It Has Been
X Days
Since a Techbro Asshole Made a Fedi Scraper/Indexer.
Seriously. Stop fucking doing it.
- 2025-12-08: Readily.news ⚠️
@ansuz@social.cryptography.dog discovered a new fedi scraper in the guise of an LLM news-summarising bot, that hijacks account permissions to scrape posts (including private & DMs) from the follows of every person who uses it. Their writeup has lots more info. Once again showing that people who use LLM tech don't care about the world being made worse for everyone around them. - 2025-08-02: NodeBB (sorta) 🛠️
@dentangle@chaos.social discovered that some forum software called NodeBB supports ActivityPub but sorta forgot to set noindex tags, or distinguish between Public and Unlisted status types. Gotta ship that MVP, right? Even Eugen "Website" Boy decided to weigh in to go complain to the authors. Not the worst offence here, but a classic "consider privacy last, if at all" techbro approach. - 2025-05-27: FediLive
Self-reporting on fedi, some Berlin techbro consultancy announced 'FediLive': a scraper so unoriginal in function and so unimaginatively named that I was almost certain that I'd already written about it here. It grabs the instances.social list and then connects to and scrapes the public timeline of all the instances on there.
Unsurprisingly, it does not set a user agent, comply with GDPR, or do anything whatsoever to comply with the policies of the instances or privacy of their users.
The authors announce that it "could be leveraged for data analysis, ... leading to a deeper understanding of user activities on Mastodon" [sic]. Wow, how inspiring and novel. Bet they'll be disappointed when the VC cash doesn't eventuate. - 2025-02-07: Fedicate 💀
Ginny McQueen reports site "Fedicate" that is intended to be some kind of public fedi account complaint aggregator. When confronted about the potential for harassment this exposes, the author (@sam@bikersgo.social) appears to have pivoted to hosting "endorsements" instead. Either way, it includes users and servers without consent, and there is not even any opt-out mechanism. - 2025-02-07: Fediral
another in the series of incomprehensible Spoonerisms, @tobi@goblin.technology discovered a scraper calling itself "Fediral", with zero public documentation, that is already getting dug into ignoring servers' robots.txt, despite accessing it. I expect we'll find out more soon. - 2025-01-10: FediFirehose
DJSundog discovers fedi scraper "FediFirehose" by Roni Laukkarinen (@rolle@mementomori.social) using account on mastodon.social to "firehose" posts to external dstabase and website. No opt-out method existed. Appears to have existed since 2023. Project shut down after outcry. 💀 - 2024-06-13: Maven
Founders at VC-funded "AI" startup Maven scrape 1.2mil+ posts from Fedi via mastodon.social, with complete disregard for privacy, consent, copyright or GDPR. When called on it, "the extreme negative reaction was a surprise to me". Data later expunged, but it may be back.
https://social.wake.st/@liaizon/112603892765841003
https://wedistribute.org/2024/06/maven-mastodon-posts/
https://app.heymaven.com/discover/1190743 - 2024-06-11: Awakari
Site follows mastodon users in opt-out system to populate a for-profit feed aggregation tool. No clear way to remove data from its search index. Not privacy/GDPR compliant. https://mas.to/@meganL/112576865799374035 - 2024-06-04: Several university departments scrape fedi instance block information with no consent, no opt-out, no respect for robots.txt https://goblin.technology/@tobi/statuses/01HZHFNX6Q38A1QCKEPAM2TYFP
- 2024-03-29: Webis
Matti Wiegman revealed Webis mastodon scraping corpus, collected without consent over Dec 2023 to Feb 2024.
https://idf.social/@djoerd/112173581660862282 - 2024-02-29: contentnation.net by @sash@noc.social.
ActivityPub-based content scraper website about "monetising content". Doesn't allow defederation by regular methods, allow instance-level opt-outs, or respect post delete requests. Unknown if respects #nobot or similar. Author does not respect GDPR takedown requests. Been operating since 2022.
https://web.archive.org/web/20230923032556/https://contentnation.net/en/content.htm https://cloudisland.nz/@aurynn/112012888228299945 - 2024-02-13: Bsky.brid.gy by @snarfed.org@snarfed.org.
BlueSky bridge with only custom opt-out mechanism. Doesn't respect #nobot or similar.
https://web.archive.org/web/20240212170323/https://snarfed.org/2024-02-12_52106 - 2023-08-07: Mnemo.social by @dt@mastodon.top. 💀
Doesn't respect nobot, robots.txt, or any individual or instance-level anti-indexing opt-outs.
https://mastodon.art/@welshpixie/110847598016972837 - 2023-06: (There were a couple in the intervening months.)
- 2023-02-09: Peopl.social 💀
https://mastodon.gamedev.place/@rzubek/109832886323414200 - 2023-02-03: Flockingbird Hashtag scraper 💀
https://troet.cafe/@baddadda/109720568312142935, https://search.flockingbird.social/ - 2023-01-24: Takesama_bot 💀
ActivityPub proposal for built-in scraping. Nerds who want to silently scrape your content without authorisation.
https://github.com/w3c/activitypub/issues/361, https://takesama.com/whitepapers/takesama_bot - 2023-01-13: Searchtodon.social 💀
https://chaos.social/@janl/109677640974046213 - 2023-01-10: FediverseAlmanac.com. 💀
https://hachyderm.io/@tedivm/109656323461979925 - 2023-01-07: Flussence scraper. 💀
(no longer up) https://pleroma.flussence.eu/objects/ecc2e183-48ed-4886-9621-1160a86e9924 - 2022-12-27: Mastinator.com 🧟
https://mastodon.ar.al/@aral/109585159213960986
by @s0