mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

#scraping

6 Beiträge6 Beteiligte0 Beiträge heute

That's the logic I don't get, I guess I will never be rich, unless I win the lottery??..
Scraping is a huge business nowadays.
NB: many LLMs are based on something called the Pile, it is weird and shaddy to say the least. I don't think using LLM for business is good for reputation. But clearly, we are not really allowed to think otherwise (Physics Nobel price for AI was the end of the argument for me), and I want to work, it is MY fault, I should've known better.

Antwortete im Thread

@akamran @davidtoddmccarty If you search Google for #Mastodon hashtag scraping, you find software and programs that help AI for doing that. It exists.

Fact is that from today, the main instances mastodon.social and mastodon.online prohibit #scraping officially: techcrunch.com/2025/06/17/mast

Problem of decentralisation: admins/users of other instances must get aware of the problem and change their terms, too.

It may be funny but it's no joke.

TechCrunch · Mastodon updates its terms to prohibit AI model training | TechCrunchDays after Elon Musk-owned X updated its terms to explicitly prohibit AI model training, decentralized social network Mastodon updated its own rules to bar any kind of model training, as well.
Fortgeführter Thread

2/

Scraping (as in Web Scraping) is the act of extracting data from HTML web-pages where the data is NOT machine-legible.

If the data, even in an HTML web-page, is in a machine-legible format, then it is NOT scraping.

...

And, getting data in JSON (key-value pairs) is definitely NOT scraping — as JSON's purpose is to communicate data in a machine-legible manner.

CC: @404mediaco

#Scraper#Scraping#WebScraper