- cross-posted to:
- fediverse@lemmy.ml
- cross-posted to:
- fediverse@lemmy.ml
I think it’s just pulling from the frontpage?
If you have a hashtag in your title, the bot’s post will be parsed by Mastodon as having used that hashtag. I’m not sure how I feel about this since most hashtags in post titles are ironic or sarcastic.
I’m curious about this too. Bots are scanning my instance and I’ve thought about creating a robots.txt file in case they want to respect it
You can always just blacklist them from Nginx (or whatever public facing server you use) if they start posing a real problem. Especially bots from a company, since they’re usually from their corporate IP range, so if you can track that down, then you should be good.
I have fail2ban setup pretty tight and it does a good job on bad bots. I’m talking more about hrefs bot, bing, goggle, etc who are indexing my instance. They will respect the file and optimistically a mirror bot with a public site should too. Without robots configured you don’t have as good of a case against their unwanted retrieval.