Apparently there's a Lemmy mirror bot now.

AgreeableLandscape@lemmy.ml · edit-2 4 years ago

Apparently there's a Lemmy mirror bot now.

ufra@lemmy.ml · 4 years ago

I’m curious about this too. Bots are scanning my instance and I’ve thought about creating a robots.txt file in case they want to respect it

https://www.robotstxt.org/robotstxt.html

AgreeableLandscape@lemmy.ml · 4 years ago

You can always just blacklist them from Nginx (or whatever public facing server you use) if they start posing a real problem. Especially bots from a company, since they’re usually from their corporate IP range, so if you can track that down, then you should be good.

ufra@lemmy.ml · edit-2 4 years ago

I have fail2ban setup pretty tight and it does a good job on bad bots. I’m talking more about hrefs bot, bing, goggle, etc who are indexing my instance. They will respect the file and optimistically a mirror bot with a public site should too. Without robots configured you don’t have as good of a case against their unwanted retrieval.

Apparently there's a Lemmy mirror bot now.

Apparently there's a Lemmy mirror bot now.

Lemmy (@lemmy@botsin.space)