You can always just blacklist them from Nginx (or whatever public facing server you use) if they start posing a real problem. Especially bots from a company, since they’re usually from their corporate IP range, so if you can track that down, then you should be good.
I have fail2ban setup pretty tight and it does a good job on bad bots. I’m talking more about hrefs bot, bing, goggle, etc who are indexing my instance. They will respect the file and optimistically a mirror bot with a public site should too. Without robots configured you don’t have as good of a case against their unwanted retrieval.
I’m curious about this too. Bots are scanning my instance and I’ve thought about creating a robots.txt file in case they want to respect it
https://www.robotstxt.org/robotstxt.html
You can always just blacklist them from Nginx (or whatever public facing server you use) if they start posing a real problem. Especially bots from a company, since they’re usually from their corporate IP range, so if you can track that down, then you should be good.
I have fail2ban setup pretty tight and it does a good job on bad bots. I’m talking more about hrefs bot, bing, goggle, etc who are indexing my instance. They will respect the file and optimistically a mirror bot with a public site should too. Without robots configured you don’t have as good of a case against their unwanted retrieval.