Hi all, I sent a message on masto about some concerns I have about the suggested use of archive.is on lemmy but since I didn’t get any feedback, I’m copying it here:
Hi! I am currently trying #Lemmy locally in order to figure what features are available on the platform (for a possible future instance deployment). So far so good, thanks for the work done.
That said, I am a bit concerned about the “Create Post” form and the suggestion to use archive.today/archive.is to archive the shared links. As opposed to archive.org, archive.today is quite shady (zero info about the owner) and the way its engine is circumventing the robot.txt rules isn’t privacy friendly in my opinion.
To make it short, I don’t think it’s safe to encourage people to use archive.today since this website can be a vector of harassment. Have you considered removing this feature or using another service instead?
source: https://hackers.town/web/statuses/106677926407258300
I also find it insufferable because it breaks reader mode, redirection add-ons and a few other things.
IMO the archiving option should work by providing the original link + the archived page in case the content in the original page is modified/deleted.
and the way its engine is circumventing the robot.txt rules isn’t privacy friendly in my opinion
I’m fairly certain archive.org does this too.
There don’t : https://help.archive.org/hc/en-us/articles/360004651732-Using-The-Wayback-Machine
(the “Some sites are not available because of robots.txt or other exclusions. What does that mean?” question)Huh, I could have sworn I saw a blog post a while back about how they stopped respecting robots.txt