One thing Reddit dominates on is search results. I’m looking things up and seeing so many links to reddit, which I guess is going to help keep that place relevant (unless those subreddits stay dark).

I wondered how Lemmy and this fed thingy stuff all works for that? With more posts can we expect to see people arriving through search results?

  • wpuckering@lm.williampuckering.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    As a general rule, I prevent all of my self-hosted services that are directly exposed to the Internet from being crawled or indexed by search engines. Any service I do expose publicly to the Internet is of course behind proper authentication and is secured using modern best practices and standards, but lowering the visibility and odds of someone stumbling onto services they have no use for, and potentially trying to exploit them, is less likely to happen if they aren’t presented front and center in a search result. I wouldn’t say it’s a proper security measure by any means (obscurity has nothing to do with real security), but blending into the crowd or taking a seat at the back of the room draws less attention to yourself if you don’t care to be the first target in someone’s sights.

    So why do I expose any of my self-hosted services to the Internet in the first place, rather than access them exclusively via VPN? For me there’s a few reasons:

    • Ease of Access - I want the ability to instantly share usage of specific services that I host with friends and family over the Internet, and I can’t expect them to do so over VPN, even if I were to offer to help them get set up
    • Performance - I use Cloudflare Tunnels to expose my services (no open router ports, ever), so that allows me to use Cloudflare’s CDN for caching static assets such as immutable images, CSS, Javascript, and I’ve extensively tweaked my Cache Rules to squeeze the most of out it
    • Security - Cloudflare secures my services with their built-in tooling, and I can use Cloudflare Access if I want to limit access further to specific users by means of accounts they already have, such as Google or various social media account providers

    …And there are more reasons I could get into, and I could easily expand on the ones above, but I’ll leave it there.

    Of course having all of my external traffic flow through Cloudflare means there’s no expectation of data privacy for any payload traversing in and out of my services, but I’ve decided that I’m okay with that for the other benefits I get out of Cloudflare. Nothing’s truly free, right?

    But to answer your original question more specifically, and with the context above in mind, why actively work against indexing in the case of my Lemmy instance? Well, I’m the only user on my instance. I only use it as a home server for my account. That means I’m not creating any communities on it, and there’s no content actually originating from my instance proper. Anybody who would end up coming across my instance, if they were to browse, would see content which originates from other instances, and only content from the time that I set up my server and began federating with those other servers and onward. They wouldn’t see every comment from posts that pre-dated my federation, so it would be an incomplete view. They would be better off going directly to the server that originated the content. They could of course do that by following the permalink from my own server, but it’s an extra hop. It might arguably be better in this case if I just remove my server entirely from any possible search results so that if the originating instance is indexable, its content shows up in the results and mine don’t. That would probably be a better user experience for users trying to find Lemmy content via search engines, they’d hopefully land in the originating instance sooner than later.

    Long answer, but I wanted to give as much insight and clarity into why I do what I do. Happy to answer any more questions!

    • Kresten@feddit.dk
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Interesting insight into your setup and thought process.

      That makes good sense. I didn’t realize you hosted your instance only for yourself. I might consider that as well in the future.