At the time of writing, Lemmyworld has the second highest number of active users (compared to all lemmy instances)

Also at the time of writing, Lemmyworld has >99% uptime.

By comparison, other lemmy instances with as many users as Lemmyworld keep going down.

What optimizations has Lemmyworld made to their hosting configuration that has made it more resilient than other instances’ hosting configurations?

See also Does Lemmy cache the frontpage by default (read-only)? on !lemmy_support@lemmy.ml

  • Ruud@lemmy.worldM
    link
    fedilink
    arrow-up
    320
    arrow-down
    1
    ·
    1 year ago

    Thank you for all the compliments.

    This ride reminds me of Mastodon.world in November. Details on that are here: https://blog.mastodon.world/and-then-november-happened

    So I started lemmy.world on a 2CPU/4GB VPS. Keeping an eye on the performance. Soon I decided to double that. And after the first few thousand of users joined, doubled it again to 8CPU/16GB. That also was the max I could for that VPS type.

    But, already I saw some donations come in, without really asking. That reminded me of the willingness to donate on Mastodon, which allowed me to easily pay for a very powerful server for mastodon.world, one of the reasons it grew so fast. Other (large) servers crashed and closed registrations, I (mainly) didn’t.

    So, I decided to buy the same large server (32cpu/64threads with 128GB RAM) as for masto (but that masto one has double the RAM). With the post announcing that, I also mentioned the donation possibilities. That brought a lot of donations immediately, already funding this server for at least 2 months. (To the anonymous person donating $100 : wow!).

    Now next: to solve the issue with post slowness. That’s probably a database issue.

    And again: migration took 4 minutes downtime, and that could have been less if I wasn’t eating pizza at the same time. So if any server wants to migrate: please do! If you have the userbase, you’ll get the donations for it. Contact me if you have questions.

    • Balthazar@sopuli.xyz
      link
      fedilink
      arrow-up
      25
      ·
      1 year ago

      And this kindness and willingness to help is why I’ve already fallen in love with Lemmy. Thank you good sir, thank you dearly for helping the next generation of internet denisens :D

    • million@lemmy.world
      link
      fedilink
      arrow-up
      15
      ·
      1 year ago

      Nice job, thanks very much for the write up.

      Out of curiosity are you cloud hosting or do you own a server on a rack somewhere? Scaling with Kubernetes or VMs or just running bare-metal?

    • grouvie@lemmy.help
      link
      fedilink
      arrow-up
      13
      ·
      1 year ago

      Interesting writeup. I’m curious about the resource usage of the Lemmy backend and frontend deployments. Do you have any insights on the resource utilization of these deployments?

    • QuazarOmega@fedia.io
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      Amazing stuff!
      I hope others will follow suit, it’s the biggest hurdle right now for serious adoption in my opinion and you crushed it!

    • Dream_state@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      Am I able to use the same account to login to mastodon.world? Or do I need to make an account there too? Never used mastodon but vining the fediverse stuff

        • Balthazar@sopuli.xyz
          link
          fedilink
          arrow-up
          5
          ·
          1 year ago

          You can, you just must log in to the server/website in which you made the account, then browse over to the server you wish to contribute to or use.

          It’s a bit weird, I know. If you’ve got any questions I’ll try my best to answer them :D

      • mjgood91@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        1 year ago

        I bet you could do it if your instance didn’t pull in a lot of traffic.

        If it did… I reckon that you might be able to pull it off to a certain extent so long as your internet package was good enough, but if you got hit with a Reddit-level flood of incoming users, your network almost certainly wouldn’t be able to keep up.

        Even if it could, if you were consistently eating through all the upload bandwidth, I reckon you’d draw the eyes of your ISP and they might send you a letter kindly and respectfully telling you that if you don’t upgrade to a commercial line they’re not renewing your contract.

    • Xaphanos@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      As someone “in the business”, but not nearly as technical as you… How far can a single instance scale? Can a load balancer spread it over mulitple front-ends to handle user load? Can the back-end be re-worked to handle hundreds of millions of user operations per second? Can it work with a CDN? Can a single “Lemmy.World” site exist as a distributed site - with hundreds of servers spread across dozens of sites across the globe?

      I expect this entire line of thought is antithetical to the entire Lemmy philosophy of distributed operation. I expect that the “correct” way is to spin off “NA.Lemmy.World”, EMEA. Lemmy.World", APAC.Lemmy.World", etc. as separate servers. Is that correct?

      Thanks.

      • Ruud@lemmy.worldM
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        Best would be to have thousands of servers. I wouldn’t make them subdomains of lemmy.world, but people could create separate instances for large communities, for example. https://lemmy-selfhosted.com Scaling: if the Lemmy software gets tuned a bit better, I think the current server could host at least a million users. But let’s hope for more servers first…

    • Xanvial@beehaw.org
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      Interesting, I’m new on Lemmy (and fediverse itself), but when you said server does it means the backend that handles frontend traffic or database that stores all the data? Seems the next optimization step is distributing the traffic to multiple servers.

      Also (again I don’t know about the lemmy system itself), maybe you can get away with just upgrading CPU cores only or RAM only (depends on what bottlenecking the system). From my experience, the RAM requirement is scaling slower compared to CPU

    • maltfield@monero.houseOP
      link
      fedilink
      arrow-up
      75
      arrow-down
      3
      ·
      1 year ago

      Yes. And I’m asking him to share his tweaks here with the community so that others instance admins can shore-up their servers :)

      • PriorProject@lemmy.world
        link
        fedilink
        arrow-up
        108
        arrow-down
        1
        ·
        1 year ago

        Fwiw, he has been providing quite a lot of transparency in his posts to this community. He’s shared his hardware config in detail, posted maintenance posts with brief descriptions of what he’s doing, and replied to comments around specific config tweaks. I haven’t catalogued a list of links, but I’ve seen him do all of these things in the last 48h. It’s easy to imagine that all these things could be compiled in real time into a how-to, but it’s a pretty big deal just to keep the lights on right now, and pretty difficult to understand whether tweaks that helped your setup are generally applicable or only situationally useful and happen to perform well for your specific setup.

        I’m sure we will see more high-performance Lemmy guides in the future, but at this point no one has more than 36h of experience with high-performance Lemmy. Give them a minute to catch up.

        • MetaCubed@lemmy.world
          link
          fedilink
          arrow-up
          10
          ·
          1 year ago

          Saying this without any knowledge of lemmy’s backend… Large, high user count databases with (very quickly) growing demand take more power than generally expected. At a certain point, throwing money at the problem is the solution.

          • 676@lemmy.ca
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            I respect anyone working with any amount of databases. Just my selfhosted operation made me realize how fucking complicated and stupid but nessasary databases are

  • PriorProject@lemmy.world
    link
    fedilink
    arrow-up
    161
    ·
    1 year ago

    I’m not an admin, but have followed the sizing discussions around the lemmyverse as closely as I can from my position of lacking first-hand knowledge:

    • lemmy.ml is the biggest instance by user count, but runs on incredibly modest 8-cpu hardware. Their cloud provider doesn’t provide any easy scale up options for them, so they can’t trivially restart on a bigger VM with their db and disk in place. I suspect this means that instance is going to suffer for a bit as they figure out what to do next.
    • lemmy.world on the other hand was running on a box at least twice as big as lemmy.ml at last count, and I believe they can go quite a bit bigger if they need to.
    • The lemmy.world admins also run mastodon.world and lived through the twitterpocalypse, seeing peak user registrations rates of 4k per hour. So this is not their first rodeo in terms of explosive growth, I’m sure that experience gives them some tricks up their sleeve.
    • The admin team is pretty clearly technically strong. If I recall correctly, ruud is a professional database admin. One of the spooky parts of Lemmy performance-wise is the db. If ruud or others on the admin team custom-tuned their pg setup based on their own analysis of how/why it’s slow, they may be getting more performance per CPU cycle than other instances running more stock configs or that are cargo-culting tweaks that aren’t optimal for their setup without understanding what makes them work.

    I’m surprised that sh.itjust.works isn’t growing faster. They also have a hefty hardware setup and seemingly the technical admins to handle big user counts. I wonder if it’s a branding problem, where lemmy.world sounds inviting and plausibly serious where sh.itjust.works sounds like clowntown even though it’s run by a capable and serious team.

    • Pspspspspsps@lemmy.world
      link
      fedilink
      arrow-up
      126
      ·
      edit-2
      1 year ago

      I wonder if it’s a branding problem, where lemmy.world sounds inviting and plausibly serious where sh.itjust.works sounds like clowntown

      That was my thought process when choosing an instance tbh. I’m not a tech person, I looked at the list and lemmy.world was the first ‘safest feeling’ instance that had open sign up. I saw sh.itjust.works and didn’t even check their sign up process, there was too many periods in the strange name and it just looks weird to me as someone not used to these things. Edit: spelling

      • Z______@lemmy.world
        link
        fedilink
        arrow-up
        36
        arrow-down
        3
        ·
        edit-2
        1 year ago

        I definitely second the motion on it being a branding problem. Stuff like sh.itjust.works seem to me like something that dark basement tech nerds would come up with that is “edgy” and really only used by them and other people like them.

        I’m not really into the ironic “edgy” aesthetic and part of the struggle with this transition for me has been orienting myself in the space because I don’t want to commit to some “sketchy” edgelord URL

        • DarkwingDuck@sh.itjust.works
          link
          fedilink
          arrow-up
          31
          arrow-down
          2
          ·
          1 year ago

          something that dark basement tech nerds would come up with that is “edgy” and really only used by them and other people like them.

          That’s exactly what it is and why I love it. The whole thing about this federated networking is that it doesn’t matter where you signed up.

      • Guy_Fieris_Hair@lemmy.world
        link
        fedilink
        arrow-up
        14
        arrow-down
        1
        ·
        edit-2
        1 year ago

        I do think join-lemmy.org could possibly be changed to show server usage/capacity and uptime. When I initially signed up I went for lemmy.ml because what the heck is the difference? Honestly I was having all kinds of timeouts and thought the entire lemmy-verse was probably struggling. I was concerned that was the experience everyone was getting that they were going to leave because it is unsustainable.

        But I ended up seeing a page showing the uptime of serversnand lemmy.world was 100% (at that point). So I figured I’d start an account here. HOLLY CRAP IT IS SO MUCH FASTER. I would have had a hard time sticking around if it all worked like lemmy.ml.

        I started a community on lemmy.ml. Wish I would have done it here.

      • s4if@lemmy.world
        link
        fedilink
        arrow-up
        8
        ·
        1 year ago

        nah, I’m bit regretting not signing up on their instance. sh.itjust.works is a cool name and can be a brag point. lol. lemmy world is a bit too generalist, but I won’t migrate there as ruud (the admin of lemmyworld) is doing a good job managing the instance. I appreciate that. :)

      • flatbield@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        For what it is worth, I looked at sh.itjust.works . Reason I choose beehaw.org was they were more local, and had more local content and users. Plus the server focus and values seemed to fit me better. Yes their domain is a bit odd but that was not a factor for me.

    • Retro@lemmy.world
      link
      fedilink
      arrow-up
      46
      arrow-down
      1
      ·
      edit-2
      1 year ago

      I originally signed up with sh.itjust.works, but I wanted to be on the instance with the majority of migrants.

      Also, it sounds dumb, but I think the sh.itjust.works domain is just kinda weird, technically has a “curse word” in it (not that I personally care), and they don’t support NSFW content (which isn’t just used for porn). So, it didn’t make sense to have that as my home instance. 🤷‍♂️

      Edit: Also, this is my first comment on here! Hello world! 👋

      • PriorProject@lemmy.world
        link
        fedilink
        arrow-up
        12
        ·
        1 year ago

        Yeah, I get it. Naming optics aside, it seems an instance with a lot of headroom relative to others, with a capable team. Would be near the top of my word-of-mouth options in spite of the idiosyncratic name.

    • Master@lemmy.world
      link
      fedilink
      arrow-up
      30
      arrow-down
      3
      ·
      1 year ago

      Can confirm… I didnt sign up for sh.itjust.works solely because of the name… I dont particularly want that attached to every post I make.

      • lift@aussie.zone
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        Agreed. I have no idea what I’m doing, but lemmy.world sounded inviting - thus, I’m here.

    • wheen@lemmy.world
      link
      fedilink
      arrow-up
      15
      ·
      1 year ago

      Can none of this scale horizontally? Every mention of scaling has been just “throw a bigger computer at it”.

      We’re already running into issues with the bigger servers being unable to handle the load. Spinning up entirely new instances technically works, but is an awful user experience and seems like it could be exploited.

      • PriorProject@lemmy.world
        link
        fedilink
        arrow-up
        43
        arrow-down
        1
        ·
        1 year ago

        It’s important to recall that last week the biggest lemmy server in the world ran on a 4-core VM. Anybody that says you can scale from this to reddit overnight with “horizontal scaling” is selling some snake oil. Scaling is hard work and there aren’t really any shortcuts. Lemmy is doing pretty well on the curve of how systems tend to handle major waves of adoption.

        But that’s not your question, you asked if Lemmy can horizontally scale. The answer is yes, but in a limited/finite way. The production docker-compose file that many lemmy installs are based on has 5 components. From the inside out, they are:

        • Postgres: The database, stores most of the data for the other components. Exposes a protocol to accept and return SQL queries and responses.
        • Lemmy: The application server, exposes websockets and http protocols for lemmy clients… also talks to the db.
        • Lemmy-ui: Talks to Lemmy over websockets (for now, they’re working to deprecate that soon) and does some fancy dynamic webpage construction.
        • Nginx: Acts as a web proxy. Does https encryption, compression over the wire, could potentially do some static asset caching of images but I didn’t see that configured in my skim of the config.
        • Pict-rs: Some kind of image-hosting server.

        So… first off… there’s 5 layers there that talk to each other over the docker network. So you can definitely use 5 computers to run a lemmy instance. That’s a non-zero amount of horizontal scaling. Of those layers, I’m told that lemmy and lemmy-ui are stateles and you can run an arbitrary number of them today. There are ways of scaling nginx using round-robin DNS and other load-balancing mechanisms. So 3 out of the 5 layers scale horizontally.

        Pict-rs does not. It can be backed by object storage like S3, and there are lots of object storage systems that scale horizontally. But pict-rs itself seems to still need to be a single instance. But still, that’s just one part of lemmy and you can throw it on a giant multicore box backed by scalable object storage. Should take you pretty far.

        Which leaves postgres. Right now I believe everyone is running a single postgres instance and scaling it bigger, which is common. But postgres has ways to scale across boxes as well. It supports “read-replicas”, where the “main” postgres copies data to the replicas and they serve reads so the leader can focus on handling just the writes. Lemmy doesn’t support this kind of advanced request routing today, but Postgres is ready when it can. In the far future, there’s also sharding writes across multiple leaders, which is complex and has its downsides but can scale writes quite a lot.

        All of which is to say… lemmy isn’t build on a purely distributed primitives that can each scale horizontally to arbitrary numbers of machines. But there is quite a lot of opportunity to scale out in the current architecture. Why don’t people do it more? Because buying a bigger box is 10x-100x easier until it stops being possible, and we haven’t hit that point yet.

    • Druidgrove@lemmy.world
      link
      fedilink
      arrow-up
      14
      ·
      1 year ago

      I’m now going to start incorporating “Sounds like clowntown” into my everyday conversations - that’s funny!

    • WaffleFriends@lemmy.world
      link
      fedilink
      arrow-up
      12
      ·
      1 year ago

      I had a very similar thought process when choosing my instance. lemmy.world seemed like it would be more open to new users than an instance named sh.itjust.works. Idk why that was my thought process but I’m here now

    • StrayPizza@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      I hope lemmy.ml can upgrade at some point. A lot of the slowness I’m running into is trying to browse/discovery communities that happen to live on that instance.

    • maltfield@monero.houseOP
      link
      fedilink
      arrow-up
      8
      arrow-down
      1
      ·
      1 year ago

      Right, but if you don’t have a cache setup, then the DB gets taxed. At a certain point a cache looses its benefit, but an enormous amount of savings can be made (to backend DB calls, for example) by just caching all API reads for ~60 seconds.

      • andrew@radiation.party
        link
        fedilink
        arrow-up
        9
        ·
        edit-2
        1 year ago

        Ensuring there’s no data leakage in those cached calls can be tricky, especially if any api calls return anything sensitive (login tokens, authentication information, etc) but I can see caching all read-only endpoints that return the same data regardless of permissions for a second or two being helpful for the larger servers.

        It’s also worth noting that postgres does its own query-level caching, quite aggressively too. I’ve worked in some places where we had to add a SELECT RANDOM() to a query to ensure it was pulling the latest data.

        • maltfield@monero.houseOP
          link
          fedilink
          arrow-up
          4
          ·
          1 year ago

          In my experience, the best benefits gained from caching are done before the backend and are stored in RAM, so the query never even reaches those services at all. I’ve used varnish for this (which is also what the big CDN providers use). In Lemmy, I imagine that would be the ngnix proxy that sits in-front of the backend.

          • PriorProject@lemmy.world
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            I haven’t heard admins discussing web-proxy caching, which may have something to do with the fact that the Lemmy API is currently pretty much entirely over websockets. I’m not an expert in web-sockets, and I don’t want to say that websockets API responses absolutely can’t be cached… but it’s not like caching a restful API. They are working on moving away from websockets, btw… but it’s not there yet.

            The comments from Lemmy devs in https://github.com/LemmyNet/lemmy/issues/2877 make me think that there’s a lot of database query optimization low-hanging fruit to be had, and that admins are frequently focusing on app configs like worker counts and db configs to maximize the effectiveness of db-level caches, indexes, and other optimizations.

            Which isn’t to say there aren’t gains in the direction your suggesting, but I haven’t seen evidence that anyone’s secret sauce is in effective web-proxy caches.

            • s900mhz@beehaw.org
              link
              fedilink
              arrow-up
              3
              ·
              1 year ago

              I may be wrong, but there is a branch in the works (UI repo) that pulls the web socket out and replaces it all with http calls. So the web socket may not be here for long

              • PriorProject@lemmy.world
                link
                fedilink
                arrow-up
                1
                ·
                1 year ago

                You’re correct, the devs are already committed to deprecating the websocket API. This may make caching easier in the future and people may use it more as a result. I’m a little bit skeptical as most of the the heavy requests are from authenticated users, and web-proxy caching authenticated requests without risking serving them up to the wrong user is also non-trivial. But caching is not my area of expertise, there may be straightforward solutions here.

                But my comment was in reference to current releases in use on real world Lemmy servers.

                • s900mhz@beehaw.org
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  1 year ago

                  Yes, I didn’t intend to downplay your comment. Caching at the proxy later with auth is something I am not familiar with. I never had to implement it in my career. (So far 😅) I just wanted to make it known that the web socket may be a thing of Lemmy past for anyone unaware

            • Yours Truly@dataterm.digital
              link
              fedilink
              arrow-up
              3
              ·
              edit-2
              1 year ago

              I work on nginx cache modules for a CDN provider.

              While websockets can be proxied, they’re impractical to cache. There are no turn key solutions for this that I’m aware of, but an interesting approach might be to build something on top of NChan with some custom logic in ngx_lua.

              I agree with you that web proxy cache’s aren’t the silver bullet solution. They need to be part of a more holistic approach, which should start with optimizing the database queries.

              Caching with auth is possible, but it’s a whole can of worms that should be a last resort, not a first one.

            • maltfield@monero.houseOP
              link
              fedilink
              arrow-up
              3
              ·
              1 year ago

              Yeah, that’s exactly why I’m asking this question. All the effort seems to be going into the DB – but you can have a horribly shitty DB and backend but still have a massively performant webserver by just caching away the reads to RAM.

              I didn’t see any tickets about this on the GitHub, which is why I’m asking around to see if there’s actually some very low-hanging-fruit for improving all the instances with a frontend RAM cache.

              • PriorProject@lemmy.world
                link
                fedilink
                arrow-up
                3
                ·
                1 year ago

                Yeah, that’s exactly why I’m asking this question. All the effort seems to be going into the DB – but you can have a horribly shitty DB and backend but still have a massively performant webserver by just caching away the reads to RAM.

                Much of your post seemed to focus on the techniques employed by lemmy.world, caching websocket responses in the web-proxy does not seem to prominently feature among those techniques.

                If you’re interested in advancing the state of the discussion around web-proxy caching, I’d consider standing up an instance to experiment with it and report your own findings. You wouldn’t necessarily have to take on the ongoing expense and moderation headache of a public instance, you could set up with new user registrations closed, create your own test users, and write a small load generator powered by https://join-lemmy.org/api/ to investigate the effect of caching common API queries.

    • isosphere@beehaw.org
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      edit-2
      1 year ago

      sh.itjust.works

      on paper i’d be on this instance but the name is quite terrible and gives me little confidence in the administration

  • Skull giver@popplesburger.hilciferous.nl
    link
    fedilink
    arrow-up
    73
    arrow-down
    1
    ·
    1 year ago

    Lemmy.world is using some Big Boy Hardware while most other servers are hosted on more hobbyist-grade hardware.

    Lemmy is quite efficient software. It has its challenges and weirdness, but I wouldn’t bat an eye if someone told me a server with 200 people was running on a Raspberry Pi attached so an SSD in someone’s basement. It barely consumes any CPU and for small communities the entire thing runs on less memory than a copy of Chrome loading the homepage.

    Needless to say, a network of servers that has been serving hundreds of people every day struggled to adapt to hundreds of thousands of people coming in at once.

    This isn’t unique to Lemmy either. We saw this happen to Mastodon when Twitter first started fucking up, then we saw it on Mastodon alternatives when the main Mastodon servers closed registration while they desperately scaled up.

    kbin, which is written in PHP, has also been seriously struggling. I’m pretty sure they still have their Cloudflare anti-DDoS-firewall enabled, meaning federating with kbin users is basically impossible until they can get that fixed.

    There are all sorts of things that could be improved (the frontend can use more aggressive caching, the backend might be written in a way that you can spawn more to take the load, the database can be sharded, you name it) but for Lemmy’s current size, throwing more cores and more RAM at the problem will quickly and easily fix the issue.

    If anything, this shows how inefficient some easier web application frameworks are. Good luck getting Mastodon to run well with less than half a GB reserved for Mastodon itself. Maybe there’s a point where Mastodon’s architecture starts to become more efficient than Lemmy’s, but from what I can tell that’s not all that likely.

    • Televise@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      Having to seperate kbin from the rest of the fediverse is really limiting, and makes the experience more fractured.

      • Skull giver@popplesburger.hilciferous.nl
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        True, but that won’t make it any more or less stable. I suppose the fact that kbin can serve a page without Javascript would make caching common pages easier but you’ll still need to update the cache every time someone comments.

        In theory it shouldn’t be that hard to take the kbin frontend code and use it as a template for Lemmy. There’s a good chance rewriting the UI from scratch is easier, but it’s all just HTML and CSS.

        • sneakattack@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          I played around with the Stylus browser extension and made a custom script with adjustments to widths, padding, font sizes, line heights, etc and Lemmy started to feel a lot better and more familiar. I’m sure there are really talented people working on ideas to make it better.

      • For super cheap, there’s Oracle’s free* servers. Restricted to about 50mbps in bandwidth but doesn’t cost you a penny. Get the ARM server if you can, those 4 cores and 24GB of RAM will serve you well.

        For cheap, there are hosting providers like Contabo and Hetzner that will offer you servers for a fair price. You’ll need to do your own backups though!

        OVH has some very cheap dedicated servers if you need more performance. Your uplink will be limited to 100mbps but you get dedicated CPU cores, dedicated hard drives/SSDs, dedicated RAM, everything. This comes with the risk that you’ll need to do your own backups because those dedicated components can die and there’s no failover (by default).

        It’s hard to say what hardware you’ll actually need, but I’ve read somewhere that lemmy.ml used to run on a machine with 8 CPU cores to serve thousands of users before everyone migrated over. The admin of lemmy.world posts regular updates on how the server is doing, I think the 21k users and all of its federated communities are running on a 32 core CPU with 128GB of RAM. Based on this, 6 cores and 8GB of RAM should be more than enough to serve ±100 users with the responsiveness of the lemmy.world instance (probably better because you’ll have comparatively more bandwidth).

        *=free forever but you do need to answer a call from a marketeer

  • neighbourbehaviour@lemmy.world
    link
    fedilink
    arrow-up
    58
    ·
    edit-2
    1 year ago

    It’s known in the industry as the throw-hardware-at-it optimization. It’s often effective and what’s needed to buy time for software optimization to come in.

    • OsrsNeedsF2P@lemmy.ml
      link
      fedilink
      arrow-up
      21
      ·
      1 year ago

      As someone who got burnt out on one of their last businesses due to optimizing too early - Yes!!!

      Doing it “properly” with “stateless servers” and “autoscaling” with “Kubernetes” costs a hell of a lot more money than a 64 Core server with 256 GB of RAM

      • Atemu@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        Completely OT but it’s so nice to recognise usernames you’ve often seen around your “neighbourhood” on Reddit.

  • MasterBlaster@lemmy.world
    link
    fedilink
    arrow-up
    45
    ·
    1 year ago

    Talk about dumb luck! I chose this server (apparently 2 days after launch) because docmentation suggested choosing a less populated server to spread the load. Now I’m on one of the biggest and most stable. Me so happy!

  • netburnr@lemmy.world
    link
    fedilink
    arrow-up
    37
    ·
    edit-2
    1 year ago

    He has been posting updates along the way. It’s a combo of upgrading the server as it hits its limits and tuning of his web proxy and docker container to handle the increased load and federation requirements.

    Been doing an amazing job of it too. I just randomly chose this instance and I’m glad.

    Edit. Here is his last post https://lemmy.world/post/75556

  • andrew@radiation.party
    link
    fedilink
    arrow-up
    16
    ·
    1 year ago

    Likely experience and knowledge improving the quality of deployment. Most instances are likely underspecced, are on hosts that aren’t easy to scale up with, or are maxed out in their current offering tier (lemmy.ml comes to mind there)

    • maltfield@monero.houseOP
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      1 year ago

      I wouldn’t be surprised if it has more to do with caching than throwing hardware at it.

      • andrew@radiation.party
        link
        fedilink
        arrow-up
        9
        ·
        edit-2
        1 year ago

        Looking at ruud’s post, he moved the instance to a pretty beefy server - it sounds like a large part of the stability is coming from overestimating performance requirements.

            • PriorProject@lemmy.world
              link
              fedilink
              arrow-up
              11
              ·
              1 year ago

              Lemmy is a monolithic application, so there’s only so much a server upgrade can do.

              This is sort of true, but not really true. The default docker setup is comprised of 4 containers. I’ve seen admins report that two of those containers (lemmy and lemmy-ui) can be horizontally scaled just fine. The pict-rs and postgres containers can currently only be vertically scaled, but Postgres natively supports scaling read load.at least through read-replicas, and there’s an incomplete proposal to support scaling reads through separate db connections.

              All of which is to say, it’s possible to throw 4-6 machines at a Lemmy install. It’s not truly a single-procees monolith. Would the Lemmy code be able to productively use all that hardware? I dunno. It’s scaled better tombig hardware on lemmy.world than I would have predicted last week, maybe it can fully utilize a 6 machine setup, or maybe the db falls over first and you need to fix performance bugs because sn instance can scale to the user counts necessary to support bigger hardware setups.

                • PriorProject@lemmy.world
                  link
                  fedilink
                  arrow-up
                  4
                  ·
                  1 year ago

                  What is Pict-rs?

                  The image-hosting component: https://crates.io/crates/pict-rs

                  I wonder if you could replace Postgres with CockroachDB in this instance to scale out.

                  It’s not a crazy idea, but it’s not obviously a panacea either.

                  • Cockroach is harder to admin and would only provide benefit on giant instances. It would be worse for small/beginner instances, which is the majority of them right now.
                  • Cockroach doesn’t support the full PG protocol, there may be porting issues.
                  • Cockroach can have poor performance on some query types. I don’t have deep knowledge of what kinds or if Lemmy would be impacted, but when I read about cockroach migrations it’s common to find some performance footgun people had to work around. This can also add to the porting effort and could make it harder to develop for Lemmy on cockroach.
                  • There’s also AWS Aurora, for distributed PG-compatible dbs.
                  • Finally Postgres scales pretty big. You can’t run reddit or Facebook on a single-write leader with a bunch of read replicas, but you can run a pretty big website that way.

                  All of which is to say, maybe there’s something there… but often distributed databases have fewer features and have bigger footguns than rdbms’s. If you want to try cockroach or aurora, you don’t need the devs help. You can stay to stand up a Lemmy instance that points at them. If the compatibility is really good enough, it will “just work” and you can try some performance testing. If it doesn’t, then you have your answer than porting effort is requires for speculative benefit.