So after we’ve extended the virtual cloud server twice, we’re at the max for the current configuration. And with this crazy growth (almost 12k users!!) even now the server is more and more reaching capacity.

Therefore I decided to order a dedicated server. Same one as used for mastodon.world.

So the bad news… we will need some downtime. Hopefully, not too much. I will prepare the new server, copy (rsync) stuff over, stop Lemmy, do last rsync and change the DNS. If all goes well it would take maybe 10 minutes downtime, 30 at most. (With mastodon.world it took 20 minutes, mainly because of a typo :-) )

For those who would like to donate, to cover server costs, you can do so at our OpenCollective or Patreon

Thanks!

Update The server was migrated. It took around 4 minutes downtime. For those who asked, it now uses a dedicated server with a AMD EPYC 7502P 32 Cores “Rome” CPU and 128GB RAM. Should be enough for now.

I will be tuning the database a bit, so that should give some extra seconds of downtime, but just refresh and it’s back. After that I’ll investigate further to the cause of the slow posting. Thanks @veroxii@lemmy.world for assisting with that.

  • Luca@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    1 year ago

    I’m not too familiar with Lemmy’s codebase, but I am a devops engineer. Is the software written in any way to support horizontal scaling? If so, I’d be happy to consult/help to get the instance onto an autoscaling platform eventually.

    • Gollum@feddit.de
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      The code is open source on GitHub and the backend is written in Rust.

      I have no idea how it goes in terms of scaling…

      • pleasemakesense@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Apparently it’s not ideal at Horizontal scaling (that’s what I’ve picked up from reading stuff here, could be wrong)

        • nulldev@lemmy.vepta.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          I think they can horizontally scale the Postgres maybe? Postgres is probably the biggest performance bottleneck.

          • bobaduk@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 year ago

            Databases are also the hardest bit to horizontally scale. Web servers are easy cos they’re (usually) stateless . It’s state that’s hard to scale out.

          • pleasemakesense@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            Have they implemented the postgres? Last I read they were still using websockets (I think I’m not a programmer and don’t know what all that means lmfao)

            • nulldev@lemmy.vepta.org
              link
              fedilink
              English
              arrow-up
              2
              ·
              edit-2
              1 year ago

              Postgres is a database. Websockets is a communication method between the browser and the server.

              So the infrastructure is like this:

              Browser <--Websockets--> Server <--> Postgres
              

              So there’s a couple problems here. First of all, websockets are very resource heavy so too many of them will slow down the server, that’s why they are working on replacing websockets with something else. And second, the database (Postgres) is getting overloaded so they need to figure out how to scale it up or use it more efficiently.

              • Luca@lemmy.world
                link
                fedilink
                English
                arrow-up
                3
                ·
                1 year ago

                Man, the place I work at has a single DB instance (with a read replica) serving millions of users. I’m not saying this should be true everywhere, but I don’t understand how the postgres is buckling here. Does Lemmy have a bunch of horrifically unoptimized queries, or is the DB just on an underpowered machine?

                • NebLem@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 year ago

                  Yes to both. Lemmy does have a few PRs to make the queries more efficient (and not just blind generic ORM calls) but most instances outside of lemmy.world are very underpowered (which makes federation synchronization slow).

    • terebat@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      Doesn’t support HA or horizontal scaling yet from what I read. Unsure if kbin does. Probably would have to add support for horizontal scaling to have that auto scaling do anything.

      • Luca@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Yeah, that’s what I was afraid of. Understandable though, since horizontal scaling/HA usually isn’t a priority when developing a new application.

        • terebat@programming.dev
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          I think there are people working on adding support for pgbouncer and splitting out pg from the core server to avoid having a 1 box only setup.