A widespread Blue Screen of Death (BSOD) issue on Windows PCs disrupted operations across various sectors, notably impacting airlines, banks, and healthcare providers. The issue was caused by a problematic channel file delivered via an update from the popular cybersecurity service provider, CrowdStrike. CrowdStrike confirmed that this crash did not impact Mac or Linux PCs.

It turns out that similar problems have been occurring for months without much awareness, despite the fact that many may view this as an isolated incident. Users of Debian and Rocky Linux also experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company’s software update and testing procedures. These occurrences highlight potential risks for customers who rely on their products daily.

  • sudo@programming.dev
    link
    fedilink
    arrow-up
    189
    ·
    2 months ago

    The analysis revealed that the Debian Linux configuration was not included in their test matrix.

    You might as well say you don’t support Linux.

    “Crowdstrike’s model seems to be ‘we push software to your machines any time we want, whether or not it’s urgent, without testing it’,” lamented the team member.

    I wonder how this shit works on NixOS.

    • Flatfire@lemmy.ca
      link
      fedilink
      arrow-up
      70
      arrow-down
      2
      ·
      2 months ago

      If I’m remembering right, RHEL is Crowdstrike’s primary Linux target. And NixOS wouldn’t even be a factor since it’s basically just not enterprise grade.

      That said, they need a serious revision of their QA processes.

      • circuscritic@lemmy.ca
        link
        fedilink
        arrow-up
        34
        ·
        edit-2
        2 months ago

        RHEL, Ubuntu, & Debian cover the vast majority of enterprise installs I imagine, and provide a solid testing base for developers in the Linux business software space.

        Maybe you add Gentoo, some post-CentOS clones/forks, or other more niche industry/workload specific distros, but how you do skip Debian?

        • lemmyreader@lemmy.ml
          link
          fedilink
          English
          arrow-up
          10
          ·
          2 months ago

          RHEL, Ubuntu, & Debian cover the vast majority of enterprise installs I imagine, and provide a solid testing base for developers in the Linux business software space.

          Enterprises I imagine are using RHEL, Ubuntu, SUSE’s SLES and Oracle Linux and probably not Debian. But that’s a guess. Where can statistics and numbers be found ?

          • Pup Biru@aussie.zone
            link
            fedilink
            English
            arrow-up
            8
            ·
            2 months ago

            consultant for large enterprises in australia, and i literally can’t say i’ve ever seen anyone running anything other than RHEL and amazon linux (so… RHEL) in production… unless we’re talking not for profits, and then that’s been a bit of a mixed bag

  • Telorand@reddthat.com
    link
    fedilink
    arrow-up
    100
    arrow-down
    4
    ·
    2 months ago

    Users of Debian and Rocky Linux also experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company’s software update and testing procedures. These occurrences highlight potential risks for customers who rely on their products daily.

    Hot take: maybe bossware is a fucking drain on society, and people should stop buying it.

  • SkyNTP@lemmy.ml
    link
    fedilink
    arrow-up
    89
    arrow-down
    3
    ·
    2 months ago

    The software is not the problem. Software breaks all the time. The problem is monocultures and centralization. Building entire industry ecosystems all around a single point of failure. This is the just-in-time manufacturing supply chain disruptions and fragility all over again.

    Who knew, a diverse ecosystem was a strength, not a weakness.

    • Ooops@feddit.org
      link
      fedilink
      arrow-up
      41
      ·
      2 months ago

      The software is the problem if it’s produced with a corporate mentality of “ship first, fix later”.

    • rozodru@lemmy.ca
      link
      fedilink
      arrow-up
      14
      ·
      2 months ago

      also doesn’t help when CEO of said company isn’t a fan of testing or code reviews and IS a fan of crunch and speedy development. One of the reasons that whole mcafee snafu also happened. He believed development at Mcafee was too slow.

  • Toes♀@ani.social
    link
    fedilink
    arrow-up
    35
    ·
    2 months ago

    There’s a concept in this industry where you eat your own dog food.

    Deploying these updates to your own people could have avoided this mess.

    • themoonisacheese@sh.itjust.works
      link
      fedilink
      arrow-up
      36
      ·
      2 months ago

      Oh but they did. Turns out that this is specifically caused by one driver expecting another to be installed, the other one being for another of their products. If you have the other product installed, it doesn’t crash, so it didn’t crash on their machines because they have all their products installed and apparently not a single element of their test matrix has the single most common configuration they service

      • Fox@pawb.social
        link
        fedilink
        arrow-up
        7
        ·
        2 months ago

        Do you have a source for that? I’m intrigued. Their own blog post is only talking about a “logic error”.

        • themoonisacheese@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          2 months ago

          It’s a very educated guess based on the following:

          The crash is a null pointer dereference, which a linter ought to catch.

          The crash does not happen if you have crowdstrike sensor installed, which is weird because crowdstrike sensor’s job is not to prevent any crashes.

          Hence the guess: the update the pushed tries accessing memory in sensor, but if it’s not installed the pointer is null and that’s Bye-Bye.

        • lemmyvore@feddit.nl
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 months ago

          I heard a different rumor, that the driver file they pushed was all zeros. I’m inclined to believe that one.

      • Mango@lemmy.world
        link
        fedilink
        arrow-up
        6
        arrow-down
        1
        ·
        2 months ago

        This is the best explanation of this I’ve heard and you’re just like… A dude on Lemmy.

  • LeFantome@programming.dev
    link
    fedilink
    arrow-up
    7
    ·
    2 months ago

    The article implies that CrowdStrike issue impacted only Debian and Rocky 9.4. Debian I can see. But how did something impact Rocky but not RHEL itself or Alma or Oracle?

    Is Rocky actually different from RHEL now? Their entire brand promise is that they are the same.

  • rsp@ecoevo.social
    link
    fedilink
    arrow-up
    5
    ·
    2 months ago

    @lemmee_in I can’t find any news about this. Just a statement in a forum and everyone basing subsequent articles on that. It appears to have been limited to a single company? Is there any support for this claim?

    • learningduck@programming.dev
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      2 months ago

      Based on the article, it seems like the issue only happens on a specific distro. Is it only Rocky or other Debians?

      I wonder if other distros experience similar issues. Maybe linux based users don’t even install CS at all and try to leave their OS as lean as possible.