• FuckBigTech347
    link
    fedilink
    arrow-up
    2
    ·
    4 months ago

    Coding directly in assembly is rare.

    I used to think that, but when you’re dealing with a lot of low-level stuff you’ll eventually realize that Compilers are pretty bad at generating fast and reliable Assembly where it’s needed. Also, some Architectures have specific machine instructions that Compilers just don’t take advantage of, no matter what flags you enable.

    • CanadaPlus@lemmy.sdf.org
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      4 months ago

      Also, some Architectures have specific machine instructions that Compilers just don’t take advantage of, no matter what flags you enable.

      Interesting. Do you have some examples?

      Writing those frequently-called leaf functions in assembly has certainly far outlived it’s use in other places. But, the word on the street, or I guess the conventional wisdom, is that compilers have gradually caught up even there.

      • FuckBigTech347
        link
        fedilink
        arrow-up
        1
        ·
        4 months ago

        Some that come to mind right now are RDRAND and RDSEED in x86, both of which allow you to generate a random number. gcc and clang never replace calls to libc’s rand() with them, even though they happily replace calls to memset() and memcpy() with a chunk of Assembly that does the same thing, wherever possible. The only way to make use of those instructions is either by using builtin intrinsics (which is basically inline Assembly) or just straight up Assembly.

        Kind-of recently I had to write custom multi-threading code and the spawned thread processes kept randomly crashing when they reached the end of their main function, because at that point they would send a “done” signal to the main process to let it know that they’re done and that their resources can now be de-allocated. This very random crash only happened because the Compiler would generate code that still touched the thread process’ stack long after it sent the “done” signal, which means sometimes the main process could have already de-allocated the thread’s Stack in the meantime. No amount of massaging carefully crafted C Code can get common Compilers to reliably generate bug-free code in this case, and there is no way to tell the C Compiler that the Stack is a no-no zone after a certain point or within a certain scope (at least none that I know of). The easiest solution in this case was to just write some Assembly by hand.