• merc@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    9
    ·
    10 months ago

    Not only that, but the same variables that turn on “hallucination” are the ones that make it interesting.

    By the very design of generative LLMs, the same knob that makes them unpredictable makes them invent “facts”. If they’re 100% predictable they’re useless because they just regurgitate word for word something that was in the training data. But, as soon as they’re not 100% predictable they generate word sequences in a way that humans interpret as lying or hallucinating.

    So, you can’t have a generative LLM that is both “creative” in that it comes up with a novel set of words, without also having “hallucinations”.

    • JoBo@feddit.uk
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 months ago

      the same knob that makes them unpredictable makes them invent “facts”.

      This isn’t what makes them invent facts, or at least not the only (or main?) reason. Fake references, for example, arise because it encounters references in text, so it knows what they look like and where they should be used. It just doesn’t know what one is or that it’s supposed to match up to something real which says what the text implies that it says.

      • merc@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        so it knows what they look like and where they should be used

        Right, and if it’s set to a “strict” setting where it only ever uses the 100% perfect next word, if the words leading up to a reference are a match for a reference it has seen before it will spit out that specific reference from its training data. But, when it’s set to be “creative”, and predict words that are a good but not perfect match, it will spit out references that are plausible but don’t exist.

        So, if you want it to only use real references, you have to set it up to not be at all creative and always use the perfect next word. But, that setting isn’t very interesting because it just word-for-word spits out whatever was in its training data. If you want it to be creative, it will “daydream” references that don’t exist. The same knob controls both behaviours.

        • JoBo@feddit.uk
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          10 months ago

          That’s not how it works at all. That’s not even how references work.