• QuadratureSurfer@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    ·
    6 months ago

    I’m just glad to hear that they’re working on a way for us to run these models locally rather than forcing a connection to their servers…

    Even if I would rather run my own models, at the very least this incentivizes Intel and AMD to start implementing NPUs (or maybe we’ll actually see plans for consumer grade GPUs with more than 24GB of VRAM?).

    • suburban_hillbilly@lemmy.ml
      link
      fedilink
      arrow-up
      28
      ·
      6 months ago

      Bet you a tenner within a couple years they start using these systems as distrubuted processing for their in house ai training to subsidize cost.

      • 8ender@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        6 months ago

        That was my first thought. Server side LLMs are extraordinarily expensive to run. Download to costs to users.

      • QuadratureSurfer@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 months ago

        Similar use cases to what I’m doing right now, running LLMs like Mixtral8x7B (or something better by the time we start seeing these), Whisper (STT), or Stable Diffusion.

        I use a fine tuned version of Mixtral (dolphin-Mixtral) for coding purposes.

        Transcribing live audio for notes/search, or translating audio from different languages using Whisper (especially useful for verifying claims of translations for Russian/Ukrainian/Hebrew/Arabic especially with all of the fake information being thrown around).

        Combine the 2 models above with a text to speech system (TTS), a vision model like LLaVA and some animatronics and then I’ll have my own personal GLaDOS: https://github.com/dnhkng/GlaDOS

        And then there’s Stable Diffusion for generating images for DnD recaps, concept art, or even just avatar images.

        • Alphane Moon@lemmy.ml
          link
          fedilink
          arrow-up
          2
          ·
          6 months ago

          Thank you! I currently use my 3080 dGPU for Stable Diffusion. I wonder to what extent NPUs will be usable with Stable Diffusion XL.

  • ThyTTY@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    6 months ago

    Shit, I think I need to buy a second laptop just in case because otherwise I will be forced to buy a machine with all this crap

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 months ago

    This is the best summary I could come up with:


    At a minimum, systems will need 16GB of RAM and 256GB of storage, to accommodate both the memory requirements and the on-disk storage requirements needed for things like large language models (LLMs; even so-called “small language models” like Microsoft’s Phi-3, still use several billion parameters).

    Microsoft says that all of the Snapdragon X Plus and Elite-powered PCs being announced today will come with the Copilot+ features pre-installed, and that they’ll begin shipping on June 18th.

    But the biggest new requirement, and the blocker for virtually every Windows PC in use today, will be for an integrated neural processing unit, or NPU.

    Right now, that requirement can only be met by a single chip in the Windows PC ecosystem, one that isn’t even quite available yet: Qualcomm’s Snapdragon X Elite and X Plus, launching in the new Surface and a number of PCs from the likes of Dell, Lenovo, HP, Asus, Acer, and other major PC OEMs in the next couple of months.

    NPUs that meet Microsoft’s Copilot+ PC requirement will be used to power (among other things) a group of features that Microsoft is calling “Recall,” a collection of AI features that will try to make helpful suggestions by keeping track of everything you’ve done on your PC, including attending meetings, opening files, and doing web searches.

    On a Copilot+ PC with the minimum 256GB SSD, Microsoft says Recall will take up about 25GB of disk space and store around three months’ worth of events.


    The original article contains 888 words, the summary contains 245 words. Saved 72%. I’m a bot and I’m open source!

  • UnfortunateShort@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    6 months ago

    That are 40 TOPS, no? I mean, why use standard nomenclature when you can have a big number I guess.

    Any yeah, that’s a lot of OPS for a ‘+’.

  • shirro@aussie.zone
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    3
    ·
    6 months ago

    Don’t give a fuck about Microsoft. The last notable products they invented were Windows 95 and Office on the Mac. It has all been downhill since. An NPU isn’t going to make gcc or games run faster so who the fuck needs it.