Mozilla’s innovation group and Justine Tunney just released llamafile, and I think it’s now the single best way to get started running Large Language Models (think your own local copy …
Seems like 4gb of RAM is enough for the smaller models, and then it’s just a difference of how fast it generates the responses based on how good the CPU is.
What kind of specs do you need to run one of these?
Seems like 4gb of RAM is enough for the smaller models, and then it’s just a difference of how fast it generates the responses based on how good the CPU is.