Here's what vector embeddings and transformers are in LLMs

CriticalResist8 · 10 days ago

Here's what vector embeddings and transformers are in LLMs

Philo_and_sophy · 10 days ago

Removed by mod

☆ Yσɠƚԋσʂ ☆ · 10 days ago

It’s really unbecoming behavior to berate people who are learning and sharing what they’ve figured out. The LLM clearly explained the concept in an accessible way. Complaining about “AI slop” isn’t useful. Stop gatekeeping and telling people how to learn.

Philo_and_sophy · 10 days ago

If that reads as beratement, I do apologize as a self crit. Yet if I were to hold the same level of consideration, there are many comments in this post that also warrant removal because I certainly don’t feel respected

To the point of complaining isn’t useful, that’s why I also included resources for anyone who wants to learn. Not a defense, but reflecting intention

Moreover, I accept the criticism that there are more helpful approaches. It would take much more time to correct the factual errors but that would be a start

Personally I don’t see how asking people not to post AI generated information vs linking to human based and vetted resources is gatekeeping, but I accept that standard if it is

All that said, i just realized we don’t have any explicit rules against misinformation, which was my fundamental issue.

amemorablename · 10 days ago

It sounds like there are two main things at work here:

Validity of information provided by AI without cross-referencing (backing up what they say with other sources). I think this is a valid concern, though trying to have that discussion within this thread probably isn’t the best place for it. It may be better to make a dedicated thread and link to this one if you want to use it as a reference for incorrect output.
Correcting information you recognize to be incorrect. If you can provide sourcing on what’s incorrect about the output, that would certainly help the case that we should be wary of posting AI output without other sources.

But your original comment reads to me like the usual reaction of people hating AI because it’s AI, rather than directly addressing these issues. I don’t think it’s too late to course correct and focus on correctness of information.

Philo_and_sophy · edit-2 10 days ago

Agreed on the first two points fam. Though I’m legitimately very puzzled how multiple people read that comment as anti AI when the point was preserving the human and social connections underlying this tech imho

We as socialists/communists have a more rooted appreciation for humanity and society than our liberal public, so I assumed that value would be assumed in my comment rather than the anti ai sentiment.

Probably a mix of poor assumptions and being bad at off the cuff words/comments 🤷🏿‍♀️

To the point of course correction, I’m tapping out actually. This is the culmination of months of trying to combat misinformation in this community, specifically from this poster.

I have nothing but respect for them as a comrade and genuinely support their work in AI, but they post things that gain traction in the community even though it’s factually incorrect (most recently about SocialismAI vs their MCP server project)

It’s really hard (i.e. rigorous and time consuming) to manually correct information, hence me making the request that I did. And if we start using AI for corrections, then we’ve lost the script completely imho

Thanks for helpful comment

CriticalResist8 · 10 days ago

The two systems (WSWS AI and MCP) are RAGs. Just that one was made in the simplest way possible and the other was made “correctly”. Conceptually both perform the same function: retrieving information from a pool of texts and basing the LLM’s response on the information found there. It seems we might disagree on definition there, rereading your comment from back then, and that’s fine. But it doesn’t mean I’m spreading misinformation just because we don’t work on the same definition of what a RAG is.

You can’t just say “I’m right, you’re wrong” and expect people to follow along. Nobody knows literally everything.

I’m not sure what else to say. If you must impart knowledge then make your own posts to share and teach people. And if you have a problem with me then I can only recommend you block me. “Months” is stretching it, the only other time I know we interacted was with that MCP post which was 17 days ago. One time two months ago you asked about Mistral (and a definition of what free means to you so that’s why we may be working on different definitions) and that same month asked about the ProleWiki dump files - I checked my inbox. But those two occurrences were not combating misinformation but asking a question.

Again in your comment you talk about me (directly) sharing misinformation but make no indication of what that misinformation is, neither broadly nor specifically. It comes across as you don’t like something about how I presented that information, but we don’t know what and we are left to guess. The way you say it, even if it wasn’t intended (and why I’m pointing it out) is as if I go around just saying shit for the lolz.

I’ve been wanting to have a glossary/encyclopedia of LLM-related terms, especially pertaining to users who want to start understanding all the terms they might hear about AI. If you want to contribute, I’ve been hopeful for someone to make that glossary.

One last thing I want to point just for thoroughness,

when the point was preserving the human and social connections underlying this tech imho

This implies there are “right” ways to use AI and “wrong” ways. But this is exactly in-the-box thinking. What is the right way to use AI to you may not be important at all to someone else. To be cohesive we can’t on the one hand say “yes you can use AI to code an app instead of asking your developer friend” and on the other “oh but don’t trust it to explain that app go read the books written by people on it”. Either we accept AI will replace some ‘social’ connections or we think it shouldn’t replace anything and burn it down entirely. But the moment you pass a prompt to an LLM, any prompt, is the moment you are not asking Google/your friend/a lemmy thread instead.

amemorablename · 10 days ago

I think it’s largely because situations keep coming up on here where somebody posts something AI and it gets dismissive comments directed toward it. It’s not that AI can’t have problems. It’s that the content of criticisms is often very shallow.

And again, I think AI in the realm of fact is a real concern. I’m not expecting you to manually correct AI output every time you see it somewhere, but if you can correct in this context, you could leverage it in a broader argument about risks of engaging with AI output in matters of fact. Without those corrections, we have to take you at your word that you’ve observed factual errors and the position comes out weaker as a result, even though I think it is a good one, broadly speaking.

Philo_and_sophy · 10 days ago

I’m only pulling off the top, but there are more issues that hopefully will be addressed by others

What I’m using for this app is a sentence-transformer model. It’s a tiny (~90MB) pre-trained LLM that embeds vectors over 384 dimensions.

The poster isn’t using an LLM but a text encoder. This is important because the model will never generate text, only vector embeddings. They allude to as much in their post, but I don’t know if they are aware by their own admission

On the more pedantic, yet still meaningful side: The first L in LLM is for large, and by their own admission it’s a tiny model

And also these models generate embeddings vs embed vectors. No model embeds vectors afaik.

amemorablename · 10 days ago

Hmm, okay. I was more thinking of errors the model itself made in describing how things work. But correcting other people is reasonable too.

That said, I’m not so sure about this point:

No model embeds vectors afaik.

From what I can find through a cursory search, there is something called vector embedding going on, at least in the context of LLMs. I guess this smaller model is a different story if it is a wholly different kind of architecture, as you say.

https://medium.com/@narendra.squadsync/vector-embeddings-in-large-language-models-llms-3e746f1063f3

https://labs.adaline.ai/p/how-do-embeddings-work-in-llms

https://ml-digest.com/architecture-training-of-the-embedding-layer-of-llms/

(I don’t know if these sources are reliable, it’s just what I could find.)

Philo_and_sophy · 10 days ago

Respectfully, there’s no embedding of vectors, even in these examples. These are all examples of models which generate embeddings or the embedding layers in the models themselves. You’re implicitly proving my point about how hard it is to understand this work

An easy thought experiment is if you incrementally “embed vectors” or accumulate any data into a given model, it eventually expands to being unhostable.

In reality, we train models and transformers so they have internal latent representations via their weights.

Many neutral nets have embedding layers, but those are about mapping and shaping the data as it flows through the model. But again, these are trained not “embedded”

But to the larger point, these are quality resources from people which have been vetted (via PageRank I assume). It’s so much easier to have these discussions with a static knowledge base, vs AI output that can never be replicated verbatim

m532 · 10 days ago

I wonder what those “more issues” are when you want someone to “delete your post” over a bunch of nitpicks (some of them wrong even)…

And also these models generate embeddings vs embed vectors. No model embeds vectors afaik.

An encoder model generates embeddings for the input. The embeddings are tensors. Vectors are 1-dimensional tensors. Most models use higher-dimensional tensors, but those could also be view-ed as 1-dimensional. So, every model with embeddings embeds vectors.

The first L in LLM is for large, and by their own admission it’s a tiny model

When LLMs were invented, 90Mb models were large.

Philo_and_sophy · 9 days ago

You’re contradicting yourself in your own paragraph fam

An encoder model generates embeddings for the input.

But also

So, every model with embeddings embeds vectors.

Which one is it, do models generate embeddings or do they embed vectors?

And to be clear, I believe that your assertion that models “embed vectors” is incorrect, I just want you to clarify your rebuttal 🤷🏿‍♀️

To the last point, is it still the 1900s? Do we still call movies talkies? Language matters, especially when your intent is to educate

LLMs are integrally tied to transformer architectures, and transformers allowed enabled devs to scale language models to becoming LLMs

GPT 1, the first LLM, was 117 million parameters, which is much large than OPs tiny “LLM”

CriticalResist8 · 10 days ago

an affront lol okay nothing less than that.

Here's what vector embeddings and transformers are in LLMs

Here's what vector embeddings and transformers are in LLMs

The app