Wowed by a new paper I just read and wish I had thought to write myself. Lukas Berglund and others, led by Owain Evans, asked a simple, powerful, elegant question: can LLMs trained on A is B infer automatically that B is A? The shocking (yet, in historical context, see below, unsurprising) answer is no:
Well yeah - because that’s not how LLMs work. They generate sentences that conform to the word-relationship statistics that were generated during the training (e.g. making comparisons between all the data the model was trained on). It does not have any kind of logic and it does not know things. It literally just navigates a complex web of relationships between words using the prompt as a guide, creating sentences that look statistically similar to the average of all trained sentences.
TL;DR; It’s an illusion. You don’t need to run experiments to realize this, you just need to understand how AI/ML works.
Tell that to all the tech bros on the internet are convinced that ChatGPT means AGI is just around the corner…