Title is self-explanatory. The benefits of this would be tremendous, if correctly trained and perfected, it would be the greatest tool to democratize knowledge about Marxism.

There are already several open-source large language models on the internet out there, but I think the biggest bottlenecks is the knowledge on deploying such models and computing power to run such a thing.

Thread to discuss about this subject

  • @Neodosa
    link
    1111 months ago

    First off, as someone who has programmed GPT stuff since way before ChatGPT, we don’t even need to train our own model. That is overly expensive and unnecessary for our purpose. What is much smarter to do in this case is to take all of the Marxist works and let a chatbot access the contents of the works using semantic search. The way we do this is to convert the works into small chunks which we then convert into embedding vectors. When the user sends a message to the chatbot, the message and the context of the message will be converted into an embedding vector. We then run a dot-product between the message of the user and the chunks of the texts in order to find the most relevant chunks to the question which the user has asked. Then a pre-trained model can make use of the information fetched in order to answer the user’s question.

    Of course, training one’s own model can be good if we want it to be even more accurate and familiar with the material, however a good starting point would be to use semantic search.