Obviously this is still in what could be considered “late beta”, but the pipeline was a huge success. https://fr.prolewiki.org/
The translation quality is honestly very good, we picked the right model and prompt for this.
This got us I would say 75-80% of the way there, the remaining % points are busywork that you won’t escape, or at least I don’t know how to automate it…
Think of it this way, ProleWiki EN has 5 years of organic content being written over time with links and page redirects being made. We are starting from 0. So, currently, most pages have redlinks (here’s a benchmark one: https://fr.prolewiki.org/wiki/Corée) because the redirects are not created. The pages exist, it’s just that the links should to go, say, “Kim Il Sung” instead of “Kim Il-Sung”. Normally you’d create a redirect like Wikipedia does, i.e. Kim Il-Sung takes you to Kim Il Sung. But we don’t have that history so we have to create them.
We could have exported the redirects but I decided against it because it would probably be a bigger headache. Same for the templates, we’re going to run them through Deepseek as needed.
Aside from that we focused on getting the triad of homepages (Home/Library/Essays) cleaned up and ready to go. Here’s the essays for example: https://fr.prolewiki.org/wiki/ProleWiki:Essais
I’m hopeful that with this out of the way we will get new editors and even anonymous editors interested in participating (tomorrow I think I will open up anonymous editing on the French instance to every namespace). It’ll take some time to finish cleaning everything up and tbh even the english instance isn’t completely pristine. I saw some pages that I didn’t even know existed and were clearly test pages from 2020 lol.
Obviously fixing these red links is not going to happen overnight, we’re in for the long haul. But we got 80% out of the way in a week.
I learned some practices in regards to this pipeline, things I would do differently. Tbh we were getting kinda antsy to get this up and running. But if we were to redo this for other languages I would do some things a bit different to save on the headache.
The pipeline was: download all PW pages through API -> Run through LLM to translate from EN to FR -> use regex script to clean up translation artifacts -> upload to website.
Simple enough in theory but not so small in practice, esp. the regex to clean up the translation artifacts.
edit - oh yeah, total time from start to finish was exactly 1 week (Saturday to Saturday). This is the power of LLMs lol, you just have to find the right one and prompt it. Funnily enough I think the smaller models did a better job than the bigger models. Contrast to what 5 sleep-deprived tankies could have achieved lol
I want live Hakim reaction to this.
If you understand french and want to help you don’t even need an account to make edits. I recommend looking at the most wanted pages: https://fr.prolewiki.org/wiki/Spécial:Pages_demandées, the Templates (modèles) we will take care of but the job now is to hunt down the wanted content pages and find their actual equivalent on ProleWiki. For example, I just found that Reich allemand (1933-1945) was uploaded as République allemande (1933-1945). It’s a simple Move action to change the name (in this case calling it the reich is more accurate than calling it a republic, no idea what the LLM was thinking lol), or otherwise make a redirect to the correct page.
Definitely next time I will have the script write down the match between the original and translated title somewhere so we have a key.



