And another aspect is that, at least in the realm of coding, we’re trying to get these models to write code in a way humans do it. But I’d argue that it’s not really an optimal approach because models have different strength. The biggest limitation they have is that they struggle with large contexts, but if given a small and focused task, even small models can handle it well. So, we could move to structuring programs out of small isolated components that can be reasoned about independently. There are already tools like workflow engines that do this sort of stuff, they just never caught on with human coders because they require more ceremony. But I think that viewing a program as a state graph would be a really nice way for humans to be able to tell whether the semantics are correct, and then the LLM could implement each node in the graph as a small isolated task that can be verified fairly easily.
Love this approach. It’s a unique way of thinking about this and I havent seen it elsewhere.
Ran across this yesterday not sure if you’ve seen it. A bit esoteric but the author argues for using go instead of Python for LLM coding. Its stuck with me to the point i am thinking of doing a full refactor on a project of mine even though I don’t know go as well. But the bloat of the monorepo from having all these python dependencies is definitely food for thought. Your construct I think takes it a step further which is simplifying the code base for machines
Right, languages can help us provide a lot of guard rails, and Go is a pretty good candidate being a fairly simple language with types keeping the code on track. I’ve played a bit with LLMs writing it, and results seem pretty decent overall. But then there’s the whole architecture layer on top of that, and that seems to be an area that’s largely unexplored right now.
I think the key is focusing on the contract. The human has to be able to tell that the code is doing what’s intended, and the agent needs clear requirements and fixed context to work in. Breaking the program up into small isolated steps seems like a good approach for getting both these things. You can review the overall logic of the application by examining the graph visually, and then you can check the logic of each step independently without needing a lot of context for what’s happening around it.
I’ve actually been playing a bit with the idea a bit. Here’s an example of what this looks like in practice. The graph is just a data structure showing how different steps connect to each other:
and each node is a small bit of code with a spec around its input/output that the LLM has to follow:
Have you been able to incorporate this approach to an existing project? I would imagine a lot of refactoring would be needed, but then again it might work for stitching in new components as well.
I ended up incorporating this pattern into a project at one of my jobs a while back, and fell in love with it ever since. I was working at a startup, and their requirements kept changing every few months. It became really painful to add and remove features after a while, and I was like I need to come up with a way to stay sane doing it. So, I started thinking about ways to compartmentalize things, and settled on this idea. It mades things a lot more manageable because I could come back to some code I’ve written months ago, and just look at the data flow and then figure out how it needs to be rearranged without having to relearn the details of the codebase.
It is a bit of effort to move existing logic over to this pattern though because you really have to make sure each component is context free. If you have an app with a bunch of shared state already, pulling that apart is a big effort that’s hard to justify.
Nice! Glad to hear this has been battle tested, haha. I’m part of a big team on my current project, and it’s just been released (well, disabled by a feature flag for now). May not be the ideal candidate for trying this out, but here’s hoping I can when the next thing comes up.
Yeah, give it a shot if you get the opportunity. Also worth mentioning that there are off the shelf engines you could use too. For example, there are js libraries like flowcraft that implement the workflow part for you.
And another aspect is that, at least in the realm of coding, we’re trying to get these models to write code in a way humans do it. But I’d argue that it’s not really an optimal approach because models have different strength. The biggest limitation they have is that they struggle with large contexts, but if given a small and focused task, even small models can handle it well. So, we could move to structuring programs out of small isolated components that can be reasoned about independently. There are already tools like workflow engines that do this sort of stuff, they just never caught on with human coders because they require more ceremony. But I think that viewing a program as a state graph would be a really nice way for humans to be able to tell whether the semantics are correct, and then the LLM could implement each node in the graph as a small isolated task that can be verified fairly easily.
Love this approach. It’s a unique way of thinking about this and I havent seen it elsewhere.
Ran across this yesterday not sure if you’ve seen it. A bit esoteric but the author argues for using go instead of Python for LLM coding. Its stuck with me to the point i am thinking of doing a full refactor on a project of mine even though I don’t know go as well. But the bloat of the monorepo from having all these python dependencies is definitely food for thought. Your construct I think takes it a step further which is simplifying the code base for machines
https://lifelog.my/episode/why-i-vibe-in-go-not-rust-or-python
Right, languages can help us provide a lot of guard rails, and Go is a pretty good candidate being a fairly simple language with types keeping the code on track. I’ve played a bit with LLMs writing it, and results seem pretty decent overall. But then there’s the whole architecture layer on top of that, and that seems to be an area that’s largely unexplored right now.
I think the key is focusing on the contract. The human has to be able to tell that the code is doing what’s intended, and the agent needs clear requirements and fixed context to work in. Breaking the program up into small isolated steps seems like a good approach for getting both these things. You can review the overall logic of the application by examining the graph visually, and then you can check the logic of each step independently without needing a lot of context for what’s happening around it.
I’ve actually been playing a bit with the idea a bit. Here’s an example of what this looks like in practice. The graph is just a data structure showing how different steps connect to each other:
and each node is a small bit of code with a spec around its input/output that the LLM has to follow:
It’s been a fun experiment to play with so far.
Have you been able to incorporate this approach to an existing project? I would imagine a lot of refactoring would be needed, but then again it might work for stitching in new components as well.
I ended up incorporating this pattern into a project at one of my jobs a while back, and fell in love with it ever since. I was working at a startup, and their requirements kept changing every few months. It became really painful to add and remove features after a while, and I was like I need to come up with a way to stay sane doing it. So, I started thinking about ways to compartmentalize things, and settled on this idea. It mades things a lot more manageable because I could come back to some code I’ve written months ago, and just look at the data flow and then figure out how it needs to be rearranged without having to relearn the details of the codebase.
It is a bit of effort to move existing logic over to this pattern though because you really have to make sure each component is context free. If you have an app with a bunch of shared state already, pulling that apart is a big effort that’s hard to justify.
Nice! Glad to hear this has been battle tested, haha. I’m part of a big team on my current project, and it’s just been released (well, disabled by a feature flag for now). May not be the ideal candidate for trying this out, but here’s hoping I can when the next thing comes up.
Yeah, give it a shot if you get the opportunity. Also worth mentioning that there are off the shelf engines you could use too. For example, there are js libraries like flowcraft that implement the workflow part for you.
Ok, if there are existing libraries that handle the workflow bit, this will be an even easier sell. Cheers, tovarish!
O7