• CriticalResist8MA
      link
      fedilink
      arrow-up
      4
      ·
      2 个月前

      honestly just dive into practice. check the sidebar for instructions but get crush (multi-model choice, open source), put 5$ on the deepseek api (cheap), connect the two with the key, and go. It works if you have an actual task to give it, like writing a python script you’ve been wanting to have or if there’s something you want to fix on your PC.

      It’s very scary at first because the tokens just fly by and you have no idea what it’s doing but you get used to it. The adage is usually if something is not working the way you expect it, it can be reprompted to work. Think of it as a new teammate on your project: they need to be brought up to speed on what the project is and how you are working on it. They can’t divine that. However, they are also a teammate that needs to be reminded of routine often (like remembering to make git commits).

      • Che's Motorcycle
        link
        fedilink
        arrow-up
        3
        ·
        2 个月前

        Ok! First shot at it was pretty successful. I’ll see if my teammates agree when I open a PR tomorrow.

      • Che's Motorcycle
        link
        fedilink
        arrow-up
        3
        ·
        2 个月前

        I mainly use Cursor for work (it’s company paid for, lol), and saw they recently added agent support. I think the scaffolding + TDD approach mentioned here looks really promising.

  • Munrock ☭
    link
    fedilink
    arrow-up
    5
    ·
    2 个月前

    I haven’t looked at the guide linked, but something that has made a world of difference to the outcome is getting the agent to write documentation and implementation plans first, instead of telling to to code directly from a prompt. Then you review (and edit, if necessary) the implementation plan.

    It adds a safety layer, essentially letting you check the logic before letting it write code. It makes large assignments a lot more consistent.

    And after a while, you start to get a knack for spotting problems the AI is likely to trip up on. When you have the implementation plan laid out in discrete steps, you can then chunk it into phases where multiple consecutive low-risk tasks are assigned to be completed in one prompt, while the high-risk steps you can do yourself or break down into even smaller steps with more explicit instructions in the prompts.

    Another pitfall is limiting the context when you limit the task. I did this a lot at first, thinking the smaller and more precise the prompt, the more controlled the outcome. You’ll know when that works, but it’s also worth being aware that sometimes giving the LLM the end-goal design even when you just want it working on a small function or process saves it from making short-sighted design mistakes.

    • ☆ Yσɠƚԋσʂ ☆OP
      link
      fedilink
      arrow-up
      5
      ·
      2 个月前

      yeah, I’ve learned that making it to write stuff in files is really useful, and it basically acts like long term memory you can refer to in order to refresh the context, and can also highly recommend telling LLMs to write mermaidjs diagrams in docs

      • Munrock ☭
        link
        fedilink
        arrow-up
        4
        ·
        2 个月前

        And having read the article now, I think it’s right about using Test-Driven-Development for similar reasons. The tests add additional context and reduce ambiguity in the prompt in addition to their testing function. As long as you write the test criteria yourself. I once tried letting an LLM write its own tests, it made huge errors, and proceeded to rewrite correctly-functioning code into a hot mess to make it pass the tests. That was… fun.

        • ☆ Yσɠƚԋσʂ ☆OP
          link
          fedilink
          arrow-up
          3
          ·
          2 个月前

          Yup, basically the more you can box the LLM in, the better results you’ll get. A helpful way to think about it is to treat the code as a state machine. At high level you have a graph of nodes and transitions, each node can be an isolated component (like a microservice) that can be reasoned about independently, and then the edges of the graph are the state transitions between nodes. So, the node gets the state passed in (as a data structure) as input, does some work, and then produces a new state, and that gets routed somewhere. This model fits perfectly with LLM coding because you can break any large project into a series of small components like that. The models are great at working on small things, and they don’t have to build up a large context that way. Meanwhile, the top level flow is also very manageable, and you can plug new components in, move them around, etc.