"participants who had access to an AI assistant wrote significantly less secure code" and "were also more likely to believe they wrote secure code" - 2023 Stanford University study published at CCS23

Arthur Besse@lemmy.ml · 2 months ago

"participants who had access to an AI assistant wrote significantly less secure code" and "were also more likely to believe they wrote secure code" - 2023 Stanford University study published at CCS23

nexv@programming.dev · 2 months ago

Not specified for this research but… if you rely on LLM to write code that is security-sensitive, I don’t expect you to write secured code without LLM anyway

2pt_perversion@lemmy.world · 2 months ago

I’m doing my part by writing really shitty foss projects for AI to steal and train on.

NauticalNoodle@lemmy.ml · edit-2 2 months ago

It seems to me that if one can adequately explain the function of their pseudocode in adequate detail for an LLM to turn it into a functional and reliable program, then the hardest part of writing the code was already done without the LLM.

Nomecks@lemmy.ca · 2 months ago

No worries, the properly implemented CI/CD pipelines will catch the bad code!

azimir@lemmy.ml · 2 months ago

I had a student came into office hours asking why their program got a bad grade. I looked and it didn’t actually do anything related to the assignment.

Upon further query, they objected saying that the CI pipeline built it just fine.

So …yeah… You can write a program that builds and runs, but doesn’t do the required tasks, which makes it wrong. This was not a concept they’d figured out yet.

Arcka@midwest.social · 2 months ago

Shouldn’t the pipeline have failed unless the functional tests passed?

Hasherm0n@lemmy.world · 2 months ago

Until you find out those were also built by a junior using an llm to help 🙃

HubertManne@moist.catsweat.com · 2 months ago

I really don’t get how its different than a search engine. Granted its surprising how often I have to give up in disgust and just go back to normal search but pretty often they can find more relevant stuff faster

Arthur Besse@lemmy.ml · 2 months ago

I really don’t get how its different than a search engine

Neither did this guy.

The difference is that LLM output is (in the formal sense) bullshit.

HubertManne@moist.catsweat.com · 2 months ago

so is search. I mean I would not click the first link from a search and then copy and paste code from the site into my project no questions asked. similarly you can look over what the ai comes up with and see if it makes sense. same you would do with some dudes blog. you can also check the references it gives or ask it to expand on some part. hey what does the function X do. I really don’t see it as being worse than search.

moriquende@lemmy.world · edit-2 2 months ago

not that you should be copy pasting any significanct amount of code, but at least when you do you’re required to understand it enough to fit it into your program. LLMs just straight up camouflage the shit code by putting something that already fits and has no squiggly red lines beneath. Many people probably don’t bother reading it at that point.

HubertManne@moist.catsweat.com · 2 months ago

yeah I mean by that standard anything a person like that uses is going to be an issue. They can be useful but im worried about the power they use although I wonder how much power that is realtive to be searching different blogs for 10 or 20 minutes.

Facebones@reddthat.com · 2 months ago

For a point of comparison, a ChatGPT request uses 2.9 watt-hours (and rising) to a google searches 0.3 (which per your example would only be run once assuming you’re checking different blogs from the same list of results.)

https://timesofindia.indiatimes.com/technology/tech-news/chatgpt-google-search-need-power-to-run-heres-how-much-water-and-electricity-are-used-to-answer-questions/articleshow/111382705.cms

HubertManne@moist.catsweat.com · 2 months ago

generally I end up checking some results and often changing the search with new keywords but all the same I generally am doing follow up questions similarly. Im betting to any energy the ai uses to check web destinations is likely not included which would be the same as I going to a destination. maybe less if its more of a crawl or api. Any way you slice it its going to be more I think.

ampersandcastles@lemmy.ml · 2 months ago

People like to gatekeep easy access to knowledge for some reason.

meliante@lemmy.world · edit-2 2 months ago

2023? Like last year? Like when LLMs were just a curiosity more than anything useful?

They should be doing these studies continuously…

Edit: Oh no, I forgot Lemmy hates LLMs. Oh well, can’t blame you guys, hate is the basic manifestation towards what scares you, and it’s revealing.

tpihkal@lemmy.world · 2 months ago

I’m sure they will, here’s year one.

gencha@lemm.ee · 2 months ago

They do. Reality is not going to change though. You can enable a handicapped developer to code with LLMs, but you can’t win a foot race by using a wheelchair.

gencha@lemm.ee · 2 months ago

I’m just waiting for someone to lecture me how the speed record in wheelchair sprint beats feet’s ass…

1984@lemmy.today · 2 months ago

Hmm. To me 2023 was the breakthrough year for them. Now we are already getting used to their flaws.

fishos@lemmy.world · 2 months ago

Hmmm, it’s almost like the study was testing peoples perception of the usefulness of AI vs the actual usefulness and results that came out.

chiisana@lemmy.chiisana.net · 2 months ago

While I agree “they should be doing these studies continuously” point of view, I think the bigger red flag here is that with the advancements of AI, a study published in 2023 (meaning the experiment was done much earlier) is deeply irrelevant today in late 2024. It feels misleading and disingenuous to be sharing this today.

justOnePersistentKbinPlease@fedia.io · 2 months ago

No. I would suggest you actually read the study.

The problem that the study reveals is that people who use AI-generated code as a rule don’t understand it and aren’t capable of debugging it. As a result, bigger LLMs will not change that.

chiisana@lemmy.chiisana.net · 2 months ago

I did in fact read the paper before my reply. I’d recommend considering the participants pool — this is a very common problem in most academic research, but is very relevant given the argument you’re claiming — with vast majority of the participants being students (over 60% if memory serves; I’m on mobile currently and can’t go back to read easily) and most of which being undergraduate students with very limited exposure to actual dev work. They are then prompted to, quite literally as the first question, produce code for asymmetrical encryption and deception.

Seasoned developers know not to implement their own encryption because it is a very challenging space; this is similar to polling undergraduate students to conduct brain surgery and expect them to know what to look for.