Excerpt:

“Even within the coding, it’s not working well,” said Smiley. “I’ll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven’t engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence.”

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

“We don’t know what those are yet,” he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That’s the kind of thing that needs to be assessed to determine whether AI helps an organization’s engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

“It passed all the unit tests, the shape of the code looks right,” he said. It’s 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It’s a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

“Coding works if you measure lines of code and pull requests,” he said. “Coding does not work if you measure quality and team performance. There’s no evidence to suggest that that’s moving in a positive direction.”

  • Technus@lemmy.zip
    link
    fedilink
    arrow-up
    15
    ·
    16 hours ago

    I realized the fundamental limitation of the current generation of AI: it’s not afraid of fucking up. The fear of losing your job is a powerful source of motivation to actually get things right the first time.

    And this isn’t meant to glorify toxic working environments or anything like that; even in the most open and collaborative team that never tries to place blame on anyone, in general, no one likes fucking up.

    So you double check your work, you try to be reasonably confident in your answers, and you make sure your code actually does what it’s supposed to do. You take responsibility for your work, maybe even take pride in it.

    Even now we’re still having to lean on that, but we’re putting all the responsibility and blame on the shoulders of the gatekeeper, not the creator. We’re shooting a gun at a bulletproof vest and going “look, it’s completely safe!”

    • Feyd@programming.dev
      link
      fedilink
      arrow-up
      13
      ·
      15 hours ago

      fear of losing your job is a powerful source of motivation

      I just feel good when things I make are good so I try to make them good. Fear is a terrible motivator for quality

    • deadcream@sopuli.xyz
      link
      fedilink
      arrow-up
      10
      ·
      15 hours ago

      So you double check your work, you try to be reasonably confident in your answers, and you make sure your code actually does what it’s supposed to do. You take responsibility for your work, maybe even take pride in it.

      In my experience, around 50% of (professional) developers do not take pride in their work, nor do they care.

      • Technus@lemmy.zip
        link
        fedilink
        arrow-up
        6
        ·
        14 hours ago

        In my experience, around 50% of (professional) developers do not take pride in their work, nor do they care.

        I agree. And in my experience, that 50% have been the quickest and most eager to add LLMs to their workflow.

        • nymnympseudonym@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          13 hours ago

          And when they do, the quality of their code goes up

          I agree we’re better off firing them, but I’m not their manager and I do appreciate stuff with less memory leaks and SQL injections

          • deadcream@sopuli.xyz
            link
            fedilink
            arrow-up
            1
            ·
            7 hours ago

            The amount of their output goes up. More importantly, they excrete code faster than good developers equipped with AI, simply because they don’t bother to review generated code. So now they are seen as top performers instead of always lagging behind like it was before AI.

            Whether it actually results in better code is debatable, especially in the long run.