• nandeEbisu@lemmy.world
    link
    fedilink
    arrow-up
    24
    arrow-down
    2
    ·
    7 days ago

    I understand why people are bandwagoning on Antropic, they’re certainly not a blameless company with negative societal, environmental, and economic impacts of the product they’re pushing.

    This article is kind of dumb though, structurally, the headline and introductory paragraph are completely different issues. And the nothingburger framing is a straw man.

    I don’t think anyone who has had access to the models and been able to write about it has claimed that it is better at humans at finding bugs, but most articles I’ve read from those people have said it is valuable because expert security researchers are a finite resource, they can only do so much manual poring over of large code bases and honestly that sounds like mind numbing work.

    The value is not that its better than human security researchers, but that it doesn’t sleep and you can just spawn a bunch of them to look for stuff so a researcher is only reviewing and guiding agents instead of getting bogged down in the nitty gritty of scanning code bases.

    The mozilla article a few days ago put it well, if it becomes easier to find bugs that generally benefits criminal elements who have a larger profit incentive to find a single zero-day over security researchers where it doesn’t matter how many zero-days they find, it only takes one to have a major incident.

    I think the idea of responsible disclosure here is a good one if somewhat poorly executed with the model leak, and I don’t understand why its being mocked other than that people just see the letters AI and start hurling tomatoes. There’s plenty of targets in the IP theft and vibe coded garbage areas of AI, its counterproductive to attack them for doing the right thing.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      7 days ago

      Yeah, all of this just doesn’t say much. Someone said it found hundreds of exploits, someone said it was 40, or maybe none. It might be way better than everything before. Or perform about as well as some other model. That is what they say is the bandwidth of claims out there. But that’s not really helpful information for anything.

      • nandeEbisu@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        7 days ago

        It’s not unknown if it’s effective…

        someone from Mozilla put their name on a blog post saying they fixes 22 vulnerabilities and an upcoming patch fixes 200+

        https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/

        This isn’t anthropic, they have no incentive to lie about how good the product is. They’re committing limited development resources to these fixes so would want them to be real problems.

        It’s also definitely more than 0, the original mythos post disclosed a OpenBSD vulnerability that was patched and they disclosed more as hashes for unpatched vulnerabilities, basically signatures of either descriptions or exploit implementations, so when they fix and disclose the plain text ones we know the original post was telling the truth. The signatures are difficult to forge.

        We also have a ton of 3rd party researchers looking at this stuff if they did forge those signatures and I haven’t seen any whistleblowers with access to the model saying it garbage. If you have any sources let me know, I would be interested in reading that.

        My point is not to white knight for anthropic. They’re flaunting IP rights and driving up energy prices for personal profit, but that when you take a position and say something it should be for actual reasons, not just “I hate this company”

        • hendrik@palaver.p3x.de
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          7 days ago

          Hmmh. Yes. I mean they throw a lot of numbers around. A 10 fold performance increase would be impressive. But they’re not really transparent, are they? How many “vulnerabilities” from the one category compare to the “security-sensitive bugs” from the other category? We know Opus found 7 medium and 14 high classified vulnerabilities and then for Mythos they give one total number. I also don’t know whether the tooling and pipeline changed, or the focus… Or how many resources they allocated in February compared to March… So I think it’s really hard to tell whether the Mythos model is like 10x better or a hundred times better or whatever compared to what they had two months ago. I tend to question claims like that if they come from a profit-oriented company and they don’t provide us with enough details to fact-check the claim. I mean I read 271 vulnerabilities. And then I go to check in the list of CVEs fixed in Firefox 150 and I just see the name “Claude” mentioned three times. Dunno why that is. But it was 22 times in 148 when they used Opus.

          And then they didn’t compare it with anything. For all we know a random Chinese (or Google) model might find 20 or 200 or 2000 bugs in Firefox. But that’d be important info if we want to hype Anthropic in specific. But as far as I know Mozilla only has that one cooperation with one company. So there’s no reference.

          The info I have is all from the theregister article above, they said there’s wildly different perspectives and counts out there. And they wrote who said it. I mean I don’t have access anyway. All I can do is wait and see what effect it has on some software projects.

          But I think what I’d really like to know is the details on all of this. Someone dissect some example findings… Some numbers on the allocated resources. Like if they spend a million dollar electricity bill to find some medium vulnerabilities or whatever the numbers are.