• [object Object]@lemmy.ca
    link
    fedilink
    English
    arrow-up
    168
    ·
    edit-2
    4 days ago

    This is too real.

    Now I get PRs entirely written by Claude from my VP that include things like full plaintext secret keys, or reimplement an API that exists, just shittier.

    “Claude wrote this in an hour, why is review taking so long”

    Uhh because I can’t figure out the diplomatic way to say this is shit and you need to stop without creating an incident, and I don’t want to spend half my day reviewing crap.

      • whotookkarl@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        23
        ·
        4 days ago

        Or spending hours explaining in excruciating detail all the reasons why it’s shit and what they should have done instead, make sure to throw all the heavy handed certification standards and strict audit requirements and mind numbing bike shedding naming standards back at them.

        • pinball_wizard@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          Yes. This is the way.

          I’m the VP’s ally. Practically their beat friends. It’s all these pesky regulations, lawyers, audits and extreme personal liability that is slowing both of us down from doing things the sociopath way…at least until I find a gig with a less sociopathic boss.

  • Smith6612@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    3 days ago

    My favorite meme about Google AI is the one where it tries to justify that the pool of the Titanic is not full of water.

  • Vanth@reddthat.com
    link
    fedilink
    English
    arrow-up
    152
    arrow-down
    1
    ·
    4 days ago

    Best part of the article, hat tip to author Emanuel for how he included the correction request:

    After this story was published Google’s spokesperson reached out and asked us to publish a slightly different version of that statement. The new statement no longer stated that “it’s critical that we maintain humans in the loop.”

    • Th4tGuyII@fedia.io
      link
      fedilink
      arrow-up
      58
      ·
      4 days ago

      Its a very damning line to retract, but I don’t think anybody is surprised at this point

    • KeenFlame@feddit.nu
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 days ago

      “No no that is wrong! We said fuck all kids too, we really meant everyone! not just the adults??”

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    2 days ago

    Gemini actually has a really interesting architecture, hence it has fast responses, and it’s easily the best long context model out there.

    And outside of bechmaxxing or pure coding, Gemma is very good for its size. 12B is an incredible multimodal LLm, the only one natively trained for image/text ingestion without a mmproj hacked on at the end.

    …But it sure feels like executive meddling kills it.

    The pattern I see is:

    • Gemini preview is released.

    • It’s genuinely good! It’s smart, it’s straight.

    • Then they “refine” it, it’s gets more and more sycophantic, more deep fried. Long context performance degrades… benchmark scores go up, but anyone who actually uses it can immediately tell it’s gotten worse.

    • Only then, is it released for mass use.

    It’s obvious they took a good model, then enshittified it to make their bosses happy and tech bros in Twitter excited.

    Gemma has the same pattern. Researchers tease the local community, delay it, and then when a new Gemma finally comes out, it turns out to be using some old SWA architecture. And the biggest model is cut. And only a smaller one uses the multimodal training.

    It’s obvious it was neutered to not “threaten” Gemma API or be too “unsafe.”


    Another thing I’ve noticed is that both Gemini and Gemma are awful with their default 1.0 temperature/top-p 0.95. Sampling completely screws them up. But they like low temperature + minp, and Gemma loves constrained sampling.

    But 99% of users don’t know anything about sampling, so that’s going to leave a bad impression.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        2 days ago

        I use sigma N sampling at 1.0, a slop phrase banlist, and maybe a little rep penalty.

        Beyond that it depends on the usage.

        For scripts or “questioning a document,” it’s as low as can be until it loops. I start with zero temperature. But I don’t really use Gemma for coding, TBH, and it’s not good for longer documents.

        If it’s for a specific language or a very specific script, I sometimes constrain grammar for the language.

        For more “general” writing, like brainstorming or RP or whatever, I start at around 0.7 with minimal DRY sampling and look at the logit percentages in the Mikupad UI. Especially “important” tokens like names or information recall. If the probability of getting correct answers is too low, I turn the temperature down.

        …But honestly, I tend to use big MoEs instead of Gemma for that, too.


        And if none of this makes any sense…

        Yeah. That’s the problem.

        Sampling was supposed to be a temporary stopgap until looping and such was figured out, but the big LLM devs just never addressed it in production. There are all sorts of interesting papers, including one from Google about sampling logits per-layer, but they don’t implement any of them in the API models.

  • canadaduane@lemmy.ca
    link
    fedilink
    English
    arrow-up
    9
    ·
    3 days ago

    404media: “This post is for paid members only” But we’ll sure as hell put ads on it anyway.

  • uuj8za@piefed.social
    link
    fedilink
    English
    arrow-up
    75
    ·
    edit-2
    4 days ago

    Google’s CEO says 75% of the company’s code is AI-generated.

    Everyone should take this with a huge grain of salt. Like all other internal company stat reports, it’s bullshit and manufactured.

    Example: my company has recently introduced a gate on CI. All commits must have “Co-Authored-By: X”. Technically, you can set X=None, but most people aren’t doing that because we’re not stupid and we know the commit history can easily be data mined and used to generate stats on who is or isn’t using AI. And we don’t want to get fired.

    Result: 99% of all new commits use “Co-Authored-By: Claude”. Every commit I make now has “Co-Authored-By: Claude”. Am I using AI? FUCK NO. But, now I have to add that stupid line to any work I turn in.

    • criss_cross@lemmy.world
      link
      fedilink
      English
      arrow-up
      28
      ·
      4 days ago

      We have a commit skill we’re supposed to use. So for non-trivial work that I don’t want the AI to screw up i do it by hand then use the skill so it can vomit put a commit message and PR.

      I get the shiny “Co-Authored-By: Claude” and burn a ton of tokens to make myself look “AI Fluent”

    • mcv@lemmy.zip
      link
      fedilink
      English
      arrow-up
      11
      ·
      4 days ago

      This is insane to me. Having a way to easily distinguish AI generated commits from human created ones makes a lot of sense, but lying that your honest, high quality handcrafted commit is AI slop makes it pointless.

      That people feel they need to do this in order to protect their jobs is fucking insane and self destructive.

    • Steve@startrek.website
      link
      fedilink
      English
      arrow-up
      14
      ·
      4 days ago

      Remember that part in The Big Short where the stripper is talking about all the houses she owns? Similar vibes.

    • 0x0@infosec.pub
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 days ago

      Microslop really went to shit after statements just like that. Can’t wait for google to implode too

    • masterspace@lemmy.ca
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      4 days ago

      We’re a small company so I do the opposite and am avoiding any co-authored tag being applied to the code I publish.

      I review and test my code before it’s published to make sure that it works and that it’s the right solution to the problem, and I’m the one responsible for fixing it if it goes wrong late at night in prod.

      That was the case when I was using Intellisense and codegen tools and that’s still the case now.

      That makes me the author.

      Anything else is a lie, a violation of engineering ethics, and is flat out not SOC2, nor regulatorily compliant for anything that matters.

  • Th4tGuyII@fedia.io
    link
    fedilink
    arrow-up
    64
    ·
    4 days ago

    “We encourage our engineers to vigorously test and critique our internal tools; that candid feedback loop, even via our internal meme generator, is vital to how we build technology”

    Google listening to employee feedback:

    ...

    • uuj8za@piefed.social
      link
      fedilink
      English
      arrow-up
      13
      ·
      4 days ago

      Honestly, that would be great if they just tossed it out the window.

      What they’re probably doing is building a list of who they should layoff next based on the feedback.

  • kreskin@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    3 days ago

    None of my management cares if AI agents work well, they just want to get them deployed asap. I dread the day they go into use. They will claim I have no engineering talent or something like that. I’m not sure malicious compliance will work this time but its worth a shot.

    On the bright side its never too late to be a meth head salvaging copper from around town, and I know where a bunch of metal is at.

    • MrKoyun@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      It says on the article that 404 Media recreated similiar images to the memes they saw to protect their sources, so there is a chance that the originals were pure gold.

  • Deebster@infosec.pub
    link
    fedilink
    English
    arrow-up
    7
    ·
    4 days ago

    Kinda weird experience to be reading textual descriptions of memes and having to reconstruct them in my head. They had enough to say to not need to pad out their word count that way.

    • Vanth@reddthat.com
      link
      fedilink
      English
      arrow-up
      9
      ·
      4 days ago

      They’re probably doing that to protect the identity of any Google workers providing them with information. If they posted the actual meme, Google could possibly trace it back to an employee and fire them.

      Some of the memes they do have in the article, they note they are reconstructions and not the actual memes from Googles internal channels.

      I agree it’s long though, they could have just recreated them and skipped the written description.

  • reksas@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    7
    ·
    4 days ago

    paid tons of money to fool around while some who would be willing to work dont get hired no matter what

  • ddplf@szmer.info
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    4 days ago

    Also a big, chunky and oily FUCK YOU to all of you who work for or aspire to work for FAANG, MAAMA or whatever fucking letters you call it these days