American companies are spending enormous sums to develop high-performing AI models. Distillation attacks are attempting to maliciously extract them — and nobody is doing much to stop it.

  • H Ramus@piefed.social
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    3 hours ago

    World’s smallest violin. Let’s break it down:

    • Hardware - all paid to providers and more prominently Nvidia;
    • Software - all the statistical relationship and logic was developed by handsomely paid staff;
    • Input data - there’s no such thing as copyright, intellectual property or any sort of mechanism that prevents harvesting copious amounts of data that was created, refined and delivered as part of human experience or a business product. It’s free for all to take, why pay for data?
    • Output of LLM - Based on the preceding paragraph, it’s free for all to take, why pay for data?

    So, competitors can’t avoid the hardware costs but can save on developer costs? Nobody paid for input data anyway. Sounds like a VC’s wet dream.

  • fodor@lemmy.zip
    link
    fedilink
    arrow-up
    6
    ·
    11 hours ago

    lol it is perhaps costing billions but is it worth billions? let’s not pretend money spent (or laundered) implies value…

      • pdxfed@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        5 hours ago

        If Wallace Shawn and Billy Crystal don’t act in TPB it’s never the instant classic it was and would have been immediately forgotten. It’s a B movie at best without them, though Ewles is carrying a lot. As it was, thank God they were both in it.

    • binarytobis@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      15 hours ago

      My SIL’s friend was bragging about her son “writing” books using an LLM and selling them on amazon. “He checked and it isn’t even plagiarism!”

      If it wasn’t our first meeting I probably would have pointed out how, in fact, it is.

  • bitteroldcoot@piefed.social
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    2
    ·
    20 hours ago

    I worked with computers for about 30 years, and in retirement been testing ai for fun. I’ve yet to figure out what the point of them is. They lie, manipulate users and censor information. Their prose is overly verbose and their code sucks. What’s the point…

    You know, as I was typing the first paragraph I realized the point. They are really good at controlling and manipulating stupid people. They are the new Facebook and twitter. How depressing.

    • Strider@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      14 hours ago

      Well, the point is using humongous amounts of energy, cutting resources from everything else and creating a huge money funnel.

      It’s the most effective hype yet.

    • unmagical@lemmy.ml
      link
      fedilink
      arrow-up
      14
      ·
      19 hours ago

      They seem great till you ask them about something you know. Somehow people fail to extrapolate out that the failures they see in their field of expertise are actually there across all subject matters.

      • moopet@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        I find the same with human-written articles. Like New Scientist, for example. When I was young I liked reading it, right up until I started reading articles on topics I knew well. They were all misleading shite. So I naturally assume that everything else I read associated with that magazine is also shite.

      • very_well_lost@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        17 hours ago

        Briefly stated, the Gell-Mann Amnesia effect is as follows. You open the newspaper to an article on some subject you know well. In Murray’s case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the “wet streets cause rain” stories. Paper’s full of them.

        In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know.

    • 13igTyme@piefed.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      20 hours ago

      I work for a company that uses machine learning to make predictions for hospitals for census and discharges. It only a tool and works to help not replace. We’re also working on it reading unstructured notes. I’m incredibly sceptical of AI and we test the shit out of it to make sure it’s accurate.

      • bitteroldcoot@piefed.social
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        19 hours ago

        “reading unstructured notes.” and if it screws up someone dies? I have doctors that want ai to transcribe what they say. I refused to sign the permission form.

        • 13igTyme@piefed.social
          link
          fedilink
          English
          arrow-up
          4
          ·
          19 hours ago

          The software is only used to help identify barriers for patients currently discharging. A person isn’t going to die when discharging home and waiting on DME.

    • kboos1@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      19 hours ago

      The only thing I have found useful about Ai is it’s ability to quickly fill in documents with slop to make it seem like I spent more time and effort on it. Usually something like, I put it together with major points and frame work, then give to Ai to slop it up and format it. Then proof it and send it out. It’s also good for note taking and transcripts.

      Other than that it seems like it’s just another form of control because now it can search data and make decisions quickly and cheaply now. This means that things that weren’t worth making time for in the past can just be given to Ai to track. In fact my company is playing around with using Ai to track our progress on projects so that the PMs don’t have to interact with engineers directly. I would also bet that it will be used to assess performance in future annual performance reviews.

      Companies are also hoping to get rid of employees that perform those menial tasks that support staff do and get rid of employees that do tasks that they believe don’t require specialized skills or talents.

  • eleijeep@piefed.social
    link
    fedilink
    English
    arrow-up
    11
    ·
    18 hours ago

    What brain? They’re developing an accountability-laundering propaganda machine. There’s nothing involved that you could call a “brain.”

  • brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    18 hours ago

    Actually, Chinese are doing a whole lot more innovation than American “AI brains,” or at least innovation that we know about. Architectures are getting more and more efficient, instead of US Big Tech’s “the same, but bigger, and capture regulators” ethos.

    Not that the Chinese labs are saints. They’re 100% distilling US labs data. It’s somewhat measurable:

    https://eqbench.com/creative_writing.html

    They’re almost certainly using unspecified Chinese govt data too, or at least sharing data between them, given the common quirks and behavior across models and the efficiency for their size. Not to speak of political “gaps” (which US models certainly have too).

  • Meron35@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    11 hours ago

    As if American AI firms aren’t doing the same.

    Anthropic made a lot of noise of being the victim of large scale distillation attacks (ie other AI firms, usually Chinese copying/scraping their model), but people quickly pointed out the hypocrisy that Anthropic themselves seems to have copied DeepSeek.

    If you bypass the system prompt and ask Claude what model it is (e.g. via Open router), it’ll reply that it’s DeepSeek.

  • magnetosphere@fedia.io
    link
    fedilink
    arrow-up
    7
    ·
    19 hours ago

    …and nobody is doing much to stop it.

    Why should we care?

    I see this as a perfect real-world test. These companies can’t even protect what’s supposed to make them “valuable”. That doesn’t make it our problem. This is an easily foreseeable issue that they chose to ignore in their rush to market. They’re simply not ready. It’s their own fault.

  • mrmaplebar@fedia.io
    link
    fedilink
    arrow-up
    5
    ·
    19 hours ago

    I believe they have been doing that and will continue to do that. Not just through distillation attacks, but also throughout hacking corporate and government networks, and good old fashioned espionage.

    But “easy come, easy go”, I guess. Because all of the training data was stolen in the first place. Just one more reason why the AI business is fucked. The answer for society remains regulation.

    • Hegar@fedia.io
      link
      fedilink
      arrow-up
      4
      ·
      18 hours ago

      “Only I stole this fairly” has been the motto of oligarchs for millenia.

  • SalamenceFury@piefed.social
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    20 hours ago

    I don’t care if anyone steals any AI model, in a just world LLMs would be considered illegal everywhere.