TL;DR: The big tech AI company LLMs have gobbled up all of our data, but the damage they have done to open source and free culture communities are particularly insidious. By taking advantage of those who share freely, they destroy the bargain that made free software spread like wildfire.

  • BackgrndNoize@lemmy.world
    link
    fedilink
    arrow-up
    25
    arrow-down
    1
    ·
    2 days ago

    So the issue is that AI strips the provenence of the open source contributors and then the output it spits out based on the data it consumed is not subject to the same open source licensing that applies to the open source projects, and these AI companies make profit from this but the open source contributors don’t see a dime. We’ll that’s kinda always been the case though, so many amazing open source projects get coopted by tech giants like Microsoft and then repackaged as proprietary software for a profit, embrace, extend, extinguish, but back then they needed a team of developers to do that, now it’s more automated I guess with AI

    • yoasif@fedia.ioOP
      link
      fedilink
      arrow-up
      18
      ·
      2 days ago

      Copyleft software isn’t supposed to just be repackaged as proprietary, though. Permissive licenses, sure - but people know what they were signing up for (presumably) there.

    • aichan@piefed.blahaj.zone
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      3
      ·
      2 days ago

      I believe the discourse that the FSF has managed to spread is greatly harming for the developers and communities. They are copyleft absolutists who believe no restrictions should be imposed in the use of our code, not even to megacorps that massively profit from it with oftentimes nothing in return.

      I am in the process of making a revised version of the copyfarleft Cooperative Software License with a lawyer and once its done I will switch most of my development to it, with a clear warning for any company that uses my code to fuck off (or pay me I guess).

      • misk@piefed.social
        link
        fedilink
        English
        arrow-up
        7
        ·
        edit-2
        2 days ago

        If the code used to train LLM was released with copyleft license then there’s only way to interpret how the output should be licensed. There’s nothing absolutist about it, just how GPL and such were intended to work. If you don’t like it, don’t use it to train models.

        • aichan@piefed.blahaj.zone
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 days ago

          I think you misinterpreted my comment? I mean the Free Software Foundation is copyleft absolutist, as in, they will defend that model of licensing no matter what. I agree, of course, an LLM can be trained legally with GPL code, as you say that’s how it is.

      • Klear@quokk.au
        link
        fedilink
        English
        arrow-up
        7
        ·
        2 days ago

        You don’t want copyleft. What you’re looking for is called “copyright”

        • aichan@piefed.blahaj.zone
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          edit-2
          2 days ago

          No, its copyfarleft. Both it and copyleft USE copyright. I recommend you the Telecommunist Manifest on this topic, and you can find the stupid take of the FSF on this in here. I don’t want copyleft anymore, I don’t think it is enough. The FSF’s justification is hipocritical and coward as they state that “…embedding that desire (ethical behavior) in software license requirements will backfire, by legitimizing fundamentally unjust power over others” while using the power of copyright themselves, and in a world where we already see bad actors profiting from collective work.

          Edit: Adding to this, the first word of the GNU GENERAL PUBLIC LICENSE is Copyright lmao

    • BurnoutDV@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      2 days ago

      Isnt that thumbnail from the times or something Persiflagging the famous grayscale worker picture that can be found at every dentist

  • atzanteol@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    21
    ·
    2 days ago

    destroy the bargain that made free software spread like wildfire

    If you didn’t want your code to be used by others then don’t make it open source.

    • yoasif@fedia.ioOP
      link
      fedilink
      arrow-up
      21
      ·
      2 days ago

      Do you understand how free software works? Did you read the post? I’d love to clarify, but I’m not going to rewrite the article.

      • atzanteol@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        2 days ago

        Yes. And this is kinda hand-wavy bullshit.

        By incorporating copyleft data into their models, the LLMs do share the work - but not alike. Instead, the AI strips the work of its provenance and transforms it to be copyright free.

        That’s not how it works. Your code is not “incorporated” into the model in any recognizable form. It trains a model of vectors. There isn’t a file with your for loop in there though.

        I can read your code, learn from it, and create my own code with the knowledge gained from your code without violating an OSS license. So can an LLM.

        • yoasif@fedia.ioOP
          link
          fedilink
          arrow-up
          2
          ·
          2 days ago

          I can read your code, learn from it, and create my own code with the knowledge gained from your code without violating an OSS license.

          Why is Clean-room design a thing then?

          • atzanteol@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            2 days ago

            create my own code with the knowledge gained from your code

            Not copy your code. Use it to learn what algorithms it uses and ideas on how to implement it.

        • VoterFrog@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          2 days ago

          I can read your code, learn from it, and create my own code with the knowledge gained from your code without violating an OSS license. So can an LLM.

          Not even just an OSS license. No license backed by law is any stronger than copyright. And you are allowed to learn from or statistically analyze even fully copyrighted work.

          Copyright is just a lot more permissive than I think many people realize. And there’s a lot of good that comes from that. It’s enabled things like API emulation and reverse engineering and being able to leave our programming job to go work somewhere else without getting sued.

      • atzanteol@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        2 days ago

        Also - this conclusion is ridiculous:

        By incorporating copyleft data into their models, the LLMs do share the work - but not alike. Instead, the AI strips the work of its provenance and transforms it to be copyright free.

        That is absolutely not true. It doesn’t remove the copyright from the original work and no court has ruled as such.

        If I wrote a “random code generator” that just happened to create the source code for Microsoft Windows in entirety it wouldn’t strip Microsoft of its copyright.

      • atzanteol@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        4
        ·
        2 days ago

        If you put a fucking sign on your door saying “come on in!” then don’t be angry when people do?

        • ImgurRefugee114@reddthat.com
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          2 days ago

          We do hang signs on the doors but they say something sightly different

          https://en.wikipedia.org/wiki/Open-source_license

          Public domain licenses are truly as you describe, but copyleft licenses are far from that. There are also many “source available” licenses which aren’t open at all. Just because you can read a book doesn’t mean you can print and sell it.

            • ImgurRefugee114@reddthat.com
              link
              fedilink
              arrow-up
              2
              ·
              1 day ago

              Uh… Lots of people? That’s kinda the problem. Maybe use a search engine. There are plenty of cases of LLMs ‘laundering’ copyleft code into (often) proprietary codebases. And that’s just the most blatant and brain-dead obvious example; the use of GPL code to train commercial models is a bit more subtle and nuanced but no less nefarious, and the laws are currently unequipped to handle that part at all.

                • ImgurRefugee114@reddthat.com
                  link
                  fedilink
                  arrow-up
                  2
                  arrow-down
                  1
                  ·
                  1 day ago

                  Oh I’m sorry I didn’t realize you had the intelligence of an LLM. The conversation is now over. I hope you have a pleasant day.