Related:

This is in a PR where Shougo, another long-time contributor, communicates entirely in walls of unparseable AI slop text: https://github.com/vim/vim/pull/19413

Thank you for the detailed feedback! I’ve addressed all the issues:

Thank you for the feedback! I agree that following the Vim 8+ naming convention makes sense.

Thank you for the feedback on naming!

Thanks for the suggestion! After thinking about this more, I believe repeat_set() / repeat_get() is the right choice:

Thank you for the feedback. A brief clarification.

https://hachyderm.io/@AndrewRadev/116176001750596207

@[email protected]

  • hperrin@lemmy.ca
    link
    fedilink
    English
    arrow-up
    198
    arrow-down
    2
    ·
    3 days ago

    I spent literally all day yesterday working on this:

    https://sciactive.com/human-contribution-policy/

    I’ve started to add it to my projects. Eventually, it will be on all of my projects. I made it so that any project could adopt it, or modify it to their needs. It’s got a thorough and clear definition of what is banned, too, so it should help any argument over pull requests.

    Hopefully more projects will outright ban AI generated code (and other AI generated material).

    • gaiety@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      4 hours ago

      This is super cool!

      Did want to offer one language critique, it’s easy to jump to the word human as the opposite of AI-made, but there are a lot of therians and adjacent entities in the software engineering space. It would be wonderful to find language that is a pro-“human” policy that avoids that word and instead focuses on people of all sorts of identities so as not to be othering.

      Sounds strange to some I’m sure, but this has been coming up more and more with coworkers I’ve had across several companies. It’s kind of like moving from “he or she” to “they”, a great example is the writings of beeps a prominent software engineer on the GOV.UK site and its accessibility https://beeps.website/about/nonhuman/

      Regardless if any changes are made thanks for reading and your policy writeup, again very cool :D

    • Bibip@programming.dev
      link
      fedilink
      arrow-up
      8
      arrow-down
      4
      ·
      1 day ago

      hi, i have strong feelings about the use of genai but i come at it from a very different direction (story writing). it’s possible for someone to throw together a 300 page story book in an afternoon - in the style of lovecraft if they want, or brandon sanderson, or dan brown (dan brown always sounds the same and so we might not even notice). now, the assumption that i have about said 300 pager is that it will be dogshit, but art is subjective and someone out there has been beside themselves pining for it.

      but this has always been true. there have always been people churning out trash hoping to turn a buck. the fact that they can do it faster now doesn’t change that they’re still in the trash market.

      so: i keep writing. i know that my projects will be plagiarized by tech companies. i tell myself that my work is “better” than ai slop.

      for you, things are different. writing code is a goal-oriented creative endeavor, but the bar for literature is enjoyment, and the bar for code is functionality. with that in mind, i have some questions:

      if someone used genai to generate code snippets and they were able to verify the output, what’s the problem? they used an ersatz gnome to save them some typing. if generated code is indistinguishable from human code, how does this policy work?

      for code that’s been flagged as ai generated- and let’s assume it’s obvious, they left a bunch of GPT comments all over the place- is the code bad because it’s genai or is it bad because it doesn’t work?

      i’m interested to hear your thoughts

      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        1 day ago

        That’s a very good question, and I appreciate it.

        I put a lot of this in the reasoning section of the policy, but basically there are legal, quality, security, and community reasons. Even if the quality and security reasons are solved (as you’re proposing with the “indistinguishable from human code” aspect), there are still legal and community reasons.

        Legal

        AI generated material is not copyrightable, and therefore licensing restrictions on it cannot be enforced. It’s considered public domain, so putting that code into your code base makes your license much less enforceable.

        AI generated material might be too similar to its copyrighted training data, making it actually copyrighted by the original author. We’ve seen OpenAI and Midjourney get sued for regurgitating their training data. It’s not farfetched to think a copyright owner could go after a project for distributing their copyrighted material after an AI regurgitated it.

        Community

        People have an implicit trust that the maintainers of a project understand the code. When AI generated code is included, that may not be the case, and that implicit trust is broken.

        Admittedly, I’ve never seen AI generated code that I couldn’t understand, but it’s reasonable to think that as AI models get bigger and more capable of producing abstract code, their code could become too obscure or abstracted to be sufficiently understood by a project maintainer.

      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        15
        ·
        2 days ago

        Ok, yeah, I’ll make a post for it.

        Feel free to share it anywhere. :)

      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        86
        ·
        3 days ago

        Basically the best you can do is continue as normal, and if someone submits something that says it is or obviously is AI, point to this policy and reject it. Just having the policy should be a decent deterrent.

          • balsoft@lemmy.ml
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            17 hours ago

            If you vibe-code it and use an LLM to respond to reviews, it is really easy to tell.

            If you know what you’re doing and just using an LLM to speed up boilerplate writing, honestly who cares. It is technically copyright infringement but so many people are doing it that it’s not likely to be a problem.

            I think this policy is overblown a bit. A better policy is “you need to understand, and be responsible for, what every part of your contribution does”. Enough to tell lazy vibecoders to fuck off, and allows for some flexibility in your tooling.

          • hperrin@lemmy.ca
            link
            fedilink
            English
            arrow-up
            8
            arrow-down
            1
            ·
            edit-2
            1 day ago

            People submitting malicious or deceptive code to open source repositories isn’t a new phenomenon. Just know that if you do it with any name in any way attached to your real name, and anyone finds out, you can kiss your reputation in the software dev community goodbye.

            Also, if you don’t admit that it’s AI generated, and it turns out to be copyrighted code, you’ll have a fun time in court trying to defend yourself for copyright infringement by admitting to fraud.

      • Jankatarch@lemmy.world
        link
        fedilink
        arrow-up
        24
        arrow-down
        1
        ·
        edit-2
        3 days ago

        Same mindset as “You don’t need a perfect lock to protect your house from thieves, you just need one better than what your neighbors have.”

        If a vibecoder sees this they will not bother with obfuscation and simply move onto the next project.

      • Retail4068@lemmy.world
        link
        fedilink
        arrow-up
        1
        arrow-down
        83
        ·
        3 days ago

        No, it’s a prejudiced hot take that’s completely and utterly unenforceable which will be seen as some Luddite behavior in 10 years when everyone is using the tooling.

          • Retail4068@lemmy.world
            link
            fedilink
            arrow-up
            3
            arrow-down
            62
            ·
            3 days ago

            I did. And you’re worried about clankers being able to comprehend as well as a human 🤣, good Lord the bar is low.

            • Scubus@sh.itjust.works
              link
              fedilink
              arrow-up
              2
              arrow-down
              4
              ·
              2 days ago

              Ok that’s really funny and I do agree with you, but I think you might be coming at this a little… unhinged. The issue with this is that it is unenforceable and honestly somewhat pointless. If AI tools are not up to scratch, then that will always be reflected in the quality of the code. Bad code is bad code, it doesn’t matter what made it. A lot of people seem to think AI is synonomous with bad code, and if that is the case, simply ban bad code.

              The issue they are going to run into is twofold:

              Firstly, what qualifies as “using AI”? Admittedly I haven’t actually read their licensing, but I’m just going to take a guess and say that it bans all forms of AI used anywhere in production. Almost every compiler I use these days has auto predict. It’s rarely useful, but if it does happen to guess the rest of the code I was already going to type, and I accept that, did I use AI to assist my coding? Back in the day before it was an llm the auto predict was actually decent, so not all of them use AI. How would you even know whether your is AI or not?

              The second issue is an issue of foresight. When the AI tools do become up to scratch, that will be reflected in the quality of their code. Suddenly AI generated code is faster, more efficient, and easier to understand all simultaneously. Anyone using this license is effectively admitting that theirs is the inferior option.

              It’s always hilarious to me when people ask whether something is AI slop. I dunno man, has your ability to detect whether something is good been reduced to AI slop? If it’s good, it’s good. If it’s not, it’s not. Either you like it or you don’t. Feels very similar to transphobes saying they can always tell. If that’s true, and AI really is always going to worse, you should never have to ask whether something is AI slop, you should just be able to tell. Otherwise it’s just slop, no ai necessary.

              • hperrin@lemmy.ca
                link
                fedilink
                English
                arrow-up
                3
                ·
                edit-2
                1 day ago

                Firstly, what qualifies as “using AI”? Admittedly I haven’t actually read their licensing, but I’m just going to take a guess and say that it bans all forms of AI used anywhere in production. Almost every compiler I use these days has auto predict. It’s rarely useful, but if it does happen to guess the rest of the code I was already going to type, and I accept that, did I use AI to assist my coding? Back in the day before it was an llm the auto predict was actually decent, so not all of them use AI. How would you even know whether your is AI or not?

                So two things. First, it’s a policy, not a license. Second, the definition of AI generated is very clear in the policy.

                I don’t know why you would criticize it without reading it, but the main problems with AI generated code are legal, not quality related, and they are also clearly laid out in the policy.

          • Retail4068@lemmy.world
            link
            fedilink
            arrow-up
            1
            arrow-down
            16
            ·
            edit-2
            2 days ago

            Yes it does. Folks who just want to screech went crazy. Like, two of you actually engaged and brought valid concerns. Y’all are a CRAZY prejudiced bunch and hate being called out just as much as the next shit flinging monkey tribe.

            You actually think Lemmy is better behaved 🤣🤣🤣🤣

    • thethunderwolf@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      2 days ago

      “AI generated” means that the subject material is in whole, or in meaningful part, the output of a generative AI model or models, such as a Large Language Model. This does not include code that is the result of non-generative tools, such as standard compilers, linters, or basic IDE auto-completions. This does, however, include code that is the result of code block generators and automatic refactoring tools that make use of generative AI models.

      As “artificial intelligence” is not that well defined, you could clarify what the policy defines “AI” as by specifying that “AI” involves machine learning.

      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        12
        ·
        2 days ago

        “Generative AI model” is a pretty well defined term, so this prohibits all of those things like ChatGPT, Gemini, Claude Code, Stable Diffusion, Midjourney, etc.

        Machine learning is a much more broad category, so banning all outputs of machine learning may have unintended consequences.