• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    21
    arrow-down
    12
    ·
    edit-2
    6 days ago

    This is stupid.

    As I always preach, I am one of Lemmy’s rare local LLM advocates. I use “AI” every day. But I’d vote no.


    The real question isn’t if you are for or against AI.

    It’s if you support Tech Bro oligarchy and enshittification, or prefer being in control of your tech. It’s not even about AI; it’s the same dilemma as Fediverse vs Big Tech.


    And thats what Altman and such fear the most. It’s what you see talked about in actual ML research circles, as they tilt their heads at weird nonsense coming out of Altman’s mouth. Open weights AI is a race to the bottom. It’d turn “AI” into dumb, dirt cheap, but highly specialized tools. Like it should be.

    And they can make trillions off that.

    • eatCasserole@lemmy.worldM
      link
      fedilink
      arrow-up
      12
      ·
      6 days ago

      Well I think that the products currently being hyped as “AI” are significantly more dangerous and harmful than they will ever be useful, and I would like to burn them.

    • Voroxpete@sh.itjust.works
      link
      fedilink
      arrow-up
      12
      ·
      edit-2
      6 days ago

      If it helps I think you just need to read the question as “Do you want AI in the form that is currently being offered.” For all intents and purposes, that’s the question being asked, because that’s how the average person is going to read it.

      The fact that AI can be a totally different product that doesn’t fundamentally suck is nice to know, but doesn’t exactly offer anything to most people.

      • hector@lemmy.today
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        6 days ago

        That is like arguing that climate change could have been addressed. Yes, but it was never going to be. Zero chance from day one. Zero chance from twenty six years back when I learned about it in more depth.

        Yes it could be, but it never was going to be. No chance it is theoretical.

      • brucethemoose@lemmy.world
        link
        fedilink
        arrow-up
        3
        arrow-down
        7
        ·
        edit-2
        6 days ago

        The problem is being “anti AI” smothers open weights ML, doing basically nothing against corporate AI.

        The big players do not care. They’re going to shove it down your throats either way. And this whole fuss is convenient, as it crushes support for open weights AI and draws attention away from it.


        What I’m saying is people need to be advocating for open weights stuff instead of “just don’t use AI” in the same way one would advocate for Lemmy/Piefed instead of “just don’t use Reddit”

        The Fediverse could murder trillions of dollars in corporate profit with enough critical mass. AI is the same, but it’s much closer to doing it than people realize.

    • errer@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Out of curiosity how are you doing a local LLM? I’ve been trying recently and the results I’ve gotten have been really subpar compared to what the big boys offer. Been using LM Studio.

      • cecilkorik@piefed.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 days ago

        Open models, running on consumer hardware, will probably be really subpar for a very long time because the AI big boys are heavily subsidizing their AIs with massive models running on massive compute farms in massive datacenters. They are subsidizing these things using your environment, your electricity dollars, your tax dollars and your retirement fund’s dollars to subsidize it with, since they’re all losing money all the time.

        But they’re not going to do that forever (and you can’t afford to do this forever either), but that’s okay because they just want to get you onboard. It’s a trap, and you can play around with that alluring trap if you want, but they’re going to make it really easy to fall in when you do.

        Once they’ve got you in the trap, they’re eventually going to tighten the screws, lock you into their ecosystem (if they haven’t already) and start extracting money from you directly. That’s coming, they’ll lock it down and enshittify until you pay them, and then they’ll enshittify that until you pay them more, because eventually they have to reach the real economics of providing these AI models, and then they have to start actually turning a profit, and you can’t even imagine how far they’ll have to eventually go to get to that point, you might as well just set your wallet on fire now.

        So, if you’re expecting, open, free, local models to compete directly with that, you aren’t understanding their business model. The results that big AI providers are giving you are an absolute illusion, built out of a collage of countless different lies and thefts, those results won’t continue they way they are now, they can’t continue, they’re not economically viable to continue. They’re only that good so they can lure you into the trap.

        Open models, on the other hand, operate in the concrete reality of the here and now. What they’re giving you is the real quality of existing models on sensible contemporary hardware using the real economics of providing these things. If you find that underwhelming, well, maybe it is, but it doesn’t mean the lies of big AI are real or ever will be. You can try and take advantage of the constant, delightful lies they’re currently showing you while you still can, but chances are they’ll end up taking advantage of you instead, because that’s what their goal is.

      • brucethemoose@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        6 days ago

        Yeah, accessibility is the big problem.


        What I used depends.

        For “chat” and creativity, I use my own version of GLM 4.6 350B quantized to just barely fit in 128GB RAM/24GB VRAM, with a fork of llama.cop called ik_llama.cpp:

        https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

        It’s complicated, but in a nutshell, the degradation vs the full model is reasonable even though it’s like 3 bits instead of 16, and it runs at 6-7 tokens/sec even with so much in CPU.

        For the UI, it varies, but I tend to use mikupad so I can manipulate the chat syntax. LMStudio works pretty well though.


        Now, for STEM stuff or papers? I tend to use Nemotron 49B quantized with exllamav3, or sometimes Seed-OSS 36B, as both are good at that and at long context stuff.

        For coding, automation? It… depends. Sometimes I used Qwen VL 32B or 30B, in various runtimes, but it seems that GLM 4.7 Flash and GLM 4.6V will be better once I set them up.

        Minimax is pretty good at making quick scripts, while being faster than GLM on my desktop.

        For a front end, I’ve been switching around.

        I also use custom sampling. I basically always use n-gram sampling in ik_llama.cpp where I can, with DRY at modest temperatures (0.6?). Or low or even zero temperature for more “objective” things. This is massively important, as default sampling is where so many LLM errors come from.

        And TBH, I also use GLM 4.7 over API a lot, in situations where privacy does not matter. It’s so cheap it’s basically free.


        So… Yeah. That’s the problem. If you just load up LMStudio with its default Llama 8B Q4KM, it’s really dumb and awful and slow. You almost have to be an enthusiast following the space to get usable results.

        • errer@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 days ago

          Thank you, very insightful.

          Really the big disguishing feature is VRAM. Us consumers just don’t have enough. If I could have a 192GB VRAM system I prolly could run a local model comparable to what OpenAI and others offer, but here I am with a lowly 12GB

          • brucethemoose@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            5 days ago

            You mean an Nvidia 3060? You can run GLM 4.6, a 350B model, on 12GB VRAM if you have 128GB of CPU RAM. It’s not ideal though.

            More practically, you can run GLM Air or Flash quite comfortably. And that’ll be considerably better than “cheap” or old models like Nano, on top of being private, uncensored, and hackable/customizable.

            The big distinguishing feature is “it’s not for the faint of heart,” heh. It takes time and tinkering to setup, as all the “easy” preconfigurations are suboptimal.


            That aside, even you have a toaster, you can invest a in API credits and run open weights models with relative privacy on a self hosted front end. Pick the jurisdiction of your choosing.

            For example: https://openrouter.ai/z-ai/glm-4.6v

            It’s like a dollar or two per million words. You can even give a middle finger to Nvidia by using Cerebras or Groq, which don’t use GPUs at all.