Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

  • BeigeAgenda@lemmy.ca
    link
    fedilink
    English
    arrow-up
    60
    arrow-down
    1
    ·
    11 天前

    Anyone who have knowledge about a specific subject says the same: LLM’S are constantly incorrect and hallucinate.

    Everyone else thinks it looks right.

    • tyler@programming.dev
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      3
      ·
      11 天前

      That’s not what the study showed though. The LLMs were right over 98% of the time…when given the full situation by a “doctor”. It was normal people who didn’t know what was important that were trying to self diagnose that were the problem.

      Hence why studies are incredibly important. Even with the text of the study right in front of you, you assumed something that the study did not come to the same conclusion of.

    • agentTeiko@piefed.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 天前

      Yep its why CLevels think its the Holy Grail they don’t see it as everything that comes out of their mouth is bullshit as well. So they don’t see the difference.

    • zewm@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      5
      ·
      11 天前

      It is insane to me how anyone can trust LLMs when their information is incorrect 90% of the time.

      • SuspciousCarrot78@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        11 天前

        I don’t think it’s their information per se, so much as how the LLMs tend to use said information.

        LLMs are generally tuned to be expressive and lively. A part of that involves “random” (ie: roll the dice) output based on inputs + training data. (I’m skipping over technical details here for sake of simplicity)

        That’s what the masses have shown they want - friendly, confident sounding, chat bots, that can give plausible answers that are mostly right, sometimes.

        But for certain domains (like med) that shit gets people killed.

        TL;DR: they’re made for chitchat engagement, not high fidelity expert systems. You have to pay $$$$ to access those.