• WhatAmLemmy@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    ·
    edit-2
    9 days ago

    The point wasn’t to answer questions. It was to murder 10 baby giraffes a day, for profit!

  • pkjqpg1h@lemmy.zip
    link
    fedilink
    English
    arrow-up
    9
    ·
    8 days ago

    According to the AA-Omniscience benchmark

    The most expensive models,

    Opus 4.6 has a 60% hallucination rate and 46% accuracy rate. Gemini 3.1 Pro Preview has a 50% hallucination rate and 55% accuracy rate.

    And the questions aren’t even open-ended.

    I don’t even need to tell you about the other models.

    • Kairos@lemmy.today
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      8 days ago

      “Opus 4.6” like every other LLM has a 100% hallucination rate because that’s the literal only thing they do.