I don’t know if this is an acceptable format for a submission here, but here it goes anyway:

Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

In our previous research (Content Simplification), we have identified two needs:

  • The need for readers to quickly get an overview of a given article or page
  • The need for this overview to be written in language the reader can understand

Etc., you should check the full text yourself. There’s a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

This hasn’t been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn’t the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

Taking a quote from the page for the usability study:

“Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level.”

Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they ‘use AI for everything’. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don’t think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there’s no judgment of their actual quality, and they’re only asking for people’s feedback on how they should be presented. I filled it out and couldn’t even find the space to say that e.g. the summary they show is written almost insultingly, like it’s meant for particularly dumb children, and I couldn’t even tell whether it is accurate because they just scroll around in the video.

Very extensive discussion is going on at the Village Pump (en.wiki).

The comments are also overwhelmingly negative, some of them pointing out that the summary doesn’t summarise the article properly (“Perhaps the AI is hallucinating, or perhaps it’s drawing from other sources like any widespread llm. What it definitely doesn’t seem to be doing is taking existing article text and simplifying it.” - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

I’m glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it “summarises”. Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

Finally, some comments are problematising the whole situation with WMF working behind the actual wikis’ backs:

This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed “early and often” of new developments. We shouldn’t be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others’) statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that’s an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

Again, I recommend reading the whole discussion yourself.

EDIT: WMF has announced they’re putting this on hold after the negative reaction from the editors’ community. (“we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together”)

  • AbouBenAdhem@lemmy.world
    link
    fedilink
    English
    arrow-up
    43
    arrow-down
    1
    ·
    5 个月前

    IIRC, they weren’t trying to stop them—they were trying to get the scrapers to pull the content in a more efficient format that would reduce the overhead on their web servers.

    • Lv_InSaNe_vL@lemmy.world
      link
      fedilink
      English
      arrow-up
      30
      ·
      5 个月前

      You can literally just download all of Wikipedia in one go from one URL. They would rather people just do that instead of crawling their entire website because that puts a huge load on their servers.

      • palordrolap@fedia.io
        link
        fedilink
        arrow-up
        17
        ·
        5 个月前

        Ah, but the clueless code monkeys, script kiddies and C-levels who are responsible for writing the AI companies’ processing code only know how to scrape from someone else’s website. They can’t even ask their (respective) company’s AI for help because it hasn’t been trained yet. (Not that Wikipedia’s content will necessarily help).

        They’re not even capable of taking the ZIP file and hosting the contents on localhost to allow the scraper code they got working to operate on something it understands.

        So hammer Wikipedia they must, because it’s the limit of their competence.

        • JackbyDev@programming.dev
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          5 个月前

          What’s funny is crawling the site would actually be more difficult and take longer than downloading and reading the archive.

          Context for others, Wikipedia is only ~24 GB (compressed and without media or history). https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia

          As of 16 October 2024, the size of the current version including all articles compressed is about 24.05 GB without media.