Wikidata:WikiProject AI Cleanup


WikiProject
AI Cleanup

This WikiProject is currently active.

 To report issues with AI use, please go to Wikidata:AI noticeboard.

Welcome to WikiProject AI Cleanup, a collaboration to combat the increasing problem of poorly verified AI-generated content on Wikidata. If you would like to help, add yourself as a participant in the project, inquire on the talk page, and see the to-do list.

Goals

Since 2022, large language models (LLMs), including GPTs, have become a convenient tool for creating text and structured data at scale. Unfortunately, these models virtually always fail to properly source claims and often introduce subtle errors into structured knowledge bases. While Wikidata requires verifiable references for statements, a significant amount of LLM-generated data inserted from 2022 onward remains on Wikidata in the form of statements, descriptions, aliases, and references. The purpose of this project is to identify and address the misuse of AI in Wikidata content. Specifically, these are the project's goals:

  • To identify statements, descriptions, labels, and aliases written or translated by AI, and scrutinize such content to make sure it follows Wikidata's policies. Any unsourced or likely inaccurate claims should be removed in priority.
  • To identify AI-generated images on Wikimedia Commons and ensure appropriate usage in item statements.
  • To keep track of AI-using editors who may not realize the deficiencies of AI as a data creation tool.
  • To educate editors about the issues with AI-generated content and Wikidata's expectations around AI use.
  • To protect against disruption from AI agents and unauthorized automated editing.

Editing advice

  • Tag items with appropriate templates, remove unsourced statements and/or content added in violation of Wikidata:Verifiability, and warn users who add AI-generated content to items.
  • Any pages (not just items) that are clearly entirely LLM-generated without human review can be nominated for deletion according to Wikidata deletion policies.
  • Identifying AI-assisted edits is difficult in most cases since the generated text is often indistinguishable from human text. The signs of AI writing page provides a list of characteristics that are associated with text generated by AI chatbots.
    • Content that was present in an item before November 30, 2022 (the release date of ChatGPT) is very unlikely to be AI-generated.
  • Most AI content is not "unsourced"—sometimes it has real sources that are unrelated to the item's topic, sometimes it creates its own fake sources, and sometimes it uses legitimate sources to create inaccurate content. When removing bad AI content, check the cited references for legitimacy, and if they are legitimate, consider preserving them on the item talk page for eventual use in human-verified content.
    • As of 2026, recent models will usually cite real sources, although they will likely not verify the content they are being cited for. The existence of these sources should not, by itself, be taken as evidence that the content is human-written.
  • Sometimes entire items or large clusters of statements are AI-generated, and in such a case, make sure to check that the topic is legitimate and meets notability requirements. Occasionally, hoaxes have made it onto Wikidata because AI tools can create fake citations that may appear legitimate.

Open tasks

See Category:Items containing suspected AI-generated content for all items that have been tagged as possibly {{AI-generated}}. The tasks page recommends ways to handle items, talk page discussions, and sources that use AI-generated content.

Participants

Primary contacts:

Feel free to add yourself here!

Template:Scroll box

Feel free to place one of the following lines of Wikitext on your user page to add it to Template:Nobreak. Template:Yytop Template:Yycat Template:Yy Template:Yy Template:Yyend

Resources

Essays

Information

Relevant discussions

These threads may be useful for editors seeking information about how AI has previously been handled on Wikidata.

Project resources

See also

Template:Wikidata help pages Template:Wikidata policies and guidelines Template:WikiProject Footer Category:WikiProject AI Cleanup#%20

Category:WikiProject AI Cleanup