Researchers from Princeton University found that 4.36% of the 2,909 English Wikipedia articles created in August 2024 contained significant AI-generated content. The study, led by Creston Brooks, Samuel Eggert, and Denis Peskoff, highlighted the growing role of AI in content creation and raised concerns about the implications for quality, accountability, and potential bias amplification.
The researchers used two AI detection tools—GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative—to measure the presence of AI-generated content. GPTZero flagged 156 articles, while Binoculars flagged 96, with an overlap of 45 articles between the two tools. The flagged articles were often of lower quality, containing fewer references and showing weaker integration into Wikipedia’s existing knowledge network. Some were identified as self-promotional, promoting businesses or individuals using superficial citations, such as personal YouTube videos. Others exhibited political bias, including attempts to manipulate contentious topics like Albanian history.
The study also analyzed AI-generated content on other platforms, finding lower rates compared to Wikipedia. Among 3,000 Reddit comments examined, less than 1% were flagged as AI-generated, whereas AI-generated press releases from 60 UN country teams surged from under 1% before 2022 to 20% in 2024. These findings illustrate the varied adoption and detection challenges of AI-generated content across different types of media.
The paper emphasized the challenges in detecting AI-generated content, particularly as generative language models become more advanced. It highlighted the need for effective verification methods to ensure the integrity of online information.