Disinformation Detection at Scale
The increased prevalence of fake and manipulated visual media on the Internet has led to social and technical dilemmas in need of solutions. From political memes meant to influence elections to dishonest scientists fabricating experimental results, the landscape of fake media is diverse and challenging for automated means of identification. There is a crucial need for scalable algorithms that can process upwards of a billion images per day for signs of tampering or synthetic content. This project will reassess existing assumptions made for this problem, and work on new algorithms and scalable infrastructure to move towards that goal. A secondary concern is the interpretation of multimodal data, where clues about the veracity of content may be contained in text surrounding an image. This project will develop Natural Language Processing capabilities to address this.
Image caption: Left: original image. Center: Forensic analysis indicating tampered region. Right: Forensic analysis indicating text overlay
Repositories
pyIFD – Python-based Image Forgery Detection Toolkit
Other Sources of Funding that Support this Research
DARPA, 2016; HHS, 2018; USAID, 2018
Press
pyIFD: Python-based Image Forgery Detection Toolkit
Learn More
Disinformation Detection at Scale Overview pdf
Watch
Slides: Image Provenance Analysis for Disinformation Detection
See Also
Summer 2022 subproject Understanding accuracy decay in online image retrieval systems within the context of open-set classification and unsupervised clustering
Project Poster