Disinformation Detection at Scale

The increased prevalence of fake and manipulated visual media on the Internet has led to social and technical dilemmas in need of solutions. From political memes meant to influence elections to dishonest scientists fabricating experimental results, the landscape of fake media is diverse and challenging for automated means of identification. There is a crucial need for scalable algorithms that can process upwards of a billion images per day for signs of tampering or synthetic content. This project will reassess existing assumptions made for this problem, and work on new algorithms and scalable infrastructure to move towards that goal. A secondary concern is the interpretation of multimodal data, where clues about the veracity of content may be contained in text surrounding an image. This project will develop Natural Language Processing capabilities to address this.

Image caption: Left: original image. Center: Forensic analysis indicating tampered region. Right: Forensic analysis indicating text overlay


Repositories

pyIFD – Python-based Image Forgery Detection Toolkit


Other Sources of Funding that Support this Research

DARPA, 2016; HHS, 2018; USAID, 2018


Press

pyIFD: Python-based Image Forgery Detection Toolkit


Learn More

Disinformation Detection at Scale Overview pdf


Watch

July 21,2022 Research Days talk: Image Provenance Analysis for Disinformation Detection

Slides: Image Provenance Analysis for Disinformation Detection


See Also

Summer 2022 subproject Understanding accuracy decay in online image retrieval systems within the context of open-set classification and unsupervised clustering


Project Poster

Link to full size poster