Red Hat Research Quarterly

An open source tool to fight visual disinformation

Jason Schlessman

Jason Schlessman is a Principal Software Engineer at Red Hat Research, focused on novel artificial intelligence and machine learning innovations that lead to pragmatic and feasible solutions. He especially targets projects that serve the well-being of humanity, fostering ethical uses of technology. He can be found online @ EldritchJS.

Related Projects

Disinformation Detection at Scale

Article featured in

Red Hat Research Quarterly

May 2022

Download PDF

Subscribe now

In this issue

News

An open source tool to fight visual disinformation

Jason Schlessman

Feature

Adaptive streaming using Strimzi and Apache Kafka

Adam Cattermole

Feature

RISC-V for FPGAs: benefits and opportunities

Ahmed Sanaullah

Feature

Look to the Horizon: Europe’s increased focus on funding open source research is creating new opportunities

Luis Tomás Bolivar

Carlos Camacho

Josh Salomon

Interview

Machine learning for operations: Can AI push analytics to the speed of software deployment?

Marcel Hild

Column

Why you—yes, you—should take another look at Red Hat’s Research Interest Groups

Heidi Dempsey

Project Updates

Research project updates—May 2022

News

What’s new in Massachusetts computing infrastructure research?

Gordon Haff

News

New research on eBPF and security begins at Karlstad University

Toke Høiland-Jørgensen

News

Red Hat will offer collected teaching materials online

Matej Hrušovský

Red Hat Research is participating in an initiative in the space of image disinformation detection, that is, determination of false information within images that are intended to mislead. The project began January 2021 in response to the need for more mature tooling in the fairly nascent space of image forensics and analysis using statistical and classical machine learning methods.

The project is a collaboration between Red Hat Research and the University of Notre Dame, led by Jason Schlessman on the Red Hat side and Professor Walter Scheirer, a member of UND’s Computer Vision Research Lab. The project has produced an open source Python toolkit and is in the process of assessing performance results using a ground-truth dataset.

The ubiquity of disinformation

Methods for detecting altered images are of particular interest, for many reasons. We live in an image driven world: the apps we use, the sites we browse, and, perhaps most importantly, the social media we engage are fundamentally image centric. This is not limited to static individual images; videos are sequences of images that have their own impact. Each of these sources serves myriad new images we then ingest and process. They have the power to leave a lasting impression on the viewer, which can then propagate rapidly among other individuals due to our society’s vast connectedness.

While this level of connectedness and data richness is of huge value to society, we cannot deny the potential for adversarial actors to take advantage of these influential information sources. Given the pervasiveness of image and video data, detecting the operations of these actors is a problem space whose pursuit offers benefits beyond technological progress, having impact legally, socially, and politically. A search for disinformation in a search engine or on a news site will find instances of convincingly altered images spreading as memes that have the potential to disrupt sociopolitical sentiment, from local elections to the Russian invasion of Ukraine.

We enjoyed a brief period when convincing alterations to images were possible only for those with digital art expertise. However, the same technological advances that provide the ability to manipulate images convincingly also make it easier for those without domain-specific prowess. We live in a deep fake world that brings these methods to the masses.

Finding a scalable solution

In the absence of reliable methods and tooling for detecting fakes, recognizing these images has required explicit one-off identification methods. As technologists, we believe it is imperative to help find efficient means of determining the provenance of images and thus the information they provide to the citizen internet user, as well as the technology executive, the journalist, or the data scientist.

When the Red Hat/UND collaborative effort began, the primary tool available for research beyond deep-learning approaches was a software library that was academic in nature. Meanwhile, Professor Scheirer and other researchers were in need of software development and repeatable workflows approaching the enterprise level Red Hat provides. The collaboration’s goals include providing image data security for all and acquiring input from both industry and academia; therefore, the efforts coming from this collaboration are open source and freely available.

A Python package providing an image manipulation toolkit, pyIFD was released publicly in August 2021. Since then, efforts have been made to assess the algorithmic performance of the methods provided in pyIFD with respect to deep-learning approaches, using a ground-truth image dataset. Results from these efforts will be published once complete. Ongoing work is underway regarding the real-time performance potential of these methods. We wish to determine if these methods could be deployed in a live system for immediate detection, aimed at all internet users. For example, could my phone give a warning for a suspect image, or must the detection occur on the server? If we do server-side analysis, given the number of images posted online daily, are the methods scalable?

Beyond this work, a member of Professor Scheirer’s lab, William Theisen, will conduct research at Red Hat as a summer 2022 intern. He will explore using multimodal detection methods for combating online disinformation at scale. This would specifically target images having text information (e.g., memes). Does this additional information lead to stronger detectors? Can their performance be achieved using off-the-shelf models on real data? Can this work keep up with the speed of information online? Stay tuned to find out!

SHARE THIS ARTICLE

Three years of making new mistakes—and some great solutions

Hugh Brock

Three years ago, I opened my first column in the first issue of this magazine by expressing my sense of good fortune at being able to start something completely new: not just a magazine, but an entire organization devoted to research on computer infrastructure done entirely in open source. Looking back on it today through […]

Interview

Machine learning for operations: Can AI push analytics to the speed of software deployment?

Marcel Hild

RHRQ asked Professor Ayse Coskun of the Electrical and Computer Engineering Department at Boston University to sit down for an interview with Red Hatter Marcel Hild. Professor Coskun is one of the Principal Investigators on the project AI for Cloud Ops, which recently won a $1 million Red Hat Collaboratory Research Incubation Award. Their conversation […]

Feature

Scaling the PEAKS of sustainability with insights from Kepler and machine learning

Han Dong

Parul Singh

A proposed Kubernetes scheduler plugin aims to introduce energy efficiency as a factor in dynamic scheduling while still meeting performance requirements. Businesses in many sectors are setting aggressive sustainability goals, from transitioning to renewable energy sources to reducing existing consumption. Nowhere is the pressure to meet these goals more urgent than in the technology sector, […]

Feature

“When one teaches, two learn”: making the most of technical research mentorship

Matej Hrušovský

Lis Strenger

Research mentorships are the basic building block of productive industry-university relationships. We asked four mentors from around the globe to tell us about the challenges, rewards, and strategies of serving as a mentor. Linking a student’s research goals with the experience of a Red Hat software engineer is at the crux of the Red Hat […]

From the Director

Programmable networks, hardware—what’s next, programmable enterprises?

Hugh Brock

In The Practice of Management, Peter Drucker exhorts managers to push decision-making as close to individual workers—and as near to the last minute—as possible, an idea that has surprising parallels in computing.

Feature

Unleashing the potential of Function as a Service in the cloud continuum

Luis Tomás Bolivar

José Castillo Lema

The PHYSICS project demonstrates the value of the FaaS paradigm for application development and data analysis. Here’s how we enhanced the infrastructure layer. The difficulty of scaling, optimizing, and maintaining infrastructure makes cloud computing too complex or resource-intensive for many developers and data scientists. The Function-as-a-Service (FaaS) model (often called serverless computing, generically) allows users […]

From the Director

The uncertainty principle

Hugh Brock

One of the funny things about research is you never know what you’re going to get. In fact, the uncertainty of research is not just unavoidable—it’s desirable. Scientific breakthroughs like penicillin and even X-rays were the result of attentive scientists noticing something interesting while pursuing something else, then applying the same rigor to the new […]

News

Why you should (virtually) attend Devconf.US

Gordon Haff

We know you’re being deluged with event invites and it’s hard to decide where you should spend your time. Devconf.US has a unique experience to offer. Here’s why you should register for Devconf.US—for free!

Column

Focus on trust | May 2024

Martin Ukrop

Elements of trust are nearly ubiquitous in software development, spanning from security concerns to trustworthiness and reliability. Current projects address the question of trust in many aspects. Red Hat Research and its university partners focus strategically on projects with the most promise to shape the future of how we use technology. Each quarter, RHRQ will […]