Detecting Deepfakes: MIT CSAIL Model Identifies Manipulations Using Local Artifacts

When celebrity porn and other deepfake videos went viral several years back they caught the world largely unprepared — few could believe just how convincingly AI had generated the fake images. We have since seen numerous breakthroughs in image synthesis algorithms and face-synthesizing and swapping technologies enabled by generative adversarial networks (GANs), making deepfakes even more believable. Governmental and other bodies meanwhile have been scrambling to catch up — looking for ways to counter the malicious spread of deepfakes which are now so realistic they can be difficult if not impossible for the human eye to detect.

A team of researchers from MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) have proposed a new model that is designed to spot deepfakes by looking at subtle visual artifacts such as textures in hair, backgrounds, and faces, and visualizing image regions where it has detected manipulations.


The team noted that as SOTA image synthesis techniques continue advancing under novel synthesis algorithms, it is critical that fake image detection methods keep step to enable efficient and robust identification of deepfakes created using such new methods. In this regard it is important to understand which artifacts the fake image detectors will examine if they are to remain effective in the face of continually evolving synthesis algorithms.

The researchers put their focus on exploring fake image detectors’ ability to generalize to unseen fake images and identify what properties of fake images generalize across different model architectures, data, and variations in training.

The team adopted a patch-based classifier with limited receptive fields to visualize regions of fake images that are most easily detectable. They first preprocessed the classifier’s training images to reduce formatting differences between the real and fake input images, then trained it in a fully convolutional manner, truncating a standard deep network architecture after various intermediate layers. Since the model is fully convolutional, the researchers are able to focus its receptive field on textures in small patches, as this limited receptive field offers a natural way to visualize patterns that are indicative of real or fake images.


The researchers tested their approach on a suite of synthetic face datasets to determine which artifacts classifiers learn in order to enable them to detect fake images generated by different models. They discovered that more complex patches such as hair and background areas tend to be more detectable across different synthetic image sources. Moreover, even when a deepfake image generator is adversarially finetuned against a fake image classifier, it still can leave detectable artifacts in specific image patches where telltale mistakes can be spotted.


Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *