Deepfake Detection Challenge was about identifying media manipulated with Deepfake techniques. The main goal is to create a binary classification model that predicts whether input video is fake or not.
Data represented with 4 datasets
- Training Set: dataset, containing labels for the target. The full training set is just over 470 GB. It is broken up into 50 groups ~10GB each.
- Public Validation Set: small dataset of 400 videos. This set is in use when you commit your Kaggle notebook. Accessible on the same page as Training Set.
- Public Test Set: completely withheld and is what Kaggle’s platform computes the public leaderboard against.
- Private Test Set: privately held outside of Kaggle’s platform, and is used to compute the private leaderboard.
Solution can be described with several steps.
- Extract 17 faces from each video with MTCNN face detector
- Save each face on disk
- Save corresponded metadata (which face from which video)
- Divide training dataset on train and validation part
Train binary classificator based on Xception CNN.
- Extract 17 faces from each video
- Predict label [0..1] for each face separately using pretrained model from the previous step
- Use mean value of 17 predictions to get prediction whether input video is fake or not
- Migrate from Keras.Sequential to Tensorflow.data.Dataset (Currently NVIDIA DALI cannot work with TF Distributed Strategies, so we need to wait for the next release).
- Find working Tensorflow implementation of MTCNN and make it work with TF 2.0
- Clear data preparation and evaluation stages