Skip to content
Snippets Groups Projects

EMOCA: Emotion Driven Monocular Face Capture and Animation

Original Authors: Radek Daněček · Michael J. Black · Timo Bolkart

CVPR 2022

This repository is a fork of EMOCA which is the official implementation of the CVPR 2022 paper EMOCA: Emotion-Driven Monocular Face Capture and Animation.

This version focuses on EMOCA V2 support, and makes changes to help with install and increase usability for Sign Language Translation. When new features are added, they will be mentioned here.

Top row: input images. Middle row: coarse shape reconstruction. Bottom row: reconstruction with detailed displacements.


PyTorch Lightning

EMOCA takes a single in-the-wild image as input and reconstructs a 3D face with sufficient facial expression detail to convey the emotional state of the input image. EMOCA advances the state-of-the-art monocular face reconstruction in-the-wild, putting emphasis on accurate capture of emotional content. The official project page is here.

EMOCA project

The training and testing script for EMOCA can be found in the EMOCA subfolder.

Installation

Note: recommended usage is with the docker image, hosted on the University's harbour at https://container-registry.surrey.ac.uk/shared-containers/emoca-docker. The repo that builds the image is seperate for CI purposes, but can be found here. An dockerfile for the original EMOCA can also be found in that repo.

  1. Install [mamba]
wget -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3.sh -b -p "conda"
source "/build/conda/etc/profile.d/conda.sh"
source "/build/conda/etc/profile.d/mamba.sh"
conda init && mamba init
conda activate
  1. Clone this repo
git clone https://gitlab.surrey.ac.uk/as03095/emoca.git
cd emoca
  1. Finishing writing this guide

For Training

Additional data is required for training. This includes:

  1. DECA training data. This con
  2. AffectNet training data
  3. Basel Face Model converted to FLAME texture space, using BFM_to_FLAME

Usage

  1. Activate the environment:
conda activate emoca
  1. For running EMOCA examples, go to EMOCA

  2. Emotion Recognition is not a priority for this fork, but the original implementation can be found here.

Structure

This repo has two subpackages. gdl and gdl_apps

GDL

gdl is a library full of research code. Some things are OK organized, some things are badly organized. It includes but is not limited to the following:

  • models is a module with (larger) deep learning modules (pytorch based)
  • layers contains individual deep learning layers
  • datasets contains base classes and their implementations for various datasets I had to use at some points. It's mostly image-based datasets with various forms of GT if any
  • utils - various tools

The repo is heavily based on PyTorch and Pytorch Lightning.

GDL_APPS

gdl_apps contains prototypes that use the GDL library. These can include scripts on how to train, evaluate, test and analyze models from gdl and/or data for various tasks.

Look for individual READMEs in each sub-projects.

Current projects:

Citation

The rest of this README is from the original EMOCA repo.

If you use this work in your publication, please cite the following publications:

@inproceedings{EMOCA:CVPR:2022,
  title = {{EMOCA}: {E}motion Driven Monocular Face Capture and Animation},
  author = {Danecek, Radek and Black, Michael J. and Bolkart, Timo},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages = {},
  year = {2022}
}

As EMOCA builds on top of DECA and uses parts of DECA as fixed part of the model, please further cite:

@article{DECA:Siggraph2021,
  title={Learning an Animatable Detailed {3D} Face Model from In-The-Wild Images},
  author={Feng, Yao and Feng, Haiwen and Black, Michael J. and Bolkart, Timo},
  journal = {ACM Transactions on Graphics (ToG), Proc. SIGGRAPH},
  volume = {40}, 
  number = {8}, 
  year = {2021}, 
  url = {https://doi.org/10.1145/3450626.3459936} 
}

Furthermore, if you use EMOCA v2, please also cite SPECTRE:

@article{filntisis2022visual,
  title = {Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos},
  author = {Filntisis, Panagiotis P. and Retsinas, George and Paraperas-Papantoniou, Foivos and Katsamanis, Athanasios and Roussos, Anastasios and Maragos, Petros},
  journal = {arXiv preprint arXiv:2207.11094},
  publisher = {arXiv},
  year = {2022},
}

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms of this license.

Acknowledgements

There are many people who deserve to get credited. These include but are not limited to: Yao Feng and Haiwen Feng and their original implementation of DECA. Antoine Toisoul and colleagues for EmoNet.