MU-GAN: Facial Attribute Editing based on Multi-attention Mechanism.
🌏 Source
Available at: https://arxiv.org/abs/2009.04177. Source code at: SuSir1996/MU-GAN.
Task: facial attribute editing, which has two main objectives:
The Multi-attention U-Net-based Generative Adversarial Network (MU-GAN):
CNN-based GANs are good at editing local attributes and synthesizing images with few geometric constraints. However, they have difficulty in editing images with geometric or structural patterns (such as facial attribute editing). It is also known that there are complex coupling relationships among facial attributes (e.g.: gender and beard).
Thus, a desired model needs to have the ability to decouple attributes in order to meet the requirements of the target labels. To solve these problems, we construct a new generator with a novel encoder-decoder architecture and propose a Multi-attention U-Net-based GAN (MU-GAN) model.
For detailed retention and blurry image problems, the paper replaces the original asymmetric CNN-based encoder-decoder with a symmetrical Attention U-Net architecture. Besides, instead of directly connecting an encoder to a decode via skip-connections, the paper present AUCs to selectively transfer attribute-irrelevant representations from an encoder. Then, AUCs concatenate encoder representation with decoder ones to improve image quality and detail preservation.
With an attention mechanism, AUCs are capable of filtering out representations related to original attributes while preserving attribute-irrelevant details. It can promote the image fidelity without weakening attribute manipulation ability.
Self-attention mechanism.
real / fake: two sub-networks.
In MU-GAN, the generator
Loss is calculated as three parts:
Overall objective:
Compare MU-GAN
with AttGAN
and STGAN
.
Multi-attention mechanism: AUCs and self-attention mechanisms → into a symmetrical U-Net-like architecture.