StyleFlow : Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
See you @ Siggraph 2021
1KAUST
2UCL , Adobe Research

Abstract

High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes, while still preserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this paper, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow to other concurrent works.

Overview Video

Method

Attribute-conditioned editing using StyleFlow. Starting from a source image, we support attribute-conditioned editing by using a reverse inference followed by a forward inference though a sequence of CNF blocks. Here, z denotes the variable of the prior distribution and w denotes the intermediate weight vector of the StyleGAN.

Sampling

Attribute-conditioned sampling using StyleFlow. Here we show sampling results for attribute specifications of females with glasses in a target pose(top); 50-year old males with facial hair(middle); and smiling 5-year old children in a target pose(bottom).

Sequential Edits

Sequential edits using StyleFlow with `+'/`-' denoting corresponding attribute was increased/decreased.

Real Image Edits

Real image non-sequential edits using our StyleFlow framework. Note that the method is able to handle extreme pose (first and second rows),asymmetrical expressions (fourth row) and age diversity (first and last rows) well compared to the concurrent methods.

Bibtex

@article{10.1145/3447648,
author = {Abdal, Rameen and Zhu, Peihao and Mitra, Niloy J. and Wonka, Peter},
title = {StyleFlow: Attribute-Conditioned Exploration of StyleGAN-Generated Images Using Conditional Continuous Normalizing Flows},
year = {2021},
issue_date = {May 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {40},
number = {3},
issn = {0730-0301},
url = {https://doi.org/10.1145/3447648},
doi = {10.1145/3447648},
abstract = {High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes while stillpreserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this article, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow over prior and several concurrent works. Project Page and Video: https://rameenabdal.github.io/StyleFlow.},
journal = {ACM Trans. Graph.},
month = may,
articleno = {21},
numpages = {21},
keywords = {image editing, Generative adversarial networks}
}

Acknowledgements

This work was supported by the KAUST Office of Sponsored Research (OSR) and Adobe Research.

Related Work