Seminar by Alasdair Newson - Autoencoders and Generative Adversarial Networks for Image Understanding and Editing

Abstract

Deep Generative Models are neural networks which can produce random examples of very high dimensional and complicated data, for example images. These include in particular Variational Autoencoders and Generative Adversarial Networks (GAN). The core idea of these models is to learn a latent representation of the data, synthesise the data in this latent space, and then project back to the data space. In general, the latent space is designed to be more compact than the data space, thus the representation is more powerful. When the projection to the latent space from the data space is also constructed, the network is called an autoencoder. Finally, generative models can also be exploited to achieve complex modifications of the output data (eg. editing). In this presentation, I will discuss three topics which concern generative models, in the case of image data. Firstly, I will look at exactly how an autoencoder can learn an optimal latent space in the case of simple images of centered, binary disks. The goal of this work is to understand the inner working of autoencoders in the simplest situation possible. Secondly, I will discuss how an autoencoder can be created to imitate the Principal Component Analysis. The associated architecture and loss function organise the latent space into independent axes which represent different attributes of the data (for example shape, rotation). These attributes are learned in a completely unsupervised manner. Finally, I will present a method which uses a pre-trained GAN to achieve high-level editing of facial images, with labelled attributes. For example, this will allow us to modify the hair style or smile of a person’s face. This approach is completely generic and could therefore be applied to any type of data, as long as the pre-trained GAN is available.

Alasdair Newson completed his PhD in image and video processing in March 2014 under the supervision of Andres Almansa, Yann Gousseau and Patrick Perez, with Technicolor and Telecom ParisTech. His research interests include image and video inpainting and restoration, statistical methods in image processing, film restoration, variational methods, and motion estimation. He spent one year as a Postdoc researcher with the team of Guillermo Sapiro, after which he spent one year at Paris Descartes (Paris, France) with Julie Delon and Bruno Galerne. He is currently an associate professor (Matre de Conférences) at Télécom Paris, where he works on deep learning for image processing.

Date
May 24, 2023 2:00 PM — 4:00 PM
Location
Computer Science dept., University of Turin
Via Pessinetto, 12, Torino,