SEMINAR:On Building Neural Networks with Imagination Skills14-12-2020

Speaker: Aykut Erdem 

Title: On Building Neural Networks with Imagination Skills

Date/Time: 15  December 2020/ 13:40- 14:30

Zoom: Meeting ID:  954 9528 7781

Passcode: gradsem

Abstract:In the past few years, generative adversarial networks (GANs) and variational autoencoders (VAEs) — two popular types of deep generative models — have matured to the point of synthesizing realistic looking images. In this talk, I will discuss our recent efforts on leveraging these models to enable machines to exhibit different forms of imagination skills. First, we will describe how GANs can be exploited as an image prior within a semantic photo/video editing system. In particular, we will introduce a conditional GAN model which can generate the same scene under different conditions such as lighting, weather, and seasons or time of day. When integrated with a deep photo style transfer method, the combined framework can be used to seamlessly adjust high-level transient attributes of a given natural image. Second, we will present our ongoing research, in which we seek to build a video-to-video translation model that is solely guided by natural language expressions. Our proposed model can be seen as a special-purpose VAE-GAN hybrid, in which the two networks work together in harmony to perform minimal and local semantic edits to synthesize an alternate version of the input video clip to resemble the target linguistic description.

Bio: Aykut Erdem is an Associate Professor of Computer Science at Koç University. He received his Ph.D. degree from Middle East Technical University in 2008. He was a post-doctoral researcher at the Ca’Foscari University in Venice in the EU-FP7 SIMBAD project, from 2008 to 2010. Previously, he was with the Computer Engineering Department at Hacettepe University, where he was one of the directors of the Computer Vision Lab. The broad goal of his research is to explore better ways to understand, interpret and manipulate visual data. His current research focuses on investigating learning-based approaches to image editing, visual saliency estimation, and connecting vision and language.