Skip to main content

SEMINAR:Multimedia Signal Processing: A Multimodal Analysis Perspective.

Hello everyone,


We are thrilled to announce that this Wednesday, we are going to be hosting Dr. Engin Erzin from Koç University, who will be delivering the 6th seminar of our MATH+ Seminar Series!

The event will be held in FASS G022 at 15.00 on Wednesday.

You may find the abstract of the talk and the short bio of our guest below.

Looking forward to seeing you there!

SIAM Sabancı
Title: Multimedia Signal Processing: A Multimodal Analysis Perspective for Selected Human-Computer Interaction Applications

Abstract: This talk explores multimedia signal processing and multimodal analysis, where information from different sources like audio, video, and motion is combined to create a richer understanding of human communication. We'll briefly discuss two key applications as a motivation for multimedia signal processing: Throat Microphone Speech Enhancement and Prosody-Driven Head Gesture Synthesis.  The first tackles the challenge of improving speech quality captured by skin-attached throat microphones. The second aims to generate realistic head movements that align with the emotional tone (prosody) of speech, making communication more natural. Moving beyond audio-visual processing, we'll explore how to assess engagement between humans and robots, a crucial aspect of human-robot interaction. Finally, we'll discuss Affective Video Summarization, which aims to create summaries of videos attending and capturing their emotional content. These applications demonstrate the potential of multimodal analysis and multimedia signal processing to enhance human-computer interaction and offer a more in-depth understanding of human behavior modeling.

Bio: Engin Erzin received his Ph.D. degree, M.Sc. degree, and B.Sc. degree from Bilkent University, Ankara, Turkey, in 1995, 1992, and 1990, respectively, all in Electrical & Electronics Engineering. During 1995-1996, he was a postdoctoral fellow in the Signal Compression Laboratory, University of California, Santa Barbara. He joined Lucent Technologies in September 1996 and was with Consumer Products Group for one year as a Member of the Technical Staff of the Global Wireless Products Group. From 1997 to 2001, he was with the Speech and Audio Technology Group of the Network Wireless Systems. Since January 2001, he has been with the  Computer Engineering and Electrical & Electronics Engineering Departments of Koç University, Istanbul, Turkey.  Engin Erzin is currently a member of the IEEE Speech and Language Processing Technical Committee and has previously served as Associate Editor for the IEEE Transactions on Multimedia (2018-2022) and Associate Editor of the IEEE Transactions on Audio, Speech & Language Processing (2010-2014). His research interests include speech signal processing, audio-visual signal processing, human-computer interaction, and pattern recognition.

Home

FENS Dean's Office

Orta Mahalle, 34956 Tuzla, İstanbul, Türkiye

+90 216 483 96 00

© Sabancı University 2023