Day 1 / May 17th

Link to the zoom visioconference:

2pm – 2.30pmWelcome and Summary of Multimedia Team research on
Computer Vision, Deep Learning and Compression
M.Cagnazzo, S. Lathuilière, Telecom-Paris
2.30pm – 3.00 pmApplications of graphical models and deep modelsChaohui Wang, Maître de Conférences
at Université Gustave Eiffel
3.00pm – 4pmQ&A, Discussion, break
4pm – 4.30pmOptimization Problems on GraphsMireille El Gheche, Senior research
scientist at Sony AI, Zurich
Q&A, Discussion, closing notes

Day 2 / May 18th

Link to the zoom visioconference:

2pm – 2.30pmDeep Statistical Image ModelingSiavash Bigdeli, Senior data scientist at CSEM
2.30pm – 3.30 pmQ&A, Discussion, break
3.30pm – 4pmRegularization and deep neural networks for computer vision:
Ending biased feature propagation, enhancing compression and beyond
 Enzo Tartaglione, Post-doc at Unversity of Turin
4pm – 5pmQ&A, Discussion, closing notes

Talk summaries and speakers bios

Applications of graphical models and deep models

ABSTRACT: In this talk, I will present five works done based on graphical models and deep models, both of which are important modeling tools for computer vision. Regarding the graphical models, I will show two applications: (1) a hierarchical model of object appearance using a single graphical model to exploit shared information across multiple quantization levels so as to improve the performance of object tracking; (2) a high-order graphical model for jointly inferring multiple 3D objects and the indoor scene layout from a single RGB-D image captured with a Kinect camera. Regarding the deep models, I will present two works on occlusion boundary and oriented occlusion boundary estimation, which were done motivated by the fact that occlusion boundaries contain rich perceptual information about the underlying scene structure and are one main obstacle for scene understanding. Last but not least, I will exhibit one work on re-rendering new images for the object of interest from a single image of it, by specifying multiple scene properties (such as viewpoint, illumination, expression, etc.).

BIO: Chaohui Wang is Maître de Conférences at Université Gustave Eiffel (since 09/2014), researcher at LIGM Laboratory (UMR 8049), Université Gustave Eiffel, CNRS, ESIEE Paris, Ecole des Ponts, France. He received his PhD degree (titled as “Distributed and Higher-Order Graphical Models: Towards Segmentation, Tracking, Matching and 3D Model Inference”) at École Centrale Paris in 2011, France, under the supervision of Prof. Nikos Paragios. After that, he did a postdoc with Prof. Stefano Soatto at University of California, Los Angeles, USA, and another postdoc with Prof. Michael J. Black at Max Planck Institute for Intelligent Systems, Germany. His research interests include computer vision, machine learning, and related problems. Up to now, he has published more than 30 papers (h-index: 20), one book chapter, and holds three US Patents. His works have won a few research awards. He became an IEEE Senior Member in 2020.

Optimization Problems on Graphs

ABSTRACT:  In many network-based applications, high-dimensional data naturally reside on the vertices of weighted graphs. Graph signal processing merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this presentation, we outline the main challenges of the area and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then detail two novel approaches to solve two important problems. First, we will present a recent framework based on optimal transport for the graph alignment problem, which derive a simple, yet novel and powerful, distance between graphs based on the Wasserstein distance between the distribution of random Gaussian models following the two graphs being compared. Second, we will show a graph-based depth refinement framework introducing a novel regularizer which promotes the reconstruction of piece-wise planar scenes explicitly, but, thanks to the graph underneath, it is flexible enough to handle non fully piece-wise planar scenes as well.

BIO: Mireille El Gheche received the Master degree in Radio-communication from Centrale Supélec in 2010 and the Ph.D. degree in signal and image processing from Université Gustave Eiffel in 2014. From Jan. 2015 until Aug. 2017, she was a Postdoctoral Researcher at the IMS and IMB laboratories, Université de Bordeaux, where she worked on optimization methods for image super-resolution, denoising, and reconstruction. From Nov. 2017 until Jan. 2021, she was a Postdoctoral Researcher at École Polytechnique Fédérale de Lausanne (Switzerland), where she worked on computational problems in graph theory using machine learning approaches. Since Feb. 2021, she is a Senior research scientist at Sony AI, Zurich.

Deep Statistical Image Modeling

ABSTRACT: My talk will be focused on our key results for building efficient statistical image models using deep neural networks. In the first part, I will give an overview of our work on learning image densities using denoising autoencoders and show how they can be employed in image enhancement and generation problems. In the second part, I will present our approach to learning a generative model for mapping a compact representation to high-dimensional textures with infinite resolution.

BIO: Siavash Bigdeli is a senior data scientist at CSEM. He received a Ph.D. in Computer Science from the University of Bern in 2018. Prior to joining CSEM in 2019, he was a postdoctoral fellow at EPFL. His interests are Computer Vision, Deep Learning and particularly Statistical Signal Processing.

Regularization and deep neural networks for computer vision: EnDing biased feature propagation, enhancing compression and beyond.

ABSTRACT: Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks, especially for computer vision. There are however open challenges to solve, like the presence of biases in the training data which questions the generalization capability of these models, or the models’ size questioning their deployability on mobile/embedded devices.
There are many recent approaches aiming at solving these issues. Some of them rely on the design of regularization objective function. These are minimized at training time besides the standard loss function, putting extra constraints on the learning problem. EnD can be used for bias disentangling in image classification tasks, without requiring the overhead of generating some unbiased dataset, or to train extra models/layers in the deep model. It is possible to successfully solve other problems with a proper design of the regularization function, like model’s simplification (ie. structured pruning) or compression, which opens to new possibilities, like model’s decomposition towards explainability.

BIO: Enzo Tartaglione received the joint MS degree in electronic engineering at Polytechnic of Torino, University of Illinois at Chicago and Polytechnic of Milan in 2015, with 110/110 cum laude. In 2016 he was also awarded of the “Alta Scuola Politecnica” diploma. In 2019 he received the PhD in physics at Polytechnic of Torino, cum laude, with the thesis “From Statistical Physics to Algorithms in Deep Neural Systems”. He is currently a postdoc in the EIDOS group at Università degli Studi di Torino. His main research interests are neural network applied to medical image processing, unbiased learning, compression/pruning of deep models, explainable-AI and privacy-aware deep learning.


We have some recent accepted publications in the team:

[1] N. Hobloss, L. Ge, M. Cagnazzo, “A Multi-View Stereoscopic Video Database With Green Screen (MTF) For Video Transition Quality-of-Experience Assessment”, accepted in QoMEx 2021

[2] M. Milovanovic, M. Cagnazzo, F. Henry, J. Jung, “PATCH DECODER-SIDE DEPTH ESTIMATION IN MPEG IMMERSIVE VIDEO”, accepted in IEEE ICASSP’21

[3] T. Karagkioules et al., “Online Learning for Adaptive Video Streaming in Mobile Networks”, accepted in ACM Transactions on Multimedia

Congrats to Nour, Marta and Theo and all the advisoring team!

MILES project approved

The project named: “MILES – MachIne Learning for Efficient Streaming” has been founded by the Institut Polytechnique de Paris. It will develop, in our Multimedia team, on-line learning methods for improved video streaming over mobile devices.

Congratulations to Attilio Fiandrotti, in charge of the project!

Best paper award

Our article “Cockpit video coding with temporal prediction” byI. Mitrica (MM/IDS et Safran), A. Fiandrotti (MM/IDS), M. Cagnazzo (MM/IDS), C. Ruellan and E. Mercier, has received the Best Paper Award at the “European Workshop on Visual Information Processing” (Rome, Italy, 31/10/2019).

Congrats Iulia !

Professional blog