Category Archives: Séminaire



Day 1 / May 17th

Link to the zoom visioconference:

2pm – 2.30pmWelcome and Summary of Multimedia Team research on
Computer Vision, Deep Learning and Compression
M.Cagnazzo, S. Lathuilière, Telecom-Paris
2.30pm – 3.00 pmApplications of graphical models and deep modelsChaohui Wang, Maître de Conférences
at Université Gustave Eiffel
3.00pm – 4pmQ&A, Discussion, break
4pm – 4.30pmOptimization Problems on GraphsMireille El Gheche, Senior research
scientist at Sony AI, Zurich
Q&A, Discussion, closing notes

Day 2 / May 18th

Link to the zoom visioconference:

2pm – 2.30pmDeep Statistical Image ModelingSiavash Bigdeli, Senior data scientist at CSEM
2.30pm – 3.30 pmQ&A, Discussion, break
3.30pm – 4pmRegularization and deep neural networks for computer vision:
Ending biased feature propagation, enhancing compression and beyond
 Enzo Tartaglione, Post-doc at Unversity of Turin
4pm – 5pmQ&A, Discussion, closing notes

Talk summaries and speakers bios

Applications of graphical models and deep models

ABSTRACT: In this talk, I will present five works done based on graphical models and deep models, both of which are important modeling tools for computer vision. Regarding the graphical models, I will show two applications: (1) a hierarchical model of object appearance using a single graphical model to exploit shared information across multiple quantization levels so as to improve the performance of object tracking; (2) a high-order graphical model for jointly inferring multiple 3D objects and the indoor scene layout from a single RGB-D image captured with a Kinect camera. Regarding the deep models, I will present two works on occlusion boundary and oriented occlusion boundary estimation, which were done motivated by the fact that occlusion boundaries contain rich perceptual information about the underlying scene structure and are one main obstacle for scene understanding. Last but not least, I will exhibit one work on re-rendering new images for the object of interest from a single image of it, by specifying multiple scene properties (such as viewpoint, illumination, expression, etc.).

BIO: Chaohui Wang is Maître de Conférences at Université Gustave Eiffel (since 09/2014), researcher at LIGM Laboratory (UMR 8049), Université Gustave Eiffel, CNRS, ESIEE Paris, Ecole des Ponts, France. He received his PhD degree (titled as “Distributed and Higher-Order Graphical Models: Towards Segmentation, Tracking, Matching and 3D Model Inference”) at École Centrale Paris in 2011, France, under the supervision of Prof. Nikos Paragios. After that, he did a postdoc with Prof. Stefano Soatto at University of California, Los Angeles, USA, and another postdoc with Prof. Michael J. Black at Max Planck Institute for Intelligent Systems, Germany. His research interests include computer vision, machine learning, and related problems. Up to now, he has published more than 30 papers (h-index: 20), one book chapter, and holds three US Patents. His works have won a few research awards. He became an IEEE Senior Member in 2020.

Optimization Problems on Graphs

ABSTRACT:  In many network-based applications, high-dimensional data naturally reside on the vertices of weighted graphs. Graph signal processing merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this presentation, we outline the main challenges of the area and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then detail two novel approaches to solve two important problems. First, we will present a recent framework based on optimal transport for the graph alignment problem, which derive a simple, yet novel and powerful, distance between graphs based on the Wasserstein distance between the distribution of random Gaussian models following the two graphs being compared. Second, we will show a graph-based depth refinement framework introducing a novel regularizer which promotes the reconstruction of piece-wise planar scenes explicitly, but, thanks to the graph underneath, it is flexible enough to handle non fully piece-wise planar scenes as well.

BIO: Mireille El Gheche received the Master degree in Radio-communication from Centrale Supélec in 2010 and the Ph.D. degree in signal and image processing from Université Gustave Eiffel in 2014. From Jan. 2015 until Aug. 2017, she was a Postdoctoral Researcher at the IMS and IMB laboratories, Université de Bordeaux, where she worked on optimization methods for image super-resolution, denoising, and reconstruction. From Nov. 2017 until Jan. 2021, she was a Postdoctoral Researcher at École Polytechnique Fédérale de Lausanne (Switzerland), where she worked on computational problems in graph theory using machine learning approaches. Since Feb. 2021, she is a Senior research scientist at Sony AI, Zurich.

Deep Statistical Image Modeling

ABSTRACT: My talk will be focused on our key results for building efficient statistical image models using deep neural networks. In the first part, I will give an overview of our work on learning image densities using denoising autoencoders and show how they can be employed in image enhancement and generation problems. In the second part, I will present our approach to learning a generative model for mapping a compact representation to high-dimensional textures with infinite resolution.

BIO: Siavash Bigdeli is a senior data scientist at CSEM. He received a Ph.D. in Computer Science from the University of Bern in 2018. Prior to joining CSEM in 2019, he was a postdoctoral fellow at EPFL. His interests are Computer Vision, Deep Learning and particularly Statistical Signal Processing.

Regularization and deep neural networks for computer vision: EnDing biased feature propagation, enhancing compression and beyond.

ABSTRACT: Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks, especially for computer vision. There are however open challenges to solve, like the presence of biases in the training data which questions the generalization capability of these models, or the models’ size questioning their deployability on mobile/embedded devices.
There are many recent approaches aiming at solving these issues. Some of them rely on the design of regularization objective function. These are minimized at training time besides the standard loss function, putting extra constraints on the learning problem. EnD can be used for bias disentangling in image classification tasks, without requiring the overhead of generating some unbiased dataset, or to train extra models/layers in the deep model. It is possible to successfully solve other problems with a proper design of the regularization function, like model’s simplification (ie. structured pruning) or compression, which opens to new possibilities, like model’s decomposition towards explainability.

BIO: Enzo Tartaglione received the joint MS degree in electronic engineering at Polytechnic of Torino, University of Illinois at Chicago and Polytechnic of Milan in 2015, with 110/110 cum laude. In 2016 he was also awarded of the “Alta Scuola Politecnica” diploma. In 2019 he received the PhD in physics at Polytechnic of Torino, cum laude, with the thesis “From Statistical Physics to Algorithms in Deep Neural Systems”. He is currently a postdoc in the EIDOS group at Università degli Studi di Torino. His main research interests are neural network applied to medical image processing, unbiased learning, compression/pruning of deep models, explainable-AI and privacy-aware deep learning.

Invited talk: “Virtual Reality: from Neuroscience to Media Streaming”

Our next seminar!

Date : Friday Feb 2nd 2018
Hour : 13h30
Room : C48
Title : Virtual Reality: from Neuroscience to Media Streaming
Immersive Virtual Reality is transitioning, at a remarkable pace, from a tool for fundamental research in cognitive neuroscience, therapy and rehabilitation to a consumer product for training, education and entertainment.
In this talk I will review some highlights of this remarkable transition. The talk will have three parts.
1) Review some of the known principles and well estabhised mechanisms of Virtual Reality science, as they can be found in the relevant litterature,
2) Introduce some of the ongoing initiatives in terms of capture technologies, encoding mechanisms, and delivery infrastructure fostered by the transformation of VR into a consumer product
3) Discuss, hopefully in interaction with the audience, where we are heading, what might happen in the near future, and what we can do to address the challenges and oportunities associated with this medium.

Dr. Joan Llobera (male) is an electrical engineer and an academic researcher working at the conjunction of cognitive sciences and virtual reality. He obtained a double diploma in Electrical Engineering (Universitat Politècnica de Catalunya and Télécom Paris), holds 2 Masters, one in cognitive sciences (EHESS) and another in software (UPC), and in 2012 obtained a PhD at the Universitat de Barcelona on the topic of Stories in Virtual Reality, which received a Cum Laude qualification. He also worked as a postdoc researcher at the Ecole Polytechnique Fédérale de Lausanne. Since September 2015 he is a senior researcher at the i2CAT foundation.

Multimedia group seminar’s day

Three talks are in program for our last group seminar’s day (June 12th 2017) :

  1. Iulia Mitrica : “Aircraft screen content compression”
  2. Belén Luque Lopez : “Super resolution on 3D point cloud using CNNs”
  3. Theodoros Karagkioules : “A Comparative Case Study of HTTP Adaptive Streaming Algorithms in Mobile Networks”