9h45h Welcome coffee
10h00-11h15 “Deep learning for Super Resolution and Tracking”, by Gianni Franchi, Univ. Paris Sud
11h15-12h30 “Deformation models for image and video generation”, by Stéphane Lathuilière, Univ. Trento (Italy)
12h30-14h Lunch break
14h-15h15 “The Next Big Thing: From Systems to Deep Systems”, by Francesco Banterle, CNR Pisa (Italy)
15h15-16h Coffee, free discussion

Abstracts :

“Deep learning for Super Resolution and Tracking”
Cette présentation traitera de deux projets :
Le projet 1 : vise à mélanger des techniques de deep learning et de géostatistique pour faire de la super résolution d’images. L’objectif est d’accéder aux bons résultats du deep learning ainsi que l’incertitude de l’estimateur grâce la géostatistique.
Le projet 2 : vise à suivre des personnes dans une vidéo de foule extrêmement dense. N’ayant pas de base donnée annotée sur ce projet, nous proposerons une technique ou le réseau de neurones apprend tout seul. (En anglais : Self supervised learning).

“Deformation models for image and video generation”
Generating realistic images and videos has countless applications in different areas, ranging from photography technologies to e-commerce business.
Recently, deep generative approaches have emerged as effective techniques for generation tasks. In this talk, we will first present the problem of pose-guided person image generation. Specifically, given an image of a person and a target pose, a new image of that person in the target pose is synthesized. We will show that important body-pose changes affect generation quality and that specific feature map deformations lead to better images.
Then, we will present our recent framework for video generation. More precisely, our approach generates videos where an object in a source image is animated according to the motion of a driving video. In this task, we employ a motion representation based on keypoints that are learned in a self-supervised fashion. Therefore, our approach can animate any arbitrary object without using annotation or prior information about the specific object to animate.

“The Next Big Thing: From Systems to Deep Systems”
The main communities in Computer Science are all shifting from traditional algorithms towards deep-based algorithms where deep learning is extensivelyused to solve everyday problems. Although this is very attractive in terms of quality and speed,the days of end-to-end encoding are numbered because more than a network is needed to achieve a full task. This talk will show a traditional system for 3D reconstruction, how to make it deep, and the making of a from scratch deep system in which deep learning was in the loop from start to finish,[:]

Publication

[:fr]Article ICIP’19[:en]Paper accepted at IEEE ICIP’19 [:it]Article ICIP’19[:]

07/05/2019 Marco Cagnazzo

[:fr]Notre article sur la représentation scalable des hologrammes a été accepté dans la conférence IEEE ICIP’19:

Anas El Rhammad, Patrick Gioia, Antonin Gilles, Marco Cagnazzo, ‘SCALABLE CODING FRAMEWORK FOR A VIEW-DEPENDENT STREAMING OF DIGITAL HOLOGRAMS'[:en]Our article on scalable hologram representation has been accepted into IEEE ICIP’10 conference.

Anas El Rhammad, Patrick Gioia, Antonin Gilles, Marco Cagnazzo, ‘SCALABLE CODING FRAMEWORK FOR A VIEW-DEPENDENT STREAMING OF DIGITAL HOLOGRAMS'[:]

3D Vidéo, Conférence

[:fr]Session spéciale à ACM ICDSC’19[:en]Special session ACM ICDSC’19[:]

29/04/2019 Marco Cagnazzo

[:fr]Notre proposition de session spéciale à la conférence ACM International Conference on Distributed Smart Cameras (ICDSC’19) a été acceptée !
La session spéciale s’intitule : « Trends in Free Navigation Technologies ».
Ici on peut trouver l’appel à soumissions [:en]Our special session proposal in ACM International Conference on Distributed Smart Cameras (ICDSC’19) has been accepted!
The special session title is « Trends in Free Navigation Technologies ».
Find here the call for papers

It is an open special session, so you can apply directly through the conference website.[:]

Poste

[:fr]Ouverture de poste: Maître de Conférence en Analyse et apprentissage pour la vidéo [:en]Open position: Associate/Assistant Professor in Video Analysis and Learning[:]

15/02/2019 Marco Cagnazzo

[:fr]Telecom ParisTech, une Grande Ecole d’Ingénieur de l’Institut Polytechnique de Paris, recrute un·e Maître de Conférences en Analyse et Apprentissage pour la Vidéo. Le poste est localisé au sein de l’Équipe Multimédia, dans le Département Image, Data, Signal (IDS) et le laboratoire LTCI.

L’équipe Multimédia a une longue expérience dans le domaine du codage et de la transmission de la vidéo. Plus récemment, l’analyse et l’apprentissage pour la vidéo ont pris une place d’importance grandissante dans l’activité de recherche et enseignement de l’équipe, comme témoigné par la mise en place du Groupe d’étude pour l’application de l’apprentissage (profond) à la compression vidéo. L’équipe a l’objectif de agrandir son activité dans ce domaine et plusieurs nouveaux projets de recherche viennent d’être lancé : par exemple, l’équipe a 2 projets en cours sur l’utilisation de l’apprentissage profond pour la compression vidéo, et un autre sur l’utilisation des techniques d’apprentissage pour l’évaluation de la qualité photographique. C’est dans ce cadre et pour supporter l’activité croissante de l’équipe que la position de Maître de Conférences en Analyse et apprentissage pour la Vidéo a été ouverte.

Les candidats doivent avoir un dossier de recherche universitaire de qualité et la personne retenue sera encouragée à encadrer des thèses de doctorat, des ingénieur·es et des post-doctorant·es, tout en participant activement aux projets financés et aux activités de l’équipe Multimédia. Ses activités d’enseignement se dérouleront au sein des différents cursus de Télécom ParisTech et de l’Institut Polytechnique de Paris ; ils peuvent être donnés en anglais.

Plus d’informations ici.

Postuler par e-mail.[:en]Telecom ParisTech, a CS/EE school of Institut Polytechnique de Paris, is hiring an Associate Professor in Video Analysis and Learning. The position will be located in the Multimedia Team, within the Image, Data, Signal Department (IDS), and the LTCI laboratory.

The Multimedia team has a long activity in the domain of video and image coding and transmission. More recently, video analysis and learning activity have become more and more relevant for the team who runs now a regional study group about Machine and Deep Learning applications to Image and Video compression. The team has the target to expand its activity in this area, and several new and exciting research projects have just been launched, such as research programs in Deep Learning assisted video compression and Learning-based photographic quality evaluation. In this context, and to support the increasing activity of the team, a permanent position in video analysis and learning has been opened.

Applicants are expected to provide an outstanding academic research record and will be encouraged to advise PhD theses, supervise engineers and post-docs, while being actively involved in funded projects and in the activities of the Multimedia team. The teaching activities will take place in the engineer and master tracks at Telecom ParisTech and can be given in English.

Find here more information.

Apply via e-mail.[:]

Publication

[:fr]Articles ICASSP[:en]ICASSP papers[:]

04/02/2019 Marco Cagnazzo

[:fr]Trois articles ont été acceptés dans la conférence IEEE ICASSP :
1) S. Zheng, M. Cagnazzo, M. Kieffer. « CHANNEL IMPULSIVE NOISE MITIGATION FOR LINEAR VIDEO CODING SCHEMES »
2) L. Wang, A. Fiandrotti, A. Purica, G. Valenzise, M. Cagnazzo. « ENHANCING HEVC SPATIAL PREDICTION BY CONTEXT-BASED LEARNING »
3) P. Nikitin, M. Cagnazzo, J. Jung. « COMPRESSION IMPROVEMENT VIA REFERENCE ORGANIZATION FOR 2D-MULTIVIEW CONTENT ».
Félicitations aux auteur.e.s, en particulier à Shuo, Li et Pavel.[:en]Three articles have been accepted into IEEE ICASSP :
1) S. Zheng, M. Cagnazzo, M. Kieffer. « CHANNEL IMPULSIVE NOISE MITIGATION FOR LINEAR VIDEO CODING SCHEMES »
2) L. Wang, A. Fiandrotti, A. Purica, G. Valenzise, M. Cagnazzo. « ENHANCING HEVC SPATIAL PREDICTION BY CONTEXT-BASED LEARNING »
3) P. Nikitin, M. Cagnazzo, J. Jung. « COMPRESSION IMPROVEMENT VIA REFERENCE ORGANIZATION FOR 2D-MULTIVIEW CONTENT ».
Congrats to Shuo, Li and Pavel.[:]

Soutenance

[:fr]Soutenance de Shuo Zheng[:en]Shuo Zheng’s Phd defense[:]

25/01/2019 Marco Cagnazzo

[:fr]Shuo Zheng soutient sa thèse mardi 05 Février à 10h en amphi Opale à Télécom ParisTech (46 rue Barrault, 75013 Paris). Sa recherche s’inscrit dans le contexte du codage vidéo linéaire.

Titre: Prise en compte des contraintes de canal dans les schémas de codage vidéo conjoint du source-canal

Composition du jury:

M.François-Xavier Coudoux, Université Polytechnique Hauts-de-France, Rapporteur
Mme.Aline Roumy, INRIA Rennes, Rapportrice
M.Jean-Marie Gorce, INSA Lyon, Examinateur
M.Marc LENY, Ektacom, Examinateur
Mme.Michèle Wigger, TélécomParitech, Examinatrice
M.Marco.Cagnazzo, TélécomParisTech, Directeur de thèse
M.Michel.Kieffer, Université de Paris-sud, Co-directeur de thèse

Résumé: Les schémas de Codage Vidéo Linéaire (CVL) inspirés de SoftCast ont émergé dans la dernière décennie comme une alternative aux schémas de codage vidéo classiques. Ces schémas de codage source-canal conjoint exploitent des résultats théoriques montrant qu’une transmission (quasi-) analogique est plus performante dans des situations de multicast que des schémas numériques lorsque les rapports signal-à-bruit des canaux (C-SNR) dièrent d’un récepteur à l’autre. Dans ce contexte, les schémas de CVL permettent d’obtenir une qualité de vidéo décodée proportionnelle au C-SNR du récepteur. Une première contribution de cette thèse concerne l’optimisation de la matrice de précodage de canal pour une transmission de type OFDM de ux générés par un CVL lorsque les contraintes de puissance dièrent d’un sous-canal à l’autre. Ce type de contrainte apparait en sur des canaux DSL, ou dans des dispositifs de transmission sur courant porteur en ligne (CPL). Cette thèse propose une solution optimale à ce problème de type multi-level water lling et nécessitant la solution d’un problème de type Structured Hermitian Inverse Eigenvalue. Trois algorithmes sous-optimaux de complexité réduite sont également proposés. Des nombreux ré- sultats de simulation montrent que les algorithmes sous-optimaux ont des performances très proches de l’optimum et réduisent signicativement le temps de codage. Le calcul de la matrice de précodage dans une situation de multicast est également abordé. Une seconde contribution principale consiste en la réduction de l’impact du bruit impulsif dans les CVL. Le problème de correction du bruit impulsif est formulé comme un probl ème d’estimation d’un vecteur creux. Un algorithme de type Fast Bayesian Matching Pursuit (FBMP) est adapté au contexte CVL. Cette approche nécessite de réserver des sous-canaux pour la correction du bruit impulsif, entrainant une diminution de la qualité vidéo en l’absence de bruit impulsif. Un modèle phénoménologique (MP) est proposé pour décrire l’erreur résiduelle après correction du bruit impulsif. Ce modèle permet de d’optimiser le nombre de sous-canaux à réserver en fonction des caractéristiques du bruit impulsif. Les résultats de simulation montrent que le schéma proposé améliore considérablement les performances lorsque le ux CVL est transmis sur un canal sujet à du bruit impulsif.[:en]

Shuo Zheng’s PhD defense will take place at 5th February, 10 am, Amphi Opale at TélécomParisTech (46 rue Barrault, 75013 Paris).

Committee:

Mr François-Xavier Coudoux, Université Polytechnique Hauts-de-France, Referee
Mrs Aline Roumy, INRIA Rennes, Referee
Mr Jean-Marie Gorce, INSA Lyon, Examiner
Mr Marc Leny, Ektacom, Examiner
Mrs Michèle Wigger, TélécomParitech, Examiner, Jury’s Chair
Mr Marco Cagnazzo, TélécomParisTech, Advisor
Mr Michel Kieffer, Université de Paris-sud, Advisor

Title: Accounting for Channel Constraints in Joint Source-Channel Video Coding Schemes

Abstract: SoftCast based Linear Video Coding (LVC) schemes have been emerged in the last decade as a quasi analog joint-source-channel alternative to classical video coding schemes. Theoretical analyses have shown that analog coding is better than digital coding in a multicast scenario when the channel signal-to-noise ratios (C-SNR) dier among receivers. LVC schemes provide in such context a decoded video quality at dierent receivers proportional to their C-SNR. This thesis considers rst the channel precoding and decoding matrix design problem for LVC schemes under a per-subchannel power constraint. Such constraint is found, e.g., on Power Line Telecommunication (PLT) channels and is similar to per-antenna power constraints in multi-antenna transmission system. An optimal design approach is proposed, involving a multi-level water lling algorithm and the solution of a structured Hermitian Inverse Eigenvalue problem. Three lower-complexity alternative suboptimal algorithms are also proposed. Extensive experiments show that the suboptimal algorithms perform closely to the optimal one and can reduce signicantly the complexity. The precoding matrix design in multicast situations also has been considered. A second main contribution consists in an impulse noise mitigation approach for LVC schemes. Impulse noise identication and correction can be formulated as a sparse vector recovery problem. A Fast Bayesian Matching Pursuit (FBMP) algorithm is adapted to LVC schemes. Subchannels provisioning for impulse noise mitigation is necessary, leading to a nominal video quality decrease in absence of impulse noise. A phenomenological model (PM) is proposed to describe the impulse noise correction residual. Using the PM model, an algorithm to evaluate the optimal number of subchannels to provision is proposed. Simulation results show that the proposed algorithms signicantly improve the video quality when transmitted over channels prone to impulse noise.

[:]

Publication

[:fr]Article TMM accepté[:en]Article in IEEE Transactions on Multimedia[:]

21/01/2019 Marco Cagnazzo

[:fr]L’article « Very Low Bitrate Semantic Compression of Airplane Cockpit Screen Content » a été accepté dans IEEE Trans. on Multimedia.
Félicitation à Iulia Mitrica, première auteure de cet étude portant sur la reconnaissance des éléments sémantiques (texte, graphes) dans le codage de la vidéo d’écrans d’avion.[:en]Our article entitled « Very Low Bitrate Semantic Compression of Airplane Cockpit Screen Content » has been accepted for publication in IEEE Transactions on Multimedia.

Congratulations to Iulia, our first author.[:]

Marco Cagnazzo Web Site

[:en]Practical works TSIA202a[:]

[:en]Article accepted in IEEE Trans. on Circuits and Systems for Video Tech.[:]

[:fr]TP IMA 208[:en]Practical works IMA 208[:it]Esercitazioni IMA 208[:]

[:en]Seminars[:]

[:fr]Article ICIP’19[:en]Paper accepted at IEEE ICIP’19 [:it]Article ICIP’19[:]

[:fr]Session spéciale à ACM ICDSC’19[:en]Special session ACM ICDSC’19[:]

[:fr]Ouverture de poste: Maître de Conférence en Analyse et apprentissage pour la vidéo [:en]Open position: Associate/Assistant Professor in Video Analysis and Learning[:]

[:fr]Articles ICASSP[:en]ICASSP papers[:]

[:fr]Soutenance de Shuo Zheng[:en]Shuo Zheng’s Phd defense[:]

[:fr]Article TMM accepté[:en]Article in IEEE Transactions on Multimedia[:]

Professional blog