Category Archives: Poste

Open position: Associate/Assistant Professor in Video Analysis and Learning

Telecom ParisTech, a CS/EE school of Institut Polytechnique de Paris, is hiring an Associate Professor in Video Analysis and Learning. The position will be located in the Multimedia Team, within the Image, Data, Signal Department (IDS), and the LTCI laboratory.

The Multimedia team has a long activity in the domain of video and image coding and transmission. More recently, video analysis and learning activity have become more and more relevant for the team who runs now a regional study group about Machine and Deep Learning applications to Image and Video compression. The team has the target to expand its activity in this area, and several new and exciting research projects have just been launched, such as research programs in Deep Learning assisted video compression and Learning-based photographic quality evaluation. In this context, and to support the increasing activity of the team, a permanent position in video analysis and learning has been opened.

Applicants are expected to provide an outstanding academic research record and will be encouraged to advise PhD theses, supervise engineers and post-docs, while being actively involved in funded projects and in the activities of the Multimedia team. The teaching activities will take place in the engineer and master tracks at Telecom ParisTech and can be given in English.

Find here more information.

Apply via e-mail.

Open position: Associate professor in Immersive Video

Télécom ParisTech recruits
An associate professor in Immersive Video
46 rue Barrault- 75013 PARIS

Candidature deadline : 10 nov. 2017

See also here (French).

Please send:
– a detailed curriculum vitae
– a cover letter
– a description of the past activities of the candidate in teaching (initial and continuing training) and research
– the main publications
– the names and addresses of two qualified persons who can give an informed opinion on the applicant
– a succinct teaching  and research project (maximum 3 pages)
Send to
E-mail :
Mail : Télécom ParisTech – DRH – B659 – 46 rue Barrault – 75634 Paris Cedex 13
Position description (in French)
Extract of the position description in English:
Required degrees:
  • PhD; or
  • Master Degree (or equivalent) and 5+ years of experience ; or
  • Professional with 8+ years of experience
Required knowledge:
  • Video compression; and
  • Immersive video
Required skills:
  • University-level teaching experience; and
  • Good level of English; and
  • Good level of French, or engagement to reach a good level as soon as possible
Desirable skills (one or more among)
  • Statistical model of multi-dimensional signals;
  • Video streaming;
  • Video quality
  • Digital holography
  • Programming languages
  • Teaching by projects and problems


Open position: Associate professor in Immersive Video

Very soon we will open a position of Associate Professor (specialty: immersive video) in our team at Telecom-ParisTech. We are looking for brilliant PhD, preferably with 1y+ of post-doctoral experience. More experienced candidates are also welcome. The detailed call will follow soon.

Research domains: immersive video, video coding, video transmission, video quality.

Background: signal processing, networking, applied maths.

Potential candidate can contact me at

PhD position available

Update: this position is no longer available

Acquisition and visualization of the Plenoptic function with intermediate view synthesis


There is an increasing interest towards the applications that allow Free Navigation Video Services [1], where users can modify the viewpoint on a scene while receiving a video. These services try to provide the user with the so-called Plenoptic function of the scene [2], defined as:


It gives the light intensity at each position  for any incident angle , for any wavelength  and at any time.  This doctoral project is focused on three key problems related to the use of the Plenoptic function : its acquisition, synthesis and visualization.

Current tools for acquisition do not allow collecting the whole Plenoptic function; on the contrary, they allow a sampling of it. For example, in Super-MultiView video[3], the plane (z=z_0)  is fixed, and only the forward scene, i.e. when the polar angle comprised , is between -pi/2 and pi/2, is acquired. Moreover, the plane  is sampled at the position of each camera.

In this project we are interested in the interpolation of the Plenoptic function, i.e. in the synthesis of virtual viewpoints that were not acquired by real cameras. Moreover, we also want to explore the case of irregular sampling position of P_f.


Access to the Plenoptic function would allow new ways to create and consume visual contents. For example, the Fyuse application [4] allows to change the view angle during the reproduction, while the Lytro system [5] allows post-acquisition refocusing.

Several scientific fields are concerned by this approach :

  • Image aesthetics [6]
  • virtual cinema [1]
  • perception and visual attention [8][9]
  • free viewpoint video  [10] [11]

These items interact one with the other : view synthesis is preliminary for virtual cinema and may benefit from visual attention and perception information ; the whole process impacts on the quality and the aesthetics of the resulting image.


Image synthesis plays a key role in the system that we want to implement. We can see the problem as the interpolation of the Plenoptic function from a set of samples [12]. This reconstruction is based on the scene geometry and often uses post-processing for alleviating the synthesis artifacts.

Image synthesis and rendering have been long studied by the Computer Vision community and the Compression community, even outside the context of Plenoptic function interpolation. The first methods only used the images for the synthesis: they fall into the Image-Based Rendering (IBR) [13] family. Disparity estimation and occlusion detection are typical tools used to improve the synthesis for this case[14], and may prove useful in this doctoral project.

When the depth information is also available, we have the Depth Image-Based Rendering (DIBR) [15] family. Even though DIBR is known since the first 2000’s, the quality of synthesis is not fully satisfying yet [16]. Nevertheless, some promising methods have been proposed recently [17]. They combine temporal and inter-view redundancy to improve the synthesis.

Another difficulty may come from the camera positioning [18].  A preliminary calibration and synchronization phase are needed in order to have a high quality synthesis [19] [20] [21]. To this end, feature matching tools could be employed, such as  SIFT [22], SURF [23]. This look necessary in order to achieve the 3D scene understanding [1][18] .

Work agenda

This doctoral project will start with a deep and accurate study of the state of the art in the different concerned domains : image synthesis, camera calibration, 3D geometry, feature matching, visual attention. From a practical point of view, the PhD candidate may use the facilities at b<>com to test the acquisition of the Plenoptic function and to perform camera calibration and synchronization.

Then, the PhD candidate will test and implement different synthesis methods, starting from the state of the art, and then proposing more complex and effective solutions. Human vision principles should be integrated into the new approaches.

At the same time, the impact of the synthesis methods on such practical applications as visualization, free navigation, virtual cinema, …, will be taken into account. The final target of the doctoral project is the mastering of the complete system from acquisition to visualization.

Advisors :

Rémi Cozot, Maître de Conférences, Habilité à Diriger des Recherches, IRT b<>com, IRISA/Université de Rennes 1 –

Marco Cagnazzo, Maître de Conférences, Habilité à Diriger des Recherches, IRT b<>com, Telecom-ParisTech/Institut Mines-Télécom–


  1. Tanimoto, Free-Viewpoint Television Image and Geometry Processing for 3-D Cinematography, M. Ronfard, Ré. & Taubin, G. (Eds.) Springer Berlin Heidelberg, 2010, 53-76
  2. H. Adelson and J. Bergen, “The plenoptic function and the elements of early vision,” In Computational Models of Visual Processing, pages 3-20. MIT Press, 1991
  3. Dricot, A.; Jung, J.; Cagnazzo, M.; Pesquet, B. & Dufaux, F. « Full Parallax 3D Video Content Compression ». Dans Novel 3D Media Technologies, Springer New York, 2015, 49-70
  6. C Bist, R. Cozot, G. Madec, X. Ducloux, Style Aware Tone Expansion for HDR Displays. Graphic Interface 2016
  7. Lino, M. Christie, Efficient composition for virtual camera control. ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2012S. Hillaire, A. Lécuyer, T. Regia-Corte, R. Cozot, J. Royan et G. Breton, Design and application of real-time visual attention model for the exploration of 3d virtual environments. IEEE Transactions on Visualization and Computer Graphics (TVCG), 18(3):356–368, 2012
  8. Hillaire, A. Lécuyer, R. Cozot et G. Casiez, Depth-of-field blur effects for first-person navigation in virtual environments. IEEE Computer Graphics and Applications, 28(6):47–55, 2008
  9. [Farin et al. 2006] D. Farin, Y. Morvan, PHN. de With, View Interpolation Along a Chain of Weakly Calibrated Cameras. IEEE Workshop on Content Generation and Coding for 3D-Television, Eindhoven, Netherlands, June 2006
  10. [Dufaux et al 2013] F. Dufaux, B. Pesquet-Popescu, M. Cagnazzo (eds.): Emerging Technologies for 3D Video. Wiley, 2013
  11. Chebira, A., Dragotti, P. L., Sbaiz, L., & Vetterli, M. (2003, September). Sampling and interpolation of the plenoptic function. In Image Processing, 2003. ICIP 2003. 2003 International Conference on (Vol. 2, pp. II-917). IEEE
  12. H Shum, S Kang, A review of image-based rendering techniques. Proceed. Intern. Symp. Visual Comm and Proc. (2000). doi: 10.1117/12.386541
  13. Petrazzuoli, M. Cagnazzo, B. Pesquet-Popescu. “Novel solutions for side information generation and fusion in multiview DVC”. In EURASIP Journal of Advances on Signal Processing, vol. 2013, no. 154, pp. 17, Octobre 2013.
  14. Fehn, C. (2004, May). Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. In Electronic Imaging 2004 (pp. 93-104). International Society for Optics and Photonics.
  15. Dricot, J. Jung, M. Cagnazzo, F. Dufaux, B. Pesquet-Popescu. “Subjective evaluation of Super Multi-View compressed contents on high-end light-field 3D displays”. In Elsevier Signal Processing: Image Communication, vol. 39, pp. 369-385, Novembre 2015
  16. Purica, E. Mora, M. Cagnazzo, B. Ionescu, B. Pesquet-Popescu. “Multiview plus depth video coding with temporal prediction view synthesis”. In IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 2, pp. 360 – 374, February 2016.
  17. [Snavely 2008] N. Snavely, S.M. Seitz, R. Szeliski. Modeling the world from internet photo collections. Int. J. Comput. Vis., 80 (2) (2008), pp. 189–210
  18. [Milani 2016] Simone Milani, Compression of multiple user photo galleries, Image and Vision Computing, Volume 53, September 2016, Pages 68-75
  19. [Zini et al 2013] L. Zini, A. Cavallaro, F. Odone. Action-based multi-camera synchronization. IEEE J. Emerging Sel. Top. Circuits Syst., 3 (2) (2013), pp. 165–174
  20. [Shen et al 2010] L. Shen, Z. Liu, T. Yan, Z. Zhang, P. An. View-adaptive motion estimation and disparity estimation for low complexity multiview video coding. IEEE Trans. Circuits Syst. Video Technol., 20 (6) (2010), pp. 925–930
  21. Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on (Vol. 2, pp. 1150-1157). Ieee.
  22. Bay, H., Tuytelaars, T., & Van Gool, L. (2006, May). Surf: Speeded up robust features. In European conference on computer vision (pp. 404-417). Springer Berlin Heidelberg.


PhD Thesis: compression of avionics screen content

The airplane screens have a very specific video content, where text and graph are superposed to images or to a uniform background.

Compressing this kind of data requires adapted techniques, since the most important information (text, graph) is usually degraded by traditional, transform-based video compression techniques.

We want to investigate the use of classification, segmentation and inpainting to recognize the most relevant information and encode it with appropriate methods.

The PhD student will work at both Telecom-ParisTech and Zodiac Aerospace


Thèse : compression et streaming vidéo multivues

Three-years contract to achieve a PhD degree.
The topic is the problem of interactive streaming of multiview video.
Multiview video is composed of several video sequences, each corresponding to a different point of view. Interactive acces to this video requires switches from one view to another. This is problematic from the point of view of predictive coding: making prediction from one image to a second one belonging to another view is complex (all inter-view dependencies should be taken into account); independent coding is not effective. Possible solutions are based on distributed video coding.

Links: Paper on IMVS + DVC.

See also papers by G. Cheung.