Advanced search
Start date

Semantic Segmentation on Videos

Grant number: 17/16597-7
Support Opportunities:Scholarships in Brazil - Doctorate
Effective date (Start): December 01, 2017
Effective date (End): February 28, 2022
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Gerberth Adín Ramírez Rivera
Grantee:Darwin Danilo Saire Pilco
Host Institution: Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Associated scholarship(s):19/18678-0 - Semantic Segmentation Using Hourglass Learning Model, BE.EP.DR


Semantic segmentation task aims to create a dense classification by labeling pixel-wise each object present in images or videos. Convolutional neural network (CNN) approaches have been proved useful by exhibiting the best results in this task. Some challenges remain, however, such as the low-resolution of feature maps and the loss of spatial precision, both produced in the last convolution layer of the CNNs. How to solve these problems and obtain consistent results is still an open problem on images and even more on videos; thus, making semantic segmentation on video a rather difficult problem. In this Ph.D. project, to solve these problems, we propose to create an hourglass-shaped CNN architecture to address the semantic segmentation task on video. Our proposed architecture is end-to-end trainable and extracts spatiotemporal information to discriminate between several object classes present in video. In this way, the final result of our proposed architecture is the generation of densely labeled videos. To achieve this goal we need to learn meaningful spatiotemporal features that differentiate the objects of the video (by learning convolution kernels) while remaining consistent within frame's variations, learn multidimensional up-sampling and fusion kernels that use the predictions of lower resolution levels and the existing spatiotemporal features to maintain the relations between voxels through the learned nonlinearities, and create an end-to-end learning framework (data augmentation and loss functions) that uses the existing tags (both densely annotated and bounding boxes) on video datasets to train the network.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
SAIRE, DARWIN; RIVERA, ADIN RAMIREZ. Empirical Study of Multi-Task Hourglass Model for Semantic Segmentation Task. IEEE ACCESS, v. 9, p. 80654-80670, . (19/18678-0, 19/07257-3, 17/16597-7)
SAIRE, DARWIN; RIVERA, ADIN RAMIREZ. Global and Local Features Through Gaussian Mixture Models on Image Semantic Segmentation. IEEE ACCESS, v. 10, p. 14-pg., . (19/07257-3, 17/16597-7, 19/18678-0)
Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
PILCO, Darwin Danilo Saire. Uma análise do espaço latente em modelos encoder-decoder para melhorar o aprendizado de representação para a tarefa de segmentação semântica em imagens. 2022. Doctoral Thesis - Universidade Estadual de Campinas (UNICAMP). Instituto de Computação Campinas, SP.

Please report errors in scientific publications list using this form.