Advanced search
Start date
Betweenand

Audio-Visual Speech Processing by Machine Learning

Abstract

This research plan addresses a common basis for a number of areas in signal processing such as speech analysis, speech coding and audio coding, speech recognition and audio feature recognition as well as source separation with regularizations to carry out adjustments suitable to the desired application. Traditionally, speech analysis, in addition to its own importance, also provides signal representations and model parameters that are necessary to the other areas. In this role it is losing appeal with deep learning and parallels are set to be established in order to bring about some interpretation. Beyond usual types of time-frequency decomposition and modification and autoregressive analysis, new algorithms will be explored and proposed based on machine learning and deep learning for enhancement, separation and synthesis of speech and audio signals, partially or totally replacing traditional analysis. Research will focus on generative machines capable of handling video signals and time series as well.Additionally, the parameters and representations of the speech signal will also be used to model and elaborate non-intrusive speech quality metrics; for this purpose, the speech signal is degraded using different communication system parameters. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (12)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
DA SILVA, MARIELLE JORDANE; MELGAREJO, DICK CARRILLO; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. Speech Quality Classifier Model based on DBN that Considers Atmospheric Phenomena. JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, v. 16, n. 1, p. 75-84, . (15/24496-0, 18/26455-8)
SILVA, JUAN CASAVILCA; SAADI, MUHAMMAD; WUTTISITTIKULKIJ, LUNCHAKORN; MILITANI, DAVI RIBEIRO; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA; AL OTAIBI, SATTAM. ight-Field Imaging Reconstruction Using Deep Learning Enabling Intelligent Autonomous Transportation Syste. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, v. 23, n. 2, . (18/26455-8)
RIBEIRO, DAVID AUGUSTO; SILVA, JUAN CASAVILCA; LOPES ROSA, RENATA; SAADI, MUHAMMAD; MUMTAZ, SHAHID; WUTTISITTIKULKIJ, LUNCHAKORN; ZEGARRA RODRIGUEZ, DEMOSTENES; AL OTAIBI, SATTAM. Light Field Image Quality Enhancement by a Lightweight Deformable Deep Learning Framework for Intelligent Transportation Systems. ELECTRONICS, v. 10, n. 10, . (18/26455-8)
NUNES, RODRIGO DANTAS; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. Performance improvement of a non-intrusive voice quality metric in lossy networks. IET COMMUNICATIONS, v. 13, n. 20, p. 3401-3408, . (15/24496-0, 18/26455-8)
MENDONCA, ROBSON V.; SILVA, JUAN C.; ROSA, RENATA L.; SAADI, MUHAMMAD; RODRIGUEZ, DEMOSTENES Z.; FAROUK, AHMED. A lightweight intelligent intrusion detection system for industrial internet of things using deep learning algorithm. EXPERT SYSTEMS, . (15/24496-0, 18/26455-8)
VIEIRA, SAMUEL TERRA; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA. A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations. JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, v. 16, n. 2, p. 180-187, . (15/24496-0, 18/26455-8)
MILITANI, DAVI RIBEIRO; DE MORAES, HERMES PIMENTA; ROSA, RENATA LOPES; WUTTISITTIKULKIJ, LUNCHAKORN; RAMIREZ, MIGUEL ARJONA; RODRIGUEZ, DEMOSTENES ZEGARRA. Enhanced Routing Algorithm Based on Reinforcement Machine Learning-A Case of VoIP Service. SENSORS, v. 21, n. 2, . (19/07665-4, 18/26455-8, 18/12579-7)
BARBOSA, RODRIGO CARVALHO; AYUB, MUHAMMAD SHOAIB; ROSA, RENATA LOPES; RODRIGUEZ, DEMOSTENES ZEGARRA; WUTTISITTIKULKIJ, LUNCHAKORN. Lightweight PVIDNet: A Priority Vehicles Detection Network Model Based on Deep Learning for Intelligent Traffic Lights. SENSORS, v. 20, n. 21, . (19/07665-4, 18/26455-8, 18/12579-7)
HAJAROLASVADI, NOUSHIN; RAMIREZ, MIGUEL ARJONA; BECCARO, WESLEY; DEMIREL, HASAN. Generative Adversarial Networks in Human Emotion Synthesis: A Review. IEEE ACCESS, v. 8, p. 218499-218529, . (19/07665-4, 18/12579-7, 18/26455-8)
TERRA VIEIRA, SAMUEL; LOPES ROSA, RENATA; ZEGARRA RODRIGUEZ, DEMOSTENES; ARJONA RAMIREZ, MIGUEL; SAADI, MUHAMMAD; WUTTISITTIKULKIJ, LUNCHAKORN. Q-Meter: Quality Monitoring System for Telecommunication Services Based on Sentiment Analysis Using Deep Learning. SENSORS, v. 21, n. 5, . (18/26455-8)
RODRIGUEZ, DEMOSTENES Z.; CARRILLO, DICK; RAMIREZ, MIGUEL A.; NARDELLI, PEDRO H. J.; MOELLER, SEBASTIAN. Incorporating Wireless Communication Parameters Into the E-Model Algorithm. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, v. 29, p. 956-968, . (18/26455-8, 15/24496-0)
ROSA, RENATA LOPES; DE SILVA, MARIELLE JORDANE; SILVA, DOUGLAS HENRIQUE; AYUB, MUHAMMAD SHOAIB; CARRILLO, DICK; NARDELLI, PEDRO H. J.; RODRIGUEZ, DEMOSTENES ZEGARRA. Event Detection System Based on User Behavior Changes in Online Social Networks: Case of the COVID-19 Pandemic. IEEE ACCESS, v. 8, p. 158806-158825, . (15/24496-0, 18/26455-8)

Please report errors in scientific publications list by writing to: cdi@fapesp.br.