Advanced search
Start date

Genome centric analysis of the human microbiome: detection of bioindicators and optimization through machine learning

Grant number: 22/03534-5
Support Opportunities:Scholarships abroad - Research Internship - Doctorate (Direct)
Effective date (Start): July 01, 2022
Effective date (End): June 30, 2023
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:André Carlos Ponce de Leon Ferreira de Carvalho
Grantee:Jonas Coelho Kasmanas
Supervisor: Peter Florian Stadler
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Research place: Leipzig University, Germany  
Associated to the scholarship:19/03396-9 - Analysis and classification of human microbiomes: detection of bioindicators and optimization through machine learning, BP.DD


Different microbial communities inhabit the different regions of the human body, whether mouth, gut, vagina, or others. They are collectively called the Human Microbiome (HM). These communities participate in essential processes in human health. Thus, a better understanding of the human core microbiome may help early diagnosis of disorders and contribute to potentially more efficient treatments. Metagenomics, on its hand, is one of the best ways to experiment and comprehend microbiome communities. In this project, I will investigate the use of metagenomics and Machine Learning (ML) to recover Metagenomes Assembled Genomes (MAGs) and unravel their biodiversity, genetic potential, and connection with human health. Metagenomics, the study of a collection of genetic material (genomes) from a mixed community of organisms, generates a large amount of information that is currently underused due to its complexity. At the same time, ML is an efficient way to extract new and useful information and prediction models from complex data. The main goal of this BEPE is to improve the understanding of how the healthy human microbiome (prokaryotes, eukaryotes, and viruses) biodiversity and genetic potential can be described. Consequently, I will use ML models to differentiate human metagenomic samples from healthy and not-healthy patients. Preliminary results from the student Ph.D. project should support the good development of the internship. In the past year, I created the HumanMetagenomeDB (HMgDB) and selected around 12 thousand human metagenomic samples to serve as data for this internship. The internship will start by updating the HMgDB to include feedback from the domain experts at Leipzig University (UniLeipzig) and the Helmholtz Center for Environmental Research (UFZ). Prof. Peter Stadler's group has wide expertise in developing computational tools to explore biological sequence data. In addition, UniLeipzig have close partnership with UFZ and the genome-centric analysis group from Dr. Ulisses Nunes da Rocha. Consequently, the interdisciplinary collaboration among the groups will be of utmost importance for the internship. Following, with the help of the genome-centric analysis group at UFZ, I will recover the MAGs from the metagenomic samples. The genomes recovered are from hundreds of different studies worldwide. Therefore, this BEPE aims to contribute with data to the Human Metagenomic Assemble Genomes Consortium (HuMAGs), created during the student Ph.D. project in partnership with the UFZ experts. The second step is to describe the large-scale recovered MAGs with the experts in the HuMAGs. This step will also define and extract the relevant biological features from the MAGs that will be used in the following step. The third step of the internship is to create ML predictive models able to identify the critical features able to separate health and not-healthy status. This step should help the definition of the core genetic structures of a healthy human microbiome from the different body compartments (i.e., gut, skin, vagina, etc.) and establish the ML modeling groundwork to deal with metagenomic data. Finally, I aim to evaluate the models together with experts from the data domain to identify potential bioindicators of the human microbiome from a not-healthy state. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Please report errors in scientific publications list by writing to: