Advanced search
Start date

Selection of representative molecules via machine learning

Grant number: 20/05329-4
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Effective date (Start): August 01, 2020
Effective date (End): December 31, 2021
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Marcos Gonçalves Quiles
Grantee:Felipe Vaiano Calderan
Host Institution: Instituto de Ciência e Tecnologia (ICT). Universidade Federal de São Paulo (UNIFESP). Campus São José dos Campos. São José dos Campos , SP, Brazil
Host Company:Universidade de São Paulo (USP). Instituto de Química de São Carlos (IQSC)
Associated research grant:17/11631-2 - CINE: computational materials design based on atomistic simulations, meso-scale, multi-physics, and artificial intelligence for energy applications, AP.PCPE


The analysis and generation of new materials is a costly process, both from an experimental and computational point of view. Computational methods, such as molecular dynamics and DFT (Density Functional Theory) calculation, have been used to study chemical compounds. However, even using these computational approaches, the screening of materials is still a costly process, considering that the number of compounds grows exponentially as a function of the size of the molecules and the types of materials, this task becomes prohibitive. However, based on the premise that molecules with similar characteristics can have similar properties, the accurate simulation of such materials can be reduced by selecting and simulating only representative examples. Thus, the materials scientist demands techniques for selecting representative materials in a given task, allowing only a subset of possible structures to be examined. In this context, machine learning methods, such as clustering, are excellent alternatives. However, as an unsupervised technique, the formation of clusters takes into account only the attributes and the similarity function considered in the experiments. Thus, the clusters may reveal structures that are not suitable for the problem under analysis. This limitation might be tackled with a supervised process (bias). Thus, to assist the work of the materials scientist in the automatic selection of representative examples, this work aims to investigate and implement a data clustering tool biased with external information (supervision). To solve this problem, we will evaluate the association of an optimization method with the clustering algorithm to build biased clusters following the specialist target.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
DE MENDONCA, JOAO PAULO A.; CALDERAN, FELIPE, V; LOURENCO, TUANAN C.; QUILES, MARCOS G.; DA SILVA, JUAREZ L. F.. Theoretical Framework Based on Molecular Dynamics and Data Mining Analyses for the Study of Potential Energy Surfaces of Finite- Size Particles. JOURNAL OF CHEMICAL INFORMATION AND MODELING, v. 62, n. 22, p. 10-pg., . (20/05329-4, 18/21401-7, 17/11631-2, 19/23681-0)

Please report errors in scientific publications list using this form.