Advanced search
Start date

Predictive models for the phases from structure factors of centric reflections in protein crystallography by machine learning

Grant number: 18/23946-0
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Effective date (Start): May 01, 2019
Effective date (End): June 30, 2020
Field of knowledge:Biological Sciences - Biochemistry - Chemistry of Macromolecules
Principal Investigator:Andre Luis Berteli Ambrosio
Grantee:Felipe de Souza Lincoln
Host Institution: Instituto de Física de São Carlos (IFSC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated research grant:13/07600-3 - CIBFar - Center for Innovation in Biodiversity and Drug Discovery, AP.CEPID


The phase problem is notorious in X-ray protein crystallography. Fundamentally, technological limitations on the detection systems result in the loss of information on the phases of the waves scattered constructively by the components of the crystal. As a consequence, the direct calculation of the electron density distribution function in the unit cell is impaired. Currently, two experimental approaches can be applied to overcome this problem: (I) partial replacement of the ordered aqueous solvent by electron-dense ions (metallic or halogenic) or (II) selective quantification of the dispersive component (lambda-dependent) of the atomic scattering factor. Alternatively, prior information, in the form of known crystal structures which are functionally related or homologous to components in the crystal, may serve as the source of an initial set of phases. Although challenging, when feasible, the applications of these different methods have already enabled the determination of more than a hundred thousand atomic models for the most diverse proteins (and their complexes). In this project, based on this collection of structural information already available in the Protein Data Bank, we propose to analyze the phase problem in the perspective of supervised learning. More precisely, we search to develop a predictive model for the phases of the structural factor for reflections with phase restrictions. Employing the Python programming language, we will verify the feasibility of this approach mainly using the open-source library XGBoost, seeking to understand its limitations in view of the extent of the available information on the PDB database. The success of our proposal will represent a step forward in the study of the phase problem from the point of view of artificial intelligence.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Please report errors in scientific publications list using this form.