The gene expression regulation occurs as an essential phenomenon on cellular processes in response to mutual dynamics established between an organism and its environment. In addition to the regulatory elements already known there is growing interests in the regulatory role played by non-coding RNA molecules (ncRNA) that interacting with certain proteins performs their regulatory functions in a post-transcriptional level. Sm family proteins, present in all three domains of life, are key elements in the regulatory network and therefore have been widely studied and characterized.Laboratorial experiments based on immunoprecipitation techniques are able to identify the RNA-protein complex in a satisfactory way, however the time spent in resources and people makes this strategy unfeasible to be applied in a broader context, in relation to organisms and functional elements to be characterized. Aiming to fill this gap, techniques based on machine learning have been one of the alternative approaches to predict protein-RNA interactions. Despite the existence of related works that follows this approach, the molecules specificities that interact have not been achieved in order to ensure a good prediction. In this context, the objective of this work is to combine different models into a single classifier in order to explore the junction of different data perspectives through different biases and search representation.For the methodology evaluation, the strategy will be applied on data from the model organism Halobacterium salinarum and will also be compared with other techniques.
News published in Agência FAPESP Newsletter about the scholarship: