It is being under development by the research group headed by the supervisor of this project the Perseus technique, a novel technique that handles persistent suffix trees. The Perseus introduces the following distinctive good properties. It is based on an approach that constructs persistent suffix trees whose sizes may exceed the main memory capacity. Furthermore, it provides an algorithm that allows for users to indicate which substrings of the input string should be indexed, according to the requirements of their applications. Moreover, it proposes an extended exact matching algorithm that searches for a query string into suffix trees that may be partitioned.This project aims at introducing extensions to the Perseus, allowing that this technique be used to index huge nucleotide sequences. In detail, this project aims at developing a strategy to use the Perseus when the memory required to store the string being indexed is larger than the main memory capacity. The project also aims at investigating the execution of approximate queries, in addition to perform experimental tests that make it possible to compare our work with related ones.
News published in Agência FAPESP Newsletter about the scholarship: