Advanced search
Start date

Hierarchical clustering methods for automatic organization of search engines results


Textual information retrieval is traditionally based in keywords search. This search presents as a response a list of documents ordered according to its relevance to the query. However, this approch has some well known limitations. Users generally explore just the first results of the response list, to the detriment of the documents considered less relevant by the search engine. Moreover, another significant part of information is lost due to the difficult of the users to express their search objectives through keywords. In this project, methods for hierarchical clustering of documents are explored to help the organization of search engine results. Data returned by one or more search engines are organized in groups, in wich items thar are similar and related to a same topic are placed in a same group. Furthermore, the groups are hierarchically organized, such that the nearest a group is to the root node, the more general is the knowledge it represents. The detailment of a given group and its more specific knowledge are arranged in groups and subgroups in lower levels of the hierarchy. Each group has a sucint description, i.e., a topic that helps the user in an exploratory search of the obtained results in different levels of granurality. This organization based in hierarchical topics facilitates the search for the information of interest, obtaining a complementary view to the model based in a list ordered according to the document relevance. On the other hand, clustering search results has some specific requirements and challenges. The dynamic nature of data given by search engines, the needing for computational efficiency and the exigency of interpretation and interaction by the users resulted in new requirements. These requirements have their scientific and technological challenges, which are the objectives of this research project. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
ROSSI, RAFAEL GERALDELI; LOPES, ALNEU DE ANDRADE; FALEIROS, THIAGO DE PAULO; REZENDE, SOLANGE OLIVEIRA. Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, v. 29, n. 3, p. 361-375, . (11/12823-6, 11/23689-9, 11/19850-9)

Please report errors in scientific publications list by writing to: