Advanced search
Start date

Incorporating the semantics into the websensors construction process

Grant number: 13/14757-6
Support Opportunities:Scholarships in Brazil - Doctorate
Effective date (Start): December 01, 2013
Effective date (End): May 31, 2018
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Solange Oliveira Rezende
Grantee:Roberta Akemi Sinoara
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated scholarship(s):16/07620-2 - Semantic Representation for Text Classification, BE.EP.DR


Text Mining techniques for supporting knowledge discovery become essential as the volume and variety of digital text documents increase, either in social networks, web or inside the organizations. As well as the text sources, the possibilities of Text Mining applications are varied. Applications and researches have been developed with the goal of using the web as a powerful social sensor. In this context, the websensors arises as sensors that monitor text documents publishing and keep a time series of certain topics. The applicability of websensors is wide. According to the documents monitored, the activity of a websensor can support the understanding, the explanation of the prediction of a fact. Websensors can be built from text clustering, and therefore avoiding the need of large amounts of labeled data or intense effort of a domain specialist to define the sensors' parameters. However, the semantic aspects of the texts can be crucial to the quality and effective usage of the extracted clusters. When learning good websensors, for instance, it may require a text organization that differs documents which, despite of using the same vocabulary, present different ideas about the same subject. Text Mining researches have shown several advances in the past years; however, the semantic issue is still a challenge of the field. Motivated by this gap, this PhD research project aims to incorporate the semantics in the process of websensors construction, achieving a more refined organization which considers the ideas expressed in the documents. It will be developed a new text data representation format in order to represent semantic aspects. Besides that, clustering algorithms will be developed or adapted to make effective use of the semantic representation. Although this project is focused on the inclusion of semantics in the construction of websensors through clustering methods, it is noteworthy that its results can later be expanded to other Text Mining tasks, as document classification and sentiment analysis. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications (4)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
SINOARA, ROBERTA A.; CAMACHO-COLLADOS, JOSE; ROSSI, RAFAEL G.; NAVIGLI, ROBERTO; REZENDE, SOLANGE O.. Knowledge-enhanced document embeddings for text classification. KNOWLEDGE-BASED SYSTEMS, v. 163, p. 955-971, . (16/17078-0, 13/14757-6, 16/07620-2)
SINOARA, ROBERTA A.; SCHEICHER, RICARDO B.; REZENDE, SOLANGE O.; IEEE. Evaluation of Latent Dirichlet Allocation for Document Organization in Different Levels of Semantic Complexity. 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), v. N/A, p. 8-pg., . (13/14757-6)
SINOARA, ROBERTA A.; ROSSI, RAFAEL G.; REZENDE, SOLANGE O.; IEEE. Semantic Role-based Representations in Text Classification. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), v. N/A, p. 6-pg., . (16/07620-2, 14/08996-0, 11/12823-6, 13/14757-6)
SINOARA, ROBERTA A.; SUNDERMANN, CAMILA V.; MARCACINI, RICARDO M.; DOMINGUES, MARCOS A.; REZENDE, SOLANGE O.; ALMEIDA, A; BERNARDINO, J; GOMES, EF. Named Entities as Privileged Information for Hierarchical Text Clustering. PROCEEDINGS OF THE 18TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM (IDEAS14), v. N/A, p. 10-pg., . (13/16039-3, 13/14757-6, 10/20564-8, 12/13830-9)
Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
SINOARA, Roberta Akemi. Semantic aspects in the representation of texts for automatic classification. 2018. Doctoral Thesis - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB) São Carlos.

Please report errors in scientific publications list using this form.