Databases are getting larger and, in many situations, only a small subset of data items can be labeled. This happens because the labeling process is often expensive, time consuming and require the involvement of human experts. Thus, the semi-supervised learning techniques, directed to situations which there is only a small fraction of labeled data, have become very relevant. Several semi-supervised algorithms have been proposed, showing that it is possible to obtain good results using prior knowledge. Among these algorithms, those based on graphs has gained prominence in the area. Such interest is justified by the benefits provided by the representation via graphs, such as the ability to capture the topological structure of the data, represent hierarchical structures (graphs and sub graphs), as well, to detect clusters. However, most available data is represented by an attribute-value table, making necessary the study of graph construction techniques to apply in such algorithms. This work focuses on semis-supervised algorithms based on graphs, which make use of a weight matrix between the vertices of a sparse graph, since denser graphs can degenerate the solution. As the generation of the weight matrix and the sparse graph, and their relation to the performance of the algorithms has been little investigated in the literature, this project aims to investigate these aspects and propose new methods of networks generation, covering the advantages of two or more methods of network construction from attribute-value data, or considering features not explored in the literature yet. The generated networks will be applied in semi-supervised learning algorithms that make use of graphs.
News published in Agência FAPESP Newsletter about the scholarship: