In the last years, to manage the growth in generation and storage of data, there was an increase in the need of creating and maintaining large databases. In particular, time series databases have been frequently used in study areas like medicine, economy and agrometeorology. In this context, the application of computacional techniques to find useful patterns on databases have become an important task. Classification is one of the most widely used task in data mining. However, the manual human analysis of large scale time series databases is impracticable, which leads to an expensive data supervision process. Therefore, one can usually obtain a few labeled instances, in constrast of the large amounts of unlabeled data avaiable. Semi-supervised classification addresses this problem as it considers labeled and unlabeled data to build the classifier. In this context, the goal of this project is to develop a solution for semi-supervised classification of time series, exploring approaches based on clustering and classification tasks and based on graphs. The developed solution will be applied to the classification analysis of vegetation indices time series obtained from satellite images, referring to crop plantations. As a result of this analysis, we aim to identify sugarcane crop areas in the last decade, in Brazil. Finally we expect that the results obtained in this work will be helpful to temporal mining area and to support agrometeorology researches.
News published in Agência FAPESP Newsletter about the scholarship: