Data streams are characterized by continuously generating large amounts of data with a different time interval between each sample. They are an ordered sequence of instances that arrive over time and can be of unbounded size. They have been gaining an increasing amount of attention in recent years due to the numerous real-world applications in dynamic environments that produce non-stationary data and whose traditional methods of Data Mining and Machine Learning are unsuccessful. Data streams can have data changing patterns so that a possible change can make predictive models outdated and inaccurate. In this scenario, data stream classification is an important task that has stood out, since it requires constant updates to its model so that the accuracy remains stable due to changes in class distribution over time. Besides, in real applications, class labels are rarely available for training a prediction model. The present research will develop a data stream classification method, exploring situations of concept drift and limitations as extreme latency and imbalanced classes. An application to be investigated is the classification of insect vectors of relevant infectious diseases, such as fevers dengue and Zika.
News published in Agência FAPESP Newsletter about the scholarship: