Several recent studies have used artificial intelligence to collect and evaluate secondary data in the health area, obtained from electronic hospital records. Currently at the Hospital das Clínicas of the Faculty of Medicine of Botucatu (HCFMB) there is an important demand for the analysis of consultations between medical specialties. The introduction of the computerized system of medical records and consultations created the need of a professional available in each specialty to respond to the large volume of consultations. The final objective of this study will be to develop a neural network together with natural language extraction methods capable of obtaining structured information from the field of consultations from the medical records of the HCFMB and, by this, creating a supervised "machine learning" model of automated response of the consultations, called Cross-Check.The HCFMB currently uses the SOUL MV Hospitalar electronic medical record system and the unstructured data will be obtained from the "complementary assessment" field present in these records. This field is filled in by doctors to request a consultation for a different specialty, being a text field with no character limit with the reasons why the patient needs the evaluation. However, the analysis of medical records presents some complexity due to the use of acronyms, negation, grammatical errors, different descriptive writing styles, inadequate filling of specific structured fields, etc.In order to increase the amount of text "corpus", data from the last 5 available medical evolutions will also be collected. Additionally, structured information from the fields: sex, race, education, profession, date of birth, marital status, specialty and the International Statistical Classification of Diseases and Related Health Problems (ICD) code will also be considered. The data will be provided by the Medical Informatics Center of the Hospital das Clínicas of the Faculty of Medicine of Botucatu (CIMED). Medical records samples without the complementary assessment field will be excluded from this study. The identification of the patient or the medical team will also not be part of the data for reasons of confidentiality.The Cross-Check model will be able to classify consultations into 4 categories: referral to the specialty outpatient clinic, patient already followed up by the specialty, referral to the health center or without indication of the consultation.To achieve the proposed objective, it will be necessary to: compare existing methods of extracting texts and define the most appropriate one; choose the methods that will be used for recognizing patterns and for data grouping; develop a neural network capable of collect and turn data from medical records into structured elements; assess the quality of extracted information; validate the neural network and the proposed model.For turning unstructured data from the "complementary assessment" field into a structured database, a model of Convolutional Neural Networks for Named Entities Recognition will be built and Natural Language Processing techniques will be applied in the pre-processing of the model, to base preparation and post-processing, to convert into a binary matrix (records/entities), indicating the entity's presence in each record.We will use data from requests already answered, with 70% of the data for training and 30% for validation. It is expected that the Cross-Check predictive model will be able to generate the probabilities among the 4 categories and make it possible to prioritize the interconsultation evaluations of the more severe cases. In the future, the model can be scaled to respond to inter-consultations from basic health units to tertiary hospitals with minimal need for human intervention.
News published in Agência FAPESP Newsletter about the scholarship: