Advanced search
Start date

An infrastructure for the recommendation of software improvements using large-scale code repositories


Currently, there is a vast amount of source code available in repository hosting services on the Web. GitHub alone hosts millions of open source projects. Developers can take advantage of this material as an opportunity to improve their own code. The idea is to first identify functions in the repository similar to the ones in the developer's local project, but with better implementation in some aspect. It would then be possible to recommend different types of improvements to the developer's code, such as performance increase or even automated software repair. In this research project we plan to investigate and develop an infrastructure for software improvement recommendation using source code available in large-scale code repositories. The idea is to use the extensive experience of the proponent with code search, a topic which has being working in collaboration with Cristina Lopes from UCI for over ten years. As a basis for the project we intend to use Sourcerer, a sophisticated infrastructure for source code analysis and indexing. We also plan to apply an approach recently developed within our group that indicates pairs of methods with similar semantics (what we called functional redundancy). During the project we intend to investigate what types of improvements could be implemented in the infrastructure and, for the proposed and implemented improvements, we will perform experiments to evaluate their applicability. The first type of improvement we will target is automated repair, fixing bugs that eventually appear in the developer's methods and that are not present in some redundant method available in the repository. For these studies we will use a repository with approximately 17,000 Java projects hosted on GitHub - indicated as engineered software projects by a recent machine learning approach developed by other researchers - and indexed with Sourcerer. At the end of the project we expect to develop a prototype that can be useful for the software development community and to conduct experiments using the developed platform and tool. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
LAZZARINI LEMOS, OTAVIO AUGUSTO; SUZUKI, MARCELO; DE PAULA, ADRIANO CARVALHO; LE GOES, CLAIRE; ACM. Comparing Identifiers and Comments in Engineered and Non-Engineered Code: A Large-Scale Empirical Study. PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), v. N/A, p. 10-pg., . (17/27098-1)

Please report errors in scientific publications list using this form.