Till innehåll på sidan

Tegawendé Bissyandé: Free-form and code-to-code search - Leveraging Q&A Data towards Accurate Matching of Diverse Solutions

Tid: Ti 2017-09-26 kl 13.00

Plats: Room 1440 (Biblioteket), Lindstedtsvägen 3

Medverkande: Tegawendé Bissyandé, Research Associate @SnT / Univ. Luxembourg

Exportera till kalender

Abstract:

Code search is an unavoidable activity in software development. Although various approaches have been explored in the literature to support code search tasks, two major issues challenge the accuracy and scalability of state-of-the-art techniques.

First, source code terms such as method names and variable types are often different from conceptual words mentioned in a free-form search query. This vocabulary mismatch problem can make code search inefficient. We will introduce COde voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories.

Second, a large body of research on searching for code clones has focused on identifying (nearly) similar pieces of code either (1) statically by recognizing syntactic and structural similarities or (2) dynamically by observing when code fragments produce the same outputs for the same inputs or present similar execution traces. Unfortunately,

the former approaches cannot detect code fragments that are functionally similar although syntactically different, while the latter approaches cannot scale to the search of (partially) equivalent code fragments in large code bases.

We present FaCoY (Find Another Code other than Yours), a novel approach for statically finding code fragments which may be semantically similar to user input code. FaCoY implements a query alternation strategy: instead of directly matching code query tokens with code in the search space, FaCoY first attempts to identify other tokens which may be also relevant in implementing the functional behavior of the input code.​