Earth System Science Interdisciplinary Center, University of Maryland, College Park, USA
Department of Computer Science, Wayne State University, Detroit, USA
a r t i c l e
i n f o
a b s t r a c t
Article history:
Received 21 June 2012
Received in revised form 28 September 2012
Accepted 30 September 2012
Available online 13 October 2012
Context: The functionality of a software system is most often expressed in terms of concepts from its
problem or solution domains. The process of nding where these concepts are implemented in the source
code is known as concept location and it is a prerequisite of software change.
Objective: We investigate a static approach to concept location named DepIR that combines program
dependency search (DepS) with information retrieval-based search (IR). In this approach, programmers
explore the static program dependencies of the source code components retrieved by the IR search
engine.
Method: The paper presents an empirical study that compares DepIR with its constituent techniques. The
evaluation is based on an empirical method of reenactment that emulates the steps of concept location
for 50 past changes mined from software repositories of ve software systems.
Results: The results of the study indicate that DepIR signicantly outperforms both DepS and IR.
Conclusion: DepIR allows developers to perform concept location efciently. It allows nding concepts
even with queries that do not rank the relevant software components highly. Since formulating a good
query is not always easy, this tolerance of lower-quality queries signicantly broadens the usability of
DepIR compared to the traditional IR.
2012 Elsevier B.V. All rights reserved.
Keywords:
IR
Dependency search
Concept location
1. Introduction
Concept (or feature) location is one of the most common program comprehension activities and it identies parts of the source
code that implement particular concepts. Concept location has
been addressed in the context of many software engineering tasks
[1,2]; it is an important and well dened part of the software
change process [3].
In this process, concept location starts with the change request,
which can be a new feature request or a bug report, and identies
the location in the code where the core of the change needs to be
made. While concept location identies the specic code snippet
where the change starts, the full extent of the software change is
established via impact analysis, which uses the results of concept
location as an input and identies all the remaining locations in
source code that should be modied. These two activities are recognized to be complementary to each other yet they are methodologically different [4,5].
Developers start concept location by extracting relevant concepts from the change request. Then, they locate them in the
maksym@umd.edu
(M.
Petrenko),
rajlich@wayne.edu
0950-5849/$ - see front matter 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.infsof.2012.09.013
652
2. Related work
Based on types of analyses and underlying information, existing approaches to concept location can be classied as static, dynamic, and hybrid [6]. In this work, we study techniques that
belong to the category of static concept location [2,7]. One of
the most popular approaches to static concept location is textual
pattern matching, an approach where the programmers use a tool
(e.g. grep [8]) to search the source code for patterns that match
queries, formulated as regular expressions. Petrenko et al. described how ontology fragments can be used to formulate and
improve the quality of these queries [9]. IR-based approaches
provide an alternative to the pattern matching tools and rely on
queries, formulated in terms of natural language; we provide a
detailed review and discussion of IR approaches in Section 2.1.
Other static techniques are based on using program dependencies
that guide the search process; we discuss these techniques in
Section 2.2.
Source code components (e.g., classes, methods, elds, etc.) relate to one another via data and control ow. These relationships
between the software components can be modeled as graph, where
the nodes represent the components and the edges represent
dependencies between these components (e.g., method calls,
inheritance, declarations of abstract data types, etc.). Program
dependency search (DepS) is a concept location technique where
the programmers traverse such graph of a software system in a
depth-rst manner, starting from a known location [12]. The following steps describe the process of DepS. The rst step is performed once, while the second step iterates until the concept is
located.
653
3. DepIR
DepIR is a hybrid concept locations techniques that leverages
the strengths of each constituent technique against the weaknesses of the other. Compared to the stand-alone DepS and IR, DepIR gives the users more options how to conduct search and hence
it allows locating concepts that can be missed by these two techniques alone.
One of the weaknesses of the IR is that it constraints the user to
navigate the software in the order of the results returned to a given
user query. This ordering, while semantically relevant, is void of
structural information. Thus, two software components A and B,
which are structurally strongly related, may not appear adjacent
in the ranked list of results. While investigating A, the developer
should be aware of or at least pointed to B. In DepIR, structural
relationships are preserved and presented to the user together
with the text retrieval results.
On the other hand, when searching the dependency graph using
DepS, some of dependencies are hard to detect by available program analysis tools, for example database dependencies, reection,
and so forth. Hence, the dependency graph may not contain a path
to the node implementing the concept, making this node (location)
hard to reach. Moreover in large systems, the path in the graph
from the entry point could be long and developers may need to
backtrack several times. DepIR overcomes these limitations by
using IR to query source code and identify a relevant starting point
for the search, and also by ranking neighbors of nodes in the
dependency graph by their relevance to the user query.
DepIR combines IR with DepS. It guides developers by maintaining a global context and a local context. The global context relates to
the change request and is captured in the user query that is handled by information retrieval. The local context relates to the
source code dependencies and allows browsing to the immediate
neighbors.
Although a different dependency graph can be used, we will describe DepIR based on the Method Dependency Graph (MDG). In
MDG, the nodes of the graph are program methods and dependencies are calls between these methods as identied by traversing
Eclipse Abstract Syntax Tree (AST) [31] and extracting method
bindings [32].
The following steps summarize DepIR and are illustrated by
Fig. 1:
1. Initial indexing of documents and extraction of program dependencies: The source code of the software system is parsed and
all methods and their dependencies dened in the code are
extracted to build MDG as explained above. The corpus of the
software system is created and indexed with an IR technique
as explained in Section 2.1. This part of the process is fully automatic (i.e., no user input is required) and performed every time
major changes applied to the system.
2. Ranking methods: The developers formulate a text query that
describes the target concept based on their knowledge of the
code and the description of the change request. During concept
location, they may learn new facts about the code and return to
this step and reformulate the query. Once the query is formulated, the IR engine ranks all methods in descending order
based on their relevance (i.e., textual similarity) to the query.
3. Selecting a promising method: The programmer inspects these
ordered results and determines their relevance to the change
654
(1)
11
12
11
12
11
12
10
10
10
(2)
(3)
(4)
Fig. 1. Example of DepIR workow: (1) Indexing methods and extracting their dependencies; circles represent methods and lines represent dependencies. (2) Formulating a
query and ranking the methods; numbers indicate the ranks of the methods. (3) Investigating the results of the query; the method with the highest rank is shaded in gray. (4)
Exploring neighborhood of the highly-ranked methods using dependency search, as highlighted by the bold edges, and locating the concept in the methods shaded in black.
task. For each method, the following decisions and actions are
made:
a. If the method is the place that will change, then concept
location ends and the developer continues with the next
phase of the software change (not discussed in this paper).
b. If the method is irrelevant to the task at hand, the programmers inspect the method with the next highest rank, return
to the step #2 and reformulate the query.
c. If the method is related to the concepts implementation, yet
it is not where changes are to be made, then the programmers consider this method to be focus method continues
with the step #4.
4. Navigating program dependencies: The developers explore the
MDG by investigating methods in the neighborhood of the focus
method. The adjacent methods are ranked based on their similarity to the user query. For each neighbor, the developer has
the following options:
a. If the method will be the one that changes, then concept
location ends and the developer continues with the next
phase of the software change (not discussed here).
b. If the method is not the one that will change, then navigate
to an adjacent method or backtrack.
c. Return to step#2 and reformulate the query.
d. Stop the navigation; return to the IR search results from step
#3 and continue to investigate the list of retrieved methods.
DepIR keeps track of the entire search and navigation path to
guide the developer and to avoid repetitions. Once the programmer
examines a method during the concept location and decides that it
does not implement the target concept, DepIR marks the method
and noties the programmers if they attempt to visit it the second
time. This ensures that the same method is not inspected twice if
there are cycles in the MDG.
3.1. Example of locating a concept using DepIR
A practical example of DepIR is locating a concept Open a le by
dragging and dropping it into the editor in jEdit. Note that jEdit is
also used in the case study of Section 4.
The concept location begins by formulating the following DepIR
query: open le using drag and drop. The top ten methods returned by IR and their relevance scores are isDragInProgress
(0.75), isDragEnabled (0.71), startDragAndDrop (0.66), setDragInProgress (0.64), drop (0.59), setDragEnabled (0.58), dragEnter
(0.48), two overloaded openFile (0.47) methods in the jEdit class,
and the insideSelection (0.46) method in the SelectionManager
class. Upon inspecting the top ranked methods in the TexEdit class,
the programmer concluded that although these methods are rele-
Classes
Methods
Revisions total
Adempiere
DrJava
JabRef
jEdit
Megamek
4116
1147
632
536
1268
74,930
20,193
5768
7296
12,793
5581
3197
2300
3796
5910
954
440
35
500
1664
655
656
11
12
11
12
10
10
Fig. 2. Reenactment of concept location techniques. The gray shading indicates methods inspected to locate the concept, black shading represents a method containing the
concept, numbers are IR ranks of the methods, bold edges are dependencies followed while exploring the dependency graph, and M marks the main() method. (a)
Dependency search is reenacted as the shortest path in the dependency graph between the main() method and the method containing the concept. (b) Reenactment of IR
explores methods in the order suggested by their ranks until the concept is located. (c) Reenactment of DepIR inspects methods in the order suggested by their ranks until a
method is selected as the starting point of dependency search; after this, the shortest path is computed from this starting method to the concept.
Table 2
Aggregate results for the ve systems.
Statistics
Inspected methods
Average
Median
StDev
DepS
IR
DepIR
4.3
4
0.97
190
10
647
2.98
3
1.23
10000
Adempiere
1000
100
Megamek
DrJava
JabRef
jEdit
Dep Search
IR
DepIR
10
7387
7630
8052
8136
8189
8228
8275
8739
9289
9400
5085
5573
5583
5650
5656
5700
5912
5934
6459
6473
1294
1578
4305
4637
4744
4810
4927
4928
4966
838
1002
1336
1698
1728
1737
1745
1812
1831
1909
2383
11176
12522
13899
14898
15732
5312
5571
6686
7074
7629
Fig. 3. Plot with effort data for the 50 changes and the three techniques.
657
Inspected methods
DepS
IR
DepIR
Min
Max
Mean
Median
StDev
Min
Adempiere
7387
7630
8052
8136
8189
8228
8275
8739
9289
9400
5
6
5
5
5
5
5
5
5
5
13
21
3
2
7
1
6
7
4
1
4
5
3
2
3
1
3
5
4
1
2
2
2
2
2
2
2
2
2
2
107
2793
140
1156
54
1128
610
414
1049
97
35
572
58
306
22
298
169
112
279
33
17
8
46
33
17
32
33
17
33
17
34.8
571.8
57.8
305.8
21.8
297.8
168.8
111.8
278.8
32.8
1
1
Located
Located
28
Located
5
2
39
Located
DrJava
1294
1578
4305
j4637
4744
4810
4927
4928
4966
838
4
4
4
3
3
6
4
4
3
5
2
44
53
3
4
585
4
2
11
2108
2
5
5
3
3
2
3
2
5
5
10
3
7
8
7
3
8
7
8
3
45
111
188
228
309
116
248
237
14
72
24
42
68
118
158
29
89
86
11
22
17
13
9
118
158
8
11
16
11
7
24.0
42.0
68.0
118.0
158.0
28.6
89.0
86.0
11.0
21.8
JabRef
1002
1336
1698
1728
1737
1745
1812
1831
1909
2383
3
5
5
4
4
4
5
4
4
3
1
3
99
11
3
14
27
6
53
104
1
3
4
3
2
3
4
4
3
2
17
1
1
1
1
1
1
1
1
1
47
82
405
91
411
413
55
424
58
15
32
29
106
35
142
142
19
146
24
8
32
17
10
14
14
14
10
15
15
8
32.0
28.8
105.8
34.7
141.7
141.7
18.8
145.7
23.7
7.5
jEdit
11,176
12,522
13,899
14,898
15,732
5312
5571
6686
7074
7629
3
3
5
6
3
3
3
3
4
3
1
1014
10
17
253
416
1
14
3997
485
1
5
2
2
5
3
1
4
3
4
49
7
14
4
53
42
9
42
11
46
274
49
102
107
107
239
44
237
248
252
161
28
51
38
80
140
26
139
100
149
161
28
45
15
80
140
26
139
42
149
161.0
28.0
51.0
37.8
80.0
8.4
26.0
139.0
100.0
149.0
Megamek
5085
5573
5583
5650
5656
5700
5912
5934
6459
6473
5
3
5
3
5
5
5
5
6
5
3
2
4
2
28
31
10
11
1
20
3
2
3
2
2
3
2
3
1
3
13
13
7
13
13
13
13
13
2
13
45
48
100
48
168
101
314
314
75
359
28
30
33
30
68
51
98
98
31
109
27
30
12
30
46
46
33
33
14
33
28.0
30.0
33.0
30.0
68.0
51.0
98.0
98.0
30.8
109.0
Located
Located
38
Located
61
15
13
7
Located
51
seven out of 50 cases, with three cases in DrJava and four in jEdit.
A closer analysis of the case study logs showed that the main()
method of jEdit resides in the same class as a collection of methods that provide a centralized access to the common libraries and
preferences of jEdit. Because of this design, the client methods
that call the library and preference methods are very close in
MDG to main(). The outlier cases required modication of such
client methods, resulting in the observed performance of DepS.
We also found a similar design in DrJava, where the main()
method is used to load common libraries and initialize program
preferences; this design peculiarity also resulted in the outlier
cases.
Max
6650
6664
with IR
with IR
2836
with IR
331
27
1049
with IR
with IR
with IR
268
with IR
61
70
13
77
with IR
359
Mean
part
part
part
part
2218
2378
only
only
1432
only
168
10
544
only
part only
part only
153
part only
61
42
13
42
part only
205
Median
StDev
4
1424
2217.7
2377.8
1432
1432.0
168
7
544
168.0
9.5
544.0
9.5
11
32.8
8.8
5.5
6
3
3
13.5
5
5.0
6.0
3.0
3.0
39.0
20.8
48.5
10
207
22
51
7
5
9
15
48.0
140.0
207.0
22.0
51.0
35.0
3.7
9.0
15.0
11.5
4
23
14.5
3.5
76.8
4.0
23.0
33.8
18.0
8
8.5
21
82.0
8.0
36.0
153
153.0
61
42.5
13
42
61.0
42.0
13.0
42.0
205
205.0
658
studied cases DepIR required inspecting no more than ve methods, we conclude that this assumption had no effect on the results
of the study.
We assessed the effort required to locate the concepts by
recording the number of inspected methods, which may pose a
threat to external validity. Depending on the goals of a particular
maintenance process, other measures of effort, such as the time
that programmers needed to locate the concepts, might provide a
more suitable insight into the difference between the compared
techniques. Furthermore, different change requests and software
systems might require a different effort to locate the concepts. A
specic software system might have classes with more or less
descriptive terms, which can impact the effectiveness of both IR
and DepIR.
In the studies, all of the selected change requests described bugs
program functionalities that can be viewed as unwanted features
[10,38]. Other types of changes, such as adding a new functionality
or modifying an existing functionality, may require a different effort. Furthermore, the selected change request described explicit
features. We expect that DepS and DepIR will have an advantage
over IR when locating implicit features, since such features are implied but not explicitly expressed in the code, and, therefore, can be
hard or impossible to locate by querying source code for specic
terms [43].
The systems we studied are open source and may not be representative of software developed by other processes. However, it
should be noted that, for example, Megamek was developed by
29 programmers and 75% of its code was contributed by a core
of ve developers, where each of these core developers contributed
over 5% of the total code. Adempiere was developed by 53 developers and the core of 6 developers contributed 80% of the total code
base, where each of the core developers contributed over 5% of the
code. In this way, Adempiere and Megamek display the characteristics similar to industrial software, where the bulk of the code is
developed by a relatively small and stable team of developers; this
is different from a more typical open-source software process,
where the code is produced by large and spontaneous groups of
independent contributors [44,45].
5. Conclusions and future work
In this paper, we proposed and discussed DepIR, a concept location technique that combines program dependency search and
Information Retrieval based search approaches. Our case study
indicates that concept location using DepIR requires a signicantly
smaller effort than the concept location using DepS or IR alone.
Furthermore, the comparative analysis of results of the concept
location using IR and DepIR indicates that DepIR allows nding
concepts even with queries that do not rank the relevant methods
highly. Since formulating a good query is not always trivial, this
tolerance of lower-quality queries signicantly broadens the
usability of DepIR.
In the future, we plan to investigate which specic software
characteristics make DepIR a better choice for concept location
technique compared to other concept location techniques. We
would also like to investigate whether the concept location process
can be further improved by combining DepIR with the recent techniques that, based on IR, generate a sub-graph of dependency
graph as discussed in Section 4, i.e., [1721].
Acknowledgments
We would like to thank Andrian Marcus and Denys Poshyvanyk
for their extensive input on IR techniques and help with the early
versions of this paper. This work was partially supported by the
659
[22] N.E. Gold, M. Harman, D. Binkley, R.M. Hierons, Unifying program slicing and
concept assignment for higher-level executable source code extraction,
Journal of Software Practice and Experience 35 (2005) 9771006.
[23] R. Al-Ekram, K. Kontogiannis, Source code modularization using lattice of
concept slices, in: the Eighth Euromicro Working Conference on Software
Maintenance and Reengineering, 2004, pp. 195203.
[24] N.E. Gold, M. Harman, Z. Li, K. Mahdavi, Allowing overlapping boundaries in
source code using a search based approach to concept binding, in: IEEE
International Conference on Software, Maintenance, 2006, pp. 310319.
[25] E. Hill, L. Pollock, K. Vijay-Shanker, Exploring the neighborhood with dora to
expedite software maintenance, in: IEEE/ACM International Conference on
Automated Software Engineering, 2007, pp. 1423.
[26] A.J. Ko, B.A. Myers, M.J. Coblenz, H.H. Aung, An exploratory study of how
developers seek, relate, and collect relevant information during software
maintenance tasks, IEEE Transactions on Software Engineering (TSE) 32 (2006)
971987.
[27] J. Sillito, G.C. Murphy, K. De Volder, Asking and answering questions during a
programming change task, IEEE Transactions on Software Engineering 34
(2008) 118.
[28] J. Sillito, K. De Volder, B. Fisher, G.C. Murphy, Managing software change tasks:
an exploratory study, in: International Symposium on Empirical Software
Engineering (ISESE 2005), Noosa Heads, Australia, 2005, pp. 2332.
[29] G. Marchionini, Information Seeking in Electronic Environments, Cambridge
University Press, Cambridge, United Kindom, 1997.
[30] M.P. Robillard, W. Coelho, G.C. Murphy, How effective developers investigate
source code: an exploratory study, IEEE Transactions on Software Engineering
30 (2004) 889903.
[31] T. Khun, O. Thomann, Abstract Syntax Tree, in: Eclipse Corner Articles, 2007.
[32] M. Petrenko, V. Rajlich, Variable granularity for improving precision of impact
analysis, in: IEEE International Conference on Program Comprehension, 2009,
pp. 1019.
[33] D. Poshyvanyk, A. Marcus, Y. Dong, JIRiSS an eclipse plug-in for source code
exploration, in: 14th IEEE International Conference on Program
Comprehension (ICPC06), Athens, Greece, 2006, pp. 252255.
[34] A.E. Hassan, R.C. Holt, Replaying development history to assess the
effectiveness of change propagation tools, Empirical Software Engineering 11
(2006) 335367.
[35] G. Canfora, L. Cerulo, Impact analysis by mining software and change request
repositories, in: 11th IEEE International Symposium on Software Metrics,
2005, pp. 929.
[36] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, Identifying the starting impact
set of a maintenance request: a case study, in: European Conference on
Software Maintenance and Reengineering, IEEE Computer Society, 2000, pp.
227230.
[37] C. Jensen, W. Scacchi, Discovering, modeling, and reenacting open source
software development processes, new trends in software process modeling,
Series in Software Engineering and Knowledge Engineering 18 (2006) 120.
[38] D. Liu, A. Marcus, D. Poshyvanyk, V. Rajlich, Feature location via information
retrieval based ltering of a single scenario execution trace, in: 22nd IEEE/ACM
International Conference on Automated Software Engineering (ASE07),
Atlanta, Georgia, 2007, pp. 234243.
[39] E.W. Dijkstra, A note on two problems in connexion with graphs, Numerische
Mathematik 1 (1959) 269271.
[40] W.J. Conover, Practical Nonparametric Statistics, third ed., Wiley, New York,
NY, 1999.
[41] K.B. McKeithen, J.S. Reitman, H.H. Rueter, S.C. Hitle, Knowledge organisation
and skill differences in computer programmers, Cognitive Psychology 13
(1981) 307325.
[42] D.N. Perkins, F. Martin, Fragile knowledge and neglected strategies in novice
programmers, in: Empirical Studies of Programmers, 1986, pp. 213229.
[43] V. Rajlich, Intension are a key to program comprehension, in: IEEE
International Conference on Program Comprehension (ICPC 09), 2009, pp.
19.
[44] H.K. Wright, M. Kim, D.E. Perry, Validity concerns in software engineering
research, in: FSE/SDP Workshop on Future of Software Engineering Research,
2010, pp. 411414.
[45] A. Mockus, R.T. Fielding, J.D. Herbsleb, Two case studies of open source
software development: Apache and Mozilla, ACM Transactions on Software
Engineering and Methodology 11 (2002) 309346.