Automated Fine Grained Traceability Links Recovery between High Level Requirements and Source Code Implementations

Keywords: Software Traceability, Information Retrieval, Static Code Analysis, Program Slicing, Software Maintenance, Natural Language Processing, Healthcare


Software Traceability has been a matter of discussion in the Software Engineering community for a long time. The process of keeping and recover traces among software artifacts in any system represents a fundamental aspect to properly perform software maintenance tasks and requirements compliance verification. Furthermore, there exist application contexts where this becomes a mandatory process, for instance, banking and healthcare. Software traceability has dedicated efforts in proposing alternatives to recover lost traceability links in a coarse-grained and middle-grained detail by so far, however, proposed techniques are not enough to meet the desired levels of granularity in critical contexts. In this work we propose a fine-grained traceability algorithm designed to recover traces between high level requirements written in human natural language and source code statements where they are implemented. We tested our approach in four open-source healthcare systems to trace constraints requirements specified by the HIPAA law, and we evaluated the results as presented is this paper.


Download data is not yet available.


O. C. Z. Gotel and C. W. Finkelstein, “An analysis of the requirements traceability problem,” in Proceedings of ieee international conference on requirements engineering, 1994, pp. 94–101.

G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo, “Recovering traceability links between code and documentation,” IEEE Transactions on Software Engineering, vol. 28, no. 10, pp. 970–983, 2002.

B. Ramesh and M. Jarke, “Toward reference models for requirements traceability,” IEEE Transactions on Software Engineering, vol. 27, no. 1, pp. 58–93, 2001.

V. Yadav, R. K. Joshi, and S. Ling, “Evolution traceability roadmap for business processes,” in Proceedings of the 12th innovations on software engineering conference (formerly known as india software engineering conference), 2019, doi: 10.1145/3299771.3299790 [Online]. Available:

O. for C. Rights (OCR), “HIPAA Compliance and Enforcement,” May-2008 [Online]. Available: [Accessed: 12-Jul-2020]

T. D. Breaux, M. W. Vail, and A. I. Anton, “Towards regulatory compliance: Extracting rights and obligations to align requirements with regulations,” in 14th ieee international requirements engineering conference (re’06), 2006, pp. 49–58.

S. Avancha, A. Baxi, and D. Kotz, “Privacy in mobile technology for personal healthcare,” ACM Comput. Surv., vol. 45, no. 1, Dec. 2012, doi: 10.1145/2379776.2379779. [Online]. Available:

W. Shen, C. Lin, and A. Marcus, “Using traceability links to identifying potentially erroneous artifacts during regulatory reviews,” in 2013 7th international workshop on traceability in emerging forms of software engineering (tefse), 2013, pp. 19–22.

A. Velasco and J. H. Aponte Melo, “Recovering Fine Grained Traceability Links Between Software Mandatory Constraints and Source Code,” in Applied Informatics, 2019, pp. 517–532.

“Health Insurance Portability and Accountability Act of 1996 (HIPAA) CDC.” Feb-2019 [Online]. Available: [Accessed: 12-Jul-2020]

T. Breaux and A. Antón, “Analyzing regulatory rules for privacy and security requirements,” IEEE Transactions on Software Engineering, vol. 34, no. 1, pp. 5–20, 2008.

N. Kiyavitskaya et al., “Automating the Extraction of Rights and Obligations for Regulatory Compliance,” in Conceptual Modeling - ER 2008, 2008, pp. 154–168.

T. D. Breaux and A. I. Antón, “A systematic method for acquiring regulatory requirements : A frame-based approach,” 2007.

N. Zeni, L. Mich, J. Mylopoulos, and J. R. Cordy, “Applying gaiust for extracting requirements from legal documents,” in 2013 6th international workshop on requirements engineering and law (relaw), 2013, pp. 65–68.

T. Alshugran and J. Dichter, “Extracting and modeling the privacy requirements from hipaa for healthcare applications,” in IEEE long island systems, applications and technology (lisat) conference 2014, 2014, pp. 1–5.

J. C. Maxwell and A. I. Antón, “Checking existing requirements for compliance with law using a production rule model,” in 2009 second international workshop on requirements engineering and law, 2009, pp. 1–6.

J. Huang, O. Gotel, and A. Zisman, Eds., Software and Systems Traceability. London: Springer-Verlag, 2012 [Online]. Available: [Accessed: 12-Jul-2020]

F. Fasano, “Fine-grained management of software artefacts,” in 2007 ieee international conference on software maintenance, 2007, doi: 10.1109/ICSM.2007.4362673 [Online]. Available:

A. Marcus and J. I. Maletic, “Recovering documentation-to-source-code traceability links using latent semantic indexing,” in Proceedings of the 25th international conference on software engineering, 2003, pp. 125–135.

G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo, “Tracing object-oriented code into functional requirements,” in Proceedings iwpc 2000. 8th international workshop on program comprehension, 2000, pp. 79–86.

G. Antoniol, G. Canfora, A. De Lucia, and E. Merlo, “Recovering code to documentation links in oo systems,” in Sixth working conference on reverse engineering (cat. No.PR00303), 1999, pp. 136–144.

Antoniol, Canfora, Casazza, and De Lucia, “Information retrieval models for recovering traceability links between code and documentation,” in Proceedings 2000 international conference on software maintenance, 2000, pp. 40–49.

F. Palomba et al., “User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps,” in 2015 ieee international conference on software maintenance and evolution (icsme), 2015, pp. 291–300.

A. D. Lucia, M. D. Penta, R. Oliveto, A. Panichella, and S. Panichella, “Improving ir-based traceability recovery using smoothing filters,” in 2011 ieee 19th international conference on program comprehension, 2011, pp. 21–30.

D. Diaz, G. Bavota, A. Marcus, R. Oliveto, S. Takahashi, and A. De Lucia, “Using code ownership to improve ir-based traceability link recovery,” in 2013 21st international conference on program comprehension (icpc), 2013, pp. 123–132.

S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, vol. 41, no. 6, pp. 391–407, 1990.

S. T. Dumais, “Improving the retrieval of information from external sources,” Behavior Research Methods, Instruments, & Computers, vol. 23, no. 2, pp. 229–236, Jun. 1991, doi: 10.3758/BF03203370. [Online]. Available: [Accessed: 12-Jul-2020]

B. Dit, M. Revelle, M. Gethers, and D. Poshyvanyk, “Feature location in source code: A taxonomy and survey,” J. Softw. Evol. Process., vol. 25, pp. 53–95, 2013.

A. Qusef, G. Bavota, R. Oliveto, A. D. Lucia, and D. Binkley, “Recovering test-to-code traceability using slicing and textual analysis,” Journal of Systems and Software, vol. 88, pp. 147–168, 2014, doi: [Online]. Available:

B. Dagenais and M. P. Robillard, “Recovering traceability links between an api and its learning resources,” in 2012 34th international conference on software engineering (icse), 2012, pp. 47–57.

B. Sharif and J. I. Maletic, “Using fine-grained differencing to evolve traceability links,” in In GCT’07, 2007, pp. 76–81.

W. E. Wong, S. S. Gokhale, J. R. Horgan, and K. S. Trivedi, “Locating program features using execution slices,” in Proceedings 1999 ieee symposium on application-specific systems and software engineering and technology. ASSET’99 (cat. No.PR00122), 1999, pp. 194–203.

“iTrust Medical Free/Libre and Open Source Software.” [Online]. Available: [Accessed: 12-Jul-2020]

“Start [iTrust].” [Online]. Available: [Accessed: 11-Nov-2018]

“Home - Documentation - OpenMRS Wiki.” [Online]. Available: [Accessed: 12-Jul-2020]

“OSCAR EMR User’s Manual — Site.” [Online]. Available: [Accessed: 12-Jul-2020]

“TAPAS Home.” [Online]. Available: [Accessed: 12-Jul-2020]

M. Cohn, User Stories Applied: For Agile Software Development, Edición: 1. Boston: Addison-Wesley Professional, 2004.

J. Garzas, Object-Oriented Design Knowledge: Principles, Heuristics and Best Practices. Hershey, PA: Idea Group Publishing, 2006.

P. Azimi and P. Daneshvar, “An Efficient Heuristic Algorithm for the Traveling Salesman Problem,” in Advanced Manufacturing and Sustainable Logistics, 2010, pp. 384–395.

Z. W. Geem, J. H. Kim, and G. V. Loganathan, “A New Heuristic Optimization Algorithm: Harmony Search,” SIMULATION, vol. 76, no. 2, pp. 60–68, Feb. 2001, doi: 10.1177/003754970107600201. [Online]. Available: [Accessed: 12-Jul-2020]

H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and dataflow tests,” in Proceedings of sixth international symposium on software reliability engineering. ISSRE’95, 1995, pp. 143–151.

“ Head First Object-Oriented Analysis and Design (0636920008675): Brett D. McLaughlin, Gary Pollice, Dave West: Books.” [Online]. Available: [Accessed: 12-Jul-2020]

S. M. Sutton, “Aspect-Oriented Software Development and Software Process,” in Unifying the Software Process Spectrum, 2006, pp. 177–191.

B. Fluri, M. Wursch, and H. C. Gall, “Do code and comments co-evolve? On the relation between source code and comment changes,” in 14th working conference on reverse engineering (wcre 2007), 2007, pp. 70–79.

B. Fluri, M. Würsch, E. Giger, and H. C. Gall, “Analyzing the co-evolution of comments and source code,” Software Quality Journal, vol. 17, no. 4, pp. 367–394, Dec. 2009, doi: 10.1007/s11219-009-9075-x. [Online]. Available: [Accessed: 12-Jul-2020]

R. C. Martin, Agile Software Development, Principles, Patterns, and Practices, Edición: 1st. Upper Saddle River, N.J: Pearson, 2002.

D. Binkley, M. Davis, D. Lawrie, J. I. Maletic, C. Morrell, and B. Sharif, “The impact of identifier style on effort and comprehension,” Empirical Software Engineering, vol. 18, no. 2, pp. 219–276, Apr. 2013, doi: 10.1007/s10664-012-9201-4. [Online]. Available: [Accessed: 12-Jul-2020]

M. Ohba and K. Gondow, “Toward mining "concept keywords" from identifiers in large software projects,” in Proceedings of the 2005 international workshop on Mining software repositories, 2005, pp. 1–5, doi: 10.1145/1083142.1083151 [Online]. Available: [Accessed: 12-Jul-2020]

Y. Goldberg and O. Levy, “Word2vec explained: Deriving mikolov et al.’s negative-sampling word-embedding method.” 2014 [Online]. Available:

S. Arlt, A. Podelski, and M. Wehrle, “Reducing gui test suites via program slicing,” in Proceedings of the 2014 international symposium on software testing and analysis, 2014, pp. 270–281, doi: 10.1145/2610384.2610391 [Online]. Available:

K. B. Gallagher and J. R. Lyle, “Using program slicing in software maintenance,” IEEE Transactions on Software Engineering, vol. 17, no. 8, pp. 751–761, 1991.

I. Mastroeni and D. Zanardini, “Abstract program slicing: An abstract interpretation-based approach to program slicing,” ACM Trans. Comput. Logic, vol. 18, no. 1, Feb. 2017, doi: 10.1145/3029052. [Online]. Available:

M. Weiser, “Program slicing,” in Proceedings of the 5th international conference on software engineering, 1981, pp. 439–449.

How to Cite
A. Velasco and J. Aponte, “Automated Fine Grained Traceability Links Recovery between High Level Requirements and Source Code Implementations”, paradigmplus, vol. 1, no. 2, pp. 18-41, Aug. 2020.