Applying Information Theory to Software Evolution: What can we Learn From Surprising Changes?
| dc.contributor.advisor | Treude, Christoph (Singapore Management University) | |
| dc.contributor.advisor | Baltes, Sebastian | |
| dc.contributor.author | Rodrigues Figueiredo Torres, Adriano | |
| dc.contributor.school | School of Computer and Mathematical Sciences | |
| dc.date.issued | 2024 | |
| dc.description.abstract | Information theory, though widely applied in various disciplines, remains underexplored in the field of software engineering and the evolution of code bases. This work seeks to address this gap by investigating the application of information theory to measure the ever-changing complexity of software projects. We examine two definitions of entropy, one based on natural language tokens and another based in Abstract Syntax Tree nodes, and apply them to the commit history of 95 open-source projects. Our analysis reveals a strong correlation between the two entropy measures and highlights their mostly weak correlations with established metrics of software complexity. Furthermore, our data from information-theoretic anomaly detection suggest that significant fluctuations in project information content may inform a definition of surprising change events. Building on this, we extend our analysis to measure the information content of source code using tokens and abstract syntax tree nodes as the elements of the communication system defined by the source code. Through an empirical assessment of 95 actively maintained open source projects, we demonstrate that entropy metrics capture distinct dimensions of complexity compared to traditional metrics. Finally, we showcase the efficacy of information theory in anomaly detection, indicating its potential in identifying unusual change commits generated software evolution, thus paving the way for automated detection of such events. This research offers valuable insights into the use of information theory for a comprehensive understanding of software complexity throughout its development life cycle. | |
| dc.description.dissertation | Thesis (MPhil.) -- University of Adelaide, School of Computer and Mathematical Sciences, 2024 | en |
| dc.identifier.uri | https://hdl.handle.net/2440/144189 | |
| dc.language.iso | en | |
| dc.provenance | This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals | en |
| dc.subject | information theory | |
| dc.subject | software evolution | |
| dc.subject | static analysis | |
| dc.title | Applying Information Theory to Software Evolution: What can we Learn From Surprising Changes? | |
| dc.type | Thesis | en |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Rodrigues Figueiredo Torres2024_MPhil.pdf
- Size:
- 2.89 MB
- Format:
- Adobe Portable Document Format