As developers have used version control systems and bug tracking systems, the software development process has become more visible and traceable. Since the 2000s, the relevant information has been studied by researchers in the mining software repository area. The studies have suggested various prediction models to improve the quality of software and to reduce the development cost.
In this dissertation, two principal problems in the mining software repository are studied, using graph-based approaches. First, we examine the change recommendation approach. Change recommendation approaches have been suggested to prevent omission errors by predicting additional change locations for a given change set. We study a group of bug reports that are fixed more than once to investigate real-world omission errors. We empirically study the characteristics of the multi-fix bugs and how the supplementary patch locations can be predicted based on the initial change locations. Additionally, we suggest a novel graph representation - the change relationship graph - and comprehensively investigate the relationships between initial and supplementary change locations on the change relationship graph. Second, we examine the reliability prediction approach. We suggest a reliability measure to take into account the severity of each released fault, specifically the weighted number of faults. Regression models are used to build prediction models based on existing object-oriented, change, and graph metrics. We investigate the effects of metric sets, feature selection methods, regression models, and the size of the training set on the prediction accuracy. Furthermore, we use structure, clone, and co-change graphs to investigate how these graphs evolve along the release history, as well as how they can be used to predict release-level reliability and fault-prone classes.