Refactoring is a widely accepted technique to improve software quality by restructuring its design without changing its behavior. In general, a sequence of refactorings needs to be applied until the quality of the code is improved satisfactorily. In this case, the final design after refactoring can vary with the application order of refactorings, thereby producing different quality improvement. Therefore, it is necessary to determine a proper refactoring schedule to obtain as many benefits as possible. However, there is little research on the problem of generating appropriate schedules to maximize quality improvement.
In this thesis, we propose an approach to automatically determine an appropriate schedule to maximize quality improvement through refactoring. Our approach to scheduling refactoring for code clones consists of three steps. First, we identify where to apply refactoring by detecting code clones suitable for refactoring. To do this, we define method clone sets, as code clones suitable for refactoring from a refactoring view point. Second, we detect a set of refactorings that can be applied to the detected clones. For this purpose, we propose a set of rules to determine appropriate refactoring operators from given code clones. Finally, we generate the best refactoring schedule to maximize the quality improvement achieved by refactoring. A straightforward way to generate the best schedule is to generate all possible schedules and then select the most beneficial one. However, such a brute force approach is obviously time consuming because, theoretically from n refactorings, n! refactoring schedules can be generated. As the number of refactorings increases, the number of possible refactoring schedules increases exponentially. Therefore, scheduling refactorings by investigating all the possible sequences may become NP-hard. To generate an appropriate refactoring schedule within a reasonable computation cost, we adapt a GA that has been widely used to find...