In this thesis, a new branch folding technique, speculative branch folding, is proposed for low power, embedded applications. The proposed technique is applicable to the single-issue, five stage pipeline structure. An implementation for a commercial embedded processor is also introduced.
The proposed technique predicts the direction of a conditional branch and then it combines the branch instruction with the first instruction of predicted path of the branch. We proposed the folded instruction that consists of the predicted target instruction and the branch condition. Only this folded instruction enters the execution pipeline instead of the branch instruction and the predicted target instruction. Because no branch instruction enters the execution pipeline, the execution continues as if there is no branch instruction in the original sequence of instructions.
Effectiveness of the speculative branch folding is evaluated by extensive simulation. The Simplescalar simulator is modified to show the effect of the proposed technique. We use the Mediabench benchmark suits that are ported to the Simplescalar PISA instruction set architecture. The experimental results show that the proposed technique performs better than SBB folding scheme, the previously proposed branch folding technique for low power embedded processors.
A commercial embedded processor, CalmDSPTM, is re-designed using speculative branch folding. The hardware cost and the power consumption are measured using commercial and in-house tools used in the ASIC design flow of Samsung. The implementation result of the new processor shows that the proposed technique can be applied to commercial embedded processors.