A TLB (Translation Lookaside Buffer) is a cache in a processor to accelerate the translation of a virtual address to a physical address. Since the working set of application programs has been increasing rapidly, TLB reach - the maximum size of the memory mapped by a TLB - is failing to keep pace with it.
A TLB is a very expensive resource in a processor and it should operate in a high speed. Therefore, there are strong limitations in increasing the number of TLB entries.
Superpage approach was proposed to increase TLB reach without increasing the number of TLB entries. In a superpage TLB, a TLB entry can map a superpage which is several contiguous base pages. There are several strong requirements for using superpages, and they hinder the actual utilization. Two previous schemes, a partial-subblock TLB and the shadow memory were proposed to release the requirements. A partial-subblock TLB releases only a small portion of the requirements and also limits the superpage size. The shadow memory releases most of the requirements but introduces other serious problems.
This dissertation explores various schemes for supporting superpages efficiently and increasing actual utilization of superpages. First, this dissertation proposes a hybrid scheme which integrates the shadow memory and a partial-subblock TLB, thereby enjoying the benefits inherited from both sides.
The hybrid scheme has as high a superpage utilization as the shadow memory, and avoids most of the problems in the shadow memory by virtue of the partial- subblock TLB. The hybrid scheme inherits most of the hardware cost and overhead from both schemes. However, the evaluation shows that the performance gain
overwhelms the cost and overhead.
Second, this dissertation introduces a new TLB structure, called VS-TLBs. They are based on subblock TLBs and add the subblock size field. By virtue of the subblock size field, a subblock in VS-TLBs can be multiple pages, while a subblock in the subblo...