With the growing importance of in-memory data processing, cloud providers have launched large memory virtual machine services to accommodate memory-intensive workloads. However, it is a challenging task for the cloud providers to expand the large virtual machine services further while retaining the cloud economics. Such large memory services using scaled-up machines are far less cost-efficient than scaled-out services with volume commodity servers. Exploiting memory usage imbalance across cloud nodes, disaggregated memory can provide a cost-effective way to scale memory capacity.
The hypervisor-integrated design has several new contributions in disaggregated memory design and implementation. First, with the tight integration, it investigates a new page management mechanism and policy tuned for disaggregated memory and restructures the memory management procedures which relieves the scalability bottlenecks. Second, exploiting page access records available to the hypervisor, it supports application-aware elastic block sizes for fetching indirect memory pages with different granularities. Third, we propose a new Service-Level Agreement (SLA) model that benefit both cloud consumers and providers, and the performance prediction model necessary to support the new SLA. Fourth, when only partial memory traces are available, a dual counter-based memory profiling is developed to dynamically determine the memory capacity to fulfill the new SLA. Finally, to complement the profiling with partial memory traces, the replacement mechanism in direct memory supports the memory tracing for the LRU group of pages.
With the new SLA support, the possible performance degradation of disaggregated memory over the ideal large memory machine is curtailed within a contracted margin. The experimental results show that the disaggregated memory dynamically chooses the best block configuration and controls a performance degradation compared to the large memory machine, even though the profiler requires lower computation cost than the balanced tree-based reuse distance profiler. Also, it requires only 0.1% of memory footprint of the balanced tree-based reuse distance profiler.