THE CHALLENGE
In hyperscale data centers, the most expensive component in CPU-based servers, in terms of both monetary cost and carbon footprint, is currently computer memory; going forward, memory is expected to become even more expensive relative to other server components. Hardware memory compression, where the CPU’s memory controller transparently compresses and packs memory values more densely, is a promising and performant solution to combat the high cost of memory. While this approach boosts memory capacity without adding expensive memory chips, it also breaks the traditional link between what the operating system (OS) thinks it is using and what the hardware actually stores, since dynamically-compressed data takes up varying amounts of real space. As a result, the OS can no longer accurately partition memory resources across different co-located workloads, leading to increased performance variation. Current tools like memory quotas and ballooning lack visibility into this compressed layer, making it hard to guarantee performance or isolate workloads in multi-tenant environments, ultimately affecting server efficiency, cost savings, and the predictability that enterprises and cloud customers’ demand.
OUR SOLUTION
We introduce a hardware innovation called Multi-domain Hardware Memory Compression to embed directly into each CPU’s memory controller to give cloud providers and data center operators fine-grained control over how much actual DRAM each workload consumes even when memory is compressed in hardware. By letting the OS assign clear memory quotas through lightweight per-job control blocks, Multi-domain Hardware Memory Compression ensures that each job stays within its real memory budget by automatically compressing less-used data in the background. If a workload still exceeds its quota, the system is alerted early to take corrective action. Unlike previous methods that relied on software guesses about data compressibility, this solution enforces exact memory limits in hardware, delivering predictable performance and strong isolation between tenants. The result is tighter consolidation, reduced overprovisioning, and up to 70% savings in memory costs while maintaining the reliability and service-level guarantees that businesses and cloud customers require.
Figure: Full-system prototype on a Genesys 2 Kintex-7™ FPGA. Linux boots up with two cores and 3979736 KB – 4X the 1GB DRAM on board (prototype: https://youtu.be/-1JG3JnIY3U).
Advantages:
Potential Application: