Stand-Alone Accelerator Protocol (SAP) for Heterogeneous Computing Systems

Today's high-performance computing (HPC) systems are largely structured based on traditional central processing units (CPUs) with tightly coupled general-purpose graphics processing units (GPUs, which can be considered domain-specific accelerators). GPUs have a different programming model than CPUs and are only efficient in exploiting spatial parallelism for accelerating high-concurrency algorithms but not the temporal/pipeline parallelism vital to accelerating high-dependency algorithms that are widely used in predictive simulations for computational science. As a result, today's HPC systems still have huge room for improvement in terms of performance and energy efficiency for running complex scientific computing tasks (e.g., many large pieces of legacy HPC codes for predictive simulations are still running on CPUs).

In recent years, a few more accelerator choices for heterogeneous computing systems (e.g., HPC and other large-scale computing systems) have emerged, such as field-programmable gate arrays (FPGAs, which can be considered reconfigurable accelerators) and tensor processing units (TPUs, which can be considered application-specific accelerators). Although these new accelerators offer flexible or customized hardware architectures with excellent capabilities for exploiting temporal/pipeline parallelism efficiently, their adoption in extreme-scale scientific computing is still at its infancy and is expected to be a tortuous process regardless of their superior performance and energy efficiency benefits.

The fundamental challenge to the adoption of any new accelerators in HPC, such as FPGAs and TPUs, is that each accelerator's programming model, message passing interface, and virtualization stack is developed independently and is specific to the respective hardware architecture. With the lack of clarity in the demarcation between hardware-specific and hardware-agnostic development regions, today's programming models require domain-matter experts (DMEs) and hardware-matter experts (HMEs) to work interdependently to make a significant effort in optimizing hardware-specific codes in order to adopt new accelerator devices in HPC and gain performance benefits. This tangled association is a self-imposed bottleneck from existing programming models that impairs a future in true heterogeneous HPC and severely impacts the velocity of scientific discovery.

Researchers at Arizona State University have developed a stand-alone accelerator protocol (SAP) for heterogeneous computing systems. The adoption of emerging accelerators is key to achieving greater scale and performance in heterogeneous computing systems. The SAP allows for a hardware accelerator to be plug-and-playable in a stand-alone fashion (without needing a local CPU host) and interact with a remote computing system agent for application acceleration across any network infrastructure. The SAP further facilitates a hardware-agnostic accelerator orchestration (HALO) software framework for hardware-agnostic programming with high performance portability and scalability in heterogeneous computing systems, such as HPC systems, data center computing systems, and edge computing systems. Related PublicationHALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC

Potential Applications:

  • High-performance computing, cloud computing
  • GPUs, FPGAs, TPUs, and other domain- and application-specific accelerators

Benefits and Advantages:

  • Flexible hardware-agnostic environment allows application developers to develop high-performance applications without knowledge of the underlying hardware
  • Environment facilitates dynamic plugin of an accelerator onto the network fabric, which can be auto-discovered and utilized by applications
Patent Information: