Reinforcement Learning for Fault-tolerant Energy-efficient NoC Design

Researchers at GW have developed a novel, cost-effective, energy-efficient computer architecture design based on reinforcement learning (RL) for use in various computing applications. The disclosed design has better fault-tolerance, reliability, network latency than existing solutions. The disclosed computer architecture design can include an Error Correction Code (ECC) based fault-tolerant scheme and a dynamic control policy driven by RL. This design can particularly address three major limitations of conventional computer architecture designs: performance scaling, power budget, and security demand, in that it provides a better solution than existing ones.  

The disclosed invention can be implemented as either a system, a method, or as a device as can be appreciated. The system or method or device can include various aspects as follows: (i) proactive fault-tolerant schemes capable of allowing routers to switch among four different fault-tolerant operation modes; (ii) an RL-based control policy for the proposed fault-tolerant scheme, which observes a set of NoC system parameters at runtime and can evolve optimal per-router control policy automatically and optimally; (iii) minimizing system-level network latency and maximizing energy-efficiency while providing better fault coverage comparing to conventional NoC design. In an embodiment, each operation mode can have different trade-offs among fault-tolerant capability, retransmission traffic, packet latency, and energy efficiency.

Fig. 1 – Aspects of the disclosed invention

 

Applications:

  • Computer architecture applications in mobile devices, computers, super-computers and data centers
  • Interconnection architecture aspects in various multicore processors
  • Exascale and other larger computing applications
  • Other networking applications
    • Network-on-chips, system-on-chips

 

Advantages:

  • High performance
  • Cost-effective
  • Energy-efficient
  • Better fault-tolerance, reliability, network latency than existing solutions
Patent Information:
Title App Type Country Serial No. Patent No. File Date Issued Date Expire Date
Learning-Based High-Performance, Energy-Efficient, Fault-Tolerant On-Chip Communication Design Framework US Utility *United States of America 16/547,297 12,040,897 8/21/2019 7/16/2024