A way to organize distributed systems for data centers that require fewer resources and increase efficiency and predictability. Problem: Many distributed systems today are built on the assumption that the system is asynchronous - that packets can be lost or delayed arbitrarily and that clocks are at most weakly synchronized. This assumption results in systems that are extremely robust, but it also leads to countless difficult challenges, including complex coordination, slow failure detection, high resource requirements, high tail latencies, etc. However, for current data-center hardware, this assumption is also quite pessimistic - current nodes and switches can deliver far better performance, as long as the system is structured carefully to take advantage of them. Solution: This invention describes a way to structure a distributed system as "quasi-synchronous” with tightly synchronized clocks and carefully scheduled network transmissions. By coordinating communication through predetermined time slots and assuming bounded clock differences and limited consecutive packet loss, nodes can exchange messages in a predictable, clock-driven manner. This approach reduces coordination overhead and allows replicas to process requests deterministically, and to distinguish between packet loss and node failures more quickly, enabling faster recovery. A state-machine replication (SMR) system has been implemented as a case study and shows much higher performance than state-of-the-art solutions. Technology: The invention describes several techniques that can be used to make distributed systems quasi-synchronous - including a way to schedule the network to avoid queueing delays and to achieve delivery by a specified deadline; a way to detect node failures based on the absence of expected transmissions; and a way to handle occasional packet losses despite the tight timing guarantees. A specific protocol for state-machine replication is provided as an example, but the technology should be applicable to a wide range of distributed systems. Experiments show a throughput improvement by two orders of magnitude, while using half as many replicas as state-of-the-art solutions. Advantages:
Stage of Development:
Figure A): Illustrates the overall system architecture. Clients send requests to a set of replica servers that each run the same application and maintain synchronized state. Replicas communicate with one another through a dedicated replica network to exchange replication messages and maintain consistent ordering of requests. In addition, replicas receive a shared timing signal through a separate clock network, which keeps their clocks closely synchronized. Separating the client network, replica network, and clock synchronization channel enables replicas to coordinate communication and processing in a predictable, time-driven manner.
Figure B): Illustrates the scheduled communication model used between replicas. Figure B (left) shows the network topology and how replicas are connected through switches that forward messages between servers. Figure B (right) shows a time-based transmission schedule in which each replica is assigned specific time slots to broadcast messages. The schedule accounts for network delays and small clock differences so that packets arrive without causing congestion or queue buildup. By following this predetermined schedule, replicas exchange messages in a controlled sequence that enables deterministic system behavior. Intellectual Property:
Reference Media:
Desired Partnerships:
Docket #26-11452