Methods for data multiplexing in neural networks
Princeton Docket # 22-3898-1
Current neural network architectures, particularly Transformers, face significant challenges due to their GPU-memory intensity, which limits throughput and increases time and energy costs during training and deployment. To address this issue, Princeton University researchers have developed an innovative method that leverages multiplexing to enhance neural network efficiency. This technique allows models to process multiple input instances simultaneously through a combined vector representation in a single forward pass, significantly improving computational efficiency.
Unlike traditional batch processing, this unique approach compresses several inputs into a single instance, enabling simultaneous predictions without sacrificing accuracy. This method achieves throughput improvements of 5x to 20x compared to conventional techniques, leading to substantial savings in time, energy, and operational costs. By dramatically increasing model throughput, this technology offers a transformative solution to the limitations faced by current neural network architectures, enabling organizations to handle larger datasets and improve overall performance.
Applications • Natural language processing • Image classification and object detection
Advantages • Increased throughput • Improved efficiency • Increased cost savings • Easily scalable • No loss in accuracy
Inventors
Karthik Narasimhan Ph.D. is an associate professor at Princeton University. His research includes natural language processing, reinforcement learning and artificial intelligence.
Vishvak Murahari is currently a Ph.D. student at Princeton University studying computer science. His research focuses include natural language processing and machine learning.
Carlos Jimenez is a Ph.D. student at Princeton University. His research interests include language models and task-oriented dialogue.
Runzhe Yang Ph.D. received his doctorate from Princeton University and currently a scientist at Susquehanna. His research primarily focuses on machine learning and computational neuroscience.
Ameet Deshpande is a Ph.D. student at Princeton University in the computer science department. His research focuses on the field of natural language processing (NLP).
Kai Li Ph.D. is a Paul M. and Marcia R. Wythes Professor in the computer science department at Princeton University. His current research interests include ML for systems, privacy preservation for ML, and data auditing.
Yushan Su Ph.D. is a postdoctoral researcher at Princeton University who aims to improve computer systems and machine learning efficiency by using data multiplexing, model compression and hardware accelerators.
Intellectual Property & Development status
Patent protection is pending.
Princeton is currently seeking commercial partners for the further development and commercialization of this opportunity.
Contact
Prabhpreet Gill
Princeton University Office of Technology Licensing • (609)258-3653 • psgill@princeton.edu