Sensory, a leader in voice AI for consumer products, has announced a breakthrough technology integration that enables ChatGPT or other Large Language Models to drive conversational voice responses (VoiceChat) on consumer products and other devices lacking keyboards and big screens. Targeting in-ear voice assistants, smartwatches, smartphones, automotive infotainment systems, and more, this powerful technology integration delivers a fast and seamless conversational experience on consumer products and unlocks exciting VoiceChat type capabilities for numerous electronics companies and their customers.
“Generative AI has the potential to make consumer devices smarter than ever. Integrating this powerful new technology with our robust voice AI stack is a game-changer for the market, and allows our customers to create a new generation of infinitely capable voice assistants tailored to a variety of customized domains,” said Todd Mozer, CEO of Sensory.
True to Sensory’s established reputation for highly accurate voice AI solutions, Sensory makes generative AI even more accurate. The company’s enabling technology stack includes:
- Wake word recognition.
- Accurate speech-to-text with context and AI-generated prompt engineering to ensure ideal generative AI results.
- Intelligent response selection helps to avoid unpredictable and incorrect responses, aka ‘AI hallucinations,’ which can occur on platforms that rely solely on generative AI.
- Text-to-speech allows users to hear the generated responses in a natural voice.
- Sensory’s conversational AI stack also allows users to ask follow-up questions and commands to filter, sort, or add more information to the original request, making the conversation more natural and human-like.
“This launch expands Sensory’s capabilities to bring VoiceChat capabilities to devices of all types, giving businesses the opportunity to create more engaging and interactive products,” Mozer continued.
With Sensory’s hybrid cloud + edge AI platform, customers can choose to implement a number of powerful AI technologies to bolster the end-user experience and security, and split AI inference duties between edge devices and the cloud.
Using a smartwatch as an example of an ultra-low-power device, light-duty AI like wake word recognition, speaker verification, simple voice controls, and sound identification can run on-device. More complex AI inference, such as wakeword, speaker, and sound ID revalidation, as well as domain-specific assistants, and natural language understanding engines, can be routed to a more powerful connected device like a smartphone. And for high-horsepower AI inference, like generative AI and todays generation of VoiceChat, improved revalidation, face and object recognition, and more can be routed to the cloud.
SensoryCloud’s voice assistant solution is powered by a cutting-edge technology stack that includes Go, gRPC, NVIDIA Triton, and AWS Global Accelerator. The lightning-fast Go programming language builds scalable, high-performance applications that can handle even the most demanding workloads. gRPC enables the creation of advanced SDKs for seamless communication between components. SensoryCloud uses proprietary techniques to compress dialog data to reduce cloud fees and decrease latencies.