SensoryCloud STT utilizes an end-to-end architecture that has been designed to offer high flexibility and accuracy. An optional domain-specific language model can be customized and applied to support unique domains with special vocabulary or industry-specific jargon. The platform is suitable for use with both streaming audio and batch modes.
SensoryCloud TTS is based on a combination of end-to-end models and neural vocoders. The end result is perfectly human-sounding synthesized speech that also runs significantly faster than in real-time to minimize synthesis delay.
Wake Word Detection
Most wake words live on the edge, but cloud-based verification can significantly improve wake word performance. Enabling verification in the cloud is a valuable technique for reducing false alarms. The wake word data can also be used to automatically train new edge-based models for ongoing improvements.
A highly flexible facial recognition solution allows app developers or OEMs to quickly add face biometrics to any mobile or desktop application or device. SensoryCloud Face Recognition supports advanced cloud features such as single-frame liveness, cross-device authentication, and continuous model updates to verify identity by matching the user’s face to a stored biometric template translated to an irreversible encrypted code.
Automatically identifies and authenticates users based on their voice. The technology offers support for multiple enrolled users and enables brands to deliver new experiences where a device instantly associates a user’s voice with a profile, allowing it to access specific data, track conversations, or control access to features and capabilities.
Sound ID makes devices cognizant of concerning sounds and can warn people when they occur to enhance situational awareness at home, at work and more. Our models are trained to recognize a variety of environmental sounds, including glass breaking, babies crying, dogs barking, home security alarms, smoke/CO alarms, doorbells, knocking, snoring and more.