30-60 available languages
All embedded voice technology with one solution: CSDK.
The CSDK (Cerence Software Development Kit) is an embedded voice technology in the form of a software development kit. With these tools, you will be able to integrate different voice features to create many types of interactions.
The CSDK contains to date in its latest version :
- Cerence ASR (previously VoCon) : an embedded ASR module (Automatic Speech Recognition, or Speech-to-Text (STT)) for voice transcription.
- Cerence TTS (previously Vocalizer) : An embedded TTS (Text-to-Speech) module used to produce voice synthesis.
- Cerence NLU: An NLU (Natural Language Understanding) module for natural language comprehension in embedded systems.
- Cerence Audio-Processing: Several tools included in the CSDK to improve and facilitate the audio processing of the microphone.
- Dev Tools: A Windows-based software suite to help you develop your voice solutions.
Embedded voice transcription engine VoCon (Speech-to-Text), Cerence ASR.
Cerence ASR (also known as STT for Speech-to-Text) is one of the most popular embedded voice transcription solutions. It is also the voice engine formerly known as VoCon by Nuance.
Included in the CSDK, it offers superior functionality, unmatched accuracy and high performance for a variety of applications that benefit from speech control. Designed as a modular and scalable engine, Cerence ASR can be adapted to a wide range of embedded applications in industry, logistics, transportation, etc.
The great strength of the Cerence ASR included in the CSDK lies in the notion of extensive dictionaries. This feature allows you to directly modify the lexicons understood by the transcription engine to improve the performance of the tool in particular cases, for example, in the case of terms specific to your business. If the word is originally misunderstood, it is possible to rework the associated phonetics using the development tools.
Cerence ASR have several functions, among them are:
- Wide vocabulary support : Allows speech recognition of large corpus up to millions of units.
- High reliability in a noisy environment : Capable of high-precision recognition with a signal-to-noise ratio as low as 5dB.
- Embedded Voice Dictation : Recognizes free dictation text more broadly than separate voice commands.
- Spelling module: Allows you to act as a back-up for the voice recognition system.
For more information about the features of Cerence ASR, you can contact us directly for a detailed presentation.
Vocalizer Text-to-Speech embedded and Cloud-based speech synthesis tool, Cerence TTS.
Cerence TTS (previously know as Vocalizer), also module of the CSDK, transforms the voice assistant experience by offering the most natural speech synthesis for cloud and embedded applications. Cerence offers Cerence Cloud Services and integrated SDKs for Windows, Linux, OSX, Android and iOS.
Cerence TTS is a suite of solutions for vocal synthesis to generate high-quality voice from Text-to-Speech and pre-recorded audio. The software is optimized for reading long texts in a natural and humane way. New algorithms based on models Deep-Learning offer greater fluidity and more natural prosody, providing a unique vocal experience.
Cerence TTS also own several features such as:
- Emotional voice synthesis: Choice between 4 ways of speaking (neutral, playful, authoritative and empathetic)
- Improved Expression Styles : Ability to enhance text-to-speech with pre-recorded speech elements.
- Contextual intelligence: Optimizes the reading of certain elements by an intelligent tagging system for addresses, dates, phone numbers…
- Prosody control: Manipulation of pitch, volume, rhythm and timbre of the synthesized voice.
For more information about the features of Cerence TTS, you can contact us directly for a detailed presentation.
The technical environments for integrating the CSDK locally into your systems are as follows:
Operating system-dependent PLC linking and conditioning:
- Android: CSDK will be delivered with a Java-API binding compiled in an Android archive (AAR).
- Win/Linux: CSDK will be equipped with C-API Binding
- Apple iOS: CSDK is shipped in a framework archive; it will be deployed with Objective-C binding and bridging headers to support the Swift API.
Standard Ports and Tools :
- iOS (version 7.0 and up): arm64 and x86_64
- Android (version 6.0 and up): armv7 (32Bit), arm64 and x86_64
- Linux: armv7 (32Bit), arm64 and x86_64
- Windows: x86_64
Code and data :
Feature | Code size (MB) |
Basic command and control | 3,2MB |
All features, largest acoustic model | 9,5MB |
Data, model size
Component | Data size per language |
Acoustic model by language – Gen 4 compact / Gen 5 / Gen 6 |
~900kB / ~4MB / ~6MB |
CLC – Monolingual | 300-7300kB |
CLC – Multilingual | 700-3000kB |
Use Cases: Data Size and Total RAM Usage.
Component | Data size per language | Total RAM usage |
Number recognition | 4kB | 1,25MB |
Basic application C&C 100/10K commandes | 10/500kB | 1,3/1,8MB |
Telephony with grammar + expressions | 0,52MB | 12,6MB |
Points of interest and addresses (USA only) | 300MB | 56MB |
Embedded Voice Dictation | 100MB | 100MB |
Component | Storage required (out of code) | RAM used |
Compact onboard (small system) | 10MB average / 21MB maximum | 6MB average / 23MB maximum |
Pro boarding (TTS optimized for better capacity as for navigation, SMS reading…) | 55MB average / 131MB maximum | 14MB average / 38MB maximum |
Embedded High (High quality TTS, suitable for all uses) | 120MB average / 325MB maximum | 24MB average / 69MB maximum |
Embedded Premium (Highest performing TTS on a Deep Learning model) | 337MB average / 558MB maximum | 159MB average / 198MB maximum |
The code size for a full-featured Cerence TTS takes 10 to 13.5MB depending on the integration platform. However, this can be optimized depending on the choice of languages and features selected for use.
Would you like to try the CSDK?
We can grant you an evaluation period!
The VoiceMarket accompanies you in your projects.
The state of the art of embedded voice.
CSDK is the flagship solution when it comes to embedded voice technology today. Integrated in the products of the largest companies in many applications, CSDK is constantly developing voice based human-computer interactions with ever increasing performance.
Complete and multi-purpose solution.
The CSDK comes in the form of a software development kit. This nature allows its users to modulate it at their convenience to carry out their voice projects. This versatility in use makes the CSDK a truly complete tool for the creation of voice applications, all the more so in embedded contexts.
Spin-off of a leader in modern vocals.
Cerence is a spin-off of the world-renowned Nuance, a leader in speech technology. This particular affiliation allows the company, and in particular the CSDK, to benefit from one of the best technological expertise in the field of voice, a guarantee of irreproachable quality.
What the CSDK can do for you…
A tailor-made solution.
The CSDK is a modular tool offering you different modules to be integrated according to your needs and constraints. This versatility allows you to design the most suitable solution for your project to optimize its performance.
100% embedded voice.
The main argument of the CSDK, embedded voice technology, allows to create voice use cases independently from the use of Cloud. This agility is notably indispensable in certain environments where internet connection lacks.
Multilingual technology.
Depending on its modules, the CSDK is able to manage from 30 to more than 60 different languages in a totally embedded way. The exhaustive list of compatible languages can be found at the top of the page in the main information.
A single business model.
The revenue system of the CSDK is very simple, it is a annual renewal license per device and/or per user. The price of a license is available on request directly from the VoiceMarket with a complete quotation of your need as well if wished.