The transcription tools developed by Zaion Lab engineers are based on complex neural architectures that provide an accurate representation of the speech and semantic features of speech. This leads to a very low word error rate (WER) and thus a very high level of accuracy which surpasses the tools available on the market.
Real time or asynchronous
When it comes to conversation, it has to be instant! Our transcription tool is designed to respond in real time, which means less than 200ms (meters per second), providing a smooth conversation with a voice bot.
It is also possible to use it asynchronously for applications that are not instant. In this case, the response time would be one third of the recording time on average.
Recognition of specific data formats
Customer service interactions often include specific data formats that are not present in the datasets used for model learning such as different customer number formats and license plates. These include alphanumeric references, the spelling of first and last names, and addresses. The absence of these formats in the training phase of the large vocabulary models explains why their results are typically not satisfactory.
We have developed models combining a powerful technical architecture and a high degree of business expertise to recognize these patterns. The result is impressive as Zaion speech recognizes more than 87% of the most intricate alphanumeric references on the first try including complex French and Belgian license plates. This is how the customer identification phase becomes smooth and natural.
Technology adapted to the specific nature of conversations
“…All of us who engage in conversation know that despite everything there are “failures” in the system: our interlocutors are not necessarily finished speaking at the moment we think they are, and they do not always wait until we ourselves are finished before speaking.” Candace West: Gender, Language and Conversation
Human conversations are often punctuated by social dialogue dynamics such as interruption or overlapping. Although the human ear is familiar with this type of behavior, it poses a real challenge for speech recognition tools.
Zaion speech is designed to cope with these situations. The tool’s native functionality includes handling overlaps, onomatopoeia, hesitations, and other conversational behaviors.
Robust against noise and poor-quality phone signals
Just as a weak phone signal makes speech unintelligible to the human ear, the performance of speech recognition tools is heavily reliant on the quality of the signal.
Our learning methodology forces the system to recognize speech irrespective of signal distortions.