AI: What is the difference between NLP and NLU?
Although only one small letter differentiates them, the notions of NLP (Natural Language Processing) and NLU (Natural Language Understanding) are different but complementary in terms of text recognition and understanding words.
It is easy to see why natural language understanding is an extremely important issue for companies that want to use intelligent robots to communicate with their customers.
The notion of NLP emerged in the 1960s. Its main purpose is to allow machines to record and process information in natural language.
The difference between NLP and NLU
In order to be able to work and interact with us properly, machines need to learn through a natural language processing (NLP) system.
NLP – Natural Language Processing
NLP or ‘Natural Language Processing’ is a set of text recognition solutions that can understand words and sentences formulated by users.
The aim is to analyze and understand a need expressed naturally by a human and be able to respond to it.
NLP groups together all the technologies that take raw text as input and then produces the desired result such as Natural Language Understanding, a summary or translation. In practical terms, NLP makes it possible to understand what a human being says, to process the data in the message, and to provide a natural language response.
NLU – a subfield of NLP
Natural Language Understanding (NLU) refers to the analysis of a written or spoken text in natural language and understanding its meaning. It is therefore a subfield of NLP.
NLP interprets what the client says or writes literally, while NLU identifies the intentions and the deeper meaning.
NLU is an algorithm that is trained to categorize information ‘inputs’ according to ‘semantic data classes’. The model finalized using neural networks is capable of determining whether X belongs to class Y, class Z, or any other class.
“I’m trying to get in touch Amazon, do you know the number?” The NLU will understand that the person wants to contact the multinational technology company.
“Where is the Amazon Forest located?” The NLU will understand the difference in intent with the previous sentence. We are talking about the rainforest here, not the company.
NLU is also able to recognize entities, i.e. words and expressions are recognized in the user’s request (input) and can determine the path of the conversation.
In our example of Amazon, the input is: Amazon + forest + located
Intent and entity
The understanding of natural language is therefore based on two key pieces of information: the intent and the entity.
The intent enables the understanding of the message from the user. It is characterized by a typical syntactic structure found in the majority of inputs corresponding to the same objective.
The entity is a piece of information present in the user’s request, which is relevant to understand their objective. It is typically characterized by short words and expressions that are found in a large number of inputs corresponding to the same objective.
- Automatic dialogue summary
- Automatic translation
The callbot powered by artificial intelligence has an advanced understanding of natural language because of NLU. If this is not precise enough, human intervention is possible using a low-code conversational agent creation platform for instance.
NLU – NLP and speech recognition
Speech recognition is not a new topic. Historically, the first speech recognition goal was to accurately recognize 10 digits that were transmitted using a wired device (Davis et al., 1952). From 1960 onwards, numerical methods were introduced, and they were to effectively improve the recognition of individual components of speech, such as when you are asked to say 1, 2 or 3 over the phone. However, it will take much longer to tackle ‘continuous’ speech, which will remain rather complex for a long time (Haton et al., 2006).
Just like learning to read where you first learn the alphabet, then sounds, and eventually words, the transcription of speech has evolved over time with technology.
From the simplest to the most complex:
- Recognition of individual words
- Recognition of a string of words where only one person speaks
- Coarticulation where one phoneme can influence another. For example, In the sentence “I have to go”, the end of the word ‘have’ sounds like ‘f’
- Taking into account disruptive factors such as continuous speech or the variability of interlocutors, which has been made possible by advances in machine learning.
ASR or Automatic Speech Recognition
When dealing with speech interaction, it is essential to define a real-time transcription system for speech interaction.
This transcription phase takes place together with the analysis and comprehension stages.
The transcription uses algorithms called Automatic Speech Recognition (ASR), which generates a written version of the conversation in real time.
Simply put, you can think of ASR as a speech recognition software that lets someone make a voice request.
It then transforms this request into written text. The context of each word is analyzed by ASR in order to recognize the right word out of all its homonyms. So, when we say the word “Amazon” are we talking about the company or the forest? It is obviously the context of the sentence that will determine the optimal interpretation. This is where AI comes into play to find the true meaning of the question depending on the context.