Researchers At Johns Hopkins Introduce A Machine Learning Model That Can Allow Computers To Understand Human Conversation

Human conversation is dynamic, with many exceptions and unexpected ways to express oneself. In recent years, significant progress has been made to help machine learning systems grasp even the most basic knowledge of means of communication. While human-machine interactions are the focus of present dialogue research, it is human-human communication that provides the biggest obstacles to spoken language comprehension.

Researchers from Johns Hopkins Center for Language and Speech Processing introduced a machine learning model to distinguish speech functions in transcripts of dialogues produced by language understanding (LU) systems. This work could one day help computers “understand” spoken or written text in the same way that humans do.

The new model recognizes the intent behind words in the final transcript and categorizes them as “Statement,” “Question,” or “Interruption”: a task known as “dialogue act recognition.” With this, the model could become a bridge in understanding human conversation.

When presented with large, unstructured chunks of text, LU systems fail to classify factors like the text’s topic, mood, and purpose. This new approach avoids the requirement for LU systems to deal with this scenario. Instead, the systems can communicate extremely precise messages, such as an inquiry or an interruption, using a set of idioms.

Some previously established language-understanding models were adapted to arrange and categorize words and phrases. Further, the team analyzed how different variables, such as punctuation, affect those models’ performance. It was found that punctuation delivers significant clues to the models that are not otherwise evident in the text.

Most natural language processing (NLP) algorithms work well only when the text has a clear structure, such as when a person speaks in entire sentences. However, humans rarely communicate formally in real life, making it difficult for algorithms to determine where a phrase begins and finishes. 

The team wanted to ensure that the new technology could understand everyday speech. Dialog acts are critical to comprehending the dialog’s structure. They are atomic units of communication, more fine-grained than utterances and more specialized in their function. The ‘dialogue act’ concept comes into play here that could help with various tasks, including summarization, intent identification, and keyword discovery.

Many organizations use speech analytics to glean insights from analyzing consumer interactions with contact center customer care representatives. In most cases, speech analytics entails automatic transcription of conversations, and keyword searches offer limited potential for insight. The team believes that their model will benefit such organizations.

This model can also be adopted by doctors in the future, sparing them time spent taking notes while dealing with patients. They can use this approach to automatically read the transcript of the conversation, fill out paperwork, and take notes.



🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...