Microsoft Azure AI is Bringing Iconic Characters to Life with the Help of Custom Neural Voice and 5G Network

Microsoft Azure AI technology has used 5G, augmented reality, artificial intelligence, and a Custom Neural Voice to bring the iconic character Bugs Bunny to life. This character can be seen in the AT&T Experience Store in Dallas. The major breakthrough for the creation of such a character was achieved with the use of deep learning to facilitate pronunciation. The use of deep learning also perfected the duration and tone of the character. One of the significant reasons to demonstrate this technology at the store was to make people aware of the 5G cellular network’s capabilities. When a customer enters the store, the character engages in a real-time conversation, greeting the customer by name and then asking for help to find numerous golden carrots that have been hidden in the store. The customer then navigates the character via chat throughout the store. 

The Impeccable Technology

For several years speech has been very robotic, but the custom neural voice has powered speech to a level where it sounds exceedingly natural. The 5G network makes Bugs Bunny appear in HD in mere seconds and flawlessly move all across the store. The consumers also feel a better connection through this form of speech. The technology that makes this entire process smooth is the neural text to speech capability, a part of the Azure Cognitive Service. Compared to the 4G network, 5G offers better computing power and much a higher speed.

The Iconic Character and Transparency 

The main motive behind creating and bringing such an iconic character to life was to unify the physical environment with the virtual one harmoniously. The first character selected for this purpose was Bugs Bunny, but in all likelihood, it will not be the last. The journey to building this character involved pre-recording more than 2000 phrases and lines by a voiceover actor and the Warner Bros. and Microsoft team working tirelessly to ensure that the character accurately reflects Bugs Bunny’s personality and traits.

The result of the effort was that a character was created that looked as real as possible. The technology’s general availability also signifies that it is available to the Azure cloud regions and not the public in order to prevent misuse. A set of guidelines and ethics have also been put into place by Microsoft to use this technology.

Creating the Perfect Custom Voice  

After recording the phrases and lines, the second step was to create a font of sounds to create a natural-sounding voice that can say anything in any situation. The font of sounds is similar to the font of a computer, where different letters all combine to make up words and sentences that make sense. Everything was connected beautifully by using deep learning to create a voice that sounded as good as that of an original person.

Deep learning is a phenomenon that is a part of the machine learning process wherein the machines are taught to learn and analyze a set of given data to mimic human-like activities. As the word suggests, deep learning goes in-depth about the particular phenomenon to create more layers within the neural network. In creating the custom voice, two neural networks play with different layers to perform various functions. When two neural networks work in harmony with each other, a more natural sound is created.

Creativity at its Best

The Custom Neural Voice is a unique technology in itself. It can be manifested in the field of education as well to optimize the benefits. Microsoft partnered with a non-profit organization in Beijing, China to generate AI audio content using Custom Neural Voice to help individuals facing vision problems. Not only this, but Microsoft also collaborated with Duolingo (a company offering various language learning courses) to better the quality of services and personalize the process of language learning. The Microsoft team has reiterated nine different characters to give a diverse outlook to the program. Hundreds of characters were tried and tested by the researchers to reflect upon the various cultural influences that would help the users connect better with the app’s characters. Every character has a distinctive and unique personality. The custom neural voice acted as a catalyst for all the users in their learning process. The characters will be speaking English, Spanish, French, German, and Japanese for now. Experiments are underway to create more characters and increase the range of languages offered by Duolingo.  

Responsibly Moving Forward 

This technology aims to leave a positive and lasting impact upon the people. But while doing so, Microsoft is responsible for ensuring that no harm is caused anywhere in the world. Time and again, Microsoft conducts specific tests and assesses the potential risks posed by the technology. Efforts are then aligned towards mitigating those risks and creating protocols of use. Azure Cognitive Services wants to empower its customers but also tread carefully. Microsoft says that it is committed to responsible AI, and whenever working on new technology, its foremost concern is to follow all the guidelines. 


Amreen Bawa is a consulting intern at MarktechPost. Along with pursuing BA Hons in Social Sciences from Panjab University, Chandigarh, she is also a keen learner and writer, having special interest in the application and scope of artificial intelligence in various facets of life.