A virtual you is about to take up residence in the cloud and it will use your unique voice to fluently speak languages you don’t even understand. This visual and vocal clone will become your international ambassador, traveling the digital world to engage face-to-face with customers, collaborators, and colleagues.
It is a promised outcome of the continuing evolution of cloud computing and its potential as a platform for enhancing the power of video conferencing. This particular vision of a holographic, multi-lingual digital self comes from Microsoft’s recent Inspire 2019 conference in Las Vegas. Using technologies supplied by its cloud platform, Azure, and presented through its mixed reality headset, HoloLens 2, Microsoft demonstrated how future video conference callers and presenters will be able to authentically recreate their image and voice digitally and send it out across the internet-connected world.
Azure video conferencing is a glimpse at the next generation of digital communication.
Building a Better Hologram
Microsoft seems to mark every consumer, industry, and developer conference with some sort of new holographic spectacle. Ever since the HoloLens made its debut at BUILD 2015, Microsoft has been using the holographic headgear to promise all sorts of interactive, 3D breakthroughs that will change our lives.
Powerful cloud-based AI could give video conferencing a new level of versatility and authenticity online.
This year’s demonstration came with a vocal element. Azure head Julia White donned the still rather bulky-looking headset to show in-room and online audiences how the tech could be combined with text-to-speech, translation, and voice signature apps to create a realistic avatar of herself that was capable of speaking Japanese.
It would have been impressive to see the hologram convert live speech or text into a 3D, Japanese-speaking digital replicant, but what we did get was one of the more impressive digital recreations of a human being. The focus for us, though, is on the Azure elements of the performance.
The cloud platform provided the language conversion and AI components of the demonstration. Its machine learning prowess was used to make the hologram speak with White’s own unique voice even though it uses words she doesn’t understand. It actually used recordings of her natural voice to give the virtual clone the same inflections and idiosyncrasies.
This kind of powerful cloud-based AI is what could give video conferencing a new level of versatility and authenticity online.
You’re going to spend the majority of your working life in the cloud. It is estimated that more than 73 percent of businesses already have at least a portion of their digital infrastructure in the cloud, and that 30 percent of all IT budgets are dedicated to cloud computing. The cloud is a cheap, low maintenance, easily upgradable way of building and accessing the IT infrastructure that powers all your digital requirements, from storage to hosting video calls. Cloud-based video conferencing has become the dominant form of video calling as it allows flexibility and scalability.
Amazon could potentially link its cloud services to its own video app, Chime.
At the moment, Microsoft’s cloud platform Azure runs a distant second in terms of public cloud market share to Amazon’s AWS (about 15 percent to Amazon’s 47 percent). Microsoft’s product is, however, used by more than 95 percent of Fortune 500 companies (according to Microsoft), and it is growing at a faster rate than its Big 5 rival, in part due to the existing enterprise license deals it has in place for other apps such as Office 365 and Windows.
That library of software includes Microsoft’s workplace collaboration app, Teams. The video conferencing-enabled platform could potentially be enhanced by cloud access to the data-heavy, high-performance apps of Azure, such as media services and the neural text-to-speech capabilities demonstrated at Inspire 2019–and Amazon could potentially link its cloud services to its own video app, Chime.
Cloud computing potentially gives everyday professionals cheap access to advanced software that a decade ago was prohibitively expensive to both use and maintain.
Avatars for All with Azure Video Conferencing
To be clear, Azure isn’t a video conferencing platform in itself. It is an enabling service used for IT infrastructure, platform, and software provision. What Microsoft has demonstrated with its HoloLens 2 performance is how Azure and other cloud platforms can be leveraged to enhance video conferencing. There are, however, still a few drawbacks to using an avatar as Microsoft imagines.
Basic platforms such as Skype already offer live text translation for video calls.
For starters, when you wear a headset to access mixed reality technologies you obscure your face and make it impossible to broadcast a live image of yourself on a video call. That’s why you have to use an avatar to project yourself in 3D–that, and the fact that it takes a huge amount of computer power to generate a realistic live hologram, the kind of power only the biggest corporations can afford. There is a hope that the advanced facial recognition technology of leading consumer-grade webcams could one day provide the necessary imagery, but the visuals created by Microsoft for Inspire 2019 came from a high-end studio.
Basic platforms such as Skype already offer live text translation for video calls, so going live with the signature voice conversion that Azure enables does not seem like a prohibitive leap. However, even if you can accept an avatar as your replacement during a video conference, you need everyone else on the conference call to have access to a hologram of themselves as well. Again, if the technology becomes more accessible, then it may one day be common practice to have such a digital self lying in wait in the cloud, ready to facilitate a cross-language conversation, but until then, this is a dream for another day.
Yet…what a day that could be. Imagine the possibilities of sending your digital, language- enhanced self into every corner of the internet, to meet on virtual landscapes that aren’t bound by mortal scales of time, size, or destination. Cloud-based video conferencing that draws on the AI and machine learning provided by platforms like Azure is a glimpse into that digital future.