The creation of an open source computing system users control via voice command could generate new opportunities for service providers seeking to differentiate their offerings or develop new custom solutions.
Earlier this month, engineering researchers at the University of Michigan unveiled Sirius, "an open end-to-end standalone speech and vision based intelligent personal assistant (IPA) service similar to Apple’s Siri, Google’s Google Now, Microsoft's Cortana, and Amazon’s Echo." In addition to speech recognition, the IPA includes image matching, a cloud-based question-and-answer system, and natural language processing. The system integrates all capabilities so users can, for example, show a photograph of a restaurant and voice a question — "When does this place close?" according to the developers.
"I think software providers could incorporate Sirius in varieties of aspects. One example is more advanced automatic machine response on hotlines. For example, you can call Delta and just ask, "When is my flight?" and get the answer right away instead of navigating through depths of the menu. In general, Sirius can be used to provide a more user friendly interface," Yiping Kang, a first-year PhD student at the University of Michigan who worked on Sirius, told Talkin' Cloud. "Sirius could be used to replace most search functionality and enable more advanced and customer services that are easier to use. Of course, this is only one product that could benefit from Sirius. I am sure there are a lot other ways to incorporate Sirius into commercial software."
Sirius is free and solution providers, end-users, or IT departments can customize the software. It is based primarily on Carnegie Mellon University's Sphinx; Microsoft Research's Kaldi; Germany's RWTH Aachen RASR, and OpenEphyra. Image recognition is based on SURF. Because Sirius is open source, the IPA could be attractive to those channel organizations active in vertical markets such as banking, hospitality, tourism, and retail, or horizontal business needs including customer service, education, wearables, and distribution, said Michael Starnes, president of Orlando-based Starnes Consulting, in an interview.
"With the main competing platforms, the existing licensed options — from Apple, Microsoft, and Google — the main publicized use of voice control is driven by a set of hands-free options. Sirius is taking an additional fork to functionality, one not yet seen (in my opinion) by the big three. The ability to upload an image — from a phone or wearable — and ask a question about that image — or refer to the image in a complex sentence — is part of the Sirius project team goals," he said. "For example, taking a picture of the Washington Monument and then asking “How old is this?” That's a function set that goes beyond the current voice control paradigm. A true step forward, in my opinion."
What's in it for solution providers?
As businesses, developers, and the channel try to leverage wearables' potential, Sirius could play an important role, said Starnes. The availability of a free open source IPA gives solution providers a much lower cost of entry and more control over their client offerings.
"For Sirius to succeed, a market player — regional VAR or manufacturing arm — must adopt the software in a usable and marketable format. The chance of this happening is high given the proclivity of all the voice command players to demand chip/OS level payment for enabled software features," he said.
Added Kang: "The benefit of Sirius would be more flexible and configurable in terms of capability, focus, and deployment. It can provide a customized IPA that is more suitable for the specific requirement of the customer."
Google, ARM, DARPA, and the National Science Foundation helped fund the project.