dilbert said:
Dispense with the native language recognition and just use "voice activation" to select which custom mode and upload a speech pattern that is developed and uploaded into the camera using DPP. Then the camera only needs to match what it hears against the pattern files and is also only trained to your voice.
If it was too much for a PC (not likely) to build the pattern file then Canon could offering it as part of a "Canon Cloud Photography" package or feature.
I think a limited number of commands could be handled in a method you described, by limited, I mean specific commands not general queries.
I find Siri so bad that I am amazed that someone would embarrass themselves by releasing it. Google maps and browser work great, and almost always figure out what I am saying. I Have a Amazon Fire TV (Latest), it is very good at figuring out my voice commands about almost anything.
I attending a meeting last week for deaf and hard of hearing people (I'm having a Cochlear Implant next week). The meeting used Streamtext.net, and I was totally amazed.
http://www.streamtext.net/
I did not spot the microphone(s), but every person's voice in the room was picked up and translated into text and projected on a large screen. It was 100% accurate, I've never seen a voice to test that accurate, much less in a meeting with a fair amount of noise, and people scattered around the room. Its a pay to use service, but well worth it, even at its highest rate of .09/minute, a 2 hour meeting costs only $10.80. And, you can have up to 50 clients connected at various locations at no additional charge. Since its captured by a web browser, you can record the meeting and publish minutes later too.
Because I will be totally deaf for a month after my implant surgery, I have been researching and testing voice to text for tablet and smart phone use so my wife can easily communicate with me. Many of the apps work, but only marginally, I currently have speak2see installed on my iPhone, it works well, but there is no control over text size, so it needs a large tablet, or I need to find a way to reduce the text size.