Would Voice commands be useful

RGF

How you relate to the issue, is the issue.
Jul 13, 2012
2,817
37
22,151
I would, at least in theory, be able to say commands to my camera such as "F8", or set the exposure compensation to +1/2 stop.

Of course there would be need to be a say to identify my camera vs other cameras / photographers. Some sort of voice recognition and or name would be required.
 
The only practical voice commends are actually done by transmitting the voice via internet or phone connection to a remote computer that is very fast and powerful. Thus, you would need a phone connection or Wi-Fi for that to happen. Take your smart phone, disconnect from Wi-Fi and the phone service, and try the voice command. My IPhone 6+ comes up with totally silly text. Without the connection, its worthless.

Just voice commands in a camera using the already overloaded processor might not work well at all.
 
Upvote 0
Mt Spokane Photography said:
The only practical voice commends are actually done by transmitting the voice via internet or phone connection to a remote computer that is very fast and powerful. Thus, you would need a phone connection or Wi-Fi for that to happen. Take your smart phone, disconnect from Wi-Fi and the phone service, and try the voice command. My IPhone 6+ comes up with totally silly text. Without the connection, its worthless.

Just voice commands in a camera using the already overloaded processor might not work well at all.

Not yet...Moores Law may catch up.

-pw
 
Upvote 0
Mt Spokane Photography said:
The only practical voice commends are actually done by transmitting the voice via internet or phone connection to

Actually, that happens mostly because they want to know what you're telling to your device, extract data from it and use for building profiles over time.

While complex voice queries may need more power (and data) to be processed - but nothing a decent PC of today couldn't do - the biggest hurdle is to understand the real semantic meaning of a sentence among different possible ones.

But a simple and quite limited and fixed set of voice commands like "aperture f/8", "shutter 1/125", "compensation + 1 and 1/2", "ISO 400" (and you don't need really much more) could be easily processed by smartphones hardware - they did already in the pre Siri/OK Google/Cortana days, and worked well, even without training.

Don't know about camera processors. It would also increase power consumption a little.

How much useful it could be I don't know, there could be some use cases, but would be really faster than a well laid out dial and buttons interface?
 
Upvote 0
LDS said:
Mt Spokane Photography said:
The only practical voice commends are actually done by transmitting the voice via internet or phone connection to
...

But a simple and quite limited and fixed set of voice commands like "aperture f/8", "shutter 1/125", "compensation + 1 and 1/2", "ISO 400" (and you don't need really much more) could be easily processed by smartphones hardware - they did already in the pre Siri/OK Google/Cortana days, and worked well, even without training.

Don't know about camera processors. It would also increase power consumption a little.

How much useful it could be I don't know, there could be some use cases, but would be really faster than a well laid out dial and buttons interface?

I think the utility of voice commands would be best realised when setting up exposure bracketing or complex intervalometer sequences, and like you was thinking the commands could be easily handled by smart phones.

The 1D bodies have traditionally been able to record voice annotations to go alongside an image; I'd find it much more useful if I could dictate a name or some key words and have that embedded as text in the image metadata. If I shot a sequence of images, and could literally tell the camera to attach some particular keywords to all images shot in last two minutes, or all images shot within 50m of my current GPS location, or meeting some other criteria, I'd definitely use the feature regularly.
 
Upvote 0
I dare say folks shooting wildlife, weddings, concerts, and the like would not find it useful...

Basic adjustments such as speed, aperture, and ISO are already so fast I'm unsure it would be useful. Maybe for changing more complex settings, or settings one would normally have to access the menus for. It would seem like a cool idea for setting up long bracketing sequences. Also might be nice for flash settings.
 
Upvote 0
d said:
I think the utility of voice commands would be best realised when setting up exposure bracketing or complex intervalometer sequences, and like you was thinking the commands could be easily handled by smart phones.

Basically, you have two ways to enable voice commands. One closely matches the UI navigation, and it is simpler to implement. The other tries to understand what a user wants, and acts accordingly. The latter is much more complex to implement (the software needs to "understand" what the user really means), and requires far more processing power and other resources.

If you want to say "camera, take a photo every ten minutes from dawn to sunset, then one every half an hour through the night, bracket two stops with 1/2 stop interval", well, this is complex enough, and camera will also have to compute what "dawn", "sunset" and "night" means for a given location and day of the year. It also has to understand "interval" is associated to "bracketing", etc.

IMHO voice commands would be more useful for quick settings which needs to be applied promptly (hoping commands are understood quickly and correctly...) instead of complex settings which anyway require planning and don't require quick changes.
 
Upvote 0
LDS said:
d said:
I think the utility of voice commands would be best realised when setting up exposure bracketing or complex intervalometer sequences, and like you was thinking the commands could be easily handled by smart phones.
If you want to say "camera, take a photo every ten minutes from dawn to sunset, then one every half an hour through the night, bracket two stops with 1/2 stop interval", well, this is complex enough, and camera will also have to compute what "dawn", "sunset" and "night" means for a given location and day of the year. It also has to understand "interval" is associated to "bracketing", etc.

You're wrongly assuming I want to have a conversation with the camera - voice commands are one thing, using natural language is something else, and isn't necessary for how I see such a system working.
 
Upvote 0
Mt Spokane Photography said:
The only practical voice commends are actually done by transmitting the voice via internet or phone connection to a remote computer that is very fast and powerful. Thus, you would need a phone connection or Wi-Fi for that to happen. Take your smart phone, disconnect from Wi-Fi and the phone service, and try the voice command. My IPhone 6+ comes up with totally silly text. Without the connection, its worthless.

Just voice commands in a camera using the already overloaded processor might not work well at all.

almost

I had a phone, a Motorola timeport (in year 2000) that had voice recognition, but it was limited, you needed to teach it. The system you're referring to allows anyone to talk to a device without training the device to your voice first.

A camera is usually personal so could use the trained method which vastly cuts the computational overhead.. especially if all it's looking for is "f8" or "ISO400".
 
Upvote 0
rfdesigner said:
I had a phone, a Motorola timeport (in year 2000) that had voice recognition, but it was limited, you needed to teach it.

A friend of mine had a voice-recognition phone and spent the best part of an hour tagging all his contacts with a voiceprint.
It wouldn't work the next day and after a long an tedious call to Support he realised he had recorded it after a session in the pub and the phone did not recognise his voice when he was sober.
 
Upvote 0
IglooEater said:
I dare say folks shooting wildlife, weddings, concerts, and the like would not find it useful...

Basic adjustments such as speed, aperture, and ISO are already so fast I'm unsure it would be useful.
Maybe for changing more complex settings, or settings one would normally have to access the menus for. It would seem like a cool idea for setting up long bracketing sequences. Also might be nice for flash settings.

Haha! I was thinking the same thing....
 
Upvote 0
dilbert said:
Dispense with the native language recognition and just use "voice activation" to select which custom mode and upload a speech pattern that is developed and uploaded into the camera using DPP. Then the camera only needs to match what it hears against the pattern files and is also only trained to your voice.

If it was too much for a PC (not likely) to build the pattern file then Canon could offering it as part of a "Canon Cloud Photography" package or feature.

I think a limited number of commands could be handled in a method you described, by limited, I mean specific commands not general queries.

I find Siri so bad that I am amazed that someone would embarrass themselves by releasing it. Google maps and browser work great, and almost always figure out what I am saying. I Have a Amazon Fire TV (Latest), it is very good at figuring out my voice commands about almost anything.

I attending a meeting last week for deaf and hard of hearing people (I'm having a Cochlear Implant next week). The meeting used Streamtext.net, and I was totally amazed.

http://www.streamtext.net/

I did not spot the microphone(s), but every person's voice in the room was picked up and translated into text and projected on a large screen. It was 100% accurate, I've never seen a voice to test that accurate, much less in a meeting with a fair amount of noise, and people scattered around the room. Its a pay to use service, but well worth it, even at its highest rate of .09/minute, a 2 hour meeting costs only $10.80. And, you can have up to 50 clients connected at various locations at no additional charge. Since its captured by a web browser, you can record the meeting and publish minutes later too.

Because I will be totally deaf for a month after my implant surgery, I have been researching and testing voice to text for tablet and smart phone use so my wife can easily communicate with me. Many of the apps work, but only marginally, I currently have speak2see installed on my iPhone, it works well, but there is no control over text size, so it needs a large tablet, or I need to find a way to reduce the text size.
 
Upvote 0
dilbert said:
Mt Spokane Photography said:
dilbert said:
Dispense with the native language recognition and just use "voice activation" to select which custom mode and upload a speech pattern that is developed and uploaded into the camera using DPP. Then the camera only needs to match what it hears against the pattern files and is also only trained to your voice.

If it was too much for a PC (not likely) to build the pattern file then Canon could offering it as part of a "Canon Cloud Photography" package or feature.

, so its the potential for improvement that made me chose the one I did, not current performance.

I think a limited number of commands could be handled in a method you described, by limited, I mean specific commands not general queries.
...

I am thinking along the lines of you can have it reocgnise the word "bettlejuice" and it will change the camera settings to the "bettlejuice" custom set (ISO 25600, f/1.4, shutter 1/60, center point af only) or you say "mama mia" and it sets the camera to that set. Nothing generic - i.e not "aperture f eight" or "shutter one two hundred and fiftieth" or "white balance automatic" as they're all far too complex.

Sorry to learn about your hearing problems - I hope R&D keeps on improving the options available there for you.


It could just set C1... C3, but I like your idea better. Being able to pre program a large number of options for voice recall.


The Cochlear implant business is a billion dollar industry now, but they are still relatively rare. In our fair sized city of Spokane, they only do about 40 a year. The low volume has led the implant makers to team up with hearing aid makers to take advantage of their research and their wireless accessories. (Advance Bionics was bought by Sonova who also owns Phonak)

Implants still lag hearing aid technology by about 3 years due to the FDA approval process. These are medical devices that are inserted into your cochlea and hit the nerves with voltage pulses, with all kinds of potential side effects, so the FDA is careful.

I've done a bit of research into available information about the technical workings before selecting the brand. However, the results from brand to brand are pretty much equal. I have attended two meetings with users and marveled at how much better than me they could hear. Music however is a issue, and telephone use is still difficult for many.

Turning on the device for the first time is said to sound like everyone is breathing helium, high pitched and squeaky. Your brain learns to interpret the correct pitch after practice, so voices sound more normal. Some of the users I met have had a implant since the 1980's (one was 1982).
 
Upvote 0