Monday, 10 January 2022

Voice recognition and an ape's reflexion

 When I wrote my annual 'Technology and Me' post at the end of last year I mentioned that I had bought a smart speaker on a whim and how we were both amazed at how good the voice recognition was.

I must admit to being a bit of a luddite about voice recognition preferring keyboards, and I mean physical keyboards over voice commands - more privacy, and more thought, and of course the chance to review and edit.

It's why I've only just thrown out my last Nokia Asha - essentially a poor man's Blackberry with a responsive little keyboard you could type on, and importantly learn to type fast, which I'd kept for years for overseas travel even though some of the services such as push email had stopped working some time ago. 

However, luddite or not, I've helped set up voice recognition for people who for one reason or another had difficulty using keyboards, including one library systems admin who tried unsuccessfully to use Dragon Dictate with vi in a terminal emulator.

Not a great success - Gnu Nano however proved surprisingly usable as an alternative.

However I'd never used it seriously until we got a smart speaker.

While the voice recognition capability is impressive, like all such devices it's highly reliant on its backend data set,  meaning the answers are fairly standard and after a while predictable - like if asked to play ABC Newsradio it sucks it in from Iđź’—Radio, while it sources Classic directly from the ABC's own feed, even if like all such devices sometimes its response to a query almost seems like magic.

Just as many years ago I asked the GPS system on my car to take us to a hotel in Brisbane. The system threaded us through the urban motorway system and took us past the hotel heading west, ducked us down and round an intersection and back up heading east - the GPS 'knew' that the parking garage for the hotel  was only accessible from the east bound lanes and directed us accordingly.

Of course it no more knew than the fictional Emilybot knew about Emily Bronte's inner life, what it can answer and talk about is determined by the richness of the underlying dataset.

So, my car's GPS 'knew' you needed to be in an eastbound lane to get into the underground parking garage, just as the Google assistant in the smartclock knows about radio stations, knows we have a spotify subscription and so on.

It also has some strange aberrations - asked the weather it invariably gives us the weather for Blacktown in Sydney - which is where our outward facing IP address resolves to. (Our ISP uses a combination of NAT and BGP which effectively obscures our internal NBN network address meaning the external address we are allocated at any one time comes from one of their Sydney points of presence).

However we can forgive it that, the Guardian invariably makes the same mistake as do sites with these silly clickbait ads 'The prices of cremations in Blacktown may surprise you'.

But to return to the main point, the assistant, for all its faults gives the appearance of intelligence, just as my car's GPS did.

In fact neither of the systems are intelligent. Unlike apes, or my cats, they are controlled by the richness of their dataset and outside its parameters, flounder helplessly ...

No comments: