- The Natural Environment Research Council
The spirit of the “Boaty McBoatface” phenomenon lives on at Google.
Today, Google introduces Parsey McParseface – a free new tool, born from Google’s research division to help computers better parse and understand English sentences.
“We were having trouble thinking of a good name, and then someone said, ‘We could just call it Parsey McParseface!’ So… yup,” says a Google spokesperson.
Parsey McParseface is a piece of a larger framework released today called SyntaxNet, itself a big part of Google’s popular home-built TensorFlow software for building artificial intelligence, as explained in a blog entry. With this release, any developer anywhere can download, use, and even start to improve Google’s tools in their own software.
One of the biggest problems in artificial intelligence, today, is that speech recognition by computers may be better than ever, but they still have trouble understanding exactly what we mean. After all, language is complicated: Consider that “Buffalo buffalo Buffalo buffalo buffalo buffalo” is a 100% gramatically correct sentence in American English.
It’s an issue that titans like Google, Facebook and Microsoft have thrown themselves into, as artificial intelligence and the ability to talk to a computer like a human continues to become an important part of the future of tech.
Back to school
To understand how Parsey McParseface and SyntaxNet tackle this problem, it may be helpful to flash back to your grade school English classes, where you were taught how to diagram a sentence, identifying verbs, nouns, and subjects.
Parsey McParseface does those diagrams automatically. Like so:
Alice is the subject, Bob is the direct object, “saw” is the verb. Boom.
That’s simple enough. But to use Google’s own example, things can get messy. Consider another longer, but still straightforward sentence like “Alice drove down the street in her car.” To us normal humans, there’s no possible way to misinterpret that, because we know how cars work and where they drive.
But if you’re an average computer, just following instructions, and you’re doing sentence diagrams, it is totally gramatically correct to parse that sentence as saying the street was located in Alice’s car. Obviously, that’s not right or really even physically possible, but it is correct by the laws of grammar.
“Humans do a remarkable job of dealing with ambiguity, almost to the point where the problem is unnoticeable; the challenge is for computers to do the same,” writes Google in a blog entry.
Parsey McParseface uses neural networks, kind of like the one that let Google DeepMind outsmart Go champion Lee Sedol, to scan each sentence and vet it for “plausibility,” as in how likely it is that it’s what a human would use. It means a lot of saved time and a mega-boost to efficiency, since it doesn’t have to look at implausible sentence constructions.
It means ParseyMcparseface can correctly diagram out and understand longer, more complex sentences, like so:
In Google’s own tests, running SyntaxNet and Parsey McParseface against random data drawn from the web, it was about 90% accurate in understanding sentences – a good start, but with lots of room to grow, Google says. For starters, that means going beyond English. But it also needs to teach SyntaxNet to learn more about the real world.
“The major source of errors at this point are examples such as the prepositional phrase attachment ambiguity described above, which require real world knowledge (e.g. that a street is not likely to be located in a car) and deep contextual reasoning,” Google writes.