clock menu more-arrow no yes

Filed under:

Google is giving away the tool it uses to understand language, Parsey McParseface

New, 4 comments

Okay, Google. Okay. We get it.

Say hello to Parsey McParseface.

Yes, to get you to pay attention to what would otherwise be a fairly dense and nerdy thing, Google is using an homage to Boaty McBoatface for one of the software tools it's releasing today. But don't just laugh (or groan) at the name, what Google is giving developers and researchers access to is a big deal. Today, it's open-sourcing something it calls SyntaxNet and a component for it, Mr. McParseface. These are some of the tools that Google uses to understand natural language when you type it into a box or speak to Google Now.

SyntaxNet is the overall framework for parsing sentences, called a "syntactic parser." Parsey McParseface is the English language plug-in for SyntaxNet. Google claims that it can correctly identify the subjects, objects, verbs, and other grammatical building blocks of sentences as well (or, in some cases, better) as trained human linguists — achieving 94 percent accuracy on English-language news articles.

Understanding the grammatical structure of a sentence is key to helping computers act on their meaning. For a simple example, you might say "Give me the time in Paris." Depending on how you parse that sentence, it could mean "Tell me what time it is in Paris" or it could mean "When I am in Paris, tell me what time it is." Being able to even determine which of those meanings is the intended one is a super complicated process for a computer — and it can't even start unless it's able to parse the different grammatical parts of the sentence.

To figure it out, Google says that "SyntaxNet applies neural networks to the ambiguity problem" and then uses "Beam Search" to apply probabilities to multiple possible meanings at the same time before landing on the correct meaning.

Google has been on an open-sourcing tear with its machine learning platform, TensorFlow. After open sourcing it last year and then letting researchers get the piece that lets it use multiple computers, now it's putting out these other components that are built on that same platform. Google says that the "release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface, an English parser that we have trained for you and that you can use to analyze English text."