NASA is using a database technology popularized by Facebook to save millions just by making sure its engineers don’t repeat the mistakes of the past. The same technology is directly applicable to industry, real-time analytics, the Internet of Things, and large-scale retail, too.
Not too long ago, an engineer on the Orion spacecraft had a major problem, David Meza, Chief Knowledge Architect at NASA’s Johnson Space Center, recently explained to Business Insider.
Orion’s uprighting system, which flips the returning capsule right-side-up after it splashes down in the ocean, wasn’t working correctly. In tests and simulations, the capsule stayed on its side, even after hitting the water.
Not great, considering it’s supposed to be carrying four astronauts.
Luckily, NASA maintains the so-called “Lessons Learned” system, where the agency’s engineers take the lessons they’ve learned and contribute them to a central database.
Lessons Learned spans a half-century of collective NASA engineering knowledge, all the way back to the legendary Apollo moon missions – which is good, because Apollo used a similar uprighting system to Orion. There was every possibility that somewhere in Lessons Learned was the key to fixing Orion.
But that massive database is no good if you can’t find anything.
NASA’s original web search for Lessons Learned couldn’t turn anything up; even NASA’s History Office only turned up 3 relevant files in eight days of looking. The Orion team even visited retired Apollo astronauts at home to see if maybe they had anything useful in their attics.
That’s where Meza and his team came in. Meza had been experimenting with a new, smarter, data-driven approach to sifting the Lessons Learned database. Within three hours, Meza’s better Lessons Learned had come up with 30 relevant documents. One the first 10 documents had cure for what ailed Orion. The mission was back on track.
Meza says the engineer approached him afterwards: Without that information, NASA would have had to spend 2 years and millions of dollars building a second Orion uprighting system just to test, delaying its 2023 launch further.
Meza cites the Orion case as a big reason why his team has embraced a newer kind of database, growing in popularity among developers, called a “graph database,” a term first popularized by Neo Technology.
Neo makes the Neo4j software, used by customers including Marriott, Monsanto, Walmart, and Meza’s team at NASA.
Facebook borrowed the term for its famed “social graph,” which it introduced in 2007. In fact, Facebook is a great example of a graph database: Everything is defined by relationships.
- Justin Sullivan/Getty
Your profile is instantly and invisibly linked to everybody who likes the same bands, foods, places you’ve been, songs you’ve listened to on Spotify, and so on. So when you search Facebook for “Friends who like Nickelback,” it’s basically drawing a line between your friends and people who like the band.
It seems simple. But with more traditional “relational databases,” which are more like giant spreadsheets, it would be impossible at Facebook’s scale, or at least “really, really, really slow,” says Neo4j CEO Emil Eifrem.
Imagine an Excel workbook with multiple tabs, one listing all your friends, one for everybody on Facebook who likes Nickelback, and so on. Then, imagine having to compare the two databases to find a match. Then, imagine doing that for every single one of the people who visit Facebook every day. Not easy, cheap, or fast.
Down to earth
At NASA, Meza’s team combined a Neo4j graph database with the combined know-how of a data science specialist and a web developer to come up with a better, smarter way to search the Lessons Learned database.
Before, Lessons Learned required you to punch in a keyword, like “valve contamination,” which would produce a bunch of links to documents in a spreadsheet. But there was no way to search within those documents: You’d have to download the files as they came up, one by one, and hope it was in there.
By putting Lessons Learned into a graph database, combined with some magic with statistics, it became a lot simpler to link together the millions of articles in the database in a way that’s just a little bit more like Wikipedia – automatically and intelligently grouping together articles based on their content and the words used therein.
It means that the NASA Lessons Learned database can make more natural connections between topics. “Valve maintenance” might lead to “water corrosion” might lead to “fire hazards,” based on cross-indexed words in the article. It means engineers can spend less time guessing the magic words to make their answer appear and get back to work on Orion.
Intriguingly, the International Consortium of Independent Journalists used Neo4j to similar ends, putting the Panama Papers leak into a graph so hundreds of reporters from around the world could sort the data and come up with blistering reveals on the effects of offshoring capital on the global economy.
Graph databases are broadly applicable past Meza’s brand of knowledge management, as well.
It turns out that graph databases are fabulous at pattern detection: Going back to the Facebook example, it’s pretty simple for it to guess that you might know somebody if 90% of your friends know them, too. Similarly, if half of your friends like Taco Bell on Facebook, it’s not that hard to extrapolate that you might, as well.
That’s why Amazon built its own graph database system, called Dynamo, early on in its development to power its famed recommendation engine. If a book has lots of attributes in common (author, subject matter, publisher) with something a customer bought, Dynamo can go through that graph and guess that you might like it.
“The graph can be people connected to other people, or it can be other things,” explains Eifrem.
In a business context, graph databases are fabulous for fraud detection, for the same reasons.
Instead of looking for characteristics of a person or product, you’re looking for characteristics of a credit card transaction. All the transactions made to or from, say, specific accounts are all connected to each other. So if one is fraudulent, all the connections probably are, too.
Graph databases are a newer way of looking at data – not just the data, but how each point relates to each other. NASA is using it to move faster; Amazon and Walmart are using it for commerce. Either way, Facebook might be the most visible graph out there, but it’s not the only one.
“We’re just living in an increasingly connected world,” says Eifrem.