Hadoop-da-Loop

  • Posted on: 6 November 2013
  • By: Jay Oyster

Big DataWhile searching for a comparitive analysis between the features of the various versions of the desktop database application Microsoft Access, I was presented with an odd 13 minute long ad from Cisco on their new product supporting Big Data. Now, while I don't really understand what the product they were pitching actually WAS, I did key on one word in the presentation that I've heard more and more recently . . . 'Hadoop'. 

What the heck is hadoop, you might ask? I know that I certainly was asking that question. After doing some research, I found out that it's related to the concepts of big data, the habits of very large organizations to aggregate really BIG sets of data into really BIG databases. Back in the early 2000s, Google was dealing with the problem of trying to be able to mine the data they were getting from repeatedly indexing the entire Internet. 

Each time they had to ask a  question, it took a team of PhDs to come up with not only the answers, but the tools to simply ask the questions. So they invented technology to standardize the asking of questions on databases that are so large that they may be spread out over thousands of servers around the world.

Yahoo, on the other hand, didn't have the investment to invent that technology, so they hired a guy (Doug Cutting) who sort of reverse engineered the technology and started releasing it as an open source software project. (Yeah, I know, the history is more complicated than that, but I think I got the gist of it.) Since Yahoo hired him, it has kept the software open-source, so naturally, it's taking over the world of Big Data. The logical conclusion is that the tools invented in Google will eventually be replaced by the open source version of the very same tools, as their feature sets get usurped by the crowd-sourced versio of the same technology.

 

But you know what my favorite aspect of this whole story is? The name.  HADOOP. Where the heck did that come from?  For people in this particular part of the IT industry, I'm sure this is way old news. But for most of us in the population, it's just another little piece of esoteric IT jargon. What the heck is a Hadoop!?

The initial ad for the Cisco product featured a pair of Indian engineers talking about the product, so I thought, "Maybe it's a word in Hindi, or some other Indian language . . . "

Hadoop logoNope. It turns out "Hadoop" was the name of Doug Cutting's daughter's stuffed toy elephant

All of the cool words of the digital age are coming out of the mouths of babes. "Googel" was invented by the son of a mathematician to describe the number 1 followed by a hundred zeros. And now, the way you ask big questions of big sets of data is the name of a toy elephant.

I abso-fricken-lutely LOVE this.