Big data: Are you creating a garbage dump or mountains of gold
Stop, look and listen before being taken in by vendor tall tales
Published 15:45, 05 September 11
You’re not really sure how it happened, but some time between last year and the summer of 2011 you were suddenly facing a big data problem, or you were being told you were facing a big data problem, or more accurately you were being told that you needed a big data solution.
Funny thing was, you hadn’t really done anything drastic over the last couple of years that would seem to indicate a tsunami of data was about to breach your storage floodgates. But then again, it wasn’t like you watched yourself going bald either.
On the other hand it is hard to argue with the old adage that knowledge is power, and that data leads to information which can lead to knowledge, which can hopefully translate into making highly informed decisions.
It is also true that there is a massive amount of data traversing in and out of the business that seems to ghost its way through the system like an angst ridden goth teen with a passion for walks in dark alleys - alone at night - while wearing all black and really dark purple.
So, does your data hold some big magical gold nuggets of information that will radically transform the business?
Will you single-handedly (using MPP, NoSQL, schema-less architectures) Hadoop your way to greatness, while getting a chance to brush up on your python scripting skills?
Or is this whole thing going to end up like your companies previous - highly successful? - ERP, CRM, CMDB, PKI, and <insert hype technology du-jour> deployments?
You’re dying to find out aren’t you?
Before you go leaping into the murky abyss of your companies data swamp you may just want to take a deep breathe and do a little planning.
- First off, what questions are you hoping and/or needing to answer? If you don’t know what questions you are trying to answer why would you exert any effort on trying to find an answer?
- Do you believe you have the data to actually answer the question? This is a very common theme in analytics and BI, you want an answer, you spend a ton of money to churn, prep, store, map reduce, analyse and visualise data, only to find out the questions you really need answered you cannot because you simply do not collect the data to answer the questions regardless of the myriad of tools, consultants and sharply dressed enterprise sales weasels that come knocking.
- Assuming you were able to get the answer, could you act on the information? Another common problem is that many companies are not organisationally structured to support acting on business intelligence, especially if it requires iterative dialogue with the data.
- Do you have the skills in-house to understand you data, infrastructure, and analytic requirements? Why purchase and deploy if you cannot actually administer and use?
- What tools do you currently use for data management, analytics and business intelligence?
- What is the gap between your data requirements and the incumbent tools capabilities?
- Do you have the skills in-house to research, analyse, test and deploy a big data solution?
Assuming you can answer those questions then it might be time to start doing some research on big data solutions, but before you run out and grab the latest shiny object make sure you know what the problem actually is that you’re trying to solve.
That seems like a lot to think about, but would you rather spend your time up front doing all the research, planning and expectation setting to increase your probability of success or just go for it?
How about the pain in the backside when you watch hundreds of thousands to millions of dollars and 18 months flush down the drain, with nothing to show for it except a metric ton of crappy free big data vendor schwag?