'Why'...the holy grail of Big Data.

Predictive analysis.

Who, What, Where, How, When, Why, these are these questions we ask our data. But 'Why' is the big star.

When we understand 'why'

something happens, we have a good chance of predicting when it might happen again; to reach this level of analysis we must already have a grasp on the 'Who What Where How When' of the data. Or scarier still, by knowing the 'why', we could manipulate events or people to generate the next occurrence. Star Wars jokes aside, there is a Dark Side to Big Data.

Take prior year sales data. It is often used by companies to make forecasts for the upcoming year. We can all clearly see the trends. The ebb and flow of the markets. But why do they fluctuate when they do? Knowing that answer is the key, well to everything. Let's call it 'the why of when". It is important to know why an event happened when it did. The timeliness of that information accounts for the possibility that a seemingly identical causal event can have varying effects at different times.

Big Data is driven by the questions we ask of it. As the types and amounts of data we create become more complex and boundless, so do the questions that it can answer. Limited only by our imagination.

The fight against global terrorism; Big Data to the rescue. Collecting data on all our "enemies" would mean collecting data on ourselves. screech.. that's the sound of the breaks on the wheels of the imaginary Cyber justice bus, first stop civil liberties. Big brother concerns aside, I could easily envision a system, ok I've more than envisioned it, that could collect, store, and process biometric information for every human on the planet and produce query results in real time; think customs clearing house, airport security, border control. but... who wants that?!

The IoT (internet of things) is quickly providing the eyes and ears, the data, needed to answer the questions we have yet to ask. Or afraid to answer. This is 'meaning of life' type stuff. I love to pretend to be deep.

Questions are being asked, data scientist, developers, BI analyst are answering them with Big Data. In upcoming discussions we will tackle the technology that lets us do it. From CyberSecurity best practices, data storage and query solutions like Hadoop and Vertica to statistical languages like R and Python and the BI tools that tie them all together, we'll discuss it all.


