After the data crisis at Facebook, we are seeing articles implying that Big Data’s reputation has become a bit tarnished. For example, it has been said that big data has become bad data. Additionally, it has been said that rather than being a valuable business asset much like oil, after what happened at Facebook, data may become a liability.
All of this is occurring around the same time that a new book by data visualization guru Stephen Few came out, titled “Big Data, Big Dupe”, which portrays Big Data quite negatively. Further pessimism about Big Data is expressed in the title of a recent article “Study: Few Companies Have Successfully Monetized Big Data,” written by DH Kass on April 4, 2018 at Channel e2e and reporting on a study with 5600 responses from professionals at 429 companies conducted by AtScale, a Big Data analytics specialist. Additional discouraging material appears on the Consultancy.uk website in an April 30, 2018 article titled “70% of Big Data Projects in UK Fail to Realize Full Potential”.
Does all of this negativity bode badly for Big Data?
Probably not and here’s why. First of all, despite the recent tarnish to its image, much of what is being written about Big Data is still positive. For example, the Wall Street Journal, April 30, 2018, has an entire section with several articles on AI. (AI is Artificial Intelligence and it uses computers to apply Big Data). The subhead on the section’s front page says, “Artificial Intelligence threatens to destroy a lot of jobs. But there’s another side to the story.”. And, headlines for articles in the section suggest a beneficial impact for AI. Thus, positive views coexist with negative portrayals of Big Data.
Nonetheless, the Facebook data situation was certainly a setback. But, I will not speculate about its impact. My expertise is much more in analytics, in making sense out of data and in discovering meaningful insights, rather than in privacy and security issues.
Regarding the study by AtScale, however, it found that only “nine percent of businesses have woven big data and analytics into their organization’s DNA to where it’s central to strategy, decision making, execution, investments and revenue generation, all supported by an executive advocate.” The AtScale study also found that 39% of companies were at a “strategic phase” and had “committed to a Big Data platform”, which is used by IT teams and can be accessed by business users, and that 51% were “experimenting with Big Data”.
As I see it, although the AtScale article, with its negative headline, paints a relatively bleak picture, the study itself does not necessarily suggest that companies’ data efforts won’t pay off. Companies may merely need more time. As experience is gained, more of them may have the potential to better integrate the data into their strategy and operations.
Likewise, the Consultancy uk article also seems to have a title that is far more negative than the material the article presents. In fact, the numbers in the article paint a more optimistic picture than does the AtScale study. The article reports 18% “fully implemented and using Big Data” and 47% “tentatively started with a few projects”. Surprisingly, even though the article’s title is so negative, a bar chart featured in the material shows that the 70% were “somewhat successful” with Big Data, while only 30% were “very successful”. Yet, the article’s text discouragingly describes this as “just 30% of organizations that use Big Data are extracting enough value”, a viewpoint that is in keeping with the article’s negative title that 70% fail to realize the potential.
Regarding the negative view of Big Data associated with Stephen Few’s book, I’d say it’s a bit extreme, although he does make some valid points. According to what he said in an online interview and wrote in his blog, his take is that Big Data does not really exist because there is no agreed upon definition of it. I, myself, don’t agree that lack of a consensus definition is a problem, especially since Big Data and data science are relatively recently coined terms, at least to those of us whose experience with data goes back many years.
The term Big Data can refer to the vastly increased amount of data now out there, as well as the huge quantities of it that can eventually be generated by the up and coming technologies, such as the internet of things, which is expected to connect all sorts of things—ranging from refrigerators to driverless cars to jet engines and complex machinery—and will generate voluminous data as a result. The key point is that there’s so much more data out there today than in the past, and the vast quantities of information can be considered Big Data, even if no one agrees on the exact definition.
Where Stephen Fey does make valid points, however, is in disagreeing with Big Data’s position that correlations matter and causation does not. I agree with Fey that causation does matter. Granted, there may be applications where today’s AI and machine learning come up with useful correlation only results. But, there generally is value in also looking at qualitative explanations for the data.
That’s why I’ve always emphasized understanding the dynamics of the business, instead of just blindly using algorithms. That’s why I’ve advocated the need for human input. And, many users of data science today agree and have been doing just that. Furthermore, Fey is right that the skills of professionals who have been working with data long before the term Big Data was coined should be integrated into today’s data analytics. In fact, in my previous writings I say that the data skills of areas like traditional market research need to be integrated into what is done with Big Data.
He’s also right that you don’t necessarily need huge volumes of data to find meaningful insights. Nonetheless, unlike him, I would take the position that if vast quantities of data are available, it can be worth exploring whether insights might be found there. However, I’d agree with him that vast quantities of data are not always needed and that companies do not necessarily have to collect huge volumes of data on everything. Yet, as more and more of this data is available, examining it for meaningful information might be worthwhile.
Finally, it’s important to note that Stephen Fey’s negative view applies only to what has become known as Big Data. He continues to believe in the value of data analytics that is done right and reflects a high skill level—skills not only there now, but also that have been around long before the more recent coining of Big Data. Apparently, his concern is that some aspects of what has come to be known as Big Data are not necessarily consistent with doing highly skilled data analysis right.
So, in conclusion, although we are seeing some negativity about Big Data, there are still reasons to take a positive view of what data can offer. No, Big Data is not a magic bullet. And, ethics and privacy are an issue. Yet, like any popular technology, companies will have to get past the hype and determine how and where it can be harnessed effectively. But, those who are able to use it well, are likely to benefit.