The October issue of Harvard Business Review featured articles on Big Data. One of them, “Data Scientists: the Sexiest Job of the 21st Century” by Thomas H. Davenport and D.J. Patil, discusses the role and skills of data scientists, the professionals charged with making discoveries from Big Data. The article lays good groundwork regarding who data scientists are and what they do.
Yet, the article misses the mark in a key area, since it puts far too much emphasis on high tech skills like the ability to write the code needed in programming computers. The article says that all data scientists should have coding skills. But, despite what the article says, there are reasons to dispute this. When the goal is discoveries that lead to meaningful insights, other aspects of Big Data are equally as, if not more, important than the ability to write computer code. That’s why, unlike what the article recommends, not all data scientists need to be skilled at coding.
Of course, the coding done to program computers is an important element in successfully tapping big data. So, people with those skills are needed. But, not everyone with strong computer coding skills will excel at extracting meaning from data. Likewise, someone with curiosity and a heightened ability to look for and find meaning in data may lack coding skills. Coding and finding meaning are two distinct skill sets that may not necessarily be found in the same person. True, it might be a plus if those who are extracting meaning from data also have at least some understanding of what coding entails, even if they are not skilled at doing it. But, when meaningful insights are the goal, it can be a mistake to require coding skills, since doing so may eliminate some professionals with a truly strong ability to make sense out of the data.
The Harvard article does discuss the value of traits like curiosity and the desire to go beyond the surface of a problem. The article also points out the value of good communications skills, including the ability to visually display data in creative, effective ways. Yet, all of these valuable characteristics of data scientists, may not necessarily come with coding skills. And, strong coding skills may not come with all of these other crucial characteristics of data scientists.
That’s why my own background, experience and analysis, leads me to disagree with the stance that all data scientists must have coding skills. I say this as someone who spent my early career in quantitative analysis positions, essentially doing what a data scientist would have been doing decades ago. I say this as someone who saw the limitations of statistical analysis, and learned the importance of understanding the broader business dynamics when making use of data. I say this as someone who has since spent over 25 years researching business success and failure patterns to understand those broader dynamics. And, I have concluded that coding is not a requirement for all professionals who specialize in getting meaningful insights from data.
Furthermore, in that very same issue of Harvard Business Review, another article “Big Data: The Management Revolution” by Andrew McAfee and Erik Brynjolfsson features a pull quote that makes an important point. It says “Big Data’s power does not erase the need for vision or human insight.” As I see it, sometimes this insight can come from managers and staff who are not specialists in using Big Data. In other cases, however, data scientists with the right background can contribute human insight and vision that make the data meaningful. The data scientists who can do so provide a valuable benefit, even if they lack coding skills.
So, I encourage companies not to insist that all data scientists have coding skills, as the Harvard Data Scientists article recommends. Instead, recognize that, while coding skills do play an important role, there is a need for people who can find real meaning in the data, even if those people lack coding skills. After all, finding meaningful insights and applying them is a major benefit of Big Data.
I agree. I have participated in marketing research projects with tremendous quantities of data that have generated very little in the way of useful findings.
Computers now have the capacity to run massive amounts of data through every possible combination and correlation, but the results are meaningless without brainpower to detect findings that help explain the market or suggest strategies.
Sometimes the huge amounts of numbers simply confuse people. I have seen analysts pull striking numbers out of the mess without noticing that statistical validity measures indicate the numbers shouldn’t be used. I have also seen analysts mistakenly interpret validity measures as end findings. (It’s certainly embarrassing when that problem is discovered.)
Other times the findings are interesting but no one knows what to do with them. The result is lots of bulleted text that makes for a lengthy report but does not support relevant conclusions.
It can be very efficient for a tech person to handle the coding while a subject analyst looks at what the data mean. Interpretation benefits from creativity and knowledge—the analyst does not need to know how to program as long as he or she can interpret the data.
-Diana