Why Programmers Are Not Data Scientists (and Vice Versa)

Hot jobs go in waves, and not surprisingly, the information technologies sector is as prone to following fashions as religiously as teenagers.

There is a good reason for this, of course. The hot IT jobs are where the money is, and if you want to play in that market, then you need to have the skills or training to participate. Otherwise, you run the risk of watching your income fall as you’re relegated to lesser paying jobs, or worse, are forced into IT management, doomed never to touch a compiler again, while never quite managing to play in the big leagues with the C Suite (I may be exaggerating a bit here, though not necessarily by much).

Why Programmers Are Not Data Scientists (and Vice Versa)

Over the years the role of programmers as generalists havs faded even as their importance as tool creators to assist others has grown dramatically.

Programmers Are Tool Makers, Data Scientists Are Tool Users

Around 2015, after the Big Data / Hadoop hoopla was beginning to fade, the big tech industry analysts came to the realization that with all of this big data out there, you needed someone to make sense of that data. Data analytics tools have been around forever, but for the most part they were specialty tools that mathematicians, biostatisticians, actuaries, and others of that particular ilk used – SAS, SRSS, Mathematica, Matlab.

Then along came R. R was not intended as a programming language, but rather as a data analysis language, though it traced it’s language to the S language, which was itself an extension of the Scheme language. R is not new – it debuted in 1993 and helped to facilitate a lot of the heavy lifting that statisticians needed to create pipelines and work with datasets. Statisticians use it because when you’re working with crunching numbers, being able to parameterize functions is important, and sometimes a command line interface (a CLI) is precisely the right tool for the job.

However, it’s worthwhile noting that for all that Scheme (and hence R) is a programming language, the purpose of running R is generally not to build applications – it is to generate reports based upon the analysis of data by people who understand how to analyze data. They have statistical training, they generally have a pretty good idea about concepts such as distributions, margins of error and data sampling, and they are usually asking a particular question – why does the data look the way that it does? What is the story that data is trying to tell?

Now, it turns out that there’s another sector out there who ask much the same question: business intelligence analysts. Note the word analyst here. Programmers, in general, ask a different question: how can I build a tool to solve a problem. Note that there is still a certain aspect of analysis here – decomposing a problem so that it can be recomposed via some kind of modular framework to components that ultimately ends up working as a functional unit – but that the focus, in general, is not on the data itself, it’s upon the tools that manipulate that data. Analysts basically look at the data, using the tools that programmers make, in order to extrapolate conclusions that are consistent with sound statistical principals.

To a typical business person, this distinction might not really make that much difference. A programmer and an analysis are, to borrow a term from the television show Bones, squints, people who spend all of their time doing weird mysterious things with computers that require close focus, and hence, reading glasses. However, and again to borrow from a somewhat dubious source (MBTI), programmers are INTJs while analysts are INTPs.

Programmers are engineers. They are fascinated by building things and they see systems as basically gigantic tinker toy sets that allow them to build ever more complex things. Analysts, on the other hand, are interested in understanding how and why systems work, and as such are much more focused on ascertaining the patterns that allow for classification. They often work together – theoretical physicists (INTPs) work out the theory, then experimental physicists (INTJs) build the tools necessary to verify or disprove that theory. But their mindsets ultimately are fairly different. This is important from a business standpoint.

Why Programmers Are Not Data Scientists (and Vice Versa)

How Big Data Led To Big Data Scientists

Fifteen years ago, business intelligence systems were all the rage, and ultimately a class of business person called the business analysts sprung up. Business analysts wore suits, but you could always tell that they weren’t “real” managers because they were the ones always hunched over their keyboards building up complex business models in Excel. They were the ones that gravitated towards the BI suites when they came out because it allowed them to do better statistically analysis and to work with intricately connected datasets, though even they would readily admit that the BI tools were not quite as accurate as the vendors told them they could be.

Once big data put large amounts of what had previously been siloed application data and transactional logs into the hands of the business analysts, those same people came to realize that what they were doing was not really all that different from what scientists were doing – modeling, analysing, presenting. Their shift to the darkside was complete. They had become data nerds, and the big business analysis firms picked up on the trend, dubbing this new generation of analysts data scientists.

On the face of it, data scientist is an oxymoron: the scientific method is based upon the accumulation of evidence and testability, which means that all scientists are in fact data scientists. As a marketing term, however, it did what the marketers hoped it would do – it gave a veneer of scientific respectability to finance, which, along with psychology and sociology, has always struggled to attain some kind of scientific legitimacy. Suddenly, it became fashionable for people to walk around wearing lab-coats and designer nerd glasses, not because they needed either, but because they were scientists, dammit.

Data scientists became the next big career, and for the first time in a long while, people with PhDs in mathematics were making really good money. Companies wanted their own data scientists to make sense of all of this data that they were generating because surely that Big Hadoop Data Lake they’d spent the last five years creating held some kind of insights in it. Otherwise, they would have wasted their money, and no manager worth his bonus would dare admit that they made a mistake wasting money on dead-end technologies.

Why Programmers Are Not Data Scientists (and Vice Versa)

Systems thinking is the domain of the data analyst – understanding the inputs and outputs of what keeps a system functioning.

The Importance of System Thinking

The business analysts (who were, in fact, the true subject matter experts in their domain) began to be pushed aside by doctorates who could do differential geometry in their head but had likely never dealt with a business model in their life, and were then told to do magic. You can see where this is going. The computations became fancier, the analysis likely became more rigorous, but the modeling, which ALWAYS comes down to understand your domain of expertise, became sloppier, especially once machine learning neural networks came into play.

Have you ever watched a flock of birds in flight? That’s a neural net. Each bird gets sensory input that tells them where their nearest neighbors are, where the ground and other obstacles are, where predators are if they are on the outside edge of the flock, and likely some basic sense about magnetic fields in the immediate area. Any given bird does not have a complete understanding of the whole flock, but it adjusts its actions based upon a few inputs with varying weights. This works reasonably well in most circumstances, and is usually pretty good for determining the actions of a system of autonomous but dependent agents (such as companies in an economy) but if the data that any given sensor (such as a bird or a company) receives is inadequate or not modeled properly, it can result in catastrophe for that particular sensor, and quite possibly for the whole flock or economy.

Many ecologists and economists are systems thinkers – they understand the tools of modeling, and they also understand a particular subject domain, which means that they are pretty good at knowing the limitations of those tools within that domain. Programmers are subject matter experts as well, but primarily in the domain of building tools or algorithms. I can tell you as a programmer how to translate the mathematics of differential geometry (which is what autonomous agent modeling really is) into a numerical method approximation, but understanding what particular variables are important (or even independent) is likely not something I can do very well because I don’t understand the domain.

Programmers look at the command line and think that because they understand the function set they can do data science. This has even been reinforced by the appearance of statistical and deep learning tools in Python (which many programmers are comfortable with), and to be honest by the potential to go from the average wage for a Python programmer (now around $85K in the US) to a data scientist (around $110K in the US). Some actually do make the jump successfully, but these were people who in general came into programming from other domains and as such have a good basic grasp of the squishier factors of their respective fields.

At the same time, many people with PhDs who enter as data scientists struggle with the fact that the expectations on them include data processing functions that they might ordinarily not even have to think about in controlled environments because they controlled the data collection in previous jobs or academia. Clinical data in controlled environments is relatively clean, business data, especially systemic enterprise data dumped into what could be called a data swamp is anything. Such data invariably has hidden assumptions to it, a mixture of encoding formats and frequently poor modeling, and because it was collected primarily as an artifact of a specific, usually different, process than for analytics, making that data say anything useful can be challenging even when you do know the domain, let along when you don’t.

On a related note, I want to stress again the importance of enterprise knowledge graphs and metadata/identity management. One of the central problems that both data engineers and data scientists face is the need for consistent data across teams for organizational data that requires the least amount of retranslation possible. This doesn’t necessarily mean that every database needs to have the same labels and definitions, but it does mean that if you have divergence there is some means of accessing information through a common ontology.

This means that ontologists and curators still play a role (and a growing one) in the overall mix of data professionals in an organization. I hope to address this more in an upcoming article, but keep in mind that the people who manage and organize the metadata of that organization are the ones that make the consistency of results possible.

Why Programmers Are Not Data Scientists (and Vice Versa)

The days of the lone analyst are long over. In most cases, you need a team of different people: data engineers, analysts, visualizers and storytellers, to be effective.

Data Science Is A Team Effort

From a management perspective, there are many lessons to be gained here. Don’t try to turn your programmers into data analysts unless they have a solid analytics background already. If you are getting into deep data analytics as part of your company flow, make sure that you have a good knowledge engineer and data quality crew hired first to handle ingestion from data sources, and bring in your analyst primarily to help the knowledge engineer know what kind of data they need to have to perform their jobs properly.

Once you’ve proven out this process, then you can start bringing in other analysts, keeping in mind that their goal is both to make sense of the data that they’re being handed and ultimately to build models that make predictions about future behavior possible. Hiring a data strategist first is an even better solution because ultimately the goal of such models is to inform decisions about future actions, and having someone who oversees this process is essential.

Recognize that an end to end data strategy requires thinking about the data lifecycle as a team effort. In my experience, there is a world of difference between a data analyst – who can build and interpret a given model based upon existing data – and a data whisperer, someone who can take this interpretation and put it into terms that a lay audience can readily understand – and can work with data visualizer who are programmers who are adept at creating meaningful visualizations of this information (an R graphic may be suitable for a dissertation or paper, but will likely be meaningless to the typical business person).

Finally, in the post-Covid environment affecting everyone, it may be more useful working with an outsourced data analytics team at first to get an idea about how best to utilize data analytics in your own organization before necessarily committing to one inhouse. If you use such teams to generate quarterly reports, it’s often more cost effective to go external, but if you’re at a stage where your data analytics are actually fueling other initiatives, at that point building an internal team makes far more sense.

Conclusions

There are few things stopping programmers from becoming data analysts and vice versa, but it is important within any business to understand what the difference is between the two roles, and how one can (and should) support the other. A data analyst (even one who is part of a broader team) is ultimately the definition of a subject matter expert, someone who can place the necessary context of a given field into perspective to determine the answers to questions. Programmers, for the most part, are tool builders who provide the tools necessary for analysts to better perform their own rules, as well as to help to visualize and otherwise prepare the analysis for dissemination. Both should know their way around a command line, but what they do with it can be very, very different.

 

Original post: https://www.bbntimes.com/technology/why-programmers-are-not-data-scientists-and-vice-versa

86 comentários em “Why Programmers Are Not Data Scientists (and Vice Versa)

  1. I am the owner of JustCBD Store company (justcbdstore.com) and I am currently looking to grow my wholesale side of business. I really hope that someone at targetdomain give me some advice 🙂 I considered that the most effective way to do this would be to reach out to vape stores and cbd retail stores. I was hoping if anybody could recommend a reputable web-site where I can get Vape Shop Business Marketing Data I am presently taking a look at creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Unsure which one would be the most ideal solution and would appreciate any assistance on this. Or would it be easier for me to scrape my own leads? Suggestions?

  2. I am the business owner of JustCBD Store label (justcbdstore.com) and am aiming to expand my wholesale side of business. It would be great if someone at targetdomain is able to provide some guidance ! I thought that the very best way to do this would be to talk to vape shops and cbd retailers. I was really hoping if anybody could suggest a reputable web site where I can purchase CBD Shops Business Leads I am presently taking a look at creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. On the fence which one would be the very best choice and would appreciate any support on this. Or would it be much simpler for me to scrape my own leads? Ideas?

  3. An intriguing discussion is worth comment. I believe that you ought to write more on this subject, it may not be a taboo matter but typically folks don’t speak about such topics. To the next! Kind regards!!

  4. Oh my goodness! Impressive article dude! Thanks, However I am going through problems with your RSS. I don’t understand why I am unable to subscribe to it. Is there anybody having identical RSS problems? Anyone who knows the answer will you kindly respond? Thanks!!

  5. Can I just say what a comfort to uncover somebody who genuinely knows what they are talking about online. You actually know how to bring an issue to light and make it important. A lot more people have to look at this and understand this side of the story. I can’t believe you aren’t more popular given that you certainly have the gift.

  6. Your style is very unique in comparison to other folks I’ve read stuff from. Many thanks for posting when you’ve got the opportunity, Guess I’ll just book mark this site.

  7. After looking into a number of the articles on your website, I really like your technique of blogging. I book marked it to my bookmark website list and will be checking back in the near future. Take a look at my website as well and let me know how you feel.

  8. Hi there! I just want to offer you a big thumbs up for the great information you’ve got here on this post. I’ll be coming back to your blog for more soon.

  9. An interesting discussion is worth comment. I believe that you ought to write more on this subject, it may not be a taboo subject but usually people do not discuss such subjects. To the next! Many thanks!!

  10. Hi there! This article could not be written any better! Going through this article reminds me of my previous roommate! He continually kept talking about this. I am going to send this article to him. Fairly certain he’ll have a good read. Many thanks for sharing!

  11. Your style is so unique in comparison to other folks I’ve read stuff from. Many thanks for posting when you’ve got the opportunity, Guess I’ll just bookmark this page.

  12. Having read this I thought it was rather informative. I appreciate you taking the time and energy to put this content together. I once again find myself personally spending a significant amount of time both reading and posting comments. But so what, it was still worth it!

  13. This is the right site for anybody who really wants to understand this topic. You know so much its almost tough to argue with you (not that I actually will need to…HaHa). You certainly put a new spin on a topic that’s been written about for years. Excellent stuff, just wonderful!

  14. Having read this I thought it was rather informative. I appreciate you taking the time and effort to put this informative article together. I once again find myself spending a significant amount of time both reading and leaving comments. But so what, it was still worthwhile!

  15. Aw, this was an exceptionally nice post. Finding the time and actual effort to produce a top notch article… but what can I say… I hesitate a lot and don’t seem to get anything done.

  16. After looking at a handful of the blog posts on your website, I honestly appreciate your technique of blogging. I book-marked it to my bookmark website list and will be checking back soon. Take a look at my web site as well and let me know what you think.

  17. I absolutely love your blog.. Great colors & theme. Did you build this web site yourself? Please reply back as I’m hoping to create my very own blog and want to find out where you got this from or exactly what the theme is called. Thanks!

  18. I would like to thank you for the efforts you’ve put in penning this blog. I’m hoping to see the same high-grade content from you later on as well. In fact, your creative writing abilities has motivated me to get my own, personal site now 😉

  19. The very next time I read a blog, Hopefully it doesn’t fail me just as much as this particular one. After all, Yes, it was my choice to read through, nonetheless I actually believed you’d have something interesting to talk about. All I hear is a bunch of moaning about something that you can fix if you were not too busy looking for attention.

  20. I truly love your website.. Pleasant colors & theme. Did you build this site yourself? Please reply back as I’m looking to create my own website and want to know where you got this from or what the theme is named. Thank you!

  21. Nice post. I learn something new and challenging on websites I stumbleupon on a daily basis. It will always be exciting to read through content from other authors and use something from other sites.

  22. This is a good tip particularly to those new to the blogosphere. Brief but very accurate information… Thanks for sharing this one. A must read article!

  23. Can I simply just say what a comfort to uncover someone that really understands what they are discussing online. You certainly realize how to bring a problem to light and make it important. More people have to look at this and understand this side of the story. It’s surprising you are not more popular because you most certainly have the gift.

  24. After looking into a handful of the blog articles on your blog, I honestly appreciate your way of writing a blog. I saved as a favorite it to my bookmark website list and will be checking back soon. Please check out my web site too and let me know what you think.

  25. I blog quite often and I really appreciate your content. This article has really peaked my interest. I am going to book mark your blog and keep checking for new information about once per week. I subscribed to your Feed too.

  26. This is the perfect webpage for anybody who wants to understand this topic. You realize a whole lot its almost hard to argue with you (not that I personally will need to…HaHa). You definitely put a fresh spin on a topic which has been discussed for many years. Great stuff, just wonderful!

  27. Hi, I do believe this is an excellent website. I stumbledupon it 😉 I may come back yet again since I book marked it. Money and freedom is the best way to change, may you be rich and continue to help other people.

  28. Hi there, I think your blog may be having internet browser compatibility issues. When I look at your web site in Safari, it looks fine however when opening in Internet Explorer, it’s got some overlapping issues. I merely wanted to give you a quick heads up! Besides that, fantastic blog!

  29. Hello there, I do think your web site might be having web browser compatibility issues. Whenever I take a look at your web site in Safari, it looks fine but when opening in Internet Explorer, it’s got some overlapping issues. I simply wanted to provide you with a quick heads up! Apart from that, great website!

  30. Spot on with this write-up, I absolutely feel this web site needs a great deal more attention. I’ll probably be back again to read more, thanks for the info!

  31. I blog frequently and I seriously appreciate your content. The article has truly peaked my interest. I will take a note of your site and keep checking for new details about once a week. I opted in for your Feed as well.

  32. Hi, I do think your site could be having web browser compatibility problems. When I take a look at your web site in Safari, it looks fine however, when opening in IE, it’s got some overlapping issues. I simply wanted to provide you with a quick heads up! Apart from that, excellent site!

  33. Right here is the perfect web site for anybody who wants to understand this topic. You realize so much its almost hard to argue with you (not that I personally would want to…HaHa). You certainly put a new spin on a subject that has been written about for years. Excellent stuff, just great!

  34. Hi, I do believe this is a great website. I stumbledupon it 😉 I am going to revisit yet again since i have saved as a favorite it. Money and freedom is the best way to change, may you be rich and continue to guide other people.

  35. Having read this I believed it was really informative. I appreciate you taking the time and effort to put this short article together. I once again find myself personally spending a significant amount of time both reading and commenting. But so what, it was still worth it!

  36. Having read this I thought it was extremely informative. I appreciate you taking the time and effort to put this information together. I once again find myself spending way too much time both reading and posting comments. But so what, it was still worthwhile!

  37. This is a great tip particularly to those new to the blogosphere. Short but very accurate information… Appreciate your sharing this one. A must read post!

  38. I must thank you for the efforts you have put in writing this site. I’m hoping to view the same high-grade content by you in the future as well. In fact, your creative writing abilities has inspired me to get my own, personal blog now 😉

  39. I blog frequently and I really thank you for your content. Your article has really peaked my interest. I will book mark your site and keep checking for new details about once per week. I opted in for your RSS feed as well.

Leave a Reply

Your email address will not be published. Required fields are marked *