Can you imagine modern life without electricity? I bet the idea seems ludicrous, and clearly, if you are reading this, you are among the 87% of the world population that enjoys access to electrical power. Andrew Ng, a household name in the Machine Learning community, famously stated that “AI is the new electricity”, which is a strikingly accurate analogy when one starts unpacking its implications.
Because, while electricity mostly produces perceptible, understandable and tangible effects, the results of using AI, or to be precise, Machine Learning technology, is innocuous, inscrutable and hidden for most of us. Nevertheless, Machine Learning pervades most aspects of modern society and industries, fueled by incentives such as efficiency, optimization, cost savings, increased insight and productivity.
There are several pitfalls to be aware of. Arguably, the worst outcome is not having wasted time and money on implementing a Machine Learning system that didn’t perform as well as hoped. It would be far worse if this system participates in perpetuating societal bias that hurts people and harms our society, either surreptitiously through discrimination; By limiting equal access to opportunities; By declining mortgages with no possibility of appeal; Or by hiding job listings from candidates that don’t fit the traditional profile.
Alternatively, the system could cause overt injury, through for instance autonomous vehicles that make fatal decisions based on poorly understood image recognition, or by recommending medical diagnoses and treatments to patients that are derived from flawed reasoning.
There’s a little Black Box for everyone
Progress in Data Science and Machine Learning and runaway freight trains have a few things in common, like speed and a chance of accident waiting to happen. Academic researchers are frantically spitting out papers featuring new algorithms and incrementally improved results. Meanwhile, even swifter progress is occurring on the hardware and software side of the same AI hype-coin. Few others, if any, research fields have so rapidly transitioned from being a small niche of mostly academic interest, to becoming so readily available and desirable to learn for non-expert users.
This trend is fuelled by a fortuitous congruence of circumstances. The essential programming languages for Machine Learning are intuitive and high-level, and can easily be learned in a few weeks time by someone already familiar with computer programming. And, with the availability of powerful open-source libraries and frameworks, coupled with high quality data sets in the Public Domain and cheap cloud computing, Machine Learning has turned into a fairly low entry endeavour.
But these technical aspects aside, an inherently wonderful benefit of Machine Learning being a young research field is that the general rule from the onset has been to publish scientific papers and results openly — as pre-prints on arXiv.org and in Open Access journals and proceedings such as JMLR and PMLR. Tutorials and lectures are freely shared on YouTube and as free online university courses. This has led to a democratisation of Machine Learning knowledge, where skills and know-hows really only depend on access to electricity, internet and a laptop. In other words: Anyone can do Machine Learning today.
However, the mere fact that anyone and their grandma are able to get a Machine Learning system up and running, does not necessarily imply that there is any understanding, or even intuition, of what is going on under the hood, neither during development nor after the system is put in production. Most such systems are magic black boxes, where one inserts data into one end, stirs around for some indeterminate amount of time, and then a number plops out on the other side.
So, how can we ward ourselves from potential harmful outcomes if we don’t know what is going on? Is it possible to peek inside this Magic Black Box that is Machine Learning, and understand how and why a Machine Learning system arrived at a particular result?
How to Good Science
Let’s take a step back and contemplate the word “science” that makes up one half of “Data Science”. It is hard to properly define what constitutes “Good” science, but, if we look to the STEM research fields, there are some established guidelines and principles to follow in order to avoid doing outright bad science. During the designing of an experiment to test an hypothesis, it is of equal importance to put considerable thought into how to ensure an optimal execution of the experiment, by following these steps:
Isolate the experimental setup to avoid contamination from external influences.
Choose proper metrics.
Accurately measure and store the sampled data.
Analyse and interpret the data.
Verify the results.
Especially important is The Zeroth Law: Remember to leave predetermined viewpoints at the door, and emotionally prepare for having the hypothesis completely disqualified…
These guidelines are applicable to Machine Learning as well, if one thinks of the training of the model as being the experiment, and the resulting output by the model as being the data sampled and collected during the execution of the experiment. Choosing suitable metrics, storing the output, and even verifying the results, are easily transferable concepts.
The challenge lies with controlling for outside influences and interpreting the output. Because the training data that is put into the Machine Learning system contains both the desired signal that tests the hypothesis, but also external noise that contaminate and confounds the result. And, when interpreting the result, we need to be able to decouple the signal from the noise.
Good Data Science requires Interpretability
Academic and corporate interests have been more focused on advancing the theories, algorithms, and software tools in order to perform increasingly more elaborate experiments, rather than on experimental verification, as in the more established physical sciences. Fortunately, the need for accountability is catching up, germinating into the sub-field of Interpretable Machine Learning.
Interpretability in Machine Learning is not a well defined concept, because it is context dependent and will mean different things for different problems. But, arguably, one can take it to mean opening up the magic black box, rendering it transparent, and being able to interpret, explain and trust what is going on inside it. Speaking in general terms, to assess if one has achieved Interpretability, I propose that we should be able to answer the following questions with some degree of confidence (that is, given our hypothesis, our questions posed to the data):
In part two in this series I will embellish on this checklist and supply examples to argue the relevance of these questions and outline potential outcomes of failing to answer them.
Understanding data is easier than understanding science
It might be tempting to buy into the pitch that “Your own engineers and developer teams understand your data the best, so just re-skill them”. Sure, ideally every company that has some data-driven operational aspect, or provides data-driven services, should have an in-house team of data scientist and Machine Learning experts. Also, due to the scarcity and high demand of this competency, it makes sense to retrain in-house developers.
A common misconception is that most data scientists are not programmers, but often has a Master’s Degree or PhD in STEM fields, such as Mathematics, Statistics or Physics, with years of experience in doing “Good Science”, and have acquired a generally sceptical attitude towards both data quality and model veracity. However, if one does decide on going down the retraining road, it is highly recommended that the team adheres to the Checklist of Interpretability and becomes equipped with the following:
The knowledge required to identify issues and shortcomings of the data.
The skills to properly pose the questions that one wants the data to answer.
The abilities to interpret and explain the answers the system outputs.
And lastly, for everyone involved, including the product owner, to gain a deep appreciation for The Precautionary Principle:
To avoid using Machine Learning technology that is not fully understood in decision making systems where the stakes are high or critical.
Getting Machine Learning education and training right will play a significant role in the improvement of our collective future. Achieving the common goals of halting climate change, fighting poverty and raising the living standards in the developing countries, requires us to make technology that is resource and energy efficient. It must also be interpretable, explainable and proven worthy of our trust, so it does not cause unintended harm along the way.
After all, who would accept reduced access to electricity in order to lessen their ecological impact? Would you?