Prevalence of A.I. system is not a new thing anymore from product, movie recommendation to taxi-hailing services it is present everywhere. As time is progressing further their adoption, popularity is increasing as well. Fairness is the absence of any prejudice or favoritism toward an individual or a group based on their inherent or acquired characteristics, hence a not so fair system will be biased towards a certain kind of individual.
Problems with unfair AI
There are many famous cases to underscore the importance of fairness in A.I. systems. One recent paper ‘Imperfect ImaGANation: Implications of GANs Exacerbating Biases on Facial Data Augmentation and Snapchat Selfie Lenses’ highlights the same thing. It says ‘In this paper, we show that in settings where data exhibits bias along some axes (eg. gender, race), failure modes of Generative Adversarial Networks (GANs) exacerbate the biases in the generated data.’ Many top researchers have been speaking about biases and there is a lot of active research going in this direction.
The COMPAS predicted whether or not a crime was likely to re-offend or not. The system somewhat discriminated against African American ethnicities as compared to others. The issue was with data that was biased but they were not very transparent as well which exacerbated the conditions. The accuracy of the system later by a study was shown merely to be around 65 percent compared to predictions made by non-experts. It was also later found that the risk could also be predicted without having sensitive features such as race and gender. If COMPAS would have considered fairness standpoint such a blunder could have been avoided especially in such a critical domain
Another very interesting area where bias was discovered was healthcare. Paper titled Dissecting racial bias in an algorithm used to manage the health of populations discovered that a particular algorithm that helps hospitals and insurance agencies to identify which patient will benefit more from high-end healthcare services targeted for high-risk individuals based on training data where the ratio of black patients to white was that of 7T1. Even though this represents the reality but this imbalanced data problem needs to mitigate in general.
Going a bit deep and simplifying the problem we have at our hands, there are 2 classes of biases that emerge algorithmically and data biases.
Commonly occurring biases related to data are Historical Bias, Representation Bias, Measurement Bias, Evaluation Bias, Aggregation Bias, Population Bias, Sampling Bias, Content Production Bias, Temporal Bias, Popularity Bias, Observer Bias, and Funding Bias. One good example to showcase bias would be the Simpsons Paradox where characteristics of the subgroup are very distinct as compared to when they are aggregated. This means that we need to aggregate the data on a level when there is a sufficient sense of similarity between the components in the group which is often difficult to accomplish. So here we need to be sure when and how much to aggregate not on the basis of our convenience but depending upon how the data demands to be segregated.
When it comes to algorithms, we might classify types of discrimination into direct, Indirect, systemic, statistical, explainable, unexplainable. One good example of systemic discrimination can be Amazon’s AI hiring algorithm which was somewhat sexist in nature.
The ways to check the presence of biases and discriminatory behaviors of these systems depends on case to case basis. Importance of detecting one bias over the other in terms of priority and which kind of discrimination to address first is purely custom in nature. But one should try to rule as many biases possible out of the algorithms and data and should try to maintain a healthy balance between effectiveness and fairness.
Tools for Fairness
There are many interesting approaches by leaders to approach fairness in A.I. I would like to mention very interesting ideas in that sphere:
ML fairness gym relies on the foundational idea of understanding the long term impact of ML decision systems by the use of simulation and hence trying to create a replica socially dynamic system. The simulation framework can also be extended for multi-agent interaction environments. Papers such as ‘Delayed Impact of Fair Machine Learning’ tells us how important it is to consider dynamic and temporal factors.
AI Fairness 360 by IBM is an open-source tool to address the issue of fairness in data and algorithms. It implements techniques as mentioned in a few research papers and provides us with bias detection, bias mitigation, and bias explainability tools.
FATE: Fairness, Accountability, Transparency, and Ethics in AI in this offering by Microsoft we get extremely efficient tools to assess visualization dashboards and bias mitigation algorithms. It’s a good tool to compare the trade-offs between the performance and fairness of the systems.
Even the EU commission white-paper on artificial intelligence focuses on the including fairness in the hindsight. But as we see slowly the algorithms which appear to be black boxes at the moment active research is going in the direction to understand even better and making great progress. With the increase in the capability of the explainability of these algorithms, it would become relatively much easier to track down the biases and make necessary interventions to ensure fairness. Papers such as Inceptionism: Going deeper into neural networks, The building blocks of interpretability, Feature visualization, and much more shows progress in such directions. When it comes to explainable AI there are many tools available right now to us which we can use to understand how even very complicated and black-box algorithms work. Tools are Lime, Shap, use of surrogate simpler model and feature importance graphs are very helpful for the same. For advance, unstructured data applications such as deep learning techniques such as GradCam and Attention visualization has also become popular for interpretability.
Practices to ensure the fair practice of AI
Google also provides certain fair practices that are fundamentally based on 2 major ideas. Ensuring transparency in how the algorithm makes the decision and also forming teams that are diverse in nature. The main idea is to capture many varied views about the data and algorithms to ensure the issue of bias can be attacked from all corners. In addition to having people from different domains in the team, the open-source community can also serve as an extended team. Community groups are also useful in creating awareness and ensuring transparency. Model drift as well as the post-deployment performance of the systems should also be monitored more vigilantly. There should be extended studies about the origin of data, data collection methods, data preprocessing, data post-processing, it is labeling, the possible presence of sensitive fields such as race, gender, religion, and whether the data is diverse enough and balanced in terms of all the classes present or not.
There is a new section which is introduced in recent papers ‘Broader Impact’ in papers which also covers ethical aspects of the usage of algorithms in the papers. This shows the increasingly sensitive nature of the researchers towards developing not just more accurate systems but also fair ones. One of the most famous deep learning and machine learning conference NeurIPS 2020 published guidelines for the Broader Impact section as follows:
‘Authors are required to include a statement of the broader impact of their work, including its ethical aspects and future societal consequences. The authors should discuss both positive and negative outcomes if any. For instance, authors should discuss who may benefit from this research, who may be put at disadvantage from this research, what are the consequences of the failure of the system, whether the task/method leverages biases in the data. If authors believe this is not applicable to them, authors can simply state this.’
In conclusion, we can see that the research, as well as the engineering world, is now taking this problem of unfairness in AI seriously and we can see good work coming out. I feel in the future it will become almost a prerequisite for these AI systems to fulfill a standard bar of fairness both in terms of training data is used as well as algorithms used. One troublesome this can be to track down faulty biases in increasingly complex systems and also huge amounts of data on which they are trained. One good example can be GPT-3 75 billion parameters language model (deep learning system) which is trained on a corpus as big as 2000 GB. If the rate of progress in understanding these systems and studies regarding fairness goes well in accordance with the development of new methods then the future is safe and we can see a safer and fair space. In the future, we might witness a specific body that will ensure the fairness of these systems consisting of experts from diverse fields something like the FDA. This might also need the development of standardized procedures to check for the biases and other ethical standards before it is too late which should also scale well with huge data sources.
Original post: https://towardsdatascience.com/fairness-in-a-i-5d3ceaaf649