AI and explainability, a non-problem?

The use of AI/machine learning in decision-making is often considered problematic due to its opacity. This consideration is not unreasonable. When making a decision, it is generally assumed that the decision should be explainable to the subject of the decision. If the decision cannot be explained, we have no meaningful way of holding the decision-maker accountable.

Because of the opacity that frequently occurs with AI decision-makers, many, like the EU commission on ethical AI, argue that a ‘principle of explainability’ should apply to all AI. In short, developers need to make it possible to understand how an AI makes its decisions in order to meet the requirement of ethical AI.

But is it? In a recent article, Scott Robbins, a former computer engineer and current PhD-candidate focusing on the ethics of AI, argues that ‘the principle of explainability’ (he calls it ‘explicability’) is “misguided”. He is concerned that this principle might prevent us from reaping the benefits of AI, and give rise to redundancies in its development. I find his argument quite compelling, and in this post I will summarize his argument in an easy to digest-way.

Scary-looking android in a suit, looking at the camera
Are you sure you want to hear his explanation? Image by Morning Brew via Unsplash

Why the principle of explainability is misguided

Robbins argument follows from the observation that explainability is a function of the decision to be made, not the agent/entity making the decision. “We do not require everyone capable of making a decision to be able to explain every decision they make. Rather, we require them to provide explanations when the decisions they have made require explanations”. The result, he argues, is that many AI decision-makers are redundant.

How so? The simple reason is that, in most cases, applying the principle of explicability “assumes that we have a list of considerations that are acceptable for a given decision”, but if we already know the acceptable considerations “then we could just hard code them into traditional automation algorithms rather than let the ML algorithm take the role of decision-maker”. In other words, there is no need to use an AI to learn rules by looking at historical data when we already know the rules beforehand. If we do, we can make a traditional algorithm, like a decision-tree.

For example, consider an AI that decides whether to give someone a loan. If it rejects a loan application, it is reasonable to ask why it decided to do so. Was it because the applicant had a high debt-to-income ratio, or was it because they were black? If we investigate and find that it was because of high debts, we say that it made the right decision. If we find that it was because the applicant was black, or a proxy thereof, we say that it is wrong. However, in doing so we explicate the acceptable considerations for accepting or rejecting loan applications, and having done so, the AI is redundant. We could instead make an algorithm with the rule “if debt-to-income ratio exceeds 2, reject loan, otherwise accept”.

Another consequence of the principle of explainability is that in low-stakes decisions, those where we do not require explanation, the development of AI is hampered by having to make it explainable. For example, in the promising field of NLP (natural language processing), it is not clear that an AI must be explainable. If we want to translate a web page from one language to another, it is sufficient that the translation is good (whatever that means), not that the choice of specific translations can be accounted for.

Similarly, requiring explainability means that AI decision-makers could be developed for tasks that might as easily (and more cheaply) have been done by decision trees, i.e. by focusing on the considerations that must hold for the decision to be ‘right’.

In short, the principle of explainability is misguided, and we might add, misleading, because it confuses the decision-maker with the decision. Focusing on the latter will be more fruitful and can also provide a path for non-explainable AI.

In conclusion, I find Robbins paper a highly valuable contribution to the literature on explainability and AI. I particularly like the simplicity of his argument and the way it defuses a lot of the mystique concerning opaqueness in AI, which is sorely needed in the literature. This is not to say that there are no unresolved problems. First, there is the problem of unintended consequences. Even if we let AI make decisions in low-stake situations, there might be consequences down the line where accountability is needed. Second, there might be opportunity costs by not letting AI make decisions. For example, if someone develops an AI that makes policy proposals to mitigate global warming, can we afford to say no? Lastly, there might be gray areas where the property of requiring explanation is itself disputed. In some cases, there might be people who say that a decision requires an explanation and others say it don’t. Neither of these are problems concerning Robbins paper, but they show that shifting our perspective from AI to the properties of decisions leaves ample room for further theorizing.

Robbins paper can be read hereHere is the link to his website, where his other papers are published. I first heard about Robbins from his appearance on the Philosophical Disquisitions podcast hosted by John Danaher which you can find here or other places where podcasts are found.

Professional ethics in the age of AI: Upgrading to v3.0

Doctors versus Google

Can a team of laypeople armed with Google beat doctors at diagnostics? That is the premise of a Norwegian TV show that has won international acclaim. Doctors are seemingly happy to participate and defend the honor of their practice. But the very fact that this is a realistic challenge is symptomatic of a more general and fundamental shift in the traditional power base of the professions. Developments in the field of artificial intelligence and the proliferation of online services are making accessibility of knowledge less dependent on traditional modes of professional practice. I believe this calls for a new perspective in professional ethics that takes these shifts seriously. As I will explain, “professional ethics version 3.0” may be an appropriate term for this upgrade.

“Increasingly capable machines”

The developments that necessitate this new perspective in normative theorizing are vividly portrayed in Richard and Daniel Susskind’s book The Future of the Professions (2015). They argue that technology is dismantling the monopolies of the traditional professions—for the better. In what they call our current “technology-based Internet society,” new ways of sharing expertize are refashioning public expectations. The book presents telling numbers on how artificial intelligence and online services are outcompeting traditional practices of providing academic courses, medical information, tax preparation, legal advice and more. Tasks that have been performed by professionals are taken over by “increasingly capable machines” that allegedly deliver services cheaper, faster, and better.

Normative theorists need to consider what these findings and predictions imply with regard to standards of professional role morality. Given that we are facing complex and fundamental change due to the possibilities of artificial intelligence, theories of professional ethics need to address how this alters the ground for legitimate public expectations and the conditions of trust. In particular, how does technological change in practice affect the merits of professional decisions and actions?

Professional ethics before AI

The call for a “third version” of professional ethics may sound hyperbolical, but let me explain how it relates to two previous stages. Version one concerned individual professionals. The early professional ethics codes were highly aware of how the behavior and values of the single role holder reflected on the public standing of the profession as a whole. Although this aspect has never disappeared, we can speak of a second stage (version two) when organizations and their procedural regulations gained more attention. This has been called “the institutional turn” in professional ethics (cf. Thompson, 1999). While organizations have always shaped professional practice, the appreciation of their significance for professional responsibility was gradual. The question now is how the swift arrival of artificial intelligence and new modes of sharing expertize changes our moral relation to professionals.

Philosophers should work in tandem with sociologists here. In this regard, consider how a call for a transition from version one to version two was foreshadowed in sociological writing. Thirty years ago, Andrew Abbott noted in his cornerstone contribution to professional sociology—The System of Professions (1988)—that the public approval of professional jurisdictions rested on outdated archetypes of work. The professions want to appear as virtuous, but the public image of the virtuous professional did not really track institutional reality. Abbott drew attention to how the public continued to think of the professionals in the image of a romanticized past: “Today, for example, when the vast majority of professionals are in organizational practice, and indeed when only about 50 percent of even doctors and lawyers are in independent practice, the public continues to think of professional life in terms of solo, independent practice” (p. 61).

When machines become professionals

How is the third version special compared to the previous two? One important distinction is how the third version is gradually dispelling the social logic of ordinary morality, which arguably remained perceptibly intact even in the organizational setting. That is, the organizational aspect of professional practice does not by itself imply a radical break with the kind of interaction we are familiar with from the ordinary or non-institutional morality. There are still face-to-face interactions that enable immediate emotional responses.

Care, loyalty, and respect are key virtues of role holders in hospitals or classrooms. They are also concepts that most clearly apply to the relations between agents who encounter each other directly. To care about patients or pupils, for example, seems to involve being concerned about the condition of concrete individuals, as opposed to more abstract categories. Similarly, loyalty to clients often requires attentiveness to how needs and interests are expressed (how they matter to this client), not just mechanical subsumption under institutional rules. Moreover, respect for autonomous decisions requires that conditions are present for making a professional judgment about relevant agent capacities of the decision-maker (e.g., understanding, free deliberation).

A natural question, then, for those who have worked with ethical theories for traditional practice will be how the old concepts translate to the new scene. What happens to the values of professional practice that were grounded in genuine human engagement and direct emotional participation? Susskind and Susskind are not worried about this; they believe machines will become better than humans to engage with understanding and empathic emotions (2015, p. 280). But whatever the technological realism of this stance, there is reason to stop and consider the conceptual difficulties it faces. We appreciate sincere expressions of empathy precisely because they communicate genuine like-mindedness. Many of our emotional reactions are tied to ideas about human dignity, fellowship, and mutual respect. We might have to find a new moral base for our interaction with machines. My suggestion here is that the third version of professional ethics needs to explain how the traditional moral concepts change meaning and significance when professional work is being gradually decomposed into more specialized tasks where new technology takes over old tasks.

New standards for professional practice?

A professional ethics for the new age is not just about the substance of norms and emotions, but also about how the standards for this normative order are derived or constructed. That is, even the basic sources of legitimate professional standards may be changing. Professional associations have traditionally developed their codes through appeals to the “internal” or “intrinsic” values of their practice. Some may hold that radical change in this regard is called for by the opportunities of technology. Technology may not merely be a vehicle of diffusing information; it may entail a form of “democratization” of the legislative process for professional norms. For example, one could argue that what is needed, for the most part, are efficient systems for registering user contentment. Now that people are being serviced in greater numbers at greater distances, the argument goes, the important thing is getting tools for aggregating satisfaction and adjusting the systems accordingly.

I believe, to the contrary, the standards of professional ethics cannot be reduced to aggregating satisfaction. It is a mark of professional integrity to resist pandering, to aim to rectify self-serving beliefs, and to making decisions responsive to genuine professional values. While some choice-friendly aspects of the new systems can overcome pernicious forms of paternalism that were made possible by traditional practice, there is still a need to allow professional judgment to be a counterweight to mere user satisfaction.

What machines can’t do

One reason for emphasizing the need for professional judgment is the lack of collaborative ability in machines. There is no mutual agreement on the appropriate end to pursue; the machine cannot adequately make normative assessments of the cognitive processes of others and it cannot place goals within a larger space of meaning (a lifeworld). The machine basically aids us in achieving our ends as they are, with at most a weak ability to interpret our situation or make counter-suggestions. In short, machines do not understand us and do not engage with us to determine our goals. This is a point argued at length in Steven Sloman and Philip Fernbach’s The Knowledge Illusion (2017). These cognitive scientists are skeptical about the potential for automated services to replace professional judgment. One of their findings is that using services like WebMD has the effect of raising people’s confidence in their own level of knowledge, without raising the actual level of knowledge accordingly. People tend to have rather a blurred sense of the distinction between what they know and what knowledge is available.

What does this mean for professional ethics?

None of the above is an argument against letting technology change professional practice. It is rather a point about how a theory of professional ethics can highlight considerations to which the new system needs to respond. The professional practice of the “technology-based Internet society” should be reformed in light of the genuine virtues of professional ethics, not vice versa.  While it is important to understand the gains in efficiency derived from compartmentalization, standardization, and automatization, it is also necessary to operate with an adequate conception of what kind of efficiency we should strive for. This does not just require the participation of practitioners of good judgment in the development of the systems. It also requires that theorists of professional ethics help articulate public frameworks for identifying the new ethical challenges that arise.


Abbott, A. (1988). The System of Professions. Chicago: The University of Chicago Press.

Sloman, S., & Fernbach, P. (2017). The Knowledge Illusion. London: Macmillan.

Susskind, R., & Susskind, D. (2015). The Future of the Professions. Oxford: Oxford University Press.

Thompson, D. (1999). The institutional turn in professional ethics. Ethics and Behavior 9(2), 109-118.


Andreas Eriksen is a Postdoctoral Fellow at ARENA Centre for European Studies.

Photo: Private