On the difficulty of creating a data science code of ethics

(NOTE: I’ve updated my thinking, although I still stand by everything I’ve written here. I no longer just think the code of ethics is unimplementable. I think it’s built on the wrong foundation. I explain my reasoning here.)

dj patil recently wrote about the need for a code of ethics for data science. It’s not clear to me that data science as a profession is ready for a code of ethics. Codes are just words unless there is a mechanism to enforce sanctions against people who disregard those codes, and I’m pretty sure no single data science community is cohesive enough to enforce rules even for its own members.

That being said, several products backed by machine learning have recently faced a lot of criticism for reinforcing stereotypes or justifying unjust outcomes. (See Patil’s post for some examples, or this post by Mike Loukides for others. Or just Google it. There are a lot of examples). I’m extremely skeptical that any code of ethics could ever be formulated that would have prevented any of these problems. These weren’t cases of malign intent or moral lapses. They were cases of poor design. Design problems are a reason to worry about competency, not ethics.

But putting those concerns aside, it’s worth thinking about what a code of ethics could look like, because it gets at the idea that a profession should have guiding principles. That’s kind of an old-fashioned idea, and I find that old-fashioned ideas often have a whole lot of wisdom in them.

Patil references the Hippocratic Oath and its importance for the medical profession. As I started looking into it, I was surprised to learn that the Hippocratic Oath isn’t all that important to the medical profession. In most cases, medical schools use a version of the oath that reads more like a greeting card than a solemn vow. Many medical schools don’t require any form of the oath at all. But, again because I tend to think that old wisdom, though sometimes phrased in ways that turn off a modern audience, contains some of the best advice out there, I want to use the original Hippocratic Oath as a sort of template for thinking about what a data science code of ethics could look like.

And I want to do all of this to illustrate why I think data science as a profession lacks the two things necessary for a code of ethics to be a real force: a cohesive community and a simple core purpose.

The Hippocratic Oath

Before looking at the specifics of the oath, let me just warn readers that the Hippocratic Oath can read as more than old fashioned. In some places it is offensively archaic. The diamonds buried in that rough are worth the digging. Suspend your modern sentiments for just a moment and then we’ll get back to talking about data science.

“I swear by Apollo the physician, and Asclepius, and Hygieia and Panacea and all the gods and goddesses as my witnesses, that, according to my ability and judgement, I will keep this Oath and this contract:”

No, I’m not suggesting a data science code of ethics should invoke deity. I think this first portion of the oath contains two important principles. First, if you betray your professional responsibilities, you are betraying whatever is best and highest within you. That’s what differentiates a profession from a job. There’s no shame at all in having a job without a profession, but if you’re going to have a profession, it always has to be more important than any job. Second, any code of ethics necessarily depends upon individual practitioners following that code according to their “ability and judgement”. No code can be so perfectly formulated that is can be applied equally to people of all skill levels, or remove the need for individual human judgement calls.

“To hold him who taught me this art equally dear to me as my parents, to be a partner in life with him, and to fulfill his needs when required; to look upon his offspring as equals to my own siblings, and to teach them this art, if they shall wish to learn it, without fee or contract; and that by the set rules, lectures, and every other mode of instruction, I will impart a knowledge of the art to my own sons, and those of my teachers, and to students bound by this contract and having sworn this Oath to the law of medicine, but to no others.”

The Hippocratic Oath was clearly written at a time when professional skills were transmitted by apprenticeship rather than more formal instruction. But, again, there’s at least one important principle: joining a profession means agreeing to train and teach. There is no such thing as a non-teaching practitioner. You never get to a point in your career where you are too busy, too important, or too advanced not to have an obligation to help develop even the newest newcomer to the field.

“I will use those dietary regimens which will benefit my patients according to my greatest ability and judgement, and I will do no harm or injustice to them.”

Practitioners act at the invitation and request of a customer. If a customer invites you to practice your profession, you are agreeing to benefit that customer as much as is in your ability. That, of course, means you will avoid harming their interests.

“I will not give a lethal drug to anyone if I am asked, nor will I advise such a plan; and similarly I will not give a woman a pessary to cause an abortion.”

This section references moral rules that aren’t nearly as mainstream today as they use to be. Look past that. The basic idea seems to be that a doctor’s job is to save life, not end it. Abstract that to professions in general instead of the medical profession in particular, and I think we can arrive at something like this: a profession has a very small and simple set of core principles; when faced with the need to make any individual, practical judgement call, practitioners should first reference these principles. All other things being equal, the more a judgement call upholds the core principles, the better the judgement call is.

“In purity and according to divine law will I carry out my life and my art.”

I actually really like this short idea. How you act as a individual practitioner will, at least partially, shape how people view your profession. Keep that in mind and act accordingly. When you join a profession, you have a responsibility to more than just yourself and your clients. You have a responsibility to the profession itself.

“I will not use the knife, even upon those suffering from stones, but I will leave this to those who are trained in this craft.”

This part doesn’t apply to our modern conception of doctor, but at the time the oath was written, doctors and surgeons were two different things. The lesson: stick to what you’ve been trained to do. Sutor, ne ultra crepidam.

“Into whatever homes I go, I will enter them for the benefit of the sick, avoiding any voluntary act of impropriety or corruption, including the seduction of women or men, whether they are free men or slaves.”

It takes a little digging to find the foundational principle in this section, but once we do find it, it’s an aspect of professionalism that I don’t often see discussed. As I stated before, professional practitioners work at someone else’s request. When invited into someone’s “home” (company, department, organization, etc.), you are a guest, there to address the need that prompted them to invite you in the first place, and for no other reason. If, while a guest, you see any other people, opportunities, or resources that could benefit you personally, you leave those things alone. Even if it would be ok to take advantage of those things normally, it is not ok when you find them under your employer’s roof, and it is especially not ok to take advantage of them without your employer’s prior informed consent.

“Whatever I see or hear in the lives of my patients, whether in connection with my professional practice or not, which ought not to be spoken of outside, I will keep secret, as considering all such things to be private.”

The strong version of this is “a professional should consider it an insult to be asked to sign a non-disclosure agreement.”

“So long as I maintain this Oath faithfully and without corruption, may it be granted to me to partake of life fully and the practice of my art, gaining the respect of all men for all time. However, should I transgress this Oath and violate it, may the opposite be my fate.”

For the time being, let’s leave out the part about “partaking of life fully” since I don’t know how to operationalize that. But look at that rest: if you adhere to the code of ethics for your profession, you should (1) be allowed to practice and (2) gain respect, in the wider sense of gaining a good reputation. If you violate it, “may the opposite be my fate.” Violating the code of ethics should impede your ability to practice, and should damage your reputation.

Something that strikes me about the Hippocratic oath is that it is normative, not prescriptive. It tells you thing you should or should not do, but it doesn’t tell you what you should do when you’re asked or told to do things you shouldn’t, and only very vaguely talks about what should happen to you if you do those things you shouldn’t. It defines the ideals of the community, but leaves it up the community to regulate the actual lived experience of those ideals.

That’s important. Really important. It’s easy for a code of ethics to turn into a policy document, where it becomes more legalistic than aspirational. No code will ever prevent abuse of trust or authority. No code will ever ensure just outcomes or even good-faith effort. Communities do that. Codes are a way for individual community members to publicly commit to the community’s ideals.

A (kinda sorta) data science oath

Here’s a rough outline of how these same basic principles could be incorporated into a data science code of ethics:

I have chosen data science as a profession. Therefore, I have a responsibility to my profession as well as to myself and to those for whom I build. To the best of my ability and judgement, I will adhere to the following principles:

When invited to practice my profession, whether in the capacity of an employee, contractor, consultant, or volunteer, I will above all devote my time, talent, and efforts to ensuring that those who enlisted my help are better off for it.

In cases where a request requires an extension of my skills, I will make it clear to all involved that what I produce will be the work of a student. In cases where a request requires entirely new skills, I will first recommend that the task be given to someone with the appropriate skills; if that recommendation is rejected, I will only fulfill the request under the explicit understanding that I do so as a student.

I will never take personal advantage of people, resources, information, or opportunities that I encounter while performing my professional responsibilities. I will never use something that any employer, customer, or colleague considers their own to gain or benefit other employers, customers, or colleagues.

I will devote time to train and teach anyone who wants to learn the profession, never turning away anyone who asks for help without at least giving them some thoughtful guidance about finding what they are looking for.

If I honestly strive to live my profession according the principles outlined above, may I gain more opportunities to practice my craft. If I knowingly fail to live up to these principles, may I rightfully lose the trust of those who would seek my help.

Now, I don’t propose that we adopt the above wording or anything like it as a code of ethics for data science. It’s an illustration of what such a code could look like if data science’s objectives were the same as medical doctors’ objectives. I was actually just a little surprised at how transferable the principles were.

At any rate, what I’ve written is pretty minimalist. That parsimony, I think, is one of the great strengths of the Hippocratic Oath. If your code of ethics is more than a page, it’s not a code of ethics. It’s a policy manual. Policy manuals are often useful in spelling out specific ways ethical principles point to every-day practices, but that’s only really productive after consensus on core principles is already in place.

This draft code is missing two important things. One of those things is a cohesive community. In a lot of ways, Hippocrates had it easy — his community was his students, so consensus and cohesion were kind of built in. Data science doesn’t have that. Without the support of a cohesive underlying community, no code is ever going to mean anything.

The other thing this code is missing — something the Hippocratic Oath had — is a simple core principle. Yes, it names a few basic principles: leaving recipients of your help better off, public acknowledgement of cases where you are practicing as a student, never taking personal advantage, and prioritizing teaching and training of others. None of these carry the same sense of distilled purpose as “save life.”

So that raises what, I think, is the most important question in any discussion of a data science ethics: what is supposed to be the purpose of data science? I don’t think that question has been answered. We would benefit from fewer debates about what data science is or should be, and more debates about what data science is supposed to accomplish. If even a small group can achieve consensus on that issue, that group will be positioned to determine the future of the profession.