Should data scientists sign an ethical code?

Consider this.

A terminally ill patient has just been told by her doctor that she has just about a year more left to live. She has a life insurance policy of $100,000. But what she needs now is money — for medical care or perhaps simply to live her life well for the last few days.

Say an investor offers to buy that policy from her at a discount of $50,000 and to take over the annual premium payment. When she dies, he will be the one to collect the $100,000.

It’s a deal. The dying policyholder gets access to cash, and the investor makes a profit. But there’s a catch. Everything depends on her dying, on schedule.

Known as the viatical industry, the instrument guarantees a certain payoff at death, but the rate of return depends on how long the person lives.

If the policy holder dies in less than one year, the investor makes a killing (metamorphically, at least). If she survives till two years, his annual rate of return is cut in half, and he has to pay additional premium payments. If she somehow recovers, the investor could end up making nothing.

Sounds too horrible to be true? In his book — What money can buy — Michael Sandell brings to light many more such searing examples of market-driven practices where the moral limits of money come into question even as you start wondering if there are things money shouldn’t be able to buy.

When we bring this analogy to the dynamic discipline of data science, a very interesting parallel emerges.

Data has been hailed as a panacea for many problems. And it has indeed taken huge strides in advancing fields across disciplines. But as more and more data scientists join this industry, should there be more discussions on the moral limits of data?

The key question:

Should data scientists sign a code to deal sensitively with data just as doctors do?

Let’s understand why this is becoming urgent today.

The gap:

Most discussions about the moral limits of data are restricted to regulations around the use of data. What data is usable, what are the privacy restrictions, how does that vary across countries, and how do companies incorporate that into their processes.

Forresters’ Data Protection Heatmap by country shows a snapshot. But as data becomes more open and pervasive, and as analytics solutions start becoming more invisible, is the question around the ethics or morals of data also changing?

The key question:

How is data generated and how is it used for different purposes?

Danger by likes:

Psychologist Michal Kosinski developed a method to analyze people based on their Facebook activity. This was based on a model that sought to assess human beings with five personality traits: OCEAN — openness, conscientiousness, extroversion, agreeableness, and neuroticism.

The claim — one could make a relatively accurate assessment of a person including needs, fears, and how they are likely to behave. These traits have become the standard technique of psychometrics. But for a long time, the problem with this approach was data collection, because it involved filling out a complicated, highly personal questionnaire.

Then came the Internet. And Facebook. And Kosinski. But this isn’t the full story. The article digs into something else far more disturbing.

Without Kosinki’s knowledge, a small company called Cambridge Analytica used a very similar or perhaps the same model to fashion two of the most unthinkable political results in recent history. Brexit and Trump. Did Kosinski imagine the potential moral limits of his model?

The key question:

If our data is used for a very different intent than its original intent, how can we control its potential impact?

Danger by devices:

Imagine the amount of data a small wearable device such as a watch might have about each one of us. Wearable tech and IoT are not just buzzwords. They have come very close to us — to our devices and our homes. Now, look at this headline.

“Hacked by your fridge.”

The article spoke about a cyber attack launched through smart fridges just last year. Without hacking into the security features of laptops and other devices, hackers could use the often unsecure IoT devices to compromise the entire network.

At what point does the personalization we need and welcome change into unwelcome persecution and downright danger?

Security in wearable devices has become a key point of concern for many people as the information on the wearable device starts becoming even more personal and valuable than the information even our credit cards could give about us.

The key question:

If the potential misuse of data could lead to unintended consequences, what safeguards do we need to have in place?

The need:

Where does the solution lie?

Individuals stopping access to their personal data? That is going to be increasingly unthinkable.

Manufacturers of devices putting in strong security controls on such devices? That’s happening for sure.

Or is it also by having the data scientists, who devise the intelligence that such machines run on, heighten their awareness of and sensitivity to the potential moral limits of data? Is this a code all of us need to sign?

Let’s change the conversation to good data.

The intention of this article is to raise awareness about the potential use of data and our individual and collective responsibilities.

(This write-up first appeared on LinkedIn Pulse, Datafloq, and BRIDGEi2i and was authored by Prithvijit Roy, CEO and Co-founder at BRIDGEi2i Analytics Solutions)

Should data scientists sign an ethical code?

The gap:

Danger by likes:

Danger by devices:

The need:

If you liked this article, please do clap and leave a comment so that it reaches more folks!