Data can do good in the world but it must be treated ethically, says Duncan Ross from DataKind UK

Share on facebook
Share on linkedin
Share on twitter

As part of an international campaign to lift the lid on data privacy violations, The Privacy Collective is asking some of Europe’s leading experts why online privacy matters.

Duncan Ross is the Board Director of DataKind UK, the country’s foremost data philanthropy organisation, and chief data officer of Times Higher Education. He is a business focused data scientist who has been named one of the top 50 data leaders in the UK by Information Age and formerly worked with Experian as their data director. He tells The Privacy Collective about championing responsible data use, why it’s difficult for individuals to determine how decisions are made, and why the world needs more ethical data scientists.

Can you tell me a bit about your work and mission at DataKind?

DataKind works with the social sector around the world, to help organisations understand how they can use data more effectively. Some of the issues these organisations are dealing with are the most challenging that people will face in their lives. There are decisions made around health, around income, around safety and security. And so the users of that service need to be absolutely convinced that they are treated fairly, that their data is being handled responsibly, and that concerns around privacy have been considered.

What does responsible data use look like?

Responsible data use means organisations understand the context in which data is going to be used. One of the difficulties of course is that there's often a significant gulf between the people who are working in data, and the people who receive the outcome of that data work. If data analysts are aware of the context in which a decision is being made, that can help ensure the available data is used in the right way.

There are some real issues around differential impact, for example, that can be overlooked by a data analyst or algorithm that doesn’t have the right context. Bail is a good example of this. If you’re denied bail, that has a huge impact on you – you have to stay in prison. If you’re given bail, that also has an impact, but it’s not equal. Most data analysis, by default, will treat those two outcomes as if they are equally expensive in terms of the impact on the person.

In the social sector, those decisions can be much more critical, and it’s important data is therefore used in the right way. If someone comes to an advice centre and asks for advice, then AI might be able to make an assumption about other things that will be relevant to that person. But adversely, the cost of getting that wrong could be very high. The context element is crucial.

How well is the social sector using the data they have access to?

It's growing, but it's still in the early stages. And part of it is a genuine worry that they don't have the experience or the expertise to understand these problems. Most decisions are still not made using data, they’re made using people. And those can often be badly executed as well. So this is not just about making sure artificial intelligence (AI) uses data appropriately. It's about how everyone uses the data, how everyone uses the insights they have and that they do it in a consistent, fair and open way.

If used ethically, data can absolutely change the world for the better. There are all kinds of situations where we see how a sensible and appropriate use of data could move things forward. Charities in the UK are under huge amounts of financial pressure right now, their resources are shrinking, but the need for their support is growing. One thing they can do is to use data to understand whether they’re being as effective as they think they are, and whether they’re making a difference. Charities often get into the habit of doing stuff because it looks good, without necessarily asking, is this actually making the situation better or worse? Sometimes data can help provide that insight.

If used ethically, data can absolutely change the world for the better. There are all kinds of situations where we see how a sensible and appropriate use of data could move things forward.

In the commercial sector, how are algorithms and technology being used to boost profits by technology companies around the world? What are the implications of that data being misused?

There are huge differences between how a company might use data on their own website and where it’s shared with third parties. So if I go onto Amazon to buy something and they then recommend other products to me, there is a kind of implicit contract because I’ve consented to that – I’m there to buy things, they’re there to sell me things. Much in the same way as if I walk into a department store, the store is free to advertise products to me.

I think where it becomes really difficult is when you have data that is gathered on one site, and is passed onto someone else for action. Then the chain of consent becomes much more difficult to follow. Not necessarily in a legal sense, because often there are checkboxes that will give that third party authority. But certainly consent from the perspective of the individual becomes blurry – do I understand that my data is being passed on and used elsewhere? And how is it being used? Is it just to advertise things to me (which is perhaps irritating more than anything else), or is that data being used by agencies that are making decisions about my life?

We cannot say that all data is being handled badly. We have to work on the assumption that most companies are at least trying to behave in a way that’s legal and appropriate. Most companies do realise that if they behave badly towards consumers, it comes back to bite them. But it's quite easy for companies to do things which are harmful, without necessarily realising it. And I think that's part of the danger with data use – it’s not the intentional errors, but the unintentional ones that can be more problematic.

What do we need to do to encourage more responsible data use?

There needs to be a balanced approach. I'm not someone who thinks that technology should be given free rein and somehow, miraculously, the market will resolve everything. There is a role for appropriate regulation, but I do think regulation is quite a difficult thing to do effectively. And so alongside that, people like myself who work in data, need to take responsibility for the work we're doing. Sometimes, that means we need good people working in bad companies. We need individuals who have an ethical approach to the data work that they are doing.

If all of the ethical data scientists boycotted Facebook, then Facebook will be full of the unethical data scientists, which probably wouldn't give the results we want. There's more we can do to try and help data scientists understand why data use and ethical data use is so important. And that's one of the things DataKind does as well, our volunteers are pro bono data scientists demonstrating how charities can use data effectively and for good. Hopefully that gives them an insight into other parts of society and helps inform their decisions elsewhere too.

How aware do you think the British public are about the need to take privacy seriously? Has the General Data Protection Regulation (GDPR) had an impact?

GDPR has created a transnational framework for data and data rights. The Information Commissioner's Office also seems to be a lot more serious about issuing fines than they were under the Data Protection Act, which is positive. But you can't guarantee that just because you raise an issue, they're going to take it up or indeed that you're going to be able to get enough information to build a case. This is why group actions are really important – they allow you to build up patterns of behaviour that you can't see as an individual and I think there are opportunities for data analysts to help support uncovering that kind of statistical bias.

If I think a decision has been made unfairly against me, for example, it's quite difficult for me to prove that (regardless of whether it was a decision made by technology or not). It's only when you look at the statistical group that you can say, ‘well oddly enough, it turns out that everyone who's disabled hasn't received that particular offer either’. Then you have a case. The reality is that most people who are one the wrong end of a decision aren't necessarily aware of how that decision has come about. But they shouldn’t have to be. They should just be able to have the confidence that these decisions are being made in a fair and equitable way, using data that they have consented to provide.

What can people do to educate themselves and protect their online data today?

Be aware. Don't automatically sign up to everything. Think about what the implications could be if that data was misused. But it is difficult, because the world is so interconnected and technology is so ingrained in our lives.

The other thing is to work as a community, and provide support when issues get raised by bodies that are trying to fight your corner at a higher level. The work that is being done by The Privacy Collective, for example, is absolutely crucial. It's very difficult for an individual to have much of an impact on these sorts of issues themselves, but collectively we can make a difference.

Your data should not be for sale. We’re taking Oracle and Salesforce to court for the misuse of millions of peoples data and we need your help! If you believe that tech giants should be held accountable for their use of people’s data please support our claim here. Because your privacy matters.