Share on facebook
Share on linkedin
Share on twitter
As part of an international campaign to lift the lid on data privacy violations, The Privacy Collective is asking some of the UK’s leading experts why online privacy matters.
Dr Hamed Haddadi is a Reader in Human-Centred Systems at Imperial College London, specialising in privacy and human-data interaction. He is also is a visiting professor at Brave Software, and is an academic fellow of the Data Science Institute. Here he discusses why we should be more concerned about Amazon than Facebook, the benefits of data if it’s used properly, and why introducing some transparency into these systems may start to rebuild the public’s trust.
Why does online privacy matter?
If you think about companies like Amazon, for example, which don’t get as much attention as Google and Facebook when it comes to online privacy, it has a strong presence across our lives – you have the website, you have the entertainment, you have all the home assistants and the smart devices that they offer. We’re talking about a wealth of information being available about when we are home, who are we interacting with? How many people live in our house? What times are we watching which programmes? What do we buy from the web? What kind of items are we watching online? What kind of discussions, what emotions, what moods do we go through during different times of the day, and different days of the week, and different weeks of the year? If you ask a home assistant like Alexa, what’s the news today – how does it decide what news to play back to you?
The consequences of this sort of data collection is not usually visible in the short term, but all of this can add up to be used for a variety of purposes. Often the intentions are good, but the fact that you’ve now collected a large array of rich data from citizens, means there can be undesired consequences. We’ve seen examples of that with Cambridge Analytica, and the Brexit campaign, where the availability of this data can then be misused for other purposes. At the moment, I think there’s huge amounts of effort on collecting as much data as possible to train models. How those models will be used later on … the possibilities are endless.
Can you tell me a bit about your research? What does human-data interaction mean?
At Imperial, our work is mostly centred around systems and algorithms that we deal with as human beings in our daily life. So this could be Internet of Things (IoT) devices, smart home assistants, home cameras, browsers things like that. We’re mostly looking at the data that these devices collect, the volume of information, the destinations where that data goes. And we are trying to develop privacy preserving techniques so that we can still use data for performing analytics and providing services, without necessarily jeopardising the individual’s privacy.
Human-data interaction (HDI) is about interactions, inferences and understanding that we have from the data that is collected about us, or is generated from our activities. It’s about bringing more transparency into these practices, letting you engage with this ecosystem, and having some sort of negotiability and agency over the data – so you can say: “OK, I’m happy to reveal this data for the purposes of x, y, and z, but in return, what do I get?”. When we started HDI quite a few years ago, there wasn’t really something like the General Data Protection Regulation (GDPR), where we could try to enforce or regulate this space. The idea was to bring some sort of self regulation, and provide the infrastructure, the services, the functionalities to enable this interaction.
It poses the question – do we really need such large-scale data collection practices and such a large invasion of users’ privacy in order to show them ads that they click on maybe once a week or once a month? It’s hugely inefficient.
At Brave, you’ve been developing an alternative browser model, which sounds really interesting. Can you tell me a bit more about that?
Browsing today is mostly about collecting as much data as possible in order to run real time bidding for adverts and showing ads to users. And the majority, nearly all of these ad impressions and clicks, are actually wasted. They’re either by bots, or they’re basically spam. And users really don’t engage much. It poses the question – do we really need such large-scale data collection practices and such a large invasion of users’ privacy in order to show them ads that they click on maybe once a week or once a month? It’s hugely inefficient.
At Brave, we’re looking at using much more basic but personalised models that run purely on the browser, where you can show some notifications if the user opts into that system. The idea is to not only improve the user’s privacy, but also hugely improve the performance of their browsing experience because websites load up faster. The click through rates that we’ve seen from the ads are higher than the current advertising industry rates, and we’re getting good quality engagement and conversions as well. The publishers get a share of the revenue and users get tokens they can spend on tipping various publishers, Twitter users, YouTubers, etc.
How aware do you think the public is about what’s happening to their data behind the screens?
I think awareness has massively improved, thanks to the GDPR and the publicity around huge misinformation campaigns that have happened around the world. People see the impact that can have on society as a whole. The one thing that we haven’t seen much of is enforcement by the regulators. We’ve seen a number of examples of the big regulators like the Information Commissioner’s Office (ICO), like CNIL in France, failing to enforce, or we’ve seen very small fines given against the likes of Facebook. Considering the extent of data abuse that we’ve seen, I think regulators really need to step in in order to make GDPR a reality. There’s always this tension, especially in Europe, that if you go too far with enforcement, it might prevent businesses and industry from flourishing, or companies might move their operations elsewhere.
You recently contributed to the government’s Future of Citizen Data Systems report, which is now providing a springboard for their National Data Strategy consultation. What were your recommendations, and what do you hope the UK’s data strategy will look like in the future?
There are clear benefits of data if it is used properly. One example during the pandemic is citizens’ mobility trends. Where do people go? Who do they hang around with? What kind of behaviours do they have? If you know that, you can recommend criteria to minimise the risk of spreading the virus. But privacy preserving techniques need to be used when dealing with this data.
My recommendations were that if it’s critical to use this data to control the pandemic, the government and the data processor should clarify exactly why this data is being collected, for what purpose, and how long will it be held. And secondly, to what extent can we still achieve the desired purpose without necessarily having this data lying around for anyone to see? If we are talking about contact tracing on the phone, or in a more manual way in restaurants, etc – how do you collect that data? Where does that data go? How’s it analysed, what happens to it later on?
I’m hoping there will be more transparency and clarity in this space. I think one of the biggest issues that we’re going to have in the next couple of months and years is the lack of trust, specifically in governments like the UK and US. Even if a vaccine is released and there are campaigns around health and vaccination, trust has been damaged. I think introducing a little bit of transparency can help both the governments and citizens in this situation.
Your data should not be for sale. We’re taking Oracle and Salesforce to court for illegally selling millions of peoples data and we need your help! If you believe that tech giants should be held accountable for their use of people’s data please support our claim by “liking” our support button at the top of this page.
We’re fighting for change, because your privacy matters.