21 Jan 2024
Along with five leading digital experts we aim to highlight some of the biggest challenges in privacy for the coming year. We cover everything from cookies and digital fingerprinting to AI, LLM’s and much more.
Trust 3.0 exists to ensure human-centred thinking is at the core of business decision-making where data is a consideration. And, to upskill the public to drive better standards of data literacy.
Founder of FOU Analytics. Augustine has been studying the prominence of ad fraud in digital marketing for the last 11 years and is also a staunch privacy advocate.
Founder of ASG Solutions which is focused on helping companies achieve sustainable growth through responsible marketing. Prior to that, Arielle loudly resigned from her role as Chief privacy officer at a global media agency.
A data privacy and technology attorney with 17 years of experience in consumer rights law, with the last 5 years spent in privacy and 3 in the ethical use of AI.
A junior data science consultant with a strong interest in software privacy and data.
A French corporate lawyer and founder of Data Rainbow, enthusiastic about everything AI.
We all know that cookies are going away and to be more precise about that, it's third-party cookies going away. So websites like the New York Times, if you're just visiting the site, New York Times can still set a cookie so that you can remain logged in, for example.
I'm seeing that even though third-party cookies are going away this year, the ad tech companies are finding workarounds.
They will be using even more invasive data collection practices, what we typically call fingerprints. So fingerprints are the collection of all the different JavaScript parameters that pertain to your browser.
So combined with an IP address and the version of the browser, the screen resolution, and the list of plugins, they can still relatively uniquely track the person.
A couple of things that are a continuation of what we've seen in recent years but are already ramping up and that would be sensitive data in the context of ad tech as well as children's privacy.
We've seen some recent reports of ways that beyond what we traditionally think of with privacy, the algorithmic recommendation engines on some social platforms were promoting and fostering child sexual abuse.
The sensitive personal information issue around the scrutiny of the use of health data and advertising since the Dobbs decision in the US.
The last thing I'd call out is a broader theme that I see that is especially troubling in light of the progress that needs to be made around these areas, which is opacity creep from the platforms as we have laws like the Digital Services Act, as we see the FTC ramping up their action.
And finally, platforms taking advantage of the AI momentum to push their black-box products.
There are three ways to resolve a dispute. There's negotiation, litigation and legislation. So far we've tried negotiating with big tech and that hasn't worked. So we are at a standstill on legislation because Congress needs this data to keep their jobs.
We're getting nowhere on legislation with that on the federal level. So we're getting state legislation instead, which is financed mostly by the tech industry because they're getting weak consumer protection versions of privacy law adopted it's policy blocking good legislation, but they get to say they're doing something about privacy.
They've enshrined privacy rights. Also, with the money that they spent lobbying to accomplish this, you could have handled the food insecurity issues for the entire city of Detroit for five years. That's just what was spent passing Virginia's privacy laws!
What the regulators are doing is they're putting forward legislation to claw back some of the parts where we have agency over our data.
The biggest challenge that I expect to pop up during this year is hidden LLMs.
Language models that we know such Chat GPT and others are being embedded into products where we won't even be aware that they are there and they will be used to collect way more personal data, confidential data from persons, from businesses without us acknowledging it.
Algorithms are a big thing. I've been speaking today about education and the use of AI and one school in the UK has appointed an AI chatbot headmaster.
It is going to be interesting to follow the course of data brokers. The whole industry behind that. That's one side that hasn't been tackled enough by the GDPR, but the FTC has been more active.
And yes, litigation, I've been following article 82, GDPR on non-material damage, civil compensation. I'm still waiting and I can't understand why nothing has been happening in that area yet. We see data breaches, and we know that data breaches sometimes just happen, but when the data controller is responsible and liable, there should be some kind of compensation.
Even with third-party cookies, the data for targeting was so inaccurate that if we do away with the cookies, it's not going to get any better. And I'm speaking again mainly from the marketing perspective.
Cookies identify the user or their browser, all the targeting is generally inferred from website visitation patterns and things like that. So it is not that accurate to begin with. There have been not that many academic studies about it, but the ones that we have seen showed that even with one parameter like gender, the accuracy of the targeting was less than 50%. So it was no better than random.
And, if you start using two parameters like gender plus age, accuracy goes down to 12%, which is one in 10.
So now ad-tech platforms and agencies are just going to continue making more BS up and people won't know the difference. So the large agencies are going to still target, they're going to buy from Trade Desk who has this UUID 2.0 and all that kind of stuff. It's the same BS.
Some of these moves are great for Google because it's going to concentrate more of the spend on the logged in environments!
We're talking about these alternative identifiers and I think that the way that they're being positioned is supposed to convince regulators and marketers that somehow they're better for privacy. And yet, I haven’t seen one of these identity frameworks that actually comes with standards for notice and choice that are any better than the wild-west iteration that existed during cookie-era.
One of the things that I'm watching closely from that perspective in terms of human impact is I don't really see anyone holding back on using certain sensitive transaction or purchase data to create audience segments. So again, it's a new manifestation of the same exact problem.
I recently experimented with LinkedIn's data.
They only had 538 data points they inferred about me based on my behavior on their site and their accuracy was 88% wrong!
So I'm not so concerned about the cookies as I am about the inferencing. Inferencing means making shit up. It means snake oil, you should not invest money here because inferencing, they said that my skills include hip hop and hand-to-hand combat. And they also thought that I was a doctor.
LinkedIn considers me to be an air conditioning expert. So this is the quality of data that LinkedIn has on me, especially since I post there three times a day, have thousands of posts on LinkedIn and they still see me as air conditioning expert.
I'm less aware of how it works behind the scenes, but what I saw is that before we would add maps.google.com and now we go to google.com/maps. So that means they share the same cookies as the search engine.
I expect that the inside the large corporation like Google and Facebook and the other big tech, they will try to bring all their websites on their principal domain so they can share the first party cookie between their different entities this week.
You don't use maps for the same reason you use YouTube or you use the search engine and the AI stuff that is coming. So I don't think this will be good for privacy on this site. So when you are inside the Google bubble, privacy will be worse because of that.
The first assumption is that any electronic device would individually belong to one person. It has become the case more and more, but there are still people who share their devices. So sharing devices means sharing profile, but the profiling system doesn't care. It's one person.
As Francois mentioned, with something like a Google map, you can know a lot about the individual, his political activities, and his religious beliefs. Everything he does is revealed by geo location data. So that’s a huge problem with tracking, profiling and underlying inaccuracy. The fact that it's assume it's individual device means a lot of wrong decisions.
My question is, do we really need profiling for trade and business? We've seen business and trade for centuries without profiling. Why do we need it? Do we need a YouTube or a Google to be so rich?
AI is very concerning because as we've all acknowledged in this forum that there's a lot of inaccurate data out there. It is democratizing access to tools for people to make up even more BS.
I ran my own experiments. We used chat GPT and built our own LLMs. Even when I fed it all my own articles, I gave it 500 of my own articles. The stuff that came out was barely readable.
So the LLMs can form English sentences, but the content and the meaning is so inaccurate and so bad.
You just need some flowery BS to sell coffee or to sell chocolate or something. So there are certain cases where AI will be helpful because it saves the time, it'll make up the BS faster than the human can.
But there's other dangers, we've already seen where AI can create deepfakes, Obama saying something he never said. Trump saying something he never said!
We've already seen it since the 2016 election and this year is another election cycle in the US. So we're going to see the proliferation of incorrect information deliberately falsified using AI.
I'll speak again from the advertising industry dynamics that I'm seeing. So first of all, I think that the acceleration of AI is positioning the platforms to sustain and really expand their monopolies.
But, the other problem is that they're using it to condition marketers to relinquish even more control and to accept even less transparency over where their ads run.
The knock-on effects of that is that it is further putting publishers in a chokehold. There is further decline in the quality journalism and meanwhile we will have the rise in information that's not credible and not accurate in an election year.
I worry about civic integrity, I worry about truth and how that sustained and in particular in the near term, I really, really wonder about the future of journalism.
My biggest issues are going to be fraud and discrimination. If you've been accused of aiding genocide, I really desperately think you're going to have problems with discrimination in your tools too.
If you've been previously found to be committing fraud on the public before you implemented generative AI into all of your tools, I think you have to pause and clear the air to let us know that you are trustworthy before you start implementing the ultimate bullsh*t maker into everything that you do.
Using complexity as cover and creating this moat is just going to enrich the first standards that are in the community who have already invested in building AI.
I think that we need a public testing space, like a sandbox for these types of things. Technology used to ask for permission first before they deployed something and they were horribly embarrassed if it went wrong and that's gone out the window.
One thing that worries me a lot about AI, and I would put it on level six, is using healthcare because we have a lack of specialists, especially in mental health, and people who are in a bad situation mentally will tend to use these kind of tools to reinsure themselves that they are not doing that bad.
These tools are not doctors, they are not psychologists, they cannot diagnose anything, they're just spitting words. So the danger there is that people will tend to believe what comes out from these tools that were trained on websites where people could put almost anything like there's WebMD
I don't dislike AI, I love technology. What I don't like is technology abusing my privacy and technology being inaccurate.
I think within the AI, we have to create better definitions. I don't think Chat GPT is AI, and LLMs aren't the same AI as the ones that the medical technology can use or other technology’s have been using.
Chat GPT is a language model. I think it's been an abuse of language, calling it AI. It's no way an AI, it's used to rewrite what you can write as a non-native.
Alot of so-called AIs promise more they than can deliver, and I am waiting for the bubble to burst.
I've noted two ideas and the first one was about cookies and we talked a lot about cookies.
I want to define the homemade cookie, which is the opposite of building different sub-domains for different part of your business. So your buyers have cookies and your regular visitors adopt these.
The other thing is something that was announced by Intel. Intel are bringing AI processors on motherboards for personal computers. This will allow us to use those fancy tools that we call LLMs locally on our machines because the compute is in the cloud.
I'm really looking forward to be able to use more of these toys locally, because I think LLMs are toys on our machines for fun, for gaming, for family purpose and not for medical advice or things like that!
For any more information on Trust 3.0 please get on touch via contact@trust30.org
To get in touch with the speakers please reach out to them on LinkedIn via the links below: