How to ensure data privacy in AI driven apps
It is hardly news to anyone that companies collect loads of users’ data for marketing, advertising, and other purposes. Yet, this fact does not seem that frightening anymore. We all have made peace with it, sacrificing parts of our personal space for curated, personalized content.
Since AI is intensively entering our lives, the data privacy issue is becoming more challenging and raises the issues of data protection again. With AI, data collection can go uncontrolled, resulting in massive information leaks and manipulations.
While the regulatory bodies are struggling to decide how to put the AI vs. data issue in legal frames, the burden of responsibility has fallen onto the shoulders of developers. And regulatory bodies do not mind!
For one, the UK’s data protection watchdog has clarified its expectations in its latest warning, saying AI app developers should cover all privacy risks before bringing AI-powered products to the market.
Well, developers have a lot of work to do now. Below, I want to share the main challenges developers face in AI integration and how they overcome them.
Is it necessary to introduce new protection measures for AI?
Why is it vital to talk about data protection in the context of AI specifically? Aren’t the existing instruments, like data encryption or minimization, enough to cover the data-related risks?
Well, firstly, data privacy is still crucial to users. They still put their privacy above an enthralling user experience that AI can provide and are not likely to compromise. According to the Genpact survey of more than 5,000 people, 63% of respondents valued privacy more than a positive customer experience and wanted businesses to avoid using AI in case it invaded their privacy.
Moreover, based on a 2020 survey by the European Consumer Organization, 45-60% of Europeans believe that AI will increase the misuse of personal information.
But the harsh truth is you cannot totally prevent data leaks and misuse, whether a regular app or an AI-powered one. Instead, you can prevent data from getting to the points where it can be misused.
In an ideal world, there should be a whole complex of measures taken for data protection on different levels: regulatory, technical and educational. But firstly, the responsibility lies on developers to implement the preventive mechanism when doing the AI integration.
Two scenarios of AI app integration and their privacy challenges
AI integration into an app usually follows two scenarios: you can either integrate an existing third-party AI model. Or you can fine-tune and train an open-source model. Both cases pose their own privacy risks, though the responsible actor is different.
When talking about the API integration the risks are on the model provider. Logically enough, the designer of the model is responsible for the way this model processes data. And considering the legal loopholes on AI vs privacy issue, there might be a lot of misunderstanding on how models should collect and process the data.
Take ChatGPT, for example, which is already banned in Italy, investigated in Germany, France and Spain, while Canada is examining potential infringements of its own data protection laws.
So what can developers do in terms of data protection, when integrating an open AI model like LLAMA for example?
When you work with API models, you still can anonymize data that you send TO the model to remove PII or other sensitive info from the data that you process with the model. (not from training data). This is one of the most popular approaches to working with sensitive data + cloud APIs. (If you can anonymize data reliably, sometimes PII is inside unstructured text.)
Now, let’s move to the other scenario – building a training your own AI model. Again, when talking about “data protection in AI-driven apps”, we talk more not about how to protect it fully, but how not to let it to let them be there they shouldn’t. For example, if you do not want the model to operate on personal data, you should anonymize it before sending the data to the model for training.
Building your own AI model for an app gives you a lot of space for action and opportunities, while also posing more risks at every level of development. But that means more opportunities to fix them too.
So the more you can influence the AI model, the more problems you might have eventually, and more opportunities you have to fix them at each step of development.
Some ways you can ensure data protection in AI apps
Unfortunately, there is not much to say here. The most reasonable and realistic step to take here is alignment - training the system, so that it is programmed to achieve user-oriented goals.
However, you can be creative and invent your own way of ensuring privacy. For example, You can integrate one more net to another, the first one will supervise the other, and see if the personal data is there, and anonymize it.
- Homomorphic Encryption
A lot of data protection mechanisms that we know, like encryption or anonymization, primarily relate to the way data is stored. Yet, when it comes to processing there are certain privacy-enhancing technologies to keep data safe. One of them is homomorphic encryption.
Homomorphic encryption – a privacy-enhancing technique, which makes it possible to encrypt data before processing, so that another party can use this data without accessing the raw information. It’s like the data is put into a safe, which is sent to a receiver, who can operate the information without actually reading it – that’s the magic!
Yet, so far, this is not more than a beautiful theory, rather than a practical truth.
The truth is there is not much you can do to encrypt your data in the processing. But anonymization is one of the few things you can do.
Anonymizing data means deleting all the information from the data, that can be used to identify a person – names, addresses, contacts. This is usually done automatically by processing the databases with structured information, so there is no identity-related data.
- Combination of AI models
If you’re absolutely determined to provide data privacy at any cost, you can make a kind of Frankenstein scheme by integrating one AI net into another, where the one does the processing, while the other one supervises the first one and takes protection measures, like anonymizing.
Sounds crazy, but it’s still more practical than homomorphic encryption.
As AI is intensively becoming a part of our lives through apps, the issue of data protection strikes with a new force. As legislative regulation of AI in terms of data protection is ambiguous, app developers take the whole responsibility for data protection in mobile apps.
Though there are few ways to encrypt personal information in processing, there are still some ways to make it safer.
With expertise in validating app ideas, user research, and building minimum viable products (MVPs), Dima helps tech startups and companies from Fortune 500 List to create outstanding mobile and web applications. With his expertise, Dima has helped numerous startup founders (Aspiration, GOAT, Dollar Shave Club) to ensure excellent IT development while focusing on their core business objectives. During the last couple of years, fintech startups have been the main focus of Dima’s work.