Data privacy is a moving target. Organizations are capturing more data — on customers, partners, users, vendors, employees. At the same time, local, national, and international laws governing privacy and data protection are almost constantly changing.
Against this backdrop, Mark Cockerill, vice president of legal at EMEA and head of global privacy at ServiceNow, said we lack global consensus on what data privacy is and how to approach it. “People are shaped by their experiences,” he says, “so different countries and regions have different ideas about how important data privacy is, or what it means.”
However, security teams and analysts agree that companies should invest in a data privacy infrastructure that incorporates “privacy by design.” Instead of taking data and trying to secure it post hoc, privacy by design generates data privacy protection in the data collection process.
For that purpose, organizations use tools such as artificial intelligence (AI) and machine learning (ML) that parse and secure vast amounts of data as it collects. But how does it work? Why is this important? When machines interact with human data, what ethical questions are raised?
Manual tools complicate compliance
Too much data is now being processed by organizations for human operations to parse and secure. That’s the problem. If personal information or personally identifiable information (PII) —think credit card numbers or GPS coordinates — falls into the wrong hands, evil actors could steal financial data or get involved in identity theft. But organizations can’t secure their data if they don’t know what they have or where it’s stored. And with so much data pouring in all the time, businesses are losing visibility into their data infrastructure.
ServiceNow and BigID surveyed IT and engineering leaders to understand how they manage privacy in their own organizations; BigID partners with ServiceNow to manage sensitive and private data. The survey revealed that businesses have difficulty complying with regulatory and compliance guidelines.
53%
Percentage of companies tracking data using Excel sheets
The General Data Protection Regulation, or GDPR, is especially difficult. That’s because compliance with the GDPR requires complex documentation and collaboration throughout the organization to determine what data the company has, who owns it, and how it is processed. Despite the enormous amount of data organizations manage, many still use manual tools and processes to monitor it.
According to survey data, most companies use Excel sheets (53%) and data mapping or visualization tools such as Vizio (41%). With heavy reliance on manual tools, many respondents scan and define data only in structured sources (40%), have not yet been able to analyze data in both structured and unstructured locations (12%), or no initiative to scan sensitive data (4%). Because of the intense reliance on these manual tools, it is more difficult for companies to proactively manage and monitor regulated data, which is essential for compliance with the GDPR and other regulations.
Many do not seem to be actively creating privacy by design in their processes and products. One-third of organizations say they are simply responding to the changing privacy landscape without taking steps to optimize their privacy program, while 10% say they are not in a position to respond.
Future of data privacy
That’s where companies like BigID can help. BigID uses AI and ML to disambiguate different types of data. Dimitri Sirota, CEO of BigID, said the goal is to build a map of each data point belonging to a particular identity. That allows companies to demystify their data: to find out what they have, where it is stored, and whether it contains personal information.
“There’s a huge volume,” Sirota said. “Structured and unstructured data, cloud and on-premise data … How [businesses] take a picture of what data and whose data they have? The only way that is possible is to use machine learning of various kinds. It’s about improving transparency and trust. ”
[Read the Workflow Guide: Cyberthreat control & management]
BigID uses AI in two ways. First, AI combines an organization’s data infrastructure, even the parts not visible to the organization itself. Algorithms learn where the data is located, what type of data it is, and whether it belongs to a particular individual. Second, AI automates data collection and processing. Individual customers or users can file a request to see if the company has collected their data, what type of data the organization has, and how the company plans to use it. AI ensures that workflows are in place making it easy for customers, businesses, and auditors to track that data.
Sirota said AI ensures that organizations (and individuals) get a big picture view of their data. “Historically, data has been viewed in a quiet way: the legal has one view of the data, the security has a different view, and the management has a different. We think it’s important to look at the data from a unified perspective. an insight. ”
Ethical fields of mine
At first glance, using AI and ML to classify personal information does not seem as ethically difficult as, for example, using machine learning to determine who can get a bank loan or for how long. a prison sentence. But Cockerill said there are still questions to consider.
“As you begin to use AI and ML to identify sets of personal data, ethical challenges begin to emerge based on where you conduct that analysis,” he explains. “Are you performing that analysis in the same location where that data is stored, or are you moving it to a centralized database? Are you performing that analysis just to determine the data, or are you using the dataset for something else other reasons? ”
Cockerill said that when someone gives their data to a corporation, they may not fully understand all the ways how the data will be used. The problem, Cockerill says, is that when an organization collects data for one reason, the organization often uses that data for another reason as well — but the customer or employee may not know that. “Someone may want to know if their information is being transferred or used in a way that is not in the original agreement,” he said. “It’s something to consider.”
As you begin to use AI and ML to define sets of personal data, ethical challenges will begin to emerge.
Furthermore, whenever ML or AI interacts with human data, there is the potential for bias to affect the results. “You can’t completely eliminate bias because you always have a developer making decisions about your algorithms,” Cockerill says. The developer decides what data is used to train the algorithms, for example. Cockerill said the potential for bias and false results is amplified when personal identification is made in English and yet the algorithm tries to parse the data into another language.
[Read also: What’s wrong with bias training, and how to fix it]
Checks and balances
A key principle of privacy by design is the anticipation of data classification challenges — ethical questions, the potential for inaccurate results — and efforts to mitigate them early in the lifecycle. of data processing.
Cockerill said that in conversations about data privacy, everyone is talking about the lifecycle of data — collection, use, storage, and deletion — but he invites organizations to reframe the process. -conversation. “I want to take this back to some important questions: What are you doing with the data? Who is accessing it? And where?”
Cockerill emphasizes that “checks and balances” are critical. He wants to see more companies asking internal ethics committees to review how the organization uses AI and ML. He said algorithmic impact assessments can help determine if organizations are making the right value -based decisions. This best practice should be incorporated into the data collection process.
“Privacy by design is really the right way to look at it,” Cockerill said.
.