The growth of data has been exponential. By 2023, it's anticipated that approximately 463 exabytes (EB) will be created every day. To put this into perspective, one exabyte is a unit equivalent to 1 billion gigabytes. By 2021, 320 billion emails will be sent daily, many of which contain personal information.

Data collected around the globe contains the type of information that businesses leverage to make more informed decisions. This is why any data that includes Personally Identifiable Information (PII), such as financial or medical information, needs to be protected.

Due to various concerns surrounding this inventory of information, governments have stepped in to protect their citizens on a local, national, and international level. Some of these data protection laws include but are not limited to the California Consumer Privacy Act (CCPA), the Health Insurance Portability and Accountability Act (HIPAA), the General Data Protection Regulation (GDPR), and the China Cybersecurity Law.

In this article, we will compare two of these: CCPA and GDPR. Although there are similarities between the regulations, there are also key differences to be aware of. GDPR came into effect first, on May 25, 2018, followed by CCPA, which came into effect more recently on January 1st, 2020.

Table of Contents:

  1. What is the General Data Protection Regulation (GDPR)?
  2. What is the California Consumer Privacy Act (CCPA)?
  3. Comparing GDPR and CCPA Data Privacy Laws
  4. Optimize and Enhance Data Protection Using ETL
  5. How Does ETL Impact GDPR and CCPA?
  6. The Importance of Hashing Meaningful information

What Is the General Data Protection Regulation (GDPR)?

The European Union's General Data Protection Regulation (GDPR) safeguards its citizens' data privacy and ensures protection in the event of data breaches.

This regulation applies to any business that supplies goods and services to EU citizens and residents, as well as those that process the personal data of EU citizens (whether they live within the EU or not).

The GDPR states that:

  • When data is processed, consent agreements must be presented and explained in layman's terms.
  • If a data breach occurs, upon detection, those affected must be notified within 72 hours.
  • If personal data is being sold, those consumers have the legal right to know to whom and if it is being sold for a business purpose, commercial purpose, or something else.
  • Citizens of the EU have the right to complete erasure from a company's database.

Violations, whether accidental or intentional violations, can be incredibly costly, equating to as much as €20 million for non-compliance (or 4% of gross revenue — whichever is greater). In addition, individuals impacted by the breach also have rights to compensation.

Once this privacy regulation was implemented, changes were seen across the world of business. For example, 60% of those surveyed said GDPR significantly changed their organizations' workflow in terms of how they collect, use, and protect personal information. In addition, 48% of respondents expressed plans to express their new rights as a direct result of GDPR.


What Is the California Consumer Privacy Act (CCPA)?

CCPA applies to companies across the globe who collect any personal data pertaining to California residents.

This applies to all companies who make a minimum of $25 million in annual revenue, companies that have data on at least 50,000 residents (or more), as well as companies that make more than half of their revenue from selling personal data.

Signed into law in late June 2018 and put into effect in January 2020, the CCPA takes a broader approach than the GDPR. Unfortunately, once the CCPA was placed into effect, it was reported that anywhere from 56-88% of firms were not ready.

The CCPA states that:

  • Companies are required to inform individuals when their particular consumer information is obtained. They must also share how information is being used and to whom they sell the data.
  • Consumers must have access to a "Do Not Sell My Personal Information" opt-in/opt-out option on the company's enterprise homepage.
  • California residents and citizens have the right to deletion from databases, as well as from the third-party databases that purchased their information.

Simply put, this new law, known as the California Assembly Bill 375, allows any Californian to seek and obtain all of the consumer data a company has saved in regard to them personally, as well as the right to obtain a full list of the third parties which whom the data has been shared.

If guidelines are violated, even if there wasn't a breach, this law gives consumers the right of action to file a class-action lawsuit against the company. Violations can cost up to $7,500 per record, which can quickly add up to millions.

Remember, this applies to all businesses, not just those located in California.

Comparing GDPR and CCPA Data Privacy Laws

Both GDPR and CCPA emphasize the critical importance of ensuring only authorized users can access personal data. While the more recently implemented CCPA borrows core concepts from the GDPR, there are discrepancies.

That is why it is important to remain mindful that there are some key distinctions and comparisons between GDPR and CCPA when planning your upcoming security and consumer data processing strategies:

  • In regard to the legal framework, under the CCPA, the personal data of California residents are considered to be collected, whereas the data of EU residents are referred to as processing activities under the GDPR. This means that the GDPR applies to the processing of personal data by automated means, as opposed to the CCPA which highlights the "collecting, selling, or sharing" of personal information. In addition, the CCPA applies to "consumers" whereas the GDPR applies to "data subjects," who have less clear residency requirements or citizenship.
  • Under the GDPR, EU residents have the right to access all of the EU personal data processed, whereas, under the CCPA, consumer rights to access California personal data is narrowed to data collection performed in the last 12 months. If the data was collected more than 12 months ago, it may be exempt.
  • Both the CCPA and GDPR states that all data must be exported in a user-friendly format. However, only the GDPR states that all of the information imported should be done so in a user-friendly format.
  • Under the GDPR, citizens have a right to correct errors in processed EU personal data. This right to correction is not included under the CCPA.
  • The right to erase personal data under certain conditions pertains to both the GDPR and CCPA.
  • While the GDPR applies to non-profit and profit entities alike, the CCPA applies only to "businesses."

This means that GPDR compliance does NOT ensure CCPA compliance. Yes, there are many similarities between the two. However, there are also some key differences that could have an immense impact on your business. 

If you are still unsure of which data protection regulations are applicable to your company, it is recommended you obtain professional legal advice to get a clear sense of your responsibilities and any possible exemptions.


Optimize and Enhance Data Protection Using ETL

Data breaches, whether they're accidental or not, can quickly put enterprises out of business. Therefore, it is imperative that protective solutions are proactively implemented.

Considering 91% of customers trust companies that are honest and transparent, both CCPA and GDPR compliance can help you build trust. Some companies one step further and implement an ETL solution for this purpose. Typically used to extract, transform, and load raw data into an integrated format (most often for business intelligence purposes), an ETL pipeline can also be implemented to ensure compliance and sustain data governance.

Related: What is ETL? An Introduction to Data Integration

How Does ETL Impact GDPR and CCPA?

When companies are dealing with mass amounts of data, sensitive data can be spread across varying locations. Taking a manual approach is extremely time-consuming and, unfortunately, errors are often made, increasing the risk of a data breach.

By contrast, an encrypted, automated solution is scalable, repeatable, and much more effective. ETL is an ongoing process that extracts data from multiple sources in order to cleanse, enrich, and identify sensitive information. That data is then encrypted and loaded in a data warehouse.

For example, when sensitive consumer information is detected (such as names, social security numbers, zip codes, or other identifiers) it can then be proactively encrypted.

This step is imperative to the safety of consumers and the success of enterprises. By developing key protocols in relation to this step, enterprises can continuously ensure greater data security and confidentiality.

The Importance of Hashing Meaningful Information


The protection of data begins with the detection of sensitive information. With's rich set functions, you can manipulate the results output.

For example, you can hash PII, masking that data in order to maintain compliance. Hashing is a one-way process that takes meaningful (often sensitive) information, transforming it into a random number so that it can be copied but not reversed.

For example, if a social security number is represented by a random number, it cannot be reversed. However, you can take the original data and hash it again to ensure that it's identical. Since PII is rarely needed for data analytics, it is highly recommended that you always keep this information encrypted in order to reduce your risk.

As more regions enact privacy protection measures, regulatory compliance will be critical to your success, making ETL for both GDPR and CCPA an even more important in regard to privacy compliance and security. If you're ready to take the next step and require a data integration platform that allows you to bring all of your data sources together, get in touch with us to acquire your complete toolkit.