Guides

Features

Advanced

Data Classification

ALTR offers a variety of data discovery tools to assist in data discovery and governance, including classification with Google DLP, Data Classification with Snowflake, and metadata management with Snowflake Object Tag Data. These tools enable ALTR users to more easily identify sensitive columnar data and group them via Data Tags for data governance scale.

Types of Data Classification Reports

There are two types of data classification reports that you can choose from to get insight about the type of sensitive data exists in each column. Read on to learn the details that will help you decide which type of report to leverage for your business.

Google DLP Classification

Google DLP Classification enables ALTR users to send a random sample of their data to Google’s DLP service for classification. In a Google DLP Classification, ALTR selects a random sample from each column in your Snowflake database (up to 256 values per column)  and sends that sample to Google DLP for analysis. If there is only a limited amount of data available in a column (two times the number of columns in a table), then the column is not classified. Each column is sampled separately to protect the randomness and anonymity of data. Google’s DLP service returns possible classification results to ALTR, which associates those results to the affected columns as Data Tags.

You can access Google DLP results in ALTR via the Google DLP Classification Report tab shown in figure 1.

NOTE

  • Due to the sensitive nature of client data, ALTR does not perform a GDLP classification unless you request it.
  • Additionally, ALTR does not persist any of the values sampled during a Google DLP classification
Fig. 1 Data Management page to access the GDLP Classification report

Snowflake Data Classification and Object Tags

ALTR integrates with Snowflake’s Object Tagging functionality to import any Object Tags available in Snowflake. The following options are available for importing Snowflake Object Tag data into ALTR:

  • Import any existing Object Tags available in Snowflake OR
  • Execute a Snowflake Data Classification first and then import all available object tag data

Object Tags are metadata in Snowflake similar to ALTR’s Data Tags that are used to associate particular Snowflake Objects with each other. You can manually define these tags in Snowflake and assign them to column or you can automatically generate these Object Tags through a Snowflake Data Classification.

Snowflake’s Data Classification tool scans through all of the columns in a Snowflake database and attempts to identify what kind of data exists in each column. These result in two Snowflake Object Tags: Semantic Categories, which indicate the specific type of data; and Privacy Categories, which indicate the sensitivity of the data.

If you trigger a Snowflake Classification and Object Tag Integration for a database in ALTR, then ALTR will trigger a Classification for each column in that database and store the resulting Object Tags (and all other object tags available in Snowflake) as Data Tags in ALTR. Running a Snowflake Object Tag Import without a classification will not trigger a new Snowflake Data Classification, but may access any Object Tags created by previous Snowflake classifications.

Note: When performing a Snowflake Data Classification and Object Tag Import, your data stays local inside Snowflake. ALTR does not access the individual values present in your Snowflake Database; it only accesses the resulting metadata.

Using Data Tags in ALTR

In ALTR, you can leverage data tags to more easily identify sensitive data and use these identifications to create governance rules at scale.

Accessing Data Tag Information

ALTR enables you to see the results of a Google DLP Classification via the Google DLP Classification Report, which is available in the second tab of the Data Management Page. On the Google DLP Classification Report, you can easily see the most recent Google DLP classification for each database and connect the classified columns with a single click.

ALTR also enables you to view the results of a Snowflake Classification Report. However, be aware you can only view the most recent one per database.

Note: ALTR automatically generates a Friendly Name for columns connected through the Google DLP classification report based on the classification and the column name.

Enforcing Column Access Policies on Data Tags

All ALTR customers (regardless of your Free, Enterprise or Enterprise Plus tier plan) can define Column Access Policies on Data Tags, saving significant time from otherwise having to specifying individual columns when creating and updating policies.

Frequently Asked Questions

GOOGLE DLP CLASSIFICATION

  1. I triggered a Google DLP Classification for a database but it hasn’t finished yet. What’s going on?
    Due to throughput limitations with Google, the Google DLP Classification process can take a long time - sometimes over a day - for databases depending on the number of columns present in a database. If a Google DLP classification is still marked as 'in-progress' even after several days, please reach out to support@altr.com.

    Also, be aware that you can only see the most recently run report per database.
  2. I performed a Google DLP Classification but many columns appear to be missing from the Classification Report. Why?
    The Google DLP Classification Report only returns results for columns that Google Identified as potentially sensitive. There are a variety of reasons a column may not appear in the classification report, such as a sample not containing any potentially-sensitive data or a column not having enough values present to be randomly sampled.


SNOWFLAKE DATA CLASSIFICATION

  1. I triggered a Snowflake Data Classification/Tag Import but I’m not sure whether it’s completed or not. How can I tell?
    ALTR enables you to view the status of your Snowflake Classification Report which is available on the second tab of the Data Management page. However, depending on the complexity of your database, this could take several hours - especially for a Snowflake Classification.  If after multiple hours, Snowflake Object Tags are still not available in ALTR, then contact support@altr.com.
  2. My Snowflake Data Classification / Object Tag Import finished.  Where can I see the results?
    ALTR enables you to view the results of a Snowflake Classification Report which is available on the second tab of the Data Management page. However, be aware that you can only view the most recent one per database.
  3. I created a tag-based Column Access Policy but the policy isn’t being enforced for all of the tagged columns in Snowflake. Why?
    Even if they are tagged and included in policy, columns must still be connected to ALTR to create Dynamic Data Masking Policies in Snowflake. Make sure that any columns you want governed in ALTR have been connected in the Data Management page or Google DLP Classification Report.
First section of content