Guides

Features

Advanced

Data Tags and Data Classification

ALTR offers a variety of data discovery tools to assist in data discovery and governance, including classification with Google DLP, Data Classification with Snowflake, and metadata management with Snowflake Object Tag Data. These tools enable ALTR users to more easily identify sensitive columnar data and group them via Data Tags for data governance scale.

Data Tags in ALTR

Data Tags are metadata within ALTR that enable users to define groups of columnar data. Data Tags can be generated by performing a Google DLP Classification, a Snowflake Classification, or a Snowflake Object Tag imports. When any of these processes are run, ALTR groups columns together and tags them based on the resulting GDLP Classifications or Snowflake Object Tags.

Enterprise-level ALTR users can then define Column Access Policies on these tags instead of having to individually specify columns.

Note: Even if Column Access Policies are defined by tags, Columns must still be individually connected to ALTR in order for them to be governed. Learn more about connecting columns to ALTR here.

Types of ALTR Data Tag Integrations

Google DLP Classification

Google DLP Classification enables ALTR users to send a random sample of their data to Google’s DLP service for classification. In a Google DLP Classification, ALTR selects a random sample from each column in your Snowflake database (up to 256 values per column)  and sends that sample to Google DLP for analysis. If there is only a limited amount of data available in a column (two times the number of columns in a table), then the column is not classified. Each column is sampled separately to protect the randomness and anonymity of data. Google’s DLP service returns possible classification results to ALTR, which associates those results to the affected columns as Data Tags.

You can access Google DLP results in ALTR via the Google DLP Classification Report.

Note: Due to the sensitive nature of client data, ALTR does not perform a GDLP classification unless explicitly requested by a user. Additionally, ALTR does not persist any of the values sampled during a Google DLP classification.

Snowflake Data Classification and Object Tags

ALTR integrates with Snowflake’s Object Tagging functionality to import any Object Tags available in Snowflake. Two options are available for importing Snowflake Object Tag data into ALTR:

  • importing any existing Object Tags available in Snowflake OR
  • executing a Snowflake Data Classification first and them importing all available object tag data.

Object Tags are metadata in Snowflake similar to ALTR’s Data Tags that are used to associate particular Snowflake Objects with each other. You can manually define these tags in Snowflake and assign them to column or you can automatically generate these Object Tags through a Snowflake Data Classification.

Snowflake’s Data Classification tool scans through all of the columns in a Snowflake database and attempts to identify what kind of data exists in each column. These result in two Snowflake Object Tags: Semantic Categories, which indicate the specific type of data; and Privacy Categories, which indicate the sensitivity of the data.

If you trigger a Snowflake Classification and Object Tag Integration for a database in ALTR, then ALTR will trigger a Classification for each column in that database and store the resulting Object Tags (and all other object tags available in Snowflake) as Data Tags in ALTR. Running a Snowflake Object Tag Import without a classification will not trigger a new Snowflake Data Classification, but may access any Object Tags created by previous Snowflake classifications.

Note: When performing a Snowflake Data Classification and Object Tag Import, your data stays local inside Snowflake. ALTR does not access the individual values present in your Snowflake Database; it only accesses the resulting metadata.

Using Data Tags in ALTR

In ALTR, users can leverage data tags to more easily identify sensitive data and use these identifications to create governance rules at scale.

Accessing Data Tag Information

ALTR enables you to see the results of a Google DLP Classification via the Google DLP Classification Report, which is available in the second tab of the Data Management Page. On the Google DLP Classification Report, you can easily see the most recent Google DLP classification for each database and connect the classified columns with a single click.

ALTR also enables you to view the results of a Snowflake Classification Report. However, be aware you can only view the most recent one per database.

Note: ALTR automatically generates a Friendly Name for columns connected through the Google DLP classification report based on the classification and the column name

Enforcing Column Access Policies on Data Tags

Enterprise ALTR customers can define Column Access Policies on Data Tags, saving significant time from otherwise having to specifying individual columns when creating and updating policies.

FAQs

GOOGLE DLP CLASSIFICATION

  1. I triggered a Google DLP Classification for a database but it hasn’t finished yet. What’s going on?
    Due to throughput limitations with Google, the Google DLP Classification process can take a long time - sometimes over a day - for databases depending on the number of columns present in a database. If a Google DLP classification is still marked as 'in-progress' even after several days, please reach out to support@altr.com.

    Also, be aware that you can only see the most recently run report per database.
  2. I performed a Google DLP Classification but many columns appear to be missing from the Classification Report. Why?
    The Google DLP Classification Report only returns results for columns that Google Identified as potentially sensitive. There are a variety of reasons a column may not appear in the classification report, such as a sample not containing any potentially-sensitive data or a column not having enough values present to be randomly sampled.


SNOWFLAKE DATA CLASSIFICATION

  1. I triggered a Snowflake Data Classification/Tag Import but I’m not sure whether it’s completed or not. How can I tell?
    ALTR enables you to view the status of your Snowflake Classification Report which is available on the second tab of the Data Management page. However, depending on the complexity of your database, this could take several hours - especially for a Snowflake Classification.  If after multiple hours, Snowflake Object Tags are still not available in ALTR, then contact support@altr.com.
  2. My Snowflake Data Classification / Object Tag Import finished.  Where can I see the results?
    ALTR enables you to view the results of a Snowflake Classification Report which is available on the second tab of the Data Management page. However, be aware that you can only view the most recent one per database.
  3. I created a tag-based Column Access Policy but the policy isn’t being enforced for all of the tagged columns in Snowflake. Why?
    Even if they are tagged and included in policy, columns must still be connected to ALTR to create Dynamic Data Masking Policies in Snowflake. Make sure that any columns you want governed in ALTR have been connected in the Data Management page or Google DLP Classification Report.
First section of content