Automatic Tagging
Automatic tagging allows you to assign Snowflake object tags to columns based on data classification results. Once the data classification runs, provide ALTR with a JSON object that maps classifiers to tags in order to tag your data. Once your data is tagged, connect the tag in ALTR to monitor database activity and access control. Then, apply tag-based policy to mask sensitive columns.
Note
Automatic tagging is also available in ATLR’s API. Learn more. When automatically tagging data through ALTR's API, it's possible to tag objects more specific than databases and those jobs may affect the same schema objects. If you run multiple simultaneous jobs that affect the same columns, you may experience inconsistent results.
To set up and use automatic tagging:
To instruct ALTR on which tags to assign to classifiers, write a JSON object that maps the classifiers to tags. This step allows you to define tags and policy on specific sensitivity categories, such as PHI or PCI instead of individual classifiers like Name and Phone Number. When automatic tagging is triggered, ALTR tags any columns included in that classification report based on the mapping defined in the JSON object.
Note
If you want to save the classifier results directly as tags, use the “infotypes” example below.
Priority
The mapping requires a priority to each classifier-tag pair, which ALTR uses to resolve any conflicts if a column was assigned more than one classifier.
For example, if your classification report contains “Austin,” “Dallas,” and “Charlotte,” then the classification process could report back that this list contains city names and first names. You can use the priority field in the mapping to determine if this column should be tagged as names or as locations.
Write JSON Object to Map Classifiers to Tags
Only classifiers in the selected classification report are mapped. If the tag you wish to use does not exist in Snowflake, include the tag in your JSON object and turn on the Create Tag toggle. Learn more.
To write the JSON object, use the following examples for guidance. Be sure to include
All the classifiers being mapped
The priority for each mapping object Learn more
Example JSON Mappings
Below are some example mappings. Be sure to specify the name of the database and schema where the tag is defined in Snowflake. You can find this information by executing a “SHOW TAGS” statement. If you are creating new tags, specify the name of the database and schema where you want ALTR to create the tag. Make sure that the service user has the CREATE TAG privilege in that schema.
This example mapping for Google DLP classifiers assigns tags based on a stoplight model, where data sensitivity is identified through three broad categories: “green,” “yellow” and “red.” This mapping is suitable for organizations seeking a lightweight and flexible tag structure.
[ { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Red", "priority": 1, "classifiers": [ "CREDIT_CARD_NUMBER", "US_SOCIAL_SECURITY_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Yellow", "priority": 2, "classifiers": [ "DATE_OF_BIRTH", "EMAIL_ADDRESS", "PHONE_NUMBER", "FIRST_NAME", "LAST_NAME", "PERSON_NAME", "MALE_NAME", "FEMALE_NAME", "LOCATION", "STREET_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Green", "priority": 3, "classifiers": [ "DATE", "GENERIC_ID", "COUNTRY_DEMOGRAPHIC", "US_STATE", "GENDER", "ORGANIZATION_NAME", "DOMAIN_NAME" ] } ]
This example mapping for Snowflake classifiers assigns tags based on a stoplight model, where data sensitivity is identified through three broad categories: “green,” “yellow” and “red.” This mapping is suitable for organizations seeking a lightweight and flexible tag structure.
[ { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Red", "priority": 1, "classifiers": [ "SEMANTIC_CATEGORY:BANK_ACCOUNT", "SEMANTIC_CATEGORY:NATIONAL_IDENTIFIER", "SEMANTIC_CATEGORY:TAX_IDENTIFIER", "SEMANTIC_CATEGORY:PAYMENT_CARD" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Yellow", "priority": 2, "classifiers": [ "SEMANTIC_CATEGORY:DATE_OF_BIRTH", "SEMANTIC_CATEGORY:EMAIL", "SEMANTIC_CATEGORY:PHONE_NUMBER", "SEMANTIC_CATEGORY:NAME", "SEMANTIC_CATEGORY:STREET_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Stoplight Policy", "tag_value": "Green", "priority": 3, "classifiers": [ "SEMANTIC_CATEGORY:CITY", "SEMANTIC_CATEGORY:POSTAL_CODE", "SEMANTIC_CATEGORY:AGE", "SEMANTIC_CATEGORY:COUNTRY", "SEMANTIC_CATEGORY:GENDER", "SEMANTIC_CATEGORY:AGE" ] } ]
This example mapping for Google DLP classifiers assigns tags mapped directly off of Google DLP's Infotypes. This mapping is suitable for organizations who want to tag and control access based directly on classification results.
[ { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Credit Card Number", "priority": 1, "classifiers": [ "CREDIT_CARD_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Social Security Number (SSN)", "priority": 2, "classifiers": [ "SOCIAL_SECURITY_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Date of Birth", "priority": 3, "classifiers": [ "DATE_OF_BIRTH" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Email Address", "priority": 4, "classifiers": [ "EMAIL_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Phone Number", "priority": 5, "classifiers": [ "PHONE_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Address", "priority": 6, "classifiers": [ "LOCATION", "STREET_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Name", "priority": 7, "classifiers": [ "FIRST_NAME", "LAST_NAME", "MALE_NAME", "FEMALE_NAME", "PERSON_NAME" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Date", "priority": 8, "classifiers": [ "DATE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Gender", "priority": 9, "classifiers": [ "GENDER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Country", "priority": 10, "classifiers": [ "COUNTRY_DEMOGRAPHIC" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "State", "priority": 11, "classifiers": [ "US_STATE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Organization Name", "priority": 12, "classifiers": [ "ORGANIZATION_NAME" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Domain Name", "priority": 13, "classifiers": [ "DOMAIN_NAME" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "ALTR Classification", "tag_value": "Generic ID", "priority": 13, "classifiers": [ "GENERIC_ID" ] } ]
This example mapping for Google DLP classifiers assigns tags for financial information, with an eye towards data that may be subject to Payment Card Industry (PCI) regulation. This mapping is suitable for organizations looking to identify and protect potentially sensitive financial data.
[ { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PCI", "tag_value": "Primary Account Number (PAN)", "priority": 1, "classifiers": [ "CREDIT_CARD_NUMBER", "CREDIT_CARD_TRACK_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PCI", "tag_value": "Account Number", "priority": 2, "classifiers": [ "FINANCIAL_ACCOUNT_NUMBER", "IBAN_CODE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PCI", "tag_value": "Cardholder Name", "priority": 3, "classifiers": [ "PERSON_NAME" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PCI", "tag_value": "Expiration Date", "priority": 4, "classifiers": [ "DATE" ] } ]
This example mapping for Google DLP classifiers assigns tags based on healthcare and PHI data, loosely modeled after HIPAA identifiers. This mapping is suitable for organizations looking to identify and protect potentially sensitive healthcare data.
[ { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Medical Record Number", "priority": 1, "classifiers": [ "MEDICAL_RECORD_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Social Security Number (SSN)", "priority": 2, "classifiers": [ "US_SOCIAL_SECURITY_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Taxpayer Identification Number (TIN)", "priority": 3, "classifiers": [ "US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER", "US_PREPARER_TAXPAYER_IDENTIFICATION_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Medicare Benificiary Number", "priority": 4, "classifiers": [ "US_MEDICARE_BENEFICIARY_ID_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Phone Number", "priority": 5, "classifiers": [ "PHONE_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Name", "priority": 6, "classifiers": [ "FIRST_NAME", "LAST_NAME", "PERSON_NAME", "MALE_NAME", "FEMALE_NAME" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Date", "priority": 7, "classifiers": [ "DATE_OF_BIRTH", "DATE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Email Address", "priority": 8, "classifiers": [ "EMAIL_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Geographic Subdivision", "priority": 9, "classifiers": [ "STREET_ADDRESS", "LOCATION", "LOCATION_COORDINATES", "US_STATE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Device Identifier", "priority": 10, "classifiers": [ "IMEI_HARDWARE_ID", "ICCID_NUMBER", "MAC_ADDRESS", "MAC_ADDRESS_LOCAL" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: IP Address", "priority": 11, "classifiers": [ "IP_ADDRESS" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: URL", "priority": 12, "classifiers": [ "URL" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: License Number", "priority": 13, "classifiers": [ "US_DRIVERS_LICENSE_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "HIPAA Identifer: Vehicle Identification Number", "priority": 14, "classifiers": [ "US_VEHICLE_IDENTIFICATION_NUMBER", "VEHICLE_IDENTIFICATION_NUMBER" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "Diagnosis Code", "priority": 15, "classifiers": [ "ICD9_CODE", "ICD10_CODE" ] }, { "tag_database": "[TAG_DATABASE]", "tag_schema": "[TAG_SCHEMA]", "tag_name": "PHI", "tag_value": "Medical Term", "priority": 16, "classifiers": [ "MEDICAL_TERM" ] } ]
Trigger an automatic tagging job to tag data in Snowflake based on an ALTR classification report.
When triggered, ALTR automatically tags any columns in that classification report based on the specified mapping. Note that only one automatic tagging job can be run at a time per database.
To trigger an automatic tagging job:
Select Navigation menu.
→ in theSelect the classification report to trigger an automatic tagging job.
Click the Tag Columns button.
Enter the JSON object to map classifiers to tags. Learn more.
(Optionally) Turn on the Create Tags toggle to automatically create tags that do not exist in Snowflake, but are listed in the JSON mapping object. Learn more.
Click the Tag Columns button. Depending on the number of columns in the classification report, the tagging job may take several minutes or hours to complete.
Click the View Tagging Summary button on the Classification Report page to view details about which columns were tagged and which columns failed to tag.
Note
If columns failed to tag, refer to Automatic Tagging Troubleshooting for assistance, make adjustments as needed and trigger a new classification job. If the error persists, contact ALTR Support.
Create Tags During Automatic Tagging
As you set up automatic tagging, you have the option to create tags that don’t exist in Snowflake and to add allowed values to existing tags by turning on the Create Tag toggle. Define all the columns, tags and allowed values to be mapped in the JSON object, including the tags and allowed values that don’t exist in Snowflake. When the tagging job runs, any tags and allowed values defined in the JSON object that don’t exist in Snowflake will be created.
Optionally, you can manually connect tags before you run the automatic tagging job. Learn more.
Before Using This Feature
Before turning on the Create Tags toggle, ensure
your service user has the correct privileges:
APPLY TAG
OWNERSHIP on the existing tag if adding new allowed values
USAGE on the database and schema where the tag resides
CREATE TAG on the schema where the tag will reside
you included the tag and allowed values in the JSON object
Note
If privileges and the JSON object are not properly configured, part or all of the automatic tagging job will fail.
When to Use This Feature
This feature is useful the first few times you automatically tag objects when you are still defining your tags. If you are continuously running this automatic tagging job or have created the object tags directly in Snowflake, then you should not use this feature.
Once the automatic tagging job has been triggered, ALTR starts preparing to identify and tag all of the columns in Snowflake. Track the automatic tagging job to see what is in progress, stuck or completed.
To track tagging jobs, click the Notifications (bell) icon at the top of the page.
Once the automatic tagging job completes, ALTR generates a summary report so you can identify which objects were tagged and see if any object-level errors occurred during the tagging job. You can also download the summary report to either distribute the report or to view the entire report at one time.
To view the tagging summary:
Navigate to the classification report.
Click the View Tagging Summary button.
To download the tagging summary report, click the Download CSV Report button.
Download Tagging Summary
Download the summary report to a CSV file to understand your data, specifically, which columns are protected, and to distribute the report to stakeholders.
To download the tagging summary report:
Select the classification report.
Click the View Tagging Summary button.
Click the Download CSV Report button.
If the automatic tagging job failed to tag columns, check the following causes and try tagging again. If the error persists, contact ALTR Support.
Note
If working with ALTR Support to troubleshoot a failed job, be sure to send them the Tracking ID on the Tagging Summary page. Learn more. This unique ID is assigned to the tagging job and is helpful to expedite the resolution.
Your automatic tagging job may have failed or encountered errors because
one of the tags specified doesn’t exist in Snowflake
one of the columns classified no longer exists in Snowflake
the service user doesn’t have privileges to create tags on the relevant Snowflake schemas (only applicable if the “Create Tags” option was turned on)
the service user doesn't have privileges to apply tags on your Snowflake account
there is a typo or other error in your JSON object
Locate the Tracking ID
If you need to contact ALTR Support, they will need the tagging job’s tracking ID to assist you.
Note
The Tracking ID is automatically included if you click the Contact ALTR support button for failed tagging jobs from the Notification (bell) icon.
To locate the tracking ID:
Select the classification report.
Click the View Tagging Summary button.
Copy the Tracking ID and include it in your message to ALTR Support.