Skip to content

Data Classification

Data classification in ALTR helps you identify potentially sensitive data across data sources. This process determines where specific types of information, such as personal, financial or regulated data, may exist.

After a classification scan completes, the results are available in a classification report. Use these results to understand how sensitive data is distributed across your data sources and to protect sensitive data by applying access controls.

ALTR supports three classification methods:

  • ALTR Native classification— uses regex-based classifiers you define or that are managed by ALTR
  • Snowflake classification— leverages Snowflake’s native classification capabilities
  • Google DLP classification— uses Google’s Data Loss Prevention service

Refer to the follow table for a high-level comparison of the available ALTR’s classification methods:

MethodDescriptionSupported Data SourcesAvailable FeaturesLimitations
ALTR NativeMatches custom or ALTR-managed classifier patterns against a data sampleSnowflakeFull customization of classification scans Data remains in Snowflake and does not leave your Snowflake instance Connect columns from the classification report Pause and cancel classification scansDoes not support auto tagging
SnowflakeUses Snowflake’s built-in classification functionalitySnowflakeData remains in Snowflake and does not leave your Snowflake instance Connect columns from the classification report Supports auto taggingClassification logic is defined by Snowflake and is not configurable by use case Doesn’t support custom regex
Google DLPMatches pre-defined rules against a data sample using Google Data Loss PreventionSnowflake DatabricksDelivers a good base set of classifiers with their classification engine Supports auto tagging Connect columns from the classification reportData has to leave your database environment and ALTR in order to go to Google Limited customization beyond predefined infoTypes Doesn’t support custom regex

To classify data in ALTR:

  1. Connect a data source to ALTR.
  2. Determine your classification method.
  3. Classify data.
  4. View classification report.
  5. Protect sensitive data using classification results.

Data classification may take several minutes to run based on the number of columns in the data source. An email is sent to administrators when the classification report is ready (Snowflake and Google DLP, only).

Classifying data runs a classification scan and generates a classification report.

To classify data:

  1. Connect a Snowflake or a Databricks data source to ALTR.
  2. If using ALTR Native Classification, create a collection with at least one classifier. Learn more.
  3. Select Data ClassificationClassification Reports in the Navigation menu.
  4. Click Classify Data to run a classification scan and generate a report; a modal displays.
  5. Select a data source.
  6. Select a classification method.
  7. Select a collection (ALTR Native only).
  8. Click Classify Data . This process may take a few minutes depending on the size of the data source. When the Status is Success, the classification report is available.

Pause a classification scan to temporary stop it, for example, if there are performance issues. You cannot view the report of a partially run classification scan; the scan must successfully complete in order to view the report.

If you update the collection used on a paused classification scan, the final report does not include the changes; it is a snapshot of the collection from when the classification started.

To pause a running classification scan:

  1. Select Data ClassificationClassification Reports in the Navigation menu.
  2. Locate the ALTR Native classification scan to pause (Status is In Progress).
  3. Click the ellipsis menu for the running scan.
  4. Select Pause scan .

To resume a paused classification scan:

  1. Select Data ClassificationClassification Reports in the Navigation menu.
  2. Locate the ALTR Native classification scan to resume (Status is In Progress).
  3. Click the ellipsis menu for the paused scan.
  4. Select Resume scan .

Cancel a classification to permanently stop it. You will lose any data generated for the report. Cancelling a scan is permanent and can’t be restarted. To temporarily stop a running scan, pause it instead. A cancelled classification report remains on the Classification Report page with a Status of Cancelled.

To cancel a running classification scan:

  1. Select Data ClassificationClassification Reports in the Navigation menu.
  2. Locate the ALTR Native classification scan to cancel (Status is In Progress or Paused).
  3. Click the ellipsis menu for the scan.
  4. Select Cancel scan ; a modal displays to confirm.
  5. Click Cancel Scan .

Once the data classification scan completes and the report is generated, view the classification report. The classification report is available once the Status on the Classification Report page is Success.

To view the classification report:

  1. Select Data ClassificationClassification Reports in the Navigation menu.
  2. Click a report to view details.

Classification results in ALTR help you identify where sensitive data exists so you can apply data access control policies to protect this sensitive data.

When a classification scan completes, ALTR generates a classification report that shows which columns were identified as containing specific types of data, such as personal, financial or regulated information. You can use these results to create or update policies that protect sensitive data across your environment.

To protect sensitive sensitive data using classification:

  1. Identify sensitive data. Classification reports highlight columns that contain sensitive data based on the classifiers used in the scan. Assign tags to columns based on classification results using automatic tagging (Snowflake classification and Google DLP classification only).
  2. Apply access controls. Based on classification, create policies to
    1. mask sensitive data
    2. restrict access by user or role
    3. allow access only under defined conditions
  3. Automate ongoing protection. As new data is classified or existing data changes, policies continue to apply automatically, helping maintain consistent protection without manual updates.