Manage Databricks Data Sources
This section describes ALTR's capabilities for managing Databricks data sources. ALTR must be connected to a data source in order to enforce data access governance and advanced data protection on sensitive data.
When connecting your account to ALTR, you will need the following information:
Workspace Hostname
Service Principal ID
Cluster ID
OAuth Secret
For assistance locating these fields in your Databricks account, read our documentation.
Note
The ideal scenario is to have one ALTR organization (tenant) per metastore, it is technically possible to have two organizations connecting the same metastore.
To connect a Databricks metastore:
Select Navigation menu.
→ in theClick Add Data Source.
On the Databricks card, click Add Data Source.
Enter a user-friendly Display Name connection.
Enter the Workspace Hostname.
Enter the Service Principal ID.
Enter the Cluster ID.
Note
Once the metastore is connected, you are unable to edit the Cluster ID. If you need to update the Cluster ID, disconnect and reconnect the metastore to ALTR.
Enter the OAuth Secret.
(Optional) Click Run Data Classification Scan to classify the metastore.
Note
Classification users Google DLP to scan all catalogs within the metastore that are accessible by the principal ID to identify sensitive data. Learn more.
Click Connect Data Source.
Remove a Databricks data source from ALTR if your service user is having problems or some other issue has occurred with the data source.
To remove a data source:
Select Navigation menu.
→ in theSelect the data source you wish to disconnect.
Click the Remove Data Source button. The process to remove a data source can take up to several minutes to complete.