Databricks

Databricks is a data lakehouse that unifies the best of data warehouses and data lakes in one simple platform to handle all your data, analytics and AI use cases. It’s built on an open and reliable data foundation that efficiently handles all data types and applies one common security and governance approach across all of your data and cloud platforms.

πŸ“˜

Limitations

Metaplane currently supports customers who are using Databricks with Unity Catalog and SQL Warehouse. Please reach out through the chat bubble in the bottom left corner if this does not work for you.

1. Generate a Databricks access token

You must have an access token in order for Metaplane to access Databricks. It is recommended that you create a service principal for Metaplane and generate an access token for that service principal. Alternatively you can use a personal access token for your user.

Generate personal access token for your user

Follow the instructions here to generate a personal access token for your user. Save this access token somewhere safe.

Create a service principal for Metaplane

Follow the instructions here to create a service principal using the Databricks API. Take note of the service principal's application id and save it somewhere safe.

Grant token usage to service principal

Follow the instructions here to give Metaplane's service principal permissions to use access tokens.

Generate an access token for Metaplane's service principal

Follow the instructions here to generate an access token for Metaplane's service principal. If you want Metaplane's connection to Databricks to be uninterrupted, set lifetime_seconds to null to prevent the token from expiring. Save this access token somewhere safe.

2. Grant permission to data to Metaplane's service principal

Run these commands on each catalog you want Metaplane to have access to.

Grant access to all existing and future tables within catalog

GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<application_id>`;
GRANT USE_SCHEMA ON CATALOG <catalog_name> TO `<application_id>`;
GRANT SELECT ON CATALOG <catalog_name> TO `<application_id>`;

Grant access to specific tables within catalog

GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<application_id>`;
GRANT USE_SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<application_id>`;
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.<table_name> TO `<application_id>`;

3. Create a Databricks SQL Warehouse for Metaplane

  1. Follow the instructions here to create a SQL Warehouse for Metaplane to use. You will use the Host, Port and HTTP path from the 'Connection details' tab when creating the connection to Databricks in Metaplane.
  2. Click the 'Permissions' button and give the Metaplane service principal 'Can use' permissions.

4. Add Databricks as a connection in Metaplane

On the connections page, click the 'Add connection' button in the upper right corner and find the Databricks icon under Warehouses. The Host, Port and HTTP Path fields come from the SQL Warehouse created in Step 3. The Access token comes from Step 1.