Airbyte
Airbyte is a data movement platform with an expansive catalog of connectors, allowing users to seamlessly sync data between systems.
Generate Airbyte API credentials
Pick an Airbyte user for the Metaplane connection
The credentials you give Metaplane must be associated with some Airbyte user account. You can use your own account, or you can create a dedicated Metaplane account in Airbyte, which allows you finer-grained control over what resources you allow Metaplane to access.
Airbyte Cloud
If you use Airbyte Cloud, you can use API token authentication to connect Metaplane.
Create an Application
You can follow the steps outlined in https://reference.airbyte.com/reference/authentication-20 to generate an Airbyte Application for Metaplane. The tl;dr is: go to Airbyte cloud, go to Settings -> Applications
, and then click Create an Application
and call it Metaplane
.
Once you've created the application, note the Client ID
and Client Secret
. These are what you'll need to pass to Metaplane in the next step.
Airbyte OSS
If you run a self-hosted open-source deployment of Airbyte, you can use username-password authentication to connect Metaplane. This could be the username and password of your personal Airbyte user, or of the Metaplane service you (optionally) created in the previous step.
Whitelist Metaplane IP Addresses
Depending on your method of deployment, your Airbyte instance may be behind a firewall that restricts external access.
Metaplane will only access your Airbyte instance through the following IPs:
44.197.96.121
34.206.79.174
107.22.42.246
As list: 44.197.96.121
, 34.206.79.174
, 107.22.42.246
Identify your base API URL
Usually of the format https://<airbyteInstanceUrl>/api/public/v1
, but you can check with your Airbyte administrator to be sure. The airbyteInstanceUrl
may or may not include a port as well - the default is usually :8000
.
Validating URL and credentials
If the Airbyte Public API isn't something you use regularly, finding the correct URL can be difficult. To validate your URL and credentials in your local terminal, you can run the following, substituting your own values for AIRBYTE_USER
, AIRBYTE_PASS
, and AIRBYTE_API_URL
AIRBYTE_USER=
AIRBYTE_PASS=
AIRBYTE_API_URL=
curl $AIRBYTE_API_URL/health --header Authorization:"Basic $(printf "${AIRBYTE_USER}:${AIRBYTE_PASS}" | base64)"
If your credentials and API url are all correct, the response will be Successful operation
.
If the response is an HTML block containing something like This deployment of Airbyte is protected by HTTP Basic Authentication
, then your API url is most likely correct, but the username and password were rejected.
If the response times out, then your API url is incorrect (more likely), or your local ip address needs to be whitelisted for access to your Airbyte instance (less likely, since you probably already have access to Airbyte).
If you get back a message like Object not found
, this means your API url is the url of an airbyte instance, but likely not the correct path. To further validate, you can try hitting the /workspaces
endpoint and see if it returns any more information:
curl $AIRBYTE_API_URL/health --header Authorization:"Basic $(printf "${AIRBYTE_USER}:${AIRBYTE_PASS}" | base64)"
Create an Airbyte connection in Metaplane
Head over Metaplane, add a new connection via Settings -> Data Stack -> Add connection, and select Airbyte
.
Select your deployment type - Cloud
if you use Airbyte Cloud, or Self hosted
otherwise, and enter the corresponding credentials you generated in the previous step (username/password for Self hosted
and client ID + secret for Cloud
).
For Self hosted
Airbyte instances, you'll also need to enter the Airbyte Public API base URL that you identified.
What to expect
Metaplane will populate with all of the Airbyte Workspaces, Connections, and Streams that it has access to. You'll see both upstream and downstream warehouse lineage for Streams. You'll also see some useful metrics (which you can create monitors on) about the syncs that Airbyte ran for each Connection, including:
- Time since an Airbyte Connection last successfully synced
- Bytes + rows written per sync
- Duration of each sync
Updated 5 months ago