Metaplane automatically uses a combination of parsing queries from your warehouse and metadata available from 3rd party APIs of tools that you've connected in your account to generate a lineage diagram, similar to the one shown below. You're able to define the number of levels either upstream or downstream and which column you'd like to drill into through the dropdowns on the bottom left corner of your lineage graph view.
Metaplane is able to provide intra warehouse lineage, e.g. table → table and column → column connections, via two mechanisms. You can have any combination of these mechanisms enabled
- Metaplane can automatically ingest lineage table level lineage when dbt core or cloud is connected.
- When Metaplane has been given access to your warehouse query logs, we can parse SQL statements to build a column level lineage.
To build CLL, Metaplane parses your query history looking for any table or view
CREATE statements.Metaplane can parse complex sql statements to figure out what columns are related to one another.
Let’s take a look at a simple query to understand how CLL works in practice:
create or replace view db.schema.customers as SELECT ( SELECT count(distinct owner_id) from db.schema.deals ) as total_deals, deal_flow.deal_id FROM deal_flow WHERE deal_flow.amount > 100
In this example, we are creating a new view called
customers that has the columns
total_deals, deal_id . Our parser looks at each top level column and recursively determines what columns contribute data. Here, we can see that the
total_deals column is derived from the
deals.owner_id column and the
deal_id column is derived from the column with the same name in the
deal_flow table. We think of these as “DIRECT” lineage, since the columns are directly contributing their own data to the top level column. Metaplane is also able to parse out “FILTER” lineage, which we think of as any column that isn’t contributing data, but is being used to filter down a result. In the example above, we can see that
deal_flow.amount will be picked up as a “FILTER” column for
deal_id . By default, our application only shows “DIRECT” lineage.
Because we parse queries to determine lineage, there are cases where warehouse to warehouse lineage for columns can become stale over time. If 21 days pass with no update to the lineage, we will remove it the next time we see new lineage information for that column come in.
If you've connected any of the business intelligence tools to your Metaplane account, we'll automatically add the ddashboards (or equivalent terminology) and relations to objects in your warehouse to your lineage graph, typically found downstream (right hand side of graph).
If you've connected Fivetran to your Metaplane account, you'll be able to see which connectors load data into which tables in your warehouse, located upstream (left hand side of graph).
Updated about 2 months ago