Monitors as Code
Instead of managing monitors via the UI, Metaplane allows users to define monitors as part of the dbt model metadata. When Metaplane syncs your dbt connection, we will ingest any monitors defined in the dbt model metadata and automatically perform the necessary creates, updates, and disables. While these sync normally happen on an hourly cadence, it’s possible to kick off a manual sync on the connection page for dbt.
Yaml Specification
Below is an annotated example of defining monitors in the dbt meta
field.
meta:
metaplane:
createMonitors:
defaultConfiguration: # optional, applies to all monitors
timeWindowFilter:
columnName: "createdAt"
duration: P2D # any ISO 8601 duration combo of days, hour, minute duration string allowed. e.g. "PT2H30M" is also valid
cronSchedule: "0 * * * *" # optional default to account setting
monitors: # List of monitors to apply, each can optionally have more specific configuration
- ROW_COUNT
- FRESHNESS
- FRESHNESS:
columnMatchers: # optional list to only include specific columns, accepts regex
includeColumns:
- FIRST_ORDER
configuration: # override top level configuration in this scope
cronSchedule: "2 * * * *"
- MAX:
columnMatchers:
includeColumns:
- CUSTOMER_ID
configuration:
cronSchedule: "3 * * * *"
where: "NUMBER_OF_ORDERS > 4" # optionally specify fully custom sql added as a where clause
groupBy: # optional list of columns to group results by
- FIRST_NAME
- MAX: # specify monitor type multiple times to have different configurations for different columns
columnMatchers:
includeColumns:
- NUMBER_OF_ORDERS
configuration:
cronSchedule: "4 * * * *"
# optionally specify a list of manual rules to use instead of automatic anomaly detection
# rule types are [GREATER_THAN, GREATER_THAN_EQUALS, LESS_THAN, LESS_THAN_EQUALS]
# only one rule per type (e.g LESS_THAN or LESS_THAN_EQUALS) can be used in a ruleset
# value must be a valid double precision float
manualRules:
- GREATER_THAN:
value: 60.0
- LESS_THAN:
value: 0.0
- CUSTOM: # Custom sql monitor always require setting the specific "sql" field
sql: "select * from schema.table1" # table names must be fully qualified in sql
name: "My Monitor Name 1" # required name of custom sql monitor to display in app
identifier: 1 # user supplied integer to uniquely identify custom sql tests. Only needs to be unique within the scope of a single model
configuration:
anomalyRule: # optional specify additional settings for anomaly detection
# Control the bounds of the model. The default value is 3.0.
# 1.0 high is the highest sensitivity correlating with smallest bounds.
# 6.0 Is lowest sensitivity corresponding with the largest bounds.
sensitivity: 4.0
- CUSTOM:
sql: "select * from schema.table2"
description: "a nice description here" # optional description of custom monitor
name: "My Monitor Name 2" # required name of custom sql monitor to display in app
identifier: 2 # optional seconday identifier if you need duplicate names
The supported monitor types are: [ROW_COUNT, CARDINALITY, UNIQUENESS, MAX, MIN, MEAN, STDDEV, FRESHNESS, CUSTOM, NULLNESS, COLUMN_COUNT, PERCENT_ZERO, PERCENT_NEGATIVE, SUM]
.
Any monitor listed without a columnMatcher
will automatically be set on every relevant context. For example, STDDEV
only works on numeric columns, so it would only be applied to all the numeric columns of the table. ROW_COUNT
is a table level monitor, so it will only ever be applied to the table.
The FRESHNESS
monitor is a special since it is both a table and column level monitor. When FRESHNESS
is specified without a columnMatcher
, we treat it as only a table level monitor. To apply FRESHNESS
to a column, a columnMatcher
must be specified.
Adding a columnMatcher
to the monitor definition allows you to limit what columns the monitor is applied to. When specified, Metaplane will add monitors to columns that match any of the specified includeColumns
fields. You can specify the full column name or give us a regex to match on.
Updated 26 days ago