Monitors as Code

Instead of managing monitors via the UI, Metaplane allows users to define monitors as part of the dbt model metadata. When Metaplane syncs your dbt connection, we will ingest any monitors defined in the dbt model metadata and automatically perform the necessary creates, updates, and disables. While these sync normally happen on an hourly cadence, it’s possible to kick off a manual sync on the connection page for dbt.

YAML Examples

Row Count & Freshness Monitors

meta:
  metaplane:
    createMonitors:
      monitors:
        - ROW_COUNT
        - FRESHNESS

Column-Level Monitors

meta:
  metaplane:
    createMonitors:
      monitors:
        - FRESHNESS:
            columnMatchers:
              includeColumns:
                - UPDATED_AT
        - MAX:
            name: "my custom name"
            columnMatchers:
              includeColumns:
                - CUSTOMER_ORDER_SIZE
            configuration:
              cronSchedule: "3 * * * *"
              where: 
                "UPDATED_AT >= CURRENT_DATE - 7"        
        - MIN:
            columnMatchers:
              includeColumns:
                - CUSTOMER_ORDER_SIZE
            configuration:
              cronSchedule: "4 * * * *"
              manualRules:
                - GREATER_THAN:
                    value: 60.0
                - LESS_THAN:
                    value: 15.0
        - NULLNESS:
            columnMatchers:
              includeColumns:
                - CUSTOMER_ID

Custom SQL Monitors

meta:
  metaplane:
    createMonitors:
      monitors: 
        - CUSTOM:
            sql: "select count(*) from schema1.table1 st1 join schema2.table1 st2 on st1.id = st2.id"
            description: "a nice description here"
            name: "My Monitor Name 1"
            identifier: 1
        - CUSTOM:
            sql: "select * from schema.table2" 
            name: "My Monitor Name 1"
            identifier: 2 #
            configuration:
              anomalyRule: # 
                sensitivity: 3.0
                includeObservationsSince: 2024-07-07T13:33:41+00:00
                modelTypeOverride: ROW_COUNT
                modelClassType: STATIONARY
                modelBoundsOverride: UPPER_ONLY

Group By Monitors

meta:
  metaplane:
    createMonitors:
      monitors:
        - MAX:
            name: "my custom name"
            columnMatchers:
              includeColumns:
                - CUSTOMER_ORDER_SIZE
            configuration:
              cronSchedule: "3 * * * *"
              where: "NUMBER_OF_ORDERS > 4"
              groupBy:
                - FIRST_NAME
        - MIN:
            columnMatchers:
              includeColumns:
                - CUSTOMER_ORDER_SIZE
            configuration:
              groupBy:
                - FIRST_NAME
              manualRules:
                - LESS_THAN:
                    value: 15.0

YAML Specification

Below is an annotated example of defining monitors in the dbt meta field.

meta:
  metaplane:
    createMonitors:
      defaultConfiguration: # optional, applies to all monitors
        timeWindowFilter:
          columnName: "createdAt"
          duration: P2D # any ISO 8601 duration combo of days, hour, minute duration string allowed. e.g. "PT2H30M" is also valid
        cronSchedule: "0 * * * *" # optional default to account setting
      defaultTags: # optional, this allows you to apply tags to all monitors
        - my_custom_tag # Tag name does not need to already exist in metaplane, this can be anything you like
      monitors: # List of monitors to apply, each can optionally have more specific configuration
        - ROW_COUNT
        - FRESHNESS
        - FRESHNESS:
            columnMatchers: # optional list to only include specific columns, accepts regex
              includeColumns:
                - FIRST_ORDER
            configuration: # override top level configuration in this scope
              cronSchedule: "2 * * * *"
              freshnessTimeZone: "America/New_York" # For column freshness monitors, if the column timestamp value is stored without a time zone, use this setting to specify which time zone to use. Valid timezone values are listed here: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
            tags:
              - another_custom_tag # this tag will only be applied to this FRESHNESS monitor
        - MAX:
            name: "my custom name" # any monitor can have a custom name
            columnMatchers:
              includeColumns:
                - CUSTOMER_ID
            configuration:
              cronSchedule: "3 * * * *"
              where: "NUMBER_OF_ORDERS > 4" # optionally specify fully custom sql added as a where clause
              groupBy: # optional list of columns to group results by
                - FIRST_NAME
        - MAX: # specify monitor type multiple times to have different configurations for different columns
            columnMatchers:
              includeColumns:
                - NUMBER_OF_ORDERS
            configuration:
              cronSchedule: "4 * * * *"
              # optionally specify a list of manual rules to use instead of automatic anomaly detection
              # rule types are [GREATER_THAN, GREATER_THAN_EQUALS, LESS_THAN, LESS_THAN_EQUALS]
              # only one rule per type (e.g LESS_THAN or LESS_THAN_EQUALS) can be used in a ruleset
              # value must be a valid double precision float
              manualRules:
                - GREATER_THAN:
                    value: 60.0
                - LESS_THAN:
                    value: 0.0
        - CUSTOM: # Custom sql monitor always require setting the specific "sql" field
            sql: "select * from schema.table1" # table names must be fully qualified in sql
            name: "My Monitor Name 1" # required name of custom sql monitor to display in app
            identifier: 1 # user supplied integer to uniquely identify custom sql tests. Only needs to be unique within the scope of a single model
            configuration:
              anomalyRule: # optional specify additional settings for anomaly detection
                # Control the bounds of the model. The default value is 3.0.
                # 1.0 high is the highest sensitivity correlating with smallest bounds.
                # 6.0 Is lowest sensitivity corresponding with the largest bounds.
                sensitivity: 4.0
                # optional value to control what values metaplane should use for modeling
                # value needs to be an ISO-8601 date time with a time zone offset
                includeObservationsSince: 2024-07-07T13:33:41+00:00
                # optional: change the anomaly model used for the custom sql monitor
                # one of [ROW_COUNT, FRESHNESS, CARDINALITY, NULLNESS, DURATION, MEAN, STDDEV]
                modelTypeOverride: ROW_COUNT
                # optional: override the class of the anomaly model
                # one of [STATIONARY]
                modelClassType: STATIONARY
                # optional: override the bounds of the anomaly model
                # one of [UPPER_ONLY, LOWER_ONLY]
                modelBoundsOverride: UPPER_ONLY
                
        - CUSTOM:
            sql: "select * from schema.table2"
            description: "a nice description here" # optional description of custom monitor
            name: "My Monitor Name 2" # required name of custom sql monitor to display in app, must be unique within a dbt model
            identifier: 2 # optional seconday identifier if you want to change monitor names without downtime, must be unique within a dbt model

The supported monitor types are: [ROW_COUNT, CARDINALITY, UNIQUENESS, MAX, MIN, MEAN, STDDEV, FRESHNESS, CUSTOM, NULLNESS, COLUMN_COUNT, PERCENT_ZERO, PERCENT_NEGATIVE, SUM].

Any monitor listed without a columnMatcher will automatically be set on every relevant context. For example, STDDEV only works on numeric columns, so it would only be applied to all the numeric columns of the table. ROW_COUNT is a table level monitor, so it will only ever be applied to the table.

The FRESHNESS monitor is a special since it is both a table and column level monitor. When FRESHNESS is specified without a columnMatcher, we treat it as only a table level monitor. To apply FRESHNESS to a column, a columnMatcher must be specified.

Adding a columnMatcher to the monitor definition allows you to limit what columns the monitor is applied to. When specified, Metaplane will add monitors to columns that match any of the specified includeColumns fields. You can specify the full column name or give us a regex to match on.