To start customizing a monitor, click into a monitor page and locate the configuration pane on the right hand side of the screen. This configuration pane will let you switch between automatic and manual anomaly detection, configure monitor sensitivity, configure monitor frequency, test based on rolling time windows, and/or specify which rows you want to monitor.
Metaplane offers two types of anomaly detection: automatic and manual.
Automatic anomaly detection is the default when you create a new monitor. With automatic anomaly detection, Metaplane will build a machine-learning model based on your historical data, then compare the values we observe to that machine-learning model. Automatic anomaly detection works best for tables or columns with periodic patterns, seasonality, and linear changes.
When using automatic anomaly detection, you can choose the period of time that Metaplane's models use in making their predictions in the Model data since field. By default the model uses the date that it began training, but there might be cases where you'd like to change this.
For example, let's say that the data in your table materially changed since the model first went through its training period. Maybe your team ran a huge backfill of data and now the row counts are off, or maybe your team cleaned up several duplicate records and the uniqueness spiked. In these cases, you may want to update the date in the Model data since field to be after the date the change happened. Then, in the next run, Metaplane will only use the more recent data to make its predictions.
Note: If you update the Model data since field, Metaplane may put the model back into training if it no longer has enough data to predict. This is more common if you update the field to a date that's less than 3-5 days from the current date.
Manual anomaly detection, on the other hand, relies on your understanding of the desired behavior for the underlying metadata through upper and/or lower thresholds you apply to the monitor. Manual anomaly detection works best for tables and columns where you have strict requirements about the desired behavior. Some examples might be:
- If you have columns that should never be null, you could set the upper threshold for a nullness monitor to 0%. This means that Metaplane will trigger an incident any time the nullness is greater than 0%.
- If you have columns that should always be unique, you could set the lower threshold for a uniqueness monitor to 100%. This means that Metaplane will trigger an incident any time the uniqueness is less than 100%.
- If you have a table that doesn't get updated on any fixed schedule, but really should be updated at least every two weeks, you could set an upper threshold of 14 days. This means that Metaplane will trigger an incident any time the the freshness for a table is more than 14 days old.
The sensitivity setting lets you tune the expected range generated by the machine learning model.
Increasing the sensitivity will decrease the size of the model thresholds. This is perfect for instances where you want to know about more subtle variations in the values Metaplane observes. For example, let's say you have a distribution monitor to a column that’s critically important to some business logic, and you want to know about even minute changes to the standard deviation or mean. Increasing the sensitivity might be the way to go.
Decreasing the sensitivity will increase the size of the model thresholds. This is ideal for situations where you want to only get alerts about large variations in the values Metaplane observes. For example, let's say you have a table where it’s fine if the row count fluctuates a lot on a daily basis, but you’d want to know if it suddenly dropped to zero. Decreasing the sensitivity would enable this behavior.
Best practices for model sensitivity
As you begin to tweak the alert sensitivities for your Metaplane monitors, we’d recommend starting small. Adjust the sensitivity up or down a notch, then wait for the monitor’s next regularly scheduled run (or, if you want a quicker result, you can run the monitor manually). As the results come in, see if they’re in line with the adjustment you were hoping for, and if not, continue to tweak the sensitivity until it is.
By default, Metaplane runs monitors every hour. You can change your organization wide default via the account page. When changing the schedule, you can pick from one of our pre-set options or fully customize it with your own custom cron schedule.
Adding a time window filter to your monitor is a useful way to limit the total amount of data Metaplane needs to scan or ensure that old data isn’t being included in the observation. The monitor can be filtered by any datetime column on the table.
Once selected, this effectively adds a
WHERE clause to the monitor’s sql to filter out an orders from more than a day ago. Note that the time window is rolling and not calendar based.
Metaplane also allows users to provide their own sql expression that we will add into the
WHERE clause of the monitor, to focus improving data quality on your designated segments. This can be used independently or in conjunction with our other filters.
Updated 4 months ago