Amazon Lookout for Metrics uses machine learning (ML) to automatically detect and diagnose anomalies (outliers from the norm) without requiring any prior ML experience. Amazon CloudWatch provides you with actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health.

This post demonstrates how you can seamlessly connect to your data in CloudWatch to set up a highly accurate anomaly detector across metrics, dimensions, and namespaces of your choice using Lookout for Metrics. The solution allows you to set up a continuous anomaly detector and optionally set up alerts to receive notifications when anomalies occur.

Solution overview

The following diagram shows the architecture of our continuous detection system.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

To implement our solution, we complete the following high-level steps:

  1. Create an anomaly detector with Lookout for Metrics.
  2. Add a dataset to the detector and define the CloudWatch metrics.
  3. Activate the detector.
  4. Create an alert.
  5. Review detector status.
  6. Review and analyze any found anomalies.

The dataset used for this post is an Amazon API Gateway based service with various supported APIs that emit metrics like Latency, 4XXError, 5XXError, and Request count available through CloudWatch.

Create an anomaly detector with Lookout for Metrics

To create your anomaly detector, complete the following steps:

  1. On the Lookout for Metrics console, choose Create detector.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. For Detector name, enter a name.
  2. For Description, enter an optional description.
  3. For Interval, choose the time between each analysis.
  4. In the Encryption section, you can choose to let Lookout for Metrics encrypt your data using an AWS Key Management Service (AWS KMS) key, but this isn’t mandatory.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. The Tags section is also optional.
  2. Choose Create.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Add a dataset to the detector and define the CloudWatch metrics

After you create the anomaly detector, a banner appears that confirms its creation. You can then add a dataset to your newly created detector.

  1. Choose Add a dataset, either on the banner or on the detector details page.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. For Name, enter a name for the dataset.
  2. Optionally, enter a description and choose a time zone.
  3. For Datasource, choose the data source that stores your data.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Lookout for Metrics supports multiple data sources. For this post, we use CloudWatch.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Choose Next.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

We now define the relevant CloudWatch metrics.

  1. For Namespace, choose the CloudWatch namespace to use with the dataset (for this post, we choose ApiGateway).

Lookout for Metrics automatically populates this list with all the available namespaces for your account.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. For Dimensions, choose up to five dimensions within your CloudWatch namespace.

Lookout for Metrics makes this easy for you by pre-populating the available dimensions for a given namespace.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. For Metric, choose the metrics to monitor (up to five).

These metrics should also be associated with the same namespace.

  1. Choose Next.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Review the details.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Choose Save dataset to save the dataset settings.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Activate the detector

Now that the dataset is created, we activate the detector.

  1. On the details page for the detector, choose Activate or Activate detector.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Choose Activate to confirm that you want to activate the detector for continuous detection.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

A message appears to confirm that the detector is activating.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Create an alert

At any time before or after you activate the detector, you can create an alert.

  1. In the navigation pane, choose Alerts.
  2. For Alert name¸ enter a name.
  3. For Severity threshold, choose your preferred sensitivity of the alert configuration.
  4. For Channel, you can choose between Amazon Simple Notification Service (Amazon SNS) or AWS Lambda as the notification method.

For this post, we use Amazon SNS.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Choose Add alert.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Review detector status

When the anomaly detector is active, you can use the Detector log tab on the detector details page to review the detector runs that have been performed by Lookout for Metrics.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You can also choose View anomalies on the detector details page to manually inspect anomalies that may have been detected.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

On the Anomalies page, you can adjust the severity score threshold on the threshold dial to filter anomalies above a given score.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Review and analyze anomalies

When detecting an anomaly, Lookout for Metrics helps you focus on what matters most by assigning a severity score to aid prioritization. To help you find the root cause, it intelligently groups anomalies that may be related to the same incident and summarizes the different sources of impact.

In the following screenshot, the anomaly in latency on June 7 at 20:00 GMT had a severity score of 86, indicating a high-severity anomaly that needs immediate attention. The impact analysis also tells you that the primary API impacted was ListMetricSets.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Lookout for Metrics also allows you to provide real-time feedback on the relevance of the detected anomalies, which enables a powerful human-in-the-loop mechanism. This information is fed back to the anomaly detection model to improve its accuracy continuously, in near-real time.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Conclusion

You can seamlessly connect to your data in CloudWatch to set up a highly accurate anomaly detector across metrics, dimensions, and namespaces of your choice using Lookout for Metrics.

To get started with this capability, see Using Amazon CloudWatch with Lookout for Metrics. You can use this capability in all Regions where Lookout for Metrics is publicly available. For more information about Region availability, see AWS Regional Services.


About the Authors

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,Ankita Verma is the Product Lead for Amazon Lookout for Metrics. Her current focus is helping businesses make data-driven decisions using AI and ML. Outside of AWS, she is a fitness enthusiast, and loves mentoring budding product managers and entrepreneurs in her free time. She also publishes a weekly product management newsletter called The Product Mentors on Substack.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,Raj Vippagunta is a Senior SDE at AWS AI Services. He uses his vast experience in large-scale distributed systems and his passion for machine learning to build practical service offerings in the AI space. He has helped build various solutions for AWS and Amazon. In his spare time, he likes reading books and watching travel and cuisine vlogs from across the world.

This post was first published on: AWS