GCP Cloud Monitoring Metrics

For GCP-managed services, GCP captures metrics in GCP Cloud Monitoring (previously StackDriver). Nullstone offers real-time visuals of these metrics through the Nullstone UI.

In order to retrieve relevant metrics for an infrastructure component, Nullstone relies on workspace outputs from Terraform modules to gain access and query the appropriate metrics. For example, the GKE Service module which creates container apps emits outputs to read metrics for CPU and Memory.

This reference outlines the set of required outputs and their formats so that you can update and extend the Terraform modules with valuable metrics.

Required Outputs

To enable real-time monitoring for any workspace in Nullstone that uses AWS Cloudwatch metrics, emit the following outputs from the Terraform module.

metrics_provider
metrics_reader
metrics_mappings

hcl

output "metrics_provider" {
  value = "cloudmonitoring"
}
output "metrics_reader" {
  value = {
    email       = "<gcp-service-account-email>"
    private_key = "<gcp-service-account-private-key>"
  }
}
output "metrics_mappings" {
  value = [
    // See below for syntax
  ]
}

TIP

Capability modules are attached to Applications and don't have workspaces. As a result, they only support the metrics_mappings output and must use the same metrics provider as its attached app. The metrics for the capability are shown in the attached application's dashboard.

`metrics_reader`

The metrics reader is a GCP service account with a limited amount of read privilege to access the necessary metrics. The following IAM membership is recommended for this metrics reader.

hcl

resource "google_project_iam_member" "metrics_reader_metrics_viewer" {
  project = local.project_id
  role    = "roles/monitoring.viewer"
  member  = "serviceAccount:${google_service_account.metrics_reader.email}"
}

`metrics_mappings`

metrics_mappings controls how to query and display metrics that are shown in the workspace dashboard.

TIP

See Chart Types for Metrics for reference on chart types and which metrics are necessary for each chart.

Format

hcl

locals {
  metrics_mappings = [
    { // Each group represents a single chart to display metrics
      name = "" // Used for identification and UI chart title
      type = "" // Type of chart (see below for a listing of chart types)
      unit = "" // Unit of measurement (displayed in chart UI)
      
      mappings = { // map of each series displayed on the chart
        "<metric_id>" = { // metric_id should be of the form "<chart-name>_<metric>"
          query = "" // required
        } 
      }
      // ... more metric series
    }
    // ... more metric groups
  ]
}

Chart Types

Metric Group

name - The name that will be displayed in the chart. Also used as a prefix for metric_id in each series.
type - The type of chart to display in the Nullstone UI. See Chart Types for Metrics.
unit - This is used for display purposes only. It informs the user what unit each measurement is computed.

Metric Series

"metric_id" (string: <group-name>_<metric>) - Each series needs a unique ID. <group-name> must match the parent metric group name to be included in the chart display. <metric> must match a reserved metric name based on the chart type to be included in the chart display. For example, a usage chart requires one of reserved, average, min, or max. See Chart Types for Metrics for a list of available <metric> for each chart type.
query (string) - A PromQL query that produces a single time series. For information on what metric names are available in GCP Cloud Monitoring via PromQL, refer to Using PromQL for Cloud Monitoring metrics. For differences and limitations, refer to PromQL compatibility.

Example

The following example provides a single usage chart to display CPU usage for a GKE service.

hcl

locals {
  pod_name_regex = "^${local.app_name}-[0-9a-f]{10}-.*$"
  query_filter   = "monitored_resource=\"k8s_container\",cluster_name=\"${local.cluster_name}\",namespace_name=\"${local.app_namespace}\",pod_name=~\"${local.pod_name_regex}\""
  
  metrics_mappings = [
    {
      name = "app/cpu"
      type = "usage"
      unit = "cores"

      mappings = {
        cpu_reserved = {
          query = "avg(kubernetes_io:container_cpu_request_cores{${local.query_filter}})"
        }
        cpu_average = {
          query = "(avg(kubernetes_io:container_cpu_request_utilization{${local.query_filter}}))*(avg(kubernetes_io:container_cpu_request_cores{${local.query_filter}}))"
        }
        cpu_min = {
          query = "(min(kubernetes_io:container_cpu_request_utilization{${local.query_filter}}))*(avg(kubernetes_io:container_cpu_request_cores{${local.query_filter}}))"
        }
        cpu_max = {
          query = "(max(kubernetes_io:container_cpu_request_utilization{${local.query_filter}}))*(avg(kubernetes_io:container_cpu_request_cores{${local.query_filter}}))"
        }
      }
    }
  ]
}

GCP Cloud Monitoring Metrics ​

Required Outputs ​

metrics_reader ​

metrics_mappings ​

Chart Types ​

Metric Group ​

Metric Series ​

Example ​