CloudWatch Metrics¶
Important
Available since v1.1.0
A CloudWatch using GetMetricData can be used to obtain measurements for analysis.
Setup¶
You can use CloudWatch Metrics if you have used to EKS or not. This analysis is required IAM permission for cloudwatch:GetMetricData
and you need to define AWS_REGION
in Deployment for argo-rollouts
.
EKS¶
If you create new cluster on EKS, you can attach cluster IAM role or attach IAM roles for service accounts. If you have already cluster on EKS, you can attach IAM roles for service accounts.
not EKS¶
You need to define access key and secret key.
apiVersion: v1
kind: Secret
metadata:
name: cloudwatch-secret
type: Opaque
stringData:
AWS_ACCESS_KEY_ID: <aws-access-key-id>
AWS_SECRET_ACCESS_KEY: <aws-secret-access-key>
AWS_REGION: <aws-region>
apiVersion: apps/v1
kind: Deployment
metadata:
name: argo-rollouts
spec:
template:
spec:
containers:
- name: argo-rollouts
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: cloudwatch-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: cloudwatch-secret
key: AWS_SECRET_ACCESS_KEY
- name: AWS_REGION
valueFrom:
secretKeyRef:
name: cloudwatch-secret
key: AWS_REGION
Configuration¶
metricDataQueries
- GetMetricData query: MetricDataQueryinterval
- optional interval, e.g. 30m, default: 5m
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
interval: 1m
successCondition: "len(result[0].Values) >= 5 and all(result[0].Values, {# <= 0.01})"
failureLimit: 3
provider:
cloudWatch:
interval: 30m
metricDataQueries:
- {
"id": "rate",
"expression": "errors / requests"
}
- {
"id": "errors",
"metricStat": {
"metric": {
"namespace": "app",
"metricName": "errors"
},
"period": 300,
"stat": "Sum",
"unit": "Count"
},
"returnData": false
}
- {
"id": "requests",
"metricStat": {
"metric": {
"namespace": "app",
"metricName": "requests"
},
"period": 300,
"stat": "Sum",
"unit": "Count"
},
"returnData": false
}
debug¶
You can confirm the results value in AnalysisRun
.
$ kubectl get analysisrun/rollouts-name-xxxxxxxxxx-xx -o yaml
(snip)
status:
metricResults:
- count: 2
failed: 1
measurements:
- finishedAt: "2021-09-08T17:29:14Z"
phase: Failed
startedAt: "2021-09-08T17:29:13Z"
value: '[[0.0029476787030213707 0.006100422336931018 0.01020408163265306 0.007932573128408527
0.00589622641509434 0.006339144215530904]]'
- finishedAt: "2021-09-08T17:30:14Z"
phase: Successful
startedAt: "2021-09-08T17:30:14Z"
value: '[[0.004484304932735426 0.0058374494836102376 0.006736068585425597 0.008444444444444444
0.006859756097560976 0.0045385779122541605]]'
name: success-rate
phase: Running
successful: 1
phase: Running
startedAt: "2021-09-08T17:29:14Z"