When to use this runbook?
This runbook is intended to be used when monitoring the Secret Detection Service to identify and mitigate any reliability issues or performance regressions that may occur when it is enabled on Gitlab.com and/or Dedicated.
What to monitor?
We primarily need to monitor system metrics and recurrent errors raised within the service. Here are the narrowed down list of monitoring targets:
- Resource Saturation: Saturation is a measure of what ratio of a finite resource is currently being utilized.
- Aggregated Service Level Indicators(SLIs)
- Apdex Score: Apdex is a measure of requests that complete within a tolerable period of time for the service.
- Error Ratio: Error rates are a measure of unhandled service exceptions per second. Client errors are excluded when possible.
- Request Rate: The operation rate is the sum total of all requests being handle for all components within this service. Note that a single user request can lead to requests to multiple components.
- Recurrent appplication errors raised by the service.
How to monitor the service?
Most of above-mentioned monitoring targets i.e. Resource Saturation and Aggregated SLIs, are available in the Service Overview Dashboard.