Troubleshooting
Information about monitoring and logging tools of Duo Workflow.
Tools
Duo Workflow uses the following logging and monitoring tools:
- LangSmith - collects logs scoped to underlaying graph execution, including inforamtion like: LLM completions or tool calls
- gcp logs explorer - The logs explorer for
gitlab-runway-production
gcp project. These logs include all logs generated by Duo Workflow Service (WARNING: this project collects logs also from other Runway services, so correct filtering is necessary to scope browsed logs to the right entity). - Sentry error tracking collects error traces for:
- Runway monitoring dashboard - this a grafana dashboard that tracks hardware resource consumption for Duo Workflow Service
- Tableau dashboard for internal events tracking - displays aggregated data collected with internal event tracking, showing additional product metrics like total number of workflows, or distribution between differnt workflow outcomes
Tips and tricks
A typical investigation around problematic Duo Workflow execution follows steps listed below:
Based on a user report:
- Ask the user for the
workflow_id
for the problematic workflow which is displayed in the list of workflows - Use the
workflow_id
from previous step to filter down langsmith traces by applying a filter formetadata
andthread_id=[workflow_id]
- Use the
workflow_id
from 1st step to filter down logs in gcp logs explorer
Based on a Sentry issue:
- Use Duo Workflow Sentry issue to locate problematic workflow’s
correlation_id
. - Use the
correlation_id
from previous step to filter down logs in gcp logs explorer, example filter:jsonPayload.correlation_id="e7171f28-706d-4a47-be25-29d9b3751c0e"
In addition one can use a workflow’s workflow_id
that is being recorded either in sentry or in log explorer to filter down LangSmith logs using thread_id
filter in metadata and comparing it against workflow_id
.
Last modified December 13, 2024: Remove trailing spaces (
a4c83fb3
)