Agentic RAG Workflow

LangGraph orchestration with MLflow observability

LangGraph Workflow

Click to see Mermaid JS

graph TD A[User Query
Input] --> B[Summarize
Node] B --> C[Bot Response
Output] B --> D[Thread Retrieval] D --> E[Query Optimization] E --> F[Vector Search] F --> G[Response Formatting] G --> C classDef userQuery fill:#dc2626,stroke:#ef4444,stroke-width:2px,color:#fecaca classDef summarize fill:#1e40af,stroke:#3b82f6,stroke-width:2px,color:#dbeafe classDef botResponse fill:#16a34a,stroke:#22c55e,stroke-width:2px,color:#dcfce7 classDef sequential fill:#7c2d12,stroke:#ea580c,stroke-width:2px,color:#fed7aa class A userQuery class B summarize class C botResponse class D,E,F,G sequential

Click to see SVG

Main Workflow

User Query Input

Initial user question or request received via Slack

Summarize Node

Core LangGraph node that orchestrates all processing steps

Bot Response Output

Final formatted response delivered back to the user

Sequential Processing

Thread Retrieval

Fetches conversation history for context awareness

Query Optimization

Use LLM to optimize user request for better search results

Vector Search

Retrieves relevant documentation using optimized query

Response Formatting

Use LLM to format response given context and retrieved docs

MLflow Observability

Monitoring Components

Trace Decorators

Applied to all LangGraph nodes for end-to-end visibility

Experiment Tracking

Model performance across different configurations

Inference Capture

Production interactions captured for evaluation datasets

Lineage Tracking

Complete observability from query to response

Observability Benefits

Performance Monitoring

Real-time tracking of response times and throughput

Quality Assurance

Automated evaluation of response quality and accuracy

Debug Capabilities

Detailed trace analysis for troubleshooting issues

Continuous Improvement

Data-driven insights for model and workflow optimization

Application Architecture

Slack Bot Components

Key Features

Full Thread Context

Complete conversation history for context-aware responses

Intelligent Query Enhancement

LLM generates focused search queries from context

Slack Formatting

Proper links, code blocks, and bullet points

Graceful Fallbacks

Handles service unavailability gracefully

Token Refresh

Automatic token management for long sessions