Landing & Auth Flow
Sign in to your AI workspace
AVEOSOFT
Enterprise SSO available
Contact sales for SAML / OIDC setup
Main Dashboard Flow
24,318
Total Queries Today
↑ +18%
187ms
Avg Response Latency
↑ -23ms
94.2%
RAG Retrieval Accuracy
↑ +1.4%
$12.47
LLM Token Cost Today
↑ -8%
Query Volume — Last 7 Days
2:14
PM
RAG Pipeline Ingestion Completed
47 documents processed · 12,480 chunks · Pinecone namespace: prod-docs
1:05
PM
LLM Fallback Triggered
OpenAI rate limit hit · Auto-switched to Anthropic Claude 3.5 Sonnet
11:30
AM
New Knowledge Base Version Deployed
Product Docs v3.2 · 234 sources indexed · 48,320 total chunks
Vector DB Health
Pinecone · 2.4M vectors
Embedding Coverage
18,420 / 20,680 chunks
API Uptime
Last 30 days
Token Usage by Model (This Week)
2.4M
Vectors Indexed
↑ +340K this week
99.7%
API Uptime (30d)
↑ stable
RAG Chatbot Interface Flow
User
How does the RAG pipeline handle multi-document queries?
Session ID: sess_8f3k2 · Model: GPT-4o · Temp: 0.3 · Turn 4
AI
The pipeline uses a hybrid retrieval strategy combining dense vector search with metadata filters...
Sources: [1] Architecture Guide p.12 [2] API Docs §4.3 [3] Setup Guide p.8 · Tokens used: 847
User
What chunking strategies are supported?
Follow-up · Conversation turn 5 · Memory window: 10 turns
847
Tokens This Turn
↑ within budget
3
Sources Retrieved
↑ Top-k: 5
Source Relevance
Architecture Guide p.12
Source Relevance
API Reference §4.3
Source Relevance
Setup Guide p.8
[1]
Architecture Guide — Page 12
Chunk ID: chk_4821 · Similarity: 0.923 · Namespace: prod-docs · RecursiveCharacterTextSplitter
[2]
API Reference — Section 4.3
Chunk ID: chk_2204 · Similarity: 0.874 · Namespace: prod-docs · PDF loader
[3]
Setup & Integration Guide — Page 8
Chunk ID: chk_9031 · Similarity: 0.791 · Namespace: prod-docs · Markdown loader
Knowledge Base Flow
234
Indexed Documents
↑ +12 this week
48,320
Total Chunks
↑ avg 206 per doc
1536
Embedding Dimensions
↑ text-embedding-3-large
Product Architecture Guide v3.2.pdf
Status: Indexed · 347 chunks · Uploaded Apr 14 2026 · 2.4 MB
MD
API Reference Documentation (GitHub Sync)
Status: Indexed · 1,204 chunks · Auto-synced · Last updated 2h ago
User Onboarding Guide v4.pdf
Status: Processing · 0 / 89 chunks · Queued · 1.1 MB
CSV
Product FAQ Dataset v2.csv
Status: Indexed · 542 chunks · 856 rows · Apr 12 2026
Chunks per Document (Top Sources)
Chunking Strategy
RecursiveCharacterTextSplitter
Embedding Model
OpenAI text-embedding-3-large
Vector Database
Pinecone · us-east-1 region
LangChain Document Loaders
PyPDFLoader + UnstructuredMarkdownLoader + CSVLoader
Analytics & Monitoring Flow
186,420
Total Queries (30d)
↑ +24%
94.7%
Successful Responses
↑ +2.1%
204ms
P95 Latency
↑ -31ms
4.6 / 5
Avg User Rating
↑ +0.3 vs last month
Daily Query Volume — April 2026
RAG Quality Score by Source Type
$342.18
Total LLM Cost (30d)
↑ -12% vs last month
48.2M
Total Tokens (30d)
↑ input + output combined
$0.0018
Cost per Query
↑ within $0.002 target
Monthly Budget Used
$342 of $500 budget
Token Cost by Model — April 2026
Alert
Budget threshold approaching 80% — estimated in 4 days
Recommendation: Route simple queries to GPT-3.5-turbo to reduce spend
API & Access Management Flow
8
Active API Keys
↓ 2 expiring in 7 days
5
Team Members
↑ 1 pending invite
PROD
sk-ragflow-prod-••••••••3f2a
Role: Admin · Created Mar 1 2026 · Last used: 2 min ago · Rate: 1000 req/min
DEV
sk-ragflow-dev-••••••••8b1c
Role: Developer · Created Apr 1 2026 · Rate limit: 100 req/min · Sandbox only
READ
sk-ragflow-read-••••••••4d9e
Role: Read-Only · Created Apr 10 2026 · Chatbot embed widget access only
sarah.chen@company.com
Admin · Last login: 10 min ago
james.okafor@company.com
Developer · Last login: 1 hour ago
API Request Distribution by Key (Today)
Global Rate Limit — Production Key
Enforced via Redis sliding window
JWT Token Expiry
Session authentication tokens
CORS Allowed Origins
Configured domain allowlist
IP Allowlist
Production environment firewall
Settings & Configuration Flow
Primary LLM
Response generation model
Fallback LLM
On rate limit or API error
Temperature
Response creativity vs. determinism
Max Tokens per Response
Cost and latency control ceiling
Retrieval Top-K
Chunks returned from vector search
Similarity Threshold
Minimum relevance score cutoff
SYS
System Prompt Template v2.1
You are a helpful AI assistant. Answer only from the provided {context}. If unsure, say so clearly.
RAG
ConversationalRetrievalChain Config
combine_docs_chain: StuffDocumentsChain · Compression: ContextualCompressionRetriever · Memory: 10 turns
META
Pre-Retrieval Metadata Filter Template
Filters by: namespace, doc_type, date_range, department — applied before vector similarity search
3
Active Chain Templates
↑ v2.1 deployed
12
Prompt Variables Mapped
↑ all bound
V1 Deliverables
Complete overview of confirmed features, deliverable items, and technical architecture for RAGFlow AI.