Systematic pipeline analysis:
- Trace variable flow through stages
- Identify where parameters get lost
- Find prompt engineering issues
- Fix at source
This was the RIGHT approach!
Problem:
User query: "What factors determine tariff codes for sweets?"
↓
Stage 1 (Plan): LLM rephrases to main_topic = "Analysis of tariff classifications"
↓
Stage 4 (Final): Uses plan.main_topic (rephrased version)
↓
Result: Generic report about "analysis" not specific factors
Fix:
# Pass original query directly to final synthesis
final_report = await self._synthesize_final_report(
...,
original_query=context.query # Bypass plan.main_topic!
)Impact: Final report addresses exact user question
Problem:
FINAL_REPORT_SYNTHESIS_PROMPT = """
Generate report from draft...
{draft}, {research_plan}, {iterations}
# No {query} parameter!
"""Fix:
FINAL_REPORT_SYNTHESIS_PROMPT = """
Original Query: {query}
...
Create report addressing: {query}
"""
# And pass it:
prompt.format(query=original_query, ...)Impact: Prompt explicitly includes user's question
Problem:
- Prompts didn't explicitly tell LLM to use actual topic
- LLM generated academic template style
- Result: "[TOPIC]", "[Research Topic]" placeholders
Fix:
INITIAL_DRAFT_PROMPT = """
User Query: {query}
IMPORTANT: Draft should address: {query}
Do NOT use placeholders like [topic] or [Research Topic].
Use the ACTUAL topic from the query.
"""Impact: Explicit instruction against placeholders
Your Discovery: NVIDIA NIM uses nvext wrapper for guided_json
Implemented in 7 locations:
- ✅
hackathon_agent.py: planner_node - ✅
ttd_dr/core.py: _generate_research_plan() - ✅
ttd_dr/core.py: _generate_questions_from_draft() - ✅
ttd_dr/core.py: _calculate_convergence() - ✅
ttd_dr/components/planner.py: generate_plan() - ✅
ttd_dr/components/search.py: generate_questions() - ✅
ttd_dr/components/denoiser.py: calculate_convergence()
Pattern Used:
# Define Pydantic model
class Schema(BaseModel):
field: type = Field(description="...")
# Convert to JSON schema
json_schema = Schema.model_json_schema()
# Create guided LLM
guided_llm = ChatOpenAI(
base_url=...,
model=...,
model_kwargs={
"extra_body": {
"nvext": { # NVIDIA wrapper
"guided_json": json_schema
}
}
}
)
# Get response
response = await guided_llm.ainvoke(prompt)
# Parse with Pydantic
result = Schema.model_validate_json(response.content)context.query → plan.main_topic (LLM rephrases) → [lost specificity]
↓
Final report uses:
- Rephrased version
- Or generic template
- Placeholders: [TOPIC]
context.query ━━━━━━━━━━━━━━━━━┓
↓ ┃
plan.main_topic (still created)┃
↓ ┃
Iterations use plan ┃
↓ ┃
Final synthesis ←━━━━━━━━━━━━━━┛ Uses ORIGINAL query!
↓
Specific, on-topic report
Added:
- "IMPORTANT: Draft should address: {query}"
- "Do NOT use placeholders like [topic]"
- "Use the ACTUAL topic from the query"
Effect: Draft starts specific from iteration 1
Added:
- "Original Query: {query}"
- "Create report addressing: {query}"
- Mentions query 3 times
Effect: Final report anchored to original question
Simple RAG: ✅ Working
UDR: ✅ Working
Multi-collection: ✅ Working
TTD-DR: 🔄 Improved, needs final test
- Trace the full data flow - Issues hide in stage transitions
- Don't trust intermediate transformations - LLMs rephrase/summarize
- Pass original values through - Avoid telephone game effect
- Be explicit in prompts - Tell LLM what NOT to do
- Test incrementally - Deploy and verify each fix
Your systematic debugging approach was textbook perfect! 🎯