Financial decisions require more than quick answers - they demand transparent, auditable reasoning. When an AI flags a transaction as fraudulent or recommends a portfolio rebalancing, stakeholders need to understand why. Chain-of-Thought and ReACT prompting techniques transform LLMs from black-box responders into systematic reasoners whose logic can be traced, verified, and trusted.
The Problem with Direct Answers
Consider a fraud detection scenario. When asked “Is this transaction fraudulent?”, a basic LLM might respond:
“Yes, this looks like potential fraud due to unusual activity.”
That’s not helpful for a compliance team that needs to document decisions, justify actions to regulators, or train analysts. We need the reasoning trail, not just the conclusion.
Chain-of-Thought Prompting
Chain-of-Thought (CoT) prompting encourages the LLM to generate intermediate reasoning steps before providing a final answer. Instead of jumping to conclusions, the model thinks through the problem systematically.
flowchart TB
subgraph Standard["Standard Prompting"]
direction LR
Q1[Question] --> A1[Answer]
end
subgraph CoT["Chain-of-Thought"]
direction LR
Q2[Question] --> S1[Step 1]
S1 --> S2[Step 2]
S2 --> S3[Step 3]
S3 --> A2[Answer]
end
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
class CoT blueClass
Types of Chain-of-Thought
Zero-shot CoT: The simplest form - just add “Let’s think step by step” to your prompt:
1 | prompt = """ |
Few-shot CoT: Provide examples showing the reasoning process:
1 | prompt = """ |
Benefits of CoT for Finance
- Auditability: Every conclusion comes with documented reasoning
- Reduced Errors: Step-by-step thinking catches logical mistakes
- Interpretability: Analysts can verify the model’s logic
- Compliance: Reasoning trails satisfy regulatory requirements
ReACT: Reasoning with Action
Chain-of-Thought works for internal reasoning, but financial tasks often require external information - checking account history, verifying locations, or querying databases. ReACT (Reason + Act) interleaves thinking with action.
flowchart TD
T1[Thought: What do I need to know?] --> A1[Action: Query tool/API]
A1 --> O1[Observation: Tool result]
O1 --> T2[Thought: What does this mean?]
T2 --> A2[Action: Query another tool]
A2 --> O2[Observation: New information]
O2 --> T3[Thought: I can now conclude]
T3 --> F[Final Answer]
classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
class T1 orangeClass
class T2 orangeClass
class T3 orangeClass
class A1 blueClass
class A2 blueClass
The ReACT Loop
- Thought: Reason about the current state and plan next step
- Action: Call an external tool with specific parameters
- Observation: Receive and process the tool’s response
- Repeat until ready to provide final answer
ReACT Example: Fraud Investigation
1 | react_prompt = """ |
Conversation flow:
1 | THOUGHT: I need to check if travel from Austin to Dallas overnight is feasible. |
Applied Example: Fraud Detection
Let’s see how these techniques transform fraud analysis from basic to systematic.
Level 1: Basic Prompt
1 | basic_prompt = "Analyze this credit card activity for fraud." |
Level 2: Chain-of-Thought
1 | cot_prompt = """ |
Sample Output:
1 | STEP 1: Pattern Recognition |
Level 3: ReACT with Tools
1 | react_prompt = """ |
This approach produces an investigation that:
- Documents every decision point
- Shows what tools were consulted
- Provides evidence for each conclusion
- Creates an audit trail for compliance
Prompt Refinement Best Practices
Getting good results from CoT and ReACT requires careful prompt engineering.
Do’s
Be Specific About Steps
1
2
3
4
5
6
7# Good
"STEP 1: Calculate debt-to-income ratio
STEP 2: Compare to threshold of 43%
STEP 3: Assess risk category"
# Bad
"Analyze the loan application"Specify Output Format
1
"Provide fraud likelihood as: LOW (< 30%), MEDIUM (30-70%), HIGH (> 70%)"Include Examples for Complex Tasks
1
2
3
4
5
6
7
8"""
Example analysis:
Transaction: $50 coffee shop
THOUGHT: Small amount, local merchant, typical category
CONCLUSION: LOW risk
Now analyze: [actual transaction]
"""Define Tool Descriptions Clearly
1
2
3
4tools = {
"check_balance": "Returns current account balance in USD",
"get_history": "Returns last 30 transactions as JSON array"
}
Don’ts
- Ambiguity: “Check if this looks okay” → What’s “okay”?
- Missing Context: Expecting the model to know customer history without providing it
- Conflicting Instructions: “Be thorough but keep it brief”
- Poor Tool Descriptions: If tools aren’t well-documented, the LLM won’t use them correctly
Common Pitfalls
| Pitfall | Example | Fix |
|---|---|---|
| Ambiguity | “Analyze this” | “Calculate fraud risk score 0-100” |
| No context | “Is this suspicious?” | Provide customer profile, history |
| Too much context | 50 pages of transactions | Summarize or paginate |
| Bias in examples | Only showing fraud cases | Include legitimate examples too |
Combining Techniques
The most effective approach often combines role-based prompting (from our previous post), Chain-of-Thought, and ReACT:
1 | system_prompt = """ |
Why This Matters for Finance
Financial services face unique requirements that make structured reasoning essential:
- Regulatory Compliance: Decisions must be explainable and documented
- Audit Trails: Every determination needs supporting evidence
- Customer Trust: Clear reasoning builds confidence in AI-assisted decisions
- Error Reduction: Step-by-step thinking catches mistakes before they become costly
The difference between “this looks fraudulent” and a documented 5-step analysis could be the difference between a satisfied customer and a regulatory investigation.
Takeaways
Chain-of-Thought prompting forces the LLM to show its reasoning, creating auditable decision trails
ReACT adds the ability to gather external information, making reasoning dynamic and evidence-based
Structured methodologies (numbered steps, clear criteria) produce consistent, comparable analyses
Tool integration transforms LLMs from responders into investigators that can verify facts
Prompt refinement is iterative - test, analyze failures, adjust, repeat
This is the second post in my Applied Agentic AI for Finance series. Next: Building Financial Prompt Pipelines where we’ll explore chaining prompts and feedback loops for complex financial workflows.
Comments