Reasoning Chains for Financial Decisions

Financial decisions require more than quick answers - they demand transparent, auditable reasoning. When an AI flags a transaction as fraudulent or recommends a portfolio rebalancing, stakeholders need to understand why. Chain-of-Thought and ReACT prompting techniques transform LLMs from black-box responders into systematic reasoners whose logic can be traced, verified, and trusted.

The Problem with Direct Answers

Consider a fraud detection scenario. When asked “Is this transaction fraudulent?”, a basic LLM might respond:

“Yes, this looks like potential fraud due to unusual activity.”

That’s not helpful for a compliance team that needs to document decisions, justify actions to regulators, or train analysts. We need the reasoning trail, not just the conclusion.

Chain-of-Thought Prompting

Chain-of-Thought (CoT) prompting encourages the LLM to generate intermediate reasoning steps before providing a final answer. Instead of jumping to conclusions, the model thinks through the problem systematically.

flowchart TB
    subgraph Standard["Standard Prompting"]
        direction LR
        Q1[Question] --> A1[Answer]
    end

    subgraph CoT["Chain-of-Thought"]
        direction LR
        Q2[Question] --> S1[Step 1]
        S1 --> S2[Step 2]
        S2 --> S3[Step 3]
        S3 --> A2[Answer]
    end

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff

    class CoT blueClass

Types of Chain-of-Thought

Zero-shot CoT: The simplest form - just add “Let’s think step by step” to your prompt:

1
2
3
4
5
6
7
prompt = """
Analyze this transaction for fraud risk.
Let's think step by step.

Transaction: $2,450 online electronics purchase at 2:15 AM
Customer: Typical spend $2,800/month, Austin TX resident
"""

Few-shot CoT: Provide examples showing the reasoning process:

1
2
3
4
5
6
7
8
9
10
11
12
prompt = """
Example:
Q: Is a $500 gas purchase suspicious for a customer with $300 monthly gas spending?
A: Let's analyze step by step:
1. Typical pattern: $300/month on gas
2. Current transaction: $500 (67% above average)
3. Single transaction vs monthly total
4. Conclusion: Slightly elevated but could be a road trip. Medium risk.

Now analyze:
Q: Is a $2,450 electronics purchase at 2:15 AM suspicious?
"""

Benefits of CoT for Finance

  1. Auditability: Every conclusion comes with documented reasoning
  2. Reduced Errors: Step-by-step thinking catches logical mistakes
  3. Interpretability: Analysts can verify the model’s logic
  4. Compliance: Reasoning trails satisfy regulatory requirements

ReACT: Reasoning with Action

Chain-of-Thought works for internal reasoning, but financial tasks often require external information - checking account history, verifying locations, or querying databases. ReACT (Reason + Act) interleaves thinking with action.

flowchart TD
    T1[Thought: What do I need to know?] --> A1[Action: Query tool/API]
    A1 --> O1[Observation: Tool result]
    O1 --> T2[Thought: What does this mean?]
    T2 --> A2[Action: Query another tool]
    A2 --> O2[Observation: New information]
    O2 --> T3[Thought: I can now conclude]
    T3 --> F[Final Answer]

    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff

    class T1 orangeClass
    class T2 orangeClass
    class T3 orangeClass
    class A1 blueClass
    class A2 blueClass

The ReACT Loop

  1. Thought: Reason about the current state and plan next step
  2. Action: Call an external tool with specific parameters
  3. Observation: Receive and process the tool’s response
  4. Repeat until ready to provide final answer

ReACT Example: Fraud Investigation

1
2
3
4
5
6
7
8
9
10
11
react_prompt = """
You are a fraud investigator. Use the ReACT framework:

THOUGHT: [Reason about what you need to investigate]
ACTION: [Specify tool to use: check_location, verify_transaction, contact_customer]
OBSERVATION: [Result from tool - I will provide this]

Continue cycling until you can make a determination.

Case: Credit card used in Austin at 11 PM, then Dallas at 8 AM (200 miles apart)
"""

Conversation flow:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
THOUGHT: I need to check if travel from Austin to Dallas overnight is feasible.
ACTION: check_location("Austin to Dallas", "9 hours")
OBSERVATION: 200 miles, ~3 hours by car. Feasible but tight timing.

THOUGHT: The timing is possible but suspicious. Let me check if customer
has travel history or notifications filed.
ACTION: check_travel_alerts("customer_id_1847")
OBSERVATION: No travel notifications on file. Customer typically local.

THOUGHT: No travel alerts, unusual pattern. I should attempt customer contact.
ACTION: contact_customer("phone")
OBSERVATION: No answer, voicemail left.

THOUGHT: Cannot reach customer, unusual travel pattern, no prior notification.
Combined with late-night purchases and geographic spread, this is high-risk.

FINAL ANSWER: High fraud risk. Recommend temporary hold and escalation.

Applied Example: Fraud Detection

Let’s see how these techniques transform fraud analysis from basic to systematic.

Level 1: Basic Prompt

1
2
3
4
basic_prompt = "Analyze this credit card activity for fraud."

# Response: "This may be fraudulent due to unusual patterns."
# Problem: No reasoning trail, not auditable

Level 2: Chain-of-Thought

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
cot_prompt = """
You are a fraud analyst. Use this methodology:

STEP 1: Pattern Recognition
- Identify unusual patterns in timing, location, amounts
- Compare against typical customer behavior

STEP 2: Risk Factor Analysis
- Evaluate geographic anomalies
- Assess timing inconsistencies
- Review spending amounts and categories

STEP 3: Behavioral Assessment
- Do purchases align with customer profile?
- Is the transaction sequence logical?
- Are travel patterns feasible?

STEP 4: Evidence Synthesis
- Compile suspicious indicators
- Weigh evidence strength
- Consider alternative explanations

STEP 5: Conclusion
- Fraud likelihood: Low/Medium/High
- Recommended action: approve, decline, contact customer
- Key evidence supporting decision

Work through each step systematically.
"""

Sample Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
STEP 1: Pattern Recognition
- 8 transactions in 48 hours totaling $5,456
- Customer typical: $2,800/month, local Austin purchases
- Anomaly: Late-night online purchases, multi-city same-day

STEP 2: Risk Factor Analysis
- Geographic: Austin → Dallas → Houston in 12 hours (unusual)
- Timing: 2 AM online purchases (atypical for this customer)
- Amounts: $2,450 single purchase (near monthly total)

STEP 3: Behavioral Assessment
- Profile mismatch: Software consultant, business purchases typical
- $890 gaming platform doesn't fit professional profile
- Travel pattern requires 500+ miles in 12 hours

STEP 4: Evidence Synthesis
Strong indicators:
- Geographic impossibility (Houston lunch after Dallas morning)
- Spending pattern deviation (2x monthly average in 48 hours)
- Transaction timing (2 AM purchases unusual for profile)

Weak indicators:
- Austin transactions could be legitimate
- Camera equipment could be work-related

STEP 5: Conclusion
Fraud Likelihood: HIGH
Recommended Action: Temporary hold, immediate customer contact
Key Evidence: Geographic impossibility between Dallas and Houston

Level 3: ReACT with Tools

1
2
3
4
5
6
7
8
9
react_prompt = """
You are a fraud investigator with access to these tools:
- check_geolocation(from, to, time_hours): Check if travel is feasible
- verify_transaction(tx_id): Get merchant verification status
- check_customer_history(customer_id): Get recent activity patterns
- contact_customer(method): Attempt customer contact

Use THOUGHT, ACTION, OBSERVATION format to investigate systematically.
"""

This approach produces an investigation that:

  • Documents every decision point
  • Shows what tools were consulted
  • Provides evidence for each conclusion
  • Creates an audit trail for compliance

Prompt Refinement Best Practices

Getting good results from CoT and ReACT requires careful prompt engineering.

Do’s

  1. Be Specific About Steps

    1
    2
    3
    4
    5
    6
    7
    # Good
    "STEP 1: Calculate debt-to-income ratio
    STEP 2: Compare to threshold of 43%
    STEP 3: Assess risk category"

    # Bad
    "Analyze the loan application"
  2. Specify Output Format

    1
    "Provide fraud likelihood as: LOW (< 30%), MEDIUM (30-70%), HIGH (> 70%)"
  3. Include Examples for Complex Tasks

    1
    2
    3
    4
    5
    6
    7
    8
    """
    Example analysis:
    Transaction: $50 coffee shop
    THOUGHT: Small amount, local merchant, typical category
    CONCLUSION: LOW risk

    Now analyze: [actual transaction]
    """
  4. Define Tool Descriptions Clearly

    1
    2
    3
    4
    tools = {
    "check_balance": "Returns current account balance in USD",
    "get_history": "Returns last 30 transactions as JSON array"
    }

Don’ts

  1. Ambiguity: “Check if this looks okay” → What’s “okay”?
  2. Missing Context: Expecting the model to know customer history without providing it
  3. Conflicting Instructions: “Be thorough but keep it brief”
  4. Poor Tool Descriptions: If tools aren’t well-documented, the LLM won’t use them correctly

Common Pitfalls

Pitfall Example Fix
Ambiguity “Analyze this” “Calculate fraud risk score 0-100”
No context “Is this suspicious?” Provide customer profile, history
Too much context 50 pages of transactions Summarize or paginate
Bias in examples Only showing fraud cases Include legitimate examples too

Combining Techniques

The most effective approach often combines role-based prompting (from our previous post), Chain-of-Thought, and ReACT:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
system_prompt = """
You are a Senior Fraud Analyst (CFE certified) with 10 years of experience
in credit card fraud investigation.

Use the ReACT framework with these tools:
- check_geolocation(from, to, hours)
- verify_merchant(merchant_id)
- get_customer_profile(customer_id)

For each investigation:
1. State your initial hypothesis
2. Gather evidence using tools (THOUGHT → ACTION → OBSERVATION)
3. Apply the fraud scoring matrix:
- Geographic anomalies: +30 points
- Timing anomalies: +20 points
- Amount anomalies: +25 points
- Profile mismatch: +25 points
4. Calculate total score and classify:
- 0-30: LOW, approve
- 31-60: MEDIUM, verify with customer
- 61-100: HIGH, decline and escalate

Document every step for audit compliance.
"""

Why This Matters for Finance

Financial services face unique requirements that make structured reasoning essential:

  1. Regulatory Compliance: Decisions must be explainable and documented
  2. Audit Trails: Every determination needs supporting evidence
  3. Customer Trust: Clear reasoning builds confidence in AI-assisted decisions
  4. Error Reduction: Step-by-step thinking catches mistakes before they become costly

The difference between “this looks fraudulent” and a documented 5-step analysis could be the difference between a satisfied customer and a regulatory investigation.

Takeaways

  1. Chain-of-Thought prompting forces the LLM to show its reasoning, creating auditable decision trails

  2. ReACT adds the ability to gather external information, making reasoning dynamic and evidence-based

  3. Structured methodologies (numbered steps, clear criteria) produce consistent, comparable analyses

  4. Tool integration transforms LLMs from responders into investigators that can verify facts

  5. Prompt refinement is iterative - test, analyze failures, adjust, repeat


This is the second post in my Applied Agentic AI for Finance series. Next: Building Financial Prompt Pipelines where we’ll explore chaining prompts and feedback loops for complex financial workflows.

Role-Based Prompting for Financial Analysts Building Financial Prompt Pipelines

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×