Welcome to the Deep Dive

A production-grade sentiment analysis service built with AWS serverless

--

Service Health

--

Avg Latency (ms)

--

Cache Hit Rate

--

Circuits Healthy

🎯

What We're Building

A real-time financial sentiment analysis platform that aggregates news from Tiingo & Finnhub, computes sentiment scores, and delivers alerts to users. Think Bloomberg Terminal meets modern fintech.

⚡

Key Technical Decisions

Single-table DynamoDB design with 3 GSIs
Lambda with Function URL (not API Gateway)
In-memory caching for warm invocations
Circuit breakers per external service

📊

The Numbers

1,230+ unit tests passing
~$2.50/mo dev environment cost
80-95% DynamoDB read reduction via caching
<100ms p99 latency target

Architecture Overview

A serverless event-driven pipeline for sentiment analysis

Data Flow Pipeline

1. Ingestion Lambda

EventBridge triggers every 5 minutes → Fetches news from Tiingo/Finnhub → Writes to DynamoDB with status='pending'

2. SNS Notification

New items trigger SNS → Fan-out to Analysis Lambda

3. Analysis Lambda

Compute sentiment scores → Update item with sentiment + confidence → Set status='analyzed'

4. Dashboard Lambda

REST API serving endpoints → Query GSIs → Return aggregated data

5. Notification Lambda

Evaluate alert rules → Send emails via SendGrid → Track quotas

🗄️

DynamoDB Single-Table Design

# Primary Keys
USER#{user_id}   | PROFILE
USER#{user_id}   | CONFIG#{config_id}
CONFIG#{config_id} | ALERT#{alert_id}
TOKEN#{token}    | TOKEN

# GSIs
by_sentiment: sentiment → timestamp
by_status: status → timestamp
by_tag: tag → timestamp

📦

Lambda Functions

ingestion	News fetcher (5min schedule)
analysis	Sentiment scorer (SNS trigger)
dashboard	REST API
notification	Alert evaluator & emailer
metrics	CloudWatch metrics (1min)

Authentication

Multiple auth strategies for different user journeys

👤

Anonymous Sessions

Quick start for new users. Creates a temporary session with limited capabilities (1 config, basic alerts).

✉️

Magic Link

Passwordless email authentication. Token expires in 15 minutes. Converts anonymous sessions to full accounts.

🔗

OAuth 2.0

Google OAuth integration via Cognito. Full account with all features. Supports account merging.

Try It: Create Anonymous Session

POST /api/v2/auth/anonymous

Click "Execute" to create an anonymous session

Configuration Management

User-defined watchlists with ticker tracking

Business Rules

Max 2 configurations per user (free tier)
Max 5 tickers per configuration
Ticker symbols validated against exchange data
Soft deletes preserve audit trail

Caching Strategy

Config List Cache ~80% read reduction

60-second TTL in-memory cache. Invalidated on create/update/delete.

Try It: Create Configuration

POST /api/v2/configurations

{"name": "Tech Watchlist", "tickers": ["AAPL", "MSFT", "NVDA"]}

Click "Create Tech Watchlist" to create a configuration

Try It: List Configurations

GET /api/v2/configurations

Create a session first, then execute

Sentiment Analysis

Real-time market sentiment from multiple sources

📰

Tiingo News

Primary news source. Provides headlines, descriptions, and tickers mentioned. Free tier: 500 symbols/month.

📊

Finnhub Sentiment

Social sentiment scores. Bullish/bearish percentages. Free tier: 60 calls/minute.

🤖

Our Model

Weighted ensemble of Tiingo + Finnhub. Confidence-weighted averaging. Handles missing sources gracefully.

Try It: Get Sentiment for Configuration

GET /api/v2/configurations/{config_id}/sentiment

Create a configuration first, then get sentiment for its tickers

External API Integration

Resilient adapters for third-party services

⚡

Adapter Pattern

Each external API has a dedicated adapter class. Standardized interface for news/sentiment. Easy to add new sources.

🛡️

Quota Tracking

Per-service quota limits tracked in DynamoDB. Warning at 50%, critical at 80%. Reserve 10% for priority operations.

API Quota Status

--

Tiingo (500/mo)

--

Finnhub (60/min)

--

SendGrid (100/day)

Circuit Breaker

Prevent cascading failures with per-service circuit breakers

State Machine

✅

CLOSED

Normal ops

→ 5 failures →

🚫

OPEN

Blocking

→ 60s timeout →

🔄

HALF-OPEN

Testing

Live Circuit Status

Tiingo

State: CLOSED

Failures: 0/5

Finnhub

State: CLOSED

Failures: 0/5

SendGrid

State: CLOSED

Failures: 0/5

Traffic Generator

Generate synthetic traffic to demonstrate system behavior

Quick Start Commands

Run these commands from the interview/ directory:

# Basic flow: session → config → sentiment + OHLC
python3 traffic_generator.py --env preprod --scenario basic

# Price-Sentiment Overlay: OHLC + historical sentiment
python3 traffic_generator.py --env preprod --scenario price-sentiment

# Cache warmup: show latency improvement
python3 traffic_generator.py --env preprod --scenario cache

# Load test: 5 users, 10 requests each
python3 traffic_generator.py --env preprod --scenario load --users 5 --requests 10

# Rate limit test: burst 50 requests
python3 traffic_generator.py --env preprod --scenario rate-limit

# Run ALL scenarios
python3 traffic_generator.py --env preprod --scenario all
                

🎯

Basic Flow

Creates session → Creates config → Lists configs → Fetches sentiment + OHLC + sentiment history. Great for walking through the happy path.

📈

Price-Sentiment Overlay

Tests OHLC and sentiment history endpoints with multiple tickers (AAPL, MSFT, NVDA) and time ranges (1W, 1M, 3M). Validates response shapes.

🚀

Cache Warmup

Shows cold vs warm cache latency difference. Watch the latency drop from ~200ms to ~50ms as caches warm up.

⚡

Load Test

Simulates concurrent users hitting the API. Demonstrates horizontal scalability and resource isolation.

🛑

Rate Limit

Bursts 50 requests to trigger rate limiting. Shows graceful degradation with 429 responses.

Chaos Engineering

Inject failures to test resilience

Failure Injection Scenarios

These demonstrate what happens when external services fail. Watch how circuit breakers protect the system.

Tiingo API Failure

Simulate 500 errors from Tiingo. Circuit opens after 5 failures.

Finnhub API Failure

Simulate 500 errors from Finnhub. System falls back to Tiingo only.

Tiingo Timeout (10s)

Simulate slow responses. Shows timeout handling and fallback.

SendGrid Rate Limit

Simulate 429 from SendGrid. Emails queue for retry.

⚠️ Interview Demo Note

In production, chaos experiments would be enabled via environment variables or feature flags. For this demo, we simulate the effects to show how the system would respond.

What to Observe

1. Circuit State Changes

Watch the Circuit Breaker section → Tiingo card changes from CLOSED (green) to OPEN (red) after 5 failures.

2. Graceful Degradation

Sentiment API returns partial data from Finnhub only. Response includes "source_status" showing which sources succeeded.

3. Self-Healing

After 60 seconds, circuit enters HALF-OPEN state and tests with a single request. If successful, circuit closes.

Caching Strategy

Multi-layer caching for performance and cost optimization

Cache Layers

Layer	TTL	Savings	Pattern
Circuit Breaker	60s	~90% DynamoDB reads	Write-through
Quota Tracker	60s	~95% DynamoDB reads	Batched sync
User Configs	60s	~80% DynamoDB reads	Invalidate on mutate
Sentiment	5min	~70% CPU	Read-through
GSI Metrics	60s	~40% RCU	Read-through

--

Cache Hits

--

Cache Misses

--

Hit Rate

--

Cached Entries

Observability

Logs, metrics, and traces for production debugging

📝

Structured Logging

JSON logs with correlation IDs. Log levels: DEBUG, INFO, WARN, ERROR. Sensitive data sanitized before logging.

📊

CloudWatch Metrics

Custom metrics for sentiment distribution, ingestion rates, cache hit rates. 1-minute granularity.

🔍

X-Ray Tracing

End-to-end distributed tracing. Trace external API calls, DynamoDB operations. Identify bottlenecks.

Health Check

GET /health

Service health and dependencies status

Testing Strategy

Comprehensive testing pyramid with oracle-based validation

Test Pyramid

E2E

Integration

Unit (1,230+ tests)

✅

Unit Tests

Moto for DynamoDB mocking
responses for HTTP mocking
pytest with coverage
Fast feedback (<30s)

🔄

Integration Tests

Direct handler invocation
Cross-component flows
Mocked external APIs
Contract validation

🎯

E2E + Oracle

Real AWS resources
Synthetic data generators
Test oracle validation
TTL-based cleanup

Infrastructure

Terraform-managed AWS serverless stack

📦

Terraform Modules


infrastructure/terraform/modules/

├── dynamodb/  # Table + GSIs + Backups

├── iam/       # Lambda roles

├── secrets/   # Secrets Manager

├── sns/       # Event fan-out

├── cognito/   # User pools

├── amplify/   # Frontend hosting

└── eventbridge/ # Schedules

💰

Cost Optimization

DynamoDB on-demand (pay per request)
Lambda 128MB minimum memory
Free tier maximization
~$2.50/mo dev environment

Environment Parity

Feature	Dev	Preprod	Prod
DynamoDB	✅	✅	✅
Point-in-Time Recovery	❌	✅	✅
Daily Backups	❌	✅	✅
Amplify Hosting	❌	✅	✅