SYSTEM ARCHITECTURE

INTEL DATA PIPELINE

Automated multi-stage intelligence pipeline running on Cloudflare's global edge. Collects, deduplicates, classifies, fact-checks, and indexes conflict news every hour.

8STAGES
3AI MODELS
1hCYCLE TIME
0.85DEDUP SCORE
6REST APIs

SYSTEM ARCHITECTURE

EXTERNAL SERVICES
GM
Gemini 3.1 Pro
Google Search grounding
BR
Brave Search
Independent web index
CLOUDFLARE EDGE
WK
Workers
API + Cron runtime
WF
Workflows
Durable pipeline execution
AI
Workers AI
BGE + Llama 3.1-8B
D1
D1 SQL
SQLite edge DB
VZ
Vectorize
384-dim vector index
FRONTEND
RE
React 19 SPA
war.trackit.today
Polls /api/intel every 30s
localStorage cache
PWA offline fallback

PIPELINE FLOW

Each run is a durable Cloudflare Workflow instance. Steps checkpoint automatically — if a step fails mid-run, only that step retries (not the entire pipeline).

00
CRON TRIGGER
Cloudflare Workers

Scheduled trigger fires every hour at :00 via Cloudflare's built-in cron system. Webhook-triggered on-demand runs also supported (POST /api/cron/trigger with Bearer token). Kicks off a new durable Workflow instance.

CRON: 0 * * * *EVERY HOURWEBHOOK MANUAL
OUT: Workflow start signal
01
NEWS COLLECTION
Gemini 3.1 Pro Preview

Gemini 3.1 Pro with real-time Google Search grounding fetches the latest Iran-Israel-US conflict news as a structured JSON array. Thinking mode (high) enables deep multi-step reasoning for better article synthesis. Auto-fallback chain: 3.1 Pro → 2.5 Flash (thinkingBudget 4096) → 2.0 Flash.

GEMINI 3.1 PRO PREVIEWGOOGLE SEARCH GROUNDINGTHINKING: HIGH50s TIMEOUT
IN: Structured prompt OUT: ~14 raw articles
02
EMBEDDING GENERATION
Workers AI — BGE-small-en-v1.5

Each article's title + first 500 characters are vectorized into a 384-dimensional dense embedding. Runs entirely on Cloudflare's edge inference — zero external network calls, sub-second latency for the full batch. Step timeout: 30 seconds.

@CF/BAAI/BGE-SMALL-EN-V1.5384-DIM VECTORSBATCH INFERENCE
IN: 14 raw articles OUT: 14 × 384-dim vectors
03
DEDUPLICATION
Cloudflare Vectorize

Each new embedding is queried against the persistent Vectorize index (growing over time). Articles scoring ≥ 0.85 cosine similarity to any previously seen article are silently dropped as duplicates. Only genuinely novel events advance. Step timeout: 15 seconds.

COSINE SIMILARITYTHRESHOLD: 0.85TOP-K: 1 QUERY
IN: 14 articles + vectors OUT: 2–5 unique articles
04
CLASSIFY + SUMMARIZE
Workers AI — Llama 3.1-8B Instruct

Each novel article is classified and dual-language summarized in a single LLM call: category (Military / Diplomatic / Nuclear / Economic), severity rating (1–5), tag array, English summary, Chinese summary (summary_zh), Chinese title (title_zh), and event_type. Step timeout: 5 minutes.

@CF/META/LLAMA-3.1-8B-INSTRUCTTEMP: 0.2MAX 2048 TOKENSJSON OUTPUT
IN: Novel articles OUT: Classified + translated
05
FACT CORROBORATION
Brave Search API

A targeted Brave Search query (article title + 'Iran Israel US military 2026') is executed for each article. Results from independent news sources are collected as corroborating evidence — Brave uses its own independent index, not Google or Bing.

BRAVE SEARCH APILIVE WEB INDEXINDEPENDENT SOURCES
IN: Article title + key terms OUT: Independent news sources
06
FACT-CHECK
Workers AI — Llama 3.1-8B Instruct

LLM cross-references the original article against Brave Search results. Returns: status (verified / uncertain / disputed), confidence score (0–100), human-readable verification notes, and corroborating URLs. Disputed articles are silently dropped from the pipeline — never stored. Step timeout: 3 minutes.

VERIFIED | UNCERTAIN | DISPUTEDCONFIDENCE 0–100JSON OUTPUT
IN: Article + web sources OUT: Verified articles only
07
STORE + INDEX
D1 Database + Vectorize

Verified articles are written to D1 with all metadata. Embeddings are upserted to Vectorize so future runs can deduplicate against them. A live_events record is also created for the Timeline view. Data is immediately available via the /api/intel REST endpoint. Step timeout: 30 seconds.

CLOUDFLARE D1 (SQLITE)VECTORIZE UPSERTLIVE EVENTS TABLE/api/intel
IN: Verified articles OUT: Persisted + indexed + served

TECHNOLOGY STACK

CF
Cloudflare Workers
RUNTIME

Serverless compute at the edge. The entire backend runs as a single Workers bundle — zero server management, global distribution.

V8 IsolateEdge CDN< 1ms cold startCron + Webhook
WF
Cloudflare Workflows
ORCHESTRATION

Long-running durable pipeline with automatic step-level retry and checkpointing. If a step fails, only that step retries — not the whole pipeline.

Durable executionAuto-retryStep checkpointingUp to 1yr lifespan
AI
Workers AI
EDGE INFERENCE

On-edge ML inference runs directly inside the Worker runtime. No external API calls for embeddings or classification — single-digit millisecond overhead.

On-edge MLBGE embeddingsLlama 3.1-8BZero network hop
VZ
Cloudflare Vectorize
VECTOR DATABASE

Persistent vector database grows with every processed article. Enables semantic deduplication — not just exact-match but concept-level duplicate detection.

Persistent index384-dim BGECosine similarityANN queries
D1
Cloudflare D1
SQL DATABASE

Edge-native SQLite. Stores all verified articles, analytics events, live timeline events, and oil/market data snapshots.

SQLite-compatibleEdge-native SQLSYD replicaREST API
GM
Gemini 3.1 Pro
NEWS LLM

Google's frontier model with real-time web access for news collection. Thinking mode produces higher-quality article analysis and structured JSON output.

Google Search groundingThinking: HIGH50s timeout3-model fallback
BR
Brave Search
FACT SEARCH

Independent search index (not Google/Bing-derived) used for fact corroboration. Provides unbiased, real-time corroborating sources for each article.

Independent indexNo SEO spamReal-time resultsREST API
RE
React 19 + Vite
FRONTEND

SPA polls /api/intel every 30 seconds. Articles cached in localStorage for offline resilience. PWA-enabled — installable as a native-like app.

React 19Vite 6Lazy routes30s polling

ARTICLE DATA SCHEMA

D1 table articles — each row is a verified, deduplicated, AI-processed news article.

FIELDTYPEDESCRIPTION
idTEXT (UUID)Unique article identifier
titleTEXTOriginal article headline
contentTEXTFull article body text from source
summaryTEXTAI-generated English summary (2–3 sentences)
summary_zhTEXTAI-generated Traditional Chinese summary
title_zhTEXTAI-translated Chinese headline
categoryTEXTMilitary | Diplomatic | Nuclear | Economic
event_typeTEXTairstrike | missile | diplomatic | nuclear | sanction | ...
tagsJSON []Array of keyword tags (e.g. ['Iran', 'IRGC', 'nuclear'])
severityINT 1–5Geopolitical significance score (5 = war-changing)
sourceTEXTNews outlet name (e.g. AP News, Al Jazeera)
source_urlTEXTOriginal article URL
published_atDATETIMEOriginal publication timestamp
fact_check_statusTEXTverified | uncertain (disputed = dropped)
fact_check_notesTEXTAI verification reasoning and caveats
brave_sourcesJSON []Corroborating URLs from Brave Search
created_atDATETIMEDatabase insertion timestamp

LIVE DATA SOURCES

Four external APIs feed this platform — news intelligence, financial signals, vessel tracking, and fact verification.

YF
Yahoo Finance v8
FINANCIAL DATA
NO KEYFREE

Provides Brent crude oil spot price (BZ=F) and 22 defense stock quotes. Uses the undocumented v8/finance/chart endpoint — no API key required. Each ticker is fetched in parallel via Promise.allSettled so a single failure doesn't block the rest. Oil price is sanity-checked ($20–$300 range). 8-second timeout per request.

ENDPOINTquery2.finance.yahoo.com/v8/finance/chart/{symbol}
NOTENo API key · Unofficial endpoint · May be rate-limited
BR
Brave Search API
FACT CORROBORATION
API KEYFREE

Independent web search index (not derived from Google or Bing) used in Pipeline step 05. A targeted query is fired for each article — title + 'Iran Israel US military 2026' — with freshness:pw (past week). Returns up to 8 corroborating results with title, URL, and snippet. Disputed articles are silently dropped.

ENDPOINTapi.search.brave.com/res/v1/web/search
NOTERequires API key (BRAVE_API_KEY) · Free tier available
AIS
AISStream.io
VESSEL TRACKING
API KEYFREEFALLBACK

Real-time AIS (Automatic Identification System) vessel data via WebSocket. Subscribes to the Strait of Hormuz bounding box (25–27.5°N, 55–57.5°E), collects for 15 seconds, then returns unique vessels by MMSI. Browser CORS blocks direct access — data is fetched by the Cloudflare Worker backend. Frontend falls back to a deterministic simulation engine when live data is unavailable.

ENDPOINTwss://stream.aisstream.io/v0/stream (WebSocket)
NOTERequires API key · Free tier · Backend-only (CORS blocked in browser)
GM
Gemini API (Google)
NEWS INTELLIGENCE
API KEYPAID

Google's Gemini 3.1 Pro with real-time Google Search grounding is the primary news collection engine (Pipeline step 01). Thinking mode (HIGH) enables multi-step reasoning for better article synthesis. A 3-model fallback chain protects against outages: Gemini 3.1 Pro → 2.5 Flash (thinkingBudget 4096) → 2.0 Flash. Also used for classifying articles and generating bilingual summaries (step 04, via Workers AI Llama 3.1-8B).

ENDPOINTgenerativelanguage.googleapis.com/v1beta/models/{model}:generateContent
NOTERequires API key (GEMINI_API_KEY) · Paid · 50s timeout

REST API ENDPOINTS

Base URL: https://iran-war-intel.672rmwysbs.workers.dev

GET/api/intel
Latest verified articles (JSON array, sorted by created_at DESC)
GET/api/insights/oil
Brent crude oil price + Hormuz vessel count (Yahoo Finance v8 + Gemini)
GET/api/insights/market
Defense stock quotes (22 tickers: LMT, RTX, NOC, GD, BA, ...)
GET/api/live-events
Timeline events for the Timeline page (date, type, severity, summary)
POST/api/cron/triggerAUTH
Manually trigger the pipeline (returns article batch + market data)
GET/api/analytics/dashboard
Aggregated analytics stats (page views, events, top pages)
SATELLITE UPLINK: 34.8816°N 50.9800°E
ENCRYPTION: AES-256-GCM [ACTIVE]
CLEARANCE LEVEL: 5 / TK / SI-GAMMA
THEATER: IRAN · MIDDLE EAST · RED SEA
OPERATION: EPIC FURY [AUTHORIZED]
UPLINK ACQUIRED
34.8816°N · 50.9800°E · FORDOW ENRICHMENT FACILITY
34.88°N 50.98°E
TOP SECRET // NOFORN
THEATER: IRAN · OPS ACTIVE
SIGNAL STRENGTH: 98.7%