Architecture

The PolicyEngine API v2 is a distributed system for running tax-benefit microsimulations with persistence and async processing.

Components

API server

FastAPI application exposing RESTful endpoints for creating and managing datasets, defining policy reforms, queueing simulations, and computing aggregates. The server validates requests, persists to PostgreSQL, and queues background tasks.

Database

PostgreSQL (via Supabase) stores all persistent data using SQLModel for type-safe ORM with Pydantic integration.

datasetspoliciessimulationsaggregatesreportsdecile_impactsprogram_statisticsparameters

Worker

Background workers poll for pending simulations and reports. They load datasets from storage, run PolicyEngine simulations, compute aggregates and impact statistics, then store results to the database.

Storage

Dataset files (HDF5 format) are stored in Supabase Storage with local caching for performance. The storage layer handles downloads and caching transparently.

Request flow

Client creates simulation via POST /analysis/economic-impact

API validates request and persists simulation + report records

API returns pending status immediately

Worker picks up pending simulation from queue

Worker loads dataset and runs PolicyEngine simulation

Worker updates simulation status to completed

Worker picks up pending report

Worker computes decile impacts and program statistics

Client polls GET /analysis/economic-impact/{id} to check status

Once complete, response includes full analysis results

Data models

All models follow Pydantic/SQLModel patterns for type safety across API, database, and business logic:

Base

Shared fields across models

Table

Database model with ID and timestamps

Create

Request schema (no ID)

Read

Response schema (with ID and timestamps)

Scaling

API scaling

Multiple uvicorn workers behind load balancer for horizontal scaling.

Worker scaling

Increase worker count for parallel simulation processing.

Database

PostgreSQL supports read replicas for high read throughput.

Caching

Deterministic UUIDs ensure same requests reuse cached results.