Skip to content

MoaviaMahmood/Financial-Transaction-Monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Financial Transaction / Anti-Money Laundering Monitoring System

A real-time Anti-Money-Laundering (AML) data engineering pipeline on AWS, with end-to-end streaming detection and a live operational dashboard.

AWS Python React FastAPI


Overview

SENTINEL is a production-style anti-money-laundering pipeline built entirely on AWS serverless infrastructure. Synthetic banking transactions are generated on a schedule, streamed through Amazon Kinesis, evaluated against fourteen distinct AML detection rules in real time, and persisted as a partitioned data lake in S3. A FastAPI backend exposes the cleaned data through a REST interface, and a React/TypeScript dashboard renders it as a live operational console for compliance analysts.

The system was built as a Final Year Project to demonstrate a complete, cloud-native data engineering stack: ingestion, streaming, detection, storage, cataloguing, querying, and visualization, with realistic data quality and governance considerations.


Architecture

Image

The pipeline is organized into three logical layers:

Scheduling — Amazon EventBridge triggers two Lambdas on independent cadences. Entity-Generator runs every 5 minutes to produce customer and account snapshots. AML-pipeline-scheduler runs every minute to invoke a Step Functions workflow.

Data Generation — A Step Functions state machine orchestrates two Lambdas in sequence: GenerateTransactions reads the latest customer/account snapshots from S3 and produces a mix of normal and suspicious transactions, then publishes them to Kinesis; GenerateAlerts runs separate batch alerting for analytical purposes.

Stream Processing — A consumer Lambda subscribes to the Kinesis stream, applies the AML detection rules per transaction, and writes both raw transactions and generated alerts back to S3 in date-partitioned folders.

The downstream layer adds AWS Glue (cataloguing the partitioned S3 data into Hive-style tables), Amazon Athena (SQL views with type casting and data quality filters), a FastAPI backend (a thin REST bridge), and a React dashboard (the user-facing console).


Tech stack

Layer Technology
Cloud (AWS) Lambda · Kinesis Data Streams · S3 · EventBridge · Step Functions · IAM · Glue Data Catalog · Athena
Data CSV (entities/transactions), JSONL (real-time alerts), Hive partitioning (dt=YYYY-MM-DD)
Backend Python 3.11 · FastAPI · PyAthena · boto3 · Uvicorn
Frontend React 19 · TypeScript · Vite · Plotly-style custom charts (no chart library, hand-rendered SVG)
Detection 14 AML patterns (structuring, layering, round-trip, shell company, trade-based, high velocity, impossible travel, etc.)

Features

  • Real-time streaming — transactions move from generation to detection in seconds via Kinesis
  • 14 AML detection rules — covering placement, layering, and integration phases of money laundering
  • Cleaned data layer — Athena views handle type casting (CSV strings → booleans, ISO timestamps → typed TIMESTAMP), filter invalid rows, and resolve slowly-changing customer dimensions
  • Live dashboard — KPI tiles, alert breakdown chart, top suspicious entities, jurisdictional risk panel, and a streaming alert feed, all driven by real Athena queries
  • Multi-view navigation — Overview, Alerts, Transactions, and Entities pages share the same backend
  • Partitioned data lake — every dataset is partitioned by date, optimized for Athena scan cost and query speed
  • Automated cataloguing — Glue crawlers register new partitions, no manual table maintenance

Repository structure

Financial-Transaction-Monitoring/
├── aml-system/
│   ├── data-generators/        Lambda source for entity, transaction, and alert generation
│   ├── ingestion/              Kinesis consumer Lambda (real-time AML detection)
│   ├── pipelines/              Step Functions definitions
│   ├── frontend/               React + TypeScript dashboard (SENTINEL)
│   ├── ml/                     Reserved for future ML-based scoring (planned)
│   ├── services/               Shared utilities
│   └── backend/                FastAPI bridge between Athena and the React UI
├── docs/                       Repo-level documentation (this README's images)
└── README.md                   You are here

Each major folder has its own README.md explaining its contents, dependencies, and run instructions.


Setup

The project has three deployable components: AWS infrastructure, the FastAPI backend, and the React frontend. They are independent — the dashboard runs locally against AWS, no hosting required.

Prerequisites

  • AWS account with admin access (free tier sufficient for the load this project generates)
  • Python 3.11+
  • Node.js 18+
  • AWS CLI configured (aws configure with credentials in region eu-north-1)

1. AWS infrastructure

The Lambdas, Step Functions, EventBridge schedules, and Kinesis stream are deployed manually through the AWS console. See aml-system/README.md for step-by-step provisioning instructions and the Glue crawler setup.

The S3 bucket structure assumed by all downstream code is:

aml-data/
├── customers/dt=YYYY-MM-DD/customers_*.csv
├── accounts/dt=YYYY-MM-DD/accounts_*.csv
├── transactions/dt=YYYY-MM-DD/transactions_*.csv
└── alerts/
    ├── batch/dt=YYYY-MM-DD/alerts_*.csv
    └── realtime/dt=YYYY-MM-DD/*.jsonl

2. Backend

cd backend
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS/Linux
pip install -r requirements.txt
cp .env.example .env           # then edit with your bucket name
uvicorn main:app --reload --port 8000

The backend serves on http://localhost:8000. Auto-generated API docs are at http://localhost:8000/docs.

See backend/README.md for endpoint reference.

3. Frontend

cd aml-system/frontend
npm install
npm run dev

The dashboard opens at http://localhost:5173. The frontend expects the backend to be running at http://localhost:8000.

See aml-system/frontend/README.md for component documentation.


Data flow

EventBridge (every 1 min)
        │
        ▼
Step Functions: AML-pipeline
   ├── GenerateTransactions Lambda
   │       │
   │       ├── Reads latest customers/accounts from S3
   │       ├── Builds normal + suspicious transactions
   │       ├── Writes CSV to s3://.../transactions/dt=.../
   │       └── Publishes records to Kinesis stream
   │
   └── GenerateAlerts Lambda (batch path, parallel)
           └── Reads transactions, writes alerts/batch/

Kinesis Stream
        │
        ▼
AML Consumer Lambda (real-time path)
   ├── Evaluates 14 AML rules per transaction
   ├── Writes JSONL alerts to s3://.../alerts/realtime/dt=.../
   └── (optional) Updates DynamoDB state for stateful rules

S3 Data Lake (raw)
        │
        ▼
Glue Crawler — auto-detects partitions, registers tables in aml_db

Athena Views (cleaned)
   ├── customers_clean       Latest snapshot, typed pep_flag
   ├── accounts_clean        Typed balances, latest snapshot
   ├── transactions_clean    Filters invalid rows, parses timestamps
   ├── alerts_clean          Score-validated alerts
   └── alerts_enriched       Pre-joined alerts × customers × transactions

FastAPI Backend
   └── 6 endpoints querying the cleaned views

React Dashboard (SENTINEL)
   └── Auto-refreshes every 30s

AML detection rules

The Kinesis consumer Lambda evaluates each incoming transaction against fourteen rules. Each rule that fires generates a separate alert with a 0–100 severity score.

Pattern Description Typical score
HIGH_VALUE Single transaction ≥ $10,000 85
HIGH_RISK_COUNTRY Sender or receiver in FATF-monitored jurisdiction 60
CROSS_BORDER_HIGH Cross-border transaction with elevated value 70
STRUCTURING Multiple deposits just below $10K reporting threshold 90+
STRUCTURING_WINDOW Time-windowed variant of structuring 90
LAYERING Funds cycled through 3-7 accounts rapidly 95+
ROUND_TRIP Same amount returned to origin within hours 100
SHELL_COMPANY Money flowing through known shell-company accounts 100
TRADE_BASED Severe over- or under-invoicing on trade payments 75+
LARGE_RAPID Single large wire to a high-risk jurisdiction 99+
HIGH_VELOCITY Burst of 8–15 payments from one account in 30 minutes 73+
VELOCITY_BURST Short-window high-frequency variant 75
IMPOSSIBLE_TRAVEL Same account transacting in geographically impossible cities 71+
RAPID_MOVEMENT Rapid net outflow from a single account 80

Rules and scoring logic are implemented in aml-system/ingestion/aml-kinesis-consumer_LAMBDA.py.


Screenshots

Live dashboard (Overview)

Image

AML pattern breakdown

Image

Top suspicious entities (cross-table joined)

Image

Athena query — alerts enriched view

Image

FastAPI auto-generated docs

Image

Data quality layer

A common pitfall in data lake design is querying raw data directly. SENTINEL's Athena views enforce a cleaning layer between raw S3 and consumers:

  • pep_flag is cast from CSV string "True"/"False" to actual boolean
  • ISO 8601 timestamp strings are parsed into TIMESTAMP type
  • Future-dated transactions are filtered out (handles a generator-side artifact)
  • Self-transfers (sender = receiver) and zero/negative amounts are excluded
  • Customer snapshots are deduplicated to the latest day's data
  • Orphan alerts (referencing customers no longer present) are flagged and reportable

These views are defined in docs/sql/views.sql and cited in the project report as the "data cleaning layer."


Limitations and future work

The project is intentionally scoped to the data engineering pipeline; several adjacent capabilities are outlined but not implemented:

  • Network Graph, Case Manager, Watchlists, SAR Reports, Audit Log, Rule Config views are present in the navigation as planned future deliverables, with the same FastAPI + Athena pattern reusable for each
  • Mobile responsive navigation is a known limitation; the analyst console is desktop-first by design but mobile interaction polish remains
  • ML-based scoring is reserved for a future iteration in aml-system/ml/; the current detection layer is rule-based
  • KPI sparklines display flat indicators; a time-series endpoint would populate them
  • Dead-letter handling on the Kinesis consumer is not configured; production deployment would add a DLQ

Author

Moavia Mahmood BSc Software Engineering, OSTIM Technical University

Final Year Project, 2026


License

This project is provided for academic review. See LICENSE for details.

About

End-to-end Anti-Money Laundering (AML) transaction monitoring system with data pipelines, risk scoring, alert generation, and investigation dashboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors