SA Fake Data API
A production-grade SaaS platform for generating realistic South African test data. Complete with auth, subscriptions, and rate-limited APIs.
Role
Solo Developer (Me)
Timeline
2026
Team
1
Built With

5
API Endpoints
Person, ID, Address, Company, and Bank data generation
500
Free Tier
Requests per month with no credit card required
100
Batch Size
Records per single API request
9
Provinces
All SA provinces with real suburb data
The Problem
Developers building South African applications lack access to properly formatted test data. Generic fake data generators produce US-centric results — invalid SA ID numbers, wrong phone formats, non-existent provinces and suburbs. This forces developers to hand-craft test fixtures or ship code tested against unrealistic data.
The Solution
Built a full REST API platform with 5 endpoints generating authentic South African data — people, ID numbers, addresses, companies, and bank accounts. Generated ID numbers include valid Luhn checksums with embedded date of birth and gender. Name generation uses demographically weighted ethnic group distribution matching real SA census data (Zulu 22.7%, Xhosa 16%, Sotho 15%, Afrikaans 13.5%).
The platform features full user authentication (GitHub & Google OAuth plus credentials), tiered subscription billing via Paystack, Redis-backed rate limiting, and deterministic seeding for reproducible test results. Built with Next.js 16, PostgreSQL via Neon, Drizzle ORM, and deployed on Vercel with a complete CI/CD pipeline including automated E2E tests.
Tech Stack
Overview
SA Fake Data API is a SaaS platform that generates realistic South African test data through a simple REST API. Developers building applications for the South African market need properly formatted test data — valid ID numbers, real province and suburb names, correct phone formats, and demographically accurate name distributions. This project solves that problem with five purpose-built API endpoints and a complete platform around them.
Architecture & Tech Stack
The platform is built on Next.js 16 with the App Router, using React 19 and TypeScript in strict mode. The frontend uses Tailwind CSS v4 for styling. The backend leverages Next.js API routes for all server-side logic, with PostgreSQL hosted on Neon as the primary database and Drizzle ORM for type-safe database access.
The application follows a clean separation of concerns: marketing pages live under a (marketing) route group, authentication flows under (auth), the user dashboard under (dashboard), and admin tools under (admin). API endpoints are versioned under /api/v1/ with dedicated middleware for authentication, rate limiting, and usage tracking.
Data Generation Engine
The core of the platform is a suite of deterministic data generators. Each generator produces data that passes real-world validation:
- Person generator creates realistic South African individuals with ethnically weighted name distribution matching census data — Zulu 22.7%, Xhosa 16%, Sotho 15%, Afrikaans 13.5%, and more
- ID number generator produces valid 13-digit South African ID numbers with embedded date of birth, gender encoding, and correct Luhn check digit validation
- Address generator covers all 9 provinces with real suburb names sourced from actual South African geographic data
- Company generator creates realistic business entities with valid registration and VAT number formats
- Bank generator produces account details for major South African banks with correct branch codes and account number formats
All generators support deterministic seeding — passing a seed parameter produces identical results every time, which is critical for reproducible test suites. Batch generation allows up to 100 records per request, counted as a single rate-limit hit.
Authentication & Authorization
The authentication system is built on NextAuth v5 with multiple providers: GitHub OAuth, Google OAuth, and email/password credentials. Passwords are hashed with bcrypt and login endpoints are rate-limited to 5 attempts per 15 minutes to prevent brute force attacks.
The platform supports full user lifecycle flows including email verification, password reset with secure tokens, and role-based access control separating regular users from administrators. JWT sessions keep the system stateless and scalable.
API access is managed through user-generated API keys. Each key is hashed before storage and identified by a unique prefix. Usage is tracked per key with monthly quotas enforced based on the user's subscription plan.
Subscription & Billing
The billing system integrates with Paystack, a payment gateway widely used across Africa. The platform offers a free tier with 500 requests per month — no credit card required — along with paid plans that unlock higher request limits.
Plan-based rate limiting is enforced through Upstash Redis. When a user exceeds their monthly quota, the API returns a 429 status with clear messaging about their limit. The subscription management system handles upgrades, cancellations, and status checks through Paystack webhooks.
CI/CD Pipeline
The project uses a multi-stage CI/CD pipeline running on GitLab CI:
- Validate stage runs on every branch: build, typecheck, lint, and unit tests must all pass
- Security stage runs on merge requests to main: automated security scanning
- E2E stage runs on merge requests to main: full Playwright end-to-end test suite
- Deploy stage runs on main branch only: triggers a Vercel deployment via deploy hook
Locally, Husky pre-commit hooks enforce lint-staged checks, TypeScript verification, unit tests, and a full build before any commit reaches the remote. The entire pre-commit pipeline completes in roughly 15 seconds.
Security Hardening
Security is built into every layer. The application sets strict HTTP headers including Content Security Policy, HSTS, X-Frame-Options, and X-Content-Type-Options. Rate limiting protects both the API endpoints and authentication flows.
All secrets are managed through environment variables with no hardcoded credentials in the codebase. Password comparison uses timing-safe equality checks to prevent timing attacks. The API key system hashes keys before storage so even a database breach doesn't expose raw credentials.
Testing Strategy
The testing infrastructure includes 16 unit test files running on Vitest with an 80%+ coverage target. End-to-end tests use Playwright to verify critical user flows — registration, API key generation, data generation, and subscription management.
The test suite validates the data generators against real South African format specifications, ensuring generated ID numbers pass Luhn validation, phone numbers match the correct digit patterns, and addresses reference actual provinces and suburbs.
Key Technical Decisions
Choosing Neon for PostgreSQL gave the project serverless database scaling without managing infrastructure. Drizzle ORM was selected over Prisma for its lighter footprint and closer-to-SQL query building. Paystack was the natural choice for payments given the South African target market, offering local payment methods and ZAR currency support.
The decision to use deterministic seeding throughout all generators was driven by developer experience — being able to reproduce exact test data from a seed value makes debugging and test isolation significantly easier for end users of the API.
Gallery


