00 · The Format

How Roast My Design Works

A structured peer review format where someone walks their colleagues through a real design - and their colleagues tell them, honestly, where they got it wrong.

📐

Someone shares their design

A developer presents a real project they built - not a toy example, not a sanitised demo. The full thing: what problem it solves, the decisions made, and why.

Step 1

🗂️

The design covers every layer

Requirements & business context, architecture, backend, frontend, testing strategy, infrastructure and DevOps. Nothing is skipped - every layer is fair game for critique.

Step 2

🔥

Colleagues roast it - constructively

For each decision, the audience calls out what was done wrong and how they would do it instead. The goal is not to embarrass - it's to learn. Bad choices get named. Good ones get credit too.

Step 3

Real code, real trade-offs, and a room full of people who've made the same mistakes. The presenter learns what to fix. The audience learns what not to build. Everyone leaves with something useful, eventually.

🤖 Disclaimer: All the roasts and "Actually Kinda Smart" takes that follow were written by the Bot. The presenter provided the code. The Bot provided the dad jokes. Responsibilities are clearly divided.

🔥 Roast My Design · Reverse Wall of Shame

Reverse Wall of Shame

I built a leaderboard.
Then I over-engineered it.

An internal recognition platform for publicly celebrating the people who actually get things done - because someone has to, and it shouldn't happen quietly. This is the architectural post-mortem I'm proud of.

Lambda Functions

DynamoDB Tables

React SPA

90%+

Code Coverage

Regrets

Click here to see the actual live site - dev-rws.endava-test-domain.be (accessible only from Endava internal network)

🔒 Admin panel screens — visible only to users with the Admin role. Click any thumbnail to zoom.

People

Achievements

Person Achievements

Media

Data Management

Edit Person

Edit Achievement

Edit Person Achievement

🤖 Coded with AI Not interested in the specific project? Skip directly to the AI section to read about how it was built with AI and the accelerator template that came out of it.

01 · Requirements

The Brief

"We want a Wall of Shame. But reversed. With achievements. And a carousel. And offline mode. And admin login. Oh, and it should be on the office TV." - Miro, at some point.

📄 business-requirements.md

🏆

Public Scoreboard

Read-only leaderboard ranked by cumulative achievement score over a rolling 12-month window. No auth, no exposed emails.

Core product
🎖️

Achievement Catalogue

Admins define achievements: title, description, numeric score, and icon. These are the units of recognition driving all scoring.
👥

People Management

Name, discipline, unique email, optional photo. The roster of people who receive achievements and appear on the board.
🔗

Achievement Assignment

Assign any achievement to any person with a date and optional context. Same achievement can be received multiple times.
🔒

Secure Admin Panel

All data modifications gated behind corporate SSO. Some know who the admins are. Classic power dynamic.

Mystery admins
📺

Scoreboard Carousel

Full-screen presentation mode, cycles through people one at a time. Auto-activates after 30 seconds of idle display. Built for office TVs.

TV UX approved
💾

Offline USB Export

Downloads the current scoreboard as a fully self-contained HTML file with base64-inlined images. Playable on a TV with no internet.

Apocalypse ready
🔄

Auto Scoreboard Refresh

Regenerated automatically twice daily via a scheduled rule. Max staleness: 12 hours. Admins can also trigger on-demand.
🖼️

Media Management

Upload, browse, and delete person photos and achievement icons. A media picker is embedded in the person and achievement forms.
📦

Data Export & Import

Full database dump for backup or migration. Import accepts the same format the export produces. Full round-trip.

The Name

🔥 The Roast

"Reverse Wall of Shame" is just a leaderboard with extra steps and a name that requires two seconds of explanation at every demo. Should have called it "the board".

🧠 Actually Kinda Smart

The name is the product. It inverts the shame narrative into recognition without sounding like a corporate awards ceremony. Memorable, defensible, slightly smug - exactly right.

Offline USB Export

🔥 The Roast

A fully self-contained HTML file with base64-inlined images — in case the office loses internet mid-ceremony and someone needs to plug in a USB stick. For a recognition board. The threat model for this feature has never existed.

🧠 Actually Kinda Smart

The export implementation is genuinely clever: one HTTP call, one file, runs anywhere. Conference rooms with locked-down networks, hotel TVs, laptops on a plane. The use case is niche but the solution is zero-dependency portable HTML — hard to argue with that.

Scoreboard Carousel

🔥 The Roast

A full-screen cinematic presentation mode — for an office TV showing recognition scores. Someone built a slide deck engine into a leaderboard. The TV has one job. It does not need transitions.

🧠 Actually Kinda Smart

The TV is the product surface. If the scoreboard just sits there as a static list, nobody reads it after day two. The carousel forces visibility — each person gets their moment. That is the whole point of the board. Without the carousel, it is wallpaper.

Auto Scoreboard Refresh

🔥 The Roast

An EventBridge cron triggers SNS, which fans out to SQS, which invokes a Lambda, which writes one JSON file — twice a day. That is a four-service distributed pipeline to schedule a file write. A cronjob would have been three lines.

🧠 Actually Kinda Smart

The SQS queue serialises concurrent regeneration requests so two Lambdas never race on the same S3 write. SNS means future consumers — Slack notifications, webhooks — can subscribe without touching the generator. The DLQ auto-retries failures silently. That is not over-engineering, that is the architecture doing its job.

Data Export & Import

🔥 The Roast

A full round-trip backup and restore pipeline — for an app that tracks who won a "Best Commit Message" badge in Q3. The data loss risk here is reputational at best. Nobody is calling the DBA at 2am over this.

🧠 Actually Kinda Smart

PITR covers infrastructure-level recovery, but environment migration — dev to prod, org rebrand, new AWS account — requires a portable data format. The export/import round-trip is the escape hatch that makes the system not hostage to its own DynamoDB tables. You only need it once. When you need it, you really need it.

02 · Architecture Overview

The Big Picture

Serverless. Event-driven. Over-engineered in all the right ways. Here is what we are dealing with before we go into each discipline.

📄 architecture.md 📄 c1-system-context.md 📄 c2-containers.md

C1 · System Context

🔥 Core thesis: "Let's build a leaderboard using a full serverless event-driven AWS architecture - WAF, KMS encryption at rest, PITR, OIDC federation, SNS fan-out, and a custom Lambda logging layer." Nobody asked for that. Nobody said no. I proceeded.

Tech Stack

Backend

Runtime	`Python 3.14`
Web framework	`FastAPI`
ASGI adapter	`Mangum`
AWS SDK	`boto3`
JWT validation	`PyJWT[crypto]` (authorizer)
Structured logging	`rws-logger` (custom Lambda layer)
Package management	`uv` - per-function virtual envs

Frontend

Framework	`React 19` + TypeScript
Build tool	`Vite`
Auth	`@azure/msal-browser` + `@azure/msal-react`
Runtime config	`window.__env__` via `public/config.js`

Infrastructure

IaC	`OpenTofu`
State backend	S3 (state + lock)
CI/CD	Azure DevOps Pipelines
AWS auth from CI	OIDC - `sts:AssumeRoleWithWebIdentity`

The Scoreboard Regeneration Pipeline

Every time the scoreboard needs to be refreshed - scheduled or on-demand - this is the journey:

⏰

EventBridge

cron / admin click

→

📢

SNS Topic

fan-out

→

📬

SQS Queue

buffer + DLQ

→

⚡

Generator Lambda

concurrency = 1

→

🗂️

S3 Write

scoreboard.json

→

🌐

CloudFront

cache invalidation

System Qualities

📉

Scales to Zero

No idle servers. Lambda + DynamoDB on-demand. Pay per invocation.

🛡️

Security in Depth

WAF IP allowlist · JWT auth · KMS everywhere · TLS 1.3 · PITR on all tables.

💀

Resilient Read Path

CDN serves scoreboard.json even during a full Lambda + DynamoDB outage.

👁️

Observable

Structured JSON logs with correlation IDs propagated across every Lambda.

🚢

Independently Deployable

Every Lambda has its own bundle and deploy script. Infra managed separately.

📄

API-First

OpenAPI spec is the documentation contract. JSON schemas live as versioned files.

Architectural Decisions & Trade-offs

1 · Multiple single-purpose Lambda functions instead of a monolith +

Each API resource group is a separate Lambda - independent deployment and isolated blast radius.

✅ Pros	⚠️ Cons
Independent deployment and scaling per function	More IAM roles, env vars, deploy scripts to manage
Least-privilege IAM - each function only accesses the tables it needs	Cold start per function in low-traffic scenarios
Isolated blast radius - bug in export cannot affect achievements	Cross-cutting concerns (logging, CORS) must be layered

2 · DynamoDB instead of a relational database +

Three tables: People, Achievements, PersonAchievements. No joins at query time.

✅ Pros	⚠️ Cons
Fully serverless - no connection pooling, no schema migrations	No joins - scoreboard generator must batch-fetch all related records
Auto-scaling, built-in KMS encryption, PITR enabled out of the box	Query patterns must be designed upfront via GSIs
Pay-per-request pricing aligns with low-write workload	Unique email enforcement requires a conditional write

3 · Pre-generated static scoreboard JSON +

Scoreboard is generated asynchronously and stored as scoreboard.json in S3 - no DB read at display time.

✅ Pros	⚠️ Cons
Public endpoint requires no auth and no database read	Scoreboard is at most 12 hours stale without a manual regeneration
Survives a full backend outage (CDN continues to serve cached JSON)	Extra pipeline complexity (SNS, SQS, EventBridge, CloudFront invalidation)
Scales to thousands of simultaneous TV screens at no extra cost	Admin must trigger refresh explicitly to see changes immediately

4 · SNS fan-out with SQS buffering for scoreboard generation +

EventBridge schedule and admin trigger both publish to SNS → SQS → Lambda generator.

✅ Pros	⚠️ Cons
Future consumers (Slack, analytics) can subscribe to SNS without changing existing code	Additional infrastructure (SNS topic, SQS queue, subscriptions)
SQS + reserved concurrency = 1 prevents concurrent S3 writes	Adds a few seconds of latency compared to direct invocation
Automatic retry on generator failure via SQS re-delivery

5 · API Gateway REST API (v1) over HTTP API (v2) +

REST API variant chosen despite HTTP API being cheaper.

✅ Pros	⚠️ Cons
Native WAF integration (not available with HTTP API)	Roughly 3.5× higher per-request cost vs HTTP API
Per-stage method-level logging	More complex OpenTofu configuration
Richer Lambda Authoriser response (IAM policy document)

6 · Microsoft Entra ID as the sole identity provider +

Admin auth fully delegated to Entra ID - no user table in the system.

✅ Pros	⚠️ Cons
No credential storage, password policy, or MFA management in the app	Hard dependency on Microsoft - admin access breaks if Entra ID is unreachable
Enterprise SSO - admins use their existing corporate accounts	Requires Entra ID app registration per environment
Token revocation and conditional access handled by the IdP

7 · Full infrastructure management in OpenTofu (including API Gateway routes) +

All infra - Lambda, IAM, WAF, ACM, Route53, API Gateway resources - managed from infra/. OpenAPI spec is documentation only.

✅ Pros	⚠️ Cons
Single deployment pipeline for all infrastructure changes	API contract changes require an OpenTofu plan/apply cycle
No risk of drift between a separately deployed API and the infra state	API Gateway resource definitions live in HCL rather than the OpenAPI spec

8 · Runtime frontend configuration - environment-agnostic builds +

Config values are not baked into the JS bundle - a config.js is uploaded to S3 at deploy time.

✅ Pros	⚠️ Cons
Single build artifact can be deployed to any environment	Extra deploy step required to generate and upload `config.js`
Config changes do not require a rebuild	`config.js` must be served with `no-cache` to prevent stale values
Branch protection for prod enforced at deploy stage, not build

9 · Open CORS on all Lambda APIs +

All Lambda functions configure FastAPI's CORS middleware to allow requests from any origin.

✅ Pros	⚠️ Cons
Simplifies local development and cross-environment testing	Relaxed security posture - any origin can call the API
No need to update CORS config when the frontend domain changes	The WAF IP allowlist is the practical access control layer

C2 · Container Diagram

The Scoreboard Pipeline

🔥 The Roast

You built a four-service enterprise event pipeline - EventBridge, SNS, SQS, reserved concurrency - to generate one JSON file, twice a day. That is 730 executions per year. The infra to support it probably costs more than the compute.

🧠 Actually Kinda Smart

SQS ensures only one generator runs at a time - no race conditions on the S3 write. SNS lets future consumers (Slack notifications, webhooks) subscribe without touching existing code. The DLQ means failed regenerations auto-retry. Reserved concurrency = 1 costs literally nothing extra.

DynamoDB over RDS

🔥 The Roast

Three tables. No joins. A scoreboard generator that has to batch-fetch every person, every achievement, and every assignment separately - because someone said "NoSQL scales" for an app that writes maybe 50 rows a month. A SQLite file on a Raspberry Pi would have been fine.

🧠 Actually Kinda Smart

No RDS instance means no idle compute, no connection pool to manage, no schema migration scripts. Auto-scaling, KMS encryption, and PITR come free out of the box. For a low-write workload the economics are genuinely better than provisioned RDS. The batch-fetch cost is paid once per regeneration, not per page load.

REST API v1 over HTTP API v2

🔥 The Roast

Chose the API Gateway variant that costs 3.5 times more per request - for an internal tool behind a WAF that already blocks 99% of traffic. The audience is literally six admins and a TV. The cheaper API would have handled that without breaking a sweat.

🧠 Actually Kinda Smart

HTTP API v2 cannot integrate with WAF. The WAF is not optional - it is the IP allowlist that keeps the API private. REST API v1 was the only path to native WAF integration without a CloudFront workaround. The price delta on near-zero traffic is cents per month.

KMS Everywhere

🔥 The Roast

Customer-managed KMS keys for DynamoDB, SSM, Lambda environment variables, and CloudWatch Logs - on an app that stores people's names, job titles, and badge icons. The threat model here is a disgruntled AWS employee with root access. Nobody else is getting at this data.

🧠 Actually Kinda Smart

Checkov fails the plan without customer-managed keys on these resources. The pipeline enforces it. The cost is minimal and the audit trail is real. Security controls that are cheap to implement should be implemented - the point is building the habit, not protecting secrets that do not exist yet.

Open CORS

🔥 The Roast

CORS is set to allow any origin across all Lambda functions. The official justification is "the WAF handles it". The WAF is an IP allowlist. The CORS header is sent after the WAF lets you through. At that point, any browser tab on an allowed IP can call the API. Intentional, but worth naming.

🧠 Actually Kinda Smart

Local development, staging, and prod all share the same Lambda bundle. Locking CORS to a specific origin means updating config for every environment change and every domain rename. The WAF IP restriction is the meaningful access boundary here. Open CORS behind a strict WAF is a deliberate layered posture, not a shortcut.

03 · Development - Backend

Nine Functions Walk Into a Bar

Each one orders a different drink, has its own IAM role, its own virtual environment, and absolutely refuses to share a deployment pipeline with anyone.

📄 architecture.md 📄 c2-containers.md 📄 c3-backend-components.md

Tech Stack

Concern	Technology
Runtime	`Python 3.14`
Web Framework	`FastAPI`
ASGI → Lambda Adapter	`Mangum`
AWS SDK	`boto3`
JWT Validation	`PyJWT[crypto]` (authorizer only)
Structured Logging	`rws-logger` - custom Lambda layer
Package Management	`uv` - per-function virtual environments
Dependency Declaration	`pyproject.toml` with `[dependency-groups]`

The Lambda Fleet

🔑

Lambda Authorizer

Validates Entra ID JWT Bearer tokens. Fetches JWKS (cached). Returns IAM Allow/Deny policy to API GW.

→ all protected routes

🎖️

Achievements

CRUD for the achievement catalogue. Blocks deletion when referenced by a PersonAchievement.

/achievements

👥

People

CRUD for the people roster. Unique email enforced via DynamoDB GSI + conditional write.

/people

🔗

Person Achievements

CRUD for assignment records (person · achievement · date · context).

/person-achievements

🔔

Scoreboard Trigger

Publishes a refresh event to SNS. That is literally the entire function.

/scoreboard/trigger

⚡

Scoreboard Generator

Reads all three DynamoDB tables, builds scoreboard.json, writes to S3, invalidates CloudFront.

SQS / EventBridge

📤

Data Export

Full DynamoDB scan. Returns JSON dump of all people, achievements, and assignments.

/export

📥

Data Import

Accepts the export format. Batch-writes all records to all three DynamoDB tables.

/import

🖼️

Media

Upload/list/delete images. Magic-byte validation. Scoped strictly to the media/ S3 prefix.

/media

C3 · Backend Lambda Functions

Backend Lambda Functions Component Diagram

rws-logger - Because console.log is for Quitters

A custom Lambda layer published to AWS Lambda Layers and as a private package on AWS CodeArtifact. Every Lambda imports it. Every log entry gets: timestamp · level · message · correlationId. The correlation ID is extracted from the API Gateway event and propagated across the full request lifecycle. The logger has its own independent unit test suite.

Backend Unit Tests

Each Lambda has an independent test suite co-located in its own tests/ directory with its own uv virtual environment. Tests run as a quality gate before bundling.

Test Tooling

`pytest`	Test runner and assertions
`moto`	AWS service mocking (DynamoDB, S3, SNS, SQS, SSM)
`httpx`	FastAPI `TestClient` transport
`freezegun`	Time freezing for scoreboard window tests
`unittest.mock`	Patching `boto3` and `urllib` calls

Mocking strategy: AWS via moto @mock_aws. DynamoDB tables are created in fixtures with the same GSIs as production. The authorizer patches at the lowest viable level so _get_config and _get_jwks are actually executed and covered - not just mocked away. No real AWS calls. Ever.

Coverage target - all Lambda functions

≥ 90% - enforced as a CI quality gate before bundling

FastAPI + Mangum

🔥 The Roast

A full async web framework with request validation, serialisation, and automatic OpenAPI docs - inside a Lambda that handles maybe 20 requests a day from six admins. A racing engine in a bicycle. The bicycle has a custom exhaust pipe (rws-logger).

🧠 Actually Kinda Smart

FastAPI gives free request validation, automatic 422 errors, and a local dev server. Mangum is a 3-line ASGI adapter. Run uvicorn handler:app locally and Lambda doesn't exist. The cold-start overhead is negligible. The DX gain is not.

DynamoDB - No Joins, No Problem

🔥 The Roast

Three tables. No joins. The scoreboard generator batch-fetches People, Achievements, and PersonAchievements separately, then stitches them together in Python. You invented a manual JOIN and called it "aggregation" in the docs.

🧠 Actually Kinda Smart

Fully serverless, auto-scaling, KMS-encrypted, PITR out of the box. No connection pooling, no schema migrations - ever. Pay-per-request is cheap for a low-write workload. The batch-fetch runs once per regeneration, not per page load.

The Scoreboard Trigger Lambda

🔥 The Roast

A dedicated Lambda function whose entire body is one SNS publish() call. It has its own IAM role, its own pyproject.toml, its own virtual environment, its own deploy script, and its own test suite. For one line of business logic.

🧠 Actually Kinda Smart

Separation of concerns: the HTTP trigger and the generation work are independently deployable. The trigger Lambda can be replaced, versioned, or removed without touching the generator. The IAM boundary means the admin-facing endpoint cannot access DynamoDB at all. One line of business logic, zero blast radius.

rws-logger - Custom Logging Layer

🔥 The Roast

A custom Python Lambda layer, published to CodeArtifact, with its own test suite, just to add a correlationId field to log lines. CloudWatch can filter by request ID natively. This solved a problem that did not exist and then got its own CI pipeline.

🧠 Actually Kinda Smart

The correlation ID propagates the API Gateway request ID across every Lambda in a chain - not just the one that received the HTTP call. CloudWatch's built-in request ID only covers the entry Lambda. Structured JSON means logs are queryable with CloudWatch Insights without parsing free-text strings. The layer costs nothing to run and everything to not have when you're debugging a silent failure at 2am.

Magic-Byte Validation in the Media Handler

🔥 The Roast

The media upload endpoint reads the first bytes of every file to validate the magic signature before accepting it. For an internal admin tool where the uploaders are the same six people who can already read the source code. The threat model is a very confused colleague.

🧠 Actually Kinda Smart

MIME-type spoofing via file extension is trivial. Magic-byte validation catches it at the Lambda level before the file ever reaches S3. The WAF does not inspect request bodies. The check is three lines. The habit of validating untrusted input regardless of who's sending it is the right one to build - even when the sender is yourself.

Per-Function Virtual Environments

🔥 The Roast

Nine Lambda functions. Nine pyproject.toml files. Nine uv virtual environments. Nine bundle scripts. If you need to update a shared dependency you touch nine files. This is distributed monorepo pain with extra steps.

🧠 Actually Kinda Smart

Each Lambda's bundle contains only what it needs - no shared bloat inflating cold-start times. A breaking change in one function's dependencies cannot silently affect another. uv makes virtual env creation fast enough that the overhead is seconds, not minutes. The pain of updating nine files is the cost of true isolation. It has paid off exactly once already - when boto3 was bumped in one function without touching the authorizer.

04 · Development - Frontend

One React App. Many Opinions.

React 19. TypeScript. Vite. MSAL. And a window.__env__ injection scheme that sounds like a hack until you hear the reasoning - and then it sounds like a hack you would do again.

📄 c3-frontend-components.md

Tech Stack

Concern	Technology
Framework	`React 19` + TypeScript
Build Tool	`Vite`
Authentication	`@azure/msal-browser` + `@azure/msal-react`
Runtime Config	`window.__env__` via `public/config.js` injected at deploy time

Key Components

ScoreboardPage

Public entry point. Ranked table, carousel trigger, offline export button.

AdminPage

Tabbed interface: Achievements, People, Person Achievements, Media, Data Management.

useScoreboard

Fetches scoreboard.json from the CDN. Builds ranked ScoreboardEntry list client-side.

useCarouselTimer

State machine: 30s idle → carousel mode. Configurable timeouts from window.__env__.

CarouselView

Full-screen dark overlay. Cycles through people every 6s. CSS slide-in/slide-out animations.

ProtectedRoute

Guards admin routes. Checks active MSAL account. Redirects to scoreboard if absent.

api.ts

Thin fetch wrapper. Attaches JWT Bearer token to every admin API request automatically.

exportScoreboard

Collects CSS, base64-encodes all images, inlines carousel JS. Downloads as scoreboard.html.

MediaPickerModal

Browse or upload media. Embedded in achievement and person forms.

C3 · React SPA Components

Frontend Unit Tests

Test Tooling

`Vitest`	Test runner and assertions
`@testing-library/react`	Component rendering, semantic queries
`@testing-library/user-event`	Realistic user interactions
`vi.mock` / `vi.spyOn`	Module and function mocking
`vi.stubEnv`	Environment variable stubbing

Coverage exclusions: useAuth.ts (MSAL popup - untestable in jsdom), App.tsx (Router wiring, tested via E2E), main.tsx (entry point, no logic), config.ts (window.__env__ is runtime-only). All exclusions are documented and justified.

Coverage target - lines / functions / branches / statements

≥ 90% - enforced before the Vite bundle is produced

The Runtime Config Trick

🔥 The Roast

Instead of a .env file like a normal person, the deploy step generates config.js, uploads it to S3, and the page loads it at runtime via window.__env__. Environment variables: one weird trick. The React app reads globals off window like it's 2012.

🧠 Actually Kinda Smart

One build artifact, any environment - no rebuild for a config change. Branch protection enforced at deploy time, not build time. config.js is served with no-cache so stale config is structurally impossible. The window global is typed and read through a single config.ts module - not scattered across the app.

Carousel + State Machine

🔥 The Roast

A useCarouselTimer hook with a 30-second idle state machine, configurable timeouts read from window.__env__, and CSS slide-in/slide-out animations. For a TV. That someone walks past. Once a day. The TV does not care about transitions.

🧠 Actually Kinda Smart

The carousel is the entire UX on a display nobody touches. Without it the board is a static list and nobody reads it after day two. The idle timer and configurable timeouts mean it works in any room layout without a code change. The animations are the difference between a dashboard and a PowerPoint slide.

Offline Export

🔥 The Roast

The export function fetches all scoreboard data, base64-encodes every image, inlines the carousel JS, and emits a single self-contained scoreboard.html. A bespoke static site generator - for a leaderboard that already has a live URL.

🧠 Actually Kinda Smart

Conference rooms have projectors and no Wi-Fi. USB-booted TVs exist. One click, one file, zero dependencies. The base64 inlining means images never break on an air-gapped screen. The carousel works offline too - it is all inlined. The feature is a single function. The value is disproportionate to the effort.

MSAL in a React SPA

🔥 The Roast

@azure/msal-browser + @azure/msal-react for an admin panel with maybe three active users. The authentication library is roughly the same size as the rest of the app combined. The login button triggers a Microsoft popup. For a leaderboard admin.

🧠 Actually Kinda Smart

The company already uses Entra ID. There is no credential management to build, no password reset flow, no MFA to implement - it all comes from the IdP. MSAL handles token refresh, expiry, and silent re-authentication automatically. The popup is one line. The alternative is building auth from scratch.

React 19 for an Internal Tool

🔥 The Roast

React 19. The newest major version, released months before this was built. An office recognition board is now running on the bleeding edge of the most popular frontend framework in the world. This is not a risk assessment. This is a flex.

🧠 Actually Kinda Smart

Low-stakes internal tooling is exactly where you should use new technology. If something breaks, six admins notice. React 19's improvements to the rendering model and the compiler are real. The upgrade path from 18 is smooth. The risk-reward here is correct.

useAuth.ts Excluded from Coverage

🔥 The Roast

The hook that handles the MSAL popup login flow is excluded from the 90% coverage gate because it cannot be tested in jsdom. The most security-critical path in the app has zero unit test coverage. The coverage badge is technically accurate.

🧠 Actually Kinda Smart

MSAL's popup flow requires a real browser window - jsdom cannot simulate it. The exclusion is documented and justified, not hidden. The E2E tests on Port 5173 exercise the full MSAL init flow via intercepted OIDC responses. The coverage gap is real but the testing gap is not.

05 · Testing - E2E

The Parallel Universe Where Auth Doesn't Exist

Playwright, Chromium, two Vite servers running simultaneously, and a complete network-level impersonation of Microsoft's identity platform. All for a leaderboard.

📄 testing.md

Two-Server Setup

🌐

Port 5173 - No Bypass

Real MSAL, but login.microsoftonline.com is intercepted via page.route() and returns stubbed OIDC discovery responses. Tests: auth flow, scoreboard display, carousel. Microsoft never receives a real network call.

🚪

Port 5174 - Auth Bypass

VITE_E2E_BYPASS_AUTH=true makes ProtectedRoute render unconditionally. Tests: all admin screens (achievements, people, person achievements, media). API responses mocked at network level via page.route(). No live backend needed.

Admin API Mocking Pattern

await page.route(
  (url) => url.pathname.includes('/achievements'),
  async (route) => {
    if (route.request().method() === 'GET') {
      await route.fulfill({ contentType: 'application/json', body: JSON.stringify(MOCK_DATA) })
    } else {
      await route.fallback()
    }
  },
)

All admin E2E tests own their own truth. Fully deterministic. No live backend. No surprise API changes breaking your E2E suite at 2am.

CI Behaviour

⚠️ E2E tests are intentionally not a build gate. They run in a separate, manually-triggered pipeline. Rationale: Playwright + network-level mocking + timing sensitivity = occasional false negatives that should not block fast feedback on code changes.

retries: 2 in CI workers: 1 in CI forbidOnly: true - test.only blocks the run Docker-first on Ubuntu 25.10 HTML report published regardless of outcome

Two Vite Servers, Two Realities

🔥 The Roast

Two Vite dev servers launched in parallel: one where Microsoft Entra ID is intercepted at the network layer and replaced with stubs, and one where authentication does not exist at all. Two parallel universes with different laws of physics - for a leaderboard.

🧠 Actually Kinda Smart

Admin tests need authenticated state but CI cannot use real Entra ID credentials. The bypass server is isolated - VITE_E2E_BYPASS_AUTH=true is a build-time flag that cannot reach production builds. The no-bypass server still exercises the full MSAL init flow. Two servers, two coverage domains, zero real Microsoft calls.

Faking Microsoft at the Network Layer

🔥 The Roast

page.route() intercepts calls to login.microsoftonline.com and returns hand-crafted OIDC discovery responses and fake JWT tokens. You have written a partial clone of Microsoft's identity platform in 50 lines of Playwright fixture code. For a leaderboard.

🧠 Actually Kinda Smart

The alternative is a test account in a real Entra ID tenant - credentials in CI, rotation policy, MFA edge cases, and a dependency on Microsoft's uptime during your test run. The stub is deterministic, offline, and never expires. It tests exactly what needs testing: that MSAL receives a valid-looking token and lets the user through.

All API Responses Mocked

🔥 The Roast

Every admin E2E test mocks every API call via page.route(). The E2E suite never touches the actual backend. Technically these are integration tests dressed in Playwright's clothing. The "E2E" in the name is doing a lot of work.

🧠 Actually Kinda Smart

A live backend in E2E means flaky tests, state pollution between runs, and a dependency on whatever is deployed. Network-level mocking gives full control over every response, including errors, empty states, and pagination edge cases. The tests are deterministic by design. The coverage is real even if the backend is not.

E2E Not a Build Gate

🔥 The Roast

The E2E suite is manually triggered and intentionally excluded from blocking the build. The official reason is "timing sensitivity". The actual reason is that Playwright tests are hard to make perfectly stable and someone made the pragmatic call to not let a flaky carousel animation test block a hotfix deploy.

🧠 Actually Kinda Smart

Flaky E2E tests as a hard gate are worse than no gate - they train engineers to ignore failures and re-run until green. A separate, manually-triggered pipeline with retries: 2 and workers: 1 is slower but trustworthy. The HTML report is always published. The tests are run, just not as a blocker. That is a deliberate trade-off, not laziness.

Playwright on Docker Ubuntu for a Vite Dev App

🔥 The Roast

The E2E pipeline spins up a Docker container on Ubuntu 25.10, installs Chromium, starts two Vite dev servers, and runs Playwright - against a development build of a React app that serves a recognition leaderboard. The infrastructure-to-feature ratio is heroic.

🧠 Actually Kinda Smart

A dev server is the fastest reproducible target - no S3 sync, no CloudFront cache, no deploy step. Docker ensures the browser, Node version, and OS are identical in CI and locally. Ubuntu 25.10 is the latest LTS-adjacent release with the most current Playwright browser binaries. The setup is correct, even if the application under test is a scoreboard.

06 · DevOps

Two Clouds. One Project. Zero Static Credentials.

Azure DevOps for pipelines. AWS for everything else. OpenTofu for infrastructure. OIDC federation so there is not a single access key stored anywhere.

📄 deployment.md 📄 testing.md - SCA Vulnerability Scanning

Pipeline Architecture

🏗️

Infrastructure Pipeline

Manual trigger. tofu init → tofu plan → Checkov security scan → manual approval gate → tofu apply (exact plan, no drift). Branch-protected for prod. Artifact-based apply: what was reviewed is what is deployed.

⚡

Lambda Build Pipeline

Auto-triggered on push (path-filtered to function directory). Unit tests as quality gate → bundle → publish artifact. Separate manual deploy step promotes the artifact to any environment without rebuilding.

🖥️

Frontend Pipeline

Auto-triggered on push. TypeScript check → Vitest coverage → Vite bundle. Deploy: generate config.js, sync dist/ to S3, CloudFront invalidation. Branch-protected for prod.

🧪

E2E Pipeline

Manually triggered. Playwright Chromium. HTML report published as a pipeline artifact regardless of outcome. Not a build gate - intentionally.

AWS Infrastructure · eu-west-2

Security Posture

🚫

WAF IP Allowlist

Blocks all traffic by default. Allows only known IP ranges. The practical access control layer for the public API.

🗝️

KMS Everywhere

DynamoDB · SSM SecureStrings · Lambda env vars · CloudWatch Logs - all encrypted with customer-managed KMS keys.

⏱️

PITR on All Tables

35-day point-in-time recovery window on every DynamoDB table. Not just the important ones.

🌐

TLS 1.3

Security policy enforced on the API Gateway custom domain. Minimum version, not negotiable.

🔍

Checkov Scans

HCL and plan JSON scanned on every infrastructure pipeline run. Policy failures block deployment.

🪪

No Stored Credentials

OIDC short-lived tokens from ADO to AWS. Zero access keys in any pipeline variable group.

SCA - Vulnerability Scanning

Every backend Lambda project and the frontend ship a scan_vulnerabilities.sh script. Runs Snyk CLI SCA against the full locked dependency tree, including dev deps.

🐍 Backend (per Lambda)

Isolated venv	`uv venv .snyk_venv` with pip seeded
Install deps	`UV_PROJECT_ENVIRONMENT=.snyk_venv uv sync --dev`
Export deps	`uv export --dev` → requirements-exported.txt
Scan	`snyk test --command=.snyk_venv/bin/python`
Cleanup	`trap cleanup EXIT` - removes venv + txt on exit

⚛️ Frontend (npm)

Install	`npm ci` - syncs node_modules with lock file
Scan	`snyk test --file=package.json`

⚙️ Environment Variables

`SNYK_TOKEN`	Required - Snyk API token
`SEVERITY_LEVEL`	Default: `high`
`PYTHON_VERSION`	Default: `python3.14` (backend only)

OpenTofu - Not Terraform

🔥 OpenTofu manages everything: DynamoDB tables, Lambda functions, IAM roles, WAF ACLs, ACM certificates, Route53 records, API Gateway resources, methods, integrations, the authorizer, SQS queues, SNS topics, EventBridge rules, and KMS keys. The OpenAPI spec is the documentation contract - the actual API is deployed by OpenTofu. You made a deliberate choice: one source of truth for deployment.

Two Clouds, One Project

🔥 The Roast

Microsoft for CI/CD, Amazon for compute. Two billing departments, two CLIs, two sets of IAM concepts, two support contracts, and an architecture diagram that requires a legend. Why pick one cloud when you can explain both at every job interview?

🧠 Actually Kinda Smart

ADO was already running for source control and project management - the switch cost was zero. OIDC federation bridges them without storing a single AWS credential anywhere. Short-lived tokens via sts:AssumeRoleWithWebIdentity mean no rotation calendar, no leaked key incidents, and no 3am Slack messages about compromised access keys.

OpenTofu Manages Everything

🔥 The Roast

API Gateway routes, methods, integrations, the authorizer, WAF rules, KMS keys, Route53 records - all declared in HCL. Every new API endpoint requires an OpenTofu plan/apply cycle. The OpenAPI spec exists purely as documentation that is structurally forbidden from being the truth.

🧠 Actually Kinda Smart

One source of truth for deployment means no drift between a separately deployed API spec and the actual infra state. Every change goes through plan → Checkov → manual approval → apply. The API contract is version-controlled, reviewable, and auditable. The pain of HCL-defined routes is the price of that guarantee.

Artifact-Based Deploy - Build Once, Deploy Anywhere

🔥 The Roast

The build pipeline produces an artifact. A separate manual deploy step promotes that artifact to an environment. Two pipeline runs to get code to prod. The CD in CI/CD is doing less work than the name implies.

🧠 Actually Kinda Smart

What was reviewed in the plan is exactly what is applied - no rebuild, no code change between approval and execution. Promoting the same artifact to staging and then prod means the thing that was tested is the thing that ships. The manual gate is deliberate: infra changes deserve a human in the loop.

Checkov on Every Infrastructure Run

🔥 The Roast

Checkov scans the HCL and the plan JSON on every infrastructure pipeline run for a leaderboard that stores names and badge icons. The security scanning infrastructure is more sophisticated than the data being protected.

🧠 Actually Kinda Smart

Checkov is what enforces the customer-managed KMS keys, the PITR settings, and the access logging that would otherwise be easy to skip on a low-stakes project. Policy-as-code means the security posture is reproducible and auditable. It runs in seconds. There is no good reason not to have it.

Snyk SCA on Every Lambda

🔥 The Roast

Nine Lambda functions. Nine scan_vulnerabilities.sh scripts. Each one creates an isolated venv, installs all dev dependencies, exports them, runs Snyk, and cleans up. Vulnerability scanning for a leaderboard's dependency tree. The CVE risk here is theoretical.

🧠 Actually Kinda Smart

The scan runs against the full locked dependency tree including dev deps - not just the production bundle. Vulnerabilities in build tools matter too. The trap cleanup EXIT pattern means a failed scan never leaves a venv behind. This is the kind of operational hygiene that is cheap to add early and expensive to retrofit.

07 · AI

Built With AI

This project was built with AI assistance throughout - not just for boilerplate, but for architecture discussions, test generation, documentation, and the presentation you're reading right now. Here's an honest breakdown of what that looked like.

🛠 Tools Used

Windsurf — Agentic Mode

AI coding agent running autonomously: multi-step tasks, file edits, terminal commands, and iterative reasoning without manual steering.

IntelliJ IDEA + Windsurf Plugin

IDE of choice. Windsurf runs as an IntelliJ plugin, keeping the full IDE experience while giving the agent access to the workspace.

LLM: Claude Sonnet 4.6 Thinking

Extended thinking model powering the agent. Used for architecture decisions, code generation, documentation, and the roasts in this presentation.

⚡ The Accelerator Template

One side-effect of building this project with AI was extracting the architectural pattern - Lambda + API Gateway backend, S3 + CloudFront frontend, OpenTofu infra, Azure DevOps pipelines. So I created a reusable reference library that an AI agent can read at the start of a new project and immediately know how to build the whole stack from scratch, almost correctly, without being told too many times.

The template consists of four reference guides plus an entry-point file that explains how they connect:

🤖 bot-entry.md

The entry-point file. Explains how the four reference guides connect, when to use each, and the conventions an AI agent should follow when bootstrapping the stack from scratch.

📄 bot-entry.md

🏗️ AWS API Gateway + Lambda

How to define the backend in OpenTofu: a reusable Lambda module, shared KMS key, per-function IAM policies, and a REST API Gateway with TOKEN authorizer, Lambda proxy integrations, and CORS. One module call per function - no Lambda resources defined outside it.

📄 aws_apigateway_lambda.md

☁️ CloudFront + S3 Frontend

How to define the frontend in OpenTofu: a private S3 bucket (SSE-KMS via Origin Access Control), a WAF Web ACL with IP allowlist, CloudFront distribution with security response headers, ACM certificates (in us-east-1), and Route 53 alias records for both the frontend and API Gateway domains.

📄 aws_cloudfront_s3.md

🚀 Infra CI/CD Pipeline (OpenTofu)

The four-stage infrastructure pipeline: Plan → Scan (Checkov) → Manual Approval → Apply. Covers the OIDC federation setup (iam_oidc.tf) that lets Azure DevOps assume an AWS IAM role without storing any static credentials. All pipelines use short-lived STS tokens scoped to the job.

📄 azure_devops_infra_pipelines.md

📦 Workload CI/CD Pipelines (Lambda + SPA)

Build and deploy pipelines for Lambda functions and the frontend SPA. Covers unit tests as a quality gate, SCA vulnerability scanning, artifact bundling, and deployment to AWS. Includes reusable ADO template steps and the shell scripts (bundle.sh, deploy.sh, run_unittest.sh, scan_vulnerabilities.sh) each service uses. Also covers the shared Python library / Lambda layer publish pattern via AWS CodeArtifact.

📄 azure_devops_workload_pipelines.md

The idea in one sentence: Give an AI agent these four files at the start of a new project and it can bootstrap the entire AWS + Azure DevOps stack - correctly, securely, and consistently - without being walked through it step by step each time.

How Roast My Design Works

Someone shares their design

The design covers every layer

Colleagues roast it - constructively

I built a leaderboard.Then I over-engineered it.

The Brief

Public Scoreboard

Achievement Catalogue

People Management

Achievement Assignment

Secure Admin Panel

Scoreboard Carousel

Offline USB Export

Auto Scoreboard Refresh

Media Management

Data Export & Import

The Name

Offline USB Export

Scoreboard Carousel

Auto Scoreboard Refresh

Data Export & Import

The Big Picture

Tech Stack

The Scoreboard Regeneration Pipeline

System Qualities

Architectural Decisions & Trade-offs

The Scoreboard Pipeline

DynamoDB over RDS

REST API v1 over HTTP API v2

KMS Everywhere

Open CORS

Nine Functions Walk Into a Bar

The Lambda Fleet

rws-logger - Because console.log is for Quitters

Backend Unit Tests

FastAPI + Mangum

DynamoDB - No Joins, No Problem

The Scoreboard Trigger Lambda

rws-logger - Custom Logging Layer

Magic-Byte Validation in the Media Handler

Per-Function Virtual Environments

One React App. Many Opinions.

Key Components

Frontend Unit Tests

The Runtime Config Trick

Carousel + State Machine

Offline Export

MSAL in a React SPA

React 19 for an Internal Tool

useAuth.ts Excluded from Coverage

The Parallel Universe Where Auth Doesn't Exist

Two-Server Setup

Admin API Mocking Pattern

CI Behaviour

Two Vite Servers, Two Realities

Faking Microsoft at the Network Layer

All API Responses Mocked

E2E Not a Build Gate

Playwright on Docker Ubuntu for a Vite Dev App

Two Clouds. One Project. Zero Static Credentials.

Pipeline Architecture

Security Posture

SCA - Vulnerability Scanning

OpenTofu - Not Terraform

Two Clouds, One Project

OpenTofu Manages Everything

Artifact-Based Deploy - Build Once, Deploy Anywhere

Checkov on Every Infrastructure Run

Snyk SCA on Every Lambda

Built With AI

🛠 Tools Used

⚡ The Accelerator Template

I built a leaderboard.
Then I over-engineered it.