Case study — 01ML Systems / Computer Vision

DeepVerify

A deepfake detection platform built on one premise: no single detector deserves your trust. Five CNN architectures examine every upload in parallel, and their agreement — or disagreement — becomes the verdict.

Status: Live · deepverify.site
Timeline: Nov — Dec 2025
Role: Solo build
Stack: Next.js · FastAPI · PyTorch · Redis · Docker

Live product

deepverify.site

Open DeepVerify

Live productOpen ↗

DeepVerifyDec 2025

ML Systems / Computer VisionNov — Dec 2025

01 — The systemSelect a stage to inspect

Stage 01 — Upload

Drag-and-drop console accepting multiple images at once. Each upload becomes an independent job with its own lifecycle — nothing blocks while models work.

Client-side type and size validation before any byte leaves the browser
Parallel uploads — each image is a separate job
Job ID returned immediately; the UI tracks status from there

Drag-and-drop console accepting multiple images at once. Each upload becomes an independent job with its own lifecycle — nothing blocks while models work.

Client-side type and size validation before any byte leaves the browser
Parallel uploads — each image is a separate job
Job ID returned immediately; the UI tracks status from there

02 — ChallengeWhat stood in the way

One model lies confidently.

Single-model detectors look great on their own test set, then misfire in production — each architecture overfits to the artifacts of whatever generators it trained against. And inference this heavy can't sit on the request path.

A single CNN locks onto one family of generation artifacts — confident and wrong on the next one

Five models per image is heavy compute; running it synchronously freezes every upload

Skewed training data silently biases verdicts — the set had to stay balanced at 40K scale

03 — SolutionDecisions and tradeoffs

Three decisions shaped the system — each one traded raw simplicity for production behavior.

An ensemble, not a bigger model

Five diverse architectures vote on every image. Where one model's blind spot begins, another's training takes over — disagreement itself becomes a usable signal.

Tradeoff — 5× inference cost per image — bought back with parallel workers behind the queue.

A queue between the API and the models

Redis decouples intake from inference. Uploads return instantly with a job ID; workers chew through the backlog at their own pace.

Tradeoff — Results are eventual, not inline — acceptable when the answer lands in seconds.

Balance the data before tuning the models

The custom CNN trained on 40K images split exactly 20K real / 20K synthetic. Keeping that balance moved accuracy more than any architecture tweak.

Tradeoff — Curation time over raw dataset volume.

04 — ArchitectureExpand the layers

Upload console and verdict report. Optimistic job rows track status from queued to done without a reload.

Upload console

drag-and-drop, multi-file

Job status

polls the queue state per job

Verdict report

per-model breakdown + timing

05 — OutcomesMeasured, not promised

Model ensemble

EfficientNet · ResNet · Xception · MobileNet · custom CNN

0.0%

Reported accuracy

5-model weighted vote — shown on the live product

Lighthouse — desktop

perf 100 · CLS 0 · LCP 0.7s (mobile 95)

Training images

balanced — 20K real / 20K synthetic

06 — InterfaceProduct views

DeepVerify homepage — Multi-Model Deepfake Detection — 01The production landing: multi-model verdicts with heatmaps and confidence scores.

Ensemble verdictVIEW 03

EFFICIENTNET-B00.82

RESNET-500.74

XCEPTION0.90

MOBILENET-V20.66

CUSTOM CNN0.87

VERDICTSYNTHETIC — 96.8% CONFIDENCE1.8s

03Five independent probabilities, one weighted answer — spread is signal.

Inference queueVIEW 02

QUEUED

img_4823

waiting · pos 1

img_4824

waiting · pos 2

RUNNING

img_4821

ensemble · 5 models

img_4822

preprocess

DONE

img_4818

real · 0.94

img_4819

synthetic · 0.97

img_4820

real · 0.89

02Jobs move queued → running → done while the API stays untouched.

API responseVIEW 04

{
  "job_id": "img_4821",
  "verdict": "synthetic",
  "confidence": 0.968,
  "models": {
    "efficientnet_b0": 0.82,
    "resnet50": 0.74,
    "xception": 0.90,
    "mobilenet_v2": 0.66,
    "custom_cnn": 0.87
  },
  "latency_s": 1.8
}

04An answer you can interrogate — every model on the record.

07 — LearningsWhat the build taught

Disagreement is information

When five models split on an image, that spread says more than any single confidence score. Designing the verdict to expose it — instead of averaging it away — made the output trustworthy.

Queues are a UX feature

Redis wasn't a performance trick. It's the reason uploading feels instant while five CNNs grind in the background — latency you can't remove, you can move.

Data discipline beats architecture

Keeping the training set balanced at 20K/20K moved the metrics more than any clever layer ever did. The boring work was the high-leverage work.

Next case studyDeveloper Tools / Browser Extension

02REVOReads a GitHub repository in 30 seconds — the ten-minute evaluation skim, automated.