Measure the build
your users get.

The production build your users actually run — measured across versions, flagged as a regression only when the change clears real, measured noise.

Read the docs →

Cross-version performance, on the real build

RAM RSS

0MB

start142MB peak356MB

median of 5 · ±12 MB · lower better

Render count

wasted18% wasted/cmd2.4

median of 5 · ±186 · lower better

version4.2.0 (812) branchmain commita1c3f90

iPhone 15 (sim) · iOS 26.4 · window 30 s

Find the screen that's leaking.

A global memory slope is duration-fragile. weftrun diffs the memory retained per screen — Δ-of-Δ — so a screen that leaks 3.4 extra MB reads the same whether the window was 30 s or 120 s. Hover a row to see exactly when in the run it happened.

Per-screen Δ RAMsingle-run · duration-robust

Screen	baseline Δ	candidate Δ	Δ-of-Δ
Detail	+8.2	+11.6	+3.4 MB
Feed	+4.1	+4.8	+0.7 MB
Home	+2.3	+2.1	−0.2 MB
Onboarding	—	+1.8	✦ new

Lines aligned at t=0 (window start). Δ-of-Δ keeps color even when run windows differ. — nav · RAM · CPU (sim-only)

Repeatability is measurable.

--repeat 5 runs your flow five times — each an isolated cold start, the app terminated between runs. Build and install once; measure five times. The sample standard deviation across those runs is your noise floor. Not a policy. Not a guess. A measurement.

$ weftrun run checkout-flow --repeat 5
build once · install once · measure 5×
✓ 5 cold starts · ram.rss median 324 MB · ±12 MB

five runs · cold start each

git diff --stat

$ git diff --stat
$

release build · production-optimized · auto-reverted

No source changes.
Production-representative.

weftrun instruments your release build at build time — no edits to your codebase, no SDK to add. You measure the optimized, production-compiled app your users actually run, not a debug build with developer-mode overhead. When the run ends, every injected change is reverted.

Color only when signal beats noise.

A red number is an accusation. Most tools accuse on a flat 5% — crying wolf on a 6% wobble that's pure noise, and staying silent when a leak goes from zero to five megabytes. weftrun colors a delta only when it clears 2σ of measured noise. A change that appeared from nothing gets a ✦ new badge. And if two runs aren't comparable, it refuses to diff them at all.

RAM RSS

324→298MB

▼ −8%

clears 2σ (±15 combined) · median of 5

Render count

1,847→2,100

▲ +14%

clears 2σ (±196 combined) · median of 3

JS heap comparable

41.2→42.0MB

· +2%

within 2σ noise floor — not colored

RAM trend

0→1.8MB/min ✦ new

appeared

0 → N — structural, surfaced first-class

the gate

Drag it: nothing turns red
until it earns red.

This is the chokepoint every verdict passes through. Move the candidate's delta. The badge stays neutral until the change clears twice the measured standard deviation — then, and only then, it's a regression.

measurement classes

verdict: the pass/fail headline metric (ram.rss, render.count).
comparable: cross-version comparable diagnostic.
sim-only: host/sim-bound — same-machine trending only.
proxy: not a real device number (rAF self-loop "FPS").

verdict gate render.count · baseline 1,847

+0.0% · below 2σ guard

candidate Δ vs baseline · noise floor ±196 (2σ)

02σ+450

below 2σ → noise above 2σ → regression

Measure the next change.

weftrun is opening access in batches. Join the waitlist — we'll email you the moment your seat is ready.

Free

5 runs / month
public dashboard
1 app

Team

$29/mo

unlimited runs
private CI gating
fingerprint-gated baselines
priority support

weftrun run <flow> --repeat 5

Measure the build
your users get.

Find the screen that's leaking.

Repeatability is measurable.

No source changes.
Production-representative.

Color only when signal beats noise.

Drag it: nothing turns red
until it earns red.

Measure the next change.

Free

Team

Join the weftrun waitlist

You're on the list.

Measure the buildyour users get.

Find the screen that's leaking.

Repeatability is measurable.

No source changes.Production-representative.

Color only when signal beats noise.

Drag it: nothing turns reduntil it earns red.

Measure the next change.

Free

Team

Measure the build
your users get.

No source changes.
Production-representative.

Drag it: nothing turns red
until it earns red.