How we reduced analysis time from 3 minutes to under 2
April 1, 2026 · 8 min read
The problem
When ARI launched, a full analysis took an average of 3 minutes 20 seconds. That's not bad — manual QA takes 3 days — but it felt slow. Engineers would trigger an analysis, context-switch to something else, and forget to check the result.
We set a target: under 2 minutes, always.
What was taking so long
We profiled hundreds of analyses and found the time was split roughly like this:
- —46% — crawling and simulating user flows
- —31% — AI inference (the Claude API calls)
- —15% — regression comparison against the previous baseline
- —8% — report generation and storage
The low-hanging fruit was obvious: we were doing most of this sequentially.
Fix 1: Parallel flow execution
Previously, we crawled each critical user flow one at a time: login, then checkout, then signup, then search. Each flow took 15–25 seconds. Five flows = up to 125 seconds before we even started AI inference.
We refactored to run all flows concurrently using a worker pool. Each flow gets its own headless browser instance. The pool scales based on application complexity — simple apps get 3 workers, complex apps get up to 8.
Result: flow crawling dropped from ~90s to ~22s.
Fix 2: Streaming AI inference
We used to wait for the full crawl to complete before sending anything to the AI. Now we stream flow results to the AI as they complete. By the time the last flow finishes crawling, the AI has already processed the first three.
This required restructuring our prompt pipeline significantly — the AI now receives partial context and updates its analysis as new flow data arrives.
Result: AI processing overlap reduced total time by ~35s.
Fix 3: Incremental regression comparison
Comparing against the previous baseline used to reload the entire baseline snapshot from storage. For large apps, this could be 50MB+ of crawl data.
We switched to a diff-based approach: we store a compact hash fingerprint of each flow state, and only load full snapshots for flows where the fingerprint changed.
Result: regression comparison dropped from ~30s to ~4s for most analyses.
The result
Average analysis time is now 1 minute 58 seconds. P95 is under 3 minutes even for large applications.
More importantly, completion rates improved — engineers now stay on the analysis page instead of switching away, because the result arrives before they lose focus.