Research

The Showroom Standard Audit Pipeline

The infrastructure behind The Last Mile.

In flooring, tile, cabinets, and countertops, manufacturers invest heavily in brand marketing and demand generation—but most customer journeys end at a local dealer’s website. These dealers operate independently, and their digital presence varies widely. Understanding the state of this “dealer layer” requires measuring thousands of sites under consistent conditions.


Showroom Standard is the system behind The Last Mile.

It was built to audit dealer websites in the flooring, tile, cabinet, and countertop trades under consistent conditions. Not one-off checks. Not screenshots. Not hand-picked examples.

A repeatable pipeline.

The goal is simple: find the dealer layer, classify it honestly, measure the experience, and preserve enough structure in the data to make the findings defensible later.

The pipeline has three layers.

Discovery

Discovery surfaces dealer candidates from two sources: search-based market queries and a confirmed-dealer network.

That first pool is messy by design.

A search for flooring, tile, cabinets, or countertops in a metro area returns actual dealers, designers, contractors, distributors, brand pages, dead sites, and unrelated businesses that happen to match the language.

The pipeline does not treat that noise as an inconvenience. It treats it as part of the work.

Candidates are deduplicated, normalized, and stored before classification. This keeps the discovery layer visible instead of quietly cleaning the population until it looks better than it really is.

Classification

Classification runs each candidate through a multi-stage LLM pipeline.

The classifier extracts site content, evaluates the business against a four-category taxonomy, and writes the result with confidence scoring.

The four categories are:

  • DEALER
  • ADJACENT
  • NON_DEALER
  • UNCERTAIN

The purpose is not just to decide whether a site “counts.”

The purpose is to avoid forcing uncertainty into the wrong bucket.

Ground truth validation against a 100-site manual review produced 87% agreement between the classifier and human judgment.

That matters because the performance research depends on the population being clean enough to trust.

Audit

Audit runs Lighthouse measurements through the PageSpeed Insights API for each classified site.

Mobile and desktop are measured separately because they are not the same experience. That distinction became one of the central findings of the research.

Each result is stored with a timestamp so the dataset can support longitudinal analysis, not just a single snapshot.

A 200-site sample is retested periodically to estimate measurement variance. That keeps the findings grounded in repeatable measurement instead of pretending web performance is perfectly stable from run to run.

Current scale

As of May 2026, the pipeline has scored approximately 22,000 dealer candidates.

Roughly 20,000 sites have been audited.

The confirmed research population includes approximately 7,300 dealer websites, plus a supplemental cohort of 926 sites from a manufacturer-aligned dealer network.

Those numbers matter less as bragging rights than as a guardrail.

The argument in The Last Mile does not come from finding a few bad websites and turning them into a story. It comes from measuring the dealer layer at enough scale to see the median clearly.

Current implementation

The pipeline currently runs on a Hetzner VPS alongside other operational infrastructure.

Audit storage is handled in SQLite.

The classifier is built on the Anthropic API.

That stack is intentionally simple. The value is not in the complexity of the infrastructure. It is in the discipline of the pipeline: discovery, classification, audit, validation, and repeatable measurement.

More research

View all research