Tap a circle to see that AI's answer
Semrush builds its keyword search volume data by scraping SERPs, partnering with third‑party providers, incorporating click‑stream signals, and applying proprietary machine‑learning models to turn raw signals into estimated monthly volumes.
- Semrush crawls the first 100 organic results for each keyword.
- Third‑party providers supply raw SERP data and click‑stream logs.
- Machine‑learning algorithms refine raw data into search‑volume estimates.
- Semrush uses proprietary crawlers to scrape the first 100 organic SERP results for each keyword.
- It combines multiple data streams (third‑party providers, click‑stream, Google Planner) before applying machine‑learning models to estimate search volume.
- The raw SERP data is enriched with ranking, snippet, and competition signals to feed its analytics reports.
- Some sources emphasize Google Keyword Planner as a primary baseline, while others treat it as just one of many signals, leading to differing views on its relative importance.
How Semrush Obtains Keyword Search Data
Semrush does not pull a single “official” search‑volume figure from Google. Instead, it creates its own estimates by blending several data streams and processing them with proprietary algorithms.
| Source / Method | What It Provides | How Semrush Uses It |
|---|---|---|
| SERP scraping (first ≈ 100 organic results) | Raw ranking positions, featured snippets, result counts | Collected by Semrush’s proprietary crawlers and fed into Domain & Keyword Analytics reports2 |
| Third‑party data providers | Bulk SERP data, click‑through rates, competition signals | Partnerships supply large‑scale SERP extracts that supplement Semrush’s own crawls4 |
| Google Keyword Planner (or similar planners) | Baseline monthly search‑volume numbers | Acts as a “ground‑truth” reference that is later adjusted by Semrush’s models1 |
| Click‑stream data (anonymous user‑behavior logs) | Real‑world search activity patterns | Integrated into the modeling pipeline to improve volume accuracy8 |
| Machine‑learning / statistical modeling | Converts raw signals into estimated monthly searches, CPC, competition level | Proprietary algorithms weigh each signal, smooth out noise, and produce the final numbers shown in the Keyword Overview79 |
Step‑by‑step Flow
- Data acquisition – Semrush gathers raw keyword signals from SERP scrapes, third‑party feeds, Google’s own planner, and click‑stream logs.
- Signal enrichment – The first 100 organic results per keyword are analyzed to capture ranking depth, featured snippets, and result counts2.
- Modeling – All collected signals are fed into machine‑learning models that estimate monthly search volume, CPC, and competition metrics7.
- Continuous updates – The US database is refreshed regularly; each refresh re‑scrapes the first 100 organic results to keep the data current2.
Why the Estimates May Differ from Google’s Numbers
- Sampling: Only the top 100 organic results are examined, not the entire SERP.
- Modeling assumptions: Machine‑learning models extrapolate from sampled data and click‑stream trends, which can introduce variance.
- Multiple data sources: Combining third‑party and click‑stream data adds depth but also different weighting choices compared to Google’s internal calculations.
Takeaway
Semrush’s keyword search volume figures are derived estimates built from a mixture of SERP scraping, third‑party and click‑stream data, and advanced modeling—providing a useful, albeit approximate, view of keyword popularity for SEO and PPC planning.