Regex Extractor · Filter Lines by Pattern | DedupeLines
DedupeLines
Free · No signup Browser-local · zero upload · line-by-line regex

Regex extractor for lines

Paste a list, write one regex, get back only the lines that match. Built for log scrubbing and lead-list cleanup — plus any text where you only need the rows that fit.

Presets
Input · 0 lines
output · 0 lines
Paste something on the left — results stream in here.

Or press Ctrl+Enter

Input lines 0
Blank lines 0
Matched 0
Unmatched 0
Output lines0

When to use this tool

guide
01 / Log filtering

Keep only ERROR / WARN lines from a 50K-line server log without SSHing in to run grep.

02 / Lead scrubbing

From a long email list, pull out only gmail addresses, or only your own @company.com domain.

03 / URL audit

Trim a big sitemap dump down to URLs that match a path pattern — no Excel filter-dialog dance.

04 / Pattern testing

Iterate on a regex visually: edit the pattern, hit Run, see which lines stay. Six common presets get you started.

How it works

three steps
  1. 01

    Paste your text or drop a .txt file

    Live preview up to 100K lines on desktop / 5K on mobile. Larger inputs switch to big-file mode automatically — a Web Worker takes over, the page stays responsive, and you still get the .txt download.

  2. 02

    Write one regex

    JavaScript flavour (no surrounding slashes). 15 presets cover email, URL, IP, dates, times, phones, plus parametric patterns like Starts/Ends/Contains where you replace the highlighted abc placeholder.

  3. 03

    Hit Run

    Matching lines stay, the rest are dropped. Toggle case sensitivity, trim, blank-line removal, dedup of matches, or shuffle of output before you run.

Under the hood

engine notes
Algorithm
JavaScript native RegExp · one re.test() per line

Same ECMA-262 engine V8 / SpiderMonkey use. Default i flag (case-insensitive) unless the case toggle is on.

Throughput
100K lines tested in ≈380 ms on a 2024 M3 MacBook Air

O(n) scan, no per-line allocation beyond the match string. 80 MB hard ceiling per run protects against accidental gigabyte inputs.

Threading
Web Worker for files ≥ 2 MB or ≥ 100K lines

Main thread stays interactive; the Worker imports the same dedupe.js engine and posts the result back when done.

Privacy & limits

what stays where

Everything runs in your browser. Your text is never uploaded — the regex runs against in-memory strings, never against anything sent to a server. After the page is loaded, you can disconnect from the network and the tool still works. 80 MB hard cap per run protects against accidentally locking the tab on a multi-gigabyte log.

Common use cases

where it fits

Log filtering at scale

Keep only ERROR / WARN / FATAL lines from a 100K-line server log. Iterate the pattern visually instead of SSHing in to run grep variants.

Lead list scrubbing

From a 50K-row email export, pull out only the gmail.com or @your-company.com addresses — no Excel filter dance, no upload.

Sitemap audit

Trim a 200K-row sitemap dump down to URLs that match a path pattern. Combine with the dedupe home tool for unique URLs only.

Pattern testing without leaving your data

Iterate on a regex against your real list (not toy examples on regex101) without uploading anything. Tweak, Run, watch matches.

Performance & comparison

real numbers
Metric Value Notes
Throughput · 100K lines DedupeLines: ~380 ms (Chrome 132) · GNU grep -E: ~160 ms Measured on 2024 M3 MacBook Air. Reference dataset: 100K log lines, pattern /^(ERROR|WARN)/. grep wins on raw speed; DedupeLines wins on "no SSH, no install, works in any browser tab — including iPad and locked-down corporate Windows".
Memory footprint DedupeLines: ~3× input in JS heap · grep: ~1.2× input in RSS Browser JS overhead vs native binary. 80 MB hard cap per run prevents OOM tab crashes; a 50 MB log stays under 200 MB resident.
Async offload threshold DedupeLines: 2 MB / 100K lines (desktop) · regex101.com: ~50K char editor limit Beyond threshold, DedupeLines off-loads to a Web Worker; main thread stays interactive. iOS Safari memory caps drove the lower mobile threshold (300 KB / 5K lines).
Regex engine DedupeLines: ECMA-262 §22.2 RegExp · grep: POSIX BRE / ERE JavaScript native RegExp — same engine across Chrome, Firefox, Safari, Edge. Lookbehind, named groups, Unicode property escapes (\p{…}) all supported. POSIX flavour in grep is more conservative.
Cold-start latency DedupeLines: 0 ms (page already loaded) · SSH to remote: 200-500 ms RTT No round-trip to a remote box. Once the page is loaded, the regex runs as fast as a key press, and the tool works offline.

Frequently asked

answered

Which regex flavour does this use?

JavaScript built-in RegExp (ECMA-262). Lookbehind, named groups, Unicode property escapes (\p{…}) all work in modern Chrome, Firefox, Safari, and Edge. Don't prefix your pattern with /; the slashes around the input field are decorative.

Why doesn't my regex match anything?

Three common causes: (1) the trim toggle is off and your lines have trailing whitespace breaking ^…$ anchors, (2) the case toggle is on and your characters differ in case, (3) you escaped slashes — JavaScript regex doesn't need to escape forward slashes.

Can the pattern span multiple lines?

No — this tool runs the regex once per line, by design. If you need cross-line matching, use a regex IDE like regex101.com that handles the dotall flag and multiline anchors directly.

Is my data sent anywhere?

No. The regex match runs entirely in this browser tab. Nothing about your text gets shipped to a server, and the analytics layer cannot read the lines you tested. Disconnect from Wi-Fi after loading and the tool still works.

What's the file size limit?

80 MB hard cap per run. Above 2 MB on desktop (300 KB on mobile) the tool switches to big-file mode — your text doesn't go through the textarea, processing happens in a Web Worker, and the result downloads as .txt.