Regular Expressions for Developers: A Practical Guide with Real-World Patterns

What Are Regular Expressions?

A regular expression (regex) is a sequence of characters that defines a search pattern. Regex engines can test whether a string matches a pattern, extract matching substrings, or replace parts of a string.

Regex is supported in virtually every programming language (with minor syntax differences) and in most text editors, command-line tools (grep, sed), and database engines.

Core Syntax

Literal Characters

Most characters match themselves: cat matches the string "cat".

Character Classes

Syntax	Matches
`[abc]`	`a`, `b`, or `c`
`[^abc]`	Any character except `a`, `b`, `c`
`[a-z]`	Any lowercase letter
`[0-9]`	Any digit
`.`	Any character except newline
`\d`	Digit `[0-9]`
`\w`	Word character `[a-zA-Z0-9_]`
`\s`	Whitespace `[ \t\n\r\f\v]`
`\D`, `\W`, `\S`	Negated versions

Quantifiers

Syntax	Meaning
`*`	0 or more
`+`	1 or more
`?`	0 or 1 (optional)
`{n}`	Exactly n
`{n,}`	At least n
`{n,m}`	Between n and m

Quantifiers are greedy by default (match as much as possible). Append ? to make them lazy (match as little as possible): .*?

Anchors

Syntax	Meaning
`^`	Start of string (or line in multiline mode)
`$`	End of string (or line in multiline mode)
`\b`	Word boundary
`\B`	Non-word boundary

Groups and References

(abc)       — capturing group
(?:abc)     — non-capturing group
(?<name>abc) — named capturing group
\1          — backreference to group 1

Lookaheads and Lookbehinds

(?=...)   — positive lookahead: followed by
(?!...)   — negative lookahead: not followed by
(?<=...)  — positive lookbehind: preceded by
(?<!...)  — negative lookbehind: not preceded by

Flags

Flag	Meaning
`i`	Case-insensitive
`g`	Global (find all matches, not just first)
`m`	Multiline (`^` and `$` match line start/end)
`s`	Dotall (`.` matches newlines too)
`u`	Unicode mode

Using Regex in JavaScript

// Test for a match
/^\d+$/.test("12345"); // true
/^\d+$/.test("123a5"); // false

// Extract first match
"hello world".match(/\w+/); // ["hello"]

// Extract all matches (global flag)
"hello world".match(/\w+/g); // ["hello", "world"]

// Replace
"foo bar".replace(/\b\w/g, (c) => c.toUpperCase()); // "Foo Bar"

// Named groups
const { year, month, day } =
  "2026-04-01".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/)?.groups ?? {};

Common Patterns

Email (simple)

^[^\s@]+@[^\s@]+\.[^\s@]+$

Note: true email validation per RFC 5321 is extremely complex. Use this simple pattern for basic input sanitisation, but always verify email by sending a confirmation link.

URL

^https?:\/\/([\w-]+\.)+[\w-]+(\/[\w\-./?%&=]*)?$

UK Postcode

^[A-Z]{1,2}\d[A-Z\d]?\s?\d[A-Z]{2}$

IPv4 Address

^(\d{1,3}\.){3}\d{1,3}$

(This allows invalid ranges like 999.999.999.999 — add numeric range checks in code.)

ISO 8601 Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

Hex Colour Code

^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$

Semantic Version

^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-([\w.-]+))?(?:\+([\w.-]+))?$

Performance Pitfalls: Catastrophic Backtracking

Poorly written regex can cause exponential backtracking, hanging the regex engine.

Danger pattern:

(a+)+b

Against the input "aaaaaaaaaa" (no trailing b), this causes catastrophic backtracking because the engine tries every way to partition the as between the nested quantifiers.

Avoid:

Nested quantifiers with overlapping matches
Unbounded .* in the middle of a pattern — use anchors or make quantifiers possessive (*+ in PCRE)

For user-supplied regex patterns (e.g., search features), always validate or timeout the execution to prevent ReDoS (Regular expression Denial of Service).