DevGizmo
Back to Blog
web·

Regular Expressions for Developers: A Practical Guide with Real-World Patterns

Regular expressions are one of the most powerful tools in a developer's toolkit. Learn regex syntax, quantifiers, groups, lookaheads, and a collection of real-world patterns for validating email, URLs, dates, and more.

regexregular-expressionsjavascriptpattern-matching

What Are Regular Expressions?

A regular expression (regex) is a sequence of characters that defines a search pattern. Regex engines can test whether a string matches a pattern, extract matching substrings, or replace parts of a string.

Regex is supported in virtually every programming language (with minor syntax differences) and in most text editors, command-line tools (grep, sed), and database engines.

Core Syntax

Literal Characters

Most characters match themselves: cat matches the string "cat".

Character Classes

SyntaxMatches
[abc]a, b, or c
[^abc]Any character except a, b, c
[a-z]Any lowercase letter
[0-9]Any digit
.Any character except newline
\dDigit [0-9]
\wWord character [a-zA-Z0-9_]
\sWhitespace [ \t\n\r\f\v]
\D, \W, \SNegated versions

Quantifiers

SyntaxMeaning
*0 or more
+1 or more
?0 or 1 (optional)
{n}Exactly n
{n,}At least n
{n,m}Between n and m

Quantifiers are greedy by default (match as much as possible). Append ? to make them lazy (match as little as possible): .*?

Anchors

SyntaxMeaning
^Start of string (or line in multiline mode)
$End of string (or line in multiline mode)
\bWord boundary
\BNon-word boundary

Groups and References

(abc)       — capturing group
(?:abc)     — non-capturing group
(?<name>abc) — named capturing group
\1          — backreference to group 1

Lookaheads and Lookbehinds

(?=...)   — positive lookahead: followed by
(?!...)   — negative lookahead: not followed by
(?<=...)  — positive lookbehind: preceded by
(?<!...)  — negative lookbehind: not preceded by

Flags

FlagMeaning
iCase-insensitive
gGlobal (find all matches, not just first)
mMultiline (^ and $ match line start/end)
sDotall (. matches newlines too)
uUnicode mode

Using Regex in JavaScript

// Test for a match
/^\d+$/.test("12345"); // true
/^\d+$/.test("123a5"); // false

// Extract first match
"hello world".match(/\w+/); // ["hello"]

// Extract all matches (global flag)
"hello world".match(/\w+/g); // ["hello", "world"]

// Replace
"foo bar".replace(/\b\w/g, (c) => c.toUpperCase()); // "Foo Bar"

// Named groups
const { year, month, day } =
  "2026-04-01".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/)?.groups ?? {};

Common Patterns

Email (simple)

^[^\s@]+@[^\s@]+\.[^\s@]+$

Note: true email validation per RFC 5321 is extremely complex. Use this simple pattern for basic input sanitisation, but always verify email by sending a confirmation link.

URL

^https?:\/\/([\w-]+\.)+[\w-]+(\/[\w\-./?%&=]*)?$

UK Postcode

^[A-Z]{1,2}\d[A-Z\d]?\s?\d[A-Z]{2}$

IPv4 Address

^(\d{1,3}\.){3}\d{1,3}$

(This allows invalid ranges like 999.999.999.999 — add numeric range checks in code.)

ISO 8601 Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

Hex Colour Code

^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$

Semantic Version

^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-([\w.-]+))?(?:\+([\w.-]+))?$

Performance Pitfalls: Catastrophic Backtracking

Poorly written regex can cause exponential backtracking, hanging the regex engine.

Danger pattern:

(a+)+b

Against the input "aaaaaaaaaa" (no trailing b), this causes catastrophic backtracking because the engine tries every way to partition the as between the nested quantifiers.

Avoid:

  • Nested quantifiers with overlapping matches
  • Unbounded .* in the middle of a pattern — use anchors or make quantifiers possessive (*+ in PCRE)

For user-supplied regex patterns (e.g., search features), always validate or timeout the execution to prevent ReDoS (Regular expression Denial of Service).

Try it yourself

Put these concepts into practice with the free online tool on DevGizmo.