Numbered Groups in re Module – Complete Guide for Data Science 2026

Numbered Groups in re Module – Complete Guide for Data Science 2026

Numbered groups are the default capturing groups created by plain parentheses (...) in regular expressions. Python’s re module automatically assigns them numbers starting from 1 (left to right). You can then reference them with match.group(1), \1 in substitutions, or as columns in pandas .str.extract(). Numbered groups are the simplest and most commonly used way to extract multiple structured fields from text in data science workflows.

TL;DR — Numbered Groups

(pattern) → creates group 1, 2, 3…
Access with match.group(1), \1, \2
Return multiple columns with pandas .str.extract()
Perfect for extracting IDs, dates, prices, emails in one pass

1. Basic Numbered Groups

import re

text = "Order ORD-98765 for $1,250.75 on 2026-03-19"

match = re.search(r"ORD-(d+).*?$(d+(?:,d+)?(?:.d+)?)", text)

print(match.group(0))   # full match
print(match.group(1))   # order ID (group 1)
print(match.group(2))   # amount (group 2)

2. Real-World Data Science Examples with Pandas

import pandas as pd

df = pd.read_csv("logs.csv")

# Extract multiple fields using numbered groups
df[["order_id", "amount", "date"]] = df["log"].str.extract(
    r"ORD-(d+).*?$(d+(?:,d+)?(?:.d+)?).*?(d{4}-d{2}-d{2})"
)

# One-line extraction of named fields via numbered groups
df["email"] = df["log"].str.extract(r"(S+@S+.S+)")[0]

3. Numbered Groups in Substitution

# Reorder date using numbered groups
print(re.sub(r"(d{4})-(d{2})-(d{2})", r"3/2/1", "2026-03-19"))

# Swap first and last name
print(re.sub(r"(w+)s+(w+)", r"2, 1", "Alice Johnson"))

4. Best Practices in 2026

Use numbered groups when you need simple positional access to captured values
Switch to named groups (?P<name>...) for complex patterns with many groups
Use non-capturing groups (?:...) when you only need grouping
Pre-compile patterns that contain several numbered groups
Combine with pandas .str.extract() for vectorized multi-column extraction

Conclusion

Numbered groups are the foundation of structured text extraction in Python’s re module. In 2026 data science projects, they let you pull multiple fields (IDs, amounts, dates, emails…) in a single clean pattern and reference them instantly with group(1), \1, or pandas columns. Use numbered groups for simple cases, named groups for readability in complex patterns, and you’ll build faster, more maintainable text-processing pipelines than ever.

Next steps:

Take one of your current regex patterns and convert it to use numbered groups for multi-field extraction

Numbered Groups in re Module – Complete Guide for Data Science 2026

TL;DR — Numbered Groups

1. Basic Numbered Groups

2. Real-World Data Science Examples with Pandas

3. Numbered Groups in Substitution

4. Best Practices in 2026

Conclusion

Related Articles in Regular Expressions 2026

Regular Expressions in Python – Complete Guide & Best Practices 2026

Negative Look-Behind in Regular Expressions – Complete Guide for Data Science 2026

Positive Look-Behind in Regular Expressions – Complete Guide for Data Science 2026

Generating content...