Python Guide Sidebar

Regular Expressions in Python(Reg Ex): A Beginner’s Guide

Regular Expressions in python (RegEx) are a powerful tool for text processing. Whether you’re searching for patterns, extracting specific strings, or replacing text, RegEx simplifies these tasks. This article provides a beginner-friendly introduction to Python RegEx, its syntax, and practical examples.

Regular Expression in Python
Regular Expression in Python

What are Regular Expressions in Python (RegEx)?

Regular Expressions (RegEx) are sequences of characters that define search patterns. In Python, the re module provides functionalities to work with RegEx.

Key Applications of Regular Expressions in Python:

  1. Searching: Find specific patterns in text.
  2. Matching: Validate strings against patterns.
  3. Extracting: Retrieve portions of text based on patterns.
  4. Replacing: Replace text matching a pattern with new content.

How to Use RegEx in Python

To use RegEx in Python, you need to import the re module.

import re  

Commonly Used Functions in Regular Expressions in Python:

  1. re.match(): Checks if a pattern matches at the start of a string.
  2. re.search(): Searches the entire string for the first occurrence of a pattern.
  3. re.findall(): Returns all matches of a pattern in a string.
  4. re.sub(): Replaces occurrences of a pattern with a replacement string.

RegEx Syntax and Patterns

RegEx patterns consist of literals and meta characters. Below are some commonly used meta characters:

PatternDescriptionExample
.Matches any single charactera.c matches “abc”
^Matches the start of the string^hello matches “hello world”
$Matches the end of the stringworld$ matches “hello world”
\dMatches any digit (0-9)\d+ matches “12345”
\wMatches any alphanumeric character\w+ matches “hello123”
*Matches 0 or more repetitionsa* matches “aaa”
+Matches 1 or more repetitionsa+ matches “aaa”
?Matches 0 or 1 occurrencea? matches “a” or “”
Regular Expression

Practical Examples of Python RegEx

Check if a String Starts with a Specific Word

import re  

pattern = r"^Hello"  
text = "Hello, welcome to Python programming!"  
if re.match(pattern, text):  
    print("String starts with 'Hello'")  
else:  
    print("No match")  

Validate an Email Address

import re  

email = "example@mail.com"  
pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"  

if re.match(pattern, email):  
    print("Valid email address")  
else:  
    print("Invalid email address")  

Extract All Digits from a String

import re  

text = "Order ID: 12345, Amount: $678.90"  
digits = re.findall(r"\d+", text)  
print(digits)  # Output: ['12345', '678', '90']  

Replace Sub strings in a String

import re  

text = "The sky is blue."  
result = re.sub(r"blue", "clear", text)  
print(result)  # Output: "The sky is clear."  

Split a String Based on Delimiters

import re  

text = "Python,Java;C++|JavaScript"  
languages = re.split(r"[;,|]", text)  
print(languages)  # Output: ['Python', 'Java', 'C++', 'JavaScript']  

Tips for Working with Regular Expressions in Python

  1. Use raw strings (r"pattern") to avoid escape sequence errors.
  2. Break down complex patterns into smaller parts for clarity.
  3. Test patterns with tools like regex101.com.

Conclusion

Regular Expressions in Python are a must-know tool for developers, especially for tasks involving pattern matching and text manipulation. By mastering RegEx, you’ll gain the ability to handle complex text-processing challenges with ease.

Let’s Check! Click me

INTERVIEW QUESTIONS

1. Validate a Phone Number Using RegEx

Company: Amazon
Answer:
The regex pattern r"^\+91-\d{10}$" validates a phone number in the format +91-1234567890, where +91- is the country code, followed by exactly 10 digits.

import re

pattern = r"^\+91-\d{10}$"
phone_number = "+91-1234567890"

if re.match(pattern, phone_number):
    print("Valid phone number")
else:
    print("Invalid phone number")
2. Extract Domain Names from Email Addresses in Regular Expressions in Python

Company: Google
Answer:
The regex pattern r"@([a-zA-Z0-9.-]+)" extracts the domain name from an email address, such as example@gmail.com, which will return gmail.com.

import re

pattern = r"@([a-zA-Z0-9.-]+)"
email = "example@gmail.com"

domain = re.search(pattern, email)
if domain:
    print("Domain:", domain.group(1))  # Output: "gmail.com"
3. Check if a String is a Valid Palindrome in Regular Expressions in Python

Company: Infosys
Answer:
You can use a combination of regular expressions and logic to check if a string is a palindrome. While regex itself doesn’t directly check for palindromes, you can compare the string with its reverse.

import re

def is_palindrome(s):
    # Remove non-alphanumeric characters and make lowercase
    clean_string = re.sub(r'[^a-zA-Z0-9]', '', s).lower()
    return clean_string == clean_string[::-1]

# Test the function
print(is_palindrome("A man, a plan, a canal, Panama"))  # Output: True
4. Find All Words in a String Starting with a Vowel

Company: TCS
Answer:
The regex pattern r"\b[aeiouAEIOU][a-zA-Z]*\b" finds words in a string starting with a vowel (case insensitive).

import re

pattern = r"\b[aeiouAEIOU][a-zA-Z]*\b"
text = "An apple a day keeps the doctor away."

words_starting_with_vowel = re.findall(pattern, text)
print(words_starting_with_vowel)  # Output: ['An', 'apple', 'a', 'away']
5. Extract Dates from a Log File

Company: Microsoft
Answer:
The regex pattern r"\d{4}-\d{2}-\d{2}" extracts dates in the format YYYY-MM-DD.

import re

pattern = r"\d{4}-\d{2}-\d{2}"
log = "2025-01-02 Error occurred. 2024-12-31 Backup completed."

dates = re.findall(pattern, log)
print(dates)  # Output: ['2025-01-02', '2024-12-31']

QUIZZES

Regular Expressions in python Quiz