Regular Expressions (RegEx) are a powerful tool in Python for text processing. Whether you’re searching for patterns, extracting specific strings, or replacing text, RegEx simplifies these tasks. This article provides a beginner-friendly introduction to Python RegEx, its syntax, and practical examples. What are Regular Expressions (RegEx)? Regular Expressions (RegEx) are sequences of characters that define search patterns. In Python, the re module provides functionalities to work with RegEx. Key Applications of RegEx: Searching: Find specific patterns in text. Matching: Validate strings against patterns. Extracting: Retrieve portions of text based on patterns. Replacing: Replace text matching a pattern with new content. How to Use RegEx in Python To use RegEx in Python, you need to import the re module. import re Commonly Used RegEx Functions in Python: re.match(): Checks if a pattern matches at the start of a string. re.search(): Searches the entire string for the first occurrence of a pattern. re.findall(): Returns all matches of a pattern in a string. re.sub(): Replaces occurrences of a pattern with a replacement string. RegEx Syntax and Patterns RegEx patterns consist of literals and metacharacters. Below are some commonly used metacharacters: PatternDescriptionExample.Matches any single charactera.c matches "abc"^Matches the start of the string^hello matches "hello world"$Matches the end of the stringworld$ matches "hello world"\dMatches any digit (0-9)\d+ matches "12345"\wMatches any alphanumeric character\w+ matches "hello123"*Matches 0 or more repetitionsa* matches "aaa"+Matches 1 or more repetitionsa+ matches "aaa"?Matches 0 or 1 occurrencea? matches "a" or ""Regular Expression Practical Examples of Python RegEx Check if a String Starts with a Specific Word import re pattern = r"^Hello" text = "Hello, welcome to Python programming!" if re.match(pattern, text): print("String starts with 'Hello'") else: print("No match") Validate an Email Address import re email = "example@mail.com" pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" if re.match(pattern, email): print("Valid email address") else: print("Invalid email address") Extract All Digits from a String import re text = "Order ID: 12345, Amount: $678.90" digits = re.findall(r"\d+", text) print(digits) # Output: ['12345', '678', '90'] Replace Substrings in a String import re text = "The sky is blue." result = re.sub(r"blue", "clear", text) print(result) # Output: "The sky is clear." Split a String Based on Delimiters import re text = "Python,Java;C++|JavaScript" languages = re.split(r"[;,|]", text) print(languages) # Output: ['Python', 'Java', 'C++', 'JavaScript'] Tips for Working with RegEx Use raw strings (r"pattern") to avoid escape sequence errors. Break down complex patterns into smaller parts for clarity. Test patterns with tools like regex101.com. Conclusion Regular Expressions in Python are a must-know tool for developers, especially for tasks involving pattern matching and text manipulation. By mastering RegEx, you'll gain the ability to handle complex text-processing challenges with ease. INTERVIEW QUESTIONS 1. Validate a Phone Number Using RegEx Company: AmazonAnswer:The regex pattern r"^\+91-\d{10}$" validates a phone number in the format +91-1234567890, where +91- is the country code, followed by exactly 10 digits. import re pattern = r"^\+91-\d{10}$" phone_number = "+91-1234567890" if re.match(pattern, phone_number): print("Valid phone number") else: print("Invalid phone number") 2. Extract Domain Names from Email Addresses Company: GoogleAnswer:The regex pattern r"@([a-zA-Z0-9.-]+)" extracts the domain name from an email address, such as example@gmail.com, which will return gmail.com. import re pattern = r"@([a-zA-Z0-9.-]+)" email = "example@gmail.com" domain = re.search(pattern, email) if domain: print("Domain:", domain.group(1)) # Output: "gmail.com" 3. Check if a String is a Valid Palindrome Company: InfosysAnswer:You can use a combination of regular expressions and logic to check if a string is a palindrome. While regex itself doesn't directly check for palindromes, you can compare the string with its reverse. import re def is_palindrome(s): # Remove non-alphanumeric characters and make lowercase clean_string = re.sub(r'[^a-zA-Z0-9]', '', s).lower() return clean_string == clean_string[::-1] # Test the function print(is_palindrome("A man, a plan, a canal, Panama")) # Output: True 4. Find All Words in a String Starting with a Vowel Company: TCSAnswer:The regex pattern r"\b[aeiouAEIOU][a-zA-Z]*\b" finds words in a string starting with a vowel (case insensitive). import re pattern = r"\b[aeiouAEIOU][a-zA-Z]*\b" text = "An apple a day keeps the doctor away." words_starting_with_vowel = re.findall(pattern, text) print(words_starting_with_vowel) # Output: ['An', 'apple', 'a', 'away'] 5. Extract Dates from a Log File Company: MicrosoftAnswer:The regex pattern r"\d{4}-\d{2}-\d{2}" extracts dates in the format YYYY-MM-DD. import re pattern = r"\d{4}-\d{2}-\d{2}" log = "2025-01-02 Error occurred. 2024-12-31 Backup completed." dates = re.findall(pattern, log) print(dates) # Output: ['2025-01-02', '2024-12-31']