Why do we use ‘r’ in Regular Expressions (regex) ?

Photo by Erik Mclean on Unsplash

When I started learning Regular Expressions, the thing that kept me baffled for so long was the use of ‘r’. I went through some articles, documentation but somehow I was not satisfied with the explanations. So I ventured into experimenting with it using as many different kinds of examples as I can, to get better insight. This is an attempt to help all such confused souls by sharing what all I did to make sense using multitude of examples.

Let me first talk about python escape sequence and regex special characters.

There are many escape sequences in python. For instance, ‘\t ’ which represents a tab and ‘\n’ represents new line. This means when these escape sequences are used in python code they introduce a tab or a new line.

Just like python escape sequence, regex also has some special characters or metacharacters viz. . ^ $ * + ? { } [ ] \ | ( ) . They also have different meanings when used in regex code.

The letter ‘r’ stands for ‘raw string’.

First question that popped up in my mind was :

Is ‘r’ -raw string used only in regex ? — Big NO !!!

Let me illustrate it using following examples:

Use of ‘r’ in simple python code

Use of ‘r’ to read the file path in pandas

Regex with python escape sequence

Regex with regex special character

1. Use of ‘r’ in simple python code :

2. Use of ‘r’ to read the file path in pandas:

Here we try to open a csv file named ‘fruits’, which has a pandas dataframe .

Now that you’ve understood the use of raw string in places other than regex. Let’s try to use it in regex for python escape sequence and regex metacharacters as well.

1. Regex with python escape sequence

We have used findall() method in the following examples. This method is used to find all the matches of a substring in a given string and returns all such matches. Then we used len() method to count number of matches.

First example:

Second example :

Now we use sub() method. In the syntax first argument is the word pattern (or exact word) to replace , second argument is the new word to replace with, and third is a string in which replacement should be done. Let’s look at the example again with python escape sequence using sub() method.

2. Regex with regex special character :

Here we have used the character ‘\s’ (lowercase ‘s’) which in regex represents a whitespace. So we try to find out the number of times ‘\s’ appears in a given text using findall() and len() methods.

Note: We are not counting the number of whitespaces but ‘\s’ as a string.

First example:

Second example :

Here again we are trying to use sub() method to replace ‘\s’ with an actual whitespace.

So now it’s clear that we can even use backslash as an alternative for ‘r’. Here it was easy to locate whether given text has python special character or regex metacharacter and then decide how many backslashes to use. But in practice in case of huge datasets you may not have such advantage. So it’s better to always use ‘r’ before writing regex code. I hope now you have better understanding of raw string.

Thank you !!!

--

--

--

I am a keen learner and diligent teacher with special interest in mathematics and machine learning.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Delivering personalised content at scale with Smart Links — Part 2

IRPlayer — A powerful video player framework for iOS

Connected PDM Success path to Connected PLM

STREAMING | 2021 NPC Stewart Fitness — Johnny Stewart Productions’ Livestream | Live_HD

Brief Intro To NetTopology in .NET Core

Wrap-Up about Postman!

See what’s in the development pipeline

10 useful sections to improve your Github README files

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gaurav Patil

Gaurav Patil

I am a keen learner and diligent teacher with special interest in mathematics and machine learning.

More from Medium

Tutorial 7 : OOPs in Python — super() function

Python Dictionary Methods

Simply Basic Python — part 2

Numrical Computing with numpy - PART I