Regex Fundamentals

Recently, DS47 underwent training in Regex (Regular Expression). Although the language initially seemed confusing, it quickly became clear how powerful Regex can be for searching, matching, and manipulating strings of text based on specific patterns and rules.

This blog will cover some of the most used Regex expressions to help beginners familiarise themselves with the language.

Qualifier Expressions

Qualifier expressions are used to pick out particular patterns of text. Some common types are described below:

\w : This pattern codes for any alpha numeric value (any letter or any number). Every character is picked up, apart from spaces:

[a-z] : This pattern codes for any lowercase letter.

[A-Z] : This pattern codes for any uppercase letter.

\d : This pattern codes for any numeric digit.

Quantifier Expressions

Quantifier expressions help a user to express how many characters to pick out from the text.

* : This pattern states zero or more characters will be collected

+ : This pattern states one or more characters will be collected

Combining Quantifier and Qualifier Expressions

Quantifier and qualifier expressions can be used in parallel to help extract whole words, instead of just letters. Some examples include:

\w* : Collects all valid letters until the expression breaks. Because \w does not collect spaces, the expression restarts at every space, hence each word in a string is now collected.

\d+ : Collects all valid digits until the expression breaks.

Author:
Dan Booth
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab