JS: Figuring Out RegEx

Photo via Malwarebytes Labs

I’ve always been afraid of RegEx. When I first started learning code, I viewed RegEx like an alien language. It was a bunch characters put together to do power things within code. Now that I’ve been coding for about a year now, I figure it’s time to learn and memorize some of the basics with RegEx.

What is RegExp?

RegEx is short hard notation for Regular Expression, which is a sequence of characters and form a search pattern. It can be used to search or search and replace, kind of like using CTRL-F for the “find” option of a document. The expression can be a single character or a more complicated pattern, which is what I struggle with the most. The typical syntax for regular expression literals is:

/pattern/modifiers;

Within this syntax, the pattern is used for the search aspect where the modifiers “modify” the pattern for the search itself. This can also be written using the RegEx constructor, in both a string pattern or a regular expression literal:

//constructor w/ string pattern
new RegEx(‘pattern’, ‘modifiers’)
//constructor w/ regular expression literal
new RegEx(/pattern/, ‘modifiers’)

As a general rule of thumb, if the regular expression is expected to remain constant, then it’s best to use a regex literal. Otherwise, use the RegEx constructor.

RegEx Methods

A Regular Expression is a type of object, so it also has a bunch of methods that come along with. Here are a few:

This method will search a string for a pattern and will return a boolean.

let testPattern = /test/;
testPattern.test(‘This will return false’); // false
testPattern.test(‘This test will return true’); // true

This will search a string for the specified pattern and will return the found text as an object, with the found pattern and pinpointing the index of where it is found. If no match is found, it will return null.

let testPattern = /test/;
testPattern.test(‘This will return false’); // null
testPattern.test(‘This test will return true’); // [“test”, index: 5, input: “This test will return true”, groups: undefined]

This method works exactly like the .exec() method, but is a method for strings. We can use regex to complete the search for us and the return value would be the first match within the string. If we add a g (for global) modified, it will return an array of all matches (beneficial if you want to count the matches)

let testPattern1 = /est/;
let testPattern2 = /est/g;
let testString = “strongest, fastest, the best’;
testString.match(testPattern1); // [“est”, index: 6, input: “strongest, fastest, the best”, groups: undefined]
testString.match(testPattern2); // [“est”, “est”, “est”]

We can use this string method with RegEx to do a find and replace search on a string.

let testString = “stronger, faster, quicker”;
testString.replace(/er/g, “est”); // “strongest, fastest, quickest”

There are so much more out there, but these are some of the basics.

Basic Regex Cheat Sheet

For the most part, regex is just a lot of memorization. Here are the most common symbols that you may see when working with RegEx.

  • . — (period) Matches any single character, except for line breaks.
  • * — Matches the preceding expression 0 or more times.
  • + — Matches the preceding expression 1 or more times.
  • {2} — Matches the preceding expression 2 times (2 can be replaced with any number).
  • {2, } — Matches the preceding expression 2 or more times (2 can be replaced with any number).
  • {2, 5} — Matches the preceding expression 2 up to 5 times (2 & 5 can be replaced with any number).
  • ? — Preceding expression is optional (Matches 0 or 1 times).
  • ^ — Matches the beginning of the string.
  • $ — Matches the end of the string.
  • \d — Matches any single digit character.
  • \w — Matches any word character (alphanumeric & underscore).
  • ss — Matches any whitespace character (including tabs and line breaks).
  • [XYZ] — Character Set: Matches any single character from the character within the brackets. You can also do a range like [A-Z] or [0–9]
  • [XYZ]+ — Matches one or more of any of the characters in the set.
  • [^A-Z] — Inside a character set, the ^ is used for negation. In this example, match anything that is NOT an uppercase letter.
  • g — Global search, meaning it will find all matches
  • i — case insensitive search
  • m — multi-line search, (meaning ^ and $ will match the start and end of a line instead of the whole string)

There’s honestly so much more out there to learn, but this is definitely just a start. All it takes really is more practice to get use to this syntax. HackerRank offers a lot of practice, so check it out. I know I will.

https://www.hackerrank.com/domains/regex

Software Engineer based out of the San Francisco Bay Area. Flatiron School graduate with 8+ years background in healthcare.