Analytics

8 Simple yet Powerful RegEx Concepts for Google Analytics

I Know Regular Expressions

In this post, I’m going to introduce you to eight powerful RegEx concepts that you can use in Google Analytics to create better advanced segments and filters. These were chosen based on things that we at SwellPath use most to get actionable data that drives decisions. At the end of this, you should be ready to harness the power of RegEx to get that high-value site data from your GA profiles.

“Everybody stand back! I know regular expressions!”

Last week, I had the pleasure join the Portland Google Analytics User Group as a “Subject Matter Expert” teaching people about using advanced segments and filters in GA. It was a great event, with a great crowd from absolute beginners to experienced users. I brought up using regular expressions at the event and wanted to provide a post for those who are still getting their feet wet.

Using RegEx in your advanced filters and segments gives you incredible power over your data. Using the basic method, you could set up an advanced filter to look at visitors landing on blog posts from January, February, and March.

Basic multi-condition advanced segment in Google Analytics

Using RegEx, you can make that same filter using this.

Advanced Segment with RegEx in Google Analytics

Just from that super watered-down example, you can see that using RegEx has the potential to greatly simplify and power up your GA segments and filters. So without further ado, here are eight RegEx concepts that will revolutionize how you use advanced filters and segments.

#1. Pipe Dreams

One of the simplest to understand concepts in RegEx is the pipe. It looks like this: |

It’s essentially the word “or” and lets you tell Google Analytics that you want results matching this or that.

Better example: Mario|Luigi

#2. Collect Carrot, ???, Profit

Don’t get the joke in the title? Don’t worry. It’s not just you. The best kind of jokes require an explanation (#ProbablyNotAccurate).

In RegEx, we can specify exactly where we want something to appear or make certain things completely optional. By adding a caret to the beginning of a regular expression, you’re specifying that you only want results that start with that. So, using the following in the keywords report would only return searches that start with “swell”: ^swell

You can also do the other half of that: specify that you want your string to end with something. The following would return any keywords ending with “rad”: rad$

  • SwellPath is rad
  • SEO for Lauren Conrad

You can also use them both to specify you want an exact match: ^swellpath$

The last concept here is the question mark. This makes the preceding character optional. Very useful in spelling variations that have extra letters. Using the following, you could get both an exact match on “colors” or on “colours”: ^colou?rs$

Sure beats this: ^colors$|^colours$

So in this section, we covered the caret “^”, the question mark “?”, and the dollar sign “$”.

  1. Collect carrot
  2. ???
  3. PROFIT

GET IT?!

#3. Father, Mother, Sister, Brother

You can also use RegEx to define a “family” of items. You can combine parentheses with the pipe to use the “or” functionality within a larger expression. For example, the following works fine if it’s all you’re looking for: father|mother|sister|brother

But what if you want something more specific but still want to allow the freedom to catch all the family member variables? You can use parenthesis.

Visiting my (father|mother|sister|brother)

#4. To Infinity and beyond!

RegEx also lets you specify repetition. This is useful if you want to account for bad spelling: scho+l. The + sign specifies that the preceding character can occur one or more times. You can also specify that a character can occur zero times (optional), one time, or more, using the asterisk. To find an excitable text messager, just use OMF*G. You’ll get “OMG”, “OMFG”, or “OMFFFFFFFFFFG”.

Mike Arnesen at the Portland Google Analytics User Group
Photo credit: iSite Design

#5. Surprise Me

Having a “wildcard” character is always useful. The period, dot, or full stop is your wildcard in RegEx and stands for “anything”. Using it in an expression allows anything to occur in a given position. So, d.g would round up “dog”, “dig”, “dug”, or “d4g”. You can also pair the period with the asterisk. Using .* means you’ll take anything. Anything at all. Not that useful within Google Analytics, but it’s handy in other applications.

#6. Keeping it Classy

One of my favorite concepts in RegEx is character classes. Using square brackets, we can define a whole class of things that apply to the space of one character. What does that mean? Say you want to find pages in Google Analytics that start with a number. ^0|^1|^2|^3|^4|^5|^6|^7|^8|^9 is pretty lame. Using a character class, we can use ^[0-9]. The hyphen there means “through” and can be used for number or letter ranges. More commonly, you’ll specify a few letters using a character class: [kc]haos

#7. History Repeats

We covered some repetition with the addition symbol and the asterisk. The upward limit on both of those is infinity. There are a lot of situations when you need something more controlled. Using curly braces allows you to use a set number of repetitions. You can set lower and upper limits using a first and a second number between the braces. A{2,4} means you’re looking for “AA”, “AAA”, “AAAA”. You can also omit the second number to specify an exact number of repetitions. Pretty co{2}l.

#8. Don’t Even Think About It

You’ve probably noticed by now that there are a good number of characters that have a special meaning in RegEX. To use them literally, you need to tell them to not be special anymore. Characters that have special meaning are square bracket [, the backslash , the caret ^, the dollar sign $, the period ., the pipe|, the question mark ?, the asterisk *, the plus sign +, the round bracket (.

You can precede any of these special characters with a backslash to cancel out their special meaning. If you’re looking for pages in Google Analytics that contain a query string parameter, use /? in your expression.

Quick Reference and Putting it All Together

  1. This or that
    This|that
  2. Start with, optional, end with
    ^start, options?, End$
  3. Families or groups
    (mother|father|sister|brother|direwolf)
  4. One or more, zero or more
    The+, OMF*G
  5. Wildcard
    d.g d.g
  6. Character Classes
    We’re number [1-9]
  7. Controlled repetition
    503 806 [0-9]{4}
  8. Escape Special Characters
    I need about $3.50

Just to provide an example of how powerful this could be, I put together a RegEx to capture all the variations of my name plus the term “SEO”.

^Mi(ke|ch[ae]{2}l).?arne+s[eo]n.*s.?e.?o.?$

The following list represents a fraction of what this RegEx can grab:

  • Mike Arnesen SEO
  • Mike Arnesen S.E.O.
  • Michael Arnesen SEO
  • Mike Arnesen SEO.
  • Mike-Arnesen SEO.
  • Mike-Arnesen SEO
  • Mike-Arnesen SE.O
  • Mike-Arnesen S.E.O.
  • Mike-Arnesen S.E.O
  • Mike Arneson SEO.
  • Mike Arneson SEO
  • Mike Arneson SE.O
  • Mike Arneson S.E.O.
  • Mike Arneson S.E.O
  • Mike Arnesen sucks at SEO
  • Mike Arnesen SE.O
  • Mike Arnesen S.E.O
  • Mike Arnesen loves SEO
  • Mike Arnesen knows SEO
  • Mike Arnesen is terrible at SEO
  • Mike Arnesen is speaking at SMX East about Google Authorship and SEO
  • Mike Arnesen is great at SEO
  • Mike Arnesen has never actually done SEO
  • Micheal-Arnesen SEO.
  • Micheal-Arnesen SEO
  • Micheal-Arnesen SE.O
  • Micheal-Arnesen S.E.O.
  • Micheal-Arnesen S.E.O
  • Micheal-Arneesen S.E.O
  • Micheal-Arneeesen SEO
  • Micheal-Arneeesen SE.O
  • Micheal-Arneeesen S.E.O.
  • Micheal-Arneeeeesen SEO.
  • Michael-Arnesen SEO.
  • Michael-Arnesen SEO
  • Michael-Arnesen SE.O
  • Michael-Arnesen S.E.O.
  • Michael-Arnesen S.E.O
  • Michael Arneson SEO.
  • Michael Arneson SEO
  • Michael Arneson SE.O
  • Michael Arneson S.E.O.
  • Michael Arneson S.E.O
  • Michael Arnesen SEO.
  • Michael Arnesen SE.O
  • Michael Arnesen S.E.O.
  • Michael Arnesen S.E.O

If reading this got you interested in learning more about RegEx, there is an excellent site that will teach you everything you’ll ever need to know about RegEx. Best of luck in digging deeper into your data. If you found this post useful, let me know if the comments. Happy RegExing!


Like this post? Follow Mike Arnesen on Google+

▼ If you found this post useful or interesting, please recommend it for me below. ▼
Mike Arnesen

Mike Arnesen - Director of Analytics & Optimization

A diehard SEO and web analytics geek, Mike is the Director of Analytics & Optimization at SwellPath. He is also a board member at SEMpdx. Mike's fascination for search experience optimization, structured data and semantic markup, and web technology knows no bounds. Beyond geeking out with SEO and analytics, Mike is also a prolific blogger, speaker (MozCon, SemTechBiz, SEMpdx, SMX, State of Search Conference, etc.), and company culture advocate. When not in the office, Mike is spending time with his wife, enjoying the outdoors, or keeping up with inbound marketing news via mobile; most of the time, it's all three simultaneously.

Watch Mike talk about his role and life at SwellPath

49 Responses to “8 Simple yet Powerful RegEx Concepts for Google Analytics”

  1. AllenJT

    Thanks Mike — this is a fantastic guide! I really wish I had something like this when I was first starting to use Regular Expressions. I will definitely be referring back to this in the future.

    Reply
  2. Tyler Cook

    Thanks for this simple and powerful guide on Regular Expressions. Another resource I recommend is http://regex.learncodethehardway.org/book/. Very useful for programing languages such as PHP and Python.

    Reply
  3. blizzle

    I counted 2 south park references (carrot ??? profit and $3.50) – did I miss any?

    Reply
  4. David Manion

    Thanks for the great article. Very clear. However, there is one thing I haven’t been able to find information on. In google analytics how do I ask, via a RegEx, to get something “and” something else. For instance, I have a string that I want to set up as goals and funnels for my shopping cart – the string is:

    /XXXXXX/checkout?o=199996539;s=bXh5SLIjcB6petWsuELM6xvgInJ72mnhfdn.PL2gS1U;t=VIHW74UTIHKEC;ac=view;p=shipping

    How do I retrieve anything with “checkout” and “p=shipping”, whilst ignoring the rest?

    Many thanks,

    David

    Reply
  5. 13 Tips to Streamline your Google Analytics Experience

    […] – How to show keyword positions within Google Analytics with filters. – 8 Simple yet powerful RegEx concepts for Google Analytics […]

    Reply
  6. Suzy Bureau

    This is incredibly helpful -and I love the humor! Thanks for taking the time to explain things simply. I will be using these in the future!

    Reply
  7. eywu

    I think the example in the last sentence has the slash facing the wrong direction.

    use /? in your expression

    I believe it should be \? which is the black slash mentioned in the sentence prior.

    Reply
  8. André Mafei

    Use:
    /XXXXXX/checkout.*p=shipping

    Reply

Leave a Reply

Get Our Newsletter

To keep up with the latest news from SwellPath on digital advertising, analytics, and SEO, sign up for our newsletter.

Archive