In this post, I’m going to introduce you to eight powerful RegEx concepts that you can use in Google Analytics to create better advanced segments and filters. These were chosen based on things that we at SwellPath use most to get actionable data that drives decisions. At the end of this, you should be ready to harness the power of RegEx to get that high-value site data from your GA profiles.
“Everybody stand back! I know regular expressions!”
Last week, I had the pleasure join the Portland Google Analytics User Group as a “Subject Matter Expert” teaching people about using advanced segments and filters in GA. It was a great event, with a great crowd from absolute beginners to experienced users. I brought up using regular expressions at the event and wanted to provide a post for those who are still getting their feet wet.
Using RegEx in your advanced filters and segments gives you incredible power over your data. Using the basic method, you could set up an advanced filter to look at visitors landing on blog posts from January, February, and March.
Using RegEx, you can make that same filter using this.
Just from that super watered-down example, you can see that using RegEx has the potential to greatly simplify and power up your GA segments and filters. So without further ado, here are eight RegEx concepts that will revolutionize how you use advanced filters and segments.
#1. Pipe Dreams
One of the simplest to understand concepts in RegEx is the pipe. It looks like this: |
It’s essentially the word “or” and lets you tell Google Analytics that you want results matching this or that.
Better example: Mario|Luigi
#2. Collect Carrot, ???, Profit
Don’t get the joke in the title? Don’t worry. It’s not just you. The best kind of jokes require an explanation (#ProbablyNotAccurate).
In RegEx, we can specify exactly where we want something to appear or make certain things completely optional. By adding a caret to the beginning of a regular expression, you’re specifying that you only want results that start with that. So, using the following in the keywords report would only return searches that start with “swell”: ^swell
You can also do the other half of that: specify that you want your string to end with something. The following would return any keywords ending with “rad”: rad$
- SwellPath is rad
- SEO for Lauren Conrad
You can also use them both to specify you want an exact match: ^swellpath$
The last concept here is the question mark. This makes the preceding character optional. Very useful in spelling variations that have extra letters. Using the following, you could get both an exact match on “colors” or on “colours”: ^colou?rs$
Sure beats this: ^colors$|^colours$
So in this section, we covered the caret “^”, the question mark “?”, and the dollar sign “$”.
GET IT?!
#3. Father, Mother, Sister, Brother
You can also use RegEx to define a “family” of items. You can combine parentheses with the pipe to use the “or” functionality within a larger expression. For example, the following works fine if it’s all you’re looking for: father|mother|sister|brother
But what if you want something more specific but still want to allow the freedom to catch all the family member variables? You can use parenthesis.
Visiting my (father|mother|sister|brother)
#4. To Infinity and beyond!
RegEx also lets you specify repetition. This is useful if you want to account for bad spelling: scho+l. The + sign specifies that the preceding character can occur one or more times. You can also specify that a character can occur zero times (optional), one time, or more, using the asterisk. To find an excitable text messager, just use OMF*G. You’ll get “OMG”, “OMFG”, or “OMFFFFFFFFFFG”.

Photo credit: iSite Design
#5. Surprise Me
Having a “wildcard” character is always useful. The period, dot, or full stop is your wildcard in RegEx and stands for “anything”. Using it in an expression allows anything to occur in a given position. So, d.g would round up “dog”, “dig”, “dug”, or “d4g”. You can also pair the period with the asterisk. Using .* means you’ll take anything. Anything at all. Not that useful within Google Analytics, but it’s handy in other applications.
#6. Keeping it Classy
One of my favorite concepts in RegEx is character classes. Using square brackets, we can define a whole class of things that apply to the space of one character. What does that mean? Say you want to find pages in Google Analytics that start with a number. ^0|^1|^2|^3|^4|^5|^6|^7|^8|^9 is pretty lame. Using a character class, we can use ^[0-9]. The hyphen there means “through” and can be used for number or letter ranges. More commonly, you’ll specify a few letters using a character class: [kc]haos
#7. History Repeats
We covered some repetition with the addition symbol and the asterisk. The upward limit on both of those is infinity. There are a lot of situations when you need something more controlled. Using curly braces allows you to use a set number of repetitions. You can set lower and upper limits using a first and a second number between the braces. A{2,4} means you’re looking for “AA”, “AAA”, “AAAA”. You can also omit the second number to specify an exact number of repetitions. Pretty co{2}l.
#8. Don’t Even Think About It
You’ve probably noticed by now that there are a good number of characters that have a special meaning in RegEX. To use them literally, you need to tell them to not be special anymore. Characters that have special meaning are square bracket [, the backslash \, the caret ^, the dollar sign $, the period ., the pipe|, the question mark ?, the asterisk *, the plus sign +, the round bracket (.
You can precede any of these special characters with a backslash to cancel out their special meaning. If you’re looking for pages in Google Analytics that contain a query string parameter, use /? in your expression.
Quick Reference and Putting it All Together
- This or that
This|that - Start with, optional, end with
^start, options?, End$ - Families or groups
(mother|father|sister|brother|direwolf) - One or more, zero or more
The+, OMF*G - Wildcard
d.g d.g - Character Classes
We’re number [1-9] - Controlled repetition
503 806 [0-9]{4} - Escape Special Characters
I need about \$3\.50
Just to provide an example of how powerful this could be, I put together a RegEx to capture all the variations of my name plus the term “SEO”.
^Mi(ke|ch[ae]{2}l).?arne+s[eo]n.*s\.?e\.?o\.?$
The following list represents a fraction of what this RegEx can grab:
- Mike Arnesen SEO
- Mike Arnesen S.E.O.
- Michael Arnesen SEO
- Mike Arnesen SEO.
- Mike-Arnesen SEO.
- Mike-Arnesen SEO
- Mike-Arnesen SE.O
- Mike-Arnesen S.E.O.
- Mike-Arnesen S.E.O
- Mike Arneson SEO.
- Mike Arneson SEO
- Mike Arneson SE.O
- Mike Arneson S.E.O.
- Mike Arneson S.E.O
- Mike Arnesen sucks at SEO
- Mike Arnesen SE.O
- Mike Arnesen S.E.O
- Mike Arnesen loves SEO
- Mike Arnesen knows SEO
- Mike Arnesen is terrible at SEO
- Mike Arnesen is speaking at SMX East about Google Authorship and SEO
- Mike Arnesen is great at SEO
- Mike Arnesen has never actually done SEO
- Micheal-Arnesen SEO.
- Micheal-Arnesen SEO
- Micheal-Arnesen SE.O
- Micheal-Arnesen S.E.O.
- Micheal-Arnesen S.E.O
- Micheal-Arneesen S.E.O
- Micheal-Arneeesen SEO
- Micheal-Arneeesen SE.O
- Micheal-Arneeesen S.E.O.
- Micheal-Arneeeeesen SEO.
- Michael-Arnesen SEO.
- Michael-Arnesen SEO
- Michael-Arnesen SE.O
- Michael-Arnesen S.E.O.
- Michael-Arnesen S.E.O
- Michael Arneson SEO.
- Michael Arneson SEO
- Michael Arneson SE.O
- Michael Arneson S.E.O.
- Michael Arneson S.E.O
- Michael Arnesen SEO.
- Michael Arnesen SE.O
- Michael Arnesen S.E.O.
- Michael Arnesen S.E.O
If reading this got you interested in learning more about RegEx, there is an excellent site that will teach you everything you’ll ever need to know about RegEx. Best of luck in digging deeper into your data. If you found this post useful, let me know if the comments. Happy RegExing!
Like this post? Follow Mike Arnesen on Google+
Tags: advanced segments, Google Analytics, regular expressions


Fresh Blog Post: 8 Simple yet Powerful RegEx Concepts for Google Analytics:
In this post, I’m going to introduc… http://t.co/KJfq2xFO
Thanks Mike — this is a fantastic guide! I really wish I had something like this when I was first starting to use Regular Expressions. I will definitely be referring back to this in the future.
Thanks so much for the feedback! It hope that members of the user group find it useful. I definitely wouldn’t have minded a nice easy to understand guide when I was starting either. A lot of the resources out there are laden with “geek speak” that a lot of people don’t really get.
Fantastic guide to powerful Regular Expression concepts to use in @GoogleAnalytics http://t.co/1fwQtiJ1 cc: @Mike_Arnesen #measure
Using Regex in Google Analytics segments: http://t.co/96oYBeLX
Great Blog Post from @swellpath on simple but powerful regex in Google analytics #analytics #pdx #WF_ http://t.co/tyEXpYlo
“8 Simple yet Powerful RegEx Concepts for Google Analytics” http://t.co/bsO7BauN via @swellpath
8 Simple yet Powerful RegEx Concepts for Google Analytics …: Learn 8 powerful RegEx tips for Google Analytics … http://t.co/MgBCHBSX
Awesome! http://t.co/fDYiZg8Z via @swellpath #GAPDX
RT @Mike_Arnesen via @AllenJT: Fantastic guide to powerful Regular Expression concepts to use in @GoogleAnalytics – http://t.co/PvHICxUu
8 Simple yet Powerful RegEx Concepts for Google Analytics http://t.co/Mnm0UTat
RT @aaranged: Fantastic guide to powerful Regular Expression concepts to use in @googleanalytics http://t.co/BUc7MhR1 #measure
“Everybody stand back. I know regular expressions”…in Google Analytics! http://t.co/tJzDYxPH #Measure #RegEx #GoogleAnalytics
@SEFHost Thanks for sharing! The link went down though. The new on is http://t.co/tJzDYxPH
New @swellpath post from @Mike_Arnesen: 8 Simple yet Powerful RegEx Concepts for Google Analytics http://t.co/bheyl5CJ
8 Simple yet Powerful RegEx Concepts for Google Analytics http://t.co/ogZZxYak (via @swellpath) #measure
RT @toddbullivant: 8 Simple yet Powerful RegEx Concepts for Google Analytics http://t.co/l7kSIkyP (via @swellpath) #measure #ganlaytics
8 Simple yet Powerful RegEx Concepts for Google Analytics: http://t.co/M5k2gczB
8 Simple yet Powerful RegEx Concepts for Google Analytics – SwellPath http://t.co/X8i5edWs
8 Simple yet Powerful RegEx Concepts for Google Analytics – http://t.co/t59fhESL #regex #measure #GoogleAnalytics
Thanks for this simple and powerful guide on Regular Expressions. Another resource I recommend is http://regex.learncodethehardway.org/book/. Very useful for programing languages such as PHP and Python.
So glad that you (and so many others) are finding it informative and useful! Learn RegEx the hard way, eh? Sounds intriguing. I’ll check it out.
8 Simple yet Powerful RegEx Concepts for Google Analytics http://t.co/cFp7om1z @Mike_Arnesen
I counted 2 south park references (carrot ??? profit and $3.50) – did I miss any?
You got’em all. I’ll have to sprinkle in some more creative references next time.
“OH now it’s only two fiddy, what – is there a sale on loch ness munchies or somethin!?”
I need about \$[2-3]\.50.
Updated with a new picture of the event that spawned this- “8 Powerful RegEx Concepts for Analytics” http://t.co/uvUqyxCE via @swellpath
Thanks for the great article. Very clear. However, there is one thing I haven’t been able to find information on. In google analytics how do I ask, via a RegEx, to get something “and” something else. For instance, I have a string that I want to set up as goals and funnels for my shopping cart – the string is:
/XXXXXX/checkout?o=199996539;s=bXh5SLIjcB6petWsuELM6xvgInJ72mnhfdn.PL2gS1U;t=VIHW74UTIHKEC;ac=view;p=shipping
How do I retrieve anything with “checkout” and “p=shipping”, whilst ignoring the rest?
Many thanks,
David
[...] – How to show keyword positions within Google Analytics with filters. – 8 Simple yet powerful RegEx concepts for Google Analytics [...]
This is incredibly helpful -and I love the humor! Thanks for taking the time to explain things simply. I will be using these in the future!