Subscribe to our monthly newsletter to get the latest updates in your inbox
What is cardinality?
Cardinality is the number of elements in a set or other grouping, as a property of that grouping. For example, the set A = {2, 4, 6} contains 3 elements, and, therefore, A has a cardinality of 3. High cardinality means there are a lot of unique values in the set or grouping. Having low cardinality means there are few unique values in the set or grouping.How can I tell if I am experiencing high cardinality in Google Analytics?
Google Analytics has a limit of the amount of unique values in a dimension per day. Daily processed tables store up to 50K rows for Standard Analytics and up to 75K rows for Google Analytics 360. Multi-day processed tables store up to 100K rows for standard Analytics and 150K rows for Google Analytics 360. Multi-day processed tables contain 4 days' worth of data. When the amount of cardinality passes this limit, Google Analytics automatically chooses the top values to display and creates a row labeled "(other)" for the remaining values.Dimensions with high cardinality potential:
- Page
- Event Label
- Custom Dimensions
How do I get rid of (other)?
There are numerous steps and workarounds to deal with high cardinality in Google Analytics. Before reading further, we highly recommend applying the following steps by creating a new view or using a test view before applying to your production views.Solution 1: Query Parameter Settings
The easiest and the quickest step you can take to reduce cardinality is to change your query parameter setting. You can reduce the number of possible values in the Page dimension by filtering out dynamic session/customer ID variables in the query parameter settings. Any query parameters or unique sessions that you do not want to appear in your URLs can be entered in the setting as a comma-separated list. For example, if you do not want to see "sessonid" in your URLs because it is causing high cardinality, enter in in sessionid in the Exclude URL Query Parameters setting. It's important to note that the information inside this setting is before filters. This means if the original URI is "SessionID", you will need to input "SessionID" even if you have the lowercase URI filter applied to this view.Solution 2: Site Search Settings
Having a search function on your website can lead to high cardinality. When a user enters text into the search function, the text is captured in the URL. Tracking all the different text that users enter in the search box can lead to high cardinality. To solve this issue, go to the view setting in Google Analytics and check the "Strip query parameters out of URL" box. original URL: www.example.com/?s=xxx New URL: www.example.com/ Instead of appearing in the URL data, the search terms can now be found under Site Search reports. Search terms are cleaner and easier to analyze. Before the 'Strip query parameters out of URL' checkbox is checked, this is what the page dimension might look like: After applying the site search settings, the search terms report shows this: The page dimension now shows:Solution 3: Custom Tables
Custom Tables give you access to all of the data for a particular set of metrics, dimensions, segments, and filters on a daily basis. The main purposes of custom tables is to avoid sampling of your data due to large number of metrics such as sessions. Although this does not eliminate cardinality issues, unlike standard daily processed table, Custom Tables have a limit of 1M unique rows per day, which is 925K higher than the normal limit of 75K rows for GA 360.Solution 4: Filters
There are many filters you can utilize to remove high cardinality. Before applying any filters please note these rules regarding filters:- Filters are destructive. Filtering your incoming hits permanently includes, excludes, or alters those hits in that view, according to the type of filter. Therefore, you should ALWAYS maintain an unfiltered view of your data so you always have access to your full data set.
- Filters require up to 24 hours before they are applied to your data.
- Fields specified in a filter must exist in the hit and not be null in order for the filter to be applied to that hit. For example, if you are filtering on Hostname, but the hit does not contain that field (perhaps the hit was sent via the Measurement Protocol and that request did not contain the &dh parameter), then any filters acting on Hostname will be ignored and the hit will be processed as if there was no filter.
- Filters are account-level objects. If you edit a filter at the view level, you are also changing the filter at the account level, and any other views that use the filter are also affected by the change. If you want to customize a single instance of an existing filter used by multiple views, create a new filter and apply it to that single view.
- Filters are applied in chronological order. The assignment of filter order does matter. Google Analytics will process data in the order of the filter. Make sure that if you do not want to accidentally lose data, filters are assigned in the correct order.