This guide explains what the cleanUrlTracker
plugin is and how to integrate it into your analytics.js
tracking implementation.
When viewing your most visited pages in Google Analytics, it's not uncommon to see multiple different URL paths that reference the same page on your site. The following report table is a good example of this and the frustrating situation many users find themselves in today:
Page | Pageviews |
---|---|
/contact | 967 |
/contact/ | 431 |
/contact?hl=en | 67 |
/contact/index.html | 32 |
To prevent this problem, it's best to settle on a single, canonical URL path for each page you want to track, and only ever send the canonical version to Google Analytics.
The cleanUrlTracker
plugin helps you do this. It lets you specify a preference for whether or not to include extraneous parts of the URL path, and updates all URLs accordingly.
The cleanUrlPlugin
works by intercepting each hit as it's being sent and modifying the page
field based on the rules specified by the configuration options. The plugin also intercepts calls to [tracker.get()
] that reference the page
field, so other plugins that use page
data get the cleaned versions instead of the original versions.
Note: while the cleanUrlTracker
plugin does modify the page
field value for each hit, it never modifies the location
field. This allows campaign and site search data encoded in the full URL to be preserved.
To enable the cleanUrlTracker
plugin, run the require
command, specify the plugin name 'cleanUrlTracker'
, and pass in the configuration options you want to set:
ga('require', 'cleanUrlTracker', options);
The following table outlines all possible configuration options for the cleanUrlTracker
plugin. If any of the options has a default value, the default is explicitly stated:
Name | Type | Default |
---|---|---|
stripQuery |
boolean |
When true , the query string portion of the URL will be removed.Default: false
|
queryDimensionIndex |
number |
There are cases where you want to strip the query string from the URL, but you still want to record what query string was originally there, so you can report on those values separately. You can do this by creating a new custom dimension in Google Analytics. Set the dimension's scope to "hit", and then set the index of the newly created dimension as the queryDimensionIndex option. Once set, the stripped query string will be set on the custom dimension at the specified index.
|
indexFilename |
string |
When set, the indexFilename value will be stripped from the end of a URL. If your server supports automatically serving index files, you should set this to whatever value your server uses (usually 'index.html' ).
|
trailingSlash |
string |
When set to 'add' , a trailing slash is appended to the end of all URLs (if not already present). When set to 'remove' , a trailing slash is removed from the end of all URLs. No action is taken if any other value is used. Note: when using the indexFilename option, index filenames are stripped prior to the trailing slash being added or removed.
|
urlFieldsFilter |
Function |
A function that is passed a The Warning: be careful when modifying the |
The following table lists all methods for the cleanUrlTracker
plugin:
Name | Description |
---|---|
remove |
Removes the cleanUrlTracker plugin from the specified tracker and restores all modified tasks to their original state prior to the plugin being required. |
For details on how analytics.js
plugin methods work and how to invoke them, see calling plugin methods in the analytics.js
documentation.
Given the four URL paths shown in the table at the beginning of this guide, the following cleanUrlTracker
configuration would ensure that only the URL path /contact
ever appears in your reports (assumes you've created a custom dimension for the query at index 1):
ga('require', 'cleanUrlTracker', {
stripQuery: true,
queryDimensionIndex: 1,
indexFilename: 'index.html',
trailingSlash: 'remove'
});
And given those four URLs, the following fields would be sent to Google Analytics for each respective hit:
[1] {
"location": "/contact",
"page": "/contact"
}
[2] {
"location": "/contact/",
"page": "/contact"
}
[3] {
"location": "/contact?hl=en",
"page": "/contact"
"dimension1": "hl=en"
}
[4] {
"location": "/contact/index.html",
"page": "/contact"
}
If the available configuration options are not sufficient for your needs, you can use the urlFieldsFilter
option to arbirarily modify the URL fields sent to Google Analytics.
The following example passes the same options as the basic example above, but in addition it removes user-specific IDs from the page path, e.g. /users/18542823
becomes /users/<user-id>
:
ga('require', 'cleanUrlTracker', {
stripQuery: true,
queryDimensionIndex: 1,
indexFilename: 'index.html',
trailingSlash: 'remove',
urlFieldsFilter: function(fieldsObj, parseUrl) {
fieldsObj.page = parseUrl(fieldsObj.page).pathname
.replace(/^\/users\/(\d+)/, '/users/<user-id>')
return fieldsObj;
},
});