Page Cache vs. Landing Page Performance

Though "landing page" is often thrown around in casual conversation to mean "any page a user first lands on," the more precise marketing definition usually refers to a page specifically linked to by an advertisement, the goal of which is to get the user to perform an action such as filling out a form, downloading a file, etc.

One of the biggest mistakes website operators make when they make their first foray into paid search is to dump the user onto their homepage.  Instead, best practice is to create pages customized for each advertisement with engaging content relevant to the advertisement clicked.

Moreover, in order to maximize the effectiveness of each ad campaign and corresponding landing page, gathering and analyzing as much data from user interactions is paramount.  This can be done in any number of ways from client-side tracking scripts like Google Analytics to server-side data storage and analysis.  Diligent marketers will use the data gathered to optimize their pages and maximize conversion.

A textbook Drupal use-case

Naturally, paid search marketers will want to create a number of pages that are essentially the same, but have content and copy targeted to specific search phrases.  Drupalistas might translate this into a content type with whatever fields the marketer or design calls for: featured image, call to action, etc.

To allow for some basic analysis, you might attach properties to the CTA (be it additional query parameters on a download link, or additional hidden fields on a form).  If you wanted to answer the question, "Which landing page is performing best?" you'd want to be able to segment by node ID, for example.  Other common properties you might include on your content type might be: campaign ID, source, etc.

Google, Bing, et al. will also append query parameters to your page that you might want to use to override or supplement the above field data stored in Drupal.  For instance, you might want to analyze which keywords lead to a higher conversion rate for a given landing page.

The trouble with page cache

Naturally, these landing pages need to be blazingly fast; user attention is measured in hundreds of milliseconds, so every bit of time a user spends waiting for your page to load is greater potential for abandonment.  Luckily, like any other page on your Drupal installation, each landing page node will be cached at the page level, and Drupal's page cache can handily serve sub-second page requests.  Even better, every visit from a paid search ad is going to be anonymous, so all of your landing page visits will be served from cache, right?

As it turns out, the reality is that very few visits to your landing pages will be served from cache and the main reason is all of that data we need to analyze landing page performance.

When a user clicks on your Google ad, Google sends them to a page like this:

http://example.com/my-landing-page?cid=myCampaignId123&adgroup=My%20Ad%20Group&kw=User%20Search%20Query&adused=12345&gclid=123abc</pre>

While this provides fantastic data we can use to perform in-depth analysis, the trouble is that Drupal will use this entire string as the primary key associated with the page cache.

If the next visitor uses a different keyword to see your ad, Drupal generates a completely new page, even if it's identical to the last visitor's experience.  If you have multiple campaigns pointing to the same landing page, a new page will have to be generated from scratch for each one.  If you use multiple paid search providers and point them to the same page, each one will have unique query parameters, resulting in more uncached page loads.

Most troubling is the gclid appended at the very end of the example string above.  It's a globally unique identifier provided by Google that is unique to each impression, virtually guaranteeing an uncached page load on every ad click.

What to do?

Since we can't rely on page cache for landing pages (and in fact, much of the dynamic form appending / URL rewriting we'd be doing to capture data depends on there being no page cache in place), the next thing to look at is optimizing everything else we can on the page: look into what block cache (and block cache alter) can do for you, look into views caching if applicable, perhaps entity cache may be useful. Implementation of these will depend on a site-to-site basis and, unfortunately, I don't believe it will have the order of magnitude gains seen when utilizing page cache. A more interesting solution proposed in this thread by mikeytown (maintainer of several high performance Drupal modules like Boost and Advagg), is to remove specific query parameter at the server level or in settings.php, thereby tricking Drupal into getting and setting the page cache using a cleaner URL string.  This will almost certainly have unintended consequences, however, and should be implemented with extreme caution. Neither of these are particularly satisfying, however.  It seems that the need for better data is at odds with the need for performance. Have you noticed poor performance on your landing pages?  If so, how have you solved the problem?

Comments

Still looking

As the starter of the mikeytown thread linked above, I'm still looking for a good way to prevent random query strings from breaking page cache. As an aside, this behavior seems like a good way to DoS a site to death.

Query munging

What's wrong with modifying the query, exactly? If you know what parameters Google will be using, you can just knock out those specifically. I cannot think of an unpleasant side-effect off hand other than not getting all of them, so it only sometimes works.

Incidentally, the same issue is present if you're using Varnish, and I suspect the query-futzing solution would be the same (just done in Varnish configuration).

I suppose as an alternative, you could implement your own cache backend for page cache (quite easy in Drupal 7 now) that strips ALL query parameters when checking or setting the cache, but only on the front page. This assumes that your front page will never have a useful query parameter in it, so no pagers, but I suspect that's safe. Any other page than your home page you behave normally, but on the home page you just rip the query parameters off entirely and cache based only on the path. (Technically that means you're removing all query parameters except q, I think; there's some implementation details there that are left as an exercise for the reader.)

The site headline is based on

The site headline is based on the web page it represents, it will probably be shown in a web internet explorer window headline bar, and the clickable look for small link in Google, Google & other google. The headline is the “crown” of a keyword and key phrase targeted article with important keyword and key phrase presenting AT LEAST ONCE, as all look for enignes place a lot of weight in what words are included within this web-page coding element.Denver SEO

partytent huren zwolle

This is very nice blog because information provided here through the article and the pictures are very effective. Because sometimes words cannot explain the things that pictures can and here the words and pictures both are expressing the things in balance.

Add new comment