Page Cache vs. Landing Page Performance

Though "landing page" is often thrown around in casual conversation to mean "any page a user first lands on," the more precise marketing definition usually refers to a page specifically linked to by an advertisement, the goal of which is to get the user to perform an action such as filling out a form, downloading a file, etc.

One of the biggest mistakes website operators make when they make their first foray into paid search is to dump the user onto their homepage.  Instead, best practice is to create pages customized for each advertisement with engaging content relevant to the advertisement clicked.

Moreover, in order to maximize the effectiveness of each ad campaign and corresponding landing page, gathering and analyzing as much data from user interactions is paramount.  This can be done in any number of ways from client-side tracking scripts like Google Analytics to server-side data storage and analysis.  Diligent marketers will use the data gathered to optimize their pages and maximize conversion.

A textbook Drupal use-case

Naturally, paid search marketers will want to create a number of pages that are essentially the same, but have content and copy targeted to specific search phrases.  Drupalistas might translate this into a content type with whatever fields the marketer or design calls for: featured image, call to action, etc.

To allow for some basic analysis, you might attach properties to the CTA (be it additional query parameters on a download link, or additional hidden fields on a form).  If you wanted to answer the question, "Which landing page is performing best?" you'd want to be able to segment by node ID, for example.  Other common properties you might include on your content type might be: campaign ID, source, etc.

Google, Bing, et al. will also append query parameters to your page that you might want to use to override or supplement the above field data stored in Drupal.  For instance, you might want to analyze which keywords lead to a higher conversion rate for a given landing page.

The trouble with page cache

Naturally, these landing pages need to be blazingly fast; user attention is measured in hundreds of milliseconds, so every bit of time a user spends waiting for your page to load is greater potential for abandonment.  Luckily, like any other page on your Drupal installation, each landing page node will be cached at the page level, and Drupal's page cache can handily serve sub-second page requests.  Even better, every visit from a paid search ad is going to be anonymous, so all of your landing page visits will be served from cache, right?

As it turns out, the reality is that very few visits to your landing pages will be served from cache and the main reason is all of that data we need to analyze landing page performance.

When a user clicks on your Google ad, Google sends them to a page like this:</pre>

While this provides fantastic data we can use to perform in-depth analysis, the trouble is that Drupal will use this entire string as the primary key associated with the page cache.

If the next visitor uses a different keyword to see your ad, Drupal generates a completely new page, even if it's identical to the last visitor's experience.  If you have multiple campaigns pointing to the same landing page, a new page will have to be generated from scratch for each one.  If you use multiple paid search providers and point them to the same page, each one will have unique query parameters, resulting in more uncached page loads.

Most troubling is the gclid appended at the very end of the example string above.  It's a globally unique identifier provided by Google that is unique to each impression, virtually guaranteeing an uncached page load on every ad click.

What to do?

Since we can't rely on page cache for landing pages (and in fact, much of the dynamic form appending / URL rewriting we'd be doing to capture data depends on there being no page cache in place), the next thing to look at is optimizing everything else we can on the page: look into what block cache (and block cache alter) can do for you, look into views caching if applicable, perhaps entity cache may be useful. Implementation of these will depend on a site-to-site basis and, unfortunately, I don't believe it will have the order of magnitude gains seen when utilizing page cache. A more interesting solution proposed in this thread by mikeytown (maintainer of several high performance Drupal modules like Boost and Advagg), is to remove specific query parameter at the server level or in settings.php, thereby tricking Drupal into getting and setting the page cache using a cleaner URL string.  This will almost certainly have unintended consequences, however, and should be implemented with extreme caution. Neither of these are particularly satisfying, however.  It seems that the need for better data is at odds with the need for performance. Have you noticed poor performance on your landing pages?  If so, how have you solved the problem?