Quantcast
Channel: Recent posts across whole site
Viewing all articles
Browse latest Browse all 49206

Blocks, caching, and context

$
0
0

David Strauss was in Chicago for Tek-X last week, so I managed to grab him for a few hours to chew over the question of caching and performance. It was an extremely productive meeting so I'm going go try and summarize our notes here for posterity and further chewing.

Here's the basic challenge:

For ESI caching (Varnish, et al) to work, we need to be able to isolate a specific Block to load and render it, with the context information it should have, independently of the rest of the page. For example, the ideal case is that we could serve a fully cached page to everyone, including authenticated users, except for the Navigation block that has user-sensitive information such as selected menu items they can see, the "Hello, $username!" string, etc. Then the page gets cached in Varnish, and on subsequent page requests all Varnish does is call back to Drupal to render just that one block for the current user then drop that block into its cached page and print it back to the browser. All Drupal is doing at that point is rendering user-context-sensitive blocks; the rest of the page it almost never cares about.

For that to work, though, we absolutely must be able to know what context information is appropriate for that block, and what information to send it. Reconstructing the entire context object is needlessly expensive and wasteful. That is, we'd want to have a URL such as /esi/$block_id?uid=5&path=/some/page/here. That URL would result in rendering the block configured as $block_id (say, the Navigation menu block plus its configuration) with the context of user 5 and a path of /some/page/here. The entire rest of the page process is skipped, and we can then go about streamlining bootstrap to do less work. It's actually the exact same callback that would be used for AHAH rendering of blocks. Both Varnish and Drupal can then do additional caching of that rendered block as appropriate.

In practice the URL would be less human-friendly than that, and probably have a serialized or URL-encoded context array rather than separate keys. That's an implmentation detail, though. I just mention it for completeness.

In order for that to work, we need a way to know exactly what context $block_id is going to use. And not just that block, but all of the things that block may end up calling.

As an aside, this is why it's imperative that we only support injected context. Context that cannot be injected becomes incompatible with Varnish, AHAH callbacks, and pretty much anything else that bypasses the "render the entire everything" process. That's going to be a lot of things...

There are two ways that we could determine what context a given block users: Require the block author to specify it explicitly in the defintion hook or derive it automatically. Both have drawbacks. Explicit specification puts a large burden on the module author. Deriving it is more complex, and if some context is only used conditionally (the user has role X only if the current OG is Y) it could get missed.

However, it is possible to do both. That is, derive for most context and allow the block author to also specify context keys that may not derive easily. Context is relevant for a give block if it matches either criteria.

In essence, then, the context keys and given values that a given block users become its cache key, and that cache key can be used by Varnish, Drupal's own caching system, or anything else. That's much more robust than trying to cache on derived SQL strings as Drupal 7 attempts to do, and gives us much more flexibility to do "fancy stuff" with Blocks since we know so much more about them. That also means partial page caching becomes really really easy, even without Varnish.

David went so far as to suggest that we could do a poor-man's ESI within Drupal's own caching system by caching the page with ESI tags in it, and then running a pre-process on it to find the ESI tags and do the same single-block-render routine all in PHP without having a second page request. It would just be a rouine that mocks up a context object to match the ESI tag and renders that one block, then does a str_replace(). (That makes context mocking even more critical, and not just for testing.)

Note: The code in the sections that follow is there only because it's the easiest way to explain what we talked about, and it should be views as pseudo-code using PHP syntax, not as actual code implementations.

That has implications for how the context system works. Specifically, we would need a way to round-trip context keys from key (primary_node) to value (nid 12) to loaded value (the $node object) and back again, because we do not want to serialize the entire $node into the context cache string, just its nid. The easiest way to do that is to make the context callback for each key an object, and give it methods to return whichever of those keys we want at any given time.

David also made the suggestion that we can avoid the overhead of a registry hook with magic class naming, much as we do for database drivers. The only downside is that, like DB drivers, they would most likely have funky names with underscores in them, which is otherwise a coding standards violation.

That is, the primary node context would work something like this:

<?php
$node
= $context->get('primary_node');

class
Context {

  public function
get($key) {
   
$class = "Context_" . $key;
   
$handler = new $class($this);
    return
$handler->getValue();
  }

}

class
Context_primary_node implements ContextInterface {

  public function
getValue() {
   
$nid = $this->context->arg(1);
    return
node_load($nid);
  }

  public function
getIdentifier() {
    return
$this->getValue()->nid;
  }
}
?>

Give or take lots of caching. And then there'd be some mechanism (which I'm not thinking through yet) to insert that nid into Context_primary_node in place of checking the URL.

As a side effect, registering a new context responder becomes dead simple: Write a class with the right name and poof. It also prevents duplicates; that would cause a PHP parser error. :-) (Actually it might not since the classes would autoload, but it becomes an issue for the autoloader to sort out rather than the context system.)

In order to derive the used context keys, we'd need to introduce an extra thin layer in the block rendering. Remember that the context needed by a block isn't just that block, it's that block plus all of the blocks underneath it (if it's a Block Region). For that, we'd need to split the "render a child block" process into two parts. It's easier to just show it in code, I think:

<?php
class BlockRegion extends Block {

  public function
render() {
    foreach (
$this->blocks as $block_info) {
     
$block = new $block_info['class']($this->context);
     
$rendered[$block_info['block_id']] = $block->generate();
     
$this->usedContext += $block->getUsedContext();

     
// Use $rendered and the configured template/layout to
      // generate the string of this "block".
   
}
  }

}

abstract class
Block {
  public function
generate() {
   
$this->context->startTracking();
   
$output = $this->render();
   
$this->usedContext = $this->context->stopTracking();
    return
$output;
  }

 
// Individual block classes would implement this as appropriate.
 
abstract public function render();

  public function
getUsedContext() {
   
// The important context is whatever we detected plus
    // whatever the block author said was important.
   
return array_merge($this->usedContext, $this->info_hook['context_keys']);
  }
}
?>

And then a given context object would have an internal system to track when a given context key is requested vis a vis it's tracking layers.

David, please feel free to correct me if I got something wrong or misrepresented something. :-)

Any more chewing to do?


Viewing all articles
Browse latest Browse all 49206

Trending Articles