Saturday 6 August 2016

Speeding up Dokuwiki

I'm a big fan of Dokuwiki.
  • its simple
  • has a great ecosystem of plugins
  • has great performance
But some time ago I decided there was room for improvement so I wrote a very simple framework (itself implemented as a Dokuwiki plugin). I've just uploaded this at Github.

Specifically this allows for:
  • Much faster page loading using PJAX
  • Pure javascript/CSS extensions - no PHP required
  • Prevents Javascript injection by page editors
The PJAX page loading requires small changes to the template to exclude all but the page specific content (i.e. navigation elements and the rendered markup) when a PJAX request is made. Instead you just return a well-formed HTML fragment when the request is flagged as coming from PJAX. There is an example template here. While the template this is based on is already rather complex, the actual changes to this, or any existing template, are only a few lines of code - see the diff in the README.

This saves me around 450 milliseconds per page in loading time:

The savings come from not having to parse the CSS and Javascript on the browser. The serverside content generation time is not noticeably affected.

But even if you are not using Dokuwiki you can get the same benefits using PJAX on your CMS of choice.

A strict Content Security Policy provides great protection against XSS attacks. But the question then arises how to get run-time generated data routed to the right bit of code. Jokuwiki solves by this embedding JSON in data-* attributes including the entry point for execution.

Wednesday 22 June 2016

Faster and more Scalable Session handling

Chapter 18 (18.13 specifically) looks at PHP session handling, which can be a major bottleneck. I suggested there were options for reducing the impact, top of which was to use a faster substrate for the session data, but no matter how fast the storage it won't help with the fact that (by default) control over concurrency is implemented by locking the data file.

While I provided an example in the book of propagating authentication and authorization information securely via the URL, removing the need to open the session in the linked page, sometimes you need access to the full session data.

Recently I wrote a drop in replacement for the default handler which is completely compatible with the default handler (you can mix and match the methods in the same application) but which does not lock the session data file. It struck me that there were lots of things the session handler was doing and which a custom handler might do. Rather than create every possible combination of storage / representation / replication / concurrency control, I adapted my handler API to allow multiple handlers to be stacked to create a custom combination.

The code (including an implementation of the non-blocking handler) is available on PHPClasses.

One thing I omitted to mention in the book is that when session_start() is called it sets the Cache-control and Expires headers to prevent caching:

Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0



If you want your page to be cacheable, then there is a simple 2 step process:

  1. Check - are you really, REALLY sure you want the content to be cacheable and use sessions? If you are just implementing access control, then the content *may* be stored on the users disk.
  2. Add an appropriate set of headers after session_start();
header('Expires: '.gmdate('D, d M Y H:i:s \G\M\T', time() + 3600));
header('Cache-Control: max-age=3600'); 
header('Varies: Cookie'); // ...but using HTTPS would be better