Saturday 13 June 2020

HTTP2 and 421 errors

I am in the process of migrating an online B2B enterprise built on design patterns from a 1990's dial-up ISP to a more modern infrastructure. It has taken over 20 years of effort to build a network of services which are difficult to manage to and insecure. Replacing this, with no downtime, is something of a challenge. A really important stepping stone to the target architecture is getting every exposed service routed via a proxy. This makes it much, MUCH simpler for us to:

  • (re)route services within our network
  • Upgrade services to HTTP2
  • Provision certificates / manage encryption
  • Configure browser-side caching
  • Analyse traffic
  • ...and more 


Running a handful of BigIp F5s is unfortunately not an option, so my switchboard is a stack of Ubuntu + nginx.

Up till recently the onboarding exercise had gone really well - but then I started encountering 421 errors. My initial reading kept leading me to old bugs in Chrome and other issues with HTTP2. These are neatly summarized by Kevin as follows:

This is caused by the following sequence of events:
  1. The server and client both support and use HTTP/2.
  2. The client requests a page at foo.example.com.
  3. During TLS negotiation, the server presents a certificate which is valid for both foo.example.com and bar.example.com (and the client accepts it). This could be done with a wildcard certificate or a SAN certificate.
  4. The client reuses the connection to make a request for bar.example.com.
  5. The server is unable or unwilling to support cross-domain connection reuse (for example because you configured their SSL differently and Apache wants to force a TLS renegotiation), and serves HTTP 421.
  6. The client does not automatically retry with a new connection (see for example Chrome bug #546991, now fixed). The relevant RfC says that the client MAY retry, not that it SHOULD or MUST. Failing to retry is not particularly user-friendly, but might be desirable for a debugging tool or HTTP library.
However, I was able to reproduce the bug in the first request from a browser instance I had just started. Also I was able to see the issue across sites using distinct certificates. I was confused.

Eventually I looked at the logs for the origin server (Apache 2.4.26). I hadn't considered that before as I knew it did not support HTTP2. But low and behold, there in the logs, a 421 error against my request.

[Fri Jun 12 18:44:47.945706 2020] [ssl:error] [pid 21423:tid 140096556701440] AH02032: Hostname foo.example.net provided via SNI and hostname bar.example.com provided via HTTP have no compatible SSL setup

So disabling SSL session re-use on the connection from the nginx proxy to the origin resolved the issue:

proxy_ssl_session_reuse off;

This does mean slightly more overhead between the proxy and the origin, but since both are on the same LAN, its not really noticeable.

Although its something of a gray area, I don't think this is a bug with nginx - I had multiple sites in nginx pointing at the same backend URL. It would be interesting to check if the issue occurs when I have multiple unique DNS names for the origin - one for each nginx front end - if it still occurs there, then that is probably a bug.