/drupal-with-nginx

Running Drupal using nginx: an idiosyncratically crafted bleeding edge configuration.

Nginx configuration for running Drupal

Introduction

This is an example configuration from running Drupal using nginx. Which is a high-performance non-blocking HTTP server.

Nginx doesn't use a module like Apache does for PHP support. The Apache module approach simplifies a lot of things because what you have in reality is nothing less than a PHP engine running on top of the HTTP server.

Instead nginx uses FastCGI to proxy all requests for PHP processing to a php fastcgi daemon that is waiting for incoming requests and then handles the php file being requested.

Although the fcgi approach is more cumbersome to set up it provides a greater degree of control over which actions are permitted, hence greater security.

This configuration started life as a fork of yhager's configuration, tempered by omega8cc and Brian Mercer (dead link) configurations.

I've since then changed it substantially. Tried to remove as best as I can the traces of bad habits promoted by Apache's configuration logic. Namely the use of a .htaccess and what it entails in terms or reverse logic on the server configuration. I've incorporated tidbits and advices gotten, mostly, from the nginx mailing list and the nginx Wiki.

Layout

The configuration comes in two flavors:

  1. Drupal 6.

  2. Drupal 7.

Furthermore there are two options for each configuration:

  1. A non drush aware option that uses wget/curl to run cron and updating the site using update.php, i.e., via a web interface.

  2. A drush aware flavor that runs cron and updates the site using drush.

    To get drush to run cron jobs the easiest way is to define your own site aliases. See the example aliases file example.aliases.drushrc.php that comes under the examples directory in the drush distribution.

    Example: You create the aliases for example.com and example.org, with aliases @excom and @exnet respectively.

    Your crontab should contain something like:

    COLUMNS=80
    DRUSH=/full/path/to/drush
    */50 * * * * $DRUSH @excom cron -q
    1 2 * * * $DRUSH @exnet cron -q
    

    This means that the cron job for example.com will be run every 50 minutes and the cron job for example.net will be run every day at 02:01 hours. Check the section 7 of the Drupal INSTALL.txt for further details about running cron.

    Note that the /path/to/drush is the path to the shell script wrapper that comes with drush not to to the drush.php script. If using drush.php then add php in front of the /path/to/drush.php.

Configuration Selection Algorithm

che

  1. I'm not using Boost:

    • On drupal 7 use the drupal.conf config in your vhost (server block): include sites-availables/drupal.conf;.

    • On drupal 6 use the drupal6.conf config in your vhost (server block): include sites-availables/drupal6.conf;.

  2. I'm using Boost for caching on my drupal site.

    • On drupal 7 use the drupal_boost.conf config in your vhost (server block): include sites-availables/drupal_boost.conf;.

    • On drupal 6 use the drupal_boost6.conf config in your vhost (server block): include sites-availables/drupal_boost6.conf;.

  3. I'm not using drush for updating and running cron. Additionally you should also include the drupal_cron_update.conf config in your vhost (server block): include sites-availables/drupal_cron_update.conf;

Boost and Drupal 6

The standard Drupal 6 core sets cookies also for anonymous users. Therefore the following map directive from map_cache.conf will result in the Boost generated pages not being served.

map $http_cookie $no_cache {
    default 0;
    ~SESS 1; # PHP session cookie
}

If you're using the standard Drupal 6 without no_anon then the cache bursting map directive is:

map $http_cookie $no_cache {
    default 0;
    ~DRUPAL_UID 1; # PHP session cookie
}

This is properly documented in map_cache.conf.

Drupal 6 Global Redirect and the 0 Rewrites Configuration

There's a setting that is enabled by default in globalredirect that removes the trailing slash in the URIs. That setting creates a redirect loop with the 0 rewrites config provided by sites-available/drupal.conf or sites-available/drupal_boost.conf if using Boost.

There are two ways to deal with that:

  1. Install the module nginx_fast_config that takes care of this setting removing it from the settings form at /admin/settings/globalredirect and presents a status line on the status page at /admin/reports/status. This module fixes the issues for you.

  2. Take care of the deslash setting yourself by disabling it at /admin/settings/globalredirect. Note that this is enabled by default.

This is strictly a drupal 6 issue.

General Features

  1. The use of two server directives to do the domain name rewriting, usually redirecting www.example.com to example.com or vice-versa. As recommended in nginx Wiki Pitfalls page.

  2. Clean URL support.

  3. Access control for cron.php. It can only be requested from a set of IPs addresses you specify. This is for the non drush aware version.

  4. Support for the Boost module.

  5. Support for virtual hosts. The example.com.conf file.

  6. Support for Sitemaps RSS feeds.

  7. Support for the Filefield Nginx Progress module for the upload progress bar.

  8. Use of non-capturing regex for all directives that are not rewrites that need to use URI components.1

  9. IPv6 and IPv4 support.

  10. Support for private file serving in drupal.

  11. Support for hot link protection imagecache generated images.

  12. If using php-cgi with UNIX sockets in /tmp/ subdirectory with permissions 700, i.e., accessible only to the user running the process. You may consider the init script that I make available here on github that launches the PHP FastCGI daemon and spawns new instances as required. This is not needed if you're using php-fpm.

  13. End of the expensive 404s that Drupal usually handles when using Apache with the default .htaccess.

  14. Possibility of using Apache as a backend for dealing with PHP. Meaning using Nginx as reverse proxy.

  15. Advanced Help support.

  16. Advanced Aggregation support.

  17. Microcaching support for both anonymous and authenticated users.

Secure HTTP aka SSL/TLS support

  1. By default and since version 0.8.21 only SSLv3 and TLSv1 are supported. The anonymous Diffie-Hellman (ADH) key exchange and MD5 message autentication algorithms are not supported. They can be enabled explicitly but due to their insecure nature they're discouraged. The same goes for SSLv2.

  2. SSL/TLS shared cache for SSL session resume support of 10 MB. SSL session timeout is set to 10 minutes.

  3. Note that for session resumption to work the setting of the SSL socket as default, at least, is required. Meaning a listen directive like this:

    listen [::]:443 ssl default_server;

    This is so because session resumption takes place before any TLS extension is enabled, namely Server Name Indication. The ClientHello message requests a session ID from a given IP address (server). Therefore the default server setting is required.

    Another option, the one I've chosen here, is to move the ssl_session_cache directive to the http context setting. Of course the downside of this approach is that the ssl_session_cache settings are the same for all configured virtual hosts.

Security Features

  1. No direct access to PHP scripts. All PHP scripts, including index.php are acessed only internally.

  2. The use of a default configuration file to block all illegal Host HTTP header requests.

  3. Access control using HTTP Basic Auth for install.php and other Drupal sensitive files. The configuration expects a password file named .htpasswd-users in the top nginx configuration directory, usually /etc/nginx. I provide an empty file. This is also for the non drush aware version.

    If you're on Debian or any of its derivatives like Ubuntu you need either the thttpd-util or apache2-utils package installed.

    With thttpd-util create your password file by issuing:

    thtpasswd -c .htpasswd-users

    With apache2-utils create your password file by issuing:

    htpasswd -d -b -c .htpasswd-users

    You should delete this command from your shell history afterwards with history -d <command number> or alternatively omit the -b switch, then you'll be prompted for the password.

    This creates the file (there's a -c switch). For adding additional users omit the -c.

    If you're on Debian or any of its derivatives like Ubuntu you need the apache2-utils package installed. Then create your password file by issuing:

    htpasswd -d -b -c .htpasswd-users <user> <password>
    

    You should delete this command from your shell history afterwards with history -d <command number> or alternatively omit the -b switch, then you'll be prompted for the password.

    This creates the file (there's a -c switch). For adding additional users omit the -c.

    Of course you can rename the password file to whatever you want, then accordingly change its name in drupal_boost.conf.

  4. Support for X-Frame-Options HTTP header to avoid Clickjacking attacks.

  5. Protection of the upload directory. You can try to bypass the UNIX file utility or the PHP Fileinfo extension and upload a fake jpeg:

    echo -e "\xff\xd8\xff\xe0\n<?php echo 'hello'; ?>" > test.jpg
    

    If you run php test.jpg you get 'hello'. The fact is that all files with php extension are either matched by a particular location, as is the case for xmlrpc.php, update.php and install.php or match the last directive of the configuration:

    location ~* ^.+\.php$ {
      return 404; 
    }
    

    Returning a 404 (Not Found) for every PHP file not matched by all the previous locations.

    Note that index.php is accessed only indirectly, meaning it always from within the Nginx config. You cannot access it directly from outside.

  6. Use of Strict Transport Security for enhanced security. It forces during the specified period for the configured domain to be contacted only over HTTPS. Requires a modern browser to be of use, i.e., Chrome/Chromium, Firefox 4 or Firefox with NoScript.

  7. DoS prevention with a low number of connections by client allowed: 16. This number can be adjusted as you see fit.

  8. The Drupal specific headers like X-Drupal-Cache provided by pressflow or the X-Generator header that Drupal 7 sets are both hidden.

  9. Limitation of allowed HTTP methods. Out of the box only GET, HEAD and POSTare allowed.

Private file handling

This config assumes that private files are stored under a directory named private. I suggest sites/default/files/private or sites/<sitename>/files/private but can be anywhere inside the site root as long as you keep the top level directory name private. If you want to have a different name for the top level then replace in the location ~* private in drupal.conf and/or drupal7.conf the name of your private files top directory.

Example: Calling the top level private files directory protected instead of private.

   location ^~ /sites/default/files/protected {
     internal;
   }

Now any attempt to access the files under this directory directly will return a 404.

Note that this practice it's not what's usually recommended. The usual practice involves setting up a directory outside of files directory and giving write permissions to the web server user. While that might be a simple alternative in the sense that doesn't require to tweak the web server configuration, I think it to be less advisable, in the sense that now there's another directory that is writable by the server.

I prefer to use a directory under files, which is the only one that is writable by the web server, and use the above location (protected or private) to block access by the client to it.

Also bear in mind that the above configuration stanza is for a drupal 7 or a drupal 6 site not relying on purl. For sites that use it, e.g., sites/products based on spaces like OpenAtrium or ManagingNews require a regex based location, i.e.:

   location ~* /sites/default/files/protected {
     internal;
   }

in order to work properly.

Fast Private File Transfer

Nginx implements Lighty X-Sendfile using the header: X-Accel-Redirect.

This allows fast private file transfers. I've developed a module tailored for Nginx: nginx_accel_redirect.

Connections per client and DoS Mitigation

The connection zone defined, called arbeit allows for 16 connections to be established for each client. That seems to me to be a reasonable number. It could happen that you have a setup with lots of CDNs (see this issue) or extensive domain sharding and the number of allowed connections by client can be greater than 16, specially when using Nginx as a reverse proxy.

It may happen that 16 is not enough and you start getting a lot of 503 Service Unavailable status codes as a reply from the server. In that case tweak the value of limit_conn until you have a working setup. This number must be as small as possible as a way to mitigate the potential for DoS attacks.

Image hotlinking protection

Imagecache generated images can be expensive to generate. In those cases providing protection against hotlinking is a must.

To make use of that uncomment the proper line on the /imagecache/ location that includes the sites-available/hotlinking_protection.conf file.

The protection is based on the Nginx referer module. You must specify the hosts that are allowed to access the images. The hostnames can use wildcards or use regexes.

HTTP allowed methods made to measure

For a standard drupal install there's no need for any method besides GET, HEAD and POST. The allowed methods are enumerated in the file map_block_http_methods.conf.

If your site uses/provide web services then you must add the methods you need to the list. For example if you want to allow PUT then do:

   map $request_method $not_allowed_method {
       default 1;
       GET 0;
       HEAD 0;
       POST 0;
       PUT 0;
   }

Note that this enables PUT for all locations and clients. If you need a finer control than use the limit_except directive and enumerate the client IPs that are allowed to use the extra methods like PUT.

Nginx as a Reverse Proxy: Proxying to Apache for PHP

If you absolutely need to use the rather bad habit of deploying web apps relying on .htaccess, or you just want to use Nginx as a reverse proxy. The config allows you to do so. Note that this provides some benefits over using only Apache, since Nginx is much faster than Apache. Not only due to its architecture but also to using buffering for handling upstream replies. Furthermore you can use the proxy cache and/or use Nginx as a load balancer.

Static index.html file

The / location is a fallback location, meaning that after trying all other, more specific locations, Nginx, will return here.

Since there's a try_files $uri directive within @cache, if using Boost, or @drupal, or index.php?q=$uri&$args otherwise, as fallback it will return a 404 if no file is found. Even if you have an index.html file at the root. That is for a request URI of /. It will work however with /index.html, since that's the argument of the try_files directive.

There's several possible ways to fix that. Be with nested locations inside location / or with an aditional try_files $uri/index.html.

The one I opted for is instead making use of the error_page directive. There's an exact location / that issues a 200 code and serves /index.html when a 404 is returned.

Microcaching

Introduction

Microcaching is a caching concept that takes simple is better approach. Meaning we don't care about content expiration because the cache valid time is small enough for that not to be an issue. In this config we set it to 15 seconds. You can tune all cache parameters to your liking. Check the microcache_fcgi.conf or microcache_proxy.conf for anonymous users cache and microcache_fcgi_auth.conf or microcache_proxy_auth.conf for authenticated users cache.

Microcaching and Drupal

You can implement a microcaching strategy on drupal using cache_warmer. Tune the many options of that drush command to fit your site traffic pattern.

Anonymous and Authenticated users microcaching

This configuration supports both anonymous and authenticated users caching. You should enable one and only one. The authenticated user cache also supports anonymous users.

By default on both drupal 6 and drupal 7 the anonymous user microcache is enabled. If you want to use the authenticated user microcache instead comment out the line:

  1. include sites-available/microcache_fcgi.conf if using the FCGI microcache (when proxying to FCGI).

  2. include sites-available/microcache_proxy.conf if using the proxy cache (proxying to Apache or other PHP handler).

and uncomment:

  1. include sites-available/microcache_fcgi_auth.conf if using the FCGI microcache (when proxying to FCGI).

  2. include sites-available/microcache_proxy_auth.conf if using the proxy cache (proxying to Apache or other PHP handler).

You're set to go.

Boost and authenticated user microcaching

When using Boost you can use the authenticated user microcache. It will give you an additional layer of caching.

This is enabled by default. Comment out the include sites-available/microcache_fcgi_auth.conf or include sites-available/microcache_proxy_auth.conf line if you don't want to use microcaching at all with Boost.

Microcaching for authenticated users under the hood

The way microcaching for authentitcated is implemented uses a $cache_uid variable that is set on map_cache.conf.

  • anonymous users get a $cache_uid value of nil.

  • authenticated users get a $cache_uid value that is the session id. Note that the named capture that grabs the session ID assumes that you're using the default setting in terms of what drupal calls the session cookie. Hence it starts with SESS. If this isn't the case just remove the string SESS from the regex.

    See drupal_settings_initialize() for drupal 7 or conf_init() for drupal 6 for further information.

Useful scripts for working with Nginx cache

Here's two useful scripts for working with the Nginx cache:

  1. nginx cache inspector allows you to inspect the cache files.

  2. nginx cache purge allows you to purge and item or set of items from the Nginx cache.

IPv6 and IPv4

The configuration of the example vhosts uses separate sockets for IPv6 and IPv4. This way is simpler for those not (yet) having IPv6 support to disabli it by commenting out the listen directive relative to IPv6.

Installation

  1. Move the old /etc/nginx directory to /etc/nginx.old.

  2. Clone the git repository from github:

    git clone https://github.com/perusio/drupal-with-nginx.git

  3. Edit the sites-available/example.com.conf configuration file to suit your requirements. Namely replacing example.com with your domain.

  4. Setup the PHP handling method. It can be:

    • Upstream HTTP server like Apache with mod_php. To use this method comment out the include upstream_phpcgi.conf; line in nginx.conf and uncomment the lines:

      include reverse_proxy.conf;
      include upstream_phpapache.conf;
      

      Now you must set the proper address and port for your backend(s) in the upstream_phpapache.conf. By default it assumes the loopback 127.0.0.1 interface on port 8080. Adjust accordingly to reflect your setup.

      Comment out all fastcgi_pass directives in either drupal_boost.conf or drupal_boost_drush.conf, depending which config layout you're using. Uncomment out all the proxy_pass directives. They have a comment around them, stating these instructions.

    • FastCGI process using php-cgi. In this case an init script is required. This is how the server is configured out of the box. It uses UNIX sockets. You can use TCP sockets if you prefer.

    • PHP FPM, this requires you to configure your fpm setup, in Debian/Ubuntu this is done in the /etc/php5/fpm directory.

      Look here for an example configuration of php-fpm.

    Check that the socket is properly created and is listening. This can be done with netstat, like this for UNIX sockets:

    netstat --unix -l

    And like this for TCP sockets:

    netstat -t -l

    It should display the PHP CGI socket.

    Note that the default socket type is UNIX and the config assumes it to be listening on unix:/tmp/php-cgi/php-cgi.socket, if using the php-cgi, or in unix:/var/run/php-fpm.sock using php-fpm and that you should change to reflect your setup by editing upstream_phpcgi.conf.

  5. Create the /etc/nginx/sites-enabled directory and enable the virtual host using one of the methods described below.

    Note that if you're using the nginx_ensite script described below it creates the /etc/nginx/sites-enabled directory if it doesn't exist the first time you run it for enabling a site.

  6. Reload Nginx:

    /etc/init.d/nginx reload

  7. Check that your site is working using your browser.

  8. Remove the /etc/nginx.old directory.

  9. Done.

Enabling and Disabling Virtual Hosts

I've created a shell script nginx_ensite that lives here on github for quick enabling and disabling of virtual hosts.

If you're not using that script then you have to manually create the symlinks from sites-enabled to sites-available. Only the virtual hosts configured in sites-enabled will be available for Nginx to serve.

Acessing the php-fpm status and ping pages

You can get the status and a ping pages for the running instance of php-fpm. There's a php_fpm_status.conf file with the configuration for both features.

  • the status page at /fpm-status;

  • the ping page at /ping.

For obvious reasons these pages are acessed only from a given set of IP addresses. In the suggested configuration only from localhost and non-routable IPs of the 192.168.1.0 network.

The allowed hosts are defined in a geo block in file php_fpm_status_allowed_hosts.conf. You should edit the predefined IP addresses to suit your setup.

To enable the status and ping pages uncomment the line in the example.com.conf virtual host configuration file.

Getting the latest Nginx packaged for Debian or Ubuntu

I maintain a debian repository with the latest version of Nginx. This is packaged for Debian unstable or testing. The instructions for using the repository are presented on this page.

It may work or not on Ubuntu. Since Ubuntu seems to appreciate more finding semi-witty names for their releases instead of making clear what's the status of the software included, meaning. Is it stable? Is it testing? Is it unstable? The package may work with your currently installed environment or not. I don't have the faintest idea which release to advise. So you're on your own. Generally the APT machinery will sort out for you any dependencies issues that might exist.

Ad and Aditional modules support

The config is quite tight in the sense that if you have something that is not contemplated in the exact match locations, /index.php, /install.php, etc, and you try to make it work it will fail. Some Drupal modules like ad provide a PHP script. This script needs to be invoked. In the case of the ad module you must add the following location block:

   location = /sites/all/modules/ad/serve.php {
      fastcgi_pass phpcgi;
    }

Of course this assumes that you installed the ad module such that is usable for all sites. To make it usable when targeting a single site, e.g., mysite.com, insert instead:

   location = /sites/mysite.com/modules/ad/serve.php {
      fastcgi_pass phpcgi;
   }   
   
Proceed similarly for other modules requiring the usage of PHP
scripts like `ad`.   

On groups.drupal.org

There's a nginx groups.drupal.org group for sharing and learning more about using nginx with Drupal.

Monitoring nginx

I use Monit for supervising the nginx daemon. Here's my configuration for nginx.

Caveat emptor

You should always test the configuration with nginx -t to see if everything is correct. Only after a successful should you reload nginx. On Debian and any of its derivatives you can also test the configuration by invoking the init script as: /etc/init.d/nginx testconfig.

My other nginx configs on github

Securing your PHP configuration

I have created a small shell script that parses your php.ini and sets a sane environment, be it for development or production settings.

Grab it here.

TODO

  • Improve the documentation. It's too vague and needs to be more elaborate.

  • Add AgrCache support. (D7)

Acknowledgments

The great bunch at the Nginx group on groups.drupal.org. They've helped me sort out the snafus on this config and offered insights on how to improve it.

Thanks to Richard for setting me down the path of image hotlinking protection.