How to fix nginx and PHP/FastCGI PATH_INFO issue

You may be disappointed by this statement: you don’t. nginx has something in it that’s broken by design, while the author didn’t bother at least to reply my email, explaining the situation. I can demonstrate this by using some comparisons.

Apache (+mod_php5) knows the difference between a script and a PATH_INFO input request that ends in .php. It would be ridiculous not to do so since the PHP runtime is part of the webserver itself. Didn’t bother to try various Apache + mod_fcgid configurations since most of the time Apache simply wastes my own time. lighttpd binds the FastCGI proxying to the file extension. This is the part where nginx fails: it tries to use a one-size-fits-all configuration logic that actually doesn’t fits all the usage modes. The FastCGI pass is done into a location (not file / file extension!) directive that doesn’t tell anything about the nature of the input. Let’s take a look at this example:

/directory.php/file.php/pathinfo.php

Except nginx, both Apache and lighttpd won’t create a mess out of a path like this. A location directive is a little big vague and no amount of regex will ever fix this. Of course, you can fix it for every single damn virtual host, but then again, can you spell boilerplate? Having different configurations for every virtual host, when clearly for other web servers this isn’t a bundled “feature”, is not fun from the system administration point of view. Usually I generate the virtual host configuration from a bash script I wrote myself. I have configuration templates for all of the applications I administer, thus is all about flags and options, unlike manually writing configuration files. Working around nginx’s inabilities to tell which stuff is which could only mean that I have to write a whole bunch of configuration boilerplate for each type of application. Doesn’t sound like one-size-fits-all anymore. Is that fun? Let me rephrase that: that’s not fun. In case AWS disappears into a black hole, I can recreate everything from scratch in a matter of minutes onto a complete new hosting service while changing Cotendo origins is a child’s play.

Here are a bunch of proposed solutions for something that can turn out into a remote exploit. I’ve being using for quite a while the same solution as provided by one of the people commenting the article:

if (-f $request_filename)
{
    fastcgi_pass php_upstream;
}

Usually because this is more readable than try_files. I usually tend to understand code blocks better.

Of course, a proper PHP script won’t save any uploaded junk to a public accessible location, but what sysadmin trust his coders anyway? I usually don’t. That doesn’t mean that they don’t do a good job, but mistakes happen. I can’t made every living thing to be as paranoid about security as I am. This exploitable situation happens when people validate their upload via the $_FILES array. I have news flash for you: the MIME type defined by the $_FILES array is defined by the browser. The browser does a lousy job at providing a proper MIME. It matches a specific MIME based onto the file extension. PHP file with JPEG extension, anyone? fileinfo would be the proper alternative. PHP should deal with this junk by design, but that’s a whole other joke about the design of PHP.

Getting back to PATH_INFO. The juicy part is that basically you can extract the PATH_INFO from an input path by using fastcgi_split_path_info, but that directive uses … regex. Which brings us to the above statement: no amount of regex will ever fix this crap. Let’s take a look at $request_filename by throwing a custom debug logging configuration that places some stuff into the access_log. Guess what, the $request_filename for the above example is … /directory.php/file.php/pathinfo.php while it’s pretty clear that the actual request filename is /directory.php/file.php. Which is the other broken-by-design thing that nginx features. Q: What damn server side variable would ever lie to your face that a $request_filename is not exactly a file? A: duh!

This doesn’t mean that you should throw nginx and PHP-FPM away and go back crying to Apache. Just simply avoid the PATH_INFO junk. However, even by using my proposed configuration directive aka check if $request_filename is actually a file before doing the FastCGI pass, you can still use fastcgi_split_path_info for a limited amount of work. fastcgi_split_path_info can replace the need for doing URL rewrites by simply using:

if (!-e $request_filename)
{
    rewrite ^ /index.php last;
}

This works for a lot of stull like WordPress, Drupal, or Zend Framework. It works for certain stuff, except the stuff containing .php somewhere. I might want to use /%postname%.php as permalink structure in WordPress. Guess what … with a properly configured nginx (cough!) + the above rewrite rule replacement simply can’t. You have to go back to:

if (!-e $request_filename)
{
    rewrite ^/(.*)$ /index.php?q=$1 last;
}

Which is exactly what I did. For all my apps. Happens to be more deterministic by nature, while I tend to sleep better when I can predict the request pipeline, no matter which is the input junk.

I guess I could give Hiawatha a run which seems to be lightweight enough to support PHP with a threaded architecture. PHP is process based, doing blocking I/O by design, therefore the web server is rarely the actual bottleneck.

3 thoughts on “How to fix nginx and PHP/FastCGI PATH_INFO issue

  1. Donovan

    Nice article. I’ve been using:

     if (!-e $request_filename) {
            rewrite ^.*$ /index.php last;
        }
    

    for sometime now without any issues but I’m wondering what the difference is in using your method of:

    if (-e $request_filename)
    {
        rewrite ^/(.*)$ /index.php?q=$1 last;
    }
    

    Thanks for the article!

  2. SaltwaterC Post author

    I ate the bang operator (fixed the article). The proper syntax is:

    if (!-e $request_filename)
    {
        rewrite ^/(.*)$ /index.php?q=$1 last;
    }
    

    The difference with this syntax is the fact that PATH_INFO isn’t supported, therefore the routing is made by reading a query string argument. The only disadvantage is the need for writing application specific rewrite rules. Unless you’re planning to support various exotic routing schemes, the solution is maintainable without (m)any headaches. I had to administer at some point some “commercial” CMS crap that had more than 200 rewrite rules. That guy clearly did not encounter the front controller pattern design in his natural life span.

    Most frameworks / CMSes support a query string argument that contains the route, as alternative to PATH_INFO. WordPress, Drupal, Invision Power Board, and Zend Framework happen to use the q from the example. Kohaha uses a rule like /index.php?kohana_uri=$1. ActiveCollab uses a rule like /public/index.php?path_info=$1 for its main application route. The list could go on, and on.

    With this setup, the rewrite rule can be used against a fastcgi_pass directive wrapped inside an “if (-f $request_filename)”. For example, something that works with WordPress, while avoiding the “exploit” that’s linked into the article:

    server
    {
        # stuff like listener, document root dir, log config
        
        server_name example.com;
        index index.php;
        
        location /
        {
            if (-f $request_filename)
            {
                break;
            }
            
            if (!-e $request_filename)
            {
                rewrite ^/(.*)$ /index.php?q=$1 last;
            }
        }
        
        location ~ \.php$
        {
            fastcgi_intercept_errors off;
            include includes/fastcgi.params;
            if (-f $request_filename)
            {
                error_page 404 = /404.html;
                fastcgi_pass [UNIX:socket || ip:port || nginx upstream reference ];
            }
        }
    }
    

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.