Auto Escaping View Variables Using Zend Framework and Doctrine 2

Filter input, escape output.  As complex as security can be when developing web applications, following these simple rules of coding security will cover a whole range of bases.  Today we're going to be looking specifically at the second part – escaping output, and how this can be achieved with Zend Framework and later Doctrine 2.

Note this article was created for Zend Framework 1.

If you’re already clued up on the basic principle of escaping output – I’d suggest you jump ahead to the Zend bit, otherwise read on for a quick overview...

What is escaping and why is it important?

Put simply, escaping output allows you to ensure that content being sent to a page will be presented as you expect it to. This is best explained with an example:

Let’s suppose you have a commenting system on your site, and oops! you forgot to filter your input. Mr Evil commenter adds a comment to your site and whatever he types goes straight into your database unchecked. Another user then comes to your web site and he sees all the comments that have been left, including Mr Evil’s. Let’s look at the difference between the output of an escaped and non-escaped  comment:

Escaped:

<script>alert(‘You should escape your output’);</script>

Not Escaped:

 

As you can see, the comment in the unescaped version has been executed as Javascript allowing the commenter to do things they’re not supposed to. A fairly harmless example, but they could do a whole lot more; Cross-Site Scripting Attacks (XSS) can allow the attacker to perform malicious actions in visitors' browsers, like redirecting all your hard-earned traffic to Justin Beiber's fan page, and nobody wants that. Nobody.

So what do I do!?

Coming back to ‘Filtering Input, Escaping Output’, a lot of this can be prevented through data validation (ensuring you are receiving the correct ‘type’ of data) and data sanitisation (ensuring the data is ‘safe’ before you save it). The bit we’re looking at today though is escaping the output, which if implemented correctly should protect you against XSS attacks even on unsanitised data. At a simple level, this can easily be achieved in plain old PHP using methods like htmlspecialchars() or htmlentities() which will convert applicable characters to entities.

Escaping data with Zend

And on to the topic of the day, how do we escape data using the Zend Framework? I’ve looked into the various options of achieving this, my primary goal was finding a solution that was simple to maintain and consistent.

Option 1 – Zend’s Escape Function

This is the simplest method of escaping your data. All you need to do is wrap each variable that needs escaping with Zend’s escape method:

$this->escape($myVariable);

If you dig into the Zend method you’ll see it uses either htmlspecialchars or htmlentities which we previously discussed.

Now here's my reservation about this technique – you have to remember to wrap it around EVERY variable you want to escape - it’s far too easy to forget to use it, and you only need to miss it once! What if a new developer on your team is used to working with a different framework like Symfony where auto-escaping can be enabled? So how about a bit of automation...

Option2 – Zend Helper to auto-escape variables

One way to create some form of automation would be to write an AutoEscape view Helper which you would call at the top of each view. You still have to remember to add that in, but at least it’s only one line!

This code sample is a solution suggested by Jani Hartiakinen over at Code Utopia with the addition of an escapeDeep function to account for arrays:

class Zend_View_AutoEscape
{
    private $_unescapedVars = null;
    private $_view;
 
    public function setView($view)
    {
        $this->_view = $view;
    }
    
    protected function autoEscape()
    {
        $vars = $this->getVars();
        $this->_unescapedVars = $vars;

        foreach($vars as $k => $v)
        {
            $this->_view->$k = $this->escapeDeep($v);
        }
    }
    
    public function escapeDeep($unescaped) 
    {
        if(is_string($unescaped))
        {
            return htmlspecialchars($unescaped);
        }
        elseif(is_array($unescaped)) 
        {
            foreach($unescaped as $k => $v) 
            {
                $unescaped[$k] = $this->escapeDeep($v);
            }
        }
    }
}

Essentially, the helper gets the view variables, loops through and escapes each one. Note that the 'setView' function is automatically called by Zend to inject the current view object for you to use. 

So with this method all you need to do is include the following code at the top of each view:

<?php echo $this->AutoEscape(); ?>

Note that you can still access the unescaped variables if you need to (e.g. for a CMS page that actually needs script tags).

Option 3 – View Stream

OK, so what about if we didn’t even want to include that one line in each view, are there any other options? Rob Allen’s post over at Akrabat suggests a View Stream or Stream Wrapper. In this scenario, short tags are used (<? … without the PHP <?PHP), and Zend_View is extended to intercept variables before it is rendered, at which point all short tag variables are escaped. There are a few reasons I like this option:

  • Outputing variables is actually shorter <? @$var; ?> compared with the original <?php echo $this->escape($var); ?> 
  • You can still easily access the unescaped variables using normal PHP tags
  • No need to worry about calling a view helper on every page

The downside of this method however is the overhead for parsing the whole view file. I believe this is similar to Smarty Templates where speed can be an issue. If you are interested in pursuing this route you can read the full article at Akrabat.

 

Getting Niche – enter Doctrine 2…

So just to complicate matters further, let’s throw Doctrine 2 into the mix.

What’s the problem with just using one of the previous methods? Unfortunately it’s not so simple to iterate through the doctrine result sets and escape them, as the properties in the results are protected (note that you shouldn’t and can’t just change the entity properties to public, Doctrine gets very upset).

So let’s come up with a solution.

I’m going to build on our option 2, so just to re-cap how this works: In the views add $this->AutoEscape(); which implements the Auto Escape helper. The Auto Escape helper gets all of the view variables and will ‘deep’ escape arrays and variables. Anything echoed out in the view now will be escaped.

Dealing with Doctrine 2:

As doctrine 2 uses private or protected variables in its entities, these are not available to iterate through and escape. This is good practice from the perspective of the controller, service or model layer, however when using the entities as a result set in the view this prevents manipulation of the data for the view. The solution to this is to extend the entities with a BaseEntity. The BaseEntity contains three methods:

  • __set
  • __get
  • __getProperties

The last one is the key for auto-escaping as this allows iteration through the entity object as we know what properties are available, without making them public. Note that for this to work you’ll need to update your Doctrine entities so that the variables are protected, not private.

So the Doctrine entities extend the Base Entity and the Base Entity contains the new methods:

namespace Resources\Entity;

class BaseEntity
{
    public function __get($property)
    {
        return $this->$property;
    }   
    
    public function __set($property, $value)
    {
        $this->$property = $value;
    }
    
    public function getProperties()
    {
        return get_object_vars($this);
    }
}

Note that the entities in my application are stored in library/Resources/Entity so you’ll need to adapt the path / namespace accordingly.

Now that we have access to the properties that will be returned in our Doctrine result set, we need to update the helper so it can loop through them - to do this we’re going to add an additional 'else' statement to escapeDeep function in our helper that also checks for objects. Note that we don’t want to escape all objects otherwise this would cause problems with things like Zend_Form.

When looping through objects, we check that it is a doctrine entity by matching against the class name, in my case I used ‘Resources\Entity\\’, though this will be different depending on where your entities are stored and what they are titled – 'Resources' is normally where  you would have your project name. As the code loops through, it uses the getters and setters that we defined in our base class to update the object.

class Zend_View_Helper_AutoEscape
{
    private $_unescapedVars = null;
    private $_view;
 
    public function setView($view)
    {
        $this->_view = $view;
    }
 
    public function autoEscape()
    {
        
        $autoEscape = new Resources_Classes_AutoEscape();
        
        if($this->_unescapedVars === null)
        {
            
            $vars = $this->_view->getVars();
            $this->_unescapedVars = $vars;
            foreach($vars as $k => $v)
            {
                $this->_view->$k = $this->escapeDeep($v);
            }
            
            // Add the rawVariables to the view also if we need access to them
            $this->_view->assign('rawVariables',$vars);
        }
        else
        {
           $this->_view->assign($this->_unescapedVars);
           $this->_unescapedVars = null;
        }
    }
    

    private function escapeDeep(&$original) 
    {
        if (is_string($original))
        {
            return htmlspecialchars($original);
        }
        elseif (is_array($original)) 
        {
            foreach ($original as $k => $v) 
            {
                $original[$k] = $this->escapeDeep($v);
            }
        }
        elseif(is_object($original))
        {
            // Custom for doctrine 2
            $class = get_class($original);
            if(substr($class, 0, 17) == 'Resources\Entity\\')
            {
                $properties = $original->getProperties();
                foreach($properties as $key=>$value)
                {
                    // Don't try to convert the joining tables
                    if(!is_object($value))
                    {
                        $original->__set($key, htmlentities($original->__get($key)));
                    }
                }      
            }
        }
        return $original;
    }
}

So finally, we have a helper that can escape strings, arrays and doctrine result sets which only requires one line at the top of each helper and all the raw variables are still available if required.

Not everyone requires this fairly niche setup and it’s up to you which approach you ultimately want to take, but the key is to be consistent with whichever method you follow.

Known Limitations:

For the sake of being thorough, I feel I should point out a few known limitations with the proposed setup:

Doctrine

Doctrine result sets should be passed to the view as they need to be used. The view should not call methods belonging to an entity that will retrieve more data. (e.g. $this->event->getRegions() – the regions should be passed in from the controller).

Paged Results

I had a surprise later down the line of creating this implementation when I realised none of my paged result sets that used Doctrine’s Paginate extension were being escaped! To overcome this I created a new AutoEscape class in my library and moved the escapeDeep method into the Class and instead called it from the Helper. This allows me to re-use the same functionality in my custom paging class so that paged results were also escaped. Feel free to get in touch if you would like a copy of this implementation.

Custom Classes:

Be aware that only variables that are passed to the view will be escaped. If for example you called a method from within the view to retrieve data (e.g. $notifications->getnotifications) these variables would need to manually be escaped either in the view or within the custom class itself). For this reason it is recommended that all variables are passed to the view from the controller where possible, or any custom classes deal with escaping before returning the data.

Helpers

Variables need to be passed into your helpers from the view - if the helper interacts with, say, a service layer, it will be utilising unescaped variables.



Sign Up

NEXT: Getting started with Zend 2

This article provides you with an introduction to Zend 2, the changes from Zend 1 and a step-by-step tutorial on creating a hello world module.

comments powered by Disqus
Sign Up

Popular Tags

350x250

Need a web developer?

If you'd like to work with code synthesis on your next project get in touch via the contact page.