Perforce Chronicle 2012.2/486814
API Documentation

Url_Filter_UrlPath Class Reference

Filter to normalize a url path for use as a 'custom url'. More...

List of all members.

Public Member Functions

 filter ($value)
 Normalize a url path component such that characters are consistently encoded.

Detailed Description

Filter to normalize a url path for use as a 'custom url'.

Normalizes the path to encode (and only encode) 'unsafe' characters. The list of 'safe' characters includes all of the unreserved and reserved URI characters (see http://en.wikipedia.org/wiki/Percent-encoding), excluding '?' and '#' since those will terminate the path component.

Additionally converts unencoded backslashes to forward-slashes and trims any unencoded leading or trailing slashes and whitespace.

Copyright:
2011-2012 Perforce Software. All rights reserved
License:
Please see LICENSE.txt in top-level folder of this distribution.
Version:
2012.2/486814

Member Function Documentation

Url_Filter_UrlPath::filter ( value)

Normalize a url path component such that characters are consistently encoded.

Only 'unsafe' characters are encoded.

Parameters:
string | null$valuethe url path component to filter.
Exceptions:
InvalidArgumentExceptionif given value is not a string.
Returns:
string the normalized url path.
    {
        // leave null values alone.
        if (is_null($value)) {
            return $value;
        }
        
        // ensure we're dealing with a string.
        if (!is_string($value)) {
            throw new InvalidArgumentException(
                "Cannot normalize url path. Value must be a string."
            );
        }
        
        // translate unencoded backslashes to forward-slashes.
        $value = str_replace('\\', '/', $value);

        // trim unencoded leading/trailing slashes and whitespace.
        $value = trim($value, " \t\n\r/");

        // to achieve a consistent level of encoding, we first decode 
        // all characters and then (re)encode the 'unsafe' ones.
        $value = rawurldecode($value);
        
        // identify 'safe' characters.
        $safe = array(
            // unreserved characters (alpha-numerics are handled below).
            '-', '_', '.', '~',
            // reserved characters.
            '!', '*', '\'', '(', ')', ';', ':', '@', '&', '=', '+', '$', ',', '/', '[', ']'
        );

        // encode everything not in our whitelist of safe characters.
        $value = preg_replace_callback(
            '/[^a-z0-9\\' . implode('\\', $safe) . ']/i',
            function($matches)
            {
                return "%" . bin2hex($matches[0]);
            },
            $value
        );

        return $value;
    }

The documentation for this class was generated from the following file: