update page now
PHP 8.1.34 Released!

Voting

: min(nine, two)?
(Example: nine)

The Note You're Voting On

alejandro at devenet dot net
15 years ago
When the client send Get data, utf-8 character encoding have a tiny problem with the urlencode.
Consider the "º" character. 
Some clients can send (as example)
foo.php?myvar=%BA
and another clients send
foo.php?myvar=%C2%BA (The "right" url encoding)

in this scenary, you assign the value into variable $x

<?php
$x = $_GET['myvar'];
?>

$x store: in the first case "�" (bad) and in the second case "º" (good)

To fix that, you can use this function:

<?php
function to_utf8( $string ) {
// From http://w3.org/International/questions/qa-forms-utf-8.html
    if ( preg_match('%^(?:
      [\x09\x0A\x0D\x20-\x7E]            # ASCII
    | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
    | \xE0[\xA0-\xBF][\x80-\xBF]         # excluding overlongs
    | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
    | \xED[\x80-\x9F][\x80-\xBF]         # excluding surrogates
    | \xF0[\x90-\xBF][\x80-\xBF]{2}      # planes 1-3
    | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15
    | \xF4[\x80-\x8F][\x80-\xBF]{2}      # plane 16
)*$%xs', $string) ) {
        return $string;
    } else {
        return iconv( 'CP1252', 'UTF-8', $string);
    }
}
?>

and assign in this way:

<?php
$x = to_utf8( $_GET['myvar'] );
?>

$x store: in the first case "º" (good) and in the second case "º" (good)

Solve a lot of i18n problems.

Please fix the auto-urldecode of $_GET var in the next PHP version.

Bye.

Alejandro Salamanca

<< Back to user notes page

To Top