PHP Manual Masterpieces

RSS
Oct 6

Two’s complewhat

I continued reading the comments on the bizarre dechex conversion function which works on an “unsigned int” type that doesn’t actually exist in PHP. It says that negative signed numbers shall be treated as unsigned.

Someone took this to mean it creates incorrect results for negative numbers and tried to roll their own which could produce the negative hex representations of negative integers. Ignore the fact that it doesn’t prepend 0x to the output as neither does dechex().

function dec_to_hex($dec) 
{ 
    $sign = ""; // suppress errors 
    if( $dec < 0){ $sign = "-"; $dec = abs($dec); } 

/* ... an array-index based algorithm goes here */
    
    return $sign . $h; 
} 

If you pass 256, you get the output 100. If you pass -256, you get the output… -100. With the literal unary operator.

This opens an interesting question about what maketh a negative number. In a purely abstract mathematical sense, slapping a negative sign on a positive number is “correct” in any base. But this is computer programming, and hexadecimal is special, because we use it to map directly to literal bit values whereas decimal is an abstraction.

Assuming we add the 0x prefix (after the unary dash) to make real hexadecimal number tokens out of these results, we can do math like this: 257 + -0x100 returns 1 as expected. This is because PHP is taking the unary operator and doing a two’s complement negation of 0x100 at runtime. -0x100 is not the actual hex representation of -256, it is an expression that evaluates to it. It’s like returning the string “2 + 3” and saying you’ve returned “5”. Only… sort of.

The actual negation of 0x100 is 0xFFFFFF00 on 32-bit or 0xFFFFFFFFFFFFFF00 on 64-bit (note the lack of negative signs). If you are unsure where all those F’s came from, check out two’s complement representation. Honestly, this is a very complex subject and I don’t blame beginner programmers, or ones who have only ever dealt with scripting languages, for falling down on this.

Here’s the thing, though. If you call dechex(-256), you will get 0xFFFFFF00, the correct result, even though the documentation explicitly warns that it coerces input to be unsigned. When you understand why, you have achieved binary representation nirvana.

I did not understand hexadecimal, binary, signed and unsigned, and casting - all concepts which PHP exposes but does not handle very gracefully - until PHP stopped being my only programming language. And that’s your PSA for the day.