Thursday, September 13, 2007

Right Shift Considered Harmful (Again)

CLDC 1.0 has no support for floats. So J2ME programmers generally use fixed point maths to represent decimals. The basic idea is that you multiply the integer by the precision factor you need, and then divide by that precision factor to get back your integer. By choosing a precision factor that's a power of 2, we can use left bitshift to covert to fixed point and right bitshift to get back the integer. This is a common optimization since integers are stored in binary format and left bitshift is equivalent to multiplying by 2 and right bitshift is equivalent to dividing by 2. So, for example a typical fixed point math class would look like this:
public final static int PRECISION = 16;
public final static int FACTOR = 65536;

public final static int toFp(int a){ return (int)((long)a << PRECISION); }
public final static int toInt(int a_fp){ return (a_fp >> PRECISION); }
public final static int addFp(int a_fp, int b_fp) { return a_fp + b_fp; }
public final static int subFp(int a_fp, int b_fp) { return a_fp - b_fp; }
public final static int mulFp(int a_fp, int b_fp) { return (int)((((long)a_fp * (long)b_fp)) >> PRECISION); }
public final static int divFp(int a_fp, int b_fp) { return (int)(((long)a_fp << PRECISION) / b_fp); }

But then while writing my first J2ME game using fixed point math, I had the weirdest problem: motion in directions where sin/cos was negative was always greater than in directions where sin/cos was positive. Debugging, I found that the problem was with right shifting negative integers: right-shifting a negative integer is not equivalent to dividing by 2 because negative integers are stored in 2's complement form!

To understand the problem, lets try a simple example:
int a = 3/2;           // = 1
int b = -3/2; // = -1
int c = 3 >> 1; // = 1
int d = -3 >> 1; // = -2!!!
int e = -3 >>> 1; // = 2147483646!!!

System.out.println("a="+a+" b="+b+" c="+c+" d="+d+" e="+e);

I had to understand what was going on here! So here's what I found out:
Since integer is a 32 bit binary in Java:
3 is represented as: 00000000 00000000 00000000 00000011

3 >> 1
= 00000000 00000000 00000000 00000001 (1) (Right most bit 1 is shifted off)
= 2^0 = 1

-3 is represented as: 00000000 00000000 00000000 00000011
then 1's complement = 11111111 11111111 11111111 11111100
then add 1 (2's complement) = 11111111 11111111 11111111 11111101 (Left most bit is sign bit)

-3 >> 1
= 11111111 11111111 11111111 11111101 >> 1
= 11111111 11111111 11111111 11111110 (1) (Right most bit 1 is shifted off and sign bit is retained)
= -(2^0 + 1) = -2 (Converting from 2's complement)

-3 >>> 1
= 11111111 11111111 11111111 11111101 >>> 1
= 01111111 11111111 11111111 11111110 (1) (Right most bit 1 is shifted off but sign bit is also shifted)
= (2^(32-1) + 2^0 + 1) = 2147483646 (Converting from 2's complement)

Apparently this problem is a very old one. I found this interesting paper by Guy Steele in 1976 that explains it in detail: Arithmetic Shifting Considered Harmful. At the time, right shifting instead of dividing by 2 was a common compiler optimization and compiler writers either did not grasp or did not document the difference. What amazes me is that 30 years later, I faced the exact same problem because this difference is still not properly documented. Ironically, even the Java Language Specification which was co-authored by Guy Steele has only a very cryptic reference to this issue. Of course the situation in Java is still better than C/C++ where the result of right shifting a negative integer is compiler/platform dependent.

No comments: