Monday, March 14, 2011

When Does X Not Equal X and Other Oddities

In many programming languages:
  • For what value of x is the condition (x == x) false?
  • For what integer value of y is the condition (y < 0) && (-y < 0) true?
i.e. define x on line 3 and y on line 10  in the following Java code sample so that lines 7 and 12 execute:
  1 public class Odd {
  2   public static final void main(String[] arg) {
  3     // define x here 
  4     if (x == x) { 
  5       System.out.printf("Reflexive%n"); 
  6     } else { 
  7       System.out.printf("Not Reflexive%n"); 
  8     } 
  9  
 10     int y = 0; // change 0 to something else
 11     if (y < 0 && -y < 0) { 
 12       System.out.printf("How can they both be negative?%n"); 
 13     } 
 14   } 
 15 }
In SQL:
  • What database value x is the condition (x = x) false?
  • What about values y and z such that ((y = z) OR (y != z)) is false?











These next few lines are intentionally left blank, in case you wan to think about these before viewing possible answers.











Answers
In mathematics, one of the properties of the equality operator is that it is reflexive. It turns out that the Java (and almost every other language) equality operator does not have this property.  If you put the following on line 3:
double x = 0.0/0.0;
and run the program, it'll print out "Not Reflexive".  Why is this?  Well, according to the IEEE standard 754, 0/0 = NaN. What is NaN?  Well, it is not a number, corresponding to the mathematical concept of an indeterminate number. Because it could be any number, the powers that be decided that NaN != NaN.

So what about the second example? Well, in a 2's complement system of encoding numbers, there is room for one more negative number than positive numbers. (e.g. a 16 bit signed value encodes the values -32,768 to 32,767). What this means is that if you try to negate the most negative number (e.g. -32,768), the corresponding positive number won't fit in the memory location, and so you actually get the same number back. This means an answer to line 10 is:
int y = Integer.MIN_VALUE;

What about the SQL questions? Well, it turns out that SQL treats NULL like NaN, i.e. it is an indeterminate value. SQL goes even further than Java.  Not only is the expression (NULL = NULL) result in false, but (NULL != NULL) also results in false. Well, actually that's not quite true. SQL uses three-valued logic, and so both the expressions actually evaluate to undefined, but undefined is treated as false when it comes to matching rows.

Why does this matter?
Have you ever written code where it is assumed that if two numbers have the same value they will be equal? How about that Math.abs() will always return a negative value? Eric Lippert had a blog post showing how this assumption can cause the standard sort functions to fail.

I personally have generated two reports by running queries like:
SELECT * FROM Table WHERE columnA = columnB -- report 1
SELECT * FROM Table WHERE columnA != columnB -- report 2
and then was mightily confused when, at some point during the analysis, I realized that there were rows in my database table that weren't in either report.

These may feel like nitpicky details that only show up on those gotcha type quizzes. However, not understanding these details can lead to bugs in your code. If you are going to be an expert programmer, you have to know and understand the rough edges of your language.

No comments: