Hungarian Notation

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Abstract

Physicists have understood that numbers are different from each other and that all numbers have an inherent "unit" attached to it. The "Apps Hungarian" system described in http://www.joelonsoftware.com/articles/Wrong.html is an interpretation of this system. What is truly needed is the concept of units embedded in the programming language itself.

What Physicists Understand that Mathematicians Don't

In Mathematics, a number is something abstract. In Physics, numbers have a real meaning. "5" could be 5 meters, or 5 seconds, or 5 kilograms, or even 5 identical things.

Multiplying "5" by "5" you get "25" in mathematics. In Physics, "5 meters" times "5 units" gives you "25 meters". "5 kilograms" times "5 seconds" gives you "5 meter-seconds".

Now, there is a fundamental, simple mathematical principle behind all this that makes it irrelevant to mathematicians. "meters", "seconds", "units", etc... are all really numbers themselves that can be added, multiplied, subtracted, and divided. For instance, "4.2 km" is really "4,200 m" because "km = 1,000 m". So "4 km" is exactly "4 times km". Of course, some units don't match and can't be compared. How many meters in a second? It simply doesn't make sense because they are measuring completely different things.

What Programmers Can Learn

In Joel's article[1], he puts the units in the variable names. This is nice, and perhaps appropriate for some programming languages. My Lisp experience teaches me that anything we can do (that is logical), your computer program can do much better. And that means there's a lot of work our languages can do for us.

For instance, let's suppose that we invented two units: "safe string" and "unsafe string". The system could behave as follows.

  • Strings themselves are simply a number--a magnitude of some units, either "unsafe string" or "safe string".
  • By default, all strings are "safe" until they are multiplied by "unsafe string". To reverse, divide by "unsafe string".
  • Any string entering your program which is unsafe should be multiplied by "unsafe string".
  • The "print" function knows how to print "safe strings", but it doesn't know how to print "unsafe strings". That means, you cannot print "unsafe strings".
  • A safe string concatenated with an unsafe string is an unsafe string. A warning could be thrown indicating that you are doing something you probably didn't intend to do.
  • The only way to make a safe string from an unsafe string is to analyze the unsafe string and generate a new string from scratch, not composed of any part of the unsafe string.

The above system could be applied to any type.

  • Any object has attached to it its "units".
  • By default, all objects have no unit.
  • Multiplying or dividing by a unit adds or removes the unit.

Or even more simply:

  • Every object has a "unit" attribute that can be set to an arbitrary string.

Possible units:

  • pixels in window coordinates
  • pixels in page coordinates
  • difference
  • etc...

More thought is needed, but I believe such a system can be easily implemented on top of most truly flexible programming languages.