Proofs through typography with Hungarian notation
The virtue of formal texts is that their manipulations, in order to be legitimate, need to satisfy only a few simple rules; they are, when you come to think of it, an amazingly effective tool for ruling out all sorts of nonsense that, when we use our native tongues, are almost impossible to avoid.
Edsger W. Dijkstra, On the foolishness of "natural language programming".
Hungarian notation
Joel Spolsky has this idea of “making wrong code look wrong” as a way to aid your typechecker. He was writing in 2005, back when fancy dependent type systems like TypeScript were still stuck in the lab, and production typecheckers were limited to primitive concepts like int
and bool
and vector<vector<vector<string>>>
. But his essay is still relevant - unless you have a proofchecker, there will always be certain properties of your program that your off-the-shelf typechecker won’t be able to validate. In that case, you’ll have to simulate your types using names.
His example of making wrong code look wrong is the concept of (Apps) Hungarian Notation. He explains that, although Hungarian Notation gets a bad rap nowadays, its original form was useful to many developers writing application code.
For example, let’s say that you’re writing a website that is vulnerable to XSS attacks. The safety property you have to maintain is:
All strings that originate from the user are unsafe. Any unsafe string must not be output without encoding it.
Using Hungarian Notation makes it easy to maintain this property by following this simple - local - set of rules:
All user-input strings get the prefix
us
- for example,us_input
. This includes functions that return user input, likeus_get_param
.All safe strings get the prefix
s
- for example,s_page_title
.You can’t output any string prefixed with
us
. To enforce this, the only output functions get ans
as their last letter, for examplewrite_s(s_page_title)
.You’re never allowed to assign a
us
string to ans
string, or to mix them in any way, unless you use the functions_from_us
, which HTML-encodes the string.
Each of the four rules above is checkable one line at a time. In other words, if there’s a part of your program that violates the safety property, it’s a single line on which one of the above four rules is violated. For example, can you find which lines have an issue here?
us = us_request("name")
s = us_request("name")
us_name = us
s_name = us
s_name = s_from_us(us)
It’s easy to check the correctness of code written according to these rules, because all you have to do is check that every us
matches and every s
matches across the assignment equals sign, except for the ones that are “canceling out” across a parentheses to a from
-named call. For example, in s_name = s_from_us(us)
, the two us
mentions cancel out, since they sit across a parentheses boundary, while the two s
mentions match across the equals sign.
Coordinate frames
At Mach9, we follow a similar rule for coordinate frame transformations. For example, on the platform, we’ll often work in two different frames:
Project frame, which is the original coordinate frame of the point cloud dataset. This is typically a UTM or a local Mercator projection, and will have
x
pointing east,y
pointing north, andz
pointing directly up.
WGS84 ECEF (”earth-centered earth-fixed) coordinate system, which is a coordinate system that’s defined globally across the globe.
x
points out from the center of the earth towards (0, 0) latitude and longitude,y
points from the center towards (0, 90), andz
points towards the North Pole.
Coordinates in either of these coordinate frames are represented as Cesium.Cartesian3
objects, so the Typescript type system isn’t sufficient to disambiguate between the two. Therefore, we’ve adopted the following rules:
All coordinates in project frame start with
project_
.All coordinates in WGS84 ECEF start with
ecef_
.We have a special set of helper functions called
project_from_ecef_
andecef_from_project_
that perform conversion between the two, and that’s the only way we can convert.
For example, some representative code might look like:
const project_position = await loadProjectPositionFromDB();
const ecef_position = ecef_from_project_cartesian3(project_position);
Cesium.addObject(ecef_position, ...); // add other attributes
The general recipe is as follows:
Define your (finite) list of possible coordinate systems.
All quantities must be prefixed by a coordinate system name.
All of your conversion functions must be named
cs2_from_cs1
.Never mix between coordinate systems. All prefixes must either:
Match entirely within the line, like on line 1
cancel out with a
from
, like on line 2
Einstein notation
In physics, we often need to deal with a variety of vectors and matrices that are related via equations. (These are called tensor equations, but the word tensor means something different than it does in ML.)
When you change the coordinate frame that you’re working in, the numbers in a vector - its coordinates - change, but the underlying physics don’t. Apples fall to the ground at the same speed regardless of what coordinate frame you’re using for your calculations. So physicists are mostly interested in invariant quantities, numbers that don’t change when the coordinate frame does. For example, the length of a vector doesn’t change when you change your frame, and neither does the angle that vector makes with others.
Unfortunately, it’s hard to keep track of exactly what you need to do to keep your equations invariant to coordinate changes. For example, the change-of-basis equation for a matrix is really complicated; there’s a conjugation by the basis, but I and everyone else can’t remember on what side the transpose goes. Fortunately, physicists have come up with a written trick to keep track.
This trick is called Einstein notation, named after the physicist who first brought the plague of tensor calculus upon the graduate departments. Einstein notation has the following rules:
Every tensor gets one coordinate index, called a “dummy label”, per rank (or for ML people per dimension). This label is either upper or lower, depending on if the tensor is a vector or a covector in that dimension.
That’s the mathematician’s way of saying it. For us mortals, we get three simpler rules:Vectors get one upper index:
\(v^a\)Matrices get one lower and one upper index:
\(M_a^b\)For complicated reasons, the output of a cross product actually gets an lower index:
\(c_b\)
Every dummy label appears exactly twice in an equation.
If the label appears twice on one side, they have to come in a pair, one upper and one lower, so they cancel out.
If the label appears once on each side, they have to both be upper or both be lower.
You also get access to a few special symbols you’re allowed to put anywhere:
This one:
\(g_{ab} \text{ or } g_{a}^b \text{ or } g^{ab}\)called the metric tensor. This handles coordinate frame changes for you.
This one:
\(\epsilon_{ijk}\)
called the Levi-Civita tensor. This handles cross-products.
Once you’ve written your equation, you now put an implicit sum over all contracted indexes, and everything else is treated coordinate wise. Now all your equations are invariant to base change!
For example, here’s the length of a vector:
Here’s a matrix multiplication, and notice how the upper-lower contraction corresponds to the inner sum loop:
Notice the similarities between this and Hungarian notation:
A list of simple-to-check rules gets you a complicated property. Checking that each index in turn is used correctly guarantees that you have invariance. Likewise, checking that each line of code follows the Hungarian rules guarantees that we have no XSS attacks.
Every index is used once on each side, or canceled out. In our Hungarian notation, each type tag -
us
ors
- was either on each side, or canceled out across a parentheses.
These similarities aren’t just coincidence. In fact, upper indices correspond to a type of tensor called a contravariant vector, and lower indices are covariant vectors. You can get between contravariant and covariant vectors only by multiplying with the metric in a complicated process, but that’s totally automated by the Einstein rules. Likewise, you can convert unsafe to safe strings only with the s_from_us
encoding function.
Common themes
First, all three examples involve combinatorial complexity. In the XSS scenario, you now have to deal with twice as many possible strings as before, since any string can be treated both in its safe and unsafe form. When you use different coordinate frames, you now multiply all of the different geometric properties of your system by n different frames that they can be expressed in. A physics tensor of rank 5 has 2^5 = 32 different covariant or contravariant forms.
Second, in all three cases, you can separate out the combinatorics into two separate pieces. This is the distinction between the us_
part of a string variable’s name and the _page_title
part, or the difference between the project_
and the _position
parts of the object position variable’s name. In the physics case, the underlying tensor represents a definite physical quantity, and it’s only the frame or coordinates that we express it in that can change.
Third, switching from one “type” to another “type” is totally deterministic. There’s only one possible way to XSS-escape a string - to go from us
to s
. Likewise, there’s only one way to transform a point from project
frame to ecef
frame, or to transform a tensor from contravariant to covariant. Therefore, if you see a line of code or math that needs to mix types, you know exactly where and when to insert the conversion functions.
When you see these three themes come together, you can reduce the combinatorial complexity by following the above unifying guidelines, now expressed in abstract form:
Identify the “frame”- or “type”-like part, and split it off in the naming convention.
Name your variables both by their “frame”/“type” and their intent, in some clearly typographically separate way.
Introduce a set of very constrained “type” or “frame” transformation functions or procedures, and make it so that they only operate within a single line or sentence.
Never mix your “types” in a single line, except by using the specific transformation functions.