2013: Advent Computing: Compilations

by on December 7, 2013

A computer will take a sequence of numbers and interpret them as instructions. It has to – a computer can only store numbers, and the meaning of those numbers (that they’re instructions the computer should perform, rather than, say, ingredient weights for baking a cake) comes from their context.

But writing a program just by writing the numbers is hard, and even the assembly language mnemonics only help a bit, so we turn to “compiled” languages.

A compiled language is a language for writing programs (which are just those lists of numbers, remember) in something that’s a bit easier to read and understand than assembly language, and provides a selection of more useful features.

One useful feature of a compiled language, as mentioned yesterday, is that it’s often “platform independent”. That is, given some program written in a compiled language, and the write compiler to convert it, you will be able to have it translated into a variety of different machine languages depending on the computer that will run the program. This is often referred to as “portability”; the program written in the compiled language can be ported and compiled onto different computers.

Perhaps more useful is the idea of variables and classes. Say I’m writing an architecture program, and want to store the length and width of a kitchen. In an assembly language, I’d need to find some memory addresses I can use, store the measurements at those memory addresses, and refer to those memory addresses whenever I wanted to check the lengths. If, however, I use a compiled language, I can store those measurements as, say “kitchen-length” and “kitchen-width”, and just use those names whenever I want to check the size of the room; the compiler will look after finding and using suitable memory addresses. The machine language version produced by the compiler will still worry about memory addresses, but I won’t need to.

Once variables exist, the compiler can start doing lots of clever tricks. For example, if I were to write, in my compiled language, about what a “room” is, I could create a variable for the kitchen, and the compiler would know it would need a width and a length, and may even let me tell it that multiplying those two numbers would give me the floor area. At which point I can create the bedroom and the bathroom as new “instances” of the room “class”, and not have to worry about handling working out the area of those rooms separately to my kitchen.

The compiler could also warn me about, say, trying to add the area of one room to the length of another, because that’s unlikely to be something I’d want to do. An assembler wouldn’t be able to do this, since it wouldn’t have sufficient context to know that the number in one bit of memory is in metres, while the number in another is square metres (or degrees kelvin or tonnes of apples or some other obviously incompatible number).

Finally, a compiler will often do some cunning tricks as it turns the program you wrote in your compiled language into the machine language for your computer. As a simple example, if your program only runs a particular bit of code when some bit of memory is above 42, and the compiler can analyse the program and see that memory will never get above 27, it could simply skip that bit of your compiled language, meaning your program is smaller and faster. These tricks can be complicated, but a lot of research goes into making sure they’re not going to be used incorrectly, and they can have a substantial impact on how quickly a program runs.

Of course, both assembled and compiled languages need an assembly or compilation step to turn them from whatever the programmer wrote to something the computer can run. Not all languages are like that, and “scripted” or “interpreted” languages, which can bypass that step entirely, are on the plate for tomorrow.

Leave a Reply