Saturday, March 29, 2008

How does hardware evolution affect progamming language design?

I've recently watched the interesting webcast Programming Language Design and Analysis Motivated by Hardware Evolution by Professor Alan Mycroft (Webcast's link is accessible only from within the IBM Intranet). Ahead are a few keynotes I've kept.

Not everything is kept linear
As chip designers continue to scale down chips and transistors, they begin hitting design walls. Some of these walls are related to the fact that as the transistors` physical size is scaled down, some other properties of the chip do not scale linearly as well. This simplest example of this are dimensions, consider length Vs surface area: reducing a square side to 50% of its original size, will causes the square surface space to reduced by 75%, not a linear change. Different electricity characteristics might change at different rates than the rate in which length is changed.

pl.gif

Where is my 12Ghz CPU?
Moore's law, which predicts the doubling of transistors quantity on a chip every ~18 months is still in effect, sadly, this doesn't translate into clock speed. Although that, when transistors are miniaturized the distances within the chip reduces as well, and this should mean an increase in speed, but, due to heat dispersion problems (not all dimensions shrink at the same rate, generated heat is one of them, remember?) chip designers are forced to reduce the voltage in which the chip components operate. Therefore no clock speed gain.
This enables us, however, to squeeze in more cores into that optimal one cm^2 silicon pad. Hence, the multi-core technological path that the industry had resorted to in the last couple of years.

Quad core

There's always a trade-off
As the voltage in which the chip operates drops, chip designers are starting to face computation inaccuracy problems. How could we live in peace with these imprecision? the professor ponders, do we must insist on absolute accuracy? Consider the task of rendering video, do we really care about the correctness of each pixel on each of the frames, probably not, just remember those old analog VCR and audio cassettes, they were highly inaccurate and still were able to deliver the goods. We might decide to compromise on accuracy, some of the time, in order to benefit on speed, just another type of trade-off. Programming language designers should assist chip designers by developing programming languages that are able to operate in a world of non absolute certainty.
Also think about the build-in error correction mechanisms put in to network protocol stacks.

Better on one world, worse on the other
A major problem with multi-core chips processing, is that although inter-cores communication enjoy a high bandwidth (2.5GB/s), it is stained by a high latency (~75 clock cycles) .
Another problem is that programs are written based on a shared memory model, in which all cores must coordinate when accessing the shared main memory, core's caches must also be refreshed quite often. While this doesn't seems a major problem for dual or quad cores, think on how this heavily limits performance on a, not so futuristic anymore, 128 cores chip.
Trying to refrain from shared main memory access might turn the table on some of the disciplines we got accustomed to think of as obvious. For example, when you code a parametrized function you declare how parameters are passed; either by reference, or by value. Declaring this during coding time (rather then deciding this during runtime) can be regarded as "early-binding". From a performance perspective, everybody knows that passing by reference is, almost always, faster than passing by value (assuming you don't intend on changing the passed value). This preferred way of action might not hold true on a multi-core system that will have to incur an expensive overhead when it access the data which the reference point to in the shared main memory, no such price has to be paid if the parameter is past by value. One way in which future programming languages might deal with this is to allow for late-binding of the parameters passing method. When running on a chip with only a few cores, a pass by reference will occur, just as, when running on a cores enriched chip a pass by value will be selected. This is true when the pass by reference/value makes no difference to the program logic (no changes to the parameter's data are visible to the method caller, nor the parameter data is accessed concurrently), and therefore both could be used interchangeably.
Future languages will need to support this "late-binding" feature and others like it.

Summing up
It will be interesting to keep follow of these hardware to software trends of mutual influences.