Sunday, January 29, 2012

Compiler Mythology

One thing that I've grown to dislike over the years is the holier than thou attitude common on coding question forums like stackoverflow and the like with regard to optimization / performance related questions.  Often a new programmer will have some interest in learning about the relative speed of a few operations, and will ask some question like "which is faster, a for loop or a while loop?" and almost all of the time the typical response will contain something one of the following:
#1 Profile it in your app and only optimize it if you need to. 
#2 The compiler will make the best choice, and was written by smarter people than you, so don't worry about those things. 
#3 Why would you ever need to make a program go faster?
#4 Optimization before there's a need is evil, since it is premature....
#5 Optimize your algorithms, not the code itself...

I can't help but want to smack all of these people, even if they do have (partially) valid points to some extent. Most people do not need to optimize heavily, ( or so they think ) and are fine with wasting cycles all over the place since in many applications, this will be lost within the noise of a human based response to an action.  However, the attitude that someone is an idiot for even wanting to know more about this mystical topic called performance just annoys the heck out of me.  We programmers live in a world where input is translated to output based purely on logic, reason, and math, and yet the compiler is treated as a whimsical source of randomness that cannot be predicted, known, or even guessed at unless you have actually looked at what the oracle of compilation has given you for a specific piece of code.

Yes, compilers vary between gcc, msvc, borland, whatever.  Yes, compiling results vary based on your platform and subsets of features. However, you are always compiling for SOME platform, not an abstraction of a platform.  In the game programming world, you generally have a whopping 3 different platforms, and in the pc world, you are generally focused on either x86 or x64.  How could one ever learn about a whole 2 or 3 different compilers and platforms at the same time! Mysticism! 

Either way, there is nothing magical that will prevent even a new programmer from learning little tidbits here and there by telling them what performance actually is about, and how they could begin to gather real knowledge about performance.  The typical response is probably based on the fact that those programmers have gotten this far in their career without needing to know much about performance, and therefore no one else needs to know anything about it as well.  Yet another reason why computer programs have remained at a similar level of responsiveness ( sluggish ) even with Moore's law kicking butt all along. There's really no reason for lag on any non video game type application. None! And yet we tolerate it all over the place.  Dismissing and redirecting the original intent of the question ( to seek knowledge ) and responding with one of these worthless answers just continues this fear of getting to actually know how computers work. There is not one programmer who would not benefit from better understanding of the hardware their code will run on, and yet we have these jerks responding as if knowing anything below a high level language is just a waste of time.

A more useful response would be quite simple: shut up, and let those who actually know the answer provide it, or at least explain the variables involved in the situation, and why it is actually a difficult question to answer.  To me it just seems like those people who think we should teach less math since we have them fancy calculators to do that for us. Without a basic understanding of how code is transformed into something that a processor can handle, a programmer is much more likely to write inefficient/crappy code, focusing on elegant layers upon layers of abstractions instead of solving the real problems at hand.

Here's my quick response to all 5 example response to the initial fake question:
#1 Code is likely to generate similar executable code regardless of context, so you can still learn/predict based solely on what is generated.
#2 The compiler has to live with what you told it to do, the c++ standard, etc, and it will not ALWAYS make the best choice with regard to what you actually intend. Also, compilers are written by regular old humans who occasionally do make mistakes.
#3 Blech.
#4 That's like saying you shouldn't learn to swim until you are in busy drowning in a river.
#5 I would assume that any professional programmer is smart enough to not even have to be told this. Also, I'll probably do a post one day on the silliness of some decisions made solely on reducing the big O notation of some code...

End Rant.

If you are interested in whether or not A is faster than B, you will have some digging to do, and sometimes it actually does depend on more than just the little snippet of code around it. A decent first step would be to disassemble your code and look at the instructions generated for the routines in question.  Look up the instructions in the processor's docs. Check out those handy latency and throughput figures for each one if the scale is small enough. Intel is kind enough to publish all kinds of docs and optimization guides online. One key thing is that on a given processor, code will execute the same way, with the same performance characteristics every time, given the same input (except for the whole OS/threading part, but you'll never have control over that anyway).   Second, time and profile your code.  We don't all have access to excellent tools to help us out with this one, but we all have access to QueryPerformanceTimer or similar timing constructs on your specific platform. A fun little process is writing isolated code to time in release mode and seeing how aggressive the compiler actually is.   Hmmm, why does my huge long routine get optimized into return 3; ?



No comments:

Post a Comment