The Importance of Profiling

I’ve worked as a software developer and worked with teams of software developers for around 10 years now, Many of those whom I’ve worked with have earned my trust and respect in relation to development and testing techniques. Frustratingly however it’s still with irritating regularity that I hear throw-away comments bourne of uncertainty and ignorance.

A couple of times now I’ve specifically been told that “GD makes my code go slow”. Now for those of you not in the know GD (actually specifically Lincoln Stein’s GD.pm in perl) is a wrapper around Tom Boutell’s most marvellous libgd graphics library. The combination of these two has always performed excellently for me and never been the bottleneck in any of my applications. The applications in question are usually database-backed web applications with graphics components for plotting genomic features or charts of one sort or another.

As any database-application developer will tell you, the database, or network connection to the database is almost always the bottleneck in an application or service. Great efforts are made to ensure database services scale well and perform as efficiently as possible, but even after these improvements are made they usually simply delay the inevitable.

Hence my frustration when I hear that “GD is making my (database) application go slow”. How? Where? Why? Where’s the proof? It’s no use blaming something, a library in this case, that’s out of your control. It’s hard to believe a claim like that without some sort of measurement.

So.. before pointing the finger, profile the code and make an effort to understand what the profiler is doing. In database applications profile your queries – use EXPLAIN, add indices, record SQL transcripts and time the results. Then profile the code which is manipulating those results.

Once the results are in of course, concentrate in the first instance on the parts with the most impact (e.g. 0.1 second off each iteration of a 1000x loop rather than 1 second from /int main/ ) – the low hanging fruit. Good programmers should be relatively lazy and speeding up code with the least amount of effort should be commonsense.