Computer Science Foundations

I've been out of academia for 3 years now and out in the real world.  Since then I've had the opportunity to get to know several developers from all over.  In some of my discussions I've had the opportunity to compare backgrounds and experiences.

As far as I can tell, there are 3 different paths that lead someone to a development position:

  • Computer Science Graduate - Emphasis is on lower level concerns.  Coursework involves more study of principals and algorithms than following the latest tech.  The software projects reproduce things that exist already.
  • Business Applications Graduate - Different universities call this different things like Management Information Systems and Computer Information Systems.  Regardless of what you call it, the coursework is geared towards tackling problems associated with the business concerns and generally nothing lower.
  • Old School Tech Junkie - These people have differing backgrounds, some with formal education and some not.  Most I've encountered do not have the formal education, they've just seen it evolve and happened to be there when this whole birth of tech came about.

Where do I come from?  I'm the CS graduate who learned the nitty gritty mind numbing details about computer systems and software.  I had to code the standard structures like linked lists, stacks, and trees.  I had to code basic libraries like those for string handling.  I had to even write my own TCP stack and a basic OS.  Why?  Well, I know now that it's because I needed to learn the limits of the hardware and the environment.

Am I writing kernel code?  Nope.  Am I writing device drivers?  Nope.  I write business applications.  I feel that my education makes it easier for me to do what I do.  What I've found is that a majority of people produce inefficient code because of a lack of understanding for what's happening deep down.  Managed code hides all of that gory stuff from you and if you've never seen it, then you just don't know it's there.  This is where all that "junk" I learned really starts to help out.

Hmm, let's see... I'll loop over this file right here and build a query for each line.  I'll start with a string and concatenate everything I need to it to generate my query.  Then at the bottom of this loop I'll execute that query and store the results into this list over here.  Then when I'm done I'll loop over my list and write it to a file.

That code will get the job done just fine for small sets of data, but it can break down when it needs to scale.  When that 10,000 line file becomes a 1,000,000 line file, that 0.5 second loop times will stretch an hour job to a week long job.  And that list where you were storing the results?  Oh yeah, it got too big to fit in memory, so now the OS is wasting more time thrashing the page file to and from disk.  Don't forget that the database is struggling with those non-parameterized queries trying to figure out an execution path for each one. 

I'm not the smartest person in the world.  I know there are people from all backgrounds who can code circles around me.  That's not what I'm trying to convey here.  What I am trying to say is how important it is to understand what is actually happening behind the curtain for that line of code you just wrote.  Do you understand the implications of what you are doing there?  If you don't, then you better learn it before your boss finds someone else who does.  If you do know it, well learn it some more because we all make stupid decisions in code.  Later.