Be Proud

The other day I called up a former co-worker to share an epiphany I was having. "Hey dude, I just wanted to call and tell you that you should be proud of what we managed to accomplish where we worked."

Courtesy of NeoGaboX

At the time I was pretty down on myself. I helped implement a system from the ground up which grew in a very organic manner. The thing it ended up being was not what we intended to write in the beginning. Web services talked to other web services in a chained together fashion. We only had one database instance where data loading, exporting and reporting all competed for resources. The applications were all coupled to an awkward schema and changes to it required changes to everything. I basically considered it a complete disaster and felt really crappy for having been a large part of its creation.

What a difference half a year can make. I've been exposed to some gnarly stuff both good and bad. The good stuff? Well it's better than most of the stuff I wrote. The bad stuff? Oh man, it stinks. It stinks in ways way I can't even describe, but I'll try my best:

Awkward schema?
The horribleness I created pales in comparison to what I've seen. I got to see a schema which was produced by a vendor which decided it would be awesome to obfuscate the schema by naming tables and columns with 3 letters and 5 digits. It was then picked up by a vendor which decided to add it's own tables and columns filled with inconsistencies. These were easily identified with vanity prefixes. Foreign Keys? Those are for suckers.

Applications coupled to one another via the database?
Oh yeah, I've seen worse there too. Except this time the schemas are bigger. Keys are repeated in places that seem to not need to be repeated. Columns are foreign keys to other tables with names that don't match up. Rogue applications are writing data in ways which break other applications because they forget to update denormalized fields or just omit data entirely. Sometimes they forget to update fields which hold similar data but in different ways. How many fields indicate status for a record in your app? Less than 5? Pffft, you must be doing something wrong.

Chained together applications?
I thought that chaining together services was bad enough. At least those had some form of a contract to communicate between one another. How about applications that are chained via batch job AND are coupled to the database? Run one process and it inserts into a table, run the next process and it processes those records. The 3rd one cleans up after the first two. If something breaks in the middle, may God be with you if you attempt to just re-run the process without cleaning up.

Looking back...
I would have loved to have had the resources to have multiple database instances. I mean, we were developing directly against the production environment. That's like running on a tight rope with no net while holding really sharp knives with your ass on fire. How the heck I never screwed something up in a catastrophic manner, I'll never know. I must have digital guardian angels.

This is when I started to get a warm mushy feeling about what we accomplished. We didn't have reporting databases (that would have been awesome though). We didn't have readonly copies for ad-hoc queries (that would have been awesome too). Nope, we had ONE database and we had a lot of junk coming in and we managed to produce a relatively clean stream of data going out. Here's the amazing part to me: we were handling a lot higher volume of data than these other systems I've been exposed to since. That gives me hope that I'm doing SOMETHING right.

Does it mean we are awesome software developers? Nope. We just did whatever we had to do to get the job done. We had limited resources and we made it happen by using whatever tricks we could find. It was all trial and error. Ultimately I'd say it was more expensive for our employer than if we had what we needed. Looking back, we could have added so much more technical value to the company instead of figuring out how to optimize every single process to run on one machine.

I think I figured out all of the wrong ways to write code. I screwed up a lot of stuff, but it was all fixable. Because of the amount of ridiculous optimization we had to do, it looks like we got quite a bit right the 2nd (sometimes 3rd) time around. Sure some of it was messy, but it isn't near as bad as some of the stuff I've seen since. If I had it to do all over again now, I would have done things quite differently. That's par for the course being a software developer though. We're constantly learning new ways to do what we've already done in a better/faster way. Going forward, I will not let current advancements affect how I perceive past accomplishments. I shipped a lot of software that works. I'm proud of that.

From DataReader to Objects

It's been a while since I posted some actual code on my blog.  I'm working on a reporting project at work and ran into a case where I just wanted to execute some arbitrary SQL against an Oracle database and get some plain objects back.  Those objects will be passed into a crystal report and turned into something suitable for digestion by the managament of our company. Nothing fancy here.  

First I'll show you my extension methods and then follow with some explanation:

static Regex underscore = new Regex(@"(^|_)(.)");
static string convertName(string s) {
	return underscore.Replace(s.ToLower(), m => m.Groups[0].ToString().ToUpper().Replace("_",""));
}

static T ToObject<T>(this IDataRecord r) where T:new() {
	T obj = new T();
	for (int i = 0; i < r.FieldCount; i++) {
		var p=typeof(T).GetProperty(convertName(r.GetName(i)));
		if (p != null) {
			if (p.PropertyType == r[i].GetType())
				p.SetValue(obj, r[i], null);
			else {
				var c = TypeDescriptor.GetConverter(r[i]);
				if (c.CanConvertTo(p.PropertyType))
					p.SetValue(obj, c.ConvertTo(r[i],p.PropertyType), null);
			}
		}
	}
	return obj;
}

public static IEnumerable<T> GetObjects<T>(this IDbCommand c) where T : new() {
	using (IDataReader r = c.ExecuteReader()) {
		while (r.Read()) {
			yield return r.ToObject<T>();
		}
	}
}

The meat of the work happens in the extension method for IDataRecord called ToObject.  It's a generic method that tries to map columns to properties.  At our company, the naming convention for column names is underscore delimited identifiers since oracle knows nothing about case unless you use quotes. For my c# objects, I much prefer Pascal Case.  The convertName method gets called to perform this conversion for me.  Finally, the last interesting bit is the TypeDescriptor usage.  When there is a difference in the type coming from the DB and the type in the object for a particular field, I'm using TypeDescriptor to find a converter if possible. This is useful if, for example, the type coming from the DB is a decimal, but my class needs an int.  

The last method, GetObjects is just a wrapper around the ToObject method.  It fires up a datareader and calls ToObject for every row.

Finally, I'd like to show how this is being used. Assume the following silly reporting class.

public class EmployeePerformance{
	public string FirstName {get; set;}
	public string LastName {get; set;}
	public decimal HoursWorked {get; set;}
	public int WidgetsCreated {get; set;}
}

All we have to do now is create a DbCommand object and call our GetObjects extension method.  I'll give an example as if I were running some report just so you have some context.

var rpt=new EmployeePerformanceReport();

using (var cx = new OracleConnection(MyConnectionString)) {
	cx.Open();
	using (var cmd = cx.CreateCommand()) {
		cmd.CommandText = @"
			SELECT
				first_name,
				last_name,
				hours_worked,
				widgets_created
			FROM
				employee_performance
			WHERE
				work_day between :startDate and :endDate
		";
		cmd.Parameters.Add("startDate",new DateTime(2008,1,1));
		cmd.Parameters.Add("endDate",new DateTime(2008,12,31));
		rpt.SetDataSource(cmd.GetObjects<EmployeePerformance>());
	}
}

//Do something with your report

It's just that simple.  I'm sure there could be some caching/compilation of the mapping after the first object to speed things up a bit. Speed hasn't been an issue so far, so I didn't persue that any further. I just wanted to illustrate this technique that lets me avoid using a DataSet which I've come to loathe in most situations. Please let me know what you think.