A Letter to My Unborn Son

Son:
Any day now, you will bless us with your presence. I've spent a lot of time working to get everything ready for you lately. All the while, I've been wondering what I can do to help you become the best man that you can be. This world is full of challenges. As you grow you will encounter people who have no interests other than their own. I hope that you will be able to see through the nonsense and be able to contribute positively to society.

I hope to impart much knowledge upon you. You can't read this now and you won't be able to do so for a while. When you can read and understand this, I hope that you will take these tips to heart:

  1. Have strong opinions, but hold them loosely. You should be confident in what you believe, but not too stubborn to recognize when you are wrong.
  2. Never make excuses when you do something wrong. Own up to your mistakes and move on.
  3. Be humble. No one likes a show off and a braggart. Let your hard work speak for itself.
  4. Never be ashamed of who you are. Confidence is strength, shame is weakness.
  5. Nothing in life is free. There is always a catch.
  6. Never work for free. There has to be some value in everything that you do. This reward won't always be monetary.
  7. Anything in excess in sinful. Addiction will ruin even the greatest of men.
  8. Power is an illusion. Power is not a goal. Power can be a burden.
  9. Find your passion in life. Figure out what you do well which other people can't or won't do. Do that for a career. You'll be happy doing something that you excel at.
  10. There are no gray areas. There is only right and wrong. If something doesn't feel right, then it probably isn't.
  11. Use the right tool for the job. Without the right tools, your work will either take too long or be of poor quality.
  12. Learn to analyze situations before you react. Impulse decisions are rarely the best ones. Truths will point you in the right direction.
  13. The most important tool is a big hammer. Sometimes you just have to persuede an object to move.
  14. Anything worth doing is worth doing to the best of your ability. If you want to be proud of what you do, you have to know that you gave it your all.
  15. Everything you do reflects on your character. Your actions today are your legacy tomorrow.

Life is full of good and bad. Ultimately it will be up to you to make it what you want.  Just know that I will do everything I can to help you along the way.   I'm looking forward to meeting you!

Data Exchange Basics

In the past month I've run into two clients that just don't understand how to take in outside data. I really didn't think that this post was necessary until I had these recent conversations. I thought I would just lay it out there how I feel that data exchange works. It's really quite simple.

First, I should describe my background. I work for a national PPO healthcare network. We directly contract with medical providers and also serve as a PPO aggregator taking in data from several sources. We take data from other networks, wrap it up alongside our own, and then redistribute the whole shebang to clients. Multiple gigabytes of data flow into and out of systems every month.

Taking data from other sources can be difficult. The burden is on the recipient to "scrub" the data and make it fit into your own formats and standards. The only data that you can really trust is your own. There is only so much that you can expect the sender to do as they already have the data in the format that they need. Basically the sender needs to do two things in order to make a successful exchange possible.

  1. Provide a unique record id. This is absolutely critical.  Some clients I've dealt with don't have truly unique record level identifiers.  Sometimes a composite of various fields can be used, which is perfectly acceptable but not exactly ideal.  When there is no identifier, then it becomes really hard to know when data changes.
  2. Provide a consistent and reliable feed of data. The data that is being provided needs to updated regularly and be in a consistent format.  A field that you said was numeric can't magically become alphanumeric without something screwing up. The worst case scenario is when data gets changed in a subtle way that somehow gets misinterpreted on the other end without someone catching it.  When you are updating millions records per month, you can't eyeball very much of it.

That's basically it for the sender.  I told you the burden was on the recipient.  So, you have an outside source of data, now what?

  1. Uniquely identify the source.  When you are looking at a piece of data, you have to know where it came from.  Each source of data into your system (including your own) needs to have a identity.  That identity has to be tagged to the data feed.
  2. Store the source's unique record id beside yours. Before you even start to think about it, no you can't use their identifier as your own.  Furthermore, you shouldn't try to generate some kind of composite key based upon the source and the source's id. You need your own internal id.  Anytime I have a mashup of data, I create an identity (id) for foreign keys and then store the source (source_id) and the source's record identifier (external_id).  External id is always an null-able column with varchar(max).
  3. Reformat the data to fit into your system. Your data standards are not the responsibility of the source.  That bears repeating, your data standards are not the responsibility of the source.  This will mean that you will have to make compromises with the data and sometimes possibly outright reject data.  It's just part of it, but you'd rather have less higher quality data, than a bunch of crappy data.
  4. Notify the source of problems so that they have the opportunity to make corrections. While I don't like it when it happens, clients will approach me from time to time to point out oddities within my own data.  It's going to happen, and it's only because they have a different perspective on the data and may be looking at different things.  More eyeballs on the data means better quality data in the end.
  5. Trust nothing that you are receiving. Whenever the data exchange discussion is initiated, try to get as much information from the source as possible.  What fields are required?  What data types are expected?  Take all of the information you gather and write code to enforce it. The better job you do here goes a long way in catching senders who violate #2 in the list above.
  6. Use the source's naming conventions.  Somewhere you will be doing a mapping from their data to your own.  When you are referring to their data, name it (in code, in a temporary table, wherever) what they call it in their documentation.  This makes debugging stuff later on so much easier.  For the love of all things holy, never ever refer to their data by position within a  file. Whenever their format changes (sometimes it will happen), then you will be up shit creek. I name it what they call it right up to the point that it leaves their data structures and enters my own so that I can explicitly see the mapping.  "my.StateDiscount = their.Amount-their.FsAllowedAmount" goes a long way in bridging communication barriers 10 months down the line when an odd problem shows it's ugly head.

Bonus Tip: I've found that full data exchanges much more user friendly than add/update/delete notifications.  This is from my experience with data that is relatively constant with a lot of random updates, which may not necessarily apply for other type of data (time sensitive data comes to mind real fast here).  With relatively static data, a full exchange provides a few benefits for both sides.  The sender doesn't have to worry about tracking changes.  The recipient doesn't have to worry about processing the data in a certain order.  The latest data is the only data that matters.  It also allows the recipient to easily detect if a record gets orphaned somehow.  With change notifications, there is really a need for an acknowledgement from the recipient so that the sender knows the data was processed and is up to date.  The only downside for the full exchange can be file size, but it's a trade off I find acceptable.

That's about all I have on this subject.  If you have any more tips on exchanging data, I'd love to hear them.  This is a big part of what I'm responsible for every day.  A lot of it is common sense, and a lot of it is just stuff I learned by screwing up.  Dealing with other people's data is never fun.