Here’s something I’ve learned over the years: when modeling data to build a database, be very careful what fields you decide to include. Don’t throw in extraneous fields just because “someone might want to store that piece of information someday, and it’s no big deal to include it…”
It is a big deal, and here’s why —
When building and populating a domain of data, users need to trust it. They need to believe that the data in the data store is good, solid data. If you’re stingy with your fields, and you only include the fields that people are going to use, then most of the data will be good data, users will know this, and they’ll trust it.
However, if you just start throwing in fields willy-nilly, and a lot of them don’t get used, your users are going to develop a trust problem. One day, one of them is going to depend on that field — they’re going to do a search on it, or they’re going to pull a report based on it, and their search or their report is going to suck because no one uses the field. Then they start to wonder if they should trust any of the data…
Here’s an example —
You work for a construction firm, and you’re building a comprehensive database of subcontractors in the metropolitan area — electricians, plumbers, etc. Your boss gets angry one day because he’s been let down by a subcontractor one too many times, so he asks you to add a field for “Reliability” to the subcontractor database.
You think this is a silly idea, but who cares, right? If they use it, fine. If not, no harm. You add the field, and a few people use it to quantify a few subcontractors’ reliability. Most don’t use it, but a few do, and your boss is happy.
However, one day a VP in your firm searches for subcontractors with a “Reliability” rating of “High,” and he only gets back the precious few subcontractors for whom the field has been set. He trusts this data for a while, but then slowly realizes that he’s only getting a list of seven plumbers when he should be getting a list of 100. This is because there are 93 plumbers for which no Reliability rating has been set.
Your database has just been wounded. This VP is now suspect of your data. It’s let him down once, and when he begins to understand the mechanics of why, he’s going to wonder what other fields in the database are unreliable too. Since it’s hard to get an “overhead view” of a database, whenever he gets back results that he doesn’t expect, he’s going to start thinking, “Well, I bet this is another one of those situations. This database sucks. I’m going to keep my own subcontractor list in Excel.” And he will.
This is how databases begin the downhill slide into non-use. The usage and adoption of your database depends on your users trusting it. If they start to distrust it, you’re in a world of hurt. So be stingy with the fields you add — defend the integrity of your database — so you don’t make it easy for your data to spoil.