Progressive remodelling – how to ruin your data
Many applications are gathering chronological data from logs, from activities or manual data entry. Over its lifetime an app will gather loads and loads of informations often very valuable information. Over its lifetime requirements on how to use that information will change.
And probably mess it up.
I am currently working on such an application. For over five years precious information was manually entered.
At first that data was quite complicated. Those who have thought up the information wanted the data to be as rich as possible.
Soon after they realised that the manual entry of such complex data was taking up too much time as well as did processing it to generate reports. The data structure needed to be simplified.
The structure was simplified by reducing fields keeping only those that were essential.
Much later the application was running well and actually generated some money so the resources became available to make more out of that data. The data structure was changed again to collect more detailed data for more detailed reports. The reports became more complex again generating more value for the users.
Now that the application lives for some years with a substantial user base the idea came to mind to make much more from that data. Rather than reporting the current data why not build reports on the trends? Show how the data has changed over time?
I tell you why not.
Because the data structure was changed twice. Each change resulting in similar but actually incomparable data.
In my case for the first period users were asked to give a rating on a subject for five different attributes. In the second period after the first change they were asked for a single average rating on the same subject. In the third period after the second change they were asked to rate three attributes different to the five from the first period.
Unfortunately an average can not be calculated from certain values. As the questions changed so did the meaning as the question directly influenced the answers.
My advice to you: whenever you attempt to change the structure of chronological data please try to keep the essential values consistent.