Today’s Database Design Principle is based around data movement within the database, specifically when it doesn’t need to be done.
The basic problem I came across here was that two identical tables (A and B) were being used, along with a shared key, to manage the status of information inside the database. The application had a web form that could be filled out, saved, and later recalled to be submitted. In order to differentiate between the saved and the submitted data, the saved form fields were stored in table A, with an assumed status of ‘ENTRY’, while the submitted form fields were being stored in table B, with an assumed status of ‘SUBMITTED’. When a saved form was then submitted, the data from table A would be copied to table B, and then the record in table A would be deleted.
While this design was functional, there were some definite drawbacks to the approach:
- Table Maintenance – If a new field were added, or a field deleted from the form, then the same change had to be made in two places in the database.
- Doing With Two Transactions What Could Be Done With One – With this design, the submission of data then requires two transactions: one insert to populate table B, and a delete to remove the save record from table A. If a single table had been used with a status field, then this could have been accomplished with a single update transaction.
- Complicating Queries and Views – With the two-table approach, I now have to union together the two tables and calculate a status field to get the full set of data. This just adds a layer of complexity to the database, and it increases exponentially with each time this structure is used. Imagine if you have information in three or more different forms that are stored like this that are linked together and all the extra code that is required to keep things straight.
In this case, the easiest solution, as I implied above, would be to maintain a single table with a status column to no longer make the status implied, but explicit. This not only makes any coding linked to the table much easier, but if you are going to do any reporting or allowing non-developer business analysts access to the tables, it will be much easier for them to read the data.
The lesson here is to keep your tables simple and logically laid out. If you can avoid the hassle of moving your data all over (especially just to maintain implied statuses), you should. It will decrease the amount of work that needs to be done, and will be easier to read for anyone else who joins takes over your project.