Poor database usage for user data / profile module
In testing 4.2 today, I was using the user profile module. Then I decided to write my own custom user data module, and stop using the Drupal supplied profile module. That worked fine, but in testing it, I was examing the database. I noticed that all of the old data for the now-unused profile module is still in the "data" column of the "users" table.
So I spent some time thinking about this design, and why it makes it hard to get rid of data stored in a multi-element column like "data" in the "users" table. In a sense, it is serialized data, like that used by PHP sessions and by lots of other applications. The advantage to using this method for storing data is it makes it easy for the programmer to add new data fields because no database change is required. it also makes upgrading or installing patches a little bit easier, for the same reason.
However, I believe this is a bad method in general.
Software should be designed and developed with an eye to providing what the end users need, and to making their jobs easy. It should not be designed to make the programmers' jobs easier at a cost to the end user.
Unfortunately, using serialized, persistent data makes exactly that mistake. It is easier for the programmer, but for the person or people managing a web site, it makes their life more difficult because it wastes data space and reduces performance. This is the mistake that the profile module makes, and it appears to be a flaw in the core Drupal design as well, since the "users" table is part of the core.
The correct way to do this, in my view, is to use normalized database design. For example, in the case of the profile module, the core "users" table should contain columns only necessary for the core system to function. Serialized, persistent data should be avoided unless there is a set of very good arguments to storing persistent data that way (persistent meaning over several sessions). When an administrator enabled the profile module, it should result in the creation of an auxilliary profile table with a primary key tied to the primary key of the users table.
If the administrator no longer wanted to use the profile module, she could disable it, and optionally, delete the table in the database.
This has several advantages:
1. No database space is wasted on storing data that is no longer used as is currently the case.
2. Performance does not suffer from having to parse and carry data which will never be used.
3. The database will be easier to understand.
4. It may actually make it easier to develop modules.
Question: given the current structure, is there any easy way to clean out the old profile module data from the users.data column without having to write something to cleverly parse it out and rewrite the column for all users with only the data I want?
--
Slavica