Tackling oversized cache items in Drupal
Drupal’s highly dynamic and modular nature means that many of the central core and contrib subsystems and modules need to maintain a large amount of meta-data.
Rebuilding the data every request would be very expensive, and usually when one part of the data is needed during part of the request, another part will be needed later during the same request. Since just about every request needs to check variable_get(), load entities with fields attached etc., the meta-data needs to be loaded too.
The pattern followed by most subsystems is to put the data into a single large cache item and statically cache it. The more modules you have, the larger these cache items become — since more modules mean more variables, hook_schema() and hook_theme() implementations, etc. And the same happens via configuration with field instances, content types and default views.
This affects many of the central core subsystems — without which it’s impossible to run a Drupal site — as well as some of the most popular contrib modules. The theme, schema, path alias, variables, field API and modules system all have similar issues in core. Views and CCK have similar issues in contrib.
With just a stock Drupal core install, none of this is too noticeable, but once you hit 100 or 200 installed modules, suddenly every request needs to fetch and unserialize() potentially dozens of megabytes of data. Some of the largest cache items like the theme registry can grow too large for MAX_ALLOWED_PACKET or the memcache default slab size. Since the items are statically cached, these caches can easily add 30MB or 40MB to PHP memory usage combined.
The full extent of this problem became apparent when I profiled WebWise Symantec Connect site (Drupal.org case study). Symantec Connect currently runs on Drupal 6, and as a complex site with a lot of social functionality has a reasonably large number of installed modules.