Drupal's Weak Link
One of the things I love about Drupal is the entity reference field. In Drupal 8, entity reference fields are even better because they’re now included in Drupal Core. This makes building composed and well-structured data much easier. However, I think as Drupal developers and architects, we’re fundamentally misusing them in far too many cases. Over and over, we use entity references to denote a relationship between two entities instead of a simple reference or inclusion. Mere entity references severely limit our ability to create rich, rational content type architectures and force us to jump through additional hoops and write more custom code than ought to be required. I believe that Drupal having only entity references to forge relationships between content types is a major shortcoming.
What We’ve Got
Before we can understand what Drupal is missing, we first need to know what we already have. Out of the box, we can create different types of data. These are our Content Types. Different types of data might be people, blog posts or schools. If you can give it a name and describe it, there’s probably a way to represent it as an entity in Drupal. These data types can reference one another using entity references – a blog post might reference its author or a school might have a reference to all of its students. When we create the reference, we’re drawing a little imaginary arrow that points from a blog post to a user entity.
Let’s explore this concept through a scenario. You’ve been tasked with building a club management system. It should be able to represent a lots of different clubs — the Engineering Club and the Theater Club, for example. We know that every club can have members, and those members are the users of our application.
How should we manage memberships? Currently, we would model the relationship between clubs and members with an entity reference. Users might have an entity reference field that references the clubs to which they belong, or club nodes might have a field holding all the users that are part of its membership.
What We’re Missing
So, what’s wrong with that? Problems start to creep up as you start building out the rest of the application. Let’s add a new requirement. Some clubs need to represent who their Chair and Co-Chair are on their club homepage. Another club needs to have a treasurer. Another club just arbitrarily gives away titles to their members like, “Director of Pizza Ordering.”
With entity references alone, we’d be hard pressed to model all of this. We might try to create a “President” or “Treasurer” field on the club content type, but that wouldn’t solve our arbitrary title requirement. We could put a text field on the user for their title, but permissions would get weird and wouldn’t work for members of more than one club.
The truth is that the reference model doesn't work. Relationships aren’t just a little arrow from A→B like an entity reference implies. Relationships aren’t always going in just one direction either. Sometimes a relationship can go both ways, like a friendship. The technical term for this is ‘directed’ and ‘undirected.' Entity references only ever go one direction.
At the end of the day, relationships between data types are complex and varied. It’s important to be able to describe the nature of a relationship between things in Drupal, be it duration, kind or direction. Clubs can have different kinds of members. Flights can have first class and economy passengers. Friends can be best friends. A celebrity might have followers, but that doesn’t mean that the celebrity is following all of their fans. All of these relationships aren’t easy to model in Drupal. Entity references are a blunt tool for an intricate world.
A Stronger Link
The correct approach to our club management problem is to model relationships as entities themselves. In the club management application, what we really needed was a ‘membership’ entity. That ‘membership’ type, would store a reference to the club and to the user, linking them together. But because this ‘membership’ type would be a fieldable entity, it would be trivial to create a taxonomy of membership types. We could just have an autocomplete taxonomy term field for terms like “President,” “Treasurer” or even “Director of Pizza Ordering.”
The idea is really that simple. Yet, time and time again, I see Drupal sites relying solely on an entity reference when there’s clearly a relationship involved. Entity references should only be used where the referenced content is truly part of the thing from which it’s being referenced. Media entities and taxonomy terms are perfect use cases for an entity reference. As we’ve seen though, entity references aren’t the right tool for everything. To spot relationships vs. references, ask yourself, “is this thing really part of this content? Is there a verb between my content?” Like, “A user joins a club.”
What’s Stopping Us?
In many ways, we already have all the tools we need to build our sites and applications with these concepts. It’s entirely possible for you to create entities representing relationships now. Nothing is stopping you from creating a membership node or a more custom entity using something like Entity Construction Kit.
If the answer to these issues is so simple, why don’t we always use this strategy? Why do we as Drupal developers instinctively reach for the entity reference when we clearly want entity relationships?
The reality is that it’s more work up front. You have to worry about more content types. You need to theme more things. When creating a user, it’s not just as easy as checking all the clubs they belong to on their edit form. It’s a two-step process. When a user is deleted, we have to automatically know to delete their memberships. Frankly, it’s just more work.
The Fix
Most, if not all, of the problems with relationship types can be solved with no knowledge of the kinds of relationships involved. For example, automatically deleting a relationship when one of the entities on it is removed could be a pretty universal rule that could be easily overridden when needed.
That’s why I think imagining Drupal with a core Relationship type is an enticing idea. There’s already precedent for creating new entity types to better model content. That’s exactly what taxonomy terms are. They’re an entity type separate from a node that have some extra features like hierarchy and autocomplete/creation. Taxonomy terms are a fantastic tool in our toolbox. Sure, we might have to educate new site builders and clients about them, but once the concept is conveyed, people really run with the idea.
There are solutions out there that attempt to address these problems of relating entities today, but none are fully satisfying. There are plenty of modules that create bidirectional entity or node references, but they fall short because they are not fieldable. Foremost among these solutions is the Relation module. I really liked this module in D7, in fact, I built a few modules around it. Unfortunately though, it does not have a shippable release for Drupal 8 and it’s billed purely as an API module for developers. There’s not much of a user interface either. *
Relationships, I think, are the next logical progression in Drupal’s content architecture model. Having true relationships would enable us to do all kinds of neat things and make so much custom code completely unnecessary. Just imagine the simple idea of a publishing company where a manager should be able to edit any of the content of their subordinates. You might be able to write some custom code, but with Relationships, we could write simple modules that rely on the relationship abstraction to configure access policies, not write them custom code.
I’m working on a module to explore these ideas and uncover roadblocks, but I would love to get the community’s response. If you found this post interesting, tell me what you thought made sense and what didn’t. What would you want out of a Relationship type? Do you think I’m crazy? All of it will be valuable feedback.
* If you're a maintainer of Relation reading this, thank you! I really do love Relation. It inspired me to think about problems with content modeling in Drupal in this way; I just want to go a heck of a lot further with it!