Drupal 8: best authoring experience for structured content?
Drupal 8 will ship with big authoring experience improvements: WYSIWYG editing & in-place editing, thanks to the Spark distribution that Acquia — my employer — is sponsoring.
But how well does it fare with the growing importance of structured content? Do Drupal 8’s WYSIWYG & in-place editing enable it or prevent it?
The new web world order: many form factors
The Big Thing of the last few years: the advent of mobile. Inherent to that: websites that are optimized for mobile devices and act as data providers for apps.
A new form factor — mobile devices — changed web development forever. Before mobile, the life of web developers and authors (content creators) was relatively simple: make sure websites work well on a few typical screen sizes (let’s deny the existence of Internet Explorer 6 and all the misery it caused).
But … we cannot predict what’s next. We cannot predict new content consumption form factors. That’s where content strategy becomes vitally important:
content strategy is to copywriting as information architecture is to design
We have to make sure that our content is structured and has enough metadata to successfully reuse the same (structured) content for different content consumption form factors. Without having to edit each piece of content again.
Structured content: successfully dealing with form factors
NPR’s Create Once, Publish Everywhere is the most often cited example of a content strategy that successfully provides content for many form factors. They create content once, then publish it to >10 different platforms. With a small team, they do more than some other companies, because of their excellent content strategy. It took them years to evolve their systems in this direction, and it paid off.
Another example is TV Guide. They decided back in the 1980s to capture all semantic metadata, to build a database and extracting a magazine from that, rather than just creating a nicely formatted magazine every time. Thanks to that, they’re still relevant today.
It appears that the reuse of content is something every website should strive towards. There’s nothing inherently bad about it. However, there are downsides.
TV Guide editors used a mainframe application (and maybe still do?). NPR editors use this UI:
NPR editors are encouraged to only think about content, not presentation — hence a very basic data entry UI is all they get 1. This UI looks more like a web front-end to a database than a CMS (anybody else who’s reminded of PHPMyAdmin?)…
So, while this may be true:
The goal of any CMS should be to gather enough information to present the content on any platform, in any presentation, at any time.
No CMS really aims to have a poor authoring experience, of course.
Drupal & structured content
Drupal is already well prepared for structured content.
All of the principles that are being used when reviewing code that is being proposed for Drupal core inclusion, are a superset of the principles applied to structured content. Drupal demands full separation of concerns at every level. Everything must be overridable/alterable. Separation of concerns for CSS files, to ensure clean overriding of styling without having to duplicate all CSS. Content may never contain CSS nor depend on CSS. And so on.
Five features in particular stand out with regards to structured content and content reuse:
- Structured content: Field API.
It allows content to be modeled as granularly as desired. - Clean content: Filter system.
Ensures fancy mark-up is only added on output, and the stored content is as clean as possible. e.g. the fancy typographic features in this very piece of text is automatically added by Typogrify. - Different presentations of the same content: view modes.
A view mode defines the order of the fields and the field formatter & label of each field. 2 - Internal reuse of content (within the website): Views module.
To create lists, grids, tables, galleries etc. of content, while showing related content. A listing can be configured to use a specific view mode. - External reuse of content (outside the website): REST module.
To provide JSON, XML, HAL, JSON-LD, YourCustomMarkupLanguage output.
Drupal authoring experience
Drupal’s authoring experience used to be remarkably similar to that of NPR’s COPE. We’ve gone through a lot of effort in Drupal 6, 7 and 8 to improve usability in general. In Drupal 8, the Spark distribution on which I work has specifically targeted the improving of the authoring experience.
Some of the authoring experience improvements in Drupal 8 (in part) thanks to Spark:
- two-column backend content editing (with publishing options/meta configuration in a sidebar)
- in-place editing for fields
- CKEditor-powered WYSIWYG editing
The first is noncontroversial when looking at it from a structured content perspective. It’s the second and third that appear to be counter to the premise of structured content — to quote Karen McGrane about WYSIWYG editing:
[…] we allow content creators to embed layout and styling information directly into their content. Unfortunately, the code added by content creators can be at odds with the style sheet, and it’s difficult for developers to parse what’s style and what’s substance. When it comes time to put that content on other platforms, we wind up with a muddled mess.
or Jeff Eaton about in-place editing:
The editing interfaces we offer to users send them important messages, whether we intend it or not. They are affordances, like knobs on doors and buttons on telephones. If the primary editing interface we present is also the visual design seen by site visitors, we are saying: “This page is what you manage! The things you see on it are the true form of your content.”
First, let me state that I in fact do not disagree with either of them. We’ve actually taken that into account while adding WYSIWYG editing and in-place editing to Drupal core. Let me explain how.
WYSIWYG in Drupal 8: enforces clean markup
By default (in the Standard install profile), Drupal 8 will not ship with formatting/layout tools enabled in its WYSIWYG editor (CKEditor).
We make sure in Drupal 8 to prevent crappy markup and format/layout markup (style
, font
attributes). It’s not only impossible to set these kinds of “bad attributes” in the WYSIWYG editor using the toolbar, it’s also impossible to paste them in and to use the “source mode” (where you can type HTML directly) to insert them — you can type them in the latter case, but they will be stripped upon going back to WYSIWYG mode from source mode, or upon save if you try to save it without going back to WYSIWYG mode.
This is powered by the new “Advanced Content Filter” feature in CKEditor 4.1, which was added specifically on our request to make this possible.
Furthermore, we made it very easy to configure CKEditor in Drupal 8, yet at the same time very hard to break the above strictness. Only HTML tags and attributes allowed by a specific CKEditor toolbar button will be allowed, even if you add more buttons. So the above “guaranteed clean HTML” will not only be true for the default WYSIWYG configuration, but for any configuration. Drupal 8 will even automatically sync WYSIWYG configuration with filter system configuration:
In the past, configuring WYSIWYG editors was a pain, and in part because of that, the configuration of the WYSIWYG editor and corresponding filter system settings were too permissive.
Finally, we’re currently working on making sure that when you insert an image into a piece of text (with or without a WYSIWYG editor), that won’t result in the final HTML like <img src="/files/styles/thumbnail/llama.jpg" width="100" height="100" alt="Awesome llama!" />
, but instead in a placeholder that the filter system will transform into the final HTML upon output: <img data-file-uuid="aa657593-0da9-42c0-9a05-5d63d27ad27d" data-image-style="thumbnail" />
.
In other words: the text should only contain text and programmatic references to other content; the filter system should then handle “upcasting” these into their final form. This will make it much, much easier to upgrade existing content to new image styles, to modify referenced media, to migrate to a new CDN, and whatnot.
WYSIWYG in Drupal 8: from brochureware to newspapers
Drupal needs to cater to both the extreme of very structured content for maximal reuse and to the extreme of unstructured content (where pretty much all data is in a single “blob” called the “body” field, besides maybe a “title” and a “tags” field). It also needs to deal with everything in between.
Drupal may be used for news sites, but also for brochureware sites. By having the WYSIWYG editor be configurable, and hence letting the site builder choose whether formatting/layout tools are available or not, we empower the user to choose.
WYSIWYG in Drupal 8: previews are evil? WYSIWYM to the rescue?
A WYSIWYG editor by definition provides a preview — a best effort preview, that is not guaranteed to be accurate. Providing a preview is not a problem in and of itself, as long as the author knows and understands that the content will be used in multiple contexts, where it will look different.
Of course, reality is that not every author will be sufficiently educated, so we have to take potential abuse into account. Drupal’s filter system and very strict WYSIWYG editing in Drupal 8 do precisely that.
What might be even better though, is if we were to make it explicitly visually obvious that the WYSIWYG editor is indeed providing a best-effort preview: visualize the building blocks of the content that the author is using, to make him very aware of the structure of the content that he’s creating.
This is what is some people have called WYSIWYM: “What You See Is What You Mean”. 3Wikipedia defines it as follows:
WYSIWYM (an acronym for “what you see is what you mean”) is a paradigm for editing a structured document. It is an adjunct to the better-known WYSIWYG (what you see is what you get) paradigm, which displays a formatted document on screen as it will appear in only one mode of presentation.
The main advantage of this system is the total separation of presentation and content: users can structure and write the document once, rather than repeatedly altering it for each mode of presentation, which is left to the export system.
A HTML text editor specifically built for to be a WYSIWYM HTML editor exists: WYMeditor.
WYMeditor’s main concept is to leave details of the document’s visual layout, and to concentrate on its structure and meaning, while trying to give the user as much comfort as possible (at least as WYSIWYG editors).
- You may have tried a full-featured WYSIWYG editor, but you apprehend that your clients use it inappropriately, with the risk it degenerates visually and on the code quality.
- You may also have tried the BBcode syntax, Markdown or the wiki-style syntax, but you don’t want to force your clients to solutions that are too technical/complex for them, even if it tends to generate good quality code.
The downside of WYMeditor (besides its utilitarian UI and absence of keyboard accessibility) is that it doesn’t support the whole range of websites that Drupal needs to support: some people want to do everything in a WYSIWYG editor, and for the simplest websites, that’s acceptable. Drupal tries to impose as few choices as possible.
So, ideally, we’d use CKEditor, with a way to turn on a “WYSIWYM mode”. The great news: this already exists to a certain extent in the form of its “Show Blocks” plugin! (Which we’re already shipping with Drupal core specifically to accomodate this.)
If we find this an acceptable solution, then all we need to do is improve CKEditor’s “Show Blocks” plugin!
Of course, this line of reasoning might come across as a superficial solution that isn’t a real solution. But let me demonstrate that the core a this pattern has been used for almost 20 years: in the LaTeX world.
WYSIWYM & LaTeX: LyX
I’m sure many of you know LaTeX. It’s a “document markup language and document preparation system”. It’s typically used for writing papers, but also books. 4
LaTeX is based on the philosophy that authors should be able to focus on the content of what they are writing without being distracted by its visual presentation. In preparing a LaTeX document, the author specifies the logical structure using familiar concepts such as chapter, section, table, figure, etc., and lets the LaTeX system worry about the presentation of these structures. It therefore encourages the separation of layout from content while still allowing manual typesetting adjustments where needed.
That really captures the gist of it: authors focus on content, don’t think about visual presentation. That’s up to “the system” to figure out.
Now, here too, it is the domain markup, and complete knowledge of it, that is problematic: the plethora of LaTex commands.
That’s why tools like LyX exist. LyX is essentially an easier to use interface to generate LaTeX. It shields the user (mostly) from the rather complex LaTeX markup. It provides a preview of sorts, but one that clearly looks completely different from the end result that LaTeX’s typesetting will generate: LyX encourages writing based on structure (WYSIWYM) rather than appearance (WYSIWYG).
If all of the above sounded rather abstract, let’s look at an example:
- Writing LaTeX: here’s a tiny subset of the LaTeX code — see the attached file for more:
In inline formulas it looks like this:\begin_inset Formula $\lim_{x\rightarrow\infty}f(x)$\end_inset
- Writing LaTeX in Lyx:
- The output for both:
LyX’ initial release was in 1995. It’s still actively being used. Many, many papers have been written it as well as many books.
But … WYSIWYG editors suck!
Sure, WYSIWYG editors sucked… because they allowed for formatting & layout, which Drupal 8’s WYSIWYG editing doesn’t allow.
We still have work to do to stress the importance of content structure over content presentation — see the WYSIWYM section above. But that can be bolted on top of the solid foundations that we already have.
So, these wonderfully colorful quotes used to be painfully true, but they’re not applicable to Drupal 8’s WYSIWYG:
WYSIWYG Editors suck because they promote thinking about style rather than content. While content editors are busy changing headings to Comic Sans, pondering the use of a grimacing smiley on their about us page or getting creative with colour, they are not considering the actual copy they are adding to the site.
WYSIWYG Editors suck because as a designer you lose control over big chunks of the design. Anywhere that allows people to enter HTML via an editor allows them to get as creative as they like, using any mark-up that they like. Unless you carefully go through and remove all the creativity that stuff is going to stay there. For developers, even if you switch off most of the buttons, just allowing the administrator to enter simple formatting and links, you still have a situation where a user is entering HTML which you then display on the website. This can enable all kinds of stuff to get into your content, which is then very hard to remove and fundamentally tied to the current design of the site.
In-place editing
In-place editing does not inherently conflict with structured content. In fact, for most things, Drupal’s implementation of in-place editing stresses the fact that the content is structured: most structured data is impossible to edit in the same way as it is presented. Only for textual fields, we offer the überfancy “true WYSIWYG in-place editing” capability, where Jeff Eaton’s quote from above is most relevant. Even there though, abuse is prevented by the very restrictively configured WYSIWYG editor. For other fields, like taxonomy terms, image fields, boolean fields and so on, we still offer a form-based editing UI while editing in-place, and the danger of letting content presentation prevail is extremely limited.
To a degree, in-place editing can even be useful in increasing awareness of the need for structured content. If the content isn’t structured (i.e. one blob of data, for example a “body” field containing all content besides the title), then that becomes immediately and painfully obvious: no specialized, optimized in-place editors appear to edit the particular piece of content; instead you’d have to find your way to the particular thing you want to edit in the body field.
In-place editing in the way we’ve implemented it encourages structured content.
In our initial implementation of in-place editing, there was more potential for misunderstanding and abuse. But we’ve made two important changes:
- in-place editing is no longer triggered on the page level, but at the entity level: the user must declare his intent to edit a specific entity in-place. So the user can no longer get the impression he’s “editing the page”: he’s explicitly made aware of the type of content (entity type) he’s editing (node, taxonomy term, custom block …) and of the field within that piece of content (entity) that he’s currently editing (Title, Author, Body, Tag, Image …).
- in-place editing is no longer saving each field individually, instead the modified fields for a specific entity are queued up and saved at once, this strengthens the communication to the user that he’s editing a singular piece of content that just happens to be rendered on this particular page. (In progress.)
Finally, in-place editing is only designed to be used for quick edits (hence it being triggered by a “Quick edit” action in the contextual links of entities). It’s intended to bring a level of “delightful interaction” to editing, instead of being forced to go back to the overwhelming back-end form every single time, even if you don’t need to modify metadata.
Education, understanding, awareness of content reuse
It is absolutely essential that authors (content creators) understand the entire flow of the content: from creating it first, using each field for its proper purpose, to the different ways that content might end up in output.
Because in-place editing happens on the output, and output can happen in many ways, in-place editing never allows all the content to be edited: at the very least it is going to be impossible to edit metadata. From that last perspective, it’s definitely possible for an author to abuse in-place editing.
We need to provide omnipresent, explicit awareness whenever an author is creating or editing content. Both when editing on the back-end and on the front-end. Low-fidelity, simultaneous previews of the different view modes and preferably on multiple form factors would be the ideal here.
Embedding this explicit awareness is something we still have to achieve for Drupal.5
Data storage in NPR’s COPE
We saw NPR’s UI earlier in this article. What we didn’t see yet, are two fundamentally different ways of storing the data within what is presented as a single field to the end user:
- Each paragraph of a single text field is stored as a distinct database record. This also implies that the position of the paragraph needs to be stored. (See the full diagram for details.)
- When saving a paragraph, all HTML markup it contains is stored independently: it stores just the text in one database record, and then there is one database record per HTML tag used within that paragraph, which stores the type of tag, the start and end position of that tag within the text, and the attributes for that tag. They call this Markup Addressing:.
In essence: extreme database normalization!
Drupal does not yet support this out of the box. The question is whether this is actually necessary? There’s a lot of additional overhead to going so far in normalizing data. What is the use case for storing individual paragraphs in separate database records, when many paragraphs are meaningless without the surrounding paragraphs?
The use case for storing the markup separately from the text it was applied to is more clear: to easily facilitate those platforms that don’t use HTML markup, and to support changes in markup more easily (e.g. <b>
→ <strong>
). NPR decided against the alternative: storing the markup in the database and filter (strip/transform) it on the way out.
The main gripe Daniel Jacobson had with “filter on output” is based on how he’d seen that implemented before: hard-to-maintain scripts and most systems allowed all markup to be used. However, Drupal already has a mature system to deal with that: its filter system.
Both architectures have downsides. Neither is clearly superior6. Time will tell whether Drupal’s data storage approach needs to evolve.
Conclusion
WYSIWYG and in-place editing can clearly be highly problematic when it’s implemented like it has been for many websites for about a decade now. For many websites, they have been (ab)used to the extreme point of entire HTML pages being built by a WYSIWYG editor, which has caused consistent inconsistency and utter lack of reuse. Liked by authors at first, until things went bad — or until the next redesign.
The other extreme is a system like NPR’s COPE, where it is guaranteed that content is consistent and reusable. At the cost of the authoring experience.
However, I believe that using WYSIWYG editing in a very disciplinary manner combined with a well-defined system for filtering on output and a data model similar to NPR’s COPE, can yield equally successful results as NPR’s COPE, but with a significantly better authoring experience.
Sources & related reading
- http://en.wikipedia.org/wiki/Content_strategy
- http://blog.programmableweb.com/2009/10/13/cope-create-once-publish-everywhere/
- http://blog.programmableweb.com/2009/10/21/content-modularity-more-than-just-data-normalization/
- http://blog.programmableweb.com/2009/11/11/content-portability-building-an-api-is-not-enough/
- http://karenmcgrane.com/2013/05/23/drupalcon-keynote-video-and-talk-notes/
- https://www.lullabot.com/blog/articles/inline-editing-and-cost-leaky-abstractions
- http://alistapart.com/column/wysiwtf
- http://www.rachelandrew.co.uk/archives/2011/07/27/your-wysiwyg-editor-sucks/
-
Both examples are content businesses. The efficient managing and reusing of that content is the whole reason they exist and survive. Hence it is acceptable for them to have a very poor authoring experience. Also: the data model has to be right from the beginning; if something was missing or wrong, it may be impossible to transform old content to the updated data model. Hence there is also an intentional lack of flexibility. ↩
-
Use the Entity View Modes module to create new view modes. ↩
-
Not in the sense that it was discussed at the WYSIWYM BoF at DrupalCon Portland, where it was really about semantic annotation. ↩
-
The whole reason it exists is because somebody got fed up with messing with WYSIWYG editors to get everything just right: the typography, the whitespace, the layout, and so on. Instead, that person wanted to just write the content and have software automatically calculate optimal whitespace, optimal typesetting. ↩
-
The Spark team has already been working on this to a certain extend: the responsive previews patch. However, it is not tightly integrated with editing; neither on back-end nor front-end. ↩
-
Ideally, there would a domain-specific markup (as in, a markup with annotations for the specific knowledge domain of your site) that has more expressive semantics and would then be transformed to HTML when the content gets rendered for web purposes, and to something else than HTML for other purposes. We should explore this.
But at the same time, the threshold would become rather high: which sites, besides those whose primary business is the longevity of their content, the long-term relevance and reusability of their content, will want to invest to build their domain-specific language?
It requires a lot of discipline and research, to come up with a sufficiently expressive domain-specific markup. Precisely because once you’ve begun expressing content using your domain-specific markup, there is no way back. You cannot automatically enrich existing content with newly added domain-specific markup. The domain-specific markup must be complete before you begin using it.
Not to mention that either the author will need a complete understanding of the complete domain-specific markup as well, because otherwise it will all have been a measure for nothing. Once you enter this realm, it’s also very realistic (and human) for authors to forget about a few elements of the domain-specific markup. So then something like a WYSIWYG editor, but with buttons that generate the domain-specific markup could be a great help. This is once again WYSIWYM. ↩
- AttachmentSize LaTeX example2.58 KB NPR entity diagram43.16 KB