Uber-Text Pages and the Lack of Inheritance in Content Management

By Deane Barker on April 21, 2008

(Note: the audio for this post is here.)

We had a build meeting the other day for a client’s site, and we walked through the site map to determine what content types we were going to need to pull this off.

In these cases, the first content type you inevitably define is the ubiquitous “text page.” This is a simple page. Of text. Duh.

Text pages usually consist of a title, a summary (for index pages where you’re listing a bunch of them), and a body of text. Many content management systems support this model explicitly (it’s built-in this way — think of a blogging platform), or you end up modeling your page like this if your system gives you that ability.

But how far do you…push, the text page? There are a lot of opportunities to re-use this content type. How far do you take it?

This particular client also need an “announcement.” We took a long view of it, and determined that their announcements section was really just a group of text pages, reverse-ordered by date. So, we thought, let’s just tack a “date” field on the text page model, and be done with it. If the text page is in an announcements section, we’ll order by that date. If not, we’ll ignore it.

Then, the client needed an “article” content type. Well, what is an article? It’s a text page…with a date…and an author. So, let’s just tack an “author” datatype on the “text page” model, and we’re good…right? We can use it when we need to, and ignore it when we don’t.

Later, the client needed a “newsletter” content type. Turns out, this is just a text page with a PDF file attachment. So, we tacked on a “file” datatype…

Now, in truth, this situation was hypothetical. But you see the idea at work here? How content types are really just derivative of a core content type? The fact is, an awful lot of content types can be defined as:

  1. Title
  2. Summary
  3. Text body

Tack on these datatypes —

  1. Date
  2. Author
  3. File attachment

— and you’ve handled four separate, logical types in our mythical client’s content model: text page, announcement, article, and newsletter.

So, the question is, did we take this too far? Or is what we have planned here an elegant solution to modeling this content?

In the end, it depends. It depends on a lot of your content management system’s functionality external to content modeling. Dividing these four logical types into multiple actual types is often valuable for more than just content modeling — many systems will drive things like templating and permissions by content type. And what happens when you need to add a property to your Announcements, but not your Articles? So having everything as some uber-text page can lead to other issues.

In the end, it comes down to repetition vs. elegance. While duplicating your core set of properties on every content type is a pain, you avoid some tricky issues. Conversely, pushing the envelope with a single content type is elegant, but you can paint yourself into a corner pretty quickly.

But, my point here is that we shouldn’t have to do this. And here’s why —

Very few content management systems are using the object-oriented concept of inheritance these days. Inheritance says that Class B is a superset of Class A — it includes all of Class A’s functionality, and then some more. So if I happen to change Class A, Class B will change too.

In this case, I would model a “Page” object with these properties:

  1. Title
  2. Menu Title (for implicitly menued systems)
  3. Summary
  4. Body
  5. META keywords
  6. META description

Then, would extend this base “Page” object into the “Announcement” object by adding a “date.” I would extend that into an “Article” object by adding an “author,” and into the “Newsletter” object by adding a “file.”

Then, say I want to geo-locate everything someday. I just add a “location” attribute to the base object, and everything extends from that.

Very few content management systems allow this. I’ve seen it in exactly two systems, both heavy of document management — Alfresco and Documentum. It’s elegant, it’s precise, and it’s powerful, which should be obvious since it’s been a core tenet of object-oriented programming for years.

Sadly, implementing this kind of system is complicated, and usually computationally expensive. Documentum, for example, maintained a database table for every level of inheritance, and did one-to-one joins all the way down the inheritance tree to return a big database row for an object. (But, on the other hand, this is built-in to Postgres, so WTH?)

Even if a system didn’t let you do traditional inheritance, N-levels deep, it would be handy if you had a “base object” from which you could derive your types from. Meaning, you could alter a base object to include things like the title, summary, text, etc., then each type would be adding properties to this base type. You couldn’t go more than the one level deep, but it would still solve a number of problems.

If your CMS has a strong content tree, you could fake inheritance a bit. You could create a base content type, then add subcontent to “flavor” it. Your base type would have the core properites, and you could add subcontent underneath it to hold other information specific to the pseudo-class you need that particular object to act like. This is hack-ish, but it might work well in some cases, and it fits the model of “Custom Field Sets” we discussed several years ago.

In the end, content type inheritance is the holy grail of content modeling, and you don’t see it that often, which is too bad. It would be a huge asset to any CMS that implemented it. eZ publish claims that it’s on the roadmap, but I’ve yet to see anyone put a date on it.

What This Links To


  1. Why not use collections instead of inheritance. Inheritance used the way you described it has the some flaws you described. I tend to design CMS systems in a different way. There is a ContentObject that has ID, some dates, status, etc. it has a set of Items of a ContentObject.

    Each content object can be extended to be a Page (base object for web resources), Image, Text, Paragraphs, or even File (base object for downloadable resources) etc. Some types have a set of Metadata Types (i.e. Page has HTML metadata, Image EXTIF metadata, etc.)

    So when I get to create an Article I have an Page object that has a collection of Paragraphs. If I need a list of announcements again its a Page with collection of Articles, and Text object for announcements description.

    So on the certain level it is collection instead of inheritance.

    The question is, is it a good practice to extend Article so it can be Press Realese? Better not. I think that adding a configuration to your Page object is better. So you can have Page object and ask the object of what type it is and what does it mean. Article is a Page of type Article and that means has collection of Paragraphs. Paragraph is a ObjectType with collection of Text. So now you have rather flat structure that some inheritance. Whenever you would like to change one of a type and add something new it is easly done by changing the configuration of a certain Type.

    See that in this model you can also work on the parts of the main Article if it has 2 paragraphs each paragraph will have its status and unique ID and is also a ContentObject so can be processed separately.

Comments are closed. If you have something you really want to say, email editors@gadgetopia.com and we‘ll get it added for you.