The Content Tree

By Deane Barker on August 18, 2005

A while back, I mentioned the concept of a “content tree” in regards to content management. I cited this as a “functional pattern” and promised to talk about it more, but I never did.

So, here goes —

With every content management system (CMS) I’ve written, I always get back to the concept of a content tree. Additionally, every really good CMS I’ve seen has a content tree as the core structure: Documentum, Ektron, Interwoven, Zope, eZ publish, etc. It has become a pattern of content management if ever there was one.

Simply put, a content tree is a taxnomy – a parent-child structure – of content. You start at a root “folder” or “node” and build down from there. You might have a “folder” full of “article” objects, each of which might contain one or more “image” objects, etc. The idea is that a content object can be the child of another object, and the parent of one or more objects.

Obviously, this idea isn’t new, but I’m going to explain how I grasped this pattern one day. Perhaps my story will help you understand why this is such an important pattern —

I was building perhaps my 20th CMS (I think – they all kind of run together...). The CMS I was developing was “type-centric,” meaning content was grouped into “types” or “classes” (another pattern), and I used different types to organize the content. I would click a menu link for “articles” and get all the articles, for instance.

Type-centrism is really common in CMSs, and pretty much where everyone starts. You want articles, click the menu link and you get a list of them. Same for pages, authors, etc. It seems perfectly simple on the surface.

Then one day I added a type for “movie review.” I had to do a little programming, but I eventually had a shiny new type.

It turned out, however, that my “movie review” type was almost identical to my “article” type. They both had a title, a preview, an author, a body of text, an image, etc. But since my CMS was type-centric, they needed to be of different types to be separated in the system – I couldn’t very well click the “articles” link and get “movie reviews” mixed in, now could I?

Sometime later, I added a type for “book review.” Not surprisingly, this looked a lot like my “article” type too – it had all the same fields. With this, it became obvious that I was going to be doing this for a while.

So I decided I would create a “page” type instead, and add a property for “class” which would tell me what it was. This way I could recycle one type for many different uses. Brilliance!

So I had to hack away at my system for a while to separate content objects not only by “type” (page), but my “class” (article, movie review, etc.). It was done, and it worked.

But later, I decided that I wanted to separate “non-fiction” book reviews from “fiction” book reviews. How could I do this? I could create a new property on the “book review” class for “genre”...but I didn’t have a book review class anymore – I had lumped everything into “page.” And if I added “genre” to the “page” type (class? I was confused...), it would affect everything of that type.

Then it hit me – a content object is defined by two things: (1) its format (the properties it exposes), and (2) where it is located relative to other content. Two pieces of content may look the same (have the same properties), but they could be grouped into similar types: articles, movie reviews, etc.

So I quickly implemented a content tree – a recursive table of “folders” – and started assigning my content to these folders. My page class could now work just fine everywhere: if a “page” was in the folder for “articles,” then it was an article; if it was in the folder for “fiction” which was in the folder for “book reviews,” then it was a review of a work of fiction, etc.

This suddenly opened up other features, since any content object could be “assigned” to another one by virtue of being its child. Instead of having a single property for “image,” a content object could have as many images as it wanted.

I just needed to create an “image” object that could exist in the tree. If an article needed to be supported with 17 images, then the images could fit into the tree as children of the article. It become quite simple for an object to query the tree to collect all its children, so the organization and assignment of content objects suddenly became a breeze.

Additionally, it became easy to address different branches of the tree uniquely. I could decide that all the book reviews would be formatted a certain way, for example. If the system was rendering an object along that branch – however deep it might be – it could apply different images and styles. If I wanted non-fiction book reviews to look different from fiction books reviews, I could further subdivide the branches.

I also found that the content tree eliminated the need for a lot of properties. As I mentioned before, there was no need for “genre” when describing a movie review. That property was covered by where it sat in the tree.

This went on for some time – the content tree had opened up so many new possibilities, that it kept me busy for quite a while. And with that, I realized why most good content management systems have some sort of tree structure as the backbone of their content organization. This is obviously a simplistic overview, but it’s an accurate description of the epiphany that led me to this understanding.

If you’re comtemplating a CMS project, consider this story carefully. I’m quite convinced that if you hack away on and refine your system long enough, all roads will lead you to this core concept – this functional pattern.

(Note: if you’re going to implement a content tree, implement the envelope pattern too. Don’t assign letters to the content tree, assign envelopes. Generally speaking, the content tree shouldn’t care about the letters.)

Comments (3)

SasQ says:

Great idea! :) I think it could be even more great to develop it more further. E.g. by virtualizing the tree nodes. Then they could refer to the content in a database, files, remote resources, or even a content generated on the fly! :D If we do that, the tree could be purely virtual and doesn’t even exist anywhere physically! :D It could be generated or loaded [from files or database] on demand. It could be used also to contain another types of content, e.g. users, their profiles and credentials, access control lists, menu lists etc. Then the tree could be used by other modules of the CMS [authentication module, access control module, user management etc.], so it could become a real CORE of the system :) What do you think about that idea?

But still there is a little problem with it. Content objects differ not only by categories ["book review”, “movie review”, “fiction” etc.], but also by their physical structure - “image” has other fields [size, format...] than “blog post” [author, posting date, tags] or user profile [many useless fields :P]. How to handle with that?

Carol says:

Nice article with some good points – I think this “Content Tree” in most CMS’s is often the taxonomy of site > sections > pages > content types/layouts. A good CMS will let the publisher classify content by format/function (ie; content type) and topically (ie; categories,sub category,etc..).. I also like CMS’s that let me finally “tag” content with keywords which provides another map to the content.

Jens says:

I have been building a CMS and very early on arrived at a content tree. But I arrived to it from another angle. Basically I wanted to avoid creating tables + crud for each microsite that had some content to manage, so I figured I needed a generic table that could store all content for any given site. I figured a good way to do it would be adding a pathname as an id for each item (so I could refer uniquely to it from any page) and a type (so I could know if it’s an image, text, or something else. Then on each page I would query a given “folder” and get all the objects matching it, then I would render them any way the page needed. As you said, no matter what you always end up doing the same thing. (no wonder every single OS has a file system!)