By Deane Barker | July 17, 2011 | 13 Comments
There’s a weird aspect of content management that I see come popping up again and again: spatial context, or content geography. What I mean is that content “lives” in some place inside the repository. A lot of systems (though not all) have this sense of “place.”
I’ve discussed content trees at length before. In systems following this pattern, you have a tree of content – all content is in some parent-child relationship, and part of a larger overall structure.
In these cases, content has a place. The content tree forms what we’ll call the “master geography” of the system. It’s a large, obvious, accepted structure that forms implicit relationships between content.
If you take Page A and Page B, they have some spatial relationship, and given the hierarchical nature of content tree-based systems, it becomes most natural to talk about this in terms of genealogy:
- They could both be children of the same parent, making them siblings.
- One could be a parent of each other.
- They could be cousins – their parents are siblings of the same grandparent.
- In a larger sense, they could be “closely related” to each other, or “distantly related” to each other, by virtue of being flung all the way across the tree.
The bottom line is that in a lot of systems, you have this sense of geography – content is “put” somewhere, and you use this knowledge of its place to make value judgments about its relationship to other content.
Why is Page A in this place and Page B in that place? What differentiates the places? Usually it’s based on navigation and menuing. (Related: Menuing in Content Management: Implicit vs. Explicit)
But what if our system has no content tree or folder structure? What forms its geography?
Drupal is the prototype of the “big bucket o’ content” systems. By default, content in Drupal has no place. Some modules can super-impose some sort of geography onto Drupal, but what I found so damn bewildering about Drupal for the longest time is that there is a big amorphous pool of content, into which everything is thrown.
The way that a system like Drupal injects geography is through menus. You group content into menus, so Page A and Page B might be children of the same parent in Menu A. For your site, you may decide that Menu A is the main menu of the site, making this your master geography. But Page A and Page B may not be anywhere close to each other on Menu B, and what does that mean? What makes Menu A the master geography, and not Menu B? Just because you say so?
What do we call a system that allows you to create multiple, independent geographies of content, yet has no designated, accepted master geography?
What drove me nuts about Ektron for the longest time is that it has both – it has a folder structure, and it has explicit menus. So what is the single source of truth on content geography: the folder structure or the menus? I would call the folder structure the master geography, but since it wasn’t a true content tree, you couldn’t really use it for navigation very well.
(In fact, I was told by Ektron support on more than one occasion that the folder tree was never meant to define navigation. This irritated me in later versions when they tied breadcrumbs to folders, thus sort of saying “okay, now the folders should be used to define navigation…”)
So, in the end, I needed to use Ektron menus for navigation, which made them master-ish…but the folder structure imparted more geographic-ness…and, you get why I got frustrated.
(On top of this, Ektron has a taxonomy system too. And content collections. So, if we have a folder for “Employee News,” a menu for “Employee News,” a taxonomy node for “Employee News” and a collection for “Employee News,” and all four of them group different content…then where the hell is the Employee News, exactly?)
I guess, no matter what the system, you, as the CMS architect, need to have some single source of truth on geography. If the system doesn’t provide a strong content tree or folder structure, you’re going to need some way to figure out how Page A and Page B relate to each other. What is your master geography?
Whether you acknowledge it or not, so much context is drawn from geography. In most cases, it’s navigation, and this has a huge effect on how your site plays out.
What prompted this post is that I was re-reading a section In The Polar Bear Book the other day. Rosenfeld and Morville were discussing types of navigation. They differentiate between these two:
- Local Navigation
- Contextual Navigation
But for Contextual Navigation, they’re talking about what they call “associative navigation,” which are menus like “More Employee News” or “More Articles Like This.” Content that is somehow related to the main content of the page.
However, note that they have implicitly set apart the notion of geography. Local Navigation is based on geography. Contextual Navigation is based on any other type of relationship. Whether they meant to or not, they set geography apart as a special type of relationship, and one that’s apparently so basic as not to need elaboration.
This is what got me thinking about master geographies and how ingrained they are in most of the systems I work with. Can this be limiting? Occasionally, but not usually. Having the system define a master geographic structure is valuable because it gives everyone a common reference point for that CMS. When using EPiServer, you know that the content tree forms the geography of the site can is the default reference for discussion of place.
However, there is also value in being able to create alternate geographies, which are common in the explicitly menued systems like Ektron. (That system in particular, as noted earlier, might have too many alternate geographies.)
The best of both worlds? A clear, accepted master geography in the form of a content tree (not a folder tree) with a solid option for creating alternates as needed. Who does this really, really well?
(EPiServer is close, but it doesn’t have a good method for creating ad hoc hierarchical geographies like Ektron’s or Drupal’s menus, which is unfortunate.)
What This Links To
What Links Here
Re “who does content tree well”: look at JCR and CQ5.
cf rule 2 in http://wiki.apache.org/jackrabbit/DavidsModel
I agree regarding the master geography of a content tree. This may be obvious, but I also wanted to add any page in that tree should have a URL reflecting its geography in the tree that also serves as the page’s primary key. It’s home, both internally and externally. A tree of pages naturally lends itself well to a path of page names and accurately describes its placement. It’s the way URLs are designed to work and it makes the individual page geography readable.
Ideally the CMS should support the model by not internally disconnecting the content from it’s URL, as many bucket-based CMSs do. When a page of content has multiple homes (URLs) it goes against the wiring of the web. One example is with search engines (like Google). Anyone working with search accessibility knows that the same content living at multiple URLs reduces the value of that content. Both on-and-off site links form to multiple URLs and the potential ranking value of that content gets diluted. Duplicate content penalties may become part of the picture. Missed opportunities ensue. But I should clarify I’m talking about large blocks of complete content, not snippets, summaries and partial content. I’m also not suggesting this is a widespread problem with bucket-based structures, just that such systems treat content in a way internally that you’d never want to present it externally… perhaps reinforcing a counterproductive way of thinking about your content. (blogs excepted)
It’s always boggled my mind why systems like Drupal, EE, and countless others build upon a completely different structure (or lack of one) that is so foreign to the presentation and front-end structure. When they do introduce structure, some middle-man logic must be introduced by the site developer to determine what content gets pulled from what bucket/folder/channel. URLs aren’t the primary key to that content… though the competent developer will still ensure the end result looks like they are.
As for creating the alternate geography, this may be the draw of buckets. When you have no native geography, everything is an alternate and it all lives at the same level. If your mind is tuned to buckets rather than structure, perhaps the content tree looks like a confusing stack of buckets rather than a row of them. But in reality, the opportunities for confusion are much greater when there is no tree because the structure is what you ultimately have to serve.
In a content tree, I think it’s simplest if every page in the tree is of a defined type. Each type defines a group of fields. A basic type might consist of the fields: title, body, summary. Another type for news might be: title, date, author, body, summary. Every page in that tree carries one type or another. Making alternate connections in such a content tree is simple because a field like “author” (for example) is a page-to-page connection. And you can define any number of one-to-one or one-to-many page-referencing fields as you see fit for a given type. When the system references that field, it’s just a pointer to a page (rather than something like a block of text). All content still has a home. You can pull content from wherever you want in the tree, but you do so with a specific intention.
I think that the widespread use of buckets comes from the legacy of the CMSs we all recognize. In the early 2000s, buckets were easy to code and they fit the need (blogs). But some of these products have scaled beyond blogs and these buckets are awkward and heavy obstacles to carry in a modern CMS.
Deane, many good points, but the whole discussion is based on content being seen as pages and that content is best stored in a content tree. I disagree. In some cases, it’s a good model. But let me ask you a couple of questions to try to illustrate my point:
Let’s say we want to store content about movies and actors and roles etc. Would you store that content in a content tree if you were creating a database application from scratch? Or would you store your content using relationships, the way relational databases were designed for? If you wouldn’t store content in that situation in a content tree, what makes it a good idea to do it in a CMS? Do you think you could model the IMDB database in a content tree?
This brings me to the next point. Why should the CMS dictate how you store and structure your content? Why shouldn’t the needs of the content determine how you choose to store the content? Shouldn’t the CMS just allow you to choose to use a content tree? Or a folder structure when that’s the best solution? Or a purely relational content structure when the content demands it? Or a mix of all of them in the same project?
Vidar, I know where you’re going with this. Refer back to my prior post on relational content modeling — yes, a pure relational model where your CMS mimics a custom database very closely is great (a la WebNodes).
But the content tree is a good distillation of the majority of needs of Web sites today. It makes sense to users and editors, and is a good balance between standardization and real-world needs.
Vidar: I don’t see any problem storing movies, actors and roles in content tree, if software where I am doing it allows for good relationships. Like Ryan wrote earlier it gives us good and sensible url structure “for free”. Moviesite probably has urls like:
Then it makes all the sense to have content in content tree. It just have to be easy to add relations between different content. I need to have easy and clean way to say that “role terminator” is played by “arnold the actor”. And when I say that, then “terminator the movie” automatically knows that “arnold the actor” acts on that movie.
Vidar, I’ve been thinking about it, and the simple fact is that I really wouldn’t store IMDB in a content tree. The content model in that case is relational enough that I would do something like a custom database or WebNodes.
However, it would work great if you add in something a little more semantic to EPiServer, like Web3 (watch the video):
For Drupal, modules like Node Relativity could help here.
The bottom line is that a content tree lends itself very well to Web sites. In the event you need something more solidly relationship, then maybe a generic CMS isn’t going to work for you, and you need a legitimate database (or, again, something extremely relational like WebNodes or Refresh SR2).
You need to evaluate each situation on its merits. Some (most) work well in a content tree, a few don’t.
@apeisa: How your URLs look and how you structure your content are two totally different things. Any CMS should be able to create the URLs in your example, regardless if it’s a content tree based system, or more relational.
You said “It just have to be easy to add relations between different content” -> that’s my point. A content tree with a parent/child structure is insufficient to model complex data. If you have a system that can create relations between content outside the parent/child content, and preferably different types of relations, you don’t have a content tree based cms anymore.
I’m familiar with Networked Planet. Quite a few EpiServer and Sharepoint projects in Norway are using them. I haven’t used it myself, but I’ve talked to several developers that have used it, and one of our employees have used it fairly in-depth. Have you used them?
My point with the previous post was that choosing the CMS from project to project, based on the content structure you need, is not an ideal situation. Neither is using a content tree for all projects. Likewise, most simple websites don’t need a relational content structure. Trying to come up with a relational content structure just because you can is not good either. To summarize, IMO your content should determine how you structure your content, not the CMS.
Vidar: I am talking about systems which uses content tree, but allow still strong relations. If you build websites (or some kind of web database like imdb) then you do need urls. And you do want urls that makes sense. And I think that 99% of times urls will be just the same than with content tree. (Of course system should allow some dynamic urls also, for searches and customization etc).
Quote: “that’s my point. A content tree with a parent/child structure is insufficient to model complex data.”
If you have parent/child structure, that doesn’t mean that you cannot have different types of relations also. I haven’t yet seen occasion where content tree disallows having complex data structure. If nothing else, you can use your content tree just to make meaningful url structure, without having to build it separately. This also gives content editors one meaningful way to browse data structure (there should be others also, if needed).
What I am trying to say is, that I see content tree very natural for data that is shown on web – just because you still have to build url schema, which is usually just the same.
Vidar, I disagree that “your content should determine how you structure your content…” as an absolute. When it comes to a CMS, there are more factors at play. We are talking about content for presentation and management purposes, not simply database storage purposes.
What I am suggesting is that the structure of your content should not just be reflective of the content itself, but its context as well. We are talking about web-based CMSs. The context is the web and the content is accessed by URL.
URLs are a content tree. Assuming your site is designed for people, the front-end of your site is already presented and accessed as a content tree. (Unless you want to have one of those sites where everything is accessed off the root by it’s database ID).
There is nothing saying that you can’t draw any number of page-to-page relations in this tree. Likewise, nothing about the content tree suggests that the parent/child relationship is the only manner in which you can model the data. But it is suggesting that data has a primarily place (URL) in your site. Incidentally, this is the model that people are based upon… we have family and a family tree, but we also have important connections outside of it: friends, coworkers, teammates, etc.
Here’s a simple example:
This is a site built around a content tree. Note the parent-child relationships used in this site, as well as the other manners in which relations are drawn (architects, for example). Watch how the URL structure mirrors the content tree.
Last point is that you can certainly use a content tree and treat it as a bucket system where all your relations are drawn to other buckets (a 1-level content tree). But people rarely use a content tree in this manner because it makes little sense in the context of the website. For the same reason, a bucket system makes little sense in the context of a website.
I agree that you need URLs, and that they need to make sense. But why limit them to a content tree. URLs shouldn’t be locked to anything, the content tree or otherwise.
I also agree that if you support a parent/child structure doesn’t preclude supporting other relations. In fact, the parent/child of a content tree is just relation. The problem is that many content management systems doesn’t support any real relations apart from the content tree. The content tree itself isn’t the problem. It’s reliance on the content tree that limits what you can model with a system.
I can’t agree with you that you should structure your content based on the context. The problem is that the context changes and that there are multiple contexts. The content should stay the same, even if you want to use the content in another context or channel or whatever.
If you keep the natural structure of the content you can use it ANY way you want. The problem with structuring content for a website is that we’re often not just creating a website anymore. IMO, we’re moving towards a situation where we create content stores where the web is just one of the main channels.
Sometimes the content tree is a good way to represent URLs. Sometimes you want another structure for your URLs. My point is that there should be no dependency between the content tree and the URL. Use it if it suits you, but it should be optional.
As for the bucket system and how it makes little sense in the context of a website, depends on your definition of a website. Is a web based CRM system a website? Storing CRM data in a content tree might not be the best solution. Again, we think the best solution is to determine these things on a project to project basis (Or even more granular).
Vidar, you are asking good questions but you are not understanding what I’m saying and that’s because I’m not doing a good job of explaining it. I am talking about the context of a web site. I’m not talking about other contexts. I build web sites, and I’m interested in how the CMS can best structure and deliver this content for a web site (and all related contexts).
I would also suggest the content tree would lend itself quite well to the other contexts you speak of… but that’s beside the point. I think we are talking about two different things. I’m equally interested in alternate contexts, so lets explore that too. If buckets are your preference, the content tree can function as buckets, whether structurally or by node type [template], so nearly any context is covered by the content tree even if you don’t want to commit to a primary structure. We use ProcessWire, and there are many instances where we might retrieve data based on it’s type [tempate] rather than it’s placement in the content tree, so the buckets are there and I don’t discredit their value. But I also think they are just a component of a bigger picture. We have large expectations from a CMS and buckets are not a real solution at the level we work at.
Commitment to a primary structure (and unique URLs) is incredibly important for the usability and indexability of the web sites that most of us develop. Perhaps some of our data comes from a less structured source (whether by SQL database or web service), but ultimately we are using our CMSs to deliver structured data represented by it’s URL.
If you want to present a practical example of something on a web site (even if it has other contexts) that would benefit more by a lack of a content tree, I will do my best to point out why it would benefit more from the content tree.