I’ve talked a lot over the years about content modeling. Open and Closed Content Management is probably the most self-referenced post on this site. Recently I called content modeling one of the Four Disciplines of Content Management.
But, lingering behind all the questions about how to model something is a bigger question: do you model it at all? When is it obvious to structure some content, and when do you just throw it into the “WYSIWYG pile”?
We were meeting with a client the other day about applying some content management to their Web site. We came upon a page of “business partners.” It had a repeated HTML structure consisting of a logo for the partner, their name, their URL, and a few paragraphs about them. There were maybe a dozen or so partners listed.
It looked like this:

From a content modeling perspective, you have three ways to handle content like this:
No structure
This is a perfectly viable option — provided you didn’t mind a TABLE, the HTML represented here is nothing any decent WYSIWYG editor couldn’t handle.
Structured as a single content object
For most systems, this means an XML document, with a repeating “partner” element, and sub-elements therein for “logo,” “name,” “url,” and “description.”
Structured as multiple content objects
You could create a “partner” content type, with fields for “logo,” “name,” “url,” and “description.” This page would be a rendering of multiple partner content objects, sequentially down the page.
Not surprisingly, there are advantages and disadvantages for each, and we threw them around with this client.
No Structure
Structured as a single content object
Structured as multiple content objects
So, there’s a run-down of the advantages and the disadvantages of the major approaches. But which one to choose?
As you’d imagine, there’s no clear-cut answer. Here are some of the factors to consider:
How sophisticated is your CMS editing tool? Can it even do repeating elements at all? If you choose the second option, and structure within a single content object, how closely can the editing form look like what’s on the page? (Put another way, how easily can you “trick” the end-user into structuring content by making it look like WYSIWYG?)
(Incidentally, Ektron does a good job of this. You can make input forms with repeating elements that come very close what what the end result will be. Joseph Scott’s Edit in Place would do well here too.)
In writing this post, I tossed it over the pond to Josh Clark for his input. In his response, he captured one of the more succinct differences between how we (developers) look at content, and how the end users do. This too, needs to play a role in your decision (emphasis mine):
The big advantage to structuring content, of course, is that it lets you repackage it and present it in different forms and contexts. The downside is that it forces editors to approach their content like machines, thinking in terms of abstract fields that might be mixed and matched down the road. The benefits often outweigh this usability cost if you’re going to present the content elements in multiple contexts and/or offer various sorting options with a large number of elements. If not, then I typically go with unstructured.
That’s brilliant, and it’s so true. Understand that structuring content can suck the soul out of the authoring process for a lot of people. Like Josh said, often the advantages are clear enough to justify some soul-sucking, but always approach this with care.
I remember a client for whom we were building a “case studies” section of their Web site. I kept trying to get them to structure the case studies. I would say things like:
If you kept your case studies in an Excel spreadsheet, and each row was a case study, what would the column headings be?
Now, this is a good question and one that’s worked well for me in the past, but this client was just not getting it. Finally, Joe said, “Dude, I think they just want a page…” And he was right. The client wasn’t thinking in terms of structure — they were thinking in terms of a page with stuff on it. The figured they could just WYSIWYG it up, and in the end, they did it this way and they were fine.
Postscript: So, what are we going to do with our original example? At this point, I can’t say, but I’m leaning away from pure WYSIWYG because of the image processing. If we get this client on eZ publish, I imagine we’ll do it in separate records because eZ can’t repeat sub-elements within the same record. If we were to go with a CMS that allowed that, then my inclination would be to do as a structured single record.
I think the most important questions you ask are, "If your answer to the above is “no,” how sure are you of that? How well does the end-user understand the risk they’re taking by not making the content sub-dividable?". In every project I'm involved with (whether it is for me or someone else), I prefer overkill. I'd rather have features available but not used instead of lacking the needed features down the road.
What I find difficult is not structuring the content, but structuring it in a way that is reasonable for the client. This would go with one of your other questions, "What is the technical sophistication of the end-user? " Unfortunately, most end-users do not have the technical sophistication or know enough to be dangerous. I designed a site once for an investment consultant and we both agreed on the need of structure for the content. What was never really understood though by me or him though was how best to structure the content. My solution worked for him, but I don't think he was ever fully pleased with the solution I came up with. He wanted it to be structured in the way he did business, but since he didn't do computers and I didn't do investment...we never really came up with a perfect solution.
Thanks for the useful post. Differences in how to structure content is definitely tough to discuss with non-technical site owners, and I think one way to discuss it is to present a bunch of different types of reports/screens that could be generated with varying levels of structure. Drawing out abstract structures is much less effective than describing the different outputs that are possible with different structures. I also do think that we can end up over-complicating/structuring systems for our users, and for large CMS implementations (100,000+ pieces of content) often the more straightforward approach is better. Another consideration: if ithe input system gets to cumbersome/confusing/abstract for a user, they either may not use it or start putting in lower quality content/metadata.
I think the problem of subcontent structure is more common than the example in this article. I've noticed that i have this problem with links on my pages - all times I simply put <a> tags in the content, but if I move some article [or change its address in whatever way], that link is broken and I have to find it on all pages and repair. None CMS I know treats in-content links any special way. But I wonder if it would be possible to make that links automated - if I change one, it'll change in all places where I linked it in my content. [unbrokable links? hehe :-)]. What if we treat link as an element of content too? [subcontent]
The same goes for inline images. Let's assume I have some image file on a server and incorporated in few pages of content. Next I move my site to other hosting server. It is possible to break the images that way - e.g. if the image file wasn't stored in the database but in a filesystem directory. Or maybe I moved the image accidentaly to another directory, or changed its name. Then it's broken in all the places where it appears in content, because that <img> tag was refering the specific file on the server. So, even if I have that image as a subcontent in a content tree, CMS doesn't know where in the content it's referred [in what place exactly], so it can only know the page, but not the exact place. If the image location on the filesystem change, all src=".." attributes need to be changed manually. There should be some way to tell the CMS that "here you should place that image" and when the image location will change, it should be able to change the generated <img> tags automagically without the need of finding it and editting in the content pages manually.
Any ideas?