What is MetaData?

By Deane Barker on December 25, 2006

Talking about metadata, like cooking with metafood: It’s actually pretty eerie that I stumbled on this post because I had been thinking about this exact thing: when is something “metadata” and when is it just “data”?

Technical conversations about information and data can sometimes include the word “metadata,” which commonly gets defined as: data about data. It’s a fancy word, and I’ve seen many cases where there’s a need to think in terms of meta-ness about data.

But, as a practice, I find that talking about just plain-old data is not only sufficient most of the time, but prevents conversations from degrading into murkiness around which data is about data, and which isn’t. (From experience, I wonder: is there actually any data that isn’t about data in some way?)

Here’s my theory:

Something is metadata when its platform-specific. So, the checkbox “Show on Home Page” would be metadata because it’s related specifically to displaying this piece of content on a Web site.

The “title” is just data (“core data”?) because whether this content is going on a Web site, an RSS feed, a PDF — whatever — the content still needs a title. The title transcends the specific platform (a Web site).

I welcome other thoughts on this.



  1. My theory is that what is that it just depends on the context. If your task to serve files, then the files are the data and the modification time and access times are the metadata. If your task is to gather logs about what is happening, then the access times and the modification times becomes the data.

    That seems consistent with what Wikipedia has to say on the subject.

  2. So, if your file is a Word document, what about the information in the File > Properties dialog? Data or metadata?

    And if you distill a log file to a data structure of some kind, then I see it as an object with properties, one of which is the access time. That’s could be considered as core as anything in the file.

    I’m actually willing to postulate that “metadata” doesn’t so much exist — data is just data.

  3. Meta data usually is the “extra information” about the main content (“core data”). To take from the file server example above, the main content or the code data is the file itself. Every thing else is extra information or meta data (access time, modification time etc). But when you take that information into a data base, this information becomes the main data and any extra information such as size of the log file etc becomes the meta data . So it depends on what the context is and what is that you are interested int.

    Similarly the content of the word document is data and the file properties, modification info etc. are meta data.

  4. In the days of closed database systems, the systems contained data about something (say, data about XYZ). But, they also contained other data that was not about XYZ.

    This other data was stored because it was needed by the system (or DBA) for processing the data about XYZ. So, it was called metadata, and it was data about the data about XYZ. But, it was always something very specific within a closed system.

    As the previous commenters suggested, the more one can define a closed context, the more one can point to some of the data stored in that context and describe its role as being meta in relation to the other data.

    But, much of the time, with the web and other systems, the data gets used in multiple contexts, or even in an open way where there’s little control over or even knowledge about the contexts where the data might be used. And that’s when talking about metadata can be confusing, and when it gets a lot easier to just talk about the data.

Comments are closed. If you have something you really want to say, tweet @gadgetopia.