Managing Multiple XML documents

By Deane Barker on May 26, 2003

One of the continuing quandries I’ve had with XML is the management of multiple XML documents. If I have one, big XML document, then it’s easy to work with – to parse with an API, to transform with XSLT, to query with XPath.

But what if I have many documents? For instance, what if I have all my blog entries (400+ at last count) as individual XML documents in a directory somewhere and I want to find all entries containing the word “cuisinart”? What do you do then? Iterate through all the documents firing off XPath queries and somehow persist all the documents that match then go back and get them when the loop is done? This seems ugly, but the alternative – having everything in one, monolithic XML document – seems worse.

I’ve heard that Oracle 8 will let you do an XPath query on an individual field in the WHERE clause. I’m trying to figure out if SQL Server 2000 will let you do the same thing. MySQL would be even better, but perhaps that’s hoping for too much.

There are some XML databases out there (Xindice, for instance; more here), but they’re very new and I don’t know of any that have Windows binaries or that will work without me getting all geeked out.

Comments (2)

Ted Thibodeau Jr says:

Check out Virtuoso.

It’s not free (well, it’s got a 30-day extendable evaluation license), but it’s powerful...

SQL-92 database (both Virtual and Relational DBMS) blended with XML database, with built in X-Path and XQuery support, as well as XSLT, and bunches more. Including Blogger, MetaWeblog, and Moveable Type API support – from both client and server perspectives – in the latest update (v3.2)!

I can’t do it justice in a comment – and it’ll probably seem like marketing speak, since I work at OpenLink Software, which publishes Virtuoso... I don’t benefit directly from downloads or sales...

Check it out. Let me know what you think.

Tom Dyson says:

Our open source XML extensions for PostgreSQL do exactly what you want: provide XPath support within your SQL statements.

SELECT title FROM documents WHERE xpath_string(xml,'/document/title’) = ‘Konrad Lorenz’;


SELECT xpath_string(xml,'/document/title’) AS title FROM documents WHERE id = 22;

The extensions are based on the very fast and lightweight libxml2 library. See: