By Deane Barker on December 24, 2008

PDF Database – pdf and doc search engine: This is a search engine that only indexes and searches PDFs and Word files.

I find this interesting for a particular reason: what does format say about content? Would we find a different mix of content in PDF and Word files than we could find in HTML?

I did some random searches:

The results were interesting. I’m trying to put my finger on if or why the content would be a different…flavor. I found a fair amount of sales presentations. Could it be that PDF and Word content is more directed to a specific audience, rather than for random, public dissemination?

Interested on your thoughts as to if or why the content character and quality would be any different than HTML search results.



  1. PDF is the only 100 year document format in that all the information needed to decode the file is contained within itself. (Content Management Architect, nice to meet you).

    We have MS word files that cannot be read by any of the current versions used in the typical Enterprise environment. I dislike PDF’s as much as the next person but until something better comes along it’s the only format we can recommend for long term archival.

