Search is Hard

By Deane Barker on November 5, 2008

We’re researching search options for a client this week, and I stumbled across this blog post which spoke volumes to me:

Search is Easy, But Good Search is Hard

So true. Search, in it’s most basic form, is easy. But there’s a lot of subtleties that you find yourself longing for that are harder to pull off:

  • Spelling suggestions
  • META searching
  • Content biasing
  • Incremental indexing
  • Index merging
  • Key matches or “best bets”
  • HTTP spidering
  • Etc.

Also, there’s a vast difference between a “search engine” and a “search application.” A engine is just that — a tool to search an index. Lucene.Net provides this. So does Swish-E and Searcharoo.

But a “search application” is all the stuff that surrounds it. The actual searching interface, the indexing methods, the process of maintaining the index, the metadata scheme of your content domain, the spidering process, the filtering of search results, etc.

This difference was fairly stark when we downloaded Lucene.Net the other day and got it working. Using two command-line executables, we were able to index a folder of text files and search them from the command line. So, we had a search engine.

This was great, but…what then? Being able to search like this is a far cry from actually having a search system on a Web site that provides some value.

Sometimes, it seems the actual search engine is the smallest part of it. Getting that core to work in a larger, integrated whole to deliver some value to the end user is much more difficult.

Gadgetopia