Building a Search Engine

By Deane Barker on May 2, 2004

Building a Search Engine: Interesting backstory behind the Mozdex project. This is an open source search engine that they’re going to try and use to index the entire Web.

…we have a network of two db servers using the Lucene Index system (Jakarta Project) with two terabytes of disk space on each server as it generally takes 10kb per page to store the data and index segments. Our query farm is five P4 &mdash and soon to be five more AMD Opterons with 16 gigs of memory. Through some early testing we were able to realize that our biggest cost was rack and facility space and that the performance as well as memory capacity of the Opterons offered us the best value. When thinking of query servers and indexes the goal is to have as much of the index segments in memory as possible for quickest retrieval. The memory capacity and throughput on the Opterons is a great advantage in this arena.