Exabot

Published: 23 July 2006
Author: Serban Ghita

Exalead is the company that produced Exabot and it was founded by François Bourdoncle, a pioneer of the search engine software market, which is also the president and chief executive officer of Exalead. In 2000, he co-founded Exalead with the goal of revolutionizing the search engine software market by providing users with a unified technology platform to access information in the enterprise.

Exalead has many products based on their search engine technology (like exalead:desktop, exalead:workgroup, exalead:enterprise, exalead:datacenter), among them is also exalead: search.

As it states on the company webpage, the exalead: search solution is their main product:

"exalead one: search is not just a concept. This unique technology is used every day by hundreds of thousands of users - from students to homemakers, entrepreneurs to corporate titans - around the globe. It serves as the platform for all of Exalead's solutions."

In february 06, 2006 the company announced that the search engine reached 4 billion web pages indexed.

Here is a technical overview about the exalead one:search

  • written in ExaScript, a Java-based configuration language
  • uses advanced algorithms and is optimized for 64-bit processor architectures
  • simultaneously indexes and processes data (structured and unstructured) as well as search queries in real-time
  • returns more relevant results. The technology also eliminates the need for human intervention or the use of dictionaries.
  • dictionaries (word, word-stemming, noun groups, and thesauri) are fully automatic, incremental, and real-time.
  • inherently multilingual. Language recognition is processed when “crawling” a document.
  • Boolean search (AND,OR,NOT operators)
  • Phrase search (OR operator)
  • Proximity search (NEAR operator)
  • Optional search terms (OPT operator)
  • Site collapsing
  • Similarity collapsing
  • Ranking customization
  • Proximity ranking
  • Supported Formats
File Type Description
HTML Hypertext Markup Language ( versions v. 4.01 and above, and XHTML).
WML Wireless Markup Language.
XML Extensible Markup Language (any DTD).
Text Raw text.
MIME Multipurpose Internet Mail Extension.
Microsoft Office Microsoft Office documents (Word, Excel, and PowerPoint) version 95, 97, 2000 and XP.
Adobe PDF Adobe Portable Document format, both compressed and uncompressed.
JPEG, PNG, MP3, GIF, JPEG,
PNG, OGG
Meta-data of multimedia files.
ZIP, RAR Archive files.
WordPerfect Corel WordPerfect documents version 6 and 7.
RTF Rich Text Format.
MacroMedia Flash MacroMedia Flash text section and hypertext links.
   
Encoding Type Description
Various Unicode (UTF), Windows encodings, miscellaneous encodings (Arabic, Chinese, Korean,
Japanese, and Russian). More detail is available on demand.

You can read the whole technical specifications on Exalead's site.