Pages

Thursday, November 1, 2012

Apache SOLR

Apache Solr is an open source search server. It is based on the full text search engine called Apache Lucene. So basically Solr is an HTTP wrapper around an inverted index provided by Lucene. An inverted index could be seen as a list of words where each word-entry links to the documents it is contained in. That way getting all documents for the search query "dzone" is a simple 'get' operation.

One advantage of Solr in enterprise projects is that you don't need any Java code, although Java itself has to be installed. If you are unsure when to use Solr and when Lucene, these answers could help. If you need to build your Solr index from websites, you should take a look into the open source crawler called Apache Nutch before creating your own solution.


No comments:

Post a Comment