Pages

Monday, November 26, 2012

MyFitnessPal vs Livestrong My Plate

Tried both apps today, My Plate might have a better user interface but overall I like MyFitnessPal, better database and ease of use.

Highlighting search results with the query term in ElasticSearch


Highlighting search results with the query term in ElasticSearch:

curl -XGET 'http://localhost:9200/document/_search?pretty=true' -d '{
    "query" : { "term" : { "title": "mobile" }},
    "highlight" : {
        "fields" : {
            "text" : {}
        }
    }
}'

Wednesday, November 21, 2012

Deleting documents from your index Apache Solr


How can I delete all documents from my index?

Use the "match all docs" query in a delete by query command: <delete><query>*:*</query></delete>
You must also commit after running the delete so, to empty the index, run the following two commands:
curl http://localhost:8983/solr/update --data '<delete><query>*:*</query></delete>' -H 'Content-type:text/xml; charset=utf-8'  
curl http://localhost:8983/solr/update --data '<commit/>' -H 'Content-type:text/xml; charset=utf-8'
Another strategy would be to add two bookmarks in your browser:
http://localhost:8983/solr/update?stream.body=<delete><query>*:*</query></delete>
http://localhost:8983/solr/update?stream.body=<commit/>
And use those as you're developing to clear out the index as necessary.

for loop in unix

for (( i = 1 ; i <= 10000; i++ )); do java -jar post.jar $i.xml; done

Thursday, November 1, 2012

Apache SOLR

Apache Solr is an open source search server. It is based on the full text search engine called Apache Lucene. So basically Solr is an HTTP wrapper around an inverted index provided by Lucene. An inverted index could be seen as a list of words where each word-entry links to the documents it is contained in. That way getting all documents for the search query "dzone" is a simple 'get' operation.

One advantage of Solr in enterprise projects is that you don't need any Java code, although Java itself has to be installed. If you are unsure when to use Solr and when Lucene, these answers could help. If you need to build your Solr index from websites, you should take a look into the open source crawler called Apache Nutch before creating your own solution.