Archive |  November 2011
Home » 2011 » November

Wildcard query terms aren’t analyzed, why is that?

Prior to the current 3x branch (which will be released as 3.6) and the trunk (4.0) Solr code, users have frequently been perplexed by wildcard searching being un-analyzed, often manifesting in case sensitivity. Say you have an analysis chain in your schema.xml file defined as follows and a field named lc_field of this type:
<fieldType name="lowercase" class="solr.TextField" >
  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  <filter class="solr.LowercaseFilterFactory" />
Now, you index
Official release announcement for Lucene/Solr 3.5:

November 27 2011, Apache Lucene™ 3.5.0 available

 The Lucene PMC is pleased to announce the release of Apache Lucene 3.5.0. Apache Lucene is a high-performance, full-featured text search enginelibrary written entirely in Java. It is a technology suitable for nearlyany application that requires full-text search, especially cross-platform. This release contains numerous bug fixes, optimizations, andimprovements, some of which are highlighted below.  The release
Solr Reference Guide version 3.4 is now available. The Reference Guide is designed to provide descriptions of  all the important feature and functions of the LucidWorks for Solr Certified Distribution. You can either view it online or download it as a PDF…. It will be of use at any point in the application lifecycle, whether you needed detailed information about Solr or you are just getting started.
The Rich Web Experience 2011I’ll be speaking at the upcoming Rich Web Experience conference… in Ft. Lauderdale, presenting an “Introduction to Solr”, “Solr Recipes”, and “Lucene for Solr Developers”.  I’ll be tying all of these presentations together into a cohesive search/Solr track going from the introduction, to recipes for common tasks, through advanced customization of Solr.
The use of scripting languages to add new functionality to systems is something that I’ve always found very helpful. You don’t have to download the source code of the system, if it has “scriptable” parts you can add simple functionality in minutes without even compiling. Java provides this capabilities in particular with Javascript. You can refer to… for more information on this.Unfortunately, Java 6′s only included library is Rhino that converts the javascript
For those of you in the Raleigh, Durham, Chapel Hill NC area, Lucid Imagination is sponsoring the next (and ongoing) Triangle Hadoop Users Group Meeting, November 15, 2011 @ Bronto Software in Durham, NC.   The next meeting will feature Alan Gates of Hortonworks.  Alan will be speaking on Apache Pig and HCatalog.  To RSVP and find out more, visit….
  My most recent article on Mahout is up at IBM developerWorks.  It is titled Apache Mahout: Scalable machine learning for everyone and is designed to walk you through using Mahout with a real email data set using Hadoop and EC2.  It also gets you up to speed on some of the new things in Mahout since I last wrote on the subject for developerWorks….Note, I will also be giving a talk
For all of those interested in Apache Mahout and scalable machine learning, Lucid Imagination is hosting a Mahout Users Meeting at it’s new office in Redwood City on Nov. 29th. Doors open at 6:30 pm. The night will feature two speakers, Ted Dunning of MapR Technologies and Grant Ingersoll of LucidWorks, along with a social gathering with food and drinks.For more details and to RSVP, please see…
I confess, this is heavily influenced by the eXtreme Programming folks, but I see it recur again and again: we tech folks have historically been far too quick to say “sure, we can do that”. Even worse, I’ve done it myself on far too many occasions.Yes, we want to be “team players”. No, we don’t like conflict. Yes, often the changes we’re asked to make are technically “interesting” and we like challenges. None of…