Author | Erik Hatcher
Home » Articles posted by Erik Hatcher
My work at LucidWorks primarily involves helping customers build their desired solutions.  Recently, more than one customer has inquired about doing “entity extraction”.  Entity extraction, as defined on Wikipedia, “seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.”…  When drilling down into the specifics of the requirements from our customers, it turns out that many of
ApacheCon logoAhhh…. ApacheCon!   It’s that time again, and in one of America’s finest cities (Portland, Oregon).  Location and quality content, you can’t beat that.  If you’re active in the “big data”, search, and cloud worlds, you owe it to yourself and your employer to get to ApacheCon later this month.First let me highlight two related presentations immediately relevant to Solr users:Solr Query Parsing
Interpreting what the user meant and what they ideally would …
Solr in ActionExciting news…. a new book, “Solr in Action” has arrived covering Solr 4!   The book is co-authored by two extremely knowledgeable folks, Trey Grainger and Timothy Potter.  Both of these guys are deep in the trenches of using Solr in big (data) ways at their day jobs, so they speak and write with experience. Trey presented “Building a Real-Time Solr-Powered Recommendation Engine”… (video of his session is available too!) at the 2012 Lucene Revolution, and

Open Source Search Conference

Come join us at the Open Source Search Conference on Oct. 1-2 in Chantilly, VA.  LucidWorks will be delivering both a tutorial and a conference talk.  Here are the details:
  • Tomás Fernández Löbbe will be teaching a half-day “Introduction to SolrCloud” tutorial (both in the morning and afternoon, whichever best fits your schedule)
  • I’ll be presenting a short conference session on “Solr 4″
Dec. 6, 2012: A code update was made for Solr 4.0 (see commented section in AccessControlQParserPlugin.java below)Yonik recently wrote about “Advanced Filter Caching in Solr” where he talked about expensive and custom filters; it was left as an exercise to the reader on the implementation details.  In this post, I’m going to provide a concrete example of custom post filtering for the case of filtering documents based on access control lists.

Recap of Solr’s …

Last minute mention, in case you happen to be in the Central VA area (Richmond and surrounding areas) tomorrow night… I’ll be discussing Solr and the latest greatest techniques folks are using to work with Solr from Ruby.  The abstract blurb follows: “Erik Hatcher will discuss and demonstrate the state of the art with using Solr from Ruby. He’ll cover RSolr (and the forthcoming deprecation and removal of solr-ruby, RIP: solr-ruby), Sunspot, Blacklight, and other…
The Rich Web Experience 2011I’ll be speaking at the upcoming Rich Web Experience conference… in Ft. Lauderdale, presenting an “Introduction to Solr”, “Solr Recipes”, and “Lucene for Solr Developers”.  I’ll be tying all of these presentations together into a cohesive search/Solr track going from the introduction, to recipes for common tasks, through advanced customization of Solr.
EuroCon is rapidly approaching.  I’m working on refreshing our Lucene training and updating it to Lucene 3.4.  Almost there!  As part of my effort I came to the (duh!) enlightenment that we really should be teaching this class by the book.  So, we’ve decided that all of the attendees of my “Lucene Application Development Workshop”  will be getting hard copies of “Lucene in Action”… (2nd edition) [sorry to weigh down your baggage for the
The Apache Lucene community has just released Lucene 3.4.0 and Solr 3.4.0.You can read the official announcements here and here.There are several juicy additions, but also a critical bug fix.  It is recommended that all 3.x-using applications upgrade to 3.4 as soon as possible.  Here’s the scoop on this fixed bug:
* Fixed a major bug (LUCENE-3418) whereby a Lucene index could
  easily become corrupted if the OS or computer …
You’re using Solr, or some other Lucene-based search solutions, … or you should and will be!  You are (or will be) building your solutions on top of a top-notch search library, Apache Lucene.Solr makes using Lucene easier – you can index a variety of data sources easily, pretty much out of the box, and you can easily integrate features such as faceting, highlighting, and spellchecking – all without writing Java code. And if that’s…
Google+