Tag | Apache
Home » Posts tagged "Apache"
Ever since the introduction of long running collection API calls, it has been often noticed that the calls TIMEOUT every now and then. Calls like ShardSplit timeout without much information on the state of the request. Though most calls are practically idempotent, it would still be much better for users to know if a request is currently in progress, failed or actually completed after the timeout duration.

This brought me to start working on asynchronous…

The brightest minds in open source search convened in Dublin on November 4-7, 2013, to discuss topics and trends driving the next generation of search. Lucene/Solr Revolution EU featured four action packed days of opportunities for developers, technologists, and business leaders to meet, explore, and gain a deeper understanding of the technologies connected with open source search.

Three two-day workshops kicked off the event: Big Data and Solr, Solr Unleashed, and Solr Under the Hood. …

Drupal and Apache Solr search are a potent combination in the move towards “digital experiences” online. It is behind a growing number of customized, personalized enterprise platforms for eCommerce, healthcare, physical retail, and more. Drupal powers a growing portion of the web, and has been adopted especially by governments around the world, the music industry, media organizations, and retailers. 

If you have a new web project or an existing Drupal site, the combination of Drupal …

The session, “Use Case Diagnosis – When is Solr Really the Best Tool?,” by Michael Hausenblas, Chief Data Engineer at MapR, will present an overview of common big data use cases in the form of a set of questions that can be used to determine what kind of problem you really have. From the answers to these questions, you can quickly find out about what technologies are likely to be most productive, useful, …

During the session, “The First Class Integration of Solr with Hadoop,” Apache Lucene/Solr Committer, Mark Miller talks about how Solr has been integrated into the Hadoop ecosystem to provide full text search at “Big Data” scale. This talk will give an overview of how Cloudera has tackled integrating Solr into the Hadoop ecosystem and highlights some of the design decisions and future plans. Learn how Solr is getting ‘cozy’ with Hadoop, which contributions are going …

Although people usually come to Lucene and related solutions in order to make data searchable, they often realize that it can do much more for them. Indeed, its ability to handle high loads of complex queries makes Lucene a perfect fit for analytics applications and, for some use-cases, even a credible replacement for a primary data-store. It is important to understand the design decisions behind Lucene in order to better understand the problems it can …

The juris portal provides access to legal information (about 6.5 mil documents) and information about German companies (about 23 mil documents). Access is highly personalized:  search, links, and search suggestions are customized according to the documents contained in a user’s product collection. There are many search options, the system stability and reliability have to be high and there are DVD versions of subsets of the complete collection.

The Lucene/Solr Revolution session, “Moving a Complex Application …

Like many web applications in the past, the Solr Admin UI up until 4.0 was entirely server based. It used separate code on the server to generate their dashboards, overviews, and statistics. All that code had to be maintained and still… you weren’t really able to use that kind of data for the things you needed it for. It was wrapped into HTML, most of the time difficult to extract and they changed the structure …

In the Lucene/Solr Revolution session, “Text Classification with Lucene/Solr, Apache Hadoop and LibSVM,” Majirus Fansi, SOA and Search Engine Developer at Valtech, will show you how to build a text classifier using Apache Lucene/Solr with libSVM libraries. They classify their corpus of job offers into a number of predefined categories. Each indexed document (a job offer) then belongs to zero, one or more categories. Known machine learning techniques for text classification include naïve bayes …

As part of their work with large media monitoring companies, Flax has developed a technique for applying tens of thousands of stored Lucene queries to a document in under a second. 

During the Lucene/Solr Revolution session, “Turning Search Upside Down – Using Lucene for Very Fast Stored Queries,” Charlie Hull and Alan Woodward from Flax will talk about how they built intelligent filters to reduce the number of actual queries applied, how they extended Lucene …

Google+