Apache Lucene Improves Sparce Indexing |
Written by Kay Ewbank | |||
Tuesday, 22 October 2024 | |||
Apache Lucene 10 has been released. The updated version adds a new IndexInput prefetch API, support for sparse indexing on doc values, and upgraded Snowball dictionaries resulting in improved tokenization. Apache Lucene is a high-performance search engine library written entirely in Java. The developers describe it as being suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search on high-dimensionality vectors, spell correction or query suggestions. There's also a PyLucene sub project that provides Python bindings for Lucene Core. The first improvement is the new IndexInput#prefetch API, which means query evaluation logic can let the Directory know about regions of data that are about to be read. This helps perform I/O concurrently. Search concurrency has also been improved so that it is now decoupled from the index geometry, meaning an index can be searched using any number of threads, regardless of its number of segments. Snowball dictionaries have been upgraded, resulting in improved tokenization, and Kmeans clustering has been added on vectors. This release also adds initial support for intra-segment concurrency, meaning the index searcher now supports searching across leaf reader partitions concurrently. The developers say this helps make maximum use of available resources especially with force merged indices or big segments, but there is still a performance penalty for queries that require segment-level computation ahead of time, such as points/range queries. This is an implementation limitation that the developers expect to improve in future releases, but at the moment intra-segment slicing is not enabled by default. Lucene 10.0 is available now. More InformationRelated ArticlesApache Lucene Adds Similarity Vector Searches Lucene Core and Solr updated to 3.3 Elastic 8 Enhances ElasticSearch To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |