public final class Lucene50StoredFieldsFormat extends StoredFieldsFormat
StoredFieldsFormat compresses blocks of documents in
order to improve the compression ratio compared to document-level
compression. It uses the LZ4
compression algorithm by default in 16KB blocks, which is fast to compress
and very fast to decompress data. Although the default compression method
that is used (
BEST_SPEED) focuses more on speed than on
compression ratio, it should provide interesting compression ratios
for redundant inputs (such as log files, HTML or plain text). For higher
compression, you can choose (
BEST_COMPRESSION), which uses
the DEFLATE algorithm with 60KB blocks
for a better ratio at the expense of slower performance.
These two options can be configured like this:
// the default: for high performance indexWriterConfig.setCodec(new Lucene54Codec(Mode.BEST_SPEED)); // instead for higher performance (but slower): // indexWriterConfig.setCodec(new Lucene54Codec(Mode.BEST_COMPRESSION));
Stored fields are represented by two files:
A fields data file (extension .fdt). This file stores a compact representation of documents in compressed blocks of 16KB or more. When writing a segment, documents are appended to an in-memory byte buffer. When its size reaches 16KB or more, some metadata about the documents is flushed to disk, immediately followed by a compressed representation of the buffer using the LZ4 compression format.
Here is a more detailed description of the field data file format:
VInt(let's call it bitsRequired)
VLong, whose 3 last bits are Type and other bits are FieldNum
String| BinaryValue | Int | Float | Long | Double depending on Type
StoredFieldVisitors which are only interested in the first fields of a document to not have to decompress 10MB of data if the document is 10MB, but only 16KB.
A fields index file (extension .fdx).
StoredFieldsFormat does not support individual documents
larger than (231 - 214) bytes.
|Modifier and Type||Class and Description|
Configuration option for stored fields.
|Modifier and Type||Field and Description|
Attribute key for compression mode.
|Constructor and Description|
Stored fields format with default options
Stored fields format with specified mode
|Modifier and Type||Method and Description|
public StoredFieldsReader fieldsReader(Directory directory, SegmentInfo si, FieldInfos fn, IOContext context) throws IOException
StoredFieldsReaderto load stored fields.
public StoredFieldsWriter fieldsWriter(Directory directory, SegmentInfo si, IOContext context) throws IOException
StoredFieldsWriterto write stored fields.