lz4/lz4-java

Is there any demo for compress streaming data?

darouwan opened this issue · 10 comments

I have read this article https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md and it looks quite suitable for me to compress continous incoming data and append them into compressed local file.

But unfortuanately I cannot found a concrete sample for how to use it. The class LZ4BlockOutputStream may be a good tool for me however the doc of it doesn't show an intact progress.

Could anyone provide some good sample of it?

I found the demo in test cases... Thanks

And I have one more question:

"In some circumstances, it may be preferable to append multiple frames, for example in order to add new data to an existing compressed file without re-framing it."

But I failed to append data multi-times into compressed file and read them once.
Here is my code:

public static void compressFast(byte[] data) {
        File output = new File("a.lz4");
        try {
            FileOutputStream outputStream = new FileOutputStream(output,true);
            LZ4BlockOutputStream stream = new LZ4BlockOutputStream(outputStream);
            stream.write(data);
            stream.flush();
            stream.finish();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void decompressFast() {
        try {
            LZ4BlockInputStream stream = new LZ4BlockInputStream(new FileInputStream(new File("a.lz4")));
            byte[] buf = new byte[100];
            int length = stream.read(buf);
            System.out.println(new String(buf));

            stream.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws IOException {
        LZ4Compress.compressFast("aaabbb".getBytes());
        LZ4Compress.compressFast("bbbccc".getBytes());
        LZ4Compress.decompressFast();
        //LZ4Compress.testWriteCloseWriteCloseRead();
    }

In purpose I should achieve aaabbb and bbbccc, but in fact I only get "aaabbb" instead of that.
How to reach my purpose? Thanks a lot.

The streaming interface currently implemented in lz4-java (LZ4BlockOutputStream and LZ4BlockInputStream) uses its own format and does not support the lz4 Frame format. So, it does not currently support decoding of concatenated compressed streams. However, this PR #76 is exactly for the feature you want. I plan to integrate it in the next release.

Also, the next release will support the lz4 Frame format through LZ4FrameOutputStream and LZ4FrameInputStream. You can see these classes in the master branch.

Sounds nice. Can I consider the appending function within lz4 Frame format is concating compressed frame level data one by one by compromising some compressing ratio comparing to compressing all data in once time?

After all,the size of merging compressed splitted small data should be larger than compressing big data once technically.

Yes, your understanding is correct.

so, is the lz4 Frame format support for LZ4FrameOutputStream & LZ4FrameInputStream available now?
to decompress concated data i do this:

byte[] inOutbyte= new byte[MAX_BYTE];
int length=0;
while((length=lz4FrameInputStream.read(inOutbyte, 0, MAX_BYTE))>0){
bufferedOutputStream.write(inOutbyte, 0, length);
}

and it may thows the Exception when reaching the end

java.io.IOException: Stream ended prematurely

is there any better solution?

@cristph Would you still need help for this? If you receive IOException when decompressing a stream correctly compressed in the LZ4 frame format, can you tell me how to reproduce it?

@odaira I am trying to compress and decompress large files from disk (a few dozen GB's), without holding them in memory.

But I find that it takes a very long time to compress (1 GB = 30 seconds), way slower than the claimed 800MB per second lz4 should provide.

I am using the following code. Any insights on what I may be doing wrong?

public static void compress(String decompressedFileName, String compressedFileName) throws IOException{
        InputStream directIn = null;
        BufferedInputStream bufferedIn = null;
        OutputStream directOut = null;
        BufferedOutputStream bufferedOut = null;
        LZ4BlockOutputStream outStream = null;

        try {
            Path compressedPath = Paths.get(compressedFileName);
            if(Files.exists(compressedPath)){
                Files.createFile(compressedPath);
            }

            directIn = Files.newInputStream(Paths.get(decompressedFileName));
            bufferedIn = new BufferedInputStream(directIn);
            directOut = Files.newOutputStream(compressedPath);
            bufferedOut = new BufferedOutputStream(directOut);
            outStream = new LZ4BlockOutputStream(bufferedOut);

            final byte[] buffer = new byte[localPartitionBlockSize * 16];
            int readBytes = 0;
            do {
                readBytes = bufferedIn.read(buffer);
                if(readBytes > 0) {
                    outStream.write(buffer, 0, readBytes);
                }
            } while (readBytes != -1);
        }finally {
            if(outStream != null) {
                outStream.close();
            }
            if(bufferedOut != null) {
                bufferedOut.close();
            }
            if(directOut != null) {
                directOut.close();
            }
            if(bufferedIn != null) {
                bufferedIn.close();
            }
            if(directIn != null) {
                directIn.close();
            }
        }
    }

    public static void decompress(String compressedFileName, String decompressedFileName) throws IOException{
        InputStream directIn = null;
        BufferedInputStream bufferedIn = null;
        OutputStream directOut = null;
        BufferedOutputStream bufferedOut = null;
        LZ4BlockInputStream inStream = null;

        try {
            Path decompressedPath = Paths.get(decompressedFileName);
            if(Files.exists(decompressedPath)){
                Files.createFile(decompressedPath);
            }
            directIn = Files.newInputStream(Paths.get(compressedFileName));
            bufferedIn = new BufferedInputStream(directIn);
            inStream = new LZ4BlockInputStream(bufferedIn);
            directOut = Files.newOutputStream(decompressedPath);
            bufferedOut = new BufferedOutputStream(directOut);

            final byte[] buffer = new byte[localPartitionBlockSize * 16];
            int readBytes = 0;
            do {
                readBytes = inStream.read(buffer);
                if(readBytes > 0) {
                    bufferedOut.write(buffer, 0, readBytes);
                }
            } while (readBytes != -1);
        }finally {
            if(bufferedOut != null) {
                bufferedOut.close();
            }
            if(directOut != null) {
                directOut.close();
            }
            if(inStream != null) {
                inStream.close();
            }
            if(bufferedIn != null) {
                bufferedIn.close();
            }
            if(directIn != null) {
                directIn.close();
            }
        }
    }

Also, if using channels to perform the I/O, compressing files (and then subsequently un-compressing them) results in very small (mostly empty) compressed files and only small fractions of the original files present in the restored files.

Any insights you can help me with?

public static void compress(String decompressedFileName, String compressedFileName) throws IOException{
        LZ4Factory factory = LZ4Factory.fastestInstance();
        LZ4CompressorWithLength compressor = new LZ4CompressorWithLength(factory.fastCompressor());
        RandomAccessFile decompressedFile = new RandomAccessFile(decompressedFileName, "r");
        RandomAccessFile compressedFile = new RandomAccessFile(compressedFileName, "rw");
        FileChannel inChannel = null;
        FileChannel outChannel = null;

        try {
            inChannel = decompressedFile.getChannel();
            outChannel = compressedFile.getChannel();

            final ByteBuffer inBuffer = ByteBuffer.allocate(localPartitionBlockSize * 16);
            final ByteBuffer outBuffer = ByteBuffer.allocate(localPartitionBlockSize * 16);
            int readBytes = 0;
            do {
                readBytes = inChannel.read(inBuffer);
                inBuffer.flip();

                if(readBytes > 0) {
                    outBuffer.put(compressor.compress(inBuffer.array(), 0, inBuffer.position()+1));
                    outBuffer.flip();

                    outChannel.write(outBuffer);
                    outBuffer.clear();
                }
                inBuffer.clear();
            } while (readBytes != -1);
        }finally {
            if(inChannel != null) {
                inChannel.close();
            }
            if(outChannel != null) {
                outChannel.close();
            }
        }
    }

    public static void decompress(String compressedFileName, String decompressedFileName) throws IOException{
        LZ4Factory factory = LZ4Factory.fastestInstance();
        LZ4DecompressorWithLength decompressor = new LZ4DecompressorWithLength(factory.fastDecompressor());
        RandomAccessFile compressedFile = new RandomAccessFile(compressedFileName, "r");
        RandomAccessFile decompressedFile = new RandomAccessFile(decompressedFileName, "rw");
        FileChannel inChannel = null;
        FileChannel outChannel = null;

        try {
            inChannel = compressedFile.getChannel();
            outChannel = decompressedFile.getChannel();

            final ByteBuffer inBuffer = ByteBuffer.allocate(localPartitionBlockSize * 16);
            final ByteBuffer outBuffer = ByteBuffer.allocate(localPartitionBlockSize * 16);
            int readBytes = 0;
            do {
                readBytes = inChannel.read(inBuffer);
                inBuffer.flip();

                if(readBytes > 0) {
                    outBuffer.put(decompressor.decompress(inBuffer.array()));
                    outBuffer.flip();

                    outChannel.write(outBuffer);
                    outBuffer.clear();
                }
                inBuffer.clear();
            } while (readBytes != -1);
        }finally {
            if(inChannel != null) {
                inChannel.close();
            }
            if(outChannel != null) {
                outChannel.close();
            }
        }
    }

@yonatankarimish
Is your issue get resolved? As i am also facing same issue