Lzop Codec in Apache Spark

Question

srinicodeit opened this issue 3 years ago · 1 comments

Is there any documentation to use LzopCodec codec in apache-spark ?

Answer 1 · 2023-03-03T20:53:34.000Z

You will need to create a subclass of the codec using the actual name hadoop used for the codec, because Hadoop, unfortunately, encodes the class name into the file formats. Here is how we did this in a Trino test:
https://github.com/trinodb/trino/blob/master/lib/trino-hive-formats/src/test/java/com/hadoop/compression/lzo/LzopCodec.java#L18