Aegisthus spends a lot of time deserializing AegisthusKey
danchia opened this issue · 3 comments
On the map stage, we tend to be pretty CPU bound, and I took a quick profile, which shows SpillThread using most of the time, and in SpillThread it seems like use a lot of type copying bytes.
@danielbwatson I'm going to experiment with making AegisthusKey RawComparable, and will report back how that goes.
Flat profile of 262.48 secs (10192 total ticks): SpillThread
Interpreted + native Method
11.6% 0 + 778 org.apache.hadoop.io.compress.snappy.SnappyCompressor.compressBytesDirect
1.9% 0 + 128 java.io.FileOutputStream.writeBytes
0.2% 14 + 0 org.apache.hadoop.util.DataChecksum.update
0.2% 13 + 0 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
0.2% 11 + 0 org.apache.hadoop.io.compress.snappy.SnappyCompressor.compress
0.1% 7 + 0 com.netflix.aegisthus.io.writable.AegisthusKey.readFields
0.1% 7 + 0 org.apache.cassandra.utils.ByteBufferUtil.compareUnsigned
0.1% 7 + 0 org.apache.cassandra.db.marshal.AbstractCompositeType.compare
0.1% 4 + 0 org.apache.hadoop.io.compress.BlockCompressorStream.finish
0.1% 4 + 0 org.apache.hadoop.util.HeapSort.sort
0.1% 4 + 0 org.apache.hadoop.mapred.IFile$Writer.append
0.0% 3 + 0 org.apache.cassandra.utils.ByteBufferUtil.readBytes
0.0% 3 + 0 org.apache.hadoop.io.compress.BlockCompressorStream.rawWriteInt
0.0% 3 + 0 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.swap
0.0% 3 + 0 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare
0.0% 1 + 1 java.io.UnixFileSystem.getBooleanAttributes0
0.0% 2 + 0 org.apache.cassandra.db.marshal.Int32Type.compare
0.0% 2 + 0 org.apache.cassandra.db.marshal.ReversedType.compare
0.0% 2 + 0 org.apache.cassandra.utils.ByteBufferUtil.getShortLength
0.0% 2 + 0 org.apache.hadoop.io.compress.CompressorStream.write
0.0% 2 + 0 com.netflix.aegisthus.io.writable.AegisthusKey.compareTo
0.0% 2 + 0 org.apache.hadoop.io.compress.CompressorStream.
0.0% 2 + 0 org.apache.hadoop.io.WritableComparator.compare
0.0% 2 + 0 java.io.BufferedOutputStream.write
0.0% 2 + 0 java.io.ByteArrayInputStream.read
15.5% 136 + 908 Total interpreted (including elided)
Compiled + native Method
30.5% 2051 + 0 java.io.DataInputStream.readFully
21.8% 1341 + 128 com.netflix.aegisthus.io.writable.AegisthusKey.readFields
13.5% 894 + 15 org.apache.cassandra.db.marshal.AbstractCompositeType.compare
9.8% 658 + 0 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare
2.6% 173 + 0 org.apache.hadoop.util.QuickSort.sortInternal
2.2% 148 + 0 org.apache.hadoop.io.WritableComparator.compare
1.3% 90 + 0 org.apache.hadoop.io.compress.BlockCompressorStream.write
0.8% 52 + 1 org.apache.hadoop.mapred.IFileOutputStream.write
0.5% 34 + 0 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
0.5% 31 + 0 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write
0.2% 13 + 0 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write
0.2% 0 + 11 org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength
0.1% 9 + 0 org.apache.hadoop.io.compress.snappy.SnappyCompressor.setInput
0.1% 8 + 0 org.apache.cassandra.db.marshal.AbstractCompositeType.compare
0.1% 4 + 0 org.apache.hadoop.util.HeapSort.downHeap
0.0% 3 + 0 java.nio.Bits.copyToArray
0.0% 3 + 0 java.nio.DirectByteBuffer.get
0.0% 1 + 0 org.apache.hadoop.util.QuickSort.fix
0.0% 0 + 1 java.util.Vector.addElement
0.0% 1 + 0 java.io.DataInputStream.readInt
0.0% 1 + 0 org.apache.hadoop.mapred.IFile$Writer.append
84.2% 5515 + 156 Total compiled
A quick prototype that avoids all the byte copying shows a modest 30% increase in speed in the map stage, and an overall 20% increase in the reduce stage.
@danchia That is awesome, impressive improvements. The code looks good to me. If you want to make a pull request I will get it merged in.
@danchia Thanks for the performance improvement!