haskell-crypto/cryptonite

Incorrect hashes for all hash algorithms beyond 4 GiB of input

nh2 opened this issue · 8 comments

nh2 commented

I found a problem together with @chpatrick:

Click to show imports
:set -XTypeApplications
import qualified Crypto.Hash as Hash
import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as BS8
> let hash = show . Hash.hash @BS.ByteString @Hash.SHA1
> let a = BS8.replicate 5000000000 'a' <> BS8.pack "hello"
> let b = BS8.replicate 5000000000 'a' <> BS8.pack "world"
> a == b
False
> hash a == hash b
True

Two different ByteStrings hash to the same SHA1.

Trying SHA256, same issue:

> let hash = show . Hash.hash @BS.ByteString @Hash.SHA256
> hash a == hash b
True

Conclusion

  • fromIntegral is bad.
    Until we deprecate fromIntegral, Haskell code will always be subtly wrong and never be secure.
    I proposed this 3 years ago (easier-to-read threaded view here).
    Concretely:
    • If we don't fix this, people will shy away from using Haskell for serious work. I found this bug in production, and you can imagine how happy I am about that. Rust and C both do this better.
    • If the authors of key crypto libraries fall for these traps (no blame on them), who can get it right? We should remove the traps.

Details

Demonstrating that the the overflow occurs at the uint32 boundary:

Good:

> show $ Hash.hash @BS.ByteString @Hash.SHA1 (BS8.replicate (2^32 - 2) 'a' <> BS8.pack "b")
"78c1411ef0ae4e0328660b7dddc08c06000eb9f0"

> show $ Hash.hash @BS.ByteString @Hash.SHA1 (BS8.replicate (2^32 - 2) 'a' <> BS8.pack "c")
"21f556aba357dd19d7bf01a2a7f05482162ed3fc"

Bad:

> show $ Hash.hash @BS.ByteString @Hash.SHA1 (BS8.replicate (2^32 - 1) 'a' <> BS8.pack "b")
"da39a3ee5e6b4b0d3255bfef95601890afd80709"

> show $ Hash.hash @BS.ByteString @Hash.SHA1 (BS8.replicate (2^32 - 1) 'a' <> BS8.pack "c")
"da39a3ee5e6b4b0d3255bfef95601890afd80709"

Location of the problem:

For SHA1 (note uint32_t len):

void cryptonite_sha1_update(struct sha1_ctx *ctx, const uint8_t *data, uint32_t len)

For all hash algorithms (note Word32 for the length parameter):

hashInternalUpdate :: Ptr (Context a) -> Ptr Word8 -> Word32 -> IO ()
-- | Finalize the context and set the digest raw memory to the right value

This is where the silent overflow occurs due to fromIntegral:

mapM_ (\b -> B.withByteArray b $ \d -> hashInternalUpdate ctx d (fromIntegral $ B.length b)) ls

This analysis is out of my league of expertise but if I follow this correctly, we are in jeopardy of descending blockchain security if the changes are not made soon. However, what is the likelihood of a blockchain with a block size of 4GB?

The whole Num class is a cesspit of evil; fromIntegral is just one of the paper cut, but there's hundred of other issues in there (silent overflow, silent underflow, questionable functions for certain type..).

Cryptonite is sadly pre-foundation, and some of these issues would be addresses by moving to basement/foundation, where some of the Num evilness is addressed (or tried to be address, there's some limitation inherent in the language). For example: through TryFrom, From, IntegralUpsize IntegralDownsize ([1] [2] you cannot convert to a "smaller type" without going through an Maybe type), better sign vs unsigned separation [3], and moving other operation to their own class (Add, Sub, ..).

About removing the trap from the language, good luck, but I have stopped holding my breath about Haskell correctness and ability to run in production safely (lazyness trap, memory leaks, partial functions), and instead my recommendation is to use rust for serious production work (the num hierarchy is sensible, overflow/underflow are detected, TryFrom and From is available by default, conversion can be linted off: e.g. prevent 'as' cast, and I don't need a magic ball to predict performance and/or runtime profile).

Related to this function itself:

  • You really shouldn't be using the one-shot interface for such a big contiguous data; it is practical, but usual way is to use stream interface to make sure you don't pin more than 4gb of contiguous ram. it's also more practical to use the stream interface for not calling a 3min C call depending on the case (runtime lock, interruptability, etc..).
  • Fixing it:
    1. make the function return something optional (Maybe Context or Either ContextError Context): this breaks the interface of 99% of people that use this function with small data where this will never happen, this also has a speed impact, and suddenly the correct API make it "complicated" for people.
    2. make the function partial : people will complain that their prod stuff is now exploding
    3. exception: same as 2)
    4. Add some documentation
    5. Auto chunking ? That cost in speed, should we expose the chunking parameter ?
    6. Move the hash interface to uint64_t; This would need some creative thinking for the old hashes algorithm which are limited to 32 bits length. At the moment we paper over this limitation in a non-correct way)

Probably 5 is the only sensible way, considering that we don't break APIs (unless required)

[1] https://hackage.haskell.org/package/basement-0.0.11/docs/Basement-From.html
[2] https://hackage.haskell.org/package/basement-0.0.11/docs/Basement-IntegralConv.html
[3] https://hackage.haskell.org/package/basement-0.0.11/docs/Basement-Numerical-Number.html

Why not make all sizes of type size_t?

IMHO, (vi) above is the closest to being right.

nh2 commented

Thanks for the quick merging of our fix in #331!

Note though as stated in there,

I think hashFinalizePrefix could have the same issue if you give it a huge buffer though, and this doesn't fix it.

Maybe @ocheron (author of that recently added functionality) could chime in whether a similar change is needed there?

@mouse07410 it would be in C, but the boundary between C and Haskell is badly defined.

  • CSize in haskell is defined as a Word64 (on 64 bits arch) and Word32 (on 32 bits arch) http://hackage.haskell.org/package/base-4.14.0.0/docs/Foreign-C-Types.html
  • the bytestring interface returns an Int for the size, which has (mistakenly) copied to the bytearray interface for compat.
  • bytestring use fromIntegral converting from CSize to Int. Probably incorrect on edge cases (like a, currently impossible, > 2^63 data), or a 30 bits Int.
  • bytestring allocation used to be done by C call to malloc (which takes a size_t), but transformed from/to Int.
  • bytestring memory allocation is now done by the haskell rts, which is defined as Int# (the inherent type behind Int).
  • haskell Int has very loose guarantees : it has to be at least a 30 bits signed integer. it doesn't have any guaranteed relation to size_t, although in practice on ghc it does fits (by luck ?). Also size_t is unsigned, whereas Int# is signed.

It's quite likely that haskell Word would be closer to size_t, but also no guaranteed.

nh2 commented
  • You really shouldn't be using the one-shot interface for such a big contiguous data; it is practical, but usual way is to use stream interface to make sure you don't pin more than 4gb of contiguous ram.

@vincenthz I'm not sure that pinning would make a difference for my data, I need to have that ByteString in memory anyway.

calling a 3min C call depending on the case (runtime lock, interruptability, etc..).

Is there any runtime lock? Hashing is a safe foreign import.

Regarding interruptibility, that's right, it cannot be interrupted. What do you suggest for streaming a strict ByteString, cryptonite-conduit like this?

{-# LANGUAGE ScopedTypeVariables #-}

import           Conduit
import qualified Crypto.Hash as Hash
import           Crypto.Hash.Conduit (sinkHash)
import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as BS8
import           Data.Conduit.List


-- | Chunk a `BS.ByteString` into list of smaller `BS.ByteString`s of no more
-- than /n/ elements.
--
-- Naturally /O(n)/ to create the full output list, but
-- /O(1)/ concerning the input ByteString length, because it uses `BS.splitAt`,
-- which provides a view into the original ByteString (but keeps it alive).
chunkedBy :: Int -> BS.ByteString -> [BS.ByteString]
chunkedBy n bs = if BS.length bs == 0
  then []
  else case BS.splitAt n bs of
    (as, zs) -> as : chunkedBy n zs

{-# INLINEABLE chunkedBy #-}


main :: IO ()
main = do
  let a = BS8.replicate 5000000000 'a' <> BS8.pack "hello" -- 5 GB ByteString

  digest :: Hash.Digest Hash.SHA1 <-
    runConduit $
         sourceList (chunkedBy (4096 * 1024) a)
      .| sinkHash

  print digest

I took chunkedBy from https://hackage.haskell.org/package/hw-prim-0.6.3.0/docs/src/HaskellWorks.Data.ByteString.html#chunkedBy, is there a more accessible package that provides that conveniently?

hashFinalizePrefix has of course the same issue but don't need similar change IMO.
I hope no one use that with a large buffer.

Whereas changing the mutable hash interface is highly welcome. #331 did only 50% of the hash API.

nh2 commented

I hope no one use that with a large buffer.

Then I think it should be changed to at least error out, because hope is not something we should build software on, especially not security software.

#331 did only 50% of the hash API.

Then this ticket should be re-opened (I cannot do that). Which parts are missing?