awnumar/memguard

proposal: protect data with noaccess until it is read

jpaskhay opened this issue ยท 9 comments

Is your feature request related to a problem? Please describe.
Currently the guard pages use noaccess protection, but the inner region (canary + data) is still readable. This leaves the inner region vulnerable to memory scans (have verified this locally).

Describe the solution you'd like
It would be nice to have an option to make the inner region always be set to noaccess and only make it read-only when it needs to be read. Making this a default behavior would likely require a breaking API change. This is also similar to what's described in https://download.libsodium.org/doc/memory_management#guarded-heap-allocations in the sodium_mprotect_noaccess related info (specifically, This function can be used to make confidential data inaccessible except when actually needed for a specific operation.).

We have Java and C# implementations that are very similar to your project and libsodium (we only came across your project when our Go implementation was started). The approach we've taken is:

  • accepting a lambda/function from the user that operates on the data
  • call mprotect with readonly before the lambda invocation and then mprotect with noaccess after the lambda completes
  • we use a lock and access counter to minimize the mprotect readonly and noaccess system calls to only be done when needed (readonly after 1st access, noaccess after last caller exits)

You can refer to how we've implemented this in https://github.com/godaddy/asherah/blob/4ead9ca1803056d2fe2ea006aaa771b3301ae28b/languages/java/secure-memory/src/main/java/com/godaddy/asherah/securememory/protectedmemoryimpl/ProtectedMemorySecret.java#L89-L113.

Describe alternatives you've considered

  1. We considered using Enclave instead of LockedBuffer but we expect that for our use case, it will likely result in higher unmanaged memory usage (lot of caching and concurrent reads of keys).
  2. We've considered seeing if we can replicate the page boundary calculations to get an unsafe handle on the inner region start, but this seems very messy and error prone.
  3. We're considering replicating our Secure Memory implementation in Go, similar to what we did for Java/C#, but we're hoping to use your implementation for Go instead. The only other thing that may be an issue for us is whether the guard pages cause too much unmanaged memory in our workloads, but we'll cross that bridge when we get there.

Additional context
Think I covered all the details. Let me know if you have any questions on this. I can add the script that was used to verify the memory is readable if you'd like.

Not the first time this feature has been suggested. There are a few problems however, mainly on Windows:

  1. Setting PAGE_NOACCESS results in the VirtualLock flag being dropped for that memory region.
  2. Any process can modify the protection values of pages belonging to another process. I don't know why this idiotic behavior is allowed but it is what it is.

On Linux systems this isn't an issue but,

This leaves the inner region vulnerable to memory scans (have verified this locally).

You would need root access to do this at which point the attacker can dump the entire process's memory or make it all readable.

It is still possible to do this yourself. You can reliably compute the pointer value for the start of the inner memory region by getting the pointer to the start of the data section, adding the length of the data, and then subtracting the number of pages that the data spans (this is the data length rounded up to a multiple of the page size).

I also pushed this branch that exports the byte slice referencing the inner pages. I'm not sure if I will merge this into master but it suffices to prove the concept.

There's then this function that can convert a pointer to a slice (there's also this write-up) and this package implements wrappers over system-calls that you can use.

I can add the script that was used to verify the memory is readable if you'd like.

That would be interesting.

Thanks for the quick response, and this is great feedback.

Ah didn't realize Windows behaved that way. We have avoided supporting Windows with our implementations to simplify things, but this is a good thing to keep in mind going forward if we were to reassess.

You would need root access to do this at which point the attacker can dump the entire process's memory or make it all readable.

Right or even the same user as the running process. Assuming at that point, they could re-enable core dumps, use strace, theoretically reverse engineer and recompile the app, etc. The idea is adding the extra layer/hurdle of protection in case another medium of a memory-reading vulnerability arises where they don't need the user/sudo access. It admittedly may be overkill but we're just seeing if we can maintain parity with our previous work.

I applied your change of adding the Buffer.Inner() to the local go mod package, updated our calling code accordingly, and it seemed to work! That is definitely sufficient for our ask.

Related but unrelated: any chance of making the guard pages an optional piece? Can create a separate proposal if you'd like. Would be an interesting tradeoff for a user if they prefer removing that layer of safety for the sake of using up less unmanaged memory.

Sure, will post the script. It's mostly a copy of a StackOverflow post with some minor modifications.

Here is the script we use to scan. It is just thrown together based off https://stackoverflow.com/a/23001686. It seems very similar to the script mentioned in #39 (comment).

#!/usr/bin/env python

# Based off https://stackoverflow.com/a/23001686 with minor updates

import re
import sys


def print_memory_of_pid(pid, only_writable=True):
    """
    Run as same user as process (or root), take an integer PID and return the contents of memory to STDOUT
    """
    memory_permissions = 'rw' if only_writable else 'r-'
    sys.stderr.write("PID = %d" % pid)

    with open("/proc/%d/maps" % pid, 'r') as maps_file:
        with open("/proc/%d/mem" % pid, 'r', 0) as mem_file:
            for line in maps_file.readlines():  # for each mapped region
                m = re.match(r'([0-9A-Fa-f]+)-([0-9A-Fa-f]+) ([-r][-w])', line)

                if m.group(3) == memory_permissions:
                    sys.stderr.write("\nOK : \n" + line+"\n")
                    start = int(m.group(1), 16)

                    if start > 0xFFFFFFFFFFFF:
                        continue

                    end = int(m.group(2), 16)
                    sys.stderr.write( "start = " + str(start) + "\n")
                    mem_file.seek(start)  # seek to region start
                    chunk = mem_file.read(end - start)  # read region contents

                    print(chunk)  # dump contents to standard output
                else:
                    sys.stderr.write("\nPASS : \n" + line+"\n")


if __name__ == '__main__': # Execute this code when run from the commandline.
    try:
        assert len(sys.argv) == 2, "Provide exactly 1 PID (process ID)"
        pid = int(sys.argv[1])

        # check both writable and readable
        print_memory_of_pid(pid)
        print_memory_of_pid(pid, False)
    except (AssertionError, ValueError) as e:
        print("Please provide 1 PID as a commandline argument.")
        print("You entered: %s" % ' '.join(sys.argv))
        raise e

To use it, run the following as the same user as the process or as root (run it once without the stderr redirect to verify it's running properly):

python memory_scan.py <PID> 2>/dev/null | strings | grep <your_secret_or_part_of_it>

Related but unrelated: any chance of making the guard pages an optional piece?

A separate proposal would be good.

Closing this since we found a solution that worked for your use-case.

Closing this since we found a solution that worked for your use-case.

Just to clarify, are you planning on merging the bufmem branch into master?

@jpaskhay Merged the auxiliary function.

@jpaskhay Merged the auxiliary function.

awesome, thanks!