Risks of TOCTOU in keysas-transit and keysas-out and backchannel

Question

Risks of TOCTOU in keysas-transit and keysas-out and backchannel

Closed this issue 3 months ago · 5 comments

Hi,
Thank you for this tool, which is much needed.

I was casually reading the source code and I think I stumbled on a TOCTOU (time of check/time of use) issue in the keysas-transit and keysas-out components.

My understanding is that these components accept as input some metadata and a file descriptor that was opened in keysas-in. Among the metadata, there is the hash of the file associated with that file descriptor. keysas-transit and keysas-out are validating the file content against that hash by computing their own copy of the hash and then comparing it with the expected metadata hash.

To be truthful, I am not entirely sure what the point of that check is since the metadata and the file descriptor are provided by the same untrusted components (the previous components in the chain), but assuming this hash is done for security purposes, the file is later used or transfered.

Unfortunately, keysas-out has no proof that keysas-transit forgot about the file descriptor, and keysas-transit does not have any proof that keysas-in forgot about it as well. As such, the previous component could edit the content of the file after the digest check was performed by the next component and before the file content is used (for analysis or writing).

One possible solution would be for components to acquire a private copy of the file before computing the digest. This could be done by cloning the file, if the filesystem supports it (e.g. btrfs) or creating a copy in an anonymous file. This way, the digest is computed on a copy of the file that cannot be altered by the previous process, whether you trust it to have forgotten about the file descriptor or not.

This could also prevent backchannels with keysas-out being able to communicate directly with keysas-in via the shared file descriptor, since keysas-transit would have changed the open file description in between.

Cheers,
Florian

Answer 1 · 2023-10-12T10:15:40.000Z

Hi Florian,

Thanks for your interest and support ! This is a good question !
First, keysas-in is actually opening the fd in read-only mode (the above comment in the code is actually wrong and needs to be updated for sure).
Then, as soon as we call unlinkat on the fd, it goes out of scope when the function returns (lifetime is over).
But yes, to prove it we could add a control (using fcntl for example) to count the number of opened fd on the same file and make sure it is only 1. I'll also check what could be done using landlock to restrict more the daemons.

Cheers,

Answer 2 · 2023-10-12T12:14:05.000Z

Hi,

Thank you for your reply.

My understanding is that you cannot rely on keysas-in promise to open in read-only mode. This is the less trusted component of the chain, and assuming it does the right thing is blind trust. I suppose that checking in keysas-transit for the read-only flag, using fcntl(2) GET_FL could do the trick and be one more check to add to that component. This could efficiently prevent the backchannel issue, because keysas-out would not be able to write to the file pointed to by the open file description.
However, I do not think this could prevent the TOCTOU issue. Keysas-in could open two FDs for the same file, transmit the read-only FD and alter the file on disk between the check digest check and the use of the file content. Counting file descriptors could do the trick but this cannot be done using any syscalls to the best of my knowledge, and would require reading procfs with very high privileges, which is probably not desirable.

Landlock might be a better idea if you were to enforce that keysas-in cannot open files in write mode at all. Currently, everything in the sas-in directory can be opened with write permission. This is not required since only write perm on the directory and read permission on the file it contains are needed. However only relying on the LSM for security seems brittle and my opinion is that LSM should serve as a safety net and not as the main security barrier.

Answer 3 · 2023-10-12T13:30:39.000Z

Hi Florian,

Keysas-core can also serve as a simple "file filtering gateway" between two networks so I want to be able to remove unwanted directories (in the next release) that could have been dropped (using scp for example) in sas_in directory to avoid disk saturation. But maybe it's a bad thing to keep this functionality going, let me some time to think about it.
The best solution at this point is to drop my privileges at runtime with Landlock using :
LANDLOCK_ACCESS_FS_REMOVE_DIR
LANDLOCK_ACCESS_FS_READ_FILE
LANDLOCK_ACCESS_FS_REMOVE_FILE

I'm also going to look at seccomp filters for a more granular configuration.
However, compromising keysas-in this way seems to be a fairly advanced attack scenario :)

Answer 4 · 2023-10-12T16:05:05.000Z

Hey,

Regarding your use-case of file filtering gateway, my advice would be to have yet another directory handled by another agent. That agent would be responsible for moving the desirable files in sas_in. This would avoid overloading the role of the sas_in directory and increasing the permission required to handle it.

Regarding your solution of hardening with these Landlock flags, I believe this could be a valid improvement. However, I would advise adding the private copy mechanism I discussed in the first post. Since sending the file descriptor in the socket is intended as "moving" (Rust-style) the file descriptor ownership to the next component, the closest system thing to that would be to actually move the file content to a different inode. What do you think about that approach?

Finally, yes. I agree that the keysas-in compromission is far fetched, given the low-ish complexity of that component, at the moment. I suppose any process capable of running with the keysas-in system user could impersonate that component and send malicious files down the keysas-transit socket. But yes, these considerations are mainly in the realm of defense in depth, and trying to build the components with the Bell-LaPadula model in mind ;)

Answer 5 · 2023-10-16T13:22:40.000Z

Hi Florian,

Here is a first level of protection using LandLock

I will also take a look at writing a seccomp filter to restrict the "open" syscall to RO.