Kicksecure/security-misc

`remount-secure`: use `procfs` mount option `subset` (`hide-hardware-info.service`)

adrelanos opened this issue · 5 comments

But for sensitive proc, I think I found a better way. We can modify the mount options. Procfs has the mount option subset. We can set this to the subsets that we want to be available. If we set the option subset=pid, then pid is visible to everyone (the default), but everything else invisible, including /proc/dma /proc/consoles /proc/cmdline /proc/iomem and like a billion other things you can check them with ls /proc. I know for a fact that hiding memory access and device stuff and kernel arguments is good and does not break anything. i am not sure what can be hidden here without breakage, but I will test and find out the optimal mount options for proc.

Originally posted by @monsieuremre in #172 (comment)

I have done extensive testing. Nothing seem to break. If remounting is done too early tho, it won't boot. I don't know which service is the reason. But this is miles ahead of manually setting permissions under /proc, which is non persistent.

Also when it comes to /proc/kallsyms, I'm pretty sure setting the kernel parameter kernel.kptr_restrict=2 already does the trick for the most part. Meaning that addresses are hidden. I'm pretty positive that what remains unhidden here is meaningless for the most part.

related:

On a second thought this feature might be more suitable for remount-secure than hide-hardware-info.

I ran some testing on my systems (server VM only, no desktop):

Test method

I applied the patch like this, partially related to #208:

sudo groupadd proc
mkdir -p /etc/systemd/system/proc-hidepid.service.d/
echo "[Service]
ExecStart=/bin/mount -o remount,nosuid,nodev,noexec,hidepid=2,gid=proc,subset=pid /proc
" > /etc/systemd/system/proc-hidepid.service.d/override.conf

#Following changes necessary to prevent systemd related failures on bootup. https://github.com/Kicksecure/security-misc/issues/208
mkdir -p /etc/systemd/system/systemd-logind.service.d/
echo "[Service]
SupplementaryGroups=proc" > /etc/systemd/system/systemd-logind.service.d/override.conf

mkdir -p /etc/systemd/system/user@.service.d/
echo "[Service]
SupplementaryGroups=proc" > /etc/systemd/system/user@.service.d/override.conf

For a control test against all listed issues below, remove "subset=pid" from /etc/systemd/system/proc-hidepid.service.d/override.conf

Kicksecure issues

hide-hardware-info.service fails when it can't find /proc/cpuinfo. Seems easy to work around.

Systemd issues

Systemd sandboxing seems badly broken. Some useful features are not working:

running systemctl status reports a new "Tainted: cgroups-missing" value.
A syslog entry lists:
System is taitned: cgroups-missing
https://github.com/systemd/systemd/blob/main/README#L418-L425

systemd uses cgroups to add sandboxing with the bpf() call to add firewall rules:
https://github.com/systemd/systemd/blob/main/README#L111-L125

Systemd permits use of cgroups to add CPU/Memory usage restrictions:
https://github.com/systemd/systemd/blob/main/README#L104-L109

Impacted services & detection method:

root@user:/lib/systemd/system# grep -nri IpAddressDeny *
dbus-org.freedesktop.hostname1.service:21:IPAddressDeny=any
dbus-org.freedesktop.locale1.service:21:IPAddressDeny=any
dbus-org.freedesktop.login1.service:37:IPAddressDeny=any
dbus-org.freedesktop.timedate1.service:21:IPAddressDeny=any
jitterentropy.service:21:IPAddressDeny=any
systemd-hostnamed.service:21:IPAddressDeny=any
systemd-journald.service:27:IPAddressDeny=any
systemd-journald@.service:22:IPAddressDeny=any
systemd-localed.service:21:IPAddressDeny=any
systemd-logind.service:37:IPAddressDeny=any
systemd-timedated.service:21:IPAddressDeny=any
systemd-udevd.service:41:IPAddressDeny=any
udev.service:41:IPAddressDeny=any
root@user:/lib/systemd/system# grep -nri SocketBind *
root@user:/lib/systemd/system# grep -nri RestrictNet *
root@user:/lib/systemd/system# grep -nri IpIngress *
root@user:/lib/systemd/system# grep -nri IpEgress *
root@user:/lib/systemd/system# grep -nri CPU *
e2scrub_reap.service:18:CPUSchedulingPolicy=idle  #Not sure about this one. Seems to be cgroup?
e2scrub@.service:17:CPUSchedulingPolicy=idle

Docker Issues and possible sandbox software issues

A hardened docker container using Google GVisor fails when it cannot find /proc/meminfo
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: creating container: cannot create sandbox: cannot create sandbox process: open /proc/meminfo: no such file or directory: unknown.

A normal docker container fails to start when it cannot find /proc/devices. :
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unable to find device '99/136': open /proc/devices: no such file or directory: unknown.
Docker cgroups(and other containers that use cgroups) are also useful to prevent DOS attacks on a server by restricting the max CPU and memory a container can use.
https://docs.docker.com/config/containers/resource_constraints/
My launched containers do not use any cgroups at the moment, so this appears to be built deeply into docker.

Extra notes

Typical namespaces are not broken as lsns continues to report mnt, uts, net namespaces in use by haveged, jitterentropy, and some systemd services. All seems normal here.

I did not test subset=pid with bwrap but it may be a good test target next.

This is also somewhat strange, but the system suddenly stops responding to ACPI power button events when booted with subset=pid. Typically a shutdown is initiated when subset=pid is not set, at least on my server VMs.

Nothing in dmesg or syslog indicates issues.