fix(supervisor): tolerate non-empty bounding set when CAP_SETPCAP is unavailable#2075
fix(supervisor): tolerate non-empty bounding set when CAP_SETPCAP is unavailable#2075waynesun09 wants to merge 2 commits into
Conversation
|
All contributors have signed the DCO ✍️ ✅ |
|
I have read the DCO document and I hereby sign the DCO. |
|
recheck |
|
Few points of note:
|
7320552 to
1dc253d
Compare
|
/ok to test 1dc253d |
ad6106c to
3f95d51
Compare
|
/ok to test 3f95d51 |
|
The new OCSF degraded-mode alert may not fire. The parent-side probe returns early if Could we make the readiness probe test the actual bounding-set clear behavior ? |
adf21ac to
b3a0e2a
Compare
|
/ok to test b3a0e2a |
|
@waynesun09 look to me. Just fix the format issue and we're good to merge |
|
@alangou cool, I'm on it now. |
b3a0e2a to
9ab2fcb
Compare
|
@alangou it's updated, please check, thanks |
|
/ok to test 9ab2fcb |
…unavailable When running inside rootless Podman on Ubuntu 24.04 with AppArmor's apparmor_restrict_unprivileged_userns=1, prctl(PR_CAPBSET_DROP) returns EPERM even though CAP_SETPCAP may be nominally granted. The capability bounding set remains non-empty, causing the supervisor to abort sandbox creation. Add a new match arm in validate_capability_bounding_set_clear() that tolerates EPERM when the bounding set is non-empty: log a warning and continue, relying on seccomp to block dangerous syscalls. The existing privileged-environment behavior (fail-closed on non-empty success) is unchanged. Emit a parent-side OCSF DetectionFinding alert so the degraded mode is visible to operators and SIEM. The readiness probe performs a non-destructive bounding::drop() on an already-absent capability to detect AppArmor restrictions even when CAP_SETPCAP is nominally present in the effective set. Closes NVIDIA#2069 Signed-off-by: Wayne Sun <gsun@redhat.com>
Add a rootless-caps job to branch-checks.yml that runs the supervisor capability bounding set and drop_privileges tests as an unprivileged user on ubuntu-24.04 where AppArmor restricts PR_CAPBSET_DROP. Update architecture/sandbox.md to describe the degraded rootless mode where seccomp provides confinement when the bounding set cannot be cleared. Signed-off-by: Wayne Sun <gsun@redhat.com>
9ab2fcb to
4d5f5ad
Compare
|
@alangou the new ci clippy failure on backticked AppArmor is fixed, sorry I missed that in the local test |
Summary
When running under rootless Podman (or any container runtime that drops
CAP_SETPCAP),cap_drop_bound()returnsEPERMfor every capability still in the bounding set. Since v0.0.73 this is fatal — the supervisor crashes on sandbox startup, breaking all rootless Podman deployments.This PR adds a third match arm to
validate_capability_bounding_set_clear()that toleratesEPERMwhen the bounding set is non-empty, logging a warning instead of returning an error. The sandbox still relies on seccomp and Landlock for confinement in this case.Related Issue
Fixes #2069
Changes
crates/openshell-supervisor-process/src/process.rs:EPERM+ non-empty bounding set tolerance branch between the existingEPERM+ empty (success) and catch-all (error) armswarnfromtracingcapability_bounding_set_clear_tolerates_nonempty_epermtest to assertis_ok()instead ofis_err()drop_privileges_succeeds_for_current_grouptest — remove conditionalcfg(target_os)branching.github/workflows/branch-checks.yml: Addrootless-capsCI job that runs supervisor capability tests as a non-root user withoutCAP_SETPCAPonubuntu-24.04Local Build & Test
Built and tested the fix locally before pushing:
aarch64-unknown-linux-muslbinary inside arust:1.95-bookwormPodman container (macOS host cannot build Linux-only deps:capctl,landlock,seccompiler)localhost/openshell-supervisor:fix-2069viaDockerfile.supervisorsupervisor_image = "localhost/openshell-supervisor:fix-2069"ingateway.toml)Phase: Readywith the expected warning:Testing
cargo test -p openshell-supervisor-process --lib -- capability_bounding drop_privilegespassesChecklist