Skip to content

RMANOV/strix

Repository files navigation

STRIX -- Swarm Coordination, Safety, and Explainable Autonomy

Technical deep-dives

CI

STRIX is a Rust + Python research platform for coordinating heterogeneous autonomous systems in degraded environments. The public repository focuses on state estimation, task allocation, resilient mesh coordination, safety constraints, simulation, and explainable decision traces.

The core design draws from quantitative finance, control theory, and distributed systems: particle filters for hidden-state estimation, auction-based assignment for constrained resource allocation, regime detection for degraded-mode adaptation, and replayable traces for auditability. The result is a platform-agnostic autonomy stack aimed at research, evaluation, and prototype integration.

The public tree is intentionally conservative. It exposes the reusable autonomy core: coordination, safety, simulation, explainability, and platform-agnostic adapter boundaries. Evaluator collateral, internal review packs, and program-specific material are not maintained as part of the public repository.

Official Project Identity

The official public upstream is https://github.com/RMANOV/strix.

Forks are allowed under Apache-2.0, but the STRIX name, project identity, and official release channel are separate from the source-code license. See NOTICE, TRADEMARKS.md, and Project_Docs/provenance/OFFICIAL_RELEASES.md.

Public releases should be traceable to the official upstream and maintainer release authority. Private keys, local machine release state, customer-specific material, and private companion modules do not belong in the public repository.

Focus Areas

  • State estimation and prediction: particle filters, regime detection, anomaly handling, and degraded-mode reasoning for uncertain environments.
  • Task allocation and coordination: combinatorial auctions, stigmergic coordination, fractal scaling, and bandwidth-aware mesh behavior.
  • Safety and policy gates: barrier functions, conservative task gating, and resilience to GPS loss, comms degradation, and sensor noise.
  • Explainability and replay: structured decision traces, narration hooks, and after-action inspection.
  • Simulator-first integration: reusable adapters and a simulation playground before platform-specific deployment work.

Architecture

+-----------------------------------------------------------+
|              LAYER 0: HUMAN / API INTERFACE               |
|  intents, constraints, confirmation, execution requests   |
+-----------------------------------------------------------+
|          LAYER 1: PLANNING AND STATE ESTIMATION           |
|  particle filters, regime detection, anomaly handling     |
+-----------------------------------------------------------+
|             LAYER 2: TASK ALLOCATION ENGINE               |
|  auction scoring, assignment, energy/risk tradeoffs       |
+-----------------------------------------------------------+
|              LAYER 3: COORDINATION AND MESH               |
|  stigmergy, gossip, hierarchy, distributed convergence    |
+-----------------------------------------------------------+
|           LAYER 4: SAFETY AND PLATFORM ADAPTERS           |
|  safety constraints, simulator-first adapters, I/O edge   |
+-----------------------------------------------------------+
|           LAYER 5: EXPLAINABILITY AND REPLAY              |
|  decision traces, narration hooks, audit and playback     |
+-----------------------------------------------------------+

Performance Snapshot

All measurements below are prior-measured software results from the Criterion benchmark profile on a single core, reproduced with cargo bench (the repository defines no custom [profile.bench], so the standard Criterion bench profile applies). They are point-in-time figures, not final live facts, and must be re-run on the exact submission commit before being quoted as current.

Benchmark Configuration Time
Particle filter step 50 particles 42 us
Particle filter step 200 particles (default) 75 us
Particle filter step 1000 particles 226 us
Combinatorial auction 5 drones, 3 tasks 2.7 us
Combinatorial auction 20 drones, 10 tasks 47 us
Combinatorial auction 50 drones, 20 tasks 465 us
Full swarm tick 5 drones 298 us
Full swarm tick 10 drones 580 us
Full swarm tick 20 drones 1.15 ms

The full tick benchmark covers estimation, regime updates, assignment, coordination, safety clamps, and trace capture. The prior-measured ~1.15 ms per tick for 20 drones fits inside a 10 Hz orchestration loop, but this is a software-only benchmark figure to be re-confirmed on the submission commit, not a demonstrated end-to-end field result; it carries no sensor, RF, or platform-I/O budget.

Scale (estimate / roadmap, not demonstrated). The largest full swarm-tick benchmark above is 20 drones at ~1.15 ms; the combinatorial auction benchmark reaches 50 drones, and the swarm_tick benchmark source exercises up to 100 drones. A practical single-node ceiling of ~400–500 agents at 10 Hz is an audit-derived estimate, and 2000+ agents is a forward roadmap target — neither is a demonstrated or benchmark-backed capability. These bounds are governed by the canonical claim map in Project_Docs/CAPABILITY_BOUNDARY.md and detailed in Project_Docs/provenance/validation/EVIDENCE_PACKET.md (item 7). They must never be stated as fact.

Quick Start

git clone https://github.com/RMANOV/strix.git
cd strix

cargo test --workspace
cargo build --release
pip install -e .

Requirements: Rust 1.83+, Python 3.11+, maturin 1.11+

Software-Only Replay

STRIX includes a public-safe deterministic replay harness for inspecting scenario behavior before hardware or field validation:

python scripts/strix_sim_replay.py --scenario sim/scenarios/gps_denied_recon.yaml

The command writes a JSON timeline and a self-contained HTML canvas under target/strix-replays/ by default. It is useful for seeded behavior review, scenario regression evidence, and visual inspection of agent reactions. It is not a substitute for hardware, RF, sensor, or field validation.

Capability status / public claim boundary

STRIX is a civilian-dogfooded, single-operator, test-backed, simulator-first research platform. To keep public statements honest, the surfaces in this repository are labelled by claim posture. These labels describe what STRIX claims publicly; they are not a product cut. Every labelled surface below remains present and functional in the source tree.

The canonical, authoritative version of this map lives in Project_Docs/CAPABILITY_BOUNDARY.md. The summary here is a pointer, not a separate source of truth.

Core / load-bearing (the claimable autonomy story):

  • Rust-centered OODA / tick path and orchestration loop.
  • Classical safety constraints (classical control-barrier-function gating is the default; fallback_to_classical = true).
  • ROE / policy gates with friendly-and-civilian deny-first guarding.
  • DecisionTrace / BattleReport and replayable, auditable decision artifacts.
  • Degraded-mode and electronic-warfare (EW) behavior.
  • Simulator-first, software-replay evidence.

Experimental / optional / adapter-boundary labels (still in-tree, not the public claim center):

  • GCBF+: experimental / training path. The shipped safety story is classical-CBF + ROE + traces + simulator, not a trained-weights neural-safety guarantee.
  • ROS2 / MAVLink: adapter-boundary, validation-ahead. These are integration boundaries, not delivered hardware integration.
  • Python LLM / edge inference: optional / facade / degraded-mode support. It is not a required runtime and not the core autonomy path.
  • Optimizer: an offline tool, not a live autonomy core.

Frozen public claim-set. The full allowed / experimental / forbidden claim set is documented in Project_Docs/CAPABILITY_BOUNDARY.md. In short, STRIX does not publicly claim: fielded or on-hardware drone deployment; delivered ROS2/MAVLink hardware integration; defence validation / accreditation / certification; a default-runtime or trained-neural GCBF+ safety guarantee; edge-LLM autonomous decision authority; a shipped STRIX integration into external memory systems; or sensor / RF / field readiness inferred from software replay alone. Scale figures and tick-timing numbers are treated as prior measured software-replay results to be re-run on the exact submission commit, not as final live facts.

Project Structure

strix/
├── crates/
│   ├── strix-core/        state estimation, resilience modules, safety constraints
│   ├── strix-auction/     task allocation and portfolio-style optimization
│   ├── strix-mesh/        coordination mesh, gossip, stigmergy, hierarchy
│   ├── strix-adapters/    simulator-first adapter boundary and platform stubs
│   ├── strix-xai/         explainability engine and decision traces
│   ├── strix-swarm/       integration tick loop across the Rust crates
│   ├── strix-python/      PyO3 bindings for Rust/Python interop
│   └── strix-playground/  simulation playground and preset execution
├── python/strix/
│   ├── brain.py           orchestration loop and planning shell
│   ├── adversarial.py     prediction and hidden-state modeling helpers
│   ├── nlp/               intent parsing and confirmation flow
│   ├── temporal/          multi-horizon planning logic
│   ├── digital_twin/      world model, rehearsal, visualization
│   └── llm/               optional narration and generic provider hooks
├── sim/scenarios/         public simulation scenarios and placeholders
├── demo/                  public demo placeholders and lightweight examples
├── Project_Docs/          sanitized public notes
└── paper/                 paper source and generated PDF

Licensing

This public repository is licensed under Apache License 2.0.

Disclaimer

STRIX is a research prototype under active development. It is intended for research, experimentation, evaluation, and technology demonstration. Users are responsible for validating fitness for their own domain, integration path, and regulatory context.