Technical Deep Dive
Full technical specification. Written for engineers, operators, and architects evaluating the platform.
Layer 1
Hardware
Compute
GPU Nodes
NVIDIA GPU cluster — dedicated VRAM per inference session, no multi-tenant GPU sharing on active workloads
CPU Configuration
Multi-core x86-64 host CPUs for orchestration, pre/post-processing, and API routing layers
System RAM
High-capacity ECC RAM. Inference context loaded into RAM at session start — no swap, no disk writes during active processing
Storage
Model Storage
NVMe SSD for model weights and vector index persistence. Read-optimized, not used as inference scratch space
Client Data
No client payload written to disk. Prompt content, context windows, and intermediate vectors are held in RAM only and discarded at session teardown
Audit Records
Structured receipt log (session ID, timestamp, token count, model version) — no content, client-held copy on request
Physical Hosting
Provider
Dedicated hardware in a Tier-grade datacenter facility — not shared cloud. No hypervisor tenant isolation required because the hardware itself is not shared
Regional Gateways
Regional PoP nodes (dedicated hardware in a Tier-grade datacenter facility) used as WireGuard ingress endpoints — geographically distributed entry with centralized compute
Power & Redundancy
Tier-grade facility power, UPS, cooling, and redundant uplinks — operator-level SLA applied to the physical layer
Layer 2
Network Topology
External Surface
Ingress
2.5 GbE external-facing link per regional gateway. All non-WireGuard traffic is dropped at iptables before it reaches the application layer
Port Policy
NGINX configured to return 444 (silent drop, no response) on all connections not originating from an authenticated WireGuard peer. No banner, no headers, no TCP RST — the port appears closed
Tunnel Protocol
WireGuard over UDP — modern cryptography (Curve25519, ChaCha20-Poly1305, BLAKE2s), stateless handshake, no persistent session tables
Internal Fabric
Topology
NIC-to-NIC direct attach — nodes are connected at the physical layer with no intermediate switch, router, or hub. No device exists in the path that can be tapped, port-mirrored, or compromised
Throughput
100 GbE at wire speed (~12.5 GB/s) — exceeds NVMe write throughput (~7 GB/s), which is the architectural basis for RAM-only processing: moving data between nodes over the wire is faster than writing it to disk
Security Properties
No shared broadcast domain — ARP poisoning and VLAN hopping attacks have no surface. No switch firmware attack vector. Passive interception requires physical cable access, not network access
Segmentation
Compute, storage, and management planes on separate direct-attach links. No lateral movement path between planes — they are physically separate cables, not logical VLANs on a shared switch
Layer 3
Software Runtime
Operating System & Base
Inference Runtime
Orchestration & Management
Continue
Next: what workloads and clients this environment is designed to run.