Architecture - Remote Execution

System Overview

Rexec enables a local Python client (inside a Jupyter Notebook or script) to transparently execute functions on a remote Kubernetes cluster. The architecture has five logical layers:

①

Client Layer

The ndp_ep or scidx-rexec Python library running on the user's machine. Serializes function objects and sends them over ZeroMQ.

②

NDP EP API

FastAPI gateway. Provides the broker address and spawn API URL to authenticated clients. Also provides dataset discovery.

③

ZeroMQ Broker

Stateless message router deployed inside the cluster. Routes function call frames between N clients and N dedicated server pods.

④

Server Deploy API

FastAPI service that provisions and tears down per-user rexec-server pods via the Kubernetes API.

⑤

Rexec Server Pods

Per-user worker pods running inside the cluster. Connect to the broker's internal address and execute dispatched Python functions.

⑥

Auth API

External identity provider. All components validate Bearer tokens against the same AUTH_API_URL endpoint.

Full System Diagram

Source: reference/svg/rexec-arc.svg

ZeroMQ Transport Layer

Rexec uses ZeroMQ as its transport layer for low-latency, high-throughput message passing between clients and server pods.

Socket Pattern: ROUTER–DEALER

┌─────────────────────────────────────────────────────────┐
│                   ZeroMQ Broker Pod                     │
│                                                         │
│   ROUTER frontend          ROUTER backend               │
│   (NodePort 30001)         (ClusterIP 5560)             │
│        │                         │                      │
│        ▼                         ▼                      │
│   [routing table] ←────────→ [routing table]            │
└─────────────────────────────────────────────────────────┘
         │                         │
         ▼                         ▼
   Client (DEALER)          Server Pod (DEALER)
   tcp://<node>:30001       tcp://rexec-broker-internal-ip:5560

Frontend (ROUTER): Accepts client connections on the external NodePort. Identifies each client by its ZMQ identity frame.
Backend (ROUTER): Accepts connections from server pods on the internal ClusterIP. Routes responses back to originating clients.
Stateless: The broker does not store job state — it is a pure message router. Failure recovery is handled at the application layer.

Message Frame Structure(outdated; need to update this section to reflect current output stream zmq msg structure

Client → Broker → Server

[client-identity] [empty] [token] [function-dill] [args-dill]

Server → Broker → Client

[client-identity] [empty] [status] [result-dill]

Port Reference

Port	Service Name	Type	Who Connects
30001(external)	rexec-broker-external-ip	NodePort	External clients (users)
30002(external)	rexec-broker-external-ip	NodePort	Management / control
5560(internal)	rexec-broker-internal-ip	ClusterIP	Server pods (in-cluster)
/rexec	rexec-server-deployment-api	Ingress (HTTP)	NDP EP API (internal)
/api	ndp-ep-api	Ingress (HTTP)	External clients (users)

Broker Internals

The broker is deployed as a single Kubernetes Deployment with a replicaCount of 1 by default. It exposes two Kubernetes Services:

rexec-broker-external-ip  (NodePort)
  └── port 30001/TCP → broker frontend (client-facing)
  └── port 30002/TCP → broker control

rexec-broker-internal-ip  (ClusterIP)
  └── port 5560/TCP  → broker backend (server pod-facing)

Server Pod Lifecycle

 Sysadmin pre-deploys all three components to Kubernetes (via Helm chart):
 ┌─────────────────────────────────────────────────────────────────────────┐
 │  Kubernetes                                                             │
 │  [NDP EP API pod]   [Deploy API pod]   [Broker pod]                    │
 └─────────────────────────────────────────────────────────────────────────┘
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
User            [NDP EP API pod]     [Deploy API pod]      Kubernetes
 │                      │                    │                    │
 │── GET /api/rexec ───▶│                    │                    │
 │   Bearer token       │── POST /spawn ────▶│                    │
 │                      │   token            │── create ns ──────▶│
 │                      │                    │── create pod ─────▶│  [Server pod]
 │                      │                    │                    │
 │                      │                    │◀── pod running ────│
 │ ◀──────────────────────── {broker_url} ───│                    │
 │                      │                    │  [Server pod] connects to [Broker pod]:5560
 │                      │                    │                    │
 │── ZMQ connect ───────────────────────────────────────────────▶ [Broker pod]
 │   (broker_url:30001) │                    │                    │
 │── send(fn) ──────────────────────────────────────────────────▶ [Broker pod]
 │                      │                    │                      ──▶ [Server pod]
 │                      │                    │                          [executes fn]
 │◀── result ─────────────────────────────────────────────────── [Broker pod]
 |                      │                    │                    │

Authentication Flow

Every component in the stack validates the user's Bearer token independently against the configured AUTH_API_URL. This means the identity provider is the single source of truth for all access decisions.

Token validation happens at 3 independent points:

1. NDP EP API   → validates token + (optionally) group membership before returning Rexec Server Spawn API addr
2. Deploy API   → validates token + (optionally) group membership before provisioning server pod
3. ZMQ Broker   → validates token on connection handshake before routing messages

⚠️

All components must use the same AUTH_API_URL. A token issued by IDP-A will be rejected by a component configured to validate against IDP-B.

Kubernetes Network Topology

External Network
┌──────────────────────────────────────────────────────────────┐
│ User Laptop / Jupyter                                        │
│   ndp_ep client ──→ HTTPS /api/*    ──→ [Ingress] NDP EP API │
│   scidx_rexec   ──→ TCP  :30001     ──→ [NodePort] Broker    │
└──────────────────────────────────────────────────────────────┘

Kubernetes Cluster
┌──────────────────────────────────────────────────────────────┐
│  Namespace: rexec                                            │
│  ┌────────────────┐   ┌──────────────────────────────────┐   │
│  │  NDP EP API    │──▶│  Deploy API                      │   │
│  │  (Ingress /api)│   │  (Ingress /rexec)                │   │
│  └────────────────┘   └──────────────────────────────────┘   │
│                                   │ kubectl                  │
│  Namespace: rexec-broker          ▼                          │
│  ┌────────────────┐  Namespace: rexec-server-<user>          │
│  │  Broker Pod    │  ┌───────────────────────────────────┐   │
│  │  :30001 (ext)  │◀─│  rexec-server Pod                 │   │
│  │  :5560  (int)  │─▶│  (ephemeral, per-user)            │   │
│  └────────────────┘  └───────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

External Auth
┌──────────────────────────────────────────────────────────────┐
│  AUTH_API_URL  (e.g. idp-test.nationaldataplatform.org)      │
│  ← All 3 in-cluster components validate tokens here          │
└──────────────────────────────────────────────────────────────┘

Design Decision: Why ZeroMQ?

✅ Low latency

ZeroMQ's lock-free queuing and minimal framing overhead keep round-trip times in the single-millisecond range for typical function payloads.

✅ Language agnostic

ZeroMQ has bindings for 30+ languages. Future server implementations in Julia, R, or Rust are straightforward.

✅ N:M routing

A single ROUTER–DEALER broker naturally handles N clients talking to M server pods without any coordination overhead.

✅ No message queue infrastructure

No Kafka, RabbitMQ, or Redis required. The broker itself is the queue — one fewer service to operate.

Design Decision: Why Per-User Server Pods?

✅ Isolation

Each user gets a dedicated Python process. One user's long-running job or crash cannot affect another user's session.

✅ Resource limits

Kubernetes resource requests/limits can be applied per pod, giving fine-grained CPU/memory accounting per user.

✅ Custom environments

Different users can get different container images (e.g., different Python packages) by configuring the Deploy API.

⚖️ Trade-off: startup time

Pod provisioning takes 5–30s. For interactive workflows this is acceptable; for sub-second latency, a pre-warmed pool approach would be needed.

Scalability Considerations

Broker: Stateless ROUTER–DEALER pattern scales horizontally with care (ZMQ routing must be sticky per-session). Single replica is sufficient for most workloads.
Server pods: Scale naturally — each user gets a pod. The limiting factor is cluster node capacity.
Deploy API: Stateless FastAPI; can scale to multiple replicas without issue.
NDP EP API: Stateless FastAPI with HPA support built into the Helm chart.