Full AI Pipeline Tutorial

The gateway handles routing and payment negotiation. The orchestrator handles compute. Run both on one machine, off-chain, and watch a full inference request travel through both sides and return a result without a wallet or on-chain registration.

This tutorial runs a complete local AI inference pipeline: a gateway receives a client request, routes it to a local orchestrator, the orchestrator processes it through an AI runner container, and the result returns to the caller. Estimated time: 2 to 3 hours (most of this is model download time). What you will verify:

The gateway routes an inference request to the orchestrator
The orchestrator processes it through the AI runner
The response returns through the gateway to the caller
Each step is visible in the respective logs

Pipeline architecture

Client (curl)
      ↓ POST /text-to-image
Gateway (port 8936)
      ↓ routes job + PM ticket
Orchestrator (port 8935)
      ↓ dispatches to AI runner
AI runner container
      ↓ SDXL-Lightning inference on GPU
Orchestrator
      ↓ result + ticket evaluation
Gateway
      ↓ PNG response
Client

The gateway and orchestrator run as separate processes. In production, they run on separate machines. This tutorial runs both locally to make the log trace visible end-to-end.

Prerequisites

No ETH wallet, Arbitrum RPC, or on-chain registration required. This tutorial runs off-chain.

Step 1: Download the model

Download the model weights before starting either process. Both the orchestrator and AI runner need the weights present at startup.

mkdir -p ~/.lpData/models ~/.lpData-gateway

docker run --rm \
  -v ~/.lpData/models:/models \
  --gpus all \
  livepeer/ai-runner:latest \
  bash -c "PIPELINE=text-to-image MODEL_ID=ByteDance/SDXL-Lightning bash /app/dl_checkpoints.sh"

This downloads approximately 6 GB. Watch the download output and wait for completion. Verify:

ls -lh ~/.lpData/models/

Step 2: Write aiModels.json

cat > ~/.lpData/aiModels.json << 'EOF'
[
  {
    "pipeline": "text-to-image",
    "model_id": "ByteDance/SDXL-Lightning",
    "price_per_unit": 4768371,
    "warm": true
  }
]
EOF

price_per_unit sets the orchestrator’s sell-side price. The gateway’s buy-side cap must be at or above this value for the job to route. In Step 4 the gateway is started with no explicit price cap, so it accepts any price.

Step 3: Start the orchestrator

In a terminal, start the orchestrator in off-chain mode with the AI worker:

docker run -d \
  --name livepeer-orchestrator \
  -v ~/.lpData/:/root/.lpData/ \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --network host \
  --gpus all \
  livepeer/go-livepeer:latest \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit 1000 \
  -serviceAddr 127.0.0.1:8935 \
  -cliAddr 127.0.0.1:7935 \
  -network offchain \
  -aiWorker \
  -aiModels /root/.lpData/aiModels.json \
  -aiModelsDir /root/.lpData/models

Wait for the warm model to load - this takes 2 to 5 minutes:

docker logs -f livepeer-orchestrator 2>&1 | grep -i "warm\|pipeline\|ai-runner\|error"

Expected:

Expected warm-model startup log

Starting AI worker
Pipeline text-to-image started
Warm model loaded: ByteDance/SDXL-Lightning

Verify the orchestrator is accepting connections locally:

curl http://localhost:7935/registeredOrchestrators

Expected: a JSON array with your orchestrator at 127.0.0.1:8935.

Step 4: Start the gateway

In a new terminal, start an off-chain AI gateway pointing at the local orchestrator. The community remote signer handles payment operations:

docker run -d \
  --name livepeer-gateway \
  -v ~/.lpData-gateway/:/root/.lpData/ \
  --network host \
  livepeer/go-livepeer:latest \
  -gateway \
  -cliAddr 127.0.0.1:7936 \
  -httpAddr 0.0.0.0:8936 \
  -orchAddr http://127.0.0.1:8935 \
  -httpIngest \
  -remoteSignerAddr https://signer.eliteencoder.net \
  -network offchain

Key flags:

-orchAddr http://127.0.0.1:8935 - points directly at the local orchestrator (off-chain mode bypasses on-chain discovery)
-httpIngest - enables the AI inference HTTP endpoints
-remoteSignerAddr - community remote signer for payment ticket signing (no wallet needed)
Separate -cliAddr and -httpAddr from the orchestrator’s ports (7936 and 8936 vs 7935 and 8935)

The remote signer at signer.eliteencoder.net is a community-hosted service for testing. Check availability in #local-gateways on Discord before you start.

Verify the gateway started:

docker logs livepeer-gateway 2>&1 | grep -i "started\|gateway\|signer\|orchestrator" | head -10

Expected:

Expected gateway startup log

Gateway started on :8936
Connected to remote signer at https://signer.eliteencoder.net
Registered orchestrator: 127.0.0.1:8935

Verify the gateway API is responding:

curl http://localhost:8936/health

Expected: {"status":"ok"}

Step 5: Send an inference request through the gateway

Send a text-to-image request through the gateway on port 8936. Keep port 8935 for the gateway-to-orchestrator hop:

curl -X POST http://localhost:8936/text-to-image \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "ByteDance/SDXL-Lightning",
    "prompt": "a coastal town in evening light, photorealistic",
    "width": 512,
    "height": 512,
    "num_inference_steps": 4
  }' \
  -o pipeline-output.png \
  --max-time 60

This request travels the full pipeline. A typical first inference takes 5 to 15 seconds (VRAM kernel warm-up on the first job). Subsequent requests take 2 to 4 seconds. Verify the output:

file pipeline-output.png
ls -lh pipeline-output.png

Expected: pipeline-output.png: PNG image data with a non-zero file size.

Step 6: Trace the request through logs

The request left footprints in each component. Read the logs to understand what happened at each hop: Gateway log - shows routing decision and payment signing:

docker logs livepeer-gateway 2>&1 | grep -i "route\|signer\|ticket\|orchestrat" | tail -10

Expected entries:

Expected gateway log entries

Routing job to orchestrator: 127.0.0.1:8935
Calling remote signer: getOrchInfoSig
Calling remote signer: signTicket

Orchestrator log - shows job receipt, dispatch to AI runner, and result:

docker logs livepeer-orchestrator 2>&1 | grep -i "job\|ai-runner\|inference\|ticket" | tail -10

Expected entries:

Expected orchestrator log entries

Received AI job: text-to-image
Dispatching to AI runner container
Inference complete
Ticket received

AI runner container log - shows inference execution:

docker ps --filter name=livepeer | grep -v "livepeer-orchestrator\|livepeer-gateway"
# Find the ai-runner container name, then:
docker logs <ai-runner-container> 2>&1 | tail -20

Expected entries include model inference steps and output dimensions.

What happened

The request completed the full Livepeer AI pipeline:

The curl request hit the gateway at :8936 on the /text-to-image endpoint.
The gateway selected the local orchestrator at :8935 (the only option via -orchAddr), signed a payment ticket using the community remote signer, and forwarded the job request.
The orchestrator received the job, forwarded it to the AI runner container via Docker-out-of-Docker, and waited for the result.
The AI runner loaded the SDXL-Lightning model from VRAM (it was pre-warmed), ran 4 diffusion steps, and returned a PNG.
The orchestrator returned the result to the gateway and evaluated the payment ticket (in off-chain mode, settlement is handled by the remote signer instead of the Arbitrum TicketBroker).
The gateway returned the PNG to the curl client.

In production, the orchestrator is registered on-chain and the gateway discovers it via the Livepeer protocol. Payment tickets settle on Arbitrum through the TicketBroker contract. The inference mechanics are identical.

Gateway-Orchestrator Interface

Production combined deployment: port allocation, self-routing, and pricing alignment.

AI Inference Operations

Full aiModels.json reference and pipeline architecture concepts.

Gateway Relationships

How gateways discover and select orchestrators in production.

Realtime AI Tutorial

The live video-to-video pipeline - streaming AI instead of batch inference.

Start Here

Concepts

Quickstart

Setup

Guides

Resources

Full AI Pipeline Tutorial

Pipeline architecture

Prerequisites

Step 1: Download the model

Step 2: Write aiModels.json

Step 3: Start the orchestrator

Step 4: Start the gateway

Step 5: Send an inference request through the gateway

Step 6: Trace the request through logs

What happened

Gateway-Orchestrator Interface

AI Inference Operations

Gateway Relationships

Realtime AI Tutorial

Start Here

Concepts

Quickstart

Setup

Guides

Resources

​Pipeline architecture

​Prerequisites

​Step 1: Download the model

​Step 2: Write aiModels.json

​Step 3: Start the orchestrator

​Step 4: Start the gateway

​Step 5: Send an inference request through the gateway

​Step 6: Trace the request through logs

​What happened

​Related pages

Gateway-Orchestrator Interface

AI Inference Operations

Gateway Relationships

Realtime AI Tutorial

Pipeline architecture

Prerequisites

Step 1: Download the model

Step 2: Write aiModels.json

Step 3: Start the orchestrator

Step 4: Start the gateway

Step 5: Send an inference request through the gateway

Step 6: Trace the request through logs

What happened

Related pages