Guides / Deployment

Deployment

This guide is about shipping Live Neon responsibly.

It covers two related but different things:

Operating against the hosted Live Neon API in production
Planning the infrastructure you need if you are running a private deployment

If you are consuming the hosted API

This is the easiest way to start.

What you need

An organization
An org-scoped API key
One or more agents
A service that can fetch prompts or identity from Live Neon

API key model

Live Neon API keys are org-scoped and are shown once when created. Treat them like production credentials:

Store them in your secrets manager
Rotate them when team access changes
Create separate keys for different services if you want cleaner audit trails

Base URL

https://persona.liveneon.ai/api/v1

Both /api/ and /api/v1/ work, but /api/v1/ is the canonical form.

Your production integration should do four things

Fetch identity or prompt at session start, deploy time, or approved-change time.
Run the runtime using your preferred framework or model provider.
Publish meaningful experience back in through conversations or source sync.
Poll jobs for long-running work, especially for sync, PBD, and consensus flows.

Do not build around synchronous assumptions

Live Neon treats several important operations as background work. Background jobs are used for PBD processing, content sync, avatar generation, consensus detection, and embedding backfill.

Good patterns:

Enqueue follow-up work after a job reaches completed
Surface job status in your own admin UI
Treat job polling as normal infrastructure, not a temporary hack

Jobs are your control plane

The Jobs endpoints are the operational heartbeat of the platform. Use them to track queued work, running work, failures, results, and progress counters.

As of the current documentation, Jobs are the documented way to track long-running work. That means: build polling, make it idempotent, and do not wait for webhook support to design your integration cleanly.

Rate limits you should design around

The platform has four main rate-limit tiers:

General: 200 requests per minute
Heavy (sync, PBD): 30 requests per minute
Bulk: 10 requests per minute
Auth: 20 requests per minute

In practice, this means:

Regular agent reads are cheap
Sync and PBD should be paced
Bulk migrations should be chunked
Sensitive auth/key operations should stay infrequent

Recommended production architecture

User or scheduler
      |
Your app / worker
      |
Live Neon API
      |
Jobs + identity outputs
      |
Your LLM runtime
      |
Conversation capture
      |
Live Neon publish + PBD

The hidden win here is separation of concerns:

Live Neon stores and evolves identity
Your runtime handles active task execution
Your workers manage lifecycle and retries

A private deployment needs these building blocks

Web/API service

The application surface for auth, CRUD, prompt generation, and admin flows.

Persistent database

Organizations, agents, beliefs, responsibilities, jobs, content items, connections, and snapshots all require durable storage.

Background workers

Because multiple operations are async by design.

Scheduled sync/cron

Content sources can sync automatically via hourly cron when new content is imported.

Secure secret storage

For org API keys, GitHub connections, X connections, and provider keys used by your surrounding runtime.

File and upload handling

Especially for LinkedIn ZIP import, markdown exports, and any pipeline artifacts you choose to persist externally.

Observability

At minimum: request logs, job status visibility, sync results, error capture, and prompt or identity change auditing.

Security considerations

1. Connection credential encryption

GitHub and X credentials are encrypted with AES-256-GCM before storage. If you run a private deployment, preserving equivalent secret-handling standards is not optional.

2. URL ingestion safety

Website and RSS ingestion includes SSRF protection that blocks private/internal network targets and non-HTTP protocols. Content ingestion is one of the easiest places to create accidental infrastructure risk.

3. Robots-aware crawling

Website sync respects robots.txt and reports pages blocked by robots rules or URL filtering. Crawling behavior is treated as policy-aware ingestion, not blind scraping.

4. Org-scoped auth boundaries

The platform is multi-tenant by design. Your deployment and internal tooling should preserve that separation.

Operational checklists

First production launch

Create org
Create API key
Create one agent
Connect one source
Sync content
Run PBD
Poll job to completion
Review promoted identity
Regenerate prompt
Verify runtime output
Snapshot before expanding scope

Ongoing operations

Monitor failed jobs
Review newly promoted identity on a regular cadence
Rotate credentials when ownership changes
Audit which sources are still high-signal
Snapshot before major org-wide changes
Diff snapshots when investigating behavior changes

Identity-release checklist

Treat significant identity changes like deploy artifacts. Before releasing a materially changed agent:

Compare snapshot diff
Inspect resolved identity
Inspect final compiled prompt
Confirm boundaries are still present
Confirm starred items still deserve emphasis
Verify no noisy source caused accidental drift

Environment model

A good deployment should give you clear environment boundaries, safe background execution, observable change, review before rollout, and boring rollback. At minimum, operate three environments:

Local

For development, fast iteration, and migration testing. Use local for feature work, trying new model settings, shaping seed data, and testing migrations safely.

Staging

For rehearsing the full flow before production. Use staging for source sync testing, discovery job testing, prompt generation testing, identity review flow testing, and release candidate verification.

Production

For real sources, real reviews, real behavior. Use production for approved identity changes, real agent traffic, stable integrations, and monitored background jobs.

Never share secrets, data stores, or cron targets casually across environments. Identity systems are especially sensitive to silent cross-environment leakage because the mistake may not show up as an obvious crash. It may show up as a strange belief six hours later.

Environments and models

Live Neon has first-class support for environments and model assignments. This is useful if you want:

A staging environment for prompt review
A production environment for stable runtime behavior
Different provider/model combinations by environment
Cleaner separation between experimentation and live traffic

A good pattern is:

Keep one agent identity
Attach multiple environments
Assign different models by environment
Promote identity changes independently from model changes

Snapshots are a deployment feature

Snapshots are one of the most underappreciated parts of the system. They let you:

Freeze the current state before a risky change
Explain exactly what changed later
Roll your own release notes for identity evolution
Make agent identity less mysterious to operators

If you are running anything important, snapshot before major imports and before enabling new auto-approval behaviors.

What to avoid

1. Mixing many noisy sources before first release

This is the fastest way to create an untrustworthy identity baseline.

2. Shipping pending discoveries straight to production

Approval workflows exist because discovered identity is still editorial material.

3. Ignoring async job results

If your system triggers PBD but never checks the job result, you are missing half the control surface.

4. Treating content sync like a fire-and-forget crawler

The sync result tells you what was imported, skipped, or blocked. Read it.

5. Waiting until after drift to add observability

Identity systems need observability earlier than people expect.

A sensible rollout sequence

Phase 1: Single agent, stable prompt

One agent
One or two curated sources
Manual review
Regenerated prompt only

Phase 2: Closed loop

Publish selected conversations
Run PBD after meaningful sessions
Add snapshots and diffs to your workflow

Phase 3: Multi-agent governance

Use org and group inheritance
Enable consensus where appropriate
Introduce staging vs production environments
Use model assignments deliberately

The main production idea

Do not think of Live Neon as "prompt storage." Think of it as identity infrastructure. Prompts are just one output of that infrastructure. That mental model leads to better deployment decisions.

Next reads

Integrations for framework-specific guidance.
PBD Pipeline to understand how discovery works.
Core Concepts to understand the identity model.