Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Monitoring

What you’ll learn

  • How to configure log levels with RUST_LOG
  • How to monitor LLM spending with nabaos status
  • How to check cache hit rates with nabaos admin cache stats
  • How to set up security alerts via Telegram
  • How anomaly detection works and what triggers alerts
  • How to use the health check endpoint

Log Levels

NabaOS uses tracing-subscriber for structured logging. Control verbosity with the RUST_LOG environment variable:

LevelWhat it shows
errorOnly errors that require attention
warnWarnings and errors
infoNormal operation messages, warnings, and errors (default)
debugDetailed internal state, cache decisions, routing decisions

Set the log level

# Via environment variable
export RUST_LOG=debug
nabaos start

Or in your .env / systemd environment file:

RUST_LOG=debug

Or in Docker:

docker run -e RUST_LOG=debug ghcr.io/nabaos/nabaos:latest

RUST_LOG supports module-level filters for fine-grained control:

export RUST_LOG="nabaos=debug,tower_http=info"

Example log output at each level

info (default):

2026-02-24T10:00:01Z  INFO  NabaOS starting...
2026-02-24T10:00:02Z  INFO  Security layer initialized
2026-02-24T10:00:02Z  INFO  Ready.
2026-02-24T10:05:11Z  INFO  Cache hit: check_email (fingerprint match)
2026-02-24T10:05:11Z  INFO  Request completed in 12ms

debug:

2026-02-24T10:05:11Z  DEBUG  Fingerprint lookup: hash=a3f8c1 entries_checked=142
2026-02-24T10:05:11Z  DEBUG  Cache hit: similarity=0.97 threshold=0.92
2026-02-24T10:05:11Z  DEBUG  Skipping LLM call, executing cached tool sequence
2026-02-24T10:05:11Z  INFO   Request completed in 12ms

warn:

2026-02-24T10:15:00Z  WARN  Daily budget 82% consumed ($8.20 / $10.00)
2026-02-24T10:15:00Z  WARN  Anomaly score elevated: 0.73 (threshold: 0.80)

error:

2026-02-24T10:20:00Z  ERROR  LLM provider returned 429 Too Many Requests
2026-02-24T10:20:00Z  ERROR  Failed to write cache entry: database is locked

Cost Monitoring

Track how much you are spending on LLM API calls:

nabaos status

Expected output:

=== Cost Summary (All Time) ===
  Total LLM calls:     347
  Total cache hits:     2,841
  Cache hit rate:       89.1%
  Input tokens:         1,245,600
  Output tokens:        423,100
  Total spent:          $4.73
  Total saved:          $38.12
  Savings:              88.9%

=== Last 24 Hours ===
  Total LLM calls:     12
  Total cache hits:     94
  Cache hit rate:       88.7%
  Input tokens:         42,300
  Output tokens:        15,200
  Total spent:          $0.18
  Total saved:          $1.44
  Savings:              88.9%

Key metrics

MetricWhat it means
Cache hit ratePercentage of requests served from cache without an LLM call. Target: >85% after the first week.
Total spentActual dollars spent on LLM API calls.
Total savedEstimated dollars saved by cache hits (based on what those requests would have cost).
Savingstotal_saved / (total_spent + total_saved) * 100

Programmatic access

If the web dashboard is running, query costs via the API:

curl -s http://localhost:8919/api/costs \
  -H "Authorization: Bearer <token>" | python3 -m json.tool

Cache Statistics

Monitor the cache tiers individually:

nabaos admin cache stats

Expected output:

=== Cache Statistics ===

Fingerprint Cache (Tier 0):
  Entries: 142
  Hits:    1,203

Intent Cache (Tier 2):
  Total entries:   89
  Enabled entries: 84
  Total hits:      1,638

What the numbers mean

Cache tierDescription
Fingerprint Cache (Tier 0)Exact-match lookup by query hash. Sub-millisecond. Zero cost.
Intent Cache (Tier 2)Semantic similarity match using embeddings. Handles paraphrased queries.
Enabled vs. total entriesEntries with low success rates are automatically disabled (not deleted).

A healthy system shows the fingerprint cache growing over time as repeated queries are recognized, and the intent cache accumulating entries for paraphrased patterns.


Security Alerts

NabaOS can send real-time security alerts to a dedicated Telegram bot. This keeps security notifications separate from the main agent conversation.

Setup

  1. Create a second Telegram bot via @BotFather for security alerts.
  2. Get the chat ID where alerts should go.
  3. Set the environment variables:
export NABA_SECURITY_BOT_TOKEN="987654:XYZ-security-bot-token"
export NABA_ALERT_CHAT_ID="123456789"

What triggers alerts

Alert typeTrigger
Credential detectedAPI keys, passwords, tokens, or PII found in a user query
Injection attemptPrompt injection or jailbreak patterns detected by the security layer
Out-of-domain requestA query falls outside the constitution’s allowed domains
Anomaly detectedBehavioral deviation exceeds the anomaly threshold
Budget exceededDaily LLM spending exceeds NABA_DAILY_BUDGET_USD

Anomaly Detection

The agent builds a behavioral profile of normal usage patterns during a learning period (default: 24 hours). After the learning period, deviations trigger alerts.

Anomaly detection monitors:

SignalNormalAnomalous
Request frequency5-20 requests/hour200+ requests/hour (possible automation abuse)
Query length10-500 characters5000+ characters (possible injection payload)
Domain distributionConsistent with constitutionSudden shift to out-of-domain topics
Time-of-day patternsActive 9am-11pmBurst at 3am (possible compromised token)
Cost per request$0.00-0.01 avg$5+ per request (possible exploitation)

When the anomaly score crosses the threshold (default: 0.80), the agent:

  1. Sends a Telegram alert (if security bot is configured).
  2. Logs the event at WARN level.
  3. Continues processing (alerts are informational, not blocking by default).

Health Check Endpoint

When the web dashboard is running, a health endpoint is available:

curl -s http://localhost:8919/api/health

Expected response (HTTP 200):

{
    "status": "ok"
}

Use this endpoint for:

  • Docker health checks: test: ["CMD", "curl", "-sf", "http://localhost:8919/api/health"]
  • Load balancer probes: Point your ALB/Cloud Run health check at /api/health
  • Uptime monitoring: Ping from an external service (UptimeRobot, Pingdom, etc.)

Summary of Monitoring Commands

CommandWhat it shows
nabaos statusLLM spending, cache savings, token usage
nabaos admin cache statsCache entries and hit counts per tier
journalctl -u nabaos -fLive log stream (systemd)
docker logs -f nabaosLive log stream (Docker)
curl localhost:8919/api/healthHealth check (web dashboard)
curl localhost:8919/api/dashboardFull status with costs (web dashboard)