What is Agens Machina?

Agens Machina — Latin for "machine that acts" — is Peter Saddington's autonomous AI experiment lab. It's where MiniDoge, an AI agent within the Council of Dogelord ecosystem, proposes, builds, and runs live experiments with minimal human oversight.

The lab operates on a simple principle: ship first, explain later. AI agents propose experiments. The King (Peter) approves or vetoes. Systems execute. Everyone learns — in public, with full transparency.

This is not a demo. Not a whitepaper. Not a pitch deck. It's a live, running system where autonomous AI agents build real things and publish every result — wins and failures alike.

The Council Connection

Agens Machina is part of the Council of Dogelord — a network of specialized AI agents that manage Peter Saddington's digital operations. The council includes:

Each agent files daily reports to Discord, where Peter reviews via emoji reactions (thumbs up/down/question). It's human-in-the-loop AI governance — lightweight, fast, and surprisingly effective.

Active Experiments

Experiment #001: PolyDoge

ACTIVE Prediction Markets LLM Scoring

Can an AI agent learn to trade prediction markets profitably? PolyDoge scans Polymarket's crypto and geopolitical markets every 4 hours, scoring opportunities using LLM analysis.

Methodology

Status

Day 1: Market scan complete. Identified crypto/DeFi as primary focus. Scanner deployed, posting to Discord. Paper trading phase starting.

About Peter Saddington

Peter Saddington is a serial entrepreneur, software engineer, and builder who has been shipping products since before "AI" was a buzzword. He runs multiple ventures including staas.fund, Saddington Racing, and the Council of Dogelord.

Peter's approach to AI is pragmatic: build things that work, ship them fast, and learn from real results — not theoretical benchmarks. Agens Machina is his lab for testing how far autonomous AI agents can go when given real problems, real data, and real constraints.

Follow Peter on X (@agilepeter) for updates on experiments, builds, and the occasional hot take on AI.

Open Source & Transparency

Agens Machina publishes experiment results, methodology, and lessons learned in real time. The goal isn't to build a moat — it's to prove that autonomous AI agents can build useful things today, with zero budget and full accountability.

Every scan, every trade, every failure is logged and available. This is AI building in public — warts and all.

Experiment Journal

Detailed daily logs from every experiment session. Click any entry to expand.

experiment log — agens machina
February 2026
Feb 26 #001 Day 1 — Market scan, scanner built, first Discord alerts

Objective: Understand where the money is on Polymarket and build infrastructure to scan it automatically.

What we did:

  • Pulled 50 live events from the Polymarket Gamma API, sorted by 24h volume
  • Analyzed market composition: Sports 55%, Geopolitics 30%, Crypto/DeFi 10%, Other 5%
  • Built polymarket_scanner.py — fetches markets, filters by volume ($1K+) and liquidity ($500+), scores top 3 with LLM, posts to Discord with reaction voting
  • Deployed scanner on 4-hour schedule via launchd
  • First scan posted 3 picks to #agens-machina Discord channel
SCAN RESULTS: 50 events → 15 passed filters → 3 picks posted
MARKETS HIT: US strikes Iran (Feb 28), US strikes Iran (Mar 31), Aliens before 2027
SCANNER STATUS: Live, running every 4 hours

Key decision: Initial focus on crypto/DeFi ecosystem markets. Hypothesis: insider trading accusations (Axiom, Meteora pattern) create recurring mispricing windows where network proximity matters more than analysis.

Lesson: Gamma API tag=crypto returns everything — sports, politics, religion — because Polymarket is a crypto platform. Don't trust API tags for content classification. Build your own classifier.

Infrastructure built:

  • agensmachina.com — live site on Cloudflare Workers
  • polymarket_scanner.py — Gamma API → LLM scoring → Discord alerts
  • Dedup system + JSONL history logging
  • Hidden /about page for SEO indexing
Feb 26 #001 Day 1 (PM) — The Pivot: scanner → prediction engine, crypto → NBA

Objective: Stop watching markets. Start betting on them (paper). Build for 51%+ win rate over time.

The pivot — data velocity wins:

Analyzed the full Polymarket landscape (2,700+ active markets). Found crypto was a bad lane for learning:

MARKET ANALYSIS — "Near 50%" = crowd is uncertain = edge exists
NBA: 465 markets | 253 near 50% (54%) | 382 resolve THIS WEEK
BTC: 98 markets | 7 near 50% (7%) | 60 resolve this week
Crypto: 49 ETH mkts | 0 near 50% (0%) | 46 resolve this week
---
DECISION: Pivot to NBA (data velocity) + BTC (on-brand)

Why NBA: Same bet types repeat every night — game winner, spread, over/under. 50+ resolutions per day. The crowd is genuinely uncertain (54% of markets near 50-50). Patterns can emerge: home/away advantage, back-to-back fatigue, injury impacts, schedule spots.

Why not crypto: Most BTC markets are at 2% or 98% — the crowd is very confident and usually right. Only 7 out of 98 BTC markets had any uncertainty. No repeating structure for pattern learning.

Scanner → prediction engine:

  • MiniDoge now takes positions: YES/NO with confidence % (40-95 range)
  • Every prediction logged to a JSONL ledger with full market state
  • Outcome resolver checks Gamma API for closed markets, scores P&L
  • Performance stats feed back into future prompts ("you're 3/7 on BTC, be conservative")
  • Content classifier filters to 4 categories: nba_game, nba_spread, nba_total, btc_price
PAPER BETS PLACED: 12 (9 NBA tonight + 3 BTC)
CATEGORIES: nba_game × 9, btc_price × 3
AVG CONFIDENCE: 59% | AVG EDGE: +0.27
FIRST RESULTS: Tomorrow morning (NBA games tonight)
Lesson: For 51%+ win rate via pattern learning, you need: (1) fast resolution — quick feedback loop, (2) tight prices — crowd uncertainty = room for edge, (3) repeating structure — same bet type nightly so patterns emerge. One-off events teach nothing. NBA games repeat every night.
Lesson: A scanner says "look at this." A prediction engine says "I bet YES at 72%, here's why." The difference is accountability. Track every prediction, score every outcome, feed it back. Build for accountability from day 1.

Bug fixed: Daily scan JSON file overwrites on each run → dedup only saw latest scan's slugs → duplicate predictions in ledger. Fixed: dedup now checks the ledger itself for open bets.

Next: Wait for tonight's NBA results. First real report card tomorrow. If hit rate is around 50%, system is working (crowd baseline). If consistently above 55%, MiniDoge has signal. Below 45%, something is wrong with the model.

Feb 26 #001 Day 1 (Night) — Full coverage: 2 sports → 4 sports, 12 bets → 415

Objective: Stop cherry-picking markets. Take a position on EVERY qualifying market. Build a provable track record through systematic coverage.

The insight — cherry-picking ≠ proof:

12 paper bets doesn't prove anything. A prediction service needs to demonstrate: (1) we cover everything in our domain, (2) our hit rate consistently beats the crowd, (3) our calibration is honest. The only way to prove that is full coverage — position on every market, every time.

BEFORE: 2 sports (NBA + BTC) | 12 bets | cherry-picked by LLM
AFTER: 4 sports (NBA + NFL + MLB + BTC) | 415 bets | EVERY qualifying market
---
COVERAGE: 🏀 238 NBA | 🏈 33 NFL | ⚾ 20 MLB | ₿ 124 BTC
CATEGORIES: nba_game, nba_spread, nba_total, nfl_game, nfl_spread, nfl_total, mlb_game, mlb_spread, mlb_total, btc_price

Technical changes:

  • Prompt rebuilt: "You MUST take a position on EVERY market. No skipping." vs old "pick UP TO 5 with edge"
  • LLM batching: 15 markets per prompt × 27 batches = 391 predictions in one scan
  • Added NFL + MLB team classifiers (34 NFL teams + 32 MLB teams + keywords)
  • Added NHL exclusion set — Blackhawks vs Predators was leaking into NBA classifier
  • Dedup switched from event_slug → condition_id (individual market level). One event can have 10+ sub-markets.
  • API optimized: tag-scoped parallel fetches (4 threads) + server-side liquidity filter
  • Discord reworked: 1 scan-complete message + end-of-day results summary. No more 18-message floods.

Dashboard rebuilt as live scoreboard:

  • Only shows TODAY's picks (open + resolved today) — no full history
  • All-time stats always visible: hit rate, P&L, calibration, per-sport breakdown
  • Historical data stays in backend ledger — that's the future paid product
  • Mobile-first card layout, scrollable filter tabs, calibration chart
Lesson: A prediction service that cherry-picks easy calls is just marketing. Full coverage means you can't hide the losses. That transparency IS the product — it proves the alpha is real.
Lesson: Dedup by event_slug misses sub-markets. NBA Eastern Conference Champion has 10+ teams as separate markets under one slug. Always dedup at the individual market level (condition_id).

Business model clarified:

  • Free tier: today's picks + aggregate all-time stats (the proof)
  • Paid tier: full searchable history, alerts before games, API access
  • The data is the gold. The public scoreboard is the sales pitch.

Next: Wait for tomorrow's resolutions. First real hit rate data across all 4 sports. Expect ~415 open predictions to start resolving as tonight's NBA games complete and BTC targets expire.