Skip to content

Typed payloads

Every source emits a Pydantic-typed payload at the extractor boundary — not loose HTML or a generic blob. This is what makes recall composition, per-consumer rendering, and the knowledge graph possible: structure is captured once and reused everywhere.

The payloads

The payloads form a discriminated union on kind, so a consumer can switch on one field:

PayloadkindSource
TweetPayload"x"X (tweets, QRT chains, X-Articles)
RedditPayload"reddit"Reddit (post + comment tree)
WikiPayload"wiki"Wikipedia
WebPayload"web"Generic web articles
YouTubePayload"youtube"YouTube (metadata + transcript)
PDFPayload"pdf"PDFs

Shared primitives

Cross-platform building blocks are reused across payloads so the same concept has the same shape everywhere:

  • EngagementCounts — likes / reposts / replies / etc. (with a cross-platform net_score for Reddit/HN/SO-style up-minus-down)
  • UrlEntity — a resolved link with its display + expanded forms
  • CommentNode — a node in a recursive comment tree (Reddit threads nest these)
  • media attachments referenced from the payload

X-specific structure

X carries richer structure than a flat tweet, so it has dedicated models:

  • XArticle — a long-form X Article, block-structured
  • XArticleBlock — an individual block (paragraph, heading, list, table, …)

Why typed

  • Recall composes an embed-text per source from the typed fields, not a raw dump (see Recall).
  • Rendering is deterministic from the payload — Markdown, JSON, or future HTML are all derived views (see Substrate overview).
  • The Obsidian plugin’s types.ts mirrors these models additively, staying in lock-step with the daemon.

The authoritative definitions live in the daemon’s models.py; the design rationale is in ADR-0009.