How it works
Cruxwire is deliberately small: a single Python process serving the UI and an in-process scheduler running an ingestion pipeline against your own Ollama server. No build step, no external services, no database, just JSON on a volume.
The pipeline
By default the pipeline fires every two hours between 06:00 and 22:00, and once on start. Each run does the same six things.
Requests every feed, parses RSS/Atom, drops items older than your lookback_hours window and anything matching the blocklists, and de-dupes by URL.
Unread stories from the previous digest are carried into the new one, so a story you didn't get to doesn't vanish when its feed rotates it out. Read stories are vacated; stale ones are cut.
Each fresh article is scored 0–10 for relevance, given a 1–2 sentence summary and a category, and embedded, all via your local Ollama model. Carried stories reuse their score and only re-embed.
Articles about the same story across outlets are grouped by cosine similarity above your merge threshold. A story is boosted by how many independent sources cover it.
The pool is pruned to a rank-weighted keep set held inside a floor/ceiling band, so the inbox never goes dry on a slow weekend or floods on a busy news day.
The ranked, de-duplicated result is written atomically to digest.json. The frontend picks it up on the next load, no restart, no downtime.
Ranking
A story's position blends three signals: the LLM's relevance score against your category interest descriptions, a cross-source boost for how widely it's covered, and your learned affinity for its source.
The cross-source boost is bounded, min(cap, k · log₂(N sources)), so a genuinely big story climbs without letting one noisy topic swamp the page.
# relevance from your local model (0–10)
score = llm_relevance(article, category.interest)
# bounded boost for multi-source coverage
boost = min(BOOST_CAP, BOOST_K * log2(n_sources))
# learned per-source affinity (0.5×–2.0×)
affinity = source_affinity(article.source)
rank = (score + boost) * affinity
Personalization
Two learned signals shape your feed. A per-source affinity multiplier moves with your opens, saves, and dismisses. An embedding-based "taste" vector boosts stories similar to ones you engage with and sinks ones near things you dismiss.
Both are learned automatically and stored per device. Inbox hygiene never wipes them, your learned preferences persist independently of which stories are currently in the digest.
Categories, interest sentences feed the scoring prompt.
Read state, Read Later, History, and learned source stats persist to state.json on the cruxwire-data volume. The app code is baked into the image, so rebuilds preserve your data and new settings keys fall back to defaults.
Read / Read Later / History sync through the server, so what you've cleared on your laptop is cleared on your phone. The app is fully usable offline from local cache, then reconciles when it reconnects.
Everything ships in one image. The only thing outside the container is your Ollama endpoint and a local volume for state.
single container │ ├─ server.py HTTP: UI, /digest.json, /state, /settings, │ /feeds, /status, /refresh │ ├─ pipeline.py background scheduler (cron-like): │ fetch feeds → carry forward unread → score + embed │ (Ollama) → cluster → retain → write digest.json │ └─ /data volume → state.json, feeds.json, digest.json, settings.json │ └─ HTTP → Ollama (OLLAMA_HOST)
The HTTP server: serves the single-file UI, exposes the state / settings / feeds API, serves the current digest, and prunes state server-side.
The ingestion pipeline and scheduler. Fetches, carries forward, scores, clusters, and retains, then atomically writes the new digest.
A vanilla-JS single-file frontend: Home, Read Later, History, and Settings. Applies your per-device affinity multiplier when ordering.
Every threshold, schedule, and retention band is documented and editable. The setup guide walks through getting it running and tuning it to your reading.