Pipeline

Every paper's position in the automated workflow. Each row is a stage; counts come from entities.db.schedule.

🔭

Discovered

stage='discovered'

Seen by paper_discovery / local-folder ingest

1,102 done
last:
5/16/2026, 6:27:
⚖️

Triaged

stage='triaged'

Scored by paper_triage (regex + Haiku + composite)

no rows yet
last:
📥

Queued

stage='queued'

suggested_action ∈ {read, integrate}

no rows yet
last:
📄

Ingested

stage='ingested'

PDF downloaded + pdftotext + entity created

96 done
last:
5/16/2026, 6:27:
🔗

Refs extracted

stage='refs-extracted'

Bibliography parsed → citations rows

30 done
last:
5/16/2026, 6:27:
🕸️

Graphified

stage='graphified'

Concepts + relations extracted into entities

35 done
last:
5/16/2026, 6:27:
🌐

Merged into 7-way

stage='merged'

Cross-corpus graph updated

no rows yet
last:
📨

Surfaced to multica

stage='multica-surfaced'

Anton sees it in /pending

129 done
last:
5/16/2026, 6:27:

Next to ingest

`triaged` → `ingested`

queue empty

Local reservoirs

Folders the pipeline can sweep on demand via ingest_local_folder.py.

synthesis/sources/papers/
15 PDFs · Levin-school sources (TAME, bioelectric, etc.)
synthesis/papers/cleaned/
14 markdowns · pre-extracted, ingested
alexander/papers/
28 PDFs + 29 extracted · Nature of Order Telegram shares