The vault holds ~4,700 bookmarked tweets, each as a single markdown file with rich YAML frontmatter (author, handle, date, source URL, category, namespaced tags). Dewey exports come as CSVs that look nothing like the vault’s format. The script bridges the two.
What it does
- Reads a Dewey CSV from
~/Downloads/ - Builds a set of tweet IDs already present in the vault by scanning each note’s
source:field forstatus/(\d+) - For every CSV row not already in the vault, generates a markdown file matching the vault schema:
- Filename
YYYY-MM-DD-handle-slug.md - YAML frontmatter (author, handle, date, time, year, source, type, platform, category, topics, tags)
- Body with attribution header, content, source link, and optional media URL
- Filename
- Writes only — never modifies existing entries
Notes
- Category is assigned by keyword matching against the vault’s existing taxonomy
- Topic tags use the vault’s namespaced convention (
topic/X) - One run on a 46k-line CSV completes in seconds; the bottleneck is reading vault markdown for ID extraction