A small experiment in pulling something useful out of Goodreads without depending on Goodreads’s own UI. The result lives at /books/ and has four tabs: Recent (last finished), Queue (to-read), Recommended (algorithmic), and Saved (manually bookmarked from the recommended pool).
How it’s built
A zero-dependency Node script paginates three Goodreads shelves — read, to-read, currently-reading — over RSS, and writes a single books.json file at build time. The page itself is a static Astro page that reads that JSON. No client-side calls to Goodreads, no server, no database. The script runs in prebuild and predev, so each deploy and each local run starts from a fresh shelf snapshot.
The Recommended tab
This is the part that took most of the time and is most of the interest.
Recommendations are not generated by Goodreads — they are computed locally from Open Library subject data. The seeding logic:
- 25 books are sampled from the full read history, preferring 5★ over 4★, deduped by author
- Subject frequency across those seeds drives the candidate pool
- Broad-bucket subjects (
fantasy,fiction,juvenile, etc.) are blacklisted so the recommender doesn’t reach for centuries-old classics - Year floor is 1990+
- Cross-references exclude anything already on any shelf — including stripping series suffixes like
(Harry Potter, #5)before normalising titles - A
BANNED_AUTHORS/BANNED_TITLE_FRAGMENTSdeny-list at the top of the script provides a hard filter for things that shouldn’t surface regardless
The top 5 picks are pinned. The next 60 candidates are shuffled with a daily-seeded mulberry32 PRNG and 19 are taken. Stable within a calendar day, rotates each day. The discipline is the same as the music discovery tool: a transparent corpus + a transparent filter that I can read in source is friendlier than an opaque recommender even when the filter is dumber.
Saved tab
localStorage-backed (key pixelbrix:books:saved). A + save button on Recommended cards adds; a − remove button on Saved cards removes. The tab also offers a CSV export (UTF-8 with BOM) and a clear-list action. It’s unsynchronised across devices on purpose — the saved list is a per-browser working set, not a permanent collection.
Tooltips and descriptions
Recent and Queue carry their <book_description> straight from the Goodreads RSS payload. Recommendations get one extra work.json fetch each from Open Library at build time to populate descriptions. On hover (~280ms delay), a cursor-following fixed-position popup shows title, author, meta, and a 320-character description. Disabled on touch and on screens under 600px.
Daily rebuild
A GitHub Actions workflow pings a Cloudflare Pages deploy hook daily at 06:17 UTC, so the shelves and the recommendation shuffle refresh without a manual push. The Recent tab stays current with whatever I marked finished overnight; the daily-seeded recommended list rotates predictably.
What it actually surfaces
The interesting empirical observation is that the Recommended tab is more useful at scale than I expected, and most of the credit goes to the deny-lists. Without the broad-bucket subject filter, the candidate pool collapses to public-domain classics whose subjects (“fiction”, “novel”) trivially intersect with everyone’s seed set. With the filter, the pool sits in the post-1990 long tail where the actual signal lives.
The other observation is that what’s on the queue is not what gets read. The Queue tab is interesting precisely because the act of recording an intention to read something turns out, in the data, not to predict reading it. Months later I’ll pick up a Recommended that wasn’t on the Queue, while a Queue item from a year ago sits unmoved.