Toricelli

ID: 4a46861d-3f4b-4e3d-8ed1-8e1d43cb5e14
MTIME: [2025-04-03 Thu 21:55],[2025-03-24 Mon 00:57],[2024-12-25 Wed 16:16]
REVIEW_SCORE: 12.0
ROAM_REFS: https://github.com/itihas/toricelli

A tool to turn a corpus into a feed.

1. Why toricelli?

At several points in recent history, Tumblr users have felt the need to create a howto for effectively using tumblr. Universal among these guides is the recommendation to switch your feed to chronological ordering. Why does this work? Why does it outperform Tumblr's best guess at what users want to see? One good answer is that the semantics of a chronological feed are legible to the user. A lot of power comes from knowing why one post follows another - including the ability to curate what you see by messing with who you follow and what content you block. It removes the algorithm as a confounding factor. I think the variable reward circuit in the human learning machine also stabilizes, and stops firing, once it has actually learned a pattern. Hypothesis: gambling is pareidolia.

For some time I've been grappling with the problem of returning to things in my giant-ass notebook. I used to bookmark into here; I also used to catch stray thoughts from inside pomodoros into here; talk through sensitive mental problems with myself in here; draft all my academic output, and most of my creative writing output, in here. I've used it for project management, one-off task tracking, collating records of important email threads, running quantified self experiments…

And gradually, this tapered off. I've spent a long time wondering why, but awhile back I figured it out. Once the notebook got too big, all my guesses at how it was organized became too wrong to be useful. I couldn't navigate it anymore.

I think this is a fairly challenging problem. An exobrain isn't meant to be a dead thing. It's meant to work with the brain. The breadth of the interface needed to make such a thing useful means that it needs to be, in some nascent way, alive in its own right. As they say about hairstyles, it needs movement.

I spent some years meditating on this; my best lead for a long while felt like YOW! 2014 Edward Kmett - Stop Treading Water: Learning to Learn #YOW - YouTube, where the presenter talks about how his Github projects organically arrange his work into a spaced-repetition-style recall exercise. There's something about how humans interface with each other about high-density subjects that rhymes with how I wanted the interface with my notebook to work. But spaced repetition wasn't quite right - an SRS deck grows, but doesn't really have a rich theory of change. Is that right? It's not quite right. Say rather that it expands, but doesn't quite accrete. There's something disconcerting about how an uncurated SRS deck flings you from subject to subject. It feels like it burns through a line of credit that my mind is extending it - and yeah, upon examination, that is what it's doing! I'ts burning through my capacity and willingness to switch context, but it often isn't rewarding me with anything for doing it. An SRS card is a recall task, and recall tasks are often harder than they are enjoyable - especially early in the process. The hedonics of an SRS deck are off. This isn't a good motivation curve.

Let's try to make it up a little bit. I want context to my SRS deck. The key is to notice that a deck is a form of programmable attention. In fact, it's the classic example - a feed. It augments and mediates your attention via an algorithm.

I realized what I needed was something like spaced repetition in two ways:

it takes in a list of items I want to look at and outputs an subset of them to look at now. In other words, a paginated feed.
it lets me fuzzy-defer items - effectively, it lets me say, show me this again soon, or not so soon, or quite late. In other words, a todo list.

I wanted a few other things, which needed to live inside the feed idiom:

The structure of my notes themselves needed to shape what was happening somehow. Their relations and correlations ought to inform what I'm looking at in the way web browsing hsapes what I look at.
What I began with in the first place - my notes needed to grow. I wanted to develop things I found in here over time; link, expand, deepen; solve problems, correct misconceptions, learn and document nuance. Recall is all well and good, but I care more about doing and logging work I can return to. I memorise everything worth memroising near-automatically, so my real challenge was always going to be finding or making things I felt were worth memorising.

When I look at a note again, I usually want to change it, read it, or browse it. Either it's done, and I get to remember some beautiful work I've done or thing I've found - ideally, for an annotated bookmark, both; or it's not done, or stale, or missing connections to things, and I want to change the note; or, whatever its doneness, I care more right now that it links to or reminds me of something that's caught my attention. What I want is a collection of entry points into my notes with the right movement to flow with these workflows.

SRS ideas cover reading and changing, but they don't really cover browsing. For this, I decided to run PageRank on the resulting scores.

A few months of usage in, I realized I was very very bored with how repetitious the results were. While my goal of surfacing unexpected 2nd-degree-and-up connections was well served, clusters of related nodes tended to show up near each other, and standalones were completely shafted. I couldn't figure out a way to mitigate this - it's too deeply part of the point of PageRank. The algorithm assumes that all the entropy comes from the keyword search - the sheer number of possible results. It needs to extract all the order it can get out of the links. I I realized that I needed almost the inverse behaviour; to invert the standard PageRank score modifications, so that nodes that were too closely related didn't all show up - in fact, I needed them to compete, and to have a clear winner that changed often within a cluster. It is nighttime and I am still mulling over the nicest way to represent this.

One thing this problem does is clarify my goals for the feed. At present, I get to maybe two real entry points (deduplicating nodes that are too closely related) in a day. But if it's a good day, that number jumps to more like twenty. On days where I am carrying over context from an already open thread, or where I have some other heavy work going on, it drops to zero or one. All of this is fine. The real goal is tastiness, enough tastiness to make the harder days still intuitively worthwhile. I want the opportunities inside the Toricelli feed to feel good, good enough to make me want the one.

So what I want is for the top ten results to have at least three things that are interesting enough to suck me in, the way any feed worth its salt does.

2. background - SRS algorithms

Consider a value R, the probability of recall - i.e. whether you pass the test a card gives you. SRS axioms are as follows:

\(R\) for a given piece of information can be modeled by a decay function over time, unless acted upon by a recall test. This decay function is called the forgetting curve.
If you pass a recall test at time \(t\):
- \(R\) jumps to 1.
- \(R\) from \(t\) onwards decays more slowly.
If you fail a recall test at time \(t\):
- \(R\) still jumps, but to some value \(R_{reminded} < 1\). (The test will remind you of the card.)
- \(R\)'s rate of decay reverts to \(k_0\).

This is a control problem. Treat each test as costing some amount \(c\). You want to maximise the value of \(R\) over time, for which your primary tool is running tests; and minimise the value of \(c\) over time, for which your primary tool is making \(R\) decay more slowly - that is, by maximising the chance that the user passes tests.You could say you want \(tests_{passed}/ tests_{total}\) to converge to some nonzero value as fast as possible. (is this right?)

So, the algorithm needs to:

guess the shape of the forgetting curve \(d: R \to \text{Time}\) based on all available data / metadata about a card, including its test history.
decide when to run the next test (after interval \(n\)).
when the test is run, update both \(d\) and \(n\).

SM-2 implements this as two recurrence relations:

\(n_i = n_{i-1}\times EF_{i - 1}\), where \(EF_i\) is a value derived from test history and updated with every test.
First two values were fixed: \(n_1 = 1, n_2 = 6\)
\(EF_1 = 2.5\)
\(EF_i = EF_{i-1} + (0.1 - (5 -q) \times (0.08 + (5 - q) \times 0.02))\) where \(q\) is a score between 1 and 5 for each card.
- simplifies to \(EF_i = EF_{i-1} + 0.28q -0.02q^2\).
\(EF\) is effectively how the decay is parametrized. Note that for \(q=4\), it never changes. In that universe, \(n\) increases by a constant factor of 2.5 each time.
The forgetting curve is only modeled for where it's tested. Many possible functions could fit.

FSRS implements this by explicitly guessing a decay function:

establish a threshold value for \(R\) - call this \(R_{test}\) - and run a test at timestamp \(t {s.t.} d(t) = R_{test}\).
model the decay as exponential, i.e. \(d(t) = (1 + kt)^c\), \(k\) is derived from test history and \(c\) is constant.
update \(k\) after every test.

Both of these approaches makes choices about modeling based on what empirical data they have available about the forgetting curve, and what parametrizations of it let them update after every test.

They represent solution points in a parent problem of scheduling events that optimise for an outside behaviour that is somehow responsive to the event.

3. metadata

mtimes
score (some value that captures or summarises what happened the last time this was in the feed)
links
coming soon:
- keywords as links
- link weights (maybe even link properties? inferred from the surrounding link context, perhaps.)
- publication_status: never|no|draft|garden | done

4. algorithm

calculate a standalone score based on when an SRS algorithm would show a card: this will only be used for ranking.
run pagerank on it to promote clusters.

5. Related: