Toricelli

A tool to turn a corpus into a feed.

Why toricelli? draft substack

At several points in recent history, Tumblr users have felt the need to create a howto for effectively using tumblr. Universal among these guides is the recommendation to switch your feed to chronological ordering. Why does this work? Why does it outperform Tumblr's best guess at what users want to see? It's possible that literally random feed order would outperform the kind of responsive-but-unaligned feed ordering behaviour we've grown to expect, but Tumblr was doing monetizable demographic targeting, it would start with targeting its ads and Blazed (pay-to-promote) posts. Their contextual guessing ought entirely to be optimizing for engagement, baest as it can. The For You feels reasonably doomscroll-worthy, so why don't people keep it?

One good answer is that a chronological feed has legible pragmatics – users know why one post follows another. Well-understood organizing principles makes for better control planes. Users get a good intuitive view of tons of meta – who posts when, posters and topics they'd rather see more or less of – absent the confounds of an unaligned feedback loop. the ability to curate what you see by messing with who you follow and what content you block. It removes the algorithm as a confounding factor. I conjecture that the variable reward circuit in the human learning machine stabilizes once it has actually learned a pattern. (Corollary: gambling is a pareidolia hack.)

I've been grappling for some time with the problem of how to return to things I write in my giant-ass digital notebook. I started it in 2016. I bookmark into it; also I catch stray thoughts that might break pomodoros, talk through sensitive mental problems with myself, draft essays / articles / difficult chat messages, into it. I've used it for project management, one-off task tracking, collating records of important email threads, running quantified self experiments…

The first three years of this were golden - I felt like I finally could never lose anything. This was untrue. When I tried reviewing my reasearch output, I discovered stubs. Stubs everywhere. I hadn't really been bothering to flesh anything out. Six months down the line, entries like

What is the relationship between analytics!Hypergame and Russell!Hypergame? simt

get a lot less useful.

Around 2019, my usage therefore tapered off. I've spent a long time wondering how the hell to fix this problem, because I want the true version of the faith I had for those first three years. Awhile back I finally cracked it open: it was that auto-organizing it is impossible. Once the notebook got too big, it perforce contained more things than I could dream of in my philosophy. All my guesses at how it "ought to be organized" – at a coherent set of governing rules – became too wrong to be useful. And without any organization, I couldn't navigate it anymore.

This problem will be familiar to data analysts stuck in their cleaning (luteal) phase: the reams of old links, bitrot, citations whose citekeys I'd since changed, shfiting experimental fragments of structure…it simply wouldn't survive fully-automated crunching. It begs the question, do I care enough to do it manually? do I care at all? And if so, what about?

This is fine, I think; even good. An exobrain isn't meant to be a dead thing. It's meant to work with the brain. The real challenge here is that I needed a broader contact surface with the notebook than I had it order for it to be useful; and by the time it gets that, one could argue I'm less consulting than running – perhaps even emulating. Something that aligns so deeply with my thought processes ought to appear alive in its own right. As they say about hairstyles, it needs movement.This is fine, I think; even good. An exobrain isn't meant tThis is fine, I think; even good. An exobrain isn't meant to be a dead thing. It's meant to work with the brain. The real challenge here is that I needed a broader contact surface with the notebook than I had it order for it to be useful; and by the time it gets that, one could argue I'm less consulting than running – perhaps even emulating. Something that aligns so deeply with my thought processes ought to appear alive in its own right. As they say about hairstyles, it needs movement.

I spent some years meditating on how exobrains ought to move; my best lead for a long while felt like YOW! 2014 Edward Kmett - Stop Treading Water: Learning to Learn #YOW - YouTube, where the presenter talks about how his Github projects organically arrange his work into a spaced-repetition-style recall exercise. There's something about how humans interface with each other over high-conceptual-density artifacts like Haskell code that rhymes with how I wanted the interface with my notebook to work – something like loading up the context.

Spaced repetition wasn't quite right - an SRS deck grows, but doesn't really have an attached theory of change for the deck. Is that right? It's not quite right. Say rather that it expands, but doesn't quite accrete.

There's something disconcerting about how an uncurated SRS deck flings you from subject to subject. It feels like it burns through a line of credit that my mind is extending it - and yeah, upon examination, that is what it's doing! It's burning through my capacity and willingness to switch context, and it often isn't rewarding me with anything for doing it. An SRS card is a stack of acontextual recall tasks, and recall tasks are often enjoyable only by dint of the context they drag in with them. The hedonics of an SRS deck are off. This isn't a good motivation curve.

Let's try to fix it. I want context to my SRS deck. The key is to notice that a deck is a form of programmable attention. In fact, it's the classic example - a feed. It augments and mediates your attention via an algorithm.

I realized what I needed was something like spaced repetition in two ways:

it takes in a list of items I want to look at and outputs an subset of them to look at now. In other words, a paginated feed.
it lets me fuzzy-defer items - effectively, it lets me say, show me this again soon, or not so soon, or quite late. In other words, a todo list.

I wanted a few other things, which needed to live inside the feed idiom:

The structure of my notes themselves needed to shape what was happening somehow. Their relations and correlations ought to inform what I'm looking at in the way web browsing hsapes what I look at.
What I began with in the first place - my notes needed to grow. I wanted to develop things I found in here over time; link, expand, deepen; solve problems, correct misconceptions, learn and document nuance. Recall is all well and good, but I care more about doing and logging work I can return to. I memorise everything worth memroising near-automatically, so my real challenge was always going to be finding or making things I felt were worth memorising.

When I look at a note again, I usually want to change it, read it, or browse it. Either it's done, and I get to remember some beautiful work I've done or thing I've found - ideally, for an annotated bookmark, both; or it's not done, or stale, or missing connections to things, and I want to change the note; or, whatever its doneness, I care more right now that it links to or reminds me of something that's caught my attention. What I want is a collection of entry points into my notes with the right movement to flow with these workflows.

SRS ideas cover reading and changing, but they don't really cover browsing. For this, I decided to run PageRank on the resulting scores.

There have been hiccups: A few months of usage in, I realized I was very very bored with how repetitious the results were. While my goal of surfacing unexpected 2nd-degree-and-up connections was well served, clusters of related nodes tended to show up near each other, and standalones were completely shafted. PageRank assumes that all the entropy comes from the keyword query you use to cut through it, so it extracts all the order it can get out of the links. I wonder who the first person inside Google was to realize that the way this algorithm worked meant that the shape of the itnernet was changing?

It's sort of a no-win: if I review linked nodes in close succession, they'll get scheduled again near each other; but if I space out my reviews to try and mitigate it, the highest-ranked one will still pull the lower ones up, and standalones get deprioritized. I think I might need to invert the standard PageRank score modifications past a certain nearness threshold? That way, nodes that are too closely related compete, and to have a clear winner that changed often within a cluster. Or maybe I just need more entropy in the system, injected in any which way.

One thing this problem does is clarify my goals for the feed. At present, I get to maybe two real entry points (deduplicating nodes that are too closely related) in a day. But if it's a good day, that number jumps to more like twenty. On days where I am carrying over context from an already open thread, or where I have some other heavy work going on, it drops to zero or one. All of this is fine: my goal is tastiness, i.e. good hedonics, i.e. for even those one-review days to feel intuitively worthwhile. The Feed should feel good, good enough to make me want the one.

Therefore, what I want is for the top ten results to have at least three things with a __? (get this number) chance of getting me to click, the way any feed worth its salt does.

background - SRS algorithms

Consider a value R, the probability of recall - i.e. whether you pass the test a card gives you. SRS axioms are as follows:

\(R\) for a given piece of information can be modeled by a decay function over time, unless acted upon by a recall test. This decay function is called the forgetting curve.
If you pass a recall test at time \(t\):
- \(R\) jumps to 1.
- \(R\) from \(t\) onwards decays more slowly.
If you fail a recall test at time \(t\):
- \(R\) still jumps, but to some value \(R_{reminded} < 1\). (The test will remind you of the card.)
- \(R\)'s rate of decay reverts to \(k_0\).

This is a control problem. Treat each test as costing some amount \(c\). You want to maximise the value of \(R\) over time, for which your primary tool is running tests; and minimise the value of \(c\) over time, for which your primary tool is making \(R\) decay more slowly - that is, by maximising the chance that the user passes tests.You could say you want \(tests_{passed}/ tests_{total}\) to converge to some nonzero value as fast as possible. (is this right?)

So, the algorithm needs to:

guess the shape of the forgetting curve \(d: R \to \text{Time}\) based on all available data / metadata about a card, including its test history.
decide when to run the next test (after interval \(n\)).
when the test is run, update both \(d\) and \(n\).

SM-2 implements this as two recurrence relations:

\(n_i = n_{i-1}\times EF_{i - 1}\), where \(EF_i\) is a value derived from test history and updated with every test.
First two values were fixed: \(n_1 = 1, n_2 = 6\)
\(EF_1 = 2.5\)
\(EF_i = EF_{i-1} + (0.1 - (5 -q) \times (0.08 + (5 - q) \times 0.02))\) where \(q\) is a score between 1 and 5 for each card.
- simplifies to \(EF_i = EF_{i-1} + 0.28q -0.02q^2\).
\(EF\) is effectively how the decay is parametrized. Note that for \(q=4\), it never changes. In that universe, \(n\) increases by a constant factor of 2.5 each time.
The forgetting curve is only modeled for where it's tested. Many possible functions could fit.

FSRS implements this by explicitly guessing a decay function:

establish a threshold value for \(R\) - call this \(R_{test}\) - and run a test at timestamp \(t {s.t.} d(t) = R_{test}\).
model the decay as exponential, i.e. \(d(t) = (1 + kt)^c\), \(k\) is derived from test history and \(c\) is constant.
update \(k\) after every test.

Both of these approaches makes choices about modeling based on what empirical data they have available about the forgetting curve, and what parametrizations of it let them update after every test.

They represent solution points in a parent problem of scheduling events that optimise for an outside behaviour that is somehow responsive to the event.

metadata

mtimes
score (some value that captures or summarises what happened the last time this was in the feed)
links
coming soon:
- keywords as links
- link weights (maybe even link properties? inferred from the surrounding link context, perhaps.)
- publication_status: never|no|draft|garden | done

algorithm

calculate a standalone score based on when an SRS algorithm would show a card: this will only be used for ranking.
run pagerank on it to promote clusters.

Toricelli

Why toricelli? draft substack

background - SRS algorithms

metadata

algorithm

Refs

Backlinks

Footnotes:

Toricelli

Why toricelli? draft substack

background - SRS algorithms

metadata

algorithm

Related:

Refs

Backlinks

Footnotes: