Scoring

iLAB Memory ranks observations with two distinct scoring schemes: SearchScore for mem_search, ContextScore for the memories[] returned by mem_session_start. They share helper math but combine signals with different weights — and their values are not comparable.

warning

Never compare a SearchScore value to a ContextScore value. Different formulas, different scales, different semantic meaning. The only invariant they share is the [0.0, 1.0] range. This is Braess #5 — see Architecture.

The two schemes

SearchScore

Used by mem_search. Combines BM25 relevance to your query, recency, and revision count. Higher = more relevant to the query you asked.

ContextScore

Used by mem_session_start.memories. Combines recency, revision count, and a per-type priority. Higher = more salient as default context to load.

SearchScore — relevance to a query

SearchScore = 0.60 * fts5_rank
            + 0.25 * recency_score(updated_at)
            + 0.15 * revision_score(revision_count)

Signal	Weight	What it measures
`fts5_rank`	0.60	BM25 from SQLite FTS5, pre-normalized to `[0, 1]`
`recency_score`	0.25	exponential decay, 30-day half-life
`revision_score`	0.15	linear in `revision_count`, capped at 10

The dominant signal is the query match. Recency is a secondary tiebreaker — a 6-month-old observation that nails the query still beats a brand-new observation that doesn't.

ContextScore — salience as default context

ContextScore = 0.50 * recency_score(updated_at)
             + 0.30 * revision_score(revision_count)
             + 0.20 * type_priority(type)

Signal	Weight	What it measures
`recency_score`	0.50	same exponential decay, 30-day half-life
`revision_score`	0.30	linear in `revision_count`, capped at 10
`type_priority`	0.20	per-type weight (see below)

There is no query here — the goal is "if I have to load N memories at session start, which N matter most for this user right now?" Recency dominates; type priority is a thumb on the scale for stable user facts.

Type priorities

Type	Priority
`profile`	1.00
`preference`	0.90
`decision`	0.70
`pattern`	0.60
`discovery`	0.50
`summary`	0.30
any unknown type	0.50

tip

Profile and preference observations bubble to the top of memories[] even when slightly older. Discoveries and patterns rank lower — they show up most often through search.

Shared helpers

Both formulas call the same primitives. There is one definition of recency_score and one of revision_score in the codebase (scoring.py), and each composite assembles them with its own weights.

from ilab_memory.scoring import recency_score, revision_score

print(recency_score("2025-12-01T00:00:00+00:00"))  # ~0.39 if "now" is 2026-04-20
print(revision_score(5))                              # 0.5
print(revision_score(50))                             # 1.0 (capped at REVISION_CAP=10)

note

This is Braess #3. Adding a new signal to one composite requires updating the other in the same PR. The shared helpers are the single source of truth.

Why two schemes?

Because they answer different questions.

mem_search: "Among everything I have for this user, what is closest to this query string?" — query relevance dominates.
mem_session_start: "Among everything I have for this user, what should the LLM see by default before the user even speaks?" — recency and salience dominate.

A single scheme could not weight FTS rank both at 0.60 (when there is a query) and at 0.0 (when there is not). Splitting them keeps each formula honest and tunable.

Don't mix the scores

hits = mem.mem_search(user_id="alice", query="auth")          # SearchScore
resp = mem.mem_session_start(user_id="alice")                  # ContextScore in resp.memories

# DO NOT do this:
all_obs = sorted(hits + resp.memories, key=lambda o: o.score, reverse=True)
# The sort is meaningless — the scores are on incompatible scales.

If you need a unified ranking across both, compute it yourself from the underlying observations (e.g., load full records via mem_get_observation and apply your own formula). The library deliberately refuses to fuse the two.

Architecture / Braess #5 — why this rule is enforced at the type level (SearchScore vs ContextScore are distinct frozen Pydantic classes).
Observations — what the scored objects look like.

The two schemes​

SearchScore

ContextScore

SearchScore — relevance to a query​

ContextScore — salience as default context​

Type priorities​

Shared helpers​

Why two schemes?​

Don't mix the scores​

Next​