Daily Shaarli

All links of one day in a single page.

March 5, 2026

KARL: A Faster Agent for Enterprise Knowledge, powered by custom RL

While SFT distillation meaningfully improves overall performance over the base model, the gap between the two approaches is most apparent when combined with test-time compute. On in-distribution tasks, SFT benefits substantially from parallel sampling (69.1 → 75.3), yet on out-of-distribution tasks the gains are negligible (59.4 → 59.6). This suggests that distillation teaches the model to imitate task-specific expert behavior, which scales well within the training distribution but fails to generalize beyond it. In contrast, KARL benefits from test-time compute both in- and out-of-distribution, indicating that RL develops more general search capabilities rather than task-specific heuristic

symphony/elixir/README.md at main · openai/symphony
thumbnail

Why Elixir?

Elixir is built on Erlang/BEAM/OTP, which is great for supervising long-running processes. It has an active ecosystem of tools and libraries. It also supports hot code reloading without stopping actively running subagents, which is very useful during development.

awni/mylm: Self-personalizing LM

The above command enters you into a chat loop. You can talk to the model and share information like your name. Every now and then /sleep the model to transition short-term memory to long-term memory

The /sleep command:

Generates Q&A pairs based on the context
LoRA fine-tunes the model on the new Q&A pairs plus any from previous sessions
Resets the KV cache

After the /sleep command the model should remember context from previous sessions even though that context is no longer in the KV cache.