Show HN: Muscle-Mem, a behavior cache for AI agents
Muscle Memory
muscle-mem
is a behavior cache for AI agents.
It is a Python SDK that records your agent's tool-calling patterns as it solves tasks, and will deterministically replay those learned trajectories whenever the task is encountered again, falling back to agent mode if edge cases are detected.
The goal of muscle-mem
is to get LLMs out of the hotpath for repetitive tasks, increasing speed, reducing variability, and eliminating token costs for the many cases that could have just been a script.
It's unexplored territory, so all feedback is welcome!
- Read Muscle Mem - Removing LLM calls from Agents for more context
- Join Muscle Mem discord for feedback
Dev Log
- May 7, 2025 - First working demo
- May 8, 2025 - Open sourced
How It Works
muscle-mem
is not another agent framework.
You implement your agent however you want, and then plug it into muscle-mem
's engine.
When given a task, the engine will:
- determine if the environment has been seen before (cache-hit), or if it's new (cache-miss) using
Checks
- perform the task, either
- using the retrieved trajectory on cache-hit,
- or passing the task to your agent on cache-miss.
- collect tool call events to add to cache as a new trajectory
It's all about Cache Validation
To add safe tool reuse to your agent, the critical question is cache validation. Ask yourself:
For each tool we give to our agent, what features in the environment can be used to indicate whether or not it's safe to perform that action?
If you can answer this, your agent can have Muscle Memory.
The API
Installation
pip install muscle-mem
Engine
The engine wraps your agent and serves as the primary executor of tasks.
It manages its own cache of previous trajectories, and determines when to invoke your agent.
from muscle_mem import Engine engine = Engine() engine.set_agent(your_agent) # your agent is independently callable your_agent("do some task") # the engine gives you the same interface, but with muscle memory engine("do some task") engine("do some task") # cache hit
Tool
The @engine.tool
decorator instruments action-taking tools, so their invocations are recorded to the engine.
from muscle_mem import Engine engine = Engine() @engine.tool() def hello(name: str): print(f"hello {name}!") hello("world") # invocation of hello is stored, with arg name="world"
Check
The Check is the fundamental building block for cache validation. They determine if it’s safe to execute a given action.
Each Check encapsulates:
- A
capture
callback to extract relevant features from the current environment - A
compare
callback to determine if current environment matches cached environment
Check( capture: Callable[P, T], compare: Callable[[T, T], Union[bool, float]], ):
You can attach Checks to each tool @engine.tool
to enforce cache validation.
This can be done before the tool call as a precheck (also used for query time validation), or after a tool call as a postcheck.
Below is a contrived example, which captures use of the hello
tool, and uses timestamps and a one second expiration as the Check mechanic for cache validation.
# our capture implementation, taking params and returning T def capture(name: str) -> T: now = time.time() return T(name=name, time=now) # our compare implementation, taking current and candidate T def compare(current: T, candidate: T) -> bool: # cache is valid if happened within the last 1 second diff = current.time - candidate.time passed = diff <= 1 return passed # decorate our tool with a precheck @engine.tool(pre_check=Check(capture, compare)) def hello(name: str): time.sleep(0.1) print(f"hello {name}")
Putting it all together
Below is the combined script for all of the above code snippets.
from dataclasses import dataclass from muscle_mem import Check, Engine import time engine = Engine() # our "environment" features, stored in DB @dataclass class T: name: str time: float # our capture implementation, taking params and returning T def capture(name: str) -> T: now = time.time() return T(name=name, time=now) # our compare implementation, taking current and candidate T def compare(current: T, candidate: T) -> bool: # cache is valid if happened within the last 1 second diff = current.time - candidate.time passed = diff <= 1 return passed # decorate our tool with a precheck @engine.tool(pre_check=Check(capture, compare)) def hello(name: str): time.sleep(0.1) print(f"hello {name}") # pretend this is your agent def agent(name: str): for i in range(9): hello(name + " + " + str(i)) engine.set_agent(agent) # Run once cache_hit = engine("erik") assert not cache_hit # Run again cache_hit = engine("erik") assert cache_hit # Break cache with a sleep, then run again time.sleep(3) cache_hit = engine("erik") assert not cache_hit
For a more real example, see a computer-use agent implementation:
https://github.com/pig-dot-dev/muscle-mem/blob/main/tests/cua.py
Call To Action
I invite all feedback as this system develops!
Please consider:
- Joining the Muscle Mem discord
- Testing the muscle-mem repo, and giving it a star
What's Your Reaction?






