A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications


banner("Part 5 — Streaming")
mem.attribution(entity_id="[email protected]", process_id="personal-assistant")
stream = client.chat.completions.create(
   model=MODEL,
   messages=[{"role": "user",
              "content": "In two sentences, what do you remember about me?"}],
   stream=True,
)
print("[stream] ", end="")
for chunk in stream:
   d = chunk.choices[0].delta.content
   if d: print(d, end="", flush=True)
print(); time.sleep(WRITE_DELAY)
banner("Part 6 — Async LLM calls")
async def async_demo():
   r = await async_client.chat.completions.create(
       model=MODEL,
       messages=[{"role": "user",
                  "content": "What dietary restriction do I have? (asked async)"}],
   )
   return r.choices[0].message.content
print("[async]", asyncio.run(async_demo()))
banner("Part 7 — Mini support agent across multiple sessions")
def support(user_id, prompt):
   mem.attribution(entity_id=user_id, process_id="support-bot")
   return ask(prompt, system=(
       "You are a calm, helpful customer support agent. "
       "Use what you remember about the user. If you don't know, say so."
   ))
USER = "[email protected]"
mem.attribution(entity_id=USER, process_id="support-bot")
mem.new_session()
print("[support T1]", support(USER,
   "Hi! I'm Charlie, on the Pro plan. Email: [email protected]. "
   "Billing question for next month."))
time.sleep(WRITE_DELAY)
mem.new_session()
print("[support T2]", support(USER,
   "Hey, me again. What plan am I on and what's my email of record?"))
banner("Done. Open https://app.memorilabs.ai to inspect memories, "
      "or use Memori BYODB to point at your own Postgres.")



Source link

  • Related Posts

    Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Training Speedup in LLMs

    Scaling large language models (LLMs) is expensive. Every token processed during inference and every gradient computed during training flows through feedforward layers that account for over two-thirds of model parameters…

    Best Vector Databases in 2026: Pricing, Scale Limits, and Architecture Tradeoffs Across Nine Leading Systems

    Vector databases have graduated from experimental tooling to mission-critical infrastructure. In 2026, vector databases serve as the core retrieval layer for RAG pipelines, semantic search systems, and agentic AI workflows…

    Leave a Reply

    Your email address will not be published. Required fields are marked *