Note: This is an experiment. This post was created and published by the assistant itself at my request, and it will probably evolve over time as the setup changes.

For the past few days, I’ve been experimenting with OpenClaw, an open source framework for building self-hosted personal assistants. The interesting part is that it’s not just a chatbot, but a system that can integrate with your own infrastructure, access APIs, manage automations, and communicate across multiple channels (Telegram, Discord, Signal, and so on).

Why OpenClaw?

In the AI assistant landscape, most solutions are cloud-based and closed source. OpenClaw takes a different approach:

  • Self-hosted: it runs on your own hardware
  • Modular: a skill system that extends the assistant’s capabilities
  • Multi-channel: Telegram, Discord, WhatsApp, Signal, Slack, and more
  • Specialized agents: the ability to spawn sub-agents for specific tasks

The setup

I installed OpenClaw on a local Linux machine. The basic setup is relatively simple:

npm install -g openclaw
openclaw gateway start

The gateway is the heart of the system: it manages sessions, messaging provider connections, and coordinates the agents.

Defining the assistant’s identity

The most interesting part was configuring the assistant’s personality. To do that, I used Claude Opus 4.5 — a very capable model that analyzed information from my personal website to understand who I am, what I do, and how I like to communicate.

That analysis produced a file called SOUL.md, which defines the assistant’s character: a competent digital butler, with a touch of irony, speaking Italian and knowing when to be discreet. The chosen emoji is 😺 — a small personal touch.

OpenClaw’s approach is interesting: instead of ad-hoc prompt engineering for each interaction, it relies on context files (SOUL.md, USER.md, AGENTS.md) that the assistant reads automatically at each session start. This gives you behavioral consistency without having to repeat instructions all the time.

Active integrations

At the moment I have configured:

  • Telegram: the main channel for interacting with the assistant
  • Git: repository operations, commits, pushes
  • Skill system: weather, web search, wacli for WhatsApp
  • Cron: recurring task scheduling and reminders

Skills

The skill system is well designed. Each skill is a folder containing:

  • SKILL.md: documentation on how to use it
  • Scripts/binaries for specific operations
  • References to external tools (for example himalaya for email)

I currently have skills for:

  • Checking the weather
  • Searching the web
  • Interacting with Reddit, Hacker News, and FreshRSS
  • Handling email via IMAP/SMTP
  • And many others

The assistant in action

The assistant really behaves like a digital butler:

  • It replies directly on Telegram when it has something to communicate
  • It can run shell commands, edit files, and manage git
  • It performs web research and reports back the results
  • It keeps long-term memory in markdown files
  • It knows when to stay quiet (for example in group chats, when there is no need to intervene)

Scheduled automations

A key part of the setup is recurring automation, handled through OpenClaw’s cron system:

  • Every morning: a weather update for the day
  • Monday to Friday, mid-morning: collection of news relevant to my work from my FreshRSS feed
  • At lunchtime: a summary of the day’s important news
  • WhatsApp group monitoring: if it detects more than 10 unread messages in a group with friends over the last 4 hours, it sends me a Telegram summary with the key points of the conversation
  • Monthly spending analysis: every 17th of the month, it analyzes a CSV export of my expenses (covering the period from the 17th of the previous month to the 17th of the current month) and generates a full report with trends, savings suggestions, and yearly projections

This creates an automatic information flow that keeps me updated without having to manually check multiple sources.

Personal spending analysis

A more recent automation concerns personal finance. I configured a cron job that, every month on the 17th at 10:00:

  1. Retrieves the latest email containing the CSV export of my expenses
  2. Filters transactions for the period from the 17th of the previous month to the 17th of the current month
  3. Analyzes the data and generates a full report including:
    • Income vs expenses summary with savings percentage
    • Top spending categories with ASCII graphs
    • High-spending days and anomalies
    • 💡 Personalized suggestions from the digital butler (for example: “You spent X on restaurants, +20% compared to last month”)
    • Yearly projections based on the current trend

The task runs with Claude Opus 4 to ensure deeper analysis and better insights. The result is a detailed report sent directly to Telegram, which helps me keep an eye on my finances without opening banking apps or spreadsheets.

Email integration with Proton Mail

To complete the morning briefing, I also integrated email monitoring. Since I use Proton Mail, which does not expose a direct IMAP server, I had to rely on Proton Mail Bridge — a tool that creates a local IMAP/SMTP bridge connected to the Proton account.

The bridge runs in a Docker container using the shenxn/protonmail-bridge image with Docker Compose, as documented here:

version: '3'
services:
  protonmail-bridge:
    image: shenxn/protonmail-bridge:latest
    container_name: protonmail-bridge
    volumes:
      - ./config:/root
    ports:
      - "127.0.0.1:1025:25"
      - "127.0.0.1:1143:143"
    restart: unless-stopped

Once started, you just connect to the container with docker exec -it protonmail-bridge /bin/bash and run proton-bridge --login to authenticate with your Proton credentials. At that point, the bridge exposes IMAP and SMTP on localhost:

  • IMAP: 127.0.0.1:1143
  • SMTP: 127.0.0.1:1025

For the email skill I initially evaluated himalaya, which looked like the cleanest choice in many respects. Unfortunately, it did not properly support the plain authentication required to talk to the bridge. So I switched to an alternative skill with more flexible IMAP/SMTP support, and that integrated without any issues.

The result: every morning, the briefing also includes a short summary of unread email, so no more “oh damn, I forgot to reply to that email”.

Memory and context

The memory system is one of the strongest parts of the setup:

  • MEMORY.md: long-term memory, loaded only in main sessions
  • memory/YYYY-MM-DD.md: daily logs of what happened
  • AGENTS.md: conventions and rules for agent behavior
  • HEARTBEAT.md: recurring tasks to execute

This file-based approach is elegant: it survives restarts, is versionable, and remains transparent.

LLMs and costs

One thing that changed quite a lot over the last few weeks is the cost side of the equation. At first I used OpenRouter pay-per-use, testing different models (including heavier ones), and later settling for Kimi K2.5.

In terms of quality, Kimi worked well. The problem emerged on the economic side: according to my OpenRouter export, during my initial period of use, Kimi K2.5 alone cost me $31.71. For my real-world usage — daily rather than occasional — pay-per-use turned out to be not particularly sustainable.

From pay-per-use to a flat monthly cost

At that point I started looking for alternatives with two clear constraints:

  • predictable pricing (flat subscription)
  • privacy requirements compatible with my setup

Alternatives I evaluated

  • Claude Code: not a practical route in this scenario, because the usage pattern I need would conflict with the terms and conditions.
  • Ollama Cloud: a cloud service with access to various models (including Kimi K2.5) on a subscription basis, around €20/month. Interesting on paper, but in my tests performance was too slow for my daily workflow.
  • Syntetic.new: a promising project focused on more structured AI/coding workflows. I have it on my radar, but at the moment it is only available through a waitlist.
  • Kimi Code (€18/month): very competitive pricing, but the infrastructure goes through Chinese servers, which does not align with my privacy requirements.

Current compromise

Right now the most sensible compromise is ChatGPT Plus with Codex at €20/month: fixed cost, good enough performance, and a better balance between quality, predictable spending, and operational requirements.

So far, looking at the OpenRouter export, total spending has been $44.16, of which $31.71 was attributable to Kimi K2.5: numbers that confirm how difficult it is to keep a fully usage-based model sustainable in my daily use case.

Privacy considerations

I obviously had to evaluate the security side too:

  • The assistant’s data stays local for the whole orchestration layer
  • API keys are handled using native tools, without improvised workarounds
  • The agent system can operate in isolated sandboxes
  • Communication with external providers (Telegram, etc.) uses official APIs

When I use OpenRouter, an important control is Zero Data Retention (ZDR): it allows requests to be routed only to providers that claim not to retain prompts. It is a useful protection layer, but it still has to be balanced against model availability, performance, and costs.

Security: prompts are not enough, you need architecture

One of the most interesting additions to my setup has nothing to do with firewalls or reverse proxies, but with separating duties based on the trust level of the input.

The first thing I started taking seriously was the importance of openclaw security audit --deep. OpenClaw’s official security documentation recommends it as a regular check after any significant configuration change, network exposure, plugin addition, or new input surface. In my case it became part of the practical checklist, because it helps catch very real footguns before they become a problem.

To strengthen that area, I also added the skill-vetter skill, which is useful for taking a more critical look at installed or candidate skills instead of treating them as harmless little prompt snippets. In a system like this, plugins and skills should be treated for what they are: extensions of the trust boundary.

The broader point, however, is this: with an assistant like OpenClaw, the risk is not only “who can message me”, but also what the agent reads. Web pages, forums, email, documents, attachments, and search results can all become prompt injection vectors, especially when the agent has access to powerful tools, persistent memory, or automations. This is where the OWASP guidance on AI Agent Security and LLM Prompt Injection Prevention is genuinely useful, because it insists on a few simple but solid principles: least privilege, separation between tools and trust levels, treating external content as untrusted, protecting memory from poisoning, and keeping human oversight on the most sensitive actions.

As an operational baseline I also took inspiration from the checklist published by SouthSea Automation: OpenClaw Security Checklist. I do not treat it as the definitive source — for that I rely on the official documentation first — but I found it useful as a quick operational checklist. In particular, I applied point 5, SOUL.md — Hard Boundaries, with care: defining explicit limits around what the assistant can or cannot do, when it must ask for confirmation, what it must never send or promote into memory, and which boundaries it must not cross even if an external source tries to persuade it otherwise.

At that point, the natural next step was to create a sharper separation between agents. I split the setup into two levels:

  • Primary, the assistant with memory, personal context, decisions, and final responses;
  • Scout 🛰️, a second, more restricted agent dedicated to reading external content, doing research, monitoring sources, and handling small, bounded automation tasks.

The idea is simple: the agent that reads untrusted content should not be the same one that holds personal memory or can act with too many privileges. Scout filters, summarizes, and extracts facts; Primary only consumes the distilled result and decides what is actually worth remembering or turning into action.

I also added explicit memory hygiene rules:

  • no automatic promotion of external content into long-term memory;
  • only preferences, decisions, and verified operational context go into MEMORY.md;
  • automations that read external sources should remain small, specialized, and unable to expand their own scope.

In practice, the real defense is not a stricter prompt, but an architecture that limits the damage when something goes wrong. I am not trying to make the agent impossible to fool: I am trying to make sure that, when it is fooled, it has less room to do harm.

Conclusions

OpenClaw is an ambitious project that tries to bring AI assistants into the self-hosted, privacy-focused world. It is not perfect yet, but the architecture is solid and the community is active.

The idea of a “personal assistant” is finally starting to look like something useful, not just a chatbot that answers questions. And having it on your own hardware, under your own control, makes all the difference.


If you want to know more about the project: github.com/openclaw/openclaw