Making a safe, sandboxed OpenCode

I’ve wanted to make AI coding agent that is both useful and safe for a while, and I’ve finally found some success. I made an OpenCode plugin called opencode-daytona that spawns each coding session in a cloud sandbox, so you can build normally while the agent has no access to your system.

%[https://twitter.com/jamesmurdza/status/2016299806759780614\]

This post will should as a good way to learn about coding agent sandboxes, or as a want to learn about developing an OpenCode plugin. Either way, if you read this and want to discuss collaborating on either of these topics, reach out to me.

For the rest of this article, I will talk about how I made the plugin, and I will also give some commentary on my experience getting very familiar with OpenCode.

What it does

I’ve previously made many examples of AI coding agents running inside of sandboxes, but they had an issue: The agent runs in the same sandbox as AI-generated code. This is a bad thing because the agent can be hacked to steal resources from itself or leak its API keys. I’ve explained this in detail in this earlier post.

In this plugin, I added the following functionality to OpenCode:

A unique sandbox created for each session
Replacement tool calls (read file, run command, etc.) overriding the defaults
Git synchronization from the sandbox to a local git branch

The final plugin is about 1800 lines of code: 50% core plugin code, 25% agent tools, and 25% git synchronization code.

Claude Code, Codex or OpenCode?

Originally, I didn’t know if OpenCode was the best option. I wondered if I could extend Claude Code to do this. Unfortunately, there is no way to override existing behaviors such as reading files, since Claude Code is closed source and there is no way to override built-in tools.

Functionality	Claude Code plugins	OpenCode plugins
Slash commands	✅	✅
Skills	✅	✅
Events	Pre/post hooks	Event hooks
Add new tools	Indirectly via MCP/LSP	✅
Overwrite tools	❌	✅
Prompt shaping	✅	✅

My remaining options were to 1) fork an open source agent such as OpenAI’s Codex or OpenCode, or 2) build an OpenCode plugin. After some trial and error, the latter turned out to be a good solution.

The OpenCode plugin SDK

The OpenCode plugin interface is definitely a work-in-progress, but the parts that work are elegant. I used the documentation to get started, and for parts that weren’t documented (like toast notifications) I used my IDE’s IntelliSense.

When you normally install an OpenCode plugin, you add the npm package name to a config file, and OpenCode downloads the package and runs it in its Bun runtime on every launch. During testing, you don’t want to publish to npm, so you can add a plugin directly by creating a link to their TypeScript source directory:

ln -s ./opencode-daytona/.opencode/plugins ./test-project/.opencode/plugins

Note: OpenCode only imports the base plugins directory, so you also need an index.ts file that imports and re-exports all plugins within this directory.

Plugin Implementation

I’ll now walk through the implementation of the OpenCode plugin, which has a lot in common with any OpenCode plugin you might want to develop. The core functionality is in tool-calls, event handlers, and adding to the system prompt.

Tool calls

We override all state-related tool-calls with analogous versions using the Daytona SDK. Here’s the bash execution tool as an example:

import { z } from 'zod'
import type { ToolContext } from '@opencode-ai/plugin/tool'

export const bashTool = (sessionManager: DaytonaSessionManager, projectId: string) => ({
  description: 'Executes shell commands in a Daytona sandbox',
  args: { command: z.string() },
  async execute(args: { command: string }, ctx: ToolContext) {
    const sandbox = await sessionManager.getSandbox(ctx.sessionID, projectId);
    const result = await sandbox.process.executeCommand(args.command);
    return `Exit code: \({result.exitCode}\n\){result.result}`;
  },
});

The sessionManager is a custom map that I implemented to keep track of sessions and sandboxes, and the key line of code is sandbox.process.executeCommand which is the Daytona SDK method to run bash commands.

By looking at the OpenCode source code, I found 10 OpenCode tool-calls that needed to be overridden (bash, edit, glob, grep, ls, lsp, multiedit, patch, read, write) and I also added one new tool-call of my own to generate sandbox preview links (get-preview-url). All of these functions get exported by using OpenCode’s CustomToolsPlugin:

import type { Plugin, PluginInput } from '@opencode-ai/plugin'

export const CustomToolsPlugin: Plugin = async (pluginCtx: PluginInput) => {
  logger.info('OpenCode started with Daytona plugin')
  const projectId = pluginCtx.project.id
  return {
    tool: {
      bash: bashTool(sessionManager, projectId),
      read: readTool(sessionManager, projectId),
      write: writeTool(sessionManager, projectId),
      edit: editTool(sessionManager, projectId),
      // More tools...
    }
  }
}

As I mentioned earlier, the ability to override tool-calls is unique to OpenCode, which is what made the whole plugin idea possible. One big caveat to what I’ve done is that if OpenCode adds tools in a later version that I haven’t implemented, it will break the isolation until I update my plugin. In fact, this happened while I was writing this article!

Events

I used event handlers to watch for two events:

When a session is deleted: Delete the corresponding sandbox for that session
When the session idles (i.e. the agent stops working) files are syncd from the sandbox to the local system

Here’s what the implementation for the first handler looks like:

import type { Plugin, PluginInput } from '@opencode-ai/plugin'
import type { EventSessionDeleted } from './core/types'

export const SessionCleanUpPlugin: Plugin = async (pluginCtx: PluginInput) => {
  return {
    event: async ({ event }) => {
      if (event.type === 'session.deleted') {
        const sessionId = (event as EventSessionDeleted).properties.sessionID
        const projectId = pluginCtx.project.id
        await sessionManager.deleteSandbox(sessionId, projectId)
      }
    },
  }
}

Prompt transformation

With the addition of the above event handler, the plugin worked, although it behaved strangely at times. For example, it would try and use paths from my local system instead of from the sandbox. (OpenCode probably adds these in the context.) To adjust the agent’s behavior, I added my own addition to the system prompt:

import type { Plugin, PluginInput } from '@opencode-ai/plugin'

export const SystemTransformPlugin: Plugin = async (pluginCtx: PluginInput) => {
  return {
    'experimental.chat.system.transform': async (
      input: ExperimentalChatSystemTransformInput,
      output: ExperimentalChatSystemTransformOutput,
    ) => {
      output.system.push(
        [
          'This session is integrated with a Daytona sandbox.',
          `The main project repository is located at: ${repoPath}.`,
          'Do NOT try to use the current working directory of the host system.',
          // ...
        ].join('\n'),
      },
    }
  }
}

Git integration

Once I had all of above working, I was thrilled. But there was still a major inconvenience: Code created in the sandbox was stuck there, while my local OpenCode project directory remained empty. Of the many possible solutions, I considered:

Option A: Use scp or rsync. This would copy the files to the local computer, but wouldn’t handle version history or multiple sandboxes.
Option B: Sync to a git repository on a third-party host (like GitHub). This would work, but would add extra complexity to the system.
Option C: Use git to pull changes directly from the sandbox.

I decided on Option C for the best user experience. On session idle, the plugin commits all changes to a repository in the sandbox. Then the plugin syncs those changes to a read-only branch on your system. This architecture makes syncing changes works seamlessly and securely, even though it’s implementation is unintuitive.

My experience building with OpenCode

Having spent some time with both the OpenCode plugin SDK (and OpenCode source code), I want to note down some of the things that were tricky

Plugin development workflow: There isn’t a template for what a plugin’s code structure should look like, and adding multiple plugins via symlinks requires manually coding an index.ts. Ideally, you could just use file://path/to/plugin in your OpenCode config file.
Projects are not tied to the project path: OpenCode projects are tied to the git history inside them. If git is not initialized in a directory or the git history has no commits, OpenCode sessions will run in the “global” project. (If you later open OpenCode in this directory with a git history, sessions somehow move to a newly created project.) This is not intuitive as a new OpenCode user.
Reading the OpenCode config: My plugin needs some basic configuration like a Daytona API key. Currently I read this from an environment variable. This should be added to one of OpenCode’s configuration files, but I can’t figure out how to access the loaded configuration data from my plugin.
Plugin updates: All plugins are downloaded every time you run OpenCode. If there is a supply chain attack on a plugin, it will instantly affect all users.
Accessing the TUI: I was able to figure out how to pushing toast notifications to OpenCode by using IntelliSense, but I couldn’t extend the OpenCode interface further, for example, but using a modal to ask the user a question.
Storing data: OpenCode has its own directory structure for storing data, but this isn’t documented. I had to read their source code to reimplement it for my sandbox-session mappings.

Future developments

I’m still using this plugin to run secure coding jobs in parallel, and it’s working well for this! I can “fork” multiple sessions from the same code branch, and then test and merge their branches when they finish. Since all run in separate sandboxes, there is no possibility for interference between them.

Similar parallel AI coding solutions have appeared recently, such as Superset, Conductor and sidecar—These all integrate with Ai agents and allow parallel coding, but without safe isolation. One idea to explore would be to integrate code sandboxes with one of these tools.

Another idea would be to keep developing this plugin while also contributing improvements to OpenCode (addressing the points above) which would make the OpenCode experience better without needing to fork it in the long run.

Finally, the idea of synchronizing a development machine and a sandboxed agent directly using git is something fairly new, and I’d like to play with this more to make the implementation smoother and reusable for more agents.

Making a safe, sandboxed OpenCode

Comments (3)

More from this blog

Why AI coding agents are unsafe

How I taught an AI to use a computer

What it does

Claude Code, Codex or OpenCode?

The OpenCode plugin SDK

Plugin Implementation

Tool calls

Events

Prompt transformation

Git integration

My experience building with OpenCode

Future developments

Command Palette

Comments (3)

More from this blog

What it does

Claude Code, Codex or OpenCode?

The OpenCode plugin SDK

Plugin Implementation

Tool calls

Events

Prompt transformation

Git integration

My experience building with OpenCode

Future developments