Related Guides
Desktop Template
Build desktop sandboxes with Ubuntu, XFCE, and VNC streaming
Connect LLMs
Integrate AI models with sandboxes using tool calling
Sandbox Lifecycle
Create, manage, and control sandbox lifecycle
Streaming
Stream stdout, stderr, and results in real-time
Read & Write Files
Manage files within the sandbox filesystem
API Key
Set up authentication for E2B sandboxes
Full source code available on GitHub.
Project Structure
How It Works
This application creates an autonomous AI loop that enables natural language control of a virtual Linux desktop:- User Input - You send a natural language command like “Open Firefox and search for AI news”
- Sandbox Creation - E2B spins up an Ubuntu 22.04 desktop environment (if not already running)
- Visual Analysis - The AI receives a screenshot of the current desktop state
- Action Planning - OpenAI Computer Use API analyzes the screenshot and decides what action to take
- Action Execution - The action (click, type, scroll, etc.) is executed on the desktop via E2B SDK
- Feedback Loop - A new screenshot is taken and sent back to the AI
- Iteration - The loop continues until the task is complete (maximum 15 iterations)
Implementation
Step 1: Project Setup
Initialize a new Next.js project and install the required dependencies.Step 2: Environment Configuration
Set up your API keys and create environment validation utilities.Create Environment File
Create a Get your API keys:
.env file in your project root:- E2B API Key: https://e2b.dev/docs/api-key
- OpenAI API Key: https://platform.openai.com/api-keys
Step 3: Type Definitions
Define TypeScript interfaces for type safety throughout the application. Createtypes/index.ts with core application types:
ComputerAction discriminated union ensures type-safe action handling throughout the application.
Step 4: Configuration Constants
Centralize all configuration values for easy management. Createlib/constants.ts with application-wide constants:
Step 5: Utility Functions
Build helper functions for screenshot processing, streaming, and action execution.Screenshot Processing
Create This function resizes screenshots to optimal dimensions and converts them to base64 for API transmission.
lib/utils/screenshot.ts to optimize screenshots:Safe Stream Controller
Create The safe stream controller prevents “already closed” errors during SSE streaming.
lib/utils/stream.ts for SSE streaming:Step 6: AI System Prompt
Define the system instructions that guide the AI agent’s behavior. Createlib/ai/instructions.ts with the AI agent prompt:
Step 7: Computer Use Loop
Implement the core AI execution loop that powers desktop control. Createlib/services/openai.ts with the computer use loop:
- Takes screenshots of the desktop
- Sends them to OpenAI with context
- Receives structured computer actions
- Executes actions via E2B
- Repeats until the task is complete
Step 8: API Endpoint
Create the backend API endpoint that orchestrates sandbox creation and AI execution. Createapp/api/chat/route.ts for the SSE streaming endpoint:
- Validates environment and request
- Creates or reuses E2B sandboxes
- Starts VNC streaming
- Runs the computer use loop
- Streams events back to the client in real-time
Step 9: Server Actions
Add Next.js server actions for sandbox management from the client. Createapp/actions.ts for server-side operations:
- Extend timeout: Add 5 more minutes to prevent sandbox from expiring
- Stop sandbox: Immediately terminate and clean up resources
Step 10: Chat Interface
Createapp/page.tsx with a chat interface, real-time status tracking, VNC viewer, and countdown timer with timeout management.
View complete app/page.tsx implementation
View complete app/page.tsx implementation
Step 11: Styling
Createstyles/globals.css with E2B branding, animations, and responsive design.
View complete styles/globals.css implementation
View complete styles/globals.css implementation
app/layout.tsx to import the global styles and configure metadata.
Step 12: Running the Application
Start the development server and test your AI desktop control application.Start Development Server
Troubleshooting
Sandbox creation is slow
Sandbox creation is slow
Sandbox creation typically takes 5-10 seconds on the first request. This is normal. Subsequent requests reuse the sandbox and are instant (if using a long-running server).To improve perceived performance:
- Show a loading indicator during sandbox creation
- Pre-create sandboxes for expected users
- Use E2B’s template system to pre-install dependencies
Sandboxes not being reused
Sandboxes not being reused
If you’re deploying to serverless platforms (Vercel, Netlify), the in-memory Map won’t persist between requests. Solutions:
- Deploy to a long-running Node.js server (Railway, Render)
- Use a distributed cache (Redis, Vercel KV) to store sandbox IDs
- Create a new sandbox for each request (simpler but slower)
OpenAI API errors
OpenAI API errors
“Invalid model” or “Model not found”: The
computer-use-preview model may have changed names or require beta access. Check the OpenAI documentation for the current model name.API structure errors: The OpenAI Computer Use API may have changed. Verify the current API structure in the official docs.E2B Desktop SDK errors
E2B Desktop SDK errors
“Method not found”: Verify the method names (
leftClick, scroll, etc.) against the latest E2B Desktop SDK documentation.Stream not starting: Ensure you’re calling await sandbox.stream.start() before accessing sandbox.stream.getUrl().Screenshot quality issues
Screenshot quality issues
Adjust the screenshot dimensions in Note: Larger screenshots increase API costs and latency.
lib/constants.ts:Timeout errors
Timeout errors
If sandboxes are timing out too quickly:
- Increase
TIMEOUT_MSinlib/constants.ts - Implement automatic timeout extension during active work
- Add user controls to manually extend timeouts