web-browser
$
npx mdskill add megalithic/dotfiles/web-browserAutomates web page interactions by connecting to a running browser with authenticated sessions via agent-browser CLI.
- Enables agents to perform browser tasks without manual intervention, such as accessing logged-in accounts.
- Integrates with agent-browser CLI and requires a browser running on port 9222 for remote debugging.
- Decides actions by checking for existing tabs to avoid disrupting user work and opens new ones only when necessary.
- Presents results through command-line outputs and maintains session continuity by reusing authenticated browser connections.
SKILL.md
.github/skills/web-browserView on GitHub ↗
--- name: web-browser description: "Interact with web pages using agent-browser CLI. MUST run 'browser connect 9222' FIRST to use existing browser with authenticated sessions." --- # Web Browser Skill Browser automation using `agent-browser` CLI connected to your running browser. ## 🚨 MANDATORY FIRST STEP **EVERY browser session MUST start with:** ```bash browser connect 9222 ``` This connects to your running browser with all authenticated sessions (Asana, Figma, GitHub, etc.). **WITHOUT THIS STEP:** - Commands will fail or timeout - You'll get isolated sessions without logins - User will have to re-authenticate everything ## ⚠️ CRITICAL REQUIREMENTS ### 1. ALWAYS connect to port 9222 FIRST Before ANY browser operation, you MUST connect to the remote debugging port: ```bash browser connect 9222 ``` This is REQUIRED for accessing authenticated sessions. Without this step, commands will fail or create isolated sessions without your logins. ### 2. NEVER take over existing tabs When navigating to a URL: - First check if tab already exists: `browser tab list` - If found, switch to it: `browser tab <index>` - If NOT found, open a NEW tab: `browser open <url>` **NEVER navigate an existing tab to a different URL** - this destroys the user's work/context. ## Correct workflow ```bash # 1. ALWAYS connect first (required every session) browser connect 9222 # 2. Check for existing tab browser tab list # 3a. If tab exists for your URL, switch to it browser tab 14 # 3b. If tab doesn't exist, open NEW tab browser open https://app.asana.com/... # 4. Interact browser snapshot -i browser click @e5 ``` ## Check if browser is listening ```bash lsof -i :9222 -sTCP:LISTEN ``` ## Common commands After connecting, use standard agent-browser commands: ### Navigation & tabs ```bash browser tab list # List all tabs browser tab 14 # Switch to tab by index browser open https://example.com # Open URL (NEW tab) browser back # Go back browser reload # Reload page ``` ### Inspection ```bash browser snapshot -i # Get interactive elements with @refs browser screenshot # Take screenshot browser get title # Get page title browser get url # Get current URL browser get text @e1 # Get text of element ``` ### Interaction ```bash browser click @e1 # Click element browser fill @e2 "search text" # Clear and type browser type @e3 "append text" # Type without clearing browser select @e4 "option" # Select dropdown browser press Enter # Press key browser scroll down 500 # Scroll ``` ### Waiting ```bash browser wait @e1 # Wait for element browser wait 2000 # Wait milliseconds ``` ## Tab targeting by URL Instead of remembering tab numbers, find tabs by URL: ```bash browser tab list | rg -i asana browser tab list | rg -i localhost:4000 ``` ## Notes - Tabs are numbered by CDP, not visual order in browser - `snapshot -i` gives @refs like @e1, @e2 for clicking - After page changes (navigation, clicks), re-run `snapshot -i` - Your browser must be running with `--remote-debugging-port=9222`
More from megalithic/dotfiles
- brave-searchWeb search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
- cli-toolsModern CLI tool usage (fd, rg) for fast file and content searching. Critical for Nix store searches and large codebases. Use when searching files or content, especially in /nix/store.
- hsComprehensive guide for Hammerspoon development in this dotfiles repo. Covers config patterns, debugging decision trees, API reference, performance monitoring, and troubleshooting.
- image-handlingImage handling for Claude API constraints (5MB max, 8000px max dimension). Use when working with images, screenshots, or MCP browser tools.
- jjJujutsu (jj) version control workflow, commands, and best practices. Use when working with version control in jj-enabled repos. Covers commits, bookmarks, workspaces, and safe push patterns.
- nixExpert help with Nix, nix-darwin, home-manager, flakes, and nixpkgs. Use for dotfiles configuration, package management, module development, hash fetching, debugging evaluation errors, and understanding Nix idioms and patterns.
- notesExpert help with the meganote system - cross-tool note capture, daily notes, and obsidian.nvim integration. Covers Hammerspoon, Shade, nvim, and the full capture → daily note linking pipeline.
- nvimComprehensive guide for Neovim configuration in this dotfiles repo. Covers plugin management, LSP debugging, treesitter, keymaps, performance, and troubleshooting decision trees.
- previewDisplay code, diffs, images, and other content in a tmux pane or popup. Auto-detects nvim/megaterm for floating popups.
- shadeExpert help with Shade - the native Swift note capture app. Use for debugging Shade issues, understanding IPC protocols, implementing Hammerspoon integration, nvim RPC, context gathering, and meganote workflows.