Agent Skills

Install MuAPI Agent Skills to generate AI images, videos, and audio directly from coding agents like Claude Code, Cursor, Gemini CLI, and other MCP-compatible tools.

Overview

MuAPI Agent Skills are pre-built, schema-driven shell scripts that give your AI coding agent the ability to generate, edit, and enhance rich media — all from the terminal.

Pre-built skills for image, video, audio, effects, and more
Dynamic Schema-Driven — scripts auto-resolve the latest models, endpoints, and valid parameters from schema_data.json
Direct Media Display — use the --view flag to download and open generated media instantly
Local File Support — scripts automatically upload local images, videos, faces, and audio to the MuAPI CDN
100+ AI models — Midjourney v7, Flux Pro, Kling 3.0, Veo3, Suno V5, and more

Quick Start

1. Install Skills

# Install all MuAPI skills via the skills registry
npx skills add SamurAIGPT/Generative-Media-Skills --all

# Or install a specific skill
npx skills add SamurAIGPT/Generative-Media-Skills --skill muapi-media-generation

# List available skills
npx skills add SamurAIGPT/Generative-Media-Skills --list

2. Configure Your API Key

# Get your key at https://muapi.ai/dashboard
bash core/platform/setup.sh --add-key "YOUR_MUAPI_KEY"

3. Generate Your First Image

bash core/media/generate-image.sh \
  --prompt "a cyberpunk cityscape at dusk, neon reflections on rain-slicked streets" \
  --model flux-dev \
  --view

4. Generate Your First Video

bash core/media/generate-video.sh \
  --prompt "A timelapse of a city skyline transitioning from day to night" \
  --model minimax-pro \
  --duration 5 \
  --view

Available Skills

MuAPI Skills are organized into three categories: Generation, Editing & Enhancement, and Platform utilities.

Generation Skills

Skill	Script	Description
Text-to-Image	`core/media/generate-image.sh`	Generate images from text prompts. Supports Flux Dev, Midjourney v7, and 50+ models.
Text-to-Video	`core/media/generate-video.sh`	Generate videos from text prompts. Supports Minimax Pro, Kling, Veo3, and more.
Image-to-Video	`core/media/image-to-video.sh`	Animate a static image into a video. Supports Kling Pro, Veo3, Runway, and 15+ models.
Music & Audio	`core/media/create-music.sh`	Create music (Suno V5), sound effects (MMAudio), remix, and extend tracks.
File Upload	`core/media/upload.sh`	Upload local files (images, videos, audio) and get a CDN URL for use with other skills.

Editing & Enhancement Skills

Skill	Script	Description
Image Editing	`core/edit/edit-image.sh`	Prompt-based image editing with Flux Kontext, GPT-4o, Midjourney, and more.
Image Enhancement	`core/edit/enhance-image.sh`	One-click operations: upscale, background removal, face swap, colorize, Ghibli style, product shots, and more.
Lipsync	`core/edit/lipsync.sh`	Sync video lip movement to an audio track. Models: Sync Labs, LatentSync, Creatify, Veed.
Video Effects	`core/edit/video-effects.sh`	Apply effects to videos/images: Wan AI effects, face swap, dance animation, dress change, Luma modify/reframe.

Workflow Skills (CLI Powered)

The AI Workflow ecosystem is now centrally powered by the muapi-cli which provides dedicated tools for LLM-based autonomous navigation.

Skill	Command / Script	Description
Create & Edit	`muapi workflow create`<br>`muapi workflow edit`	Create or edit multi-node AI pipelines using natural language prompts.
Agent Discovery	`muapi workflow discover`	Fetch a full catalog of saved workflows (with AI-generated descriptions). Designed for LLMs to semantically match user intents to the best workflow.
Execution	`muapi workflow execute`	Execute a workflow directly via the API with custom inputs.
Interactive Run	`muapi workflow run-interactive`	Interactive CLI sequence that fetches input schemas and actively prompts the user for required values.

Platform Skills

Skill	Script	Description
Setup	`core/platform/setup.sh`	Configure API key, show config, and test key validity.
Check Result	`core/platform/check-result.sh`	Poll for async generation results by request ID. Supports one-shot and polling modes.

How Skills Work

Each skill is a self-contained shell script powered by schema_data.json for dynamic model and endpoint resolution:

muapi-skills/
├── core/
│   ├── media/           # Generation primitives
│   │   ├── generate-image.sh
│   │   ├── generate-video.sh
│   │   ├── image-to-video.sh
│   │   ├── create-music.sh
│   │   └── upload.sh
│   ├── edit/            # Editing & enhancement
│   │   ├── edit-image.sh
│   │   ├── enhance-image.sh
│   │   ├── lipsync.sh
│   │   └── video-effects.sh
│   └── platform/        # Setup & polling
│       ├── setup.sh
│       └── check-result.sh
└── schema_data.json     # Dynamic model registry

Key design principles:

Scripts output clean JSON for seamless agentic integration
All scripts support --async for fire-and-forget workflows
The --view flag auto-downloads and opens generated media
schema_data.json validates models, resolves endpoints, and checks parameters at runtime

Local File Uploads

Most editing and media generation scripts now support a "File-First" workflow. Instead of providing a public URL, you can provide a local file path (e.g., --file ./photo.jpg or --face-file ./face.png).

The scripts will:

Automatically upload the local file to the MuAPI CDN.
Parse the resulting URL.
Submit the generation/editing task using that URL.
Clean up any temporary state (if applicable).

This ensures your AI agent can process media storage locally while leveraging cloud-based AI models.

Compatibility

Optimized for the next generation of AI development environments:

Claude Code — Direct terminal execution via tools
Gemini CLI / Cursor / Windsurf — Seamless integration as local scripts
MCP — Each skill is Model Context Protocol-ready for universal agent usage

Common Flags

All core scripts share a consistent set of flags:

Flag	Description
`--prompt`, `-p`	Text description for generation
`--model`, `-m`	Model name (script-specific defaults)
`--async`	Return `request_id` immediately without waiting
`--view`	Download and open the result (macOS)
`--file`, `--image-file`, etc.	Local file paths for automatic upload (script-specific)
`--json`	Raw JSON output only (for piping)
`--timeout N`	Max wait in seconds before timing out
`--help`, `-h`	Show usage and available options

For agents that prefer natural-language recipes over shell-script primitives, the muapi assistant ships 40+ end-to-end workflow SKILL.md files (storyboard, action-figure-generator, ad-creative, etc.) and exposes them via a public, unauthenticated API:

GET https://api.muapi.ai/api/v1/agent-skills — list all recipes
GET https://api.muapi.ai/api/v1/agent-skills/{name} — fetch full markdown body

See Agents API → Public Workflow Recipe Discovery for details.

Next Steps

Get an API key at muapi.ai/dashboard
Configure your agent — bash core/platform/setup.sh --add-key "YOUR_KEY"
Start generating — Try the Quick Start examples above
Explore models — Browse the Playground to discover 100+ available models