Agent Skills

Install MuAPI Agent Skills to generate AI images, videos, and audio directly from coding agents like Claude Code, Cursor, Gemini CLI, and other MCP-compatible tools.

Overview

MuAPI Agent Skills are pre-built, schema-driven shell scripts that give your AI coding agent the ability to generate, edit, and enhance rich media — all from the terminal.

  • Pre-built skills for image, video, audio, effects, and more
  • Dynamic Schema-Driven — scripts auto-resolve the latest models, endpoints, and valid parameters from schema_data.json
  • Direct Media Display — use the --view flag to download and open generated media instantly
  • Local File Support — scripts automatically upload local images, videos, faces, and audio to the MuAPI CDN
  • 100+ AI models — Midjourney v7, Flux Pro, Kling 3.0, Veo3, Suno V5, and more

Quick Start

1. Install Skills

# Install all MuAPI skills via the skills registry
npx skills add SamurAIGPT/Generative-Media-Skills --all

# Or install a specific skill
npx skills add SamurAIGPT/Generative-Media-Skills --skill muapi-media-generation

# List available skills
npx skills add SamurAIGPT/Generative-Media-Skills --list

2. Configure Your API Key

# Get your key at https://muapi.ai/dashboard
bash core/platform/setup.sh --add-key "YOUR_MUAPI_KEY"

3. Generate Your First Image

bash core/media/generate-image.sh \
  --prompt "a cyberpunk cityscape at dusk, neon reflections on rain-slicked streets" \
  --model flux-dev \
  --view

4. Generate Your First Video

bash core/media/generate-video.sh \
  --prompt "A timelapse of a city skyline transitioning from day to night" \
  --model minimax-pro \
  --duration 5 \
  --view

Available Skills

MuAPI Skills are organized into three categories: Generation, Editing & Enhancement, and Platform utilities.

Generation Skills

SkillScriptDescription
Text-to-Imagecore/media/generate-image.shGenerate images from text prompts. Supports Flux Dev, Midjourney v7, and 50+ models.
Text-to-Videocore/media/generate-video.shGenerate videos from text prompts. Supports Minimax Pro, Kling, Veo3, and more.
Image-to-Videocore/media/image-to-video.shAnimate a static image into a video. Supports Kling Pro, Veo3, Runway, and 15+ models.
Music & Audiocore/media/create-music.shCreate music (Suno V5), sound effects (MMAudio), remix, and extend tracks.
File Uploadcore/media/upload.shUpload local files (images, videos, audio) and get a CDN URL for use with other skills.

Editing & Enhancement Skills

SkillScriptDescription
Image Editingcore/edit/edit-image.shPrompt-based image editing with Flux Kontext, GPT-4o, Midjourney, and more.
Image Enhancementcore/edit/enhance-image.shOne-click operations: upscale, background removal, face swap, colorize, Ghibli style, product shots, and more.
Lipsynccore/edit/lipsync.shSync video lip movement to an audio track. Models: Sync Labs, LatentSync, Creatify, Veed.
Video Effectscore/edit/video-effects.shApply effects to videos/images: Wan AI effects, face swap, dance animation, dress change, Luma modify/reframe.

Platform Skills

SkillScriptDescription
Setupcore/platform/setup.shConfigure API key, show config, and test key validity.
Check Resultcore/platform/check-result.shPoll for async generation results by request ID. Supports one-shot and polling modes.

How Skills Work

Each skill is a self-contained shell script powered by schema_data.json for dynamic model and endpoint resolution:

muapi-skills/
├── core/
│   ├── media/           # Generation primitives
│   │   ├── generate-image.sh
│   │   ├── generate-video.sh
│   │   ├── image-to-video.sh
│   │   ├── create-music.sh
│   │   └── upload.sh
│   ├── edit/            # Editing & enhancement
│   │   ├── edit-image.sh
│   │   ├── enhance-image.sh
│   │   ├── lipsync.sh
│   │   └── video-effects.sh
│   └── platform/        # Setup & polling
│       ├── setup.sh
│       └── check-result.sh
└── schema_data.json     # Dynamic model registry

Key design principles:

  • Scripts output clean JSON for seamless agentic integration
  • All scripts support --async for fire-and-forget workflows
  • The --view flag auto-downloads and opens generated media
  • schema_data.json validates models, resolves endpoints, and checks parameters at runtime

Local File Uploads

Most editing and media generation scripts now support a "File-First" workflow. Instead of providing a public URL, you can provide a local file path (e.g., --file ./photo.jpg or --face-file ./face.png).

The scripts will:

  1. Automatically upload the local file to the MuAPI CDN.
  2. Parse the resulting URL.
  3. Submit the generation/editing task using that URL.
  4. Clean up any temporary state (if applicable).

This ensures your AI agent can process media storage locally while leveraging cloud-based AI models.

Compatibility

Optimized for the next generation of AI development environments:

  • Claude Code — Direct terminal execution via tools
  • Gemini CLI / Cursor / Windsurf — Seamless integration as local scripts
  • MCP — Each skill is Model Context Protocol-ready for universal agent usage

Common Flags

All core scripts share a consistent set of flags:

FlagDescription
--prompt, -pText description for generation
--model, -mModel name (script-specific defaults)
--asyncReturn request_id immediately without waiting
--viewDownload and open the result (macOS)
--file, --image-file, etc.Local file paths for automatic upload (script-specific)
--jsonRaw JSON output only (for piping)
--timeout NMax wait in seconds before timing out
--help, -hShow usage and available options

Next Steps

  1. Get an API key at muapi.ai/dashboard
  2. Configure your agentbash core/platform/setup.sh --add-key "YOUR_KEY"
  3. Start generating — Try the Quick Start examples above
  4. Explore models — Browse the Playground to discover 100+ available models