logo
icon

ACE-Step 1.5

ACE-Step 1.5 is an open-source music generation foundation model with commercial-grade quality. It features a hybrid LM + DiT architecture supporting text-to-music, lyrics-to-song, audio repainting, covers, and LoRA fine-tuning. Requires GPU with at least 6GB VRAM.

template cover
Deployed0 times
PublisherDumoeDss
Created2026-03-21
Services
service icon
Minimum4 Cores16 GB
Recommended8 Cores32 GB
Tags
AIMusicGPU

ACE-Step 1.5

ACE-Step 1.5 is an open-source music generation foundation model developed by StepFun & ACE Studio. It combines a Language Model (LM) planner with a Diffusion Transformer (DiT) to produce commercial-grade music.

Features

  • Text-to-Music: Generate music from text descriptions (genre, mood, BPM, key, etc.)
  • Lyrics-to-Song: Generate songs from structured lyrics with timestamps
  • Audio Repainting: Edit specific sections of existing audio
  • Audio Continuation: Extend existing audio clips
  • LoRA Training: Fine-tune with your own audio data

Startup Mode

Set ACESTEP_MODE to choose the service to launch:

ModePortDescription
gradio (default)7860Interactive Gradio Web UI for music generation
api8001REST API server for programmatic access (/generate, /health)

GPU Requirements

VRAMModeLM Model
6 GBDiT-only (no LM)None
6-8 GBDiT + LMacestep-5Hz-lm-0.6B
8-16 GBDiT + LMacestep-5Hz-lm-1.7B
16+ GBDiT + LMacestep-5Hz-lm-4B

Configuration

  • ACESTEP_MODE: Startup mode — gradio (Web UI) or api (REST API server)
  • ACESTEP_CONFIG_PATH: DiT model variant (default: acestep-v15-turbo)
  • ACESTEP_LM_MODEL_PATH: LM model size (default: acestep-5Hz-lm-1.7B)
  • ACESTEP_LLM_BACKEND: LLM backend — vllm (faster, needs 8GB+) or pt (PyTorch fallback)
  • ACESTEP_API_KEY: Optional API authentication key