ChuxiJ commited on
Commit
428436b
·
1 Parent(s): 8e83122

add docs and readme

Browse files
README.md CHANGED
@@ -1,69 +1,194 @@
1
- # ACE-Step-1.5
 
 
 
 
 
 
 
 
 
2
 
3
- ## Installation
 
 
4
 
5
- This project uses [uv](https://github.com/astral-sh/uv) for dependency management.
6
 
7
- ### Install uv
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  ```bash
 
 
 
10
  # Windows (PowerShell)
11
  powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
12
-
13
- # macOS/Linux
14
- curl -LsSf https://astral.sh/uv/install.sh | sh
15
  ```
16
 
17
- ### Install Project Dependencies
18
 
19
  ```bash
20
- # Sync all dependencies
 
21
  uv sync
22
  ```
23
 
24
- ### Run the Project
 
 
25
 
26
  ```bash
27
- # Simplest way - run directly with uv
28
  uv run acestep
 
29
 
30
- # Run with parameters
31
- uv run acestep --port 7860 --server-name 0.0.0.0 --share
32
 
33
- # Or use the full module path
34
- uv run python -m acestep.acestep_v15_pipeline
35
 
36
- # Just Run profiling
37
- uv run profile_inference.py
 
38
 
39
- # Or activate the virtual environment first
40
- source .venv/bin/activate # Linux/macOS
41
- # or
42
- .venv\Scripts\activate # Windows
43
 
44
- acestep
45
- ```
46
 
47
- Available parameters:
48
- - `--port`: Server port (default: 7860)
49
- - `--server-name`: Server address (default: 127.0.0.1, use 0.0.0.0 to listen on all interfaces)
50
- - `--share`: Create a public share link
51
- - `--debug`: Enable debug mode
52
 
53
- ## Development
 
 
 
 
 
 
 
 
 
54
 
55
- Add new dependencies:
56
 
57
  ```bash
58
- # Add runtime dependencies
59
- uv add package-name
60
 
61
- # Add development dependencies
62
- uv add --dev package-name
63
  ```
64
 
65
- Update dependencies:
66
 
67
  ```bash
 
 
 
 
 
68
  uv sync --upgrade
69
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h1 align="center">ACE-Step 1.5</h1>
2
+ <h1 align="center">Pushing the Boundaries of Open-Source Music Generation</h1>
3
+ <p align="center">
4
+ <a href="https://ace-step-v1.5.github.io">Project</a> |
5
+ <a href="https://huggingface.co/collections/ACE-Step/ace-step-15">Hugging Face</a> |
6
+ <a href="https://modelscope.cn/models/ACE-Step/ACE-Step-v1-5">ModelScope</a> |
7
+ <a href="https://huggingface.co/spaces/ACE-Step/ACE-Step-1.5">Space Demo</a> |
8
+ <a href="https://discord.gg/PeWDxrkdj7">Discord</a> |
9
+ <a href="https://arxiv.org/abs/2506.00045">Technical Report</a>
10
+ </p>
11
 
12
+ <p align="center">
13
+ <img src="./assets/orgnization_logos.png" width="100%" alt="StepFun Logo">
14
+ </p>
15
 
16
+ ## Table of Contents
17
 
18
+ - [✨ Features](#-features)
19
+ - [📦 Installation](#-installation)
20
+ - [🚀 Usage](#-usage)
21
+ - [🔨 Train](#-train)
22
+
23
+ ## 📝 Abstract
24
+ We present ACE-Step v1.5, a highly efficient foundation model that democratizes commercial-grade music production on consumer hardware. Optimized for local deployment (<4GB VRAM), the model accelerates generation by over 100× compared to traditional pure LM architectures, producing superior high-fidelity audio in seconds characterized by coherent semantics and exceptional melodies. At its core lies a novel hybrid architecture where the Language Model (LM) functions as an omni-capable planner: it transforms simple user queries into comprehensive song blueprints—scaling from short loops to 10-minute compositions—while synthesizing metadata, lyrics, and captions via Chain-of-Thought to guide the Diffusion Transformer (DiT). Uniquely, this alignment is achieved through intrinsic reinforcement learning relying solely on the model’s internal mechanisms, thereby eliminating the biases inherent in external reward models or human preferences. Beyond standard synthesis, ACE-Step v1.5 unifies precise stylistic control with versatile editing capabilities—such as cover generation, repainting, and vocal-to-BGM conversion—while maintaining strict adherence to prompts across 50+ languages.
25
+
26
+
27
+ ## ✨ Features
28
+
29
+ <p align="center">
30
+ <img src="./assets/application_map.png" width="100%" alt="ACE-Step Framework">
31
+ </p>
32
+
33
+ ### ⚡ Performance
34
+ - ✅ **Ultra-Fast Generation** — 0.5s to 10s generation time (depending on think mode & diffusion steps)
35
+ - ✅ **Flexible Duration** — Supports 10 seconds to 10 minutes (600s) audio generation
36
+ - ✅ **Batch Generation** — Generate up to 8 songs simultaneously
37
+
38
+ ### 🎵 Generation Quality
39
+ - ✅ **Commercial-Grade Output** — Quality between Suno v4.5 and Suno v5
40
+ - ✅ **Rich Style Support** — 1000+ instruments and styles with fine-grained timbre description
41
+ - ✅ **Multi-Language Lyrics** — Supports 50+ languages with lyrics prompt for structure & style control
42
+
43
+ ### 🎛️ Versatility & Control
44
+
45
+ | Feature | Description |
46
+ |---------|-------------|
47
+ | ✅ Reference Audio Input | Use reference audio to guide generation style |
48
+ | ✅ Cover Generation | Create covers from existing audio |
49
+ | ✅ Repaint & Edit | Selective local audio editing and regeneration |
50
+ | ✅ Track Separation | Separate audio into individual stems |
51
+ | ✅ Multi-Track Generation | Add layers like Suno Studio's "Add Layer" feature |
52
+ | ✅ Vocal2BGM | Auto-generate accompaniment for vocal tracks |
53
+ | ✅ Metadata Control | Control duration, BPM, key/scale, time signature |
54
+ | ✅ Simple Mode | Generate full songs from simple descriptions |
55
+ | ✅ Query Rewriting | Auto LM expansion of tags and lyrics |
56
+ | ✅ Audio Understanding | Extract BPM, key/scale, time signature & caption from audio |
57
+ | ✅ LRC Generation | Auto-generate lyric timestamps for generated music |
58
+ | ✅ LoRA Training | One-click annotation & training in Gradio. 8 songs, 1 hour on 3090 (12GB VRAM) |
59
+ | ✅ Quality Scoring | Automatic quality assessment for generated audio |
60
+
61
+
62
+
63
+ ## 📦 Installation
64
+
65
+ > **Requirements:** Python 3.11, CUDA GPU recommended (works on CPU/MPS but slower)
66
+
67
+ ### 1. Install uv (Package Manager)
68
 
69
  ```bash
70
+ # macOS / Linux
71
+ curl -LsSf https://astral.sh/uv/install.sh | sh
72
+
73
  # Windows (PowerShell)
74
  powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
 
 
 
75
  ```
76
 
77
+ ### 2. Clone & Install
78
 
79
  ```bash
80
+ git clone https://github.com/ACE-Step/ACE-Step-1.5.git
81
+ cd ACE-Step-1.5
82
  uv sync
83
  ```
84
 
85
+ ### 3. Launch
86
+
87
+ #### 🖥️ Gradio Web UI (Recommended)
88
 
89
  ```bash
 
90
  uv run acestep
91
+ ```
92
 
93
+ Open http://localhost:7860 in your browser. Models will be downloaded automatically on first run.
 
94
 
95
+ #### 🌐 REST API Server
 
96
 
97
+ ```bash
98
+ uv run acestep-api
99
+ ```
100
 
101
+ API runs at http://localhost:8001. See [API Documentation](./docs/en/API.md) for endpoints.
 
 
 
102
 
103
+ ### Command Line Options
 
104
 
105
+ **Gradio UI (`acestep`):**
 
 
 
 
106
 
107
+ | Option | Default | Description |
108
+ |--------|---------|-------------|
109
+ | `--port` | 7860 | Server port |
110
+ | `--server-name` | 127.0.0.1 | Server address (use `0.0.0.0` for network access) |
111
+ | `--share` | false | Create public Gradio link |
112
+ | `--language` | en | UI language: `en`, `zh`, `ja` |
113
+ | `--init_service` | false | Auto-initialize models on startup |
114
+ | `--config_path` | auto | DiT model (e.g., `acestep-v15-turbo`, `acestep-v15-turbo-shift3`) |
115
+ | `--lm_model_path` | auto | LM model (e.g., `acestep-5Hz-lm-0.6B`, `acestep-5Hz-lm-1.7B`) |
116
+ | `--offload_to_cpu` | auto | CPU offload (auto-enabled if VRAM < 16GB) |
117
 
118
+ **Examples:**
119
 
120
  ```bash
121
+ # Public access with Chinese UI
122
+ uv run acestep --server-name 0.0.0.0 --share --language zh
123
 
124
+ # Pre-initialize models on startup
125
+ uv run acestep --init_service true --config_path acestep-v15-turbo
126
  ```
127
 
128
+ ### Development
129
 
130
  ```bash
131
+ # Add dependencies
132
+ uv add package-name
133
+ uv add --dev package-name
134
+
135
+ # Update all dependencies
136
  uv sync --upgrade
137
+ ```
138
+
139
+ ## 🚀 Usage
140
+
141
+ We provide multiple ways to use ACE-Step:
142
+
143
+ | Method | Description | Documentation |
144
+ |--------|-------------|---------------|
145
+ | 🖥️ **Gradio Web UI** | Interactive web interface for music generation | [Gradio Guide](./docs/en/GRADIO_GUIDE.md) |
146
+ | 🐍 **Python API** | Programmatic access for integration | [Inference API](./docs/en/INFERENCE.md) |
147
+ | 🌐 **REST API** | HTTP-based async API for services | [REST API](./docs/en/API.md) |
148
+
149
+ **📚 Documentation available in:** [English](./docs/en/) | [中文](./docs/zh/) | [日本語](./docs/ja/)
150
+
151
+
152
+ ## 🔨 Train
153
+
154
+ See the **LoRA Training** tab in Gradio UI for one-click training, or check [Gradio Guide - LoRA Training](./docs/en/GRADIO_GUIDE.md#lora-training) for details.
155
+
156
+ ## 🏗️ Architecture
157
+
158
+ <p align="center">
159
+ <img src="./assets/ACE-Step_framework.png" width="100%" alt="ACE-Step Framework">
160
+ </p>
161
+
162
+
163
+
164
+ ## 📜 License & Disclaimer
165
+
166
+ This project is licensed under [MIT](./LICENSE)
167
+
168
+ ACE-Step enables original music generation across diverse genres, with applications in creative production, education, and entertainment. While designed to support positive and artistic use cases, we acknowledge potential risks such as unintentional copyright infringement due to stylistic similarity, inappropriate blending of cultural elements, and misuse for generating harmful content. To ensure responsible use, we encourage users to verify the originality of generated works, clearly disclose AI involvement, and obtain appropriate permissions when adapting protected styles or materials. By using ACE-Step, you agree to uphold these principles and respect artistic integrity, cultural diversity, and legal compliance. The authors are not responsible for any misuse of the model, including but not limited to copyright violations, cultural insensitivity, or the generation of harmful content.
169
+
170
+ 🔔 Important Notice
171
+ The only official website for the ACE-Step project is our GitHub Pages site.
172
+ We do not operate any other websites.
173
+ 🚫 Fake domains include but are not limited to:
174
+ ac\*\*p.com, a\*\*p.org, a\*\*\*c.org
175
+ ⚠️ Please be cautious. Do not visit, trust, or make payments on any of those sites.
176
+
177
+ ## 🙏 Acknowledgements
178
+
179
+ This project is co-led by ACE Studio and StepFun.
180
+
181
+
182
+ ## 📖 Citation
183
+
184
+ If you find this project useful for your research, please consider citing:
185
+
186
+ ```BibTeX
187
+ @misc{gong2026acestep,
188
+ title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
189
+ author={Junmin Gong, Song Yulin, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
190
+ howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
191
+ year={2026},
192
+ note={GitHub repository}
193
+ }
194
+ ```
assets/ACE-Step_framework.png ADDED

Git LFS Details

  • SHA256: 12b680ef6efa0d6f62c023ece3901304e29484dca9118dffadfcd42de66e1c7d
  • Pointer size: 131 Bytes
  • Size of remote file: 647 kB
assets/Logo_StepFun.png ADDED

Git LFS Details

  • SHA256: a03bd87cc8a2bf3a9eeaa2742de0198093e00ed35fc2e75ea89ceea23f314b8c
  • Pointer size: 130 Bytes
  • Size of remote file: 29.5 kB
assets/acestudio_logo.png ADDED

Git LFS Details

  • SHA256: 9a103c2162ba425a528bdc80e17fdf1536f395ba313265780da8438a77ea6f52
  • Pointer size: 131 Bytes
  • Size of remote file: 128 kB
assets/application_map.png ADDED

Git LFS Details

  • SHA256: d823fe019ef0b4d0e410001dd2a6649f143972f4e7c0021cb90e5098f818cb9c
  • Pointer size: 131 Bytes
  • Size of remote file: 285 kB
assets/orgnization_logos.png ADDED

Git LFS Details

  • SHA256: 67963c873a2ce7991767c970e49daa4739a0e0aa906ca7691a81229cc4e4901d
  • Pointer size: 131 Bytes
  • Size of remote file: 309 kB
docs/en/API.md CHANGED
@@ -84,7 +84,7 @@ Suitable for passing only text parameters, or referencing audio file paths that
84
 
85
  | Parameter Name | Type | Default | Description |
86
  | :--- | :--- | :--- | :--- |
87
- | `model` | string | null | Select which DiT model to use (e.g., `"acestep-v15-turbo"`, `"acestep-v15-turbo-rl"`). Use `/v1/models` to list available models. If not specified, uses the default model. |
88
 
89
  **thinking Semantics (Important)**:
90
 
@@ -148,7 +148,7 @@ These parameters control 5Hz LM sampling, used for metadata auto-completion and
148
 
149
  | Parameter Name | Type | Default | Description |
150
  | :--- | :--- | :--- | :--- |
151
- | `lm_model_path` | string | null | 5Hz LM checkpoint dir name (e.g. `acestep-5Hz-lm-0.6B-v3`) |
152
  | `lm_backend` | string | `"vllm"` | `vllm` or `pt` |
153
  | `lm_temperature` | float | `0.85` | Sampling temperature |
154
  | `lm_cfg_scale` | float | `2.5` | CFG scale (>1 enables CFG) |
@@ -258,7 +258,7 @@ curl -X POST http://localhost:8001/v1/music/generate \
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "electronic dance music",
261
- "model": "acestep-v15-turbo-rl",
262
  "thinking": true
263
  }'
264
  ```
@@ -382,8 +382,8 @@ The response contains basic task information, queue status, and final results.
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
- "lm_model": "acestep-5Hz-lm-0.6B-v3",
386
- "dit_model": "acestep-v15-turbo-rl"
387
  },
388
  "error": null
389
  }
@@ -441,15 +441,15 @@ Returns a list of available DiT models loaded on the server.
441
  {
442
  "models": [
443
  {
444
- "name": "acestep-v15-turbo-rl",
445
  "is_default": true
446
  },
447
  {
448
- "name": "acestep-v15-turbo",
449
  "is_default": false
450
  }
451
  ],
452
- "default_model": "acestep-v15-turbo-rl"
453
  }
454
  ```
455
 
@@ -514,14 +514,14 @@ The API server can be configured using environment variables:
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | Server bind host |
516
  | `ACESTEP_API_PORT` | `8001` | Server bind port |
517
- | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo-rl` | Primary DiT model path |
518
  | `ACESTEP_CONFIG_PATH2` | (empty) | Secondary DiT model path (optional) |
519
  | `ACESTEP_CONFIG_PATH3` | (empty) | Third DiT model path (optional) |
520
  | `ACESTEP_DEVICE` | `auto` | Device for model loading |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | Enable flash attention |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | Offload models to CPU when idle |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | Offload DiT specifically to CPU |
524
- | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B-v3` | Default 5Hz LM model |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LM backend (vllm or pt) |
526
  | `ACESTEP_LM_DEVICE` | (same as ACESTEP_DEVICE) | Device for LM |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | Offload LM to CPU |
 
84
 
85
  | Parameter Name | Type | Default | Description |
86
  | :--- | :--- | :--- | :--- |
87
+ | `model` | string | null | Select which DiT model to use (e.g., `"acestep-v15-turbo"`, `"acestep-v15-turbo-shift3"`). Use `/v1/models` to list available models. If not specified, uses the default model. |
88
 
89
  **thinking Semantics (Important)**:
90
 
 
148
 
149
  | Parameter Name | Type | Default | Description |
150
  | :--- | :--- | :--- | :--- |
151
+ | `lm_model_path` | string | null | 5Hz LM checkpoint dir name (e.g. `acestep-5Hz-lm-0.6B`) |
152
  | `lm_backend` | string | `"vllm"` | `vllm` or `pt` |
153
  | `lm_temperature` | float | `0.85` | Sampling temperature |
154
  | `lm_cfg_scale` | float | `2.5` | CFG scale (>1 enables CFG) |
 
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "electronic dance music",
261
+ "model": "acestep-v15-turbo",
262
  "thinking": true
263
  }'
264
  ```
 
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
+ "lm_model": "acestep-5Hz-lm-0.6B",
386
+ "dit_model": "acestep-v15-turbo"
387
  },
388
  "error": null
389
  }
 
441
  {
442
  "models": [
443
  {
444
+ "name": "acestep-v15-turbo",
445
  "is_default": true
446
  },
447
  {
448
+ "name": "acestep-v15-turbo-shift3",
449
  "is_default": false
450
  }
451
  ],
452
+ "default_model": "acestep-v15-turbo"
453
  }
454
  ```
455
 
 
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | Server bind host |
516
  | `ACESTEP_API_PORT` | `8001` | Server bind port |
517
+ | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo` | Primary DiT model path |
518
  | `ACESTEP_CONFIG_PATH2` | (empty) | Secondary DiT model path (optional) |
519
  | `ACESTEP_CONFIG_PATH3` | (empty) | Third DiT model path (optional) |
520
  | `ACESTEP_DEVICE` | `auto` | Device for model loading |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | Enable flash attention |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | Offload models to CPU when idle |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | Offload DiT specifically to CPU |
524
+ | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B` | Default 5Hz LM model |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LM backend (vllm or pt) |
526
  | `ACESTEP_LM_DEVICE` | (same as ACESTEP_DEVICE) | Device for LM |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | Offload LM to CPU |
docs/en/GRADIO_GUIDE.md CHANGED
@@ -29,7 +29,7 @@ This guide provides comprehensive documentation for using the ACE-Step Gradio we
29
  python app.py
30
 
31
  # With pre-initialization
32
- python app.py --config acestep-v15-turbo-rl --init-llm
33
 
34
  # With specific port
35
  python app.py --port 7860
@@ -55,14 +55,14 @@ The Gradio interface consists of several main sections:
55
  | Setting | Description |
56
  |---------|-------------|
57
  | **Checkpoint File** | Select a trained model checkpoint (if available) |
58
- | **Main Model Path** | Choose the DiT model configuration (e.g., `acestep-v15-turbo`, `acestep-v15-turbo-rl`) |
59
  | **Device** | Processing device: `auto` (recommended), `cuda`, or `cpu` |
60
 
61
  ### 5Hz LM Configuration
62
 
63
  | Setting | Description |
64
  |---------|-------------|
65
- | **5Hz LM Model Path** | Select the language model (e.g., `acestep-5Hz-lm-0.6B`, `acestep-5Hz-lm-0.6B-v3`) |
66
  | **5Hz LM Backend** | `vllm` (faster, recommended) or `pt` (PyTorch, more compatible) |
67
  | **Initialize 5Hz LM** | Check to load the LM during initialization (required for thinking mode) |
68
 
@@ -477,7 +477,7 @@ After training, export the final adapter:
477
 
478
  ### For Faster Generation
479
 
480
- 1. **Use turbo model** - Select `acestep-v15-turbo` or `acestep-v15-turbo-rl`
481
  2. **Keep inference steps at 8** - Default is optimal for turbo
482
  3. **Reduce batch size** - Lower batch size if you need quick results
483
  4. **Disable AutoGen** - Manual control over batch generation
 
29
  python app.py
30
 
31
  # With pre-initialization
32
+ python app.py --config acestep-v15-turbo --init-llm
33
 
34
  # With specific port
35
  python app.py --port 7860
 
55
  | Setting | Description |
56
  |---------|-------------|
57
  | **Checkpoint File** | Select a trained model checkpoint (if available) |
58
+ | **Main Model Path** | Choose the DiT model configuration (e.g., `acestep-v15-turbo`, `acestep-v15-turbo-shift3`) |
59
  | **Device** | Processing device: `auto` (recommended), `cuda`, or `cpu` |
60
 
61
  ### 5Hz LM Configuration
62
 
63
  | Setting | Description |
64
  |---------|-------------|
65
+ | **5Hz LM Model Path** | Select the language model (e.g., `acestep-5Hz-lm-0.6B`, `acestep-5Hz-lm-1.7B`) |
66
  | **5Hz LM Backend** | `vllm` (faster, recommended) or `pt` (PyTorch, more compatible) |
67
  | **Initialize 5Hz LM** | Check to load the LM during initialization (required for thinking mode) |
68
 
 
477
 
478
  ### For Faster Generation
479
 
480
+ 1. **Use turbo model** - Select `acestep-v15-turbo` or `acestep-v15-turbo-shift3`
481
  2. **Keep inference steps at 8** - Default is optimal for turbo
482
  3. **Reduce batch size** - Lower batch size if you need quick results
483
  4. **Disable AutoGen** - Manual control over batch generation
docs/en/INFERENCE.md CHANGED
@@ -35,13 +35,13 @@ llm_handler = LLMHandler()
35
  # Initialize services
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
- config_path="acestep-v15-turbo-rl",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
- lm_model_path="acestep-5Hz-lm-0.6B-v3",
45
  backend="vllm",
46
  device="cuda"
47
  )
 
35
  # Initialize services
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
+ config_path="acestep-v15-turbo",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
+ lm_model_path="acestep-5Hz-lm-0.6B",
45
  backend="vllm",
46
  device="cuda"
47
  )
docs/ja/API.md CHANGED
@@ -84,7 +84,7 @@ APIはほとんどのパラメータで **snake_case** と **camelCase** の両
84
 
85
  | パラメータ名 | 型 | デフォルト | 説明 |
86
  | :--- | :--- | :--- | :--- |
87
- | `model` | string | null | 使用するDiTモデルを選択(例:`"acestep-v15-turbo"`、`"acestep-v15-turbo-rl"`)。`/v1/models` で利用可能なモデルを一覧表示。指定しない場合はデフォルトモデルを使用。|
88
 
89
  **thinkingのセマンティクス(重要)**:
90
 
@@ -148,7 +148,7 @@ APIはほとんどのパラメータで **snake_case** と **camelCase** の両
148
 
149
  | パラメータ名 | 型 | デフォルト | 説明 |
150
  | :--- | :--- | :--- | :--- |
151
- | `lm_model_path` | string | null | 5Hz LMチェックポイントディレクトリ名(例:`acestep-5Hz-lm-0.6B-v3`)|
152
  | `lm_backend` | string | `"vllm"` | `vllm` または `pt` |
153
  | `lm_temperature` | float | `0.85` | サンプリング温度 |
154
  | `lm_cfg_scale` | float | `2.5` | CFGスケール(>1でCFGを有効化)|
@@ -258,7 +258,7 @@ curl -X POST http://localhost:8001/v1/music/generate \
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "エレクトロニックダンスミュージック",
261
- "model": "acestep-v15-turbo-rl",
262
  "thinking": true
263
  }'
264
  ```
@@ -382,8 +382,8 @@ curl -X POST http://localhost:8001/v1/music/generate \
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
- "lm_model": "acestep-5Hz-lm-0.6B-v3",
386
- "dit_model": "acestep-v15-turbo-rl"
387
  },
388
  "error": null
389
  }
@@ -441,15 +441,15 @@ curl -X POST http://localhost:8001/v1/music/random \
441
  {
442
  "models": [
443
  {
444
- "name": "acestep-v15-turbo-rl",
445
  "is_default": true
446
  },
447
  {
448
- "name": "acestep-v15-turbo",
449
  "is_default": false
450
  }
451
  ],
452
- "default_model": "acestep-v15-turbo-rl"
453
  }
454
  ```
455
 
@@ -514,14 +514,14 @@ APIサーバーは環境変数で設定できます:
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | サーバーバインドホスト |
516
  | `ACESTEP_API_PORT` | `8001` | サーバーバインドポート |
517
- | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo-rl` | プライマリDiTモデルパス |
518
  | `ACESTEP_CONFIG_PATH2` | (空)| セカンダリDiTモデルパス(オプション)|
519
  | `ACESTEP_CONFIG_PATH3` | (空)| 3番目のDiTモデルパス(オプション)|
520
  | `ACESTEP_DEVICE` | `auto` | モデルロードデバイス |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | flash attentionを有効化 |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | アイドル時にモデルをCPUにオフロード |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | DiTを特にCPUにオフロード |
524
- | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B-v3` | デフォルト5Hz LMモデル |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LMバックエンド(vllmまたはpt)|
526
  | `ACESTEP_LM_DEVICE` | (ACESTEP_DEVICEと同じ)| LMデバイス |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | LMをCPUにオフロード |
 
84
 
85
  | パラメータ名 | 型 | デフォルト | 説明 |
86
  | :--- | :--- | :--- | :--- |
87
+ | `model` | string | null | 使用するDiTモデルを選択(例:`"acestep-v15-turbo"`、`"acestep-v15-turbo-shift3"`)。`/v1/models` で利用可能なモデルを一覧表示。指定しない場合はデフォルトモデルを使用。|
88
 
89
  **thinkingのセマンティクス(重要)**:
90
 
 
148
 
149
  | パラメータ名 | 型 | デフォルト | 説明 |
150
  | :--- | :--- | :--- | :--- |
151
+ | `lm_model_path` | string | null | 5Hz LMチェックポイントディレクトリ名(例:`acestep-5Hz-lm-0.6B`)|
152
  | `lm_backend` | string | `"vllm"` | `vllm` または `pt` |
153
  | `lm_temperature` | float | `0.85` | サンプリング温度 |
154
  | `lm_cfg_scale` | float | `2.5` | CFGスケール(>1でCFGを有効化)|
 
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "エレクトロニックダンスミュージック",
261
+ "model": "acestep-v15-turbo",
262
  "thinking": true
263
  }'
264
  ```
 
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
+ "lm_model": "acestep-5Hz-lm-0.6B",
386
+ "dit_model": "acestep-v15-turbo"
387
  },
388
  "error": null
389
  }
 
441
  {
442
  "models": [
443
  {
444
+ "name": "acestep-v15-turbo",
445
  "is_default": true
446
  },
447
  {
448
+ "name": "acestep-v15-turbo-shift3",
449
  "is_default": false
450
  }
451
  ],
452
+ "default_model": "acestep-v15-turbo"
453
  }
454
  ```
455
 
 
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | サーバーバインドホスト |
516
  | `ACESTEP_API_PORT` | `8001` | サーバーバインドポート |
517
+ | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo` | プライマリDiTモデルパス |
518
  | `ACESTEP_CONFIG_PATH2` | (空)| セカンダリDiTモデルパス(オプション)|
519
  | `ACESTEP_CONFIG_PATH3` | (空)| 3番目のDiTモデルパス(オプション)|
520
  | `ACESTEP_DEVICE` | `auto` | モデルロードデバイス |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | flash attentionを有効化 |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | アイドル時にモデルをCPUにオフロード |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | DiTを特にCPUにオフロード |
524
+ | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B` | デフォルト5Hz LMモデル |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LMバックエンド(vllmまたはpt)|
526
  | `ACESTEP_LM_DEVICE` | (ACESTEP_DEVICEと同じ)| LMデバイス |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | LMをCPUにオフロード |
docs/ja/GRADIO_GUIDE.md CHANGED
@@ -29,7 +29,7 @@
29
  python app.py
30
 
31
  # 事前初期化付き
32
- python app.py --config acestep-v15-turbo-rl --init-llm
33
 
34
  # 特定のポートで
35
  python app.py --port 7860
@@ -55,14 +55,14 @@ Gradioインターフェースは以下の主要セクションで構成され
55
  | 設定 | 説明 |
56
  |---------|-------------|
57
  | **チェックポイントファイル** | トレーニング済みモデルチェックポイントを選択(利用可能な場合)|
58
- | **メインモデルパス** | DiTモデル設定を選択(例:`acestep-v15-turbo`、`acestep-v15-turbo-rl`)|
59
  | **デバイス** | 処理デバイス:`auto`(推奨)、`cuda`、または `cpu` |
60
 
61
  ### 5Hz LM設定
62
 
63
  | 設定 | 説明 |
64
  |---------|-------------|
65
- | **5Hz LMモデルパス** | 言語モデルを選択(例:`acestep-5Hz-lm-0.6B`、`acestep-5Hz-lm-0.6B-v3`)|
66
  | **5Hz LMバックエンド** | `vllm`(より高速、推奨)または `pt`(PyTorch、互換性が高い)|
67
  | **5Hz LMを初期化** | 初期化時にLMを読み込むためにチェック(thinkingモードに必要)|
68
 
@@ -477,7 +477,7 @@ LoRAトレーニングタブはカスタムLoRAアダプターを作成するた
477
 
478
  ### より高速な生成のために
479
 
480
- 1. **turboモデルを使用** - `acestep-v15-turbo` または `acestep-v15-turbo-rl` を選択
481
  2. **推論ステップを8に保つ** - turboに最適なデフォルト
482
  3. **バッチサイズを減らす** - 迅速な結果が必要な場合はバッチサイズを下げる
483
  4. **AutoGenを無効化** - バッチ生成の手動制御
 
29
  python app.py
30
 
31
  # 事前初期化付き
32
+ python app.py --config acestep-v15-turbo --init-llm
33
 
34
  # 特定のポートで
35
  python app.py --port 7860
 
55
  | 設定 | 説明 |
56
  |---------|-------------|
57
  | **チェックポイントファイル** | トレーニング済みモデルチェックポイントを選択(利用可能な場合)|
58
+ | **メインモデルパス** | DiTモデル設定を選択(例:`acestep-v15-turbo`、`acestep-v15-turbo-shift3`)|
59
  | **デバイス** | 処理デバイス:`auto`(推奨)、`cuda`、または `cpu` |
60
 
61
  ### 5Hz LM設定
62
 
63
  | 設定 | 説明 |
64
  |---------|-------------|
65
+ | **5Hz LMモデルパス** | 言語モデルを選択(例:`acestep-5Hz-lm-0.6B`、`acestep-5Hz-lm-1.7B`)|
66
  | **5Hz LMバックエンド** | `vllm`(より高速、推奨)または `pt`(PyTorch、互換性が高い)|
67
  | **5Hz LMを初期化** | 初期化時にLMを読み込むためにチェック(thinkingモードに必要)|
68
 
 
477
 
478
  ### より高速な生成のために
479
 
480
+ 1. **turboモデルを使用** - `acestep-v15-turbo` または `acestep-v15-turbo-shift3` を選択
481
  2. **推論ステップを8に保つ** - turboに最適なデフォルト
482
  3. **バッチサイズを減らす** - 迅速な結果が必要な場合はバッチサイズを下げる
483
  4. **AutoGenを無効化** - バッチ生成の手動制御
docs/ja/INFERENCE.md CHANGED
@@ -35,13 +35,13 @@ llm_handler = LLMHandler()
35
  # サービスの初期化
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
- config_path="acestep-v15-turbo-rl",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
- lm_model_path="acestep-5Hz-lm-0.6B-v3",
45
  backend="vllm",
46
  device="cuda"
47
  )
 
35
  # サービスの初期化
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
+ config_path="acestep-v15-turbo",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
+ lm_model_path="acestep-5Hz-lm-0.6B",
45
  backend="vllm",
46
  device="cuda"
47
  )
docs/zh/API.md CHANGED
@@ -84,7 +84,7 @@ API 支持大多数参数的 **snake_case** 和 **camelCase** 命名。例如:
84
 
85
  | 参数名 | 类型 | 默认值 | 说明 |
86
  | :--- | :--- | :--- | :--- |
87
- | `model` | string | null | 选择使用哪个 DiT 模型(例如 `"acestep-v15-turbo"`、`"acestep-v15-turbo-rl"`)。使用 `/v1/models` 列出可用模型。如果未指定,使用默认模型。|
88
 
89
  **thinking 语义(重要)**:
90
 
@@ -148,7 +148,7 @@ API 支持大多数参数的 **snake_case** 和 **camelCase** 命名。例如:
148
 
149
  | 参数名 | 类型 | 默认值 | 说明 |
150
  | :--- | :--- | :--- | :--- |
151
- | `lm_model_path` | string | null | 5Hz LM 检查点目录名(例如 `acestep-5Hz-lm-0.6B-v3`)|
152
  | `lm_backend` | string | `"vllm"` | `vllm` 或 `pt` |
153
  | `lm_temperature` | float | `0.85` | 采样温度 |
154
  | `lm_cfg_scale` | float | `2.5` | CFG 比例(>1 启用 CFG)|
@@ -258,7 +258,7 @@ curl -X POST http://localhost:8001/v1/music/generate \
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "电子舞曲",
261
- "model": "acestep-v15-turbo-rl",
262
  "thinking": true
263
  }'
264
  ```
@@ -382,8 +382,8 @@ curl -X POST http://localhost:8001/v1/music/generate \
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
- "lm_model": "acestep-5Hz-lm-0.6B-v3",
386
- "dit_model": "acestep-v15-turbo-rl"
387
  },
388
  "error": null
389
  }
@@ -441,15 +441,15 @@ curl -X POST http://localhost:8001/v1/music/random \
441
  {
442
  "models": [
443
  {
444
- "name": "acestep-v15-turbo-rl",
445
  "is_default": true
446
  },
447
  {
448
- "name": "acestep-v15-turbo",
449
  "is_default": false
450
  }
451
  ],
452
- "default_model": "acestep-v15-turbo-rl"
453
  }
454
  ```
455
 
@@ -514,14 +514,14 @@ API 服务器可以通过环境变量进行配置:
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | 服务器绑定主机 |
516
  | `ACESTEP_API_PORT` | `8001` | 服务器绑定端口 |
517
- | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo-rl` | 主 DiT 模型路径 |
518
  | `ACESTEP_CONFIG_PATH2` | (空)| 辅助 DiT 模型路径(可选)|
519
  | `ACESTEP_CONFIG_PATH3` | (空)| 第三个 DiT 模型路径(可选)|
520
  | `ACESTEP_DEVICE` | `auto` | 模型加载设备 |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | 启用 flash attention |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | 空闲时将模型卸载到 CPU |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | 专门将 DiT 卸载到 CPU |
524
- | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B-v3` | 默认 5Hz LM 模型 |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LM 后端(vllm 或 pt)|
526
  | `ACESTEP_LM_DEVICE` | (与 ACESTEP_DEVICE 相同)| LM 设备 |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | 将 LM 卸载到 CPU |
 
84
 
85
  | 参数名 | 类型 | 默认值 | 说明 |
86
  | :--- | :--- | :--- | :--- |
87
+ | `model` | string | null | 选择使用哪个 DiT 模型(例如 `"acestep-v15-turbo"`、`"acestep-v15-turbo-shift3"`)。使用 `/v1/models` 列出可用模型。如果未指定,使用默认模型。|
88
 
89
  **thinking 语义(重要)**:
90
 
 
148
 
149
  | 参数名 | 类型 | 默认值 | 说明 |
150
  | :--- | :--- | :--- | :--- |
151
+ | `lm_model_path` | string | null | 5Hz LM 检查点目录名(例如 `acestep-5Hz-lm-0.6B`)|
152
  | `lm_backend` | string | `"vllm"` | `vllm` 或 `pt` |
153
  | `lm_temperature` | float | `0.85` | 采样温度 |
154
  | `lm_cfg_scale` | float | `2.5` | CFG 比例(>1 启用 CFG)|
 
258
  -H 'Content-Type: application/json' \
259
  -d '{
260
  "caption": "电子舞曲",
261
+ "model": "acestep-v15-turbo",
262
  "thinking": true
263
  }'
264
  ```
 
382
  "keyscale": "C Major",
383
  "timesignature": "4",
384
  "genres": null,
385
+ "lm_model": "acestep-5Hz-lm-0.6B",
386
+ "dit_model": "acestep-v15-turbo"
387
  },
388
  "error": null
389
  }
 
441
  {
442
  "models": [
443
  {
444
+ "name": "acestep-v15-turbo",
445
  "is_default": true
446
  },
447
  {
448
+ "name": "acestep-v15-turbo-shift3",
449
  "is_default": false
450
  }
451
  ],
452
+ "default_model": "acestep-v15-turbo"
453
  }
454
  ```
455
 
 
514
  | :--- | :--- | :--- |
515
  | `ACESTEP_API_HOST` | `127.0.0.1` | 服务器绑定主机 |
516
  | `ACESTEP_API_PORT` | `8001` | 服务器绑定端口 |
517
+ | `ACESTEP_CONFIG_PATH` | `acestep-v15-turbo` | 主 DiT 模型路径 |
518
  | `ACESTEP_CONFIG_PATH2` | (空)| 辅助 DiT 模型路径(可选)|
519
  | `ACESTEP_CONFIG_PATH3` | (空)| 第三个 DiT 模型路径(可选)|
520
  | `ACESTEP_DEVICE` | `auto` | 模型加载设备 |
521
  | `ACESTEP_USE_FLASH_ATTENTION` | `true` | 启用 flash attention |
522
  | `ACESTEP_OFFLOAD_TO_CPU` | `false` | 空闲时将模型卸载到 CPU |
523
  | `ACESTEP_OFFLOAD_DIT_TO_CPU` | `false` | 专门将 DiT 卸载到 CPU |
524
+ | `ACESTEP_LM_MODEL_PATH` | `acestep-5Hz-lm-0.6B` | 默认 5Hz LM 模型 |
525
  | `ACESTEP_LM_BACKEND` | `vllm` | LM 后端(vllm 或 pt)|
526
  | `ACESTEP_LM_DEVICE` | (与 ACESTEP_DEVICE 相同)| LM 设备 |
527
  | `ACESTEP_LM_OFFLOAD_TO_CPU` | `false` | 将 LM 卸载到 CPU |
docs/zh/GRADIO_GUIDE.md CHANGED
@@ -29,7 +29,7 @@
29
  python app.py
30
 
31
  # 预初始化
32
- python app.py --config acestep-v15-turbo-rl --init-llm
33
 
34
  # 指定端口
35
  python app.py --port 7860
@@ -55,14 +55,14 @@ Gradio 界面包含以下主要部分:
55
  | 设置 | 说明 |
56
  |---------|-------------|
57
  | **检查点文件** | 选择已训练的模型检查点(如果可用)|
58
- | **主模型路径** | 选择 DiT 模型配置(例如 `acestep-v15-turbo`、`acestep-v15-turbo-rl`)|
59
  | **设备** | 处理设备:`auto`(推荐)、`cuda` 或 `cpu` |
60
 
61
  ### 5Hz LM 配置
62
 
63
  | 设置 | 说明 |
64
  |---------|-------------|
65
- | **5Hz LM 模型路径** | 选择语言模型(例如 `acestep-5Hz-lm-0.6B`、`acestep-5Hz-lm-0.6B-v3`)|
66
  | **5Hz LM 后端** | `vllm`(更快,推荐)或 `pt`(PyTorch,兼容性更好)|
67
  | **初始化 5Hz LM** | 勾选以在初始化期间加载 LM(thinking 模式必需)|
68
 
@@ -477,7 +477,7 @@ LoRA 训练选项卡提供创建自定义 LoRA 适配器的工具。
477
 
478
  ### 加快生成速度
479
 
480
- 1. **使用 turbo 模型** - 选择 `acestep-v15-turbo` 或 `acestep-v15-turbo-rl`
481
  2. **保持推理步数为 8** - 这是 turbo 的最佳默认值
482
  3. **减少批量大小** - 如果需要快速结果,降低批量大小
483
  4. **禁用 AutoGen** - 手动控制批次生成
 
29
  python app.py
30
 
31
  # 预初始化
32
+ python app.py --config acestep-v15-turbo --init-llm
33
 
34
  # 指定端口
35
  python app.py --port 7860
 
55
  | 设置 | 说明 |
56
  |---------|-------------|
57
  | **检查点文件** | 选择已训练的模型检查点(如果可用)|
58
+ | **主模型路径** | 选择 DiT 模型配置(例如 `acestep-v15-turbo`、`acestep-v15-turbo-shift3`)|
59
  | **设备** | 处理设备:`auto`(推荐)、`cuda` 或 `cpu` |
60
 
61
  ### 5Hz LM 配置
62
 
63
  | 设置 | 说明 |
64
  |---------|-------------|
65
+ | **5Hz LM 模型路径** | 选择语言模型(例如 `acestep-5Hz-lm-0.6B`、`acestep-5Hz-lm-1.7B`)|
66
  | **5Hz LM 后端** | `vllm`(更快,推荐)或 `pt`(PyTorch,兼容性更好)|
67
  | **初始化 5Hz LM** | 勾选以在初始化期间加载 LM(thinking 模式必需)|
68
 
 
477
 
478
  ### 加快生成速度
479
 
480
+ 1. **使用 turbo 模型** - 选择 `acestep-v15-turbo` 或 `acestep-v15-turbo-shift3`
481
  2. **保持推理步数为 8** - 这是 turbo 的最佳默认值
482
  3. **减少批量大小** - 如果需要快速结果,降低批量大小
483
  4. **禁用 AutoGen** - 手动控制批次生成
docs/zh/INFERENCE.md CHANGED
@@ -35,13 +35,13 @@ llm_handler = LLMHandler()
35
  # 初始化服务
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
- config_path="acestep-v15-turbo-rl",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
- lm_model_path="acestep-5Hz-lm-0.6B-v3",
45
  backend="vllm",
46
  device="cuda"
47
  )
 
35
  # 初始化服务
36
  dit_handler.initialize_service(
37
  project_root="/path/to/project",
38
+ config_path="acestep-v15-turbo",
39
  device="cuda"
40
  )
41
 
42
  llm_handler.initialize(
43
  checkpoint_dir="/path/to/checkpoints",
44
+ lm_model_path="acestep-5Hz-lm-0.6B",
45
  backend="vllm",
46
  device="cuda"
47
  )
skills/acemusic/SKILL.md CHANGED
@@ -250,7 +250,7 @@ project_root/
250
  "bpm": 120,
251
  "keyscale": "C Major",
252
  "duration": 60.0,
253
- "dit_model": "acestep-v15-turbo-rl"
254
  }
255
  }
256
  ```
 
250
  "bpm": 120,
251
  "keyscale": "C Major",
252
  "duration": 60.0,
253
+ "dit_model": "acestep-v15-turbo"
254
  }
255
  }
256
  ```