Introduction¶
Symusic (Symbolic Music) is a high-performance toolkit for symbolic music data. It provides a C++20 core that parses and manipulates MIDI at the note level and exposes ergonomic Python factories via nanobind. Typical operations—loading a multi-track score, shifting the pitch of entire sections, exporting piano rolls, or rendering audio with SoundFonts—complete hundreds of times faster than the pure-Python stacks traditionally used in MIR and deep-learning pipelines.
What makes Symusic different?¶
Speed with fidelity – MIDI is decoded into note-level structures directly in C++ and exposed to Python without copies, so workflows that used to require custom C++ extensions now run inside a notebook.
Time-unit flexibility – Score/Track/Event objects are available in
tick,quarter, orsecondspace, letting you choose between raw MIDI timing, musically normalized beats, or wall-clock seconds depending on the task.Cross-language friendly – The core lives entirely in standard C++, so bindings for other languages (Julia, Lua, etc.) can be built on the same foundation.
End-to-end pipeline – Beyond parsing, Symusic includes transformations (filtering, trimming, resampling), beat/downbeat extraction, structured NumPy exports (SoA), piano-roll conversion, and synthesis via the Prestosynth engine.
Architecture at a glance¶
Layer |
Purpose |
Key technologies |
|---|---|---|
C++ engine |
MIDI/ABC parsing, time-unit conversions, vectorized containers |
C++20, fmt, minimidi, zpp_bits |
Python bindings |
Factories ( |
nanobind, stubs in |
Synthesizer |
SoundFont 2/3 rendering and WAV dumping |
Prestosynth, NumPy |
Documentation |
Canonical versioned docs on Read the Docs; legacy mdBook kept as an archive |
Sphinx, MyST, Read the Docs |
Installation overview¶
pip install symusic
We publish CPython wheels for Python 3.9 through 3.14 across the Linux, macOS, and Windows targets
listed in .github/workflows/wheel.yml, including Windows ARM64. PyPy wheels are currently
published for pp311 on manylinux_x86_64, macosx_x86_64, and macosx_arm64.
When you need to build from source, clone the repository with submodules and install it through
pip:
git clone --recursive https://github.com/Yikai-Liao/symusic
cd symusic
pip install .
Symusic requires a C++20 compiler (GCC ≥ 11, Clang ≥ 15, MSVC 2022). If you lack system compiler
permissions on Linux, a conda-forge::gcc toolchain plus the CC=/path/to/gcc override works well.
Note
The canonical documentation lives on Read the Docs at https://symusic.readthedocs.io/en/stable/. The legacy mdBook remains online at https://yikai-liao.github.io/symusic/ for historical links, but the Read the Docs site and the codebase are the primary references for current behavior.
For doc migration history and local build instructions, see Documentation Notes.
Benchmarks¶
Symusic’s decoding core is optimized for note-level MIDI workloads and outperforms the common MIDI libraries shared across MIR tooling:
midifile(C++) emits both event- and note-level data but spends significant time iniostream.mido(Python) only parses event-level structures and serves as the foundation for many Python stacks.pretty_midiandmiditoolkitbuild on top ofmidoto expose note-level abstractions.Python-accessible libraries are timed with
timeit, C++ projects usenanobench, and Julia libraries useBenchmarkTools.
The end-to-end scripts live in symusic-benchmark
and currently run on GitHub Actions M1 runners.

Motivation¶
Symusic aims to provide a fast symbolic music preprocessing backend for MIDI and ABC workflows, with room to grow into additional formats over time.
The former dominant MIDI parsing backend is mido (used by pretty_midi and miditoolkit), which is written in pure python. However, it is too slow for the large-scale symbolic music data preprocessing task in the deep learning era, which makes it impossible to tokenize the needed data in real-time while training.
Out of that need, we developed this library. It is written in C++, exposes a Python binding through
nanobind, and is over 100 times faster than mido. We parse MIDI files to note level (similar to
miditoolkit) instead of event level
(mido), which is more suitable for large-scale symbolic music
preprocessing. ABC support is already integrated, while formats such as MusicXML remain future work.
We separated the event-level MIDI parsing code into a lightweight and efficient header-only C++ library minimidi for those who only need to parse MIDI files to event level in C++.
In the future, we will also bind symusic to other languages like Julia and Lua for more convenient use.