for somoeone very experienced with programming but not very experienced in python, however do understand it/can tailor it, I have this snippet below utilizing torch and audiocraft to load an AI music model, hugginface musicgen.
Parameters currently include model name/version, duration time and num of samples to produce at a time, file exported to a wav as f'{idx}'. What would be the best path to integrate a web interface for this, input field for description(s), duration and process button, and awaiting complete returns the audio for preview/playback and download. Thinking website or api, fastapi python?
The snipped currently custom.py:
from einops import rearrange
import torch
import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write
from audiocraft.utils.notebook import display_audio
#version='facebook/musicgen-melody'
version='facebook/musicgen-small'
print("Loading model", version)
#get the pre-trained model
model = MusicGen.get_pretrained(version, device='cuda')
model.set_generation_params(duration=90) # generate N seconds.
wav = model.generate_unconditional(4) # generates 4 unconditional audio samples for each description
#Give the descriptions Text to music
descriptions = ['chillstep, calm synth, droplets',
'new wave 80s chillstep, strong emotion',
'darkest depths of space, feeling uneasy',
'uplifting, orchestral, drums, classic, instrumental',]
#Generate the music
wav = model.generate(descriptions,progress=True)
#display the audio
#display_audio(wav, 32000)
#write audio(s)
for idx, one_wav in enumerate(wav):
# Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)
Top