Music Server With Built-in AI

by Ejrnesto in Design > Software

26 Views, 0 Favorites, 0 Comments

Music Server With Built-in AI

Screenshot 2025-12-04 at 12-46-56 Navidrome.png
cds.jpg
vinilos.jpg

Having a big music library is great, but sometimes listening to music using physical media has many inconveniences; that combined with the fact that some companies that offer streaming services make very questionable decisions, made me want to digitalize my own collection and make a server to stream and organize it the way I want.

This idea is great, but we have to figure out how do we do it. At the beginning, I thought I'd just program a server in Python using Flask, but I quickly realized that doing an User interface, a Playlist system and a player with many features would take a very long time.

That's when I decided that I would divide the server in two:

  1. An Ingestion Server: the one I'll make from scratch, where the audio files will be uploaded and then processed and stored in a folder.
  2. A "User Friendly" Server: connecting Navidrome, an UI fully fledged by somebody else, to my ingestion server, so I could listen to my music easily, with an experience very similar to using a streaming service.

Supplies

material.jpg
tocadiscos_amp.jpg

Hardware:

  1. Raspberry Pi 5 (8GB RAM)
  2. Turntable, Amplifier and a Sound Card (for vinyl)
  3. A PC with a CD/DVD reader for CD rips

Software:

  1. Ableton
  2. Audacity
  3. PyCharm
  4. Asunder
  5. Flask
  6. Navidrome

Getting Our Audio Files

IMG_8002.jpg
Ripeo_vinilo_metadatos.png

We have to digitalize our music, and the process depends on the media.

If we want to rip vinyl, we'll have to connect our turntable and amplifier with a PC using a sound card; then, we'll use a software like Ableton to record the vinyl, and later, chop the recording of each side of the vinyl into individual songs and add its metadata using Audacity.

To get audio files from cd's, I used an old laptop with a cd/dvd reader and a program called asunder that connects to an external database and gets a list of clean metadata from the "raw" metadata that the cd has in it.

Setting Up the Rapsberry Pi

Screenshot from 2025-12-04 12-55-04.png

Once we have our audio files, we'll set our raspberry pi up to run there our server.

In my case, I installed ubuntu and chose to use Flask, a Python framework to implement the server.

To work in an organized way, it's recommended to create a folder for the project and a virtual enviroment so that there are no problems with external libraries, dependencies or files in general.

Building the Ingestion Server

DiagramaBloques_ENG.png

This is the most difficult part of the project, now we'll code the ingestion server.

First, we have to think about what we want to process in the server and in what order. I made a flowchart of the process, but half of the steps weren't even planned from the beginning: I had to try many different things out through trial and error.

With these steps I automatize, as much as I need, the processes of:

  1. Uploading audio files.
  2. Procesing the files.
  3. Give every song its respective metadata (manually or using APIs or databases from the internet).
  4. Organizing songs and albums thanks to metadata.
  5. Storing every song properly.

Now, we'll get down to business and code the server in Python.

Libraries and Genius API

librerias.png

The most important libraries are:

  1. flask: is the centerpiece of the server, creates the app and defines the routes of the server.
  2. werkzeug: used for secure_filename(), very important for security when uploading files.
  3. flask_sqlalchemy: translates python into SQL so we can create a database and store, delete and search songs in the database.
  4. librosa: this is the server AI, it analyzes the files and gives us the tempo, beats and key.
  5. mutagen: handles metadata, it writes the results of the analysis in the song's metadata.
  6. tinytag: does a quick read of the metadata from every song.
  7. lyricsgenius: a wrapper for the Genius API, search automatically in Genius' lyrics database the uploaded song(s), using its metadata.
  8. requests: for HTTP requests. Downloads the covers of the albums from TheAudioDB URL with fetch_album_art()

Also, we have to create a Genius Account, create an API Client for our app filling a form and then generate an Access Token (all of this is completely free!).

The Database

database.png

This maybe the most important part of the server: without database we can't store, organize and search files. Our database isn't too complex, but it does the job.

It stores an id for every song, title, artist and album. Also, it stores a filename (different from what we see in the player), a track number if the song belongs to an album and there are other songs, the media type where it comes from: a Vinyl, CD, SACD, Digital File..., the tempo and number of beats we get from analyze_music(), lyrics from fetch_lyrics(), the url of the album cover we get with fetch_album_art() and the key of the song that estimate_key() gives us.

Parse_m3u()

parse_m3u.png

This function will read a .m3u file that is extracted from a cd. It gives very useful informatio: the duration, name of the artist and name of the song; it's some kind of map that tells us that track01.flac is name of artist - name of song.flac.

Clean_title_string()

clean_title_string().png

It cleans the title of a song, but what is cleaning? Well, if it is in slug format (for example 01_artist_name_song_title), it converts it into a clean string and erases the track number so in the database and in the player the names will look clean and professional.

Extract_metadata_fallback()

extract_metadata_fallback.png

It uses the TinyTag library to extract the existing metadata in the file, it's very useful in case the AI system fails or there are no data in the internet on that specific file.

Estimate_key()

estimate_key.png

Using signal processing and maths (librosa and numpy), it gets the key of the song. First it generates a chromatogram, then looks for the tonic note (the strongest or more frequent), compares the energy between major third and minor third of the tonic, and whether one or another ir stronger, the key will be major or minor respectively.

Analyze_music()

analyze_audio.png

Using librosa once again, this function is in charge of the audio analysis. It loads the audio file into RAM as an array of numbers, it calculates the tempo counting the rhythmic beats and gets the key from estimate_key().

This function is like an AI coordinator in the server.

Fetch_lyrics()

fetch_lyrics.png

This function searches for the lyrics in Genius, using the lyricsgenius module.

Fetch_album_art()

fetch_album_art.png

This function is very similar to the former one. It looks in theaudiodb (a database of music releases) and gets the cover of the album of the uploaded song.

Embed_metadata()

The last function, this one is very important, it takes all the metadata received or created by the other functions and embeds them into every song.

Routes

paginaprincipal.png

Flask works similarly to a web developing tool; we define the routes, URL of our server and then, we can define what happens in every section of the web. For example, '/' is the homepage, there's the title, the form to upload songs an a summary of every recent upload; '/upload' proceses the upload form and does everything that is in the flow chart... We have GET routes (display, for the user), POST routes (action, for the ingestion, management and streaming).

Downloads

HTML

Screenshot 2025-12-04 at 13-02-26 Stripped - Depeche Mode.png
Screenshot from 2025-12-04 11-56-53.png

All that we have done to this point is very useful, but we also need an interface so an user can upload songs easily, that's where the HTML's come into play. We have two, one for the index, the page where we upload songs, and other for every song, so we can see the lyrics, bpm, key, play the song, change its cover art or delete it.

Index.html

<!DOCTYPE html>

<html lang="es">

<head>

<meta charset="UTF-8">

<meta name="viewport" content="width=device-width, initial-scale=1.0">

<title>🎵 Servidor de Música con IA 🎵</title>

<style>

body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; margin: 0; background: #f8f9fa; }

.container { max-width: 800px; margin: 2em auto; background: white; padding: 2em; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.05); }

h1, h2 { color: #333; }

h1 { border-bottom: 2px solid #007bff; padding-bottom: 10px; }

h3 { font-size: 1.1em; color: #444; margin-bottom: 10px; }


form { border: 2px dashed #007bff; padding: 1.5em; border-radius: 8px; background: #fdfdff; margin-bottom: 2em; }

form label { display: block; font-weight: bold; margin-bottom: 0.5em; color: #555; }

form input[type="file"], form input[type="text"], form select { width: 100%; padding: 8px; box-sizing: border-box; margin-bottom: 1em; border: 1px solid #ccc; border-radius: 4px; }

form button { background: #007bff; color: white; padding: 12px 20px; border: none; border-radius: 4px; cursor: pointer; font-size: 1.1em; margin-top: 1em; width: 100%; font-weight: bold;}

form button:hover { background: #0056b3; }


/* Controles de ordenación */

.sorting-controls { margin-bottom: 1em; font-size: 0.9em; color: #555; padding: 10px; background: #f9f9f9; border-radius: 4px; text-align: center;}

.sorting-controls a { text-decoration: none; color: #007bff; margin: 0 5px; }

.sorting-controls a.active { font-weight: bold; text-decoration: underline; color: #0056b3; }


.song-list { list-style: none; padding: 0; margin-top: 1em; }

.song-list li { display: flex; align-items: center; padding: 12px; border-bottom: 1px solid #ddd; transition: background 0.2s; }

.song-list li:last-child { border-bottom: none; }

.song-list li:hover { background: #f1f3f5; }


.song-list .album-art {

width: 50px; height: 50px; object-fit: cover; border-radius: 4px;

margin-right: 15px; background: #eee; flex-shrink: 0; border: 1px solid #ddd;

}


.song-list .song-info { display: flex; flex-direction: column; }

.song-list .song-info a { text-decoration: none; color: #0056b3; font-weight: 600; font-size: 1.1em; }

.song-list .song-info span { font-style: normal; color: #666; font-size: 0.9em; margin-top: 3px; }


.media-badge {

background: #6c757d; color: white; font-size: 0.7em; padding: 2px 6px;

border-radius: 10px; text-transform: uppercase; margin-left: 8px; vertical-align: middle;

}


.flash { padding: 15px; border-radius: 4px; margin-bottom: 1em; border: 1px solid; }

.flash.success { background: #d4edda; color: #155724; border-color: #c3e6cb; }

.flash.error { background: #f8d7da; color: #721c24; border-color: #f5c6cb; }

</style>

</head>

<body>

<div class="container">

<h1>🎵 Servidor de Música con IA 🎵</h1>


{% with messages = get_flashed_messages(with_categories=true) %}

{% if messages %}

{% for category, message in messages %}

<div class="flash {{ category }}">{{ message }}</div>

{% endfor %}

{% endif %}

{% endwith %}


<h2>Subir y Organizar</h2>


<form action="{{ url_for('upload_file') }}" method="POST" enctype="multipart/form-data">


<div style="background: #f0f4f8; padding: 15px; border-radius: 5px; margin-bottom: 15px;">

<h3 style="margin-top:0; color: #0056b3;">1. Datos del Álbum</h3>


<label for="media_type">💿 Soporte / Origen:</label>

<select id="media_type" name="media_type">

<option value="CD">CD (Compact Disc)</option>

<option value="DIG">Digital / Web</option>

<option value="VINYL">Vinilo / LP</option>

<option value="CASSETTE">Cassette</option>

<option value="SACD">SACD (Super Audio CD)</option>

<option value="DAT">DAT (Digital Audio Tape)</option>

</select>


<label for="force_artist">Nombre del Artista:</label>

<input type="text" id="force_artist" name="force_artist" placeholder="Ej: Cocteau Twins">


<label for="force_album">Nombre del Álbum:</label>

<input type="text" id="force_album" name="force_album" placeholder="Ej: Treasure">


<label for="m3u_file" style="margin-top: 15px; display:block;">Archivo .M3U (Opcional):</label>

<input type="file" id="m3u_file" name="m3u_file" accept=".m3u,.m3u8">

</div>


<div>

<h3 style="margin-top:0; color: #0056b3;">2. Archivos de Audio</h3>

<label for="file">Selecciona las canciones:</label>

<input type="file" id="file" name="file" accept="audio/*" required multiple>

</div>


<button type="submit">Procesar</button>

</form>


<h2>Biblioteca</h2>

<div class="sorting-controls">

<strong>Ordenar:</strong>

<a href="{{ url_for('index', sort_by='id', order='desc') }}" class="{{ 'active' if current_sort == 'id' }}">Recientes</a> |

<a href="{{ url_for('index', sort_by='artist', order='asc') }}" class="{{ 'active' if current_sort == 'artist' }}">Artista</a> |

<a href="{{ url_for('index', sort_by='album', order='asc') }}" class="{{ 'active' if current_sort == 'album' }}">Álbum</a> |

<a href="{{ url_for('index', sort_by='media_type', order='asc') }}" class="{{ 'active' if current_sort == 'media_type' }}">Soporte</a>

</div>


<ul class="song-list">

{% for song in songs %}

<li>

{% if song.album_art_url %}

<img src="{{ song.album_art_url }}" alt="Portada" class="album-art">

{% else %}

<div class="album-art" style="display: flex; align-items: center; justify-content: center; font-size: 1.5em; color: #bbb;">🎵</div>

{% endif %}


<div class="song-info">

<a href="{{ url_for('song_detail', song_id=song.id) }}">

{% if song.track_number %}

<span style="font-weight: bold; background: #eee; padding: 2px 5px; border-radius: 3px; font-size: 0.8em;">#{{ song.track_number }}</span>

{% endif %}

{{ song.title }}

</a>

<span>

{{ song.artist }} — <em>{{ song.album }}</em>

{% if song.media_type %}

<span class="media-badge">{{ song.media_type }}</span>

{% endif %}

</span>

</div>

</li>

{% else %}

<li style="padding: 2em; color: #777;">Biblioteca vacía.</li>

{% endfor %}

</ul>

</div>

</body>

</html>

Song.index

<!DOCTYPE html>

<html lang="es">

<head>

<meta charset="UTF-8">

<meta name="viewport" content="width=device-width, initial-scale=1.0">

<title>{{ song.title }} - {{ song.artist }}</title>

<style>

body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; margin: 0; background: #f8f9fa; }

.container { max-width: 800px; margin: 2em auto; background: white; padding: 2em; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.05); }

h1 { color: #333; margin-bottom: 5px;}

h2 { color: #555; font-weight: 400; margin-top: 0; }

h3 { border-bottom: 1px solid #eee; padding-bottom: 5px; margin-top: 1.5em; }


.header { display: flex; align-items: flex-start; gap: 20px; margin-bottom: 1.5em; }

.header img.album-art-detail {

width: 150px; height: 150px; object-fit: cover;

border-radius: 6px; background: #eee; flex-shrink: 0;

box-shadow: 0 2px 8px rgba(0,0,0,0.1);

}

.header .info { display: flex; flex-direction: column; }


audio { width: 100%; margin: 1em 0; }

.analysis { background: #f1f3f5; padding: 1em; border-radius: 5px; margin-bottom: 2em; display: flex; gap: 20px; flex-wrap: wrap;}

.analysis div { flex: 1; min-width: 150px; }

.lyrics { white-space: pre-wrap; line-height: 1.6; background: #fafafa; padding: 1em; border: 1px solid #ddd; border-radius: 5px; max-height: 400px; overflow-y: auto;}


.controls { margin-top: 2em; }

a.back-link { display: inline-block; color: #007bff; text-decoration: none; font-weight: bold; margin-bottom: 20px;}


.admin-section { margin-top: 2em; padding-top: 1em; border-top: 2px dashed #eee; background: #fff5f5; padding: 15px; border-radius: 5px;}

.delete-button { background: #dc3545; color: white; padding: 10px 15px; border: none; border-radius: 4px; cursor: pointer; }

.delete-button:hover { background: #c82333; }


/* Estilos para subir portada */

.upload-art-form { background: #e3f2fd; padding: 15px; border-radius: 5px; margin-bottom: 15px; border: 1px solid #90caf9; }

.upload-art-btn { background: #007bff; color: white; padding: 8px 12px; border: none; border-radius: 4px; cursor: pointer; }

</style>

</head>

<body>

<div class="container">

<a href="{{ url_for('index') }}" class="back-link">&larr; Volver a la lista</a>


<div class="header">

{% if song.album_art_url %}

<img src="{{ song.album_art_url }}" alt="Portada" class="album-art-detail">

{% else %}

<div class="album-art-detail" style="display: flex; align-items: center; justify-content: center; font-size: 4em; color: #bbb;">🎵</div>

{% endif %}


<div class="info">

<h1>{{ song.title }}</h1>

<h2>{{ song.artist }}</h2>

<p style="color: #777; margin: 0;">Álbum: <strong>{{ song.album }}</strong></p>

<p style="color: #999; margin: 5px 0 0 0; font-size: 0.9em;">Pista: {{ song.track_number if song.track_number else '?' }}</p>

</div>

</div>


<audio controls src="{{ url_for('serve_file', filename=song.filename) }}">

Tu navegador no soporta el elemento de audio.

</audio>


<div class="analysis">

<div>

<strong>Tempo:</strong> {{ song.tempo | round(1) }} BPM<br>

<small>Beats: {{ song.beats_count }}</small>

</div>

<div>

<strong>Tonalidad:</strong> {{ song.key }}

</div>

</div>


<h3>📜 Letra</h3>

<div class="lyrics">

{{ song.lyrics if song.lyrics else 'Letra no disponible.' }}

</div>


<div class="admin-section">

<h3>Administrar Canción</h3>


<div class="upload-art-form">

<h4 style="margin-top:0;">🖼️ Cambiar Portada Manualmente</h4>

<p style="font-size: 0.9em; margin-bottom: 10px;">Si la automática falló (ej. Soundtracks), sube la imagen aquí. Se incrustará en el archivo.</p>


<form action="{{ url_for('upload_art', song_id=song.id) }}" method="POST" enctype="multipart/form-data">

<input type="file" name="art_file" accept="image/jpeg,image/png" required>

<button type="submit" class="upload-art-btn">Subir y Reemplazar</button>

</form>

</div>


<form action="{{ url_for('delete_song', song_id=song.id) }}" method="POST"

onsubmit="return confirm('¿Estás seguro de que quieres eliminar esta canción?');">

<button type="submit" class="delete-button">🗑️ Eliminar Canción Permanentemente</button>

</form>

</div>


</div>

</body>

</html>

Running the Server

IMG_8004_2.jpg

The tough part is done, after trial and error (especially with metadata from ripped cd, soundtracks and compilation albums). Now we can run the server and upload our favorite songs.

To run the server, we just have to run the following command on the terminal(remember! you have to be in the project folder and in the virtual enviroment):

$ python run.py

And we'll see the IPs where we are running the server, the localhost IP: 127.0.0.1:5000 and our private IP.

To enter the server, you just have to go to http://127.0.0.1:5000 in your favorite browser.

Uploading Songs

Screenshot 2025-12-04 at 11-59-15 🎵 Servidor de M&uacute;sica con IA 🎵.png

To upload one or more songs, we can optionally fill a form manually with the artist and album name (for cases such as a compilation album with many artists) and the medium from which the files was extracted, we can select a m3u if we are uploading a cd rip, and then, we can select one or more songs and upload them.

In the process, we will see some status updates in the terminal, showing us which song is being processed and the step of that processing.

Once the songs are uploaded, they will appear in the uploads folder, inside our 'myproject' directory.

Integration With Navidrome

docker.png
Screenshot 2025-12-04 at 12-45-58 Navidrome.png

The server right now is very good for uploading, getting metadata and storing songs, but not for playing actual music: the interface isn't very atractive, is a little slow and doesn't give many options for the user (for example, features like creating playlists). That's why we'll integrate navidrome into our server.

Navidrome isn't connected directly to our server, it just checks our uploads folder and takes what's inside and presents it in a more attractive way, with an interface similar to apps like spotify. So we can say our ingestion server makes the dirty work, organizing music and embeding its metadata so it is read properly by navidrome.

To do the integration we just have to run a docker command in the terminal and then we can use navidrome as our streaming server.

Then, we'll have to run http://localhost:4533 and the login page of navidrome will welcome us. Now, you jsut have to create a username and a password, and from now on, you have your own music streaming server, congrats!

Navidrome on the Go

Screenshot 2025-12-04 at 12-46-56 Navidrome.png
Screenshot 2025-12-04 at 12-49-35 Station to Station - David Bowie - Navidrome.png
IMG_8224.PNG
IMG_8225.PNG

Navidrome works just like any other music streaming app, so I don't have many things to say about this; but a very interesting thing is that you can run your server in your cellphone.

To do so, you just have to download an app that runs a subsonic client, in my case it is Agin Music and by entering your IP address and your login info, you can access the server from your phone.

To Be Continued?

As I said before, I've been thinking a long time about doing this project, so I'd like to upgrade it over time because there are still some todo's.

First of all, the biggest flaw of the server is that I can only use it in a device that is connected to the same network as the Raspberry Pi, so I'd like to fix that, but it'll take some time.

The Raspberry isn't very fast especially with AI, and other thing I'd like to implement is some kind of audio processing beyond what's done: me as a musician myself have some recordings and stems, and I would like to be able to doing some kind of production using the server (such as adding reverb, cropping audio, put two tracks together...), but I still have to figure out how to do all that.

So far I'm very happy with what I've done, even though I think this will be a constantly evolving project.