license tests codecov PyPI version downloads commits

AudioIO

Platform independent interfacing of numpy arrays of floats with audio files and devices.

The AudioIO modules try to use whatever audio packages are installed on your system to achieve their tasks. AudioIO, however, adds own code for handling metadata and marker lists.

Features

  • Audio data are always numpy arrays of floats (np.float64) with values ranging between -1 and 1 ...
  • ... independent of how the data are stored in an audio file.
  • load_audio() function for loading a whole audio file.
  • Blockwise random-access loading of large audio files (class AudioLoader).
  • blocks() generator for iterating over blocks of data with optional overlap.
  • write_audio() function for writing data, metadata, and markers to an audio file.
  • Read metadata() as nested dictionaries of key-value pairs.
  • Read markers(), i.e. cue points with spans, labels, and descriptions.
  • Platform independent, synchronous (blocking) and asynchronous (non blocking) playback of numpy arrays via play().
  • Automatic resampling of data for playback to match supported sampling rates.
  • Detailed and platform specific installation instructions (pip, conda, Debian and RPM based Linux packages, homebrew for MacOS) for all supported audio packages (audiomodules).

Installation

AudioIO is available at PyPi. Simply run:

pip install audioio

Then you can use already installed audio packages for reading and writing audio files and for playing audio data. However, audio file formats supported by the python standard library are limited to basic wave files and playback capabilities are poor. If you need support for additional audio file formats or proper sound output, you need to install additional packages.

See installation for further instructions and recommendations on additional audio packages.

Usage

information.

import audioio as aio

Loading audio data

Load an audio file into a numpy array using load_audio():

data, samplingrate = aio.load_audio('audio/file.wav')

The read in data are always numpy arrays of floats ranging between -1 and 1. The arrays are always 2-D arrays with first axis time and second axis channel, even for single channel data.

Plot the first channel:

import numpy as np
import matplotlib.pyplot as plt

time = np.arange(len(data))/samplingrate
plt.plot(time, data[:,0])
plt.show()

Get a nested dictionary with key-value pairs of the file's metadata and print it using metadata() and print_metadata():

md = aio.metadata('audio/file.wav')
aio.print_metadata(md)

Get and print marker positions, spans, labels and texts using markers() and print_markers():

locs, labels = aio.markers('audio/file.wav')
aio.print_markers(locs, labels)

You can also randomly access chunks of data of an audio file, without loading the entire file into memory, by means of the AudioLoader class. This is really handy for analysing very long sound recordings:

# open audio file with a buffer holding 60 seconds of data:
with aio.AudioLoader('audio/file.wav', 60.0) as data:
     block = 1000
     rate = data.samplerate
     for i in range(len(data)//block):
         x = data[i*block:(i+1)*block]
         # ... do something with x and rate

Even simpler, iterate in blocks over the file with overlap using the blocks() generator:

from scipy.signal import spectrogram
nfft = 2048
with aio.AudioLoader('some/audio.wav') as data:
    for x in data.blocks(100*nfft, nfft//2):
        f, t, Sxx = spectrogram(x, nperseg=nfft, noverlap=nfft//2)

Metadata and markers can be accessed by the metadata() and markers() member functions of the AudioLoader object:

with aio.AudioLoader('audio/file.wav', 60.0) as data:
     md = data.metadata()
     locs, labels = data.markers()

See API documentation of the audioloader, audiometadata, and audiomarkers modules for details.

Writing audio data

Write a 1-D or 2-D numpy array into an audio file (data values between -1 and 1) using the write_audio() function:

aio.write_audio('audio/file.wav', data, samplerate)

Again, in 2-D arrays the first axis (rows) is time and the second axis the channel (columns).

Metadata in form of a nested dictionary with key-value pairs, marker positions and spans (locs) as well as associated labels and texts (labels) can also be passed on to the write_audio() function:

aio.write_audio('audio/file.wav', data, samplerate, md, locs, labels)

See API documentation of the audiowriter module for details.

Converting audio files

AudioIO provides a command line script for converting, downsampling, renaming and merging audio files:

> audioconverter -e float -o test.wav test.mp3

If possible, audioconverter tries to keep metadata and marker lists.

See documentation of the audioconverter module for details.

Display metadata and markers

AudioIO provides a command line script that prints metadata and markers of audio files to the console:

> audiometadata test.wav

See documentation of the audiometadata module for details.

Playing sounds

Fade in and out (fade()) and play (play()) a 1-D or 2-D numpy array as a sound (first axis is time and second axis the channel):

aio.fade(data, samplingrate, 0.2)
aio.play(data, samplingrate)

Just beep()

aio.beep()

Beep for half a second and 440 Hz:

aio.beep(0.5, 440.0)
aio.beep(0.5, 'a4')

Musical notes are translated into frequency with the note2freq() function.

See API documentation of the playaudio module for details.

Managing audio modules

Simply run in your terminal

> audiomodules

and you get something like

Status of audio packages on this machine:
-----------------------------------------

wave              is  installed (F)
ewave             not installed (F)
scipy.io.wavfile  is  installed (F)
soundfile         is  installed (F)
wavefile          not installed (F)
audioread         is  installed (F)
pydub             is  installed (F)
pyaudio           not installed (D)
sounddevice       not installed (D)
simpleaudio       NOT installed (D)
soundcard         not installed (D)
ossaudiodev       is  installed (D)
winsound          not installed (D)

F: file I/O, D: audio device

For better performance you should install the following modules:

simpleaudio:
------------
The simpleaudio package is a lightweight package
for cross-platform audio playback.
For documentation see https://simpleaudio.readthedocs.io

First, install the following packages:

sudo apt install python3-dev libasound2-dev

Install the simpleaudio module with pip:

sudo pip install simpleaudio

Use this to see which audio modules you have already installed on your system, which ones are recommended to install, and how to install them.

See API documentation of the audiomodules module for details.

Alternatives

All the audio modules AudioIO is using.

Reading and writing audio files:

Metadata:

  • GUANO: Grand Unified Acoustic Notation Ontology, an extensible, open format for embedding metadata within bat acoustic recordings.

Playing sounds:

  • sounddevice: wrapper for portaudio.
  • PyAudio: wrapper for portaudio.
  • simpleaudio: uses ALSA on Linux, runs well on windows.
  • SoundCard: playback via CFFI and the native audio libraries of Linux, Windows and macOS.
  • ossaudiodev: playback via the outdated OSS interface of the python standard library.
  • winsound: native windows audio playback of the python standard library, asynchronous playback only with wave files.

Scientific audio software:

  • diapason: musical notes like playaudio.note2freq.
  • librosa: audio and music processing in python.
  • TimeView: GUI application to view and analyze time series signal data.
  • scikit-maad: quantitative analysis of environmental audio recordings
  • Soundscapy: analysing and visualising soundscape assessments.
  • BatDetect2: detecting and classifying bat echolocation calls in high frequency audio recordings.
  • Batogram: viewing bat call spectrograms with GUANO metadata, including the ability to click to open the location in Google Maps.