Module audioio.riffmetadata

Read and write meta data and marker lists of riff based files.

Container files of the Resource Interchange File Format (RIFF) like WAVE files may contain sections (called chunks) with metadata and markers in addition to the timeseries (audio) data and the necessary specifications of sampling rate, bit depth, etc.

Metadata

There are various types of chunks for storing metadata, like the INFO list, broadcast-audio extension (BEXT) chunk, or iXML chunks. These chunks contain metadata as key-value pairs. Since wave files are primarily designed for music, valid keys in these chunks are restricted to topics from music and music production. Some keys are usefull also for science, but there is need for more keys. It is possible to extend the INFO list keys, but these keys are restricted to four characters and the INFO list chunk does also not allow for hierarchical metadata. The other metadata chunks, in particular the BEXT chunk, cannot be extended. With standard chunks, not all types of metadata can be stored.

The GUANO (Grand Unified Acoustic Notation Ontology), primarily designed for bat acoustic recordings, has some standard ontologies that are of much more interest in scientific context. In addition, GUANO allows for extensions with arbitray nested keys and string encoded values. In that respect it is a well defined and easy to handle serialization of the odML data model. We use GUANO to write all metadata that do not fit into the INFO, BEXT or IXML chunks into a WAVE file.

To interface the various ways to store and read metadata of RIFF files, the riffmetadata module simply uses nested dictionaries. The keys are always strings. Values are strings or integers for key-value pairs. Value strings can also be numbers followed by a unit. Values can also be dictionaries for defining subsections of key-value pairs. The dictionaries can be nested to arbitrary depth.

The write_wave() function first tries to write an INFO list chunk. It checks for a key "INFO" with a flat dictionary of key value pairs. It then translates all keys of this dictionary using the info_tags mapping. If all the resulting keys have no more than four characters and there are no subsections, then an INFO list chunk is written. If no "INFO" key exists, then with the same procedure all elements of the provided metadata are checked for being valid INFO tags, and on success an INFO list chunk is written. Then, in similar ways, write_wave() tries to assemble valid BEXT and iXML chunks, based on the tags in bext_tags abd ixml_tags. All remaining metadata are then stored in an GUANO chunk.

When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are returned as subsections with the respective keys. Metadata from an GUANO chunk are stored directly in the metadata dictionary without marking them as GUANO.

Markers

A number of different chunk types exist for handling markers or cues that mark specific events or regions in the audio data. In the end, each marker has a position, a span, a label, and a text. Position, and span are handled with 1-D or 2-D arrays of ints, where each row is a marker and the columns are position and span. The span column is optional. Labels and texts come in another 1-D or 2-D array of objects pointing to strings. Again, rows are the markers, first column are the labels, and second column the optional texts. Try to keep the labels short, and use text for longer descriptions, if necessary.

Read metadata and markers

Write data, metadata and markers

Helper functions for reading RIFF and WAVE files

Helper functions for writing RIFF and WAVE files

Demo

  • demo(): print metadata and marker list of RIFF/WAVE file.
  • main(): call demo with command line arguments.

Descriptions of the RIFF/WAVE file format

For INFO tag names see:

Global variables

var info_tags

Dictionary with known tags of the INFO chunk as keys and their description as value.

See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

var bext_tags

Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.

See https://tech.ebu.ch/docs/tech/tech3285.pdf

var ixml_tags

List with valid tags of the iXML chunk.

See http://www.gallery.co.uk/ixml/

Functions

def read_riff_header(sf, tag=None)

Read and check the RIFF file header.

Parameters

sf : stream
File stream of RIFF/WAVE file.
tag : None or str
If supplied, check whether it matches the subchunk tag. If it does not match, raise a ValueError.

Returns

filesize : int
Size of the RIFF file in bytes.

Raises

ValueError
Not a RIFF file or subchunk tag does not match tag.
def skip_chunk(sf)

Skip over unknown RIFF chunk.

Parameters

sf : stream
File stream of RIFF file.

Returns

size : int
The size of the skipped chunk in bytes.
def read_chunk_tags(filepath)

Read tags of all chunks contained in a RIFF file.

Parameters

filepath : string or file handle
The RIFF file.

Returns

tags : dict
Keys are the tag names of the chunks found in the file. If the chunk is a list chunk, then the list type is added with a dash to the key, i.e. "LIST-INFO". Values are tuples with the corresponding file positions of the data of the chunk (after the tag and the chunk size field) and the size of the chunk data. The file position of the next chunk is thus the position of the chunk plus the size of its data.

Raises

ValueError
Not a RIFF file.
def read_format_chunk(sf)

Read format chunk.

Parameters

sf : stream
File stream for reading FMT chunk.

Returns

channels : int
Number of channels.
rate : float
Sampling rate (frames per time) in Hertz.
bits : int
Bit resolution.
def read_info_chunks(sf, store_empty)

Read in meta data from info list chunk.

The variable info_tags is used to map the 4 character tags to human readable key names.

See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

Parameters

sf : stream
File stream of RIFF file.
store_empty : bool
If False do not add meta data with empty values.

Returns

metadata : dict
Dictionary with key-value pairs of info tags.
def read_bext_chunk(sf, store_empty=True)

Read in metadata from the broadcast-audio extension chunk.

The variable bext_tags lists all valid BEXT fields and their size.

See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.

Parameters

sf : stream
File stream of RIFF file.
store_empty : bool
If False do not add meta data with empty values.

Returns

meta_data : dict

The meta-data of a BEXT chunk are stored in a flat dictionary with the following keys:

  • 'Description': a free description of the sequence.
  • 'Originator': name of the originator/ producer of the audio file.
  • 'OriginatorReference': unambiguous reference allocated by the originating organisation.
  • 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.
  • 'OriginationTime': time of creation of audio sequence in hh:mm:ss.
  • 'TimeReference': first sample since midnight.
  • 'Version': version of the BWF.
  • 'UMID': unique material identifier.
  • 'LoudnessValue': integrated loudness value.
  • 'LoudnessRange': loudness range.
  • 'MaxTruePeakLevel': maximum true peak value in dBTP.
  • 'MaxMomentaryLoudness': highest value of the momentary loudness level.
  • 'MaxShortTermLoudness': highest value of the short-term loudness level.
  • 'Reserved': 180 bytes reserved for extension.
  • 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.
def read_ixml_chunk(sf, store_empty=True)

Read in metadata from an IXML chunk.

See the variable ixml_tags for a list of valid tags.

See http://www.gallery.co.uk/ixml/ for the specification of iXML.

Parameters

sf : stream
File stream of RIFF file.
store_empty : bool
If False do not add meta data with empty values.

Returns

metadata : nested dict
Dictionary with key-value pairs.
def read_guano_chunk(sf)

Read in metadata from a GUANO chunk.

GUANO is the Grand Unified Acoustic Notation Ontology, an extensible, open format for embedding metadata within bat acoustic recordings. See https://github.com/riggsd/guano-spec for details.

The GUANO specification allows for the inclusion of arbitrary nested keys and string encoded values. In that respect it is a well defined and easy to handle serialization of the odML data model.

Parameters

sf : stream
File stream of RIFF file.

Returns

metadata : nested dict
Dictionary with key-value pairs.
def read_cue_chunk(sf)

Read in marker positions from cue chunk.

See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file

Parameters

sf : stream
File stream of RIFF file.

Returns

locs : 2-D array of ints
Each row is a marker with unique identifier in the first column, position in the second column, and span in the third column. The cue chunk does not encode spans, so the third column is initialized with zeros.
def read_playlist_chunk(sf, locs)

Read in marker spans from playlist chunk.

See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file

Parameters

sf : stream
File stream of RIFF file.
locs : 2-D array of ints
Markers as returned by the read_cue_chunk() function. Each row is a marker with unique identifier in the first column, position in the second column, and span in the third column. The span is read in from the playlist chunk.
def read_adtl_chunks(sf, locs, labels)

Read in associated data list chunks.

See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file

Parameters

sf : stream
File stream of RIFF file.
locs : 2-D array of ints
Markers as returned by the read_cue_chunk() function. Each row is a marker with unique identifier in the first column, position in the second column, and span in the third column. The span is read in from the LTXT chunk.
labels : 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows) from previous LABL, NOTE, and LTXT chunks.

Returns

labels : 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows) from LABL, NOTE (first column), and LTXT chunks (last column).
def read_lbl_chunk(sf, rate)

Read in marker positions, spans, labels, and texts from lbl chunk.

The proprietary LBL chunk is specific to wave files generated by AviSoft products.

The labels (first column of labels) have special meanings. Markers with a span (a section label in the terminology of AviSoft) can be arranged in three levels when displayed:

  • "M": layer 1, the top level section
  • "N": layer 2, sections below layer 1
  • "O": layer 3, sections below layer 2
  • "P": total, section start and end are displayed with two vertical lines.

All other labels mark single point labels with a time and a frequency (that we here discard). See also https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm

Parameters

sf : stream
File stream of RIFF file.
rate : float
Sampling rate of the data in Hertz.

Returns

locs : 2-D array of ints
Each row is a marker with unique identifier (simply integers enumerating the markers) in the first column, position in the second column, and span in the third column.
labels : 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows).
def metadata_riff(filepath, store_empty=False)

Read metadata from a RIFF/WAVE file.

Parameters

filepath : string or file handle
The RIFF file.
store_empty : bool
If False do not add meta data with empty values.

Returns

meta_data : nested dict
Meta data contained in the RIFF file. Keys of the nested dictionaries are always strings. If the corresponding values are dictionaries, then the key is the section name of the metadata contained in the dictionary. All other types of values are values for the respective key. In particular they are strings, or list of strings. But other simple types like ints or floats are also allowed. First level contains sections of meta data (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).

Raises

ValueError
Not a RIFF file.

Examples

from audioio.riffmetadata import riff_metadata
from audioio import print_metadata

md = riff_metadata('audio/file.wav')
print_metadata(md)
def markers_riff(filepath)

Read markers from a RIFF/WAVE file.

Parameters

filepath : string or file handle
The RIFF file.

Returns

locs : 2-D array of ints
Marker positions (first column) and spans (second column) for each marker (rows).
labels : 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows).

Raises

ValueError
Not a RIFF file.

Examples

from audioio.riffmetadata import riff_markers
from audioio import print_markers

locs, labels = riff_markers('audio/file.wav')
print_markers(locs, labels)
def write_riff_chunk(df, filesize=0, tag='WAVE')

Write RIFF file header.

Parameters

df : stream
File stream for writing RIFF file header.
filesize : int
Size of the file in bytes.
tag : str
The type of RIFF file. Default is a wave file. Exactly 4 characeters long.

Returns

n : int
Number of bytes written to the stream.

Raises

ValueError
tag is not 4 characters long.
def write_filesize(df, filesize=None)

Write the file size into the RIFF file header.

Parameters

df : stream
File stream into which to write filesize.
filesize : int
Size of the file in bytes. If not specified or 0, then use current size of the file.
def write_chunk_name(df, pos, tag)

Change the name of a chunk.

Use this to make the content of an existing chunk to be ignored by overwriting its name with an unknown one.

Parameters

df : stream
File stream.
pos : int
Position of the chunk in the file stream.
tag : str
The type of RIFF file. Default is a wave file. Exactly 4 characeters long.

Raises

ValueError
tag is not 4 characters long.
def write_format_chunk(df, channels, frames, rate, bits=16)

Write format chunk.

Parameters

df : stream
File stream for writing FMT chunk.
channels : int
Number of channels contained in the data.
frames : int
Number of frames contained in the data.
rate : int or float
Sampling rate (frames per time) in Hertz.
bits : 16 or 32
Bit resolution of the data to be written.

Returns

n : int
Number of bytes written to the stream.
def write_data_chunk(df, data, bits=16)

Write data chunk.

Parameters

df : stream
File stream for writing data chunk.
data : 1-D or 2-D array of floats
Data with first column time (frames) and optional second column channels with values between -1 and 1.
bits : 16 or 32
Bit resolution of the data to be written.

Returns

n : int
Number of bytes written to the stream.
def write_info_chunk(df, metadata)

Write metadata to LIST INFO chunk.

If metadata contains an 'INFO' key, then write the flat dictionary of this key as an INFO chunk. Otherwise, attempt to write all metadata items as an INFO chunk. The keys are translated via the info_tags variable back to INFO tags. If after translation any key is left that is longer than 4 characters or any key has a dictionary as a value (non-flat metadata), the INFO chunk is not written.

See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

Parameters

df : stream
File stream for writing INFO chunk.
metadata : nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.

Returns

n : int
Number of bytes written to the stream.
keys_written : list of str
Keys written to the INFO chunk.
def write_bext_chunk(df, metadata)

Write metadata to BEXT chunk.

If metadata contains a BEXT key, and this contains valid BEXT tags (one of the keys listed in the variable bext_tags), then write the dictionary of that key as a broadcast-audio extension chunk.

See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.

Parameters

df : stream
File stream for writing BEXT chunk.
metadata : nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.

Returns

n : int
Number of bytes written to the stream.
keys_written : list of str
Keys written to the BEXT chunk.
def write_ixml_chunk(df, metadata, keys_written=None)

Write metadata to iXML chunk.

If metadata contains an IXML key with valid IXML tags (one of those listed in the variable ixml_tags), or the remaining tags in metadata are valid IXML tags, then write an IXML chunk.

See http://www.gallery.co.uk/ixml/ for the specification of iXML.

Parameters

df : stream
File stream for writing IXML chunk.
metadata : nested dict
Meta-data as key-value pairs. Values can be strings, integers, or dictionaries.
keys_written : list of str
Keys that have already written to INFO or BEXT chunk.

Returns

n : int
Number of bytes written to the stream.
keys_written : list of str
Keys written to the IXML chunk.
def write_guano_chunk(df, metadata, keys_written=None)

Write metadata to guan chunk.

GUANO is the Grand Unified Acoustic Notation Ontology, an extensible, open format for embedding metadata within bat acoustic recordings. See https://github.com/riggsd/guano-spec for details.

The GUANO specification allows for the inclusion of arbitrary nested keys and string encoded values. In that respect it is a well defined and easy to handle serialization of the odML data model.

This will write all metadata that are not in keys_written.

Parameters

df : stream
File stream for writing guano chunk.
metadata : nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.
keys_written : list of str
Keys that have already written to INFO, BEXT, IXML chunk.

Returns

n : int
Number of bytes written to the stream.
keys_written : list of str
Top-level keys written to the GUANO chunk.
def write_cue_chunk(df, locs)

Write marker positions to cue chunk.

See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file

Parameters

df : stream
File stream for writing cue chunk.
locs : None or 2-D array of ints
Positions (first column) and spans (optional second column) for each marker (rows).

Returns

n : int
Number of bytes written to the stream.
def write_playlist_chunk(df, locs)

Write marker spans to playlist chunk.

See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file

Parameters

df : stream
File stream for writing playlist chunk.
locs : None or 2-D array of ints
Positions (first column) and spans (optional second column) for each marker (rows).

Returns

n : int
Number of bytes written to the stream.
def write_adtl_chunks(df, locs, labels)

Write associated data list chunks.

See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file

Parameters

df : stream
File stream for writing adtl chunk.
locs : None or 2-D array of ints
Positions (first column) and spans (optional second column) for each marker (rows).
labels : None or 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows).

Returns

n : int
Number of bytes written to the stream.
def write_lbl_chunk(df, locs, labels, rate)

Write marker positions, spans, labels, and texts to lbl chunk.

The proprietary LBL chunk is specific to wave files generated by AviSoft products.

The labels (first column of labels) have special meanings. Markers with a span (a section label in the terminology of AviSoft) can be arranged in three levels when displayed:

  • "M": layer 1, the top level section
  • "N": layer 2, sections below layer 1
  • "O": layer 3, sections below layer 2
  • "P": total, section start and end are displayed with two vertical lines.

All other labels mark single point labels with a time and a frequency (that we here discard). See also https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm

If a marker has a span, and its label is not one of "M", "N", "O", or "P", then its label is set to "M". If a marker has no span, and its label is one of "M", "N", "O", or "P", then its label is set to "a".

Parameters

df : stream
File stream for writing lbl chunk.
locs : None or 2-D array of ints
Positions (first column) and spans (optional second column) for each marker (rows).
labels : None or 2-D array of string objects
Labels (first column) and texts (second column) for each marker (rows).
rate : float
Sampling rate of the data in Hertz.

Returns

n : int
Number of bytes written to the stream.
def append_metadata_riff(df, metadata)

Append metadata chunks to RIFF file.

You still need to update the filesize by calling write_filesize().

Parameters

df : stream
File stream for writing metadata chunks.
metadata : None or nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.

Returns

n : int
Number of bytes written to the stream.
tags : list of str
Tag names of chunks written to audio file.
def append_markers_riff(df, locs, labels=None, rate=None, marker_hint='cue')

Append marker chunks to RIFF file.

You still need to update the filesize by calling write_filesize().

Parameters

df : stream
File stream for writing metadata chunks.
locs : None or 1-D or 2-D array of ints
Marker positions (first column) and spans (optional second column) for each marker (rows).
labels : None or 1-D or 2-D array of string objects
Labels (first column) and texts (optional second column) for each marker (rows).
rate : float
Sampling rate of the data in Hertz, needed for storing markers in seconds.
marker_hint : str
  • 'cue': store markers in cue and and adtl chunks.
  • 'lbl': store markers in avisoft lbl chunk.

Returns

n : int
Number of bytes written to the stream.
tags : list of str
Tag names of chunks written to audio file.

Raises

ValueError
Encoding not supported.
IndexError
locs and labels differ in len.
def write_wave(filepath, data, rate, metadata=None, locs=None, labels=None, encoding=None, marker_hint='cue')

Write time series, metadata and markers to a WAVE file.

Only 16 or 32bit PCM encoding is supported.

Parameters

filepath : string
Full path and name of the file to write.
data : 1-D or 2-D array of floats
Array with the data (first index time, second index channel, values within -1.0 and 1.0).
rate : float
Sampling rate of the data in Hertz.
metadata : None or nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.
locs : None or 1-D or 2-D array of ints
Marker positions (first column) and spans (optional second column) for each marker (rows).
labels : None or 1-D or 2-D array of string objects
Labels (first column) and texts (optional second column) for each marker (rows).
encoding : string or None
Encoding of the data: 'PCM_32' or 'PCM_16'. If None or empty string use 'PCM_16'.

marker_hint: str - 'cue': store markers in cue and and adtl chunks. - 'lbl': store markers in avisoft lbl chunk.

Raises

ValueError
Encoding not supported.
IndexError
locs and labels differ in len.

See Also

write_audio()

Examples

import numpy as np
from audioio.riffmetadata import write_wave

rate = 28000.0
freq = 800.0
time = np.arange(0.0, 1.0, 1/rate) # one second
data = np.sin(2.0*np.p*freq*time)        # 800Hz sine wave
md = dict(Artist='underscore_')          # metadata

write_wave('audio/file.wav', data, rate, md)
def append_riff(filepath, metadata=None, locs=None, labels=None, rate=None, marker_hint='cue')

Append metadata and markers to an existing RIFF file.

Parameters

filepath : string
Full path and name of the file to write.
metadata : None or nested dict
Metadata as key-value pairs. Values can be strings, integers, or dictionaries.
locs : None or 1-D or 2-D array of ints
Marker positions (first column) and spans (optional second column) for each marker (rows).
labels : None or 1-D or 2-D array of string objects
Labels (first column) and texts (optional second column) for each marker (rows).
rate : float
Sampling rate of the data in Hertz, needed for storing markers in seconds.
marker_hint : str
  • 'cue': store markers in cue and and adtl chunks.
  • 'lbl': store markers in avisoft lbl chunk.

Returns

n : int
Number of bytes written to the stream.

Raises

IndexError
locs and labels differ in len.

Examples

import numpy as np
from audioio.riffmetadata import append_riff

md = dict(Artist='underscore_')    # metadata
append_riff('audio/file.wav', md)  # append them to existing audio file
def demo(filepath)

Print metadata and markers of a RIFF/WAVE file.

Parameters

filepath : string
Path of a RIFF/WAVE file.
def main(*args)

Call demo with command line arguments.

Parameters

args : list of strings
Command line arguments as returned by sys.argv[1:]