Coverage for src / audioio / riffmetadata.py: 97%
723 statements
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-17 21:34 +0000
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-17 21:34 +0000
1"""Read and write meta data and marker lists of riff based files.
3Container files of the Resource Interchange File Format (RIFF) like
4WAVE files may contain sections (called chunks) with metadata and
5markers in addition to the timeseries (audio) data and the necessary
6specifications of sampling rate, bit depth, etc.
8## Metadata
10There are various types of chunks for storing metadata, like the [INFO
11list](https://www.recordingblogs.com/wiki/list-chunk-of-a-wave-file),
12[broadcast-audio extension
13(BEXT)](https://tech.ebu.ch/docs/tech/tech3285.pdf) chunk, or
14[iXML](http://www.gallery.co.uk/ixml/) chunks. These chunks contain
15metadata as key-value pairs. Since wave files are primarily designed
16for music, valid keys in these chunks are restricted to topics from
17music and music production. Some keys are usefull also for science,
18but there is need for more keys. It is possible to extend the INFO
19list keys, but these keys are restricted to four characters and the
20INFO list chunk does also not allow for hierarchical metadata. The
21other metadata chunks, in particular the BEXT chunk, cannot be
22extended. With standard chunks, not all types of metadata can be
23stored.
25The [GUANO (Grand Unified Acoustic Notation
26Ontology)](https://github.com/riggsd/guano-spec), primarily designed
27for bat acoustic recordings, has some standard ontologies that are of
28much more interest in scientific context. In addition, GUANO allows
29for extensions with arbitray nested keys and string encoded values.
30In that respect it is a well defined and easy to handle serialization
31of the [odML data model](https://doi.org/10.3389/fninf.2011.00016).
32We use GUANO to write all metadata that do not fit into the INFO, BEXT
33or IXML chunks into a WAVE file.
35To interface the various ways to store and read metadata of RIFF
36files, the `riffmetadata` module simply uses nested dictionaries. The
37keys are always strings. Values are strings or integers for key-value
38pairs. Value strings can also be numbers followed by a unit. Values
39can also be dictionaries for defining subsections of key-value
40pairs. The dictionaries can be nested to arbitrary depth.
42The `write_wave()` function first tries to write an INFO list
43chunk. It checks for a key "INFO" with a flat dictionary of key value
44pairs. It then translates all keys of this dictionary using the
45`info_tags` mapping. If all the resulting keys have no more than four
46characters and there are no subsections, then an INFO list chunk is
47written. If no "INFO" key exists, then with the same procedure all
48elements of the provided metadata are checked for being valid INFO
49tags, and on success an INFO list chunk is written. Then, in similar
50ways, `write_wave()` tries to assemble valid BEXT and iXML chunks,
51based on the tags in `bext_tags` abd `ixml_tags`. All remaining
52metadata are then stored in an GUANO chunk.
54When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are
55returned as subsections with the respective keys. Metadata from an
56GUANO chunk are stored directly in the metadata dictionary without
57marking them as GUANO.
59## Markers
61A number of different chunk types exist for handling markers or cues
62that mark specific events or regions in the audio data. In the end,
63each marker has a position, a span, a label, and a text. Position,
64and span are handled with 1-D or 2-D arrays of ints, where each row is
65a marker and the columns are position and span. The span column is
66optional. Labels and texts come in another 1-D or 2-D array of objects
67pointing to strings. Again, rows are the markers, first column are the
68labels, and second column the optional texts. Try to keep the labels
69short, and use text for longer descriptions, if necessary.
71## Read metadata and markers
73- `metadata_riff()`: read metadata from a RIFF/WAVE file.
74- `markers_riff()`: read markers from a RIFF/WAVE file.
76## Write data, metadata and markers
78- `write_wave()`: write time series, metadata and markers to a WAVE file.
79- `append_metadata_riff()`: append metadata chunks to RIFF file.
80- `append_markers_riff()`: append marker chunks to RIFF file.
81- `append_riff()`: append metadata and markers to an existing RIFF file.
83## Helper functions for reading RIFF and WAVE files
85- `read_chunk_tags()`: read tags of all chunks contained in a RIFF file.
86- `read_riff_header()`: read and check the RIFF file header.
87- `skip_chunk()`: skip over unknown RIFF chunk.
88- `read_format_chunk()`: read format chunk.
89- `read_info_chunks()`: read in meta data from info list chunk.
90- `read_bext_chunk()`: read in metadata from the broadcast-audio extension chunk.
91- `read_ixml_chunk()`: read in metadata from an IXML chunk.
92- `read_guano_chunk()`: read in metadata from a GUANO chunk.
93- `read_cue_chunk()`: read in marker positions from cue chunk.
94- `read_playlist_chunk()`: read in marker spans from playlist chunk.
95- `read_adtl_chunks()`: read in associated data list chunks.
96- `read_lbl_chunk()`: read in marker positions, spans, labels, and texts from lbl chunk.
98## Helper functions for writing RIFF and WAVE files
100- `write_riff_chunk()`: write RIFF file header.
101- `write_filesize()`: write the file size into the RIFF file header.
102- `write_chunk_name()`: change the name of a chunk.
103- `write_format_chunk()`: write format chunk.
104- `write_data_chunk()`: write data chunk.
105- `write_info_chunk()`: write metadata to LIST INFO chunk.
106- `write_bext_chunk()`: write metadata to BEXT chunk.
107- `write_ixml_chunk()`: write metadata to iXML chunk.
108- `write_guano_chunk()`: write metadata to GUANO chunk.
109- `write_cue_chunk()`: write marker positions to cue chunk.
110- `write_playlist_chunk()`: write marker spans to playlist chunk.
111- `write_adtl_chunks()`: write associated data list chunks.
112- `write_lbl_chunk()`: write marker positions, spans, labels, and texts to lbl chunk.
114## Demo
116- `demo()`: print metadata and marker list of RIFF/WAVE file.
117- `main()`: call demo with command line arguments.
119## Descriptions of the RIFF/WAVE file format
121- https://de.wikipedia.org/wiki/RIFF_WAVE
122- http://www.piclist.com/techref/io/serial/midi/wave.html
123- https://moddingwiki.shikadi.net/wiki/Resource_Interchange_File_Format_(RIFF)
124- https://www.recordingblogs.com/wiki/wave-file-format
125- http://fhein.users.ak.tu-berlin.de/Alias/Studio/ProTools/audio-formate/wav/overview.html
126- http://www.gallery.co.uk/ixml/
128For INFO tag names see:
130- see https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
132"""
134import io
135import os
136import sys
137import warnings
138import struct
139import numpy as np
140import xml.etree.ElementTree as ET
142from .audiometadata import flatten_metadata, unflatten_metadata, find_key
145info_tags = dict(AGES='Rated',
146 CMNT='Comment',
147 CODE='EncodedBy',
148 COMM='Comments',
149 DIRC='Directory',
150 DISP='SoundSchemeTitle',
151 DTIM='DateTimeOriginal',
152 GENR='Genre',
153 IARL='ArchivalLocation',
154 IART='Artist',
155 IAS1='FirstLanguage',
156 IAS2='SecondLanguage',
157 IAS3='ThirdLanguage',
158 IAS4='FourthLanguage',
159 IAS5='FifthLanguage',
160 IAS6='SixthLanguage',
161 IAS7='SeventhLanguage',
162 IAS8='EighthLanguage',
163 IAS9='NinthLanguage',
164 IBSU='BaseURL',
165 ICAS='DefaultAudioStream',
166 ICDS='ConstumeDesigner',
167 ICMS='Commissioned',
168 ICMT='Comment',
169 ICNM='Cinematographer',
170 ICNT='Country',
171 ICOP='Copyright',
172 ICRD='DateCreated',
173 ICRP='Cropped',
174 IDIM='Dimensions',
175 IDIT='DateTimeOriginal',
176 IDPI='DotsPerInch',
177 IDST='DistributedBy',
178 IEDT='EditedBy',
179 IENC='EncodedBy',
180 IENG='Engineer',
181 IGNR='Genre',
182 IKEY='Keywords',
183 ILGT='Lightness',
184 ILGU='LogoURL',
185 ILIU='LogoIconURL',
186 ILNG='Language',
187 IMBI='MoreInfoBannerImage',
188 IMBU='MoreInfoBannerURL',
189 IMED='Medium',
190 IMIT='MoreInfoText',
191 IMIU='MoreInfoURL',
192 IMUS='MusicBy',
193 INAM='Title',
194 IPDS='ProductionDesigner',
195 IPLT='NumColors',
196 IPRD='Product',
197 IPRO='ProducedBy',
198 IRIP='RippedBy',
199 IRTD='Rating',
200 ISBJ='Subject',
201 ISFT='Software',
202 ISGN='SecondaryGenre',
203 ISHP='Sharpness',
204 ISMP='TimeCode',
205 ISRC='Source',
206 ISRF='SourceFrom',
207 ISTD='ProductionStudio',
208 ISTR='Starring',
209 ITCH='Technician',
210 ITRK='TrackNumber',
211 IWMU='WatermarkURL',
212 IWRI='WrittenBy',
213 LANG='Language',
214 LOCA='Location',
215 PRT1='Part',
216 PRT2='NumberOfParts',
217 RATE='Rate',
218 START='Starring',
219 STAT='Statistics',
220 TAPE='TapeName',
221 TCDO='EndTimecode',
222 TCOD='StartTimecode',
223 TITL='Title',
224 TLEN='Length',
225 TORG='Organization',
226 TRCK='TrackNumber',
227 TURL='URL',
228 TVER='Version',
229 VMAJ='VegasVersionMajor',
230 VMIN='VegasVersionMinor',
231 YEAR='Year',
232 # extensions from
233 # [TeeRec](https://github.com/janscience/TeeRec/):
234 BITS='Bits',
235 PINS='Pins',
236 AVRG='Averaging',
237 CNVS='ConversionSpeed',
238 SMPS='SamplingSpeed',
239 VREF='ReferenceVoltage',
240 GAIN='Gain',
241 UWRP='UnwrapThreshold',
242 UWPC='UnwrapClippedAmplitude',
243 IBRD='uCBoard',
244 IMAC='MACAdress',
245 CPUF='CPU frequency')
246"""Dictionary with known tags of the INFO chunk as keys and their description as value.
248See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
249"""
251bext_tags = dict(
252 Description=256,
253 Originator=32,
254 OriginatorReference=32,
255 OriginationDate=10,
256 OriginationTime=8,
257 TimeReference=8,
258 Version=2,
259 UMID=64,
260 LoudnessValue=2,
261 LoudnessRange=2,
262 MaxTruePeakLevel=2,
263 MaxMomentaryLoudness=2,
264 MaxShortTermLoudness=2,
265 Reserved=180,
266 CodingHistory=0)
267"""Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.
269See https://tech.ebu.ch/docs/tech/tech3285.pdf
270"""
272ixml_tags = [
273 'BWFXML',
274 'IXML_VERSION',
275 'PROJECT',
276 'SCENE',
277 'TAPE',
278 'TAKE',
279 'TAKE_TYPE',
280 'NO_GOOD',
281 'FALSE_START',
282 'WILD_TRACK',
283 'CIRCLED',
284 'FILE_UID',
285 'UBITS',
286 'NOTE',
287 'SYNC_POINT_LIST',
288 'SYNC_POINT_COUNT',
289 'SYNC_POINT',
290 'SYNC_POINT_TYPE',
291 'SYNC_POINT_FUNCTION',
292 'SYNC_POINT_COMMENT',
293 'SYNC_POINT_LOW',
294 'SYNC_POINT_HIGH',
295 'SYNC_POINT_EVENT_DURATION',
296 'SPEED',
297 'MASTER_SPEED',
298 'CURRENT_SPEED',
299 'TIMECODE_RATE',
300 'TIMECODE_FLAGS',
301 'FILE_SAMPLE_RATE',
302 'AUDIO_BIT_DEPTH',
303 'DIGITIZER_SAMPLE_RATE',
304 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_HI',
305 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_LO',
306 'TIMESTAMP_SAMPLE_RATE',
307 'LOUDNESS',
308 'LOUDNESS_VALUE',
309 'LOUDNESS_RANGE',
310 'MAX_TRUE_PEAK_LEVEL',
311 'MAX_MOMENTARY_LOUDNESS',
312 'MAX_SHORT_TERM_LOUDNESS',
313 'HISTORY',
314 'ORIGINAL_FILENAME',
315 'PARENT_FILENAME',
316 'PARENT_UID',
317 'FILE_SET',
318 'TOTAL_FILES',
319 'FAMILY_UID',
320 'FAMILY_NAME',
321 'FILE_SET_INDEX',
322 'TRACK_LIST',
323 'TRACK_COUNT',
324 'TRACK',
325 'CHANNEL_INDEX',
326 'INTERLEAVE_INDEX',
327 'NAME',
328 'FUNCTION',
329 'PRE_RECORD_SAMPLECOUNT',
330 'BEXT',
331 'BWF_DESCRIPTION',
332 'BWF_ORIGINATOR',
333 'BWF_ORIGINATOR_REFERENCE',
334 'BWF_ORIGINATION_DATE',
335 'BWF_ORIGINATION_TIME',
336 'BWF_TIME_REFERENCE_LOW',
337 'BWF_TIME_REFERENCE_HIGH',
338 'BWF_VERSION',
339 'BWF_UMID',
340 'BWF_RESERVED',
341 'BWF_CODING_HISTORY',
342 'BWF_LOUDNESS_VALUE',
343 'BWF_LOUDNESS_RANGE',
344 'BWF_MAX_TRUE_PEAK_LEVEL',
345 'BWF_MAX_MOMENTARY_LOUDNESS',
346 'BWF_MAX_SHORT_TERM_LOUDNESS',
347 'USER',
348 'FULL_TITLE',
349 'DIRECTOR_NAME',
350 'PRODUCTION_NAME',
351 'PRODUCTION_ADDRESS',
352 'PRODUCTION_EMAIL',
353 'PRODUCTION_PHONE',
354 'PRODUCTION_NOTE',
355 'SOUND_MIXER_NAME',
356 'SOUND_MIXER_ADDRESS',
357 'SOUND_MIXER_EMAIL',
358 'SOUND_MIXER_PHONE',
359 'SOUND_MIXER_NOTE',
360 'AUDIO_RECORDER_MODEL',
361 'AUDIO_RECORDER_SERIAL_NUMBER',
362 'AUDIO_RECORDER_FIRMWARE',
363 'LOCATION',
364 'LOCATION_NAME',
365 'LOCATION_GPS',
366 'LOCATION_ALTITUDE',
367 'LOCATION_TYPE',
368 'LOCATION_TIME',
369 ]
370"""List with valid tags of the iXML chunk.
372See http://www.gallery.co.uk/ixml/
373"""
376# Read RIFF/WAVE files:
378def read_riff_header(sf, tag=None):
379 """Read and check the RIFF file header.
381 Parameters
382 ----------
383 sf: stream
384 File stream of RIFF/WAVE file.
385 tag: None or str
386 If supplied, check whether it matches the subchunk tag.
387 If it does not match, raise a ValueError.
389 Returns
390 -------
391 filesize: int
392 Size of the RIFF file in bytes.
394 Raises
395 ------
396 ValueError
397 Not a RIFF file or subchunk tag does not match `tag`.
398 """
399 riffs = sf.read(4).decode('latin-1')
400 if riffs != 'RIFF':
401 raise ValueError('Not a RIFF file.')
402 fsize = struct.unpack('<I', sf.read(4))[0] + 8
403 subtag = sf.read(4).decode('latin-1')
404 if tag is not None and subtag != tag:
405 raise ValueError(f'Not a {tag} file.')
406 return fsize
409def skip_chunk(sf):
410 """Skip over unknown RIFF chunk.
412 Parameters
413 ----------
414 sf: stream
415 File stream of RIFF file.
417 Returns
418 -------
419 size: int
420 The size of the skipped chunk in bytes.
421 """
422 size = struct.unpack('<I', sf.read(4))[0]
423 size += size % 2
424 sf.seek(size, os.SEEK_CUR)
425 return size
428def read_chunk_tags(filepath):
429 """Read tags of all chunks contained in a RIFF file.
431 Parameters
432 ----------
433 filepath: string or Path or file handle
434 The RIFF file.
436 Returns
437 -------
438 tags: dict
439 Keys are the tag names of the chunks found in the file. If the
440 chunk is a list chunk, then the list type is added with a dash
441 to the key, i.e. "LIST-INFO". Values are tuples with the
442 corresponding file positions of the data of the chunk (after
443 the tag and the chunk size field) and the size of the chunk
444 data. The file position of the next chunk is thus the position
445 of the chunk plus the size of its data. Advance another 8 bytes
446 to get to the data of the next chunk.
447 The total file size is the sum of the chunk sizes of each tag
448 incremented by eight plus another 12 bytes of the riff header.
450 Raises
451 ------
452 ValueError
453 Not a RIFF file.
455 """
456 tags = {}
457 sf = filepath
458 file_pos = None
459 if hasattr(filepath, 'read'):
460 file_pos = sf.tell()
461 sf.seek(0, os.SEEK_SET)
462 else:
463 sf = open(filepath, 'rb')
464 fsize = read_riff_header(sf)
465 while (sf.tell() < fsize - 8):
466 chunk = sf.read(4).decode('latin-1').upper()
467 size = struct.unpack('<I', sf.read(4))[0]
468 size += size % 2
469 fp = sf.tell()
470 if chunk == 'LIST':
471 subchunk = sf.read(4).decode('latin-1').upper()
472 tags[chunk + '-' + subchunk] = (fp, size)
473 size -= 4
474 else:
475 tags[chunk] = (fp, size)
476 sf.seek(size, os.SEEK_CUR)
477 if file_pos is None:
478 sf.close()
479 else:
480 sf.seek(file_pos, os.SEEK_SET)
481 return tags
484def read_format_chunk(sf):
485 """Read format chunk.
487 Parameters
488 ----------
489 sf: stream
490 File stream for reading FMT chunk at the position of the chunk's size field.
492 Returns
493 -------
494 channels: int
495 Number of channels.
496 rate: float
497 Sampling rate (frames per time) in Hertz.
498 bits: int
499 Bit resolution.
500 """
501 size = struct.unpack('<I', sf.read(4))[0]
502 size += size % 2
503 ccode, channels, rate, byterate, blockalign, bits = struct.unpack('<HHIIHH', sf.read(16))
504 if size > 16:
505 sf.read(size - 16)
506 return channels, float(rate), bits
509def read_info_chunks(sf, store_empty):
510 """Read in meta data from info list chunk.
512 The variable `info_tags` is used to map the 4 character tags to
513 human readable key names.
515 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
517 Parameters
518 ----------
519 sf: stream
520 File stream of RIFF file at the position of the chunk's size field..
521 store_empty: bool
522 If `False` do not add meta data with empty values.
524 Returns
525 -------
526 metadata: dict
527 Dictionary with key-value pairs of info tags.
529 """
530 md = {}
531 list_size = struct.unpack('<I', sf.read(4))[0]
532 list_type = sf.read(4).decode('latin-1').upper()
533 list_size -= 4
534 if list_type == 'INFO':
535 while list_size >= 8:
536 key = sf.read(4).decode('ascii').rstrip(' \x00')
537 size = struct.unpack('<I', sf.read(4))[0]
538 size += size % 2
539 bs = sf.read(size)
540 x = np.frombuffer(bs, dtype=np.uint8)
541 if np.sum((x >= 0x80) & (x <= 0x9f)) > 0:
542 s = bs.decode('windows-1252')
543 else:
544 s = bs.decode('latin1')
545 value = s.rstrip(' \x00\x02')
546 list_size -= 8 + size
547 if key in info_tags:
548 key = info_tags[key]
549 if value or store_empty:
550 md[key] = value
551 if list_size > 0: # finish or skip
552 sf.seek(list_size, os.SEEK_CUR)
553 return md
556def read_bext_chunk(sf, store_empty=True):
557 """Read in metadata from the broadcast-audio extension chunk.
559 The variable `bext_tags` lists all valid BEXT fields and their size.
561 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
563 Parameters
564 ----------
565 sf: stream
566 File stream of RIFF file at the position of the chunk's size field..
567 store_empty: bool
568 If `False` do not add meta data with empty values.
570 Returns
571 -------
572 meta_data: dict
573 The meta-data of a BEXT chunk are stored in a flat dictionary
574 with the following keys:
576 - 'Description': a free description of the sequence.
577 - 'Originator': name of the originator/ producer of the audio file.
578 - 'OriginatorReference': unambiguous reference allocated by the originating organisation.
579 - 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.
580 - 'OriginationTime': time of creation of audio sequence in hh:mm:ss.
581 - 'TimeReference': first sample since midnight.
582 - 'Version': version of the BWF.
583 - 'UMID': unique material identifier.
584 - 'LoudnessValue': integrated loudness value.
585 - 'LoudnessRange': loudness range.
586 - 'MaxTruePeakLevel': maximum true peak value in dBTP.
587 - 'MaxMomentaryLoudness': highest value of the momentary loudness level.
588 - 'MaxShortTermLoudness': highest value of the short-term loudness level.
589 - 'Reserved': 180 bytes reserved for extension.
590 - 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.
591 """
592 md = {}
593 size = struct.unpack('<I', sf.read(4))[0]
594 size += size % 2
595 s = sf.read(256).decode('ascii').strip(' \x00')
596 if s or store_empty:
597 md['Description'] = s
598 s = sf.read(32).decode('ascii').strip(' \x00')
599 if s or store_empty:
600 md['Originator'] = s
601 s = sf.read(32).decode('ascii').strip(' \x00')
602 if s or store_empty:
603 md['OriginatorReference'] = s
604 s = sf.read(10).decode('ascii').strip(' \x00')
605 if s or store_empty:
606 md['OriginationDate'] = s
607 s = sf.read(8).decode('ascii').strip(' \x00')
608 if s or store_empty:
609 md['OriginationTime'] = s
610 reference, version = struct.unpack('<QH', sf.read(10))
611 if reference > 0 or store_empty:
612 md['TimeReference'] = reference
613 if version > 0 or store_empty:
614 md['Version'] = version
615 s = sf.read(64).decode('ascii').strip(' \x00')
616 if s or store_empty:
617 md['UMID'] = s
618 lvalue, lrange, peak, momentary, shortterm = struct.unpack('<hhhhh', sf.read(10))
619 if lvalue > 0 or store_empty:
620 md['LoudnessValue'] = lvalue
621 if lrange > 0 or store_empty:
622 md['LoudnessRange'] = lrange
623 if peak > 0 or store_empty:
624 md['MaxTruePeakLevel'] = peak
625 if momentary > 0 or store_empty:
626 md['MaxMomentaryLoudness'] = momentary
627 if shortterm > 0 or store_empty:
628 md['MaxShortTermLoudness'] = shortterm
629 s = sf.read(180).decode('ascii').strip(' \x00')
630 if s or store_empty:
631 md['Reserved'] = s
632 size -= 256 + 32 + 32 + 10 + 8 + 8 + 2 + 64 + 10 + 180
633 s = sf.read(size).decode('ascii').strip(' \x00\n\r')
634 if s or store_empty:
635 md['CodingHistory'] = s
636 return md
639def read_ixml_chunk(sf, store_empty=True):
640 """Read in metadata from an IXML chunk.
642 See the variable `ixml_tags` for a list of valid tags.
644 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
646 Parameters
647 ----------
648 sf: stream
649 File stream of RIFF file at the position of the chunk's size field..
650 store_empty: bool
651 If `False` do not add meta data with empty values.
653 Returns
654 -------
655 metadata: nested dict
656 Dictionary with key-value pairs.
657 """
659 def parse_ixml(element, store_empty=True):
660 md = {}
661 for e in element:
662 if not e.text is None:
663 md[e.tag] = e.text
664 elif len(e) > 0:
665 md[e.tag] = parse_ixml(e, store_empty)
666 elif store_empty:
667 md[e.tag] = ''
668 return md
670 size = struct.unpack('<I', sf.read(4))[0]
671 size += size % 2
672 xmls = sf.read(size).decode('latin-1').rstrip(' \x00')
673 root = ET.fromstring(xmls)
674 md = {root.tag: parse_ixml(root, store_empty)}
675 if len(md) == 1 and 'BWFXML' in md:
676 md = md['BWFXML']
677 return md
680def read_guano_chunk(sf):
681 """Read in metadata from a GUANO chunk.
683 GUANO is the Grand Unified Acoustic Notation Ontology, an
684 extensible, open format for embedding metadata within bat acoustic
685 recordings. See https://github.com/riggsd/guano-spec for details.
687 The GUANO specification allows for the inclusion of arbitrary
688 nested keys and string encoded values. In that respect it is a
689 well defined and easy to handle serialization of the [odML data
690 model](https://doi.org/10.3389/fninf.2011.00016).
692 Parameters
693 ----------
694 sf: stream
695 File stream of RIFF file at the position of the chunk's size field..
697 Returns
698 -------
699 metadata: nested dict
700 Dictionary with key-value pairs.
702 """
703 md = {}
704 size = struct.unpack('<I', sf.read(4))[0]
705 size += size % 2
706 for line in io.StringIO(sf.read(size).decode('utf-8')):
707 ss = line.split(':')
708 if len(ss) > 1:
709 md[ss[0].strip()] = ':'.join(ss[1:]).strip().replace(r'\n', '\n')
710 return unflatten_metadata(md, '|')
713def read_cue_chunk(sf):
714 """Read in marker positions from cue chunk.
716 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
718 Parameters
719 ----------
720 sf: stream
721 File stream of RIFF file at the position of the chunk's size field..
723 Returns
724 -------
725 locs: 2-D array of ints
726 Each row is a marker with unique identifier in the first column,
727 position in the second column, and span in the third column.
728 The cue chunk does not encode spans, so the third column is
729 initialized with zeros.
730 """
731 locs = []
732 size, n = struct.unpack('<II', sf.read(8))
733 for c in range(n):
734 cpid, cppos = struct.unpack('<II', sf.read(8))
735 datachunkid = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
736 chunkstart, blockstart, offset = struct.unpack('<III', sf.read(12))
737 if datachunkid == 'DATA':
738 locs.append((cpid, cppos, 0))
739 return np.array(locs, dtype=int)
742def read_playlist_chunk(sf, locs):
743 """Read in marker spans from playlist chunk.
745 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
747 Parameters
748 ----------
749 sf: stream
750 File stream of RIFF file at the position of the chunk's size field..
751 locs: 2-D array of ints
752 Markers as returned by the `read_cue_chunk()` function.
753 Each row is a marker with unique identifier in the first column,
754 position in the second column, and span in the third column.
755 The span is read in from the playlist chunk.
756 """
757 if len(locs) == 0:
758 warnings.warn('read_playlist_chunks() requires markers from a previous cue chunk')
759 size, n = struct.unpack('<II', sf.read(8))
760 for p in range(n):
761 cpid, length, repeats = struct.unpack('<III', sf.read(12))
762 i = np.where(locs[:,0] == cpid)[0]
763 if len(i) > 0:
764 locs[i[0], 2] = length
767def read_adtl_chunks(sf, locs, labels):
768 """Read in associated data list chunks.
770 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
772 Parameters
773 ----------
774 sf: stream
775 File stream of RIFF file at the position of the chunk's size field..
776 locs: 2-D array of ints
777 Markers as returned by the `read_cue_chunk()` function.
778 Each row is a marker with unique identifier in the first column,
779 position in the second column, and span in the third column.
780 The span is read in from the LTXT chunk.
781 labels: 2-D array of string objects
782 Labels (first column) and texts (second column) for each marker (rows)
783 from previous LABL, NOTE, and LTXT chunks.
785 Returns
786 -------
787 labels: 2-D array of string objects
788 Labels (first column) and texts (second column) for each marker (rows)
789 from LABL, NOTE (first column), and LTXT chunks (last column).
790 """
791 list_size = struct.unpack('<I', sf.read(4))[0]
792 list_type = sf.read(4).decode('latin-1').upper()
793 list_size -= 4
794 if list_type == 'ADTL':
795 if len(locs) == 0:
796 warnings.warn('read_adtl_chunks() requires markers from a previous cue chunk')
797 if len(labels) == 0:
798 labels = np.zeros((len(locs), 2), dtype=object)
799 while list_size >= 8:
800 key = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
801 size, cpid = struct.unpack('<II', sf.read(8))
802 size += size % 2 - 4
803 if key == 'LABL' or key == 'NOTE':
804 label = sf.read(size).decode('latin-1').rstrip(' \x00')
805 i = np.where(locs[:,0] == cpid)[0]
806 if len(i) > 0:
807 i = i[0]
808 if hasattr(labels[i,0], '__len__') and len(labels[i,0]) > 0:
809 labels[i,0] += '|' + label
810 else:
811 labels[i,0] = label
812 elif key == 'LTXT':
813 length = struct.unpack('<I', sf.read(4))[0]
814 sf.read(12) # skip fields
815 text = sf.read(size - 4 - 12).decode('latin-1').rstrip(' \x00')
816 i = np.where(locs[:,0] == cpid)[0]
817 if len(i) > 0:
818 i = i[0]
819 if hasattr(labels[i,1], '__len__') and len(labels[i,1]) > 0:
820 labels[i,1] += '|' + text
821 else:
822 labels[i,1] = text
823 locs[i,2] = length
824 else:
825 sf.read(size)
826 list_size -= 12 + size
827 if list_size > 0: # finish or skip
828 sf.seek(list_size, os.SEEK_CUR)
829 return labels
832def read_lbl_chunk(sf, rate):
833 """Read in marker positions, spans, labels, and texts from lbl chunk.
835 The proprietary LBL chunk is specific to wave files generated by
836 [AviSoft](www.avisoft.com) products.
838 The labels (first column of `labels`) have special meanings.
839 Markers with a span (a section label in the terminology of
840 AviSoft) can be arranged in three levels when displayed:
842 - "M": layer 1, the top level section
843 - "N": layer 2, sections below layer 1
844 - "O": layer 3, sections below layer 2
845 - "P": total, section start and end are displayed with two vertical lines.
847 All other labels mark single point labels with a time and a
848 frequency (that we here discard). See also
849 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
851 Parameters
852 ----------
853 sf: stream
854 File stream of RIFF file at the position of the chunk's size field..
855 rate: float
856 Sampling rate of the data in Hertz.
858 Returns
859 -------
860 locs: 2-D array of ints
861 Each row is a marker with unique identifier (simply integers
862 enumerating the markers) in the first column, position in the
863 second column, and span in the third column.
864 labels: 2-D array of string objects
865 Labels (first column) and texts (second column) for
866 each marker (rows).
868 """
869 size = struct.unpack('<I', sf.read(4))[0]
870 nn = size // 65
871 locs = np.zeros((nn, 3), dtype=int)
872 labels = np.zeros((nn, 2), dtype=object)
873 n = 0
874 for c in range(nn):
875 line = sf.read(65).decode('ascii')
876 fields = line.split('\t')
877 if len(fields) >= 4:
878 labels[n,0] = fields[3].strip()
879 labels[n,1] = fields[2].strip()
880 start_idx = int(np.round(float(fields[0].strip('\x00'))*rate))
881 end_idx = int(np.round(float(fields[1].strip('\x00'))*rate))
882 locs[n,0] = n
883 locs[n,1] = start_idx
884 if labels[n,0] in 'MNOP':
885 locs[n,2] = end_idx - start_idx
886 else:
887 locs[n,2] = 0
888 n += 1
889 else:
890 # the first 65 bytes are a title string that applies to
891 # the whole wave file that can be set from the AVISoft
892 # software. The recorder leave this empty.
893 pass
894 return locs[:n,:], labels[:n,:]
897def metadata_riff(filepath, store_empty=False):
898 """Read metadata from a RIFF/WAVE file.
900 Parameters
901 ----------
902 filepath: string or Path or file handle
903 The RIFF file.
904 store_empty: bool
905 If `False` do not add meta data with empty values.
907 Returns
908 -------
909 meta_data: nested dict
910 Meta data contained in the RIFF file. Keys of the nested
911 dictionaries are always strings. If the corresponding
912 values are dictionaries, then the key is the section name
913 of the metadata contained in the dictionary. All other
914 types of values are values for the respective key. In
915 particular they are strings, or list of strings. But other
916 simple types like ints or floats are also allowed.
917 First level contains sections of meta data
918 (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).
920 Raises
921 ------
922 ValueError
923 Not a RIFF file.
925 Examples
926 --------
927 ```
928 from audioio.riffmetadata import riff_metadata
929 from audioio import print_metadata
931 md = riff_metadata('audio/file.wav')
932 print_metadata(md)
933 ```
934 """
935 meta_data = {}
936 sf = filepath
937 file_pos = None
938 if hasattr(filepath, 'read'):
939 file_pos = sf.tell()
940 sf.seek(0, os.SEEK_SET)
941 else:
942 sf = open(filepath, 'rb')
943 fsize = read_riff_header(sf)
944 while (sf.tell() < fsize - 8):
945 chunk = sf.read(4).decode('latin-1').upper()
946 if chunk == 'LIST':
947 md = read_info_chunks(sf, store_empty)
948 if len(md) > 0:
949 meta_data['INFO'] = md
950 elif chunk == 'BEXT':
951 md = read_bext_chunk(sf, store_empty)
952 if len(md) > 0:
953 meta_data['BEXT'] = md
954 elif chunk == 'IXML':
955 md = read_ixml_chunk(sf, store_empty)
956 if len(md) > 0:
957 meta_data['IXML'] = md
958 elif chunk == 'GUAN':
959 md = read_guano_chunk(sf)
960 if len(md) > 0:
961 meta_data.update(md)
962 else:
963 skip_chunk(sf)
964 if file_pos is None:
965 sf.close()
966 else:
967 sf.seek(file_pos, os.SEEK_SET)
968 return meta_data
971def markers_riff(filepath):
972 """Read markers from a RIFF/WAVE file.
974 Parameters
975 ----------
976 filepath: string or Path or file handle
977 The RIFF file.
979 Returns
980 -------
981 locs: 2-D array of ints
982 Marker positions (first column) and spans (second column)
983 for each marker (rows).
984 labels: 2-D array of string objects
985 Labels (first column) and texts (second column)
986 for each marker (rows).
988 Raises
989 ------
990 ValueError
991 Not a RIFF file.
993 Examples
994 --------
995 ```
996 from audioio.riffmetadata import riff_markers
997 from audioio import print_markers
999 locs, labels = riff_markers('audio/file.wav')
1000 print_markers(locs, labels)
1001 ```
1002 """
1003 sf = filepath
1004 file_pos = None
1005 if hasattr(filepath, 'read'):
1006 file_pos = sf.tell()
1007 sf.seek(0, os.SEEK_SET)
1008 else:
1009 sf = open(filepath, 'rb')
1010 rate = None
1011 locs = np.zeros((0, 3), dtype=int)
1012 labels = np.zeros((0, 2), dtype=object)
1013 fsize = read_riff_header(sf)
1014 while (sf.tell() < fsize - 8):
1015 chunk = sf.read(4).decode('latin-1').upper()
1016 if chunk == 'FMT ':
1017 rate = read_format_chunk(sf)[1]
1018 elif chunk == 'CUE ':
1019 locs = read_cue_chunk(sf)
1020 elif chunk == 'PLST':
1021 read_playlist_chunk(sf, locs)
1022 elif chunk == 'LIST':
1023 labels = read_adtl_chunks(sf, locs, labels)
1024 elif chunk == 'LBL ':
1025 locs, labels = read_lbl_chunk(sf, rate)
1026 else:
1027 skip_chunk(sf)
1028 if file_pos is None:
1029 sf.close()
1030 else:
1031 sf.seek(file_pos, os.SEEK_SET)
1032 # sort markers according to their position:
1033 if len(locs) > 0:
1034 idxs = np.argsort(locs[:,-2])
1035 locs = locs[idxs,:]
1036 if len(labels) > 0:
1037 labels = labels[idxs,:]
1038 return locs[:,1:], labels
1041# Write RIFF/WAVE file:
1043def write_riff_chunk(df, filesize=0, tag='WAVE'):
1044 """Write RIFF file header.
1046 Parameters
1047 ----------
1048 df: stream
1049 File stream for writing RIFF file header.
1050 filesize: int
1051 Size of the file in bytes.
1052 tag: str
1053 The type of RIFF file. Default is a wave file.
1054 Exactly 4 characeters long.
1056 Returns
1057 -------
1058 n: int
1059 Number of bytes written to the stream.
1061 Raises
1062 ------
1063 ValueError
1064 `tag` is not 4 characters long.
1065 """
1066 if len(tag) != 4:
1067 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1068 if filesize < 8:
1069 filesize = 8
1070 df.write(b'RIFF')
1071 df.write(struct.pack('<I', filesize - 8))
1072 df.write(tag.encode('ascii', errors='strict'))
1073 return 12
1076def write_filesize(df, filesize=None):
1077 """Write the file size into the RIFF file header.
1079 Parameters
1080 ----------
1081 df: stream
1082 File stream into which to write `filesize`.
1083 filesize: int
1084 Size of the file in bytes. If not specified or 0,
1085 then use current size of the file.
1086 """
1087 pos = df.tell()
1088 if not filesize:
1089 df.seek(0, os.SEEK_END)
1090 filesize = df.tell()
1091 df.seek(4, os.SEEK_SET)
1092 df.write(struct.pack('<I', filesize - 8))
1093 df.seek(pos, os.SEEK_SET)
1096def write_chunk_name(df, pos, tag):
1097 """Change the name of a chunk.
1099 Use this to make the content of an existing chunk to be ignored by
1100 overwriting its name with an unknown one.
1102 Parameters
1103 ----------
1104 df: stream
1105 File stream.
1106 pos: int
1107 Position of the chunk in the file stream.
1108 tag: str
1109 The type of RIFF file. Default is a wave file.
1110 Exactly 4 characeters long.
1112 Raises
1113 ------
1114 ValueError
1115 `tag` is not 4 characters long.
1116 """
1117 if len(tag) != 4:
1118 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1119 df.seek(pos, os.SEEK_SET)
1120 df.write(tag.encode('ascii', errors='strict'))
1123def write_format_chunk(df, channels, frames, rate, bits=16):
1124 """Write format chunk.
1126 Parameters
1127 ----------
1128 df: stream
1129 File stream for writing FMT chunk.
1130 channels: int
1131 Number of channels contained in the data.
1132 frames: int
1133 Number of frames contained in the data.
1134 rate: int or float
1135 Sampling rate (frames per time) in Hertz.
1136 bits: 16 or 32
1137 Bit resolution of the data to be written.
1139 Returns
1140 -------
1141 n: int
1142 Number of bytes written to the stream.
1143 """
1144 blockalign = channels * (bits//8)
1145 byterate = int(rate) * blockalign
1146 df.write(b'fmt ')
1147 df.write(struct.pack('<IHHIIHH', 16, 1, channels, int(rate),
1148 byterate, blockalign, bits))
1149 return 8 + 16
1152def write_data_chunk(df, data, bits=16):
1153 """Write data chunk.
1155 Parameters
1156 ----------
1157 df: stream
1158 File stream for writing data chunk.
1159 data: 1-D or 2-D array of floats
1160 Data with first column time (frames) and optional second column
1161 channels with values between -1 and 1.
1162 bits: 16 or 32
1163 Bit resolution of the data to be written.
1165 Returns
1166 -------
1167 n: int
1168 Number of bytes written to the stream.
1169 """
1170 df.write(b'data')
1171 df.write(struct.pack('<I', data.size * (bits//8)))
1172 buffer = data * 2**(bits-1)
1173 n = df.write(buffer.astype(f'<i{bits//8}').tobytes('C'))
1174 return 8 + n
1177def write_info_chunk(df, metadata, size=None):
1178 """Write metadata to LIST INFO chunk.
1180 If `metadata` contains an 'INFO' key, then write the flat
1181 dictionary of this key as an INFO chunk. Otherwise, attempt to
1182 write all metadata items as an INFO chunk. The keys are translated
1183 via the `info_tags` variable back to INFO tags. If after
1184 translation any key is left that is longer than 4 characters or
1185 any key has a dictionary as a value (non-flat metadata), the INFO
1186 chunk is not written.
1188 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
1190 Parameters
1191 ----------
1192 df: stream
1193 File stream for writing INFO chunk.
1194 metadata: nested dict
1195 Metadata as key-value pairs. Values can be strings, integers,
1196 or dictionaries.
1197 size: int or None
1198 If specified write this size into the list's size field.
1200 Returns
1201 -------
1202 n: int
1203 Number of bytes written to the stream.
1204 keys_written: list of str
1205 Keys written to the INFO chunk.
1207 """
1208 if not metadata:
1209 return 0, []
1210 is_info = False
1211 if 'INFO' in metadata:
1212 metadata = metadata['INFO']
1213 is_info = True
1214 tags = {v: k for k, v in info_tags.items()}
1215 n = 0
1216 for k in metadata:
1217 kn = tags.get(k, k)
1218 if len(kn) > 4:
1219 if is_info:
1220 warnings.warn(f'no 4-character info tag for key "{k}" found.')
1221 return 0, []
1222 if isinstance(metadata[k], dict):
1223 if is_info:
1224 warnings.warn(f'value of key "{k}" in INFO chunk cannot be a dictionary.')
1225 return 0, []
1226 try:
1227 v = str(metadata[k]).encode('latin-1')
1228 except UnicodeEncodeError:
1229 v = str(metadata[k]).encode('windows-1252')
1230 n += 8 + len(v) + len(v) % 2
1231 df.write(b'LIST')
1232 df.write(struct.pack('<I', size if size is not None else n + 4))
1233 df.write(b'INFO')
1234 keys_written = []
1235 for k in metadata:
1236 kn = tags.get(k, k)
1237 df.write(f'{kn:<4s}'.encode('latin-1'))
1238 try:
1239 v = str(metadata[k]).encode('latin-1')
1240 except UnicodeEncodeError:
1241 v = str(metadata[k]).encode('windows-1252')
1242 ns = len(v) + len(v) % 2
1243 if ns > len(v):
1244 v += b' ';
1245 df.write(struct.pack('<I', ns))
1246 df.write(v)
1247 keys_written.append(k)
1248 return 12 + n, ['INFO'] if is_info else keys_written
1251def write_bext_chunk(df, metadata):
1252 """Write metadata to BEXT chunk.
1254 If `metadata` contains a BEXT key, and this contains valid BEXT
1255 tags (one of the keys listed in the variable `bext_tags`), then
1256 write the dictionary of that key as a broadcast-audio extension
1257 chunk.
1259 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
1261 Parameters
1262 ----------
1263 df: stream
1264 File stream for writing BEXT chunk.
1265 metadata: nested dict
1266 Metadata as key-value pairs. Values can be strings, integers,
1267 or dictionaries.
1269 Returns
1270 -------
1271 n: int
1272 Number of bytes written to the stream.
1273 keys_written: list of str
1274 Keys written to the BEXT chunk.
1276 """
1277 if not metadata or not 'BEXT' in metadata:
1278 return 0, []
1279 metadata = metadata['BEXT']
1280 for k in metadata:
1281 if not k in bext_tags:
1282 warnings.warn(f'no bext tag for key "{k}" found.')
1283 return 0, []
1284 n = 0
1285 for k in bext_tags:
1286 n += bext_tags[k]
1287 ch = metadata.get('CodingHistory', '').encode('ascii', errors='replace')
1288 if len(ch) >= 2 and ch[-2:] != '\r\n':
1289 ch += b'\r\n'
1290 nch = len(ch) + len(ch) % 2
1291 n += nch
1292 df.write(b'BEXT')
1293 df.write(struct.pack('<I', n))
1294 for k in bext_tags:
1295 bn = bext_tags[k]
1296 if bn == 2:
1297 v = metadata.get(k, '0')
1298 df.write(struct.pack('<H', int(v)))
1299 elif bn == 8 and k == 'TimeReference':
1300 v = metadata.get(k, '0')
1301 df.write(struct.pack('<Q', int(v)))
1302 elif bn == 0:
1303 df.write(ch)
1304 df.write(bytes(nch - len(ch)))
1305 else:
1306 v = metadata.get(k, '').encode('ascii', errors='replace')
1307 df.write(v[:bn] + bytes(bn - len(v)))
1308 return 8 + n, ['BEXT']
1311def write_ixml_chunk(df, metadata, keys_written=None):
1312 """Write metadata to iXML chunk.
1314 If `metadata` contains an IXML key with valid IXML tags (one of
1315 those listed in the variable `ixml_tags`), or the remaining tags
1316 in `metadata` are valid IXML tags, then write an IXML chunk.
1318 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
1320 Parameters
1321 ----------
1322 df: stream
1323 File stream for writing IXML chunk.
1324 metadata: nested dict
1325 Meta-data as key-value pairs. Values can be strings, integers,
1326 or dictionaries.
1327 keys_written: list of str
1328 Keys that have already written to INFO or BEXT chunk.
1330 Returns
1331 -------
1332 n: int
1333 Number of bytes written to the stream.
1334 keys_written: list of str
1335 Keys written to the IXML chunk.
1337 """
1338 def check_ixml(metadata):
1339 for k in metadata:
1340 if not k.upper() in ixml_tags:
1341 return False
1342 if isinstance(metadata[k], dict):
1343 if not check_ixml(metadata[k]):
1344 return False
1345 return True
1347 def build_xml(node, metadata):
1348 kw = []
1349 for k in metadata:
1350 e = ET.SubElement(node, k)
1351 if isinstance(metadata[k], dict):
1352 build_xml(e, metadata[k])
1353 else:
1354 e.text = str(metadata[k])
1355 kw.append(k)
1356 return kw
1358 if not metadata:
1359 return 0, []
1360 md = metadata
1361 if keys_written:
1362 md = {k: metadata[k] for k in metadata if not k in keys_written}
1363 if len(md) == 0:
1364 return 0, []
1365 has_ixml = False
1366 if 'IXML' in md and check_ixml(md['IXML']):
1367 md = md['IXML']
1368 has_ixml = True
1369 else:
1370 if not check_ixml(md):
1371 return 0, []
1372 root = ET.Element('BWFXML')
1373 kw = build_xml(root, md)
1374 bs = bytes(ET.tostring(root, xml_declaration=True,
1375 short_empty_elements=False))
1376 if len(bs) % 2 == 1:
1377 bs += bytes(1)
1378 df.write(b'IXML')
1379 df.write(struct.pack('<I', len(bs)))
1380 df.write(bs)
1381 return 8 + len(bs), ['IXML'] if has_ixml else kw
1384def write_guano_chunk(df, metadata, keys_written=None):
1385 """Write metadata to guan chunk.
1387 GUANO is the Grand Unified Acoustic Notation Ontology, an
1388 extensible, open format for embedding metadata within bat acoustic
1389 recordings. See https://github.com/riggsd/guano-spec for details.
1391 The GUANO specification allows for the inclusion of arbitrary
1392 nested keys and string encoded values. In that respect it is a
1393 well defined and easy to handle serialization of the [odML data
1394 model](https://doi.org/10.3389/fninf.2011.00016).
1396 This will write *all* metadata that are not in `keys_written`.
1398 Parameters
1399 ----------
1400 df: stream
1401 File stream for writing guano chunk.
1402 metadata: nested dict
1403 Metadata as key-value pairs. Values can be strings, integers,
1404 or dictionaries.
1405 keys_written: list of str
1406 Keys that have already written to INFO, BEXT, IXML chunk.
1408 Returns
1409 -------
1410 n: int
1411 Number of bytes written to the stream.
1412 keys_written: list of str
1413 Top-level keys written to the GUANO chunk.
1415 """
1416 if not metadata:
1417 return 0, []
1418 md = metadata
1419 if keys_written:
1420 md = {k: metadata[k] for k in metadata if not k in keys_written}
1421 if len(md) == 0:
1422 return 0, []
1423 fmd = flatten_metadata(md, True, '|')
1424 for k in fmd:
1425 if isinstance(fmd[k], str):
1426 fmd[k] = fmd[k].replace('\n', r'\n')
1427 sio = io.StringIO()
1428 m, k = find_key(md, 'GUANO.Version')
1429 if k is None:
1430 sio.write('GUANO|Version:1.0\n')
1431 for k in fmd:
1432 sio.write(f'{k}:{fmd[k]}\n')
1433 bs = sio.getvalue().encode('utf-8')
1434 if len(bs) % 2 == 1:
1435 bs += b' '
1436 n = len(bs)
1437 df.write(b'guan')
1438 df.write(struct.pack('<I', n))
1439 df.write(bs)
1440 return n, list(md)
1443def write_cue_chunk(df, locs):
1444 """Write marker positions to cue chunk.
1446 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
1448 Parameters
1449 ----------
1450 df: stream
1451 File stream for writing cue chunk.
1452 locs: None or 2-D array of ints
1453 Positions (first column) and spans (optional second column)
1454 for each marker (rows).
1456 Returns
1457 -------
1458 n: int
1459 Number of bytes written to the stream.
1460 """
1461 if locs is None or len(locs) == 0:
1462 return 0
1463 df.write(b'CUE ')
1464 df.write(struct.pack('<II', 4 + len(locs)*24, len(locs)))
1465 for i in range(len(locs)):
1466 df.write(struct.pack('<II4sIII', i, locs[i,0], b'data', 0, 0, 0))
1467 return 12 + len(locs)*24
1470def write_playlist_chunk(df, locs):
1471 """Write marker spans to playlist chunk.
1473 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
1475 Parameters
1476 ----------
1477 df: stream
1478 File stream for writing playlist chunk.
1479 locs: None or 2-D array of ints
1480 Positions (first column) and spans (optional second column)
1481 for each marker (rows).
1483 Returns
1484 -------
1485 n: int
1486 Number of bytes written to the stream.
1487 """
1488 if locs is None or len(locs) == 0 or locs.shape[1] < 2:
1489 return 0
1490 n_spans = np.sum(locs[:,1] > 0)
1491 if n_spans == 0:
1492 return 0
1493 df.write(b'plst')
1494 df.write(struct.pack('<II', 4 + n_spans*12, n_spans))
1495 for i in range(len(locs)):
1496 if locs[i,1] > 0:
1497 df.write(struct.pack('<III', i, locs[i,1], 1))
1498 return 12 + n_spans*12
1501def write_adtl_chunks(df, locs, labels):
1502 """Write associated data list chunks.
1504 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
1506 Parameters
1507 ----------
1508 df: stream
1509 File stream for writing adtl chunk.
1510 locs: None or 2-D array of ints
1511 Positions (first column) and spans (optional second column)
1512 for each marker (rows).
1513 labels: None or 2-D array of string objects
1514 Labels (first column) and texts (second column) for each marker (rows).
1516 Returns
1517 -------
1518 n: int
1519 Number of bytes written to the stream.
1520 """
1521 if labels is None or len(labels) == 0:
1522 return 0
1523 labels_size = 0
1524 for l in labels[:,0]:
1525 if hasattr(l, '__len__'):
1526 n = len(l)
1527 if n > 0:
1528 labels_size += 12 + n + n % 2
1529 text_size = 0
1530 if labels.shape[1] > 1:
1531 for t in labels[:,1]:
1532 if hasattr(t, '__len__'):
1533 n = len(t)
1534 if n > 0:
1535 text_size += 28 + n + n % 2
1536 if labels_size == 0 and text_size == 0:
1537 return 0
1538 size = 4 + labels_size + text_size
1539 spans = locs[:,1] if locs.shape[1] > 1 else None
1540 df.write(b'LIST')
1541 df.write(struct.pack('<I', size))
1542 df.write(b'adtl')
1543 for i in range(len(labels)):
1544 # labl sub-chunk:
1545 l = labels[i,0]
1546 if hasattr(l, '__len__'):
1547 n = len(l)
1548 if n > 0:
1549 n += n % 2
1550 df.write(b'labl')
1551 df.write(struct.pack('<II', 4 + n, i))
1552 df.write(f'{l:<{n}s}'.encode('latin-1', errors='replace'))
1553 # ltxt sub-chunk:
1554 if labels.shape[1] > 1:
1555 t = labels[i,1]
1556 if hasattr(t, '__len__'):
1557 n = len(t)
1558 if n > 0:
1559 n += n % 2
1560 span = spans[i] if spans is not None else 0
1561 df.write(b'ltxt')
1562 df.write(struct.pack('<III', 20 + n, i, span))
1563 df.write(struct.pack('<IHHHH', 0, 0, 0, 0, 0))
1564 df.write(f'{t:<{n}s}'.encode('latin-1', errors='replace'))
1565 return 8 + size
1568def write_lbl_chunk(df, locs, labels, rate):
1569 """Write marker positions, spans, labels, and texts to lbl chunk.
1571 The proprietary LBL chunk is specific to wave files generated by
1572 [AviSoft](www.avisoft.com) products.
1574 The labels (first column of `labels`) have special meanings.
1575 Markers with a span (a section label in the terminology of
1576 AviSoft) can be arranged in three levels when displayed:
1578 - "M": layer 1, the top level section
1579 - "N": layer 2, sections below layer 1
1580 - "O": layer 3, sections below layer 2
1581 - "P": total, section start and end are displayed with two vertical lines.
1583 All other labels mark single point labels with a time and a
1584 frequency (that we here discard). See also
1585 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
1587 If a marker has a span, and its label is not one of "M", "N", "O", or "P",
1588 then its label is set to "M".
1589 If a marker has no span, and its label is one of "M", "N", "O", or "P",
1590 then its label is set to "a".
1592 Parameters
1593 ----------
1594 df: stream
1595 File stream for writing lbl chunk.
1596 locs: None or 2-D array of ints
1597 Positions (first column) and spans (optional second column)
1598 for each marker (rows).
1599 labels: None or 2-D array of string objects
1600 Labels (first column) and texts (second column) for each marker (rows).
1601 rate: float
1602 Sampling rate of the data in Hertz.
1604 Returns
1605 -------
1606 n: int
1607 Number of bytes written to the stream.
1609 """
1610 if locs is None or len(locs) == 0:
1611 return 0
1612 size = (1 + len(locs)) * 65
1613 df.write(b'LBL ')
1614 df.write(struct.pack('<I', size))
1615 # first empty entry (this is ment to be a title for the whole wave file):
1616 df.write(b' ' * 63)
1617 df.write(b'\r\n')
1618 for k in range(len(locs)):
1619 t0 = locs[k,0]/rate
1620 t1 = t0
1621 t1 += locs[k,1]/rate
1622 ls = 'M' if locs[k,1] > 0 else 'a'
1623 ts = ''
1624 if labels is not None and len(labels) > k:
1625 ls = labels[k,0]
1626 if ls != 0 and len(ls) > 0:
1627 ls = ls[0]
1628 if ls in 'MNOP':
1629 if locs[k,1] == 0:
1630 ls = 'a'
1631 else:
1632 if locs[k,1] > 0:
1633 ls = 'M'
1634 ts = labels[k,1]
1635 if ts == 0:
1636 ts = ''
1637 df.write(struct.pack('<14sc', f'{t0:e}'.encode('ascii', errors='replace'), b'\t'))
1638 df.write(struct.pack('<14sc', f'{t1:e}'.encode('ascii', errors='replace'), b'\t'))
1639 bs = f'{ts:31s}\t{ls}\r\n'.encode('ascii', errors='replace')
1640 df.write(bs)
1641 return 8 + size
1644def append_metadata_riff(df, metadata):
1645 """Append metadata chunks to RIFF file.
1647 You still need to update the filesize by calling
1648 `write_filesize()`.
1650 Parameters
1651 ----------
1652 df: stream
1653 File stream for writing metadata chunks.
1654 metadata: None or nested dict
1655 Metadata as key-value pairs. Values can be strings, integers,
1656 or dictionaries.
1658 Returns
1659 -------
1660 n: int
1661 Number of bytes written to the stream.
1662 tags: list of str
1663 Tag names of chunks written to audio file.
1664 """
1665 if not metadata:
1666 return 0, []
1667 n = 0
1668 tags = []
1669 # metadata INFO chunk:
1670 nc, kw = write_info_chunk(df, metadata)
1671 if nc > 0:
1672 tags.append('LIST-INFO')
1673 n += nc
1674 # metadata BEXT chunk:
1675 nc, bkw = write_bext_chunk(df, metadata)
1676 if nc > 0:
1677 tags.append('BEXT')
1678 n += nc
1679 kw.extend(bkw)
1680 # metadata IXML chunk:
1681 nc, xkw = write_ixml_chunk(df, metadata, kw)
1682 if nc > 0:
1683 tags.append('IXML')
1684 n += nc
1685 kw.extend(xkw)
1686 # write remaining metadata to GUANO chunk:
1687 nc, _ = write_guano_chunk(df, metadata, kw)
1688 if nc > 0:
1689 tags.append('GUAN')
1690 n += nc
1691 kw.extend(bkw)
1692 return n, tags
1695def append_markers_riff(df, locs, labels=None, rate=None,
1696 marker_hint='cue'):
1697 """Append marker chunks to RIFF file.
1699 You still need to update the filesize by calling
1700 `write_filesize()`.
1702 Parameters
1703 ----------
1704 df: stream
1705 File stream for writing metadata chunks.
1706 locs: None or 1-D or 2-D array of ints
1707 Marker positions (first column) and spans (optional second column)
1708 for each marker (rows).
1709 labels: None or 1-D or 2-D array of string objects
1710 Labels (first column) and texts (optional second column)
1711 for each marker (rows).
1712 rate: float
1713 Sampling rate of the data in Hertz, needed for storing markers
1714 in seconds.
1715 marker_hint: str
1716 - 'cue': store markers in cue and and adtl chunks.
1717 - 'lbl': store markers in avisoft lbl chunk.
1719 Returns
1720 -------
1721 n: int
1722 Number of bytes written to the stream.
1723 tags: list of str
1724 Tag names of chunks written to audio file.
1726 Raises
1727 ------
1728 ValueError
1729 Encoding not supported.
1730 IndexError
1731 `locs` and `labels` differ in len.
1732 """
1733 if locs is None or len(locs) == 0:
1734 return 0, []
1735 if labels is not None and len(labels) > 0 and len(labels) != len(locs):
1736 raise IndexError(f'locs and labels must have same number of elements.')
1737 # make locs and labels 2-D:
1738 if not locs is None and locs.ndim == 1:
1739 locs = locs.reshape(-1, 1)
1740 if not labels is None and labels.ndim == 1:
1741 labels = labels.reshape(-1, 1)
1742 # sort markers according to their position:
1743 idxs = np.argsort(locs[:,0])
1744 locs = locs[idxs,:]
1745 if not labels is None and len(labels) > 0:
1746 labels = labels[idxs,:]
1747 n = 0
1748 tags = []
1749 if marker_hint.lower() == 'cue':
1750 # write marker positions:
1751 nc = write_cue_chunk(df, locs)
1752 if nc > 0:
1753 tags.append('CUE ')
1754 n += nc
1755 # write marker spans:
1756 nc = write_playlist_chunk(df, locs)
1757 if nc > 0:
1758 tags.append('PLST')
1759 n += nc
1760 # write marker labels:
1761 nc = write_adtl_chunks(df, locs, labels)
1762 if nc > 0:
1763 tags.append('LIST-ADTL')
1764 n += nc
1765 elif marker_hint.lower() == 'lbl':
1766 # write avisoft labels:
1767 nc = write_lbl_chunk(df, locs, labels, rate)
1768 if nc > 0:
1769 tags.append('LBL ')
1770 n += nc
1771 else:
1772 raise ValueError(f'marker_hint "{marker_hint}" not supported for storing markers')
1773 return n, tags
1776def write_wave(filepath, data, rate, metadata=None, locs=None,
1777 labels=None, encoding=None, marker_hint='cue'):
1778 """Write time series, metadata and markers to a WAVE file.
1780 Only 16 or 32bit PCM encoding is supported.
1782 Parameters
1783 ----------
1784 filepath: string or Path
1785 Full path and name of the file to write.
1786 data: 1-D or 2-D array of floats
1787 Array with the data (first index time, second index channel,
1788 values within -1.0 and 1.0).
1789 rate: float
1790 Sampling rate of the data in Hertz.
1791 metadata: None or nested dict
1792 Metadata as key-value pairs. Values can be strings, integers,
1793 or dictionaries.
1794 locs: None or 1-D or 2-D array of ints
1795 Marker positions (first column) and spans (optional second column)
1796 for each marker (rows).
1797 labels: None or 1-D or 2-D array of string objects
1798 Labels (first column) and texts (optional second column)
1799 for each marker (rows).
1800 encoding: string or None
1801 Encoding of the data: 'PCM_32' or 'PCM_16'.
1802 If None or empty string use 'PCM_16'.
1803 marker_hint: str
1804 - 'cue': store markers in cue and and adtl chunks.
1805 - 'lbl': store markers in avisoft lbl chunk.
1807 Raises
1808 ------
1809 ValueError
1810 Encoding not supported.
1811 IndexError
1812 `locs` and `labels` differ in len.
1814 See Also
1815 --------
1816 audioio.audiowriter.write_audio()
1818 Examples
1819 --------
1820 ```
1821 import numpy as np
1822 from audioio.riffmetadata import write_wave
1824 rate = 28000.0
1825 freq = 800.0
1826 time = np.arange(0.0, 1.0, 1/rate) # one second
1827 data = np.sin(2.0*np.p*freq*time) # 800Hz sine wave
1828 md = dict(Artist='underscore_') # metadata
1830 write_wave('audio/file.wav', data, rate, md)
1831 ```
1832 """
1833 if not encoding:
1834 encoding = 'PCM_16'
1835 encoding = encoding.upper()
1836 bits = 0
1837 if encoding == 'PCM_16':
1838 bits = 16
1839 elif encoding == 'PCM_32':
1840 bits = 32
1841 else:
1842 raise ValueError(f'file encoding {encoding} not supported')
1843 if locs is not None and len(locs) > 0 and \
1844 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1845 raise IndexError(f'locs and labels must have same number of elements.')
1846 # write WAVE file:
1847 with open(filepath, 'wb') as df:
1848 write_riff_chunk(df)
1849 if data.ndim == 1:
1850 write_format_chunk(df, 1, len(data), rate, bits)
1851 else:
1852 write_format_chunk(df, data.shape[1], data.shape[0],
1853 rate, bits)
1854 append_metadata_riff(df, metadata)
1855 write_data_chunk(df, data, bits)
1856 append_markers_riff(df, locs, labels, rate, marker_hint)
1857 write_filesize(df)
1860def append_riff(filepath, metadata=None, locs=None, labels=None,
1861 rate=None, marker_hint='cue'):
1862 """Append metadata and markers to an existing RIFF file.
1864 Parameters
1865 ----------
1866 filepath: string or Path
1867 Full path and name of the file to write.
1868 metadata: None or nested dict
1869 Metadata as key-value pairs. Values can be strings, integers,
1870 or dictionaries.
1871 locs: None or 1-D or 2-D array of ints
1872 Marker positions (first column) and spans (optional second column)
1873 for each marker (rows).
1874 labels: None or 1-D or 2-D array of string objects
1875 Labels (first column) and texts (optional second column)
1876 for each marker (rows).
1877 rate: float
1878 Sampling rate of the data in Hertz, needed for storing markers
1879 in seconds.
1880 marker_hint: str
1881 - 'cue': store markers in cue and and adtl chunks.
1882 - 'lbl': store markers in avisoft lbl chunk.
1884 Returns
1885 -------
1886 n: int
1887 Number of bytes written to the stream.
1889 Raises
1890 ------
1891 IndexError
1892 `locs` and `labels` differ in len.
1894 Examples
1895 --------
1896 ```
1897 import numpy as np
1898 from audioio.riffmetadata import append_riff
1900 md = dict(Artist='underscore_') # metadata
1901 append_riff('audio/file.wav', md) # append them to existing audio file
1902 ```
1903 """
1904 if locs is not None and len(locs) > 0 and \
1905 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1906 raise IndexError(f'locs and labels must have same number of elements.')
1907 # check RIFF file:
1908 chunks = read_chunk_tags(filepath)
1909 # append to RIFF file:
1910 n = 0
1911 with open(filepath, 'r+b') as df:
1912 tags = []
1913 df.seek(0, os.SEEK_END)
1914 nc, tgs = append_metadata_riff(df, metadata)
1915 n += nc
1916 tags.extend(tgs)
1917 nc, tgs = append_markers_riff(df, locs, labels, rate, marker_hint)
1918 n += nc
1919 tags.extend(tgs)
1920 write_filesize(df)
1921 # blank out already existing chunks:
1922 for tag in chunks:
1923 if tag in tags:
1924 if '-' in tag:
1925 xtag = tag[5:7] + 'xx'
1926 else:
1927 xtag = tag[:2] + 'xx'
1928 write_chunk_name(df, chunks[tag][0], xtag)
1929 return 0
1932def demo(filepath):
1933 """Print metadata and markers of a RIFF/WAVE file.
1935 Parameters
1936 ----------
1937 filepath: string or Path
1938 Path of a RIFF/WAVE file.
1939 """
1940 def print_meta_data(meta_data, level=0):
1941 for sk in meta_data:
1942 md = meta_data[sk]
1943 if isinstance(md, dict):
1944 print(f'{"":<{level*4}}{sk}:')
1945 print_meta_data(md, level+1)
1946 else:
1947 v = str(md).replace('\n', '.').replace('\r', '.')
1948 print(f'{"":<{level*4}s}{sk:<20s}: {v}')
1950 # read meta data:
1951 meta_data = metadata_riff(filepath, store_empty=False)
1953 # print meta data:
1954 print()
1955 print('metadata:')
1956 print_meta_data(meta_data)
1958 # read cues:
1959 locs, labels = markers_riff(filepath)
1961 # print marker table:
1962 if len(locs) > 0:
1963 print()
1964 print('markers:')
1965 print(f'{"position":10} {"span":8} {"label":10} {"text":10}')
1966 for i in range(len(locs)):
1967 if i < len(labels):
1968 print(f'{locs[i,0]:10} {locs[i,1]:8} {labels[i,0]:10} {labels[i,1]:30}')
1969 else:
1970 print(f'{locs[i,0]:10} {locs[i,1]:8} {"-":10} {"-":10}')
1973def main(*args):
1974 """Call demo with command line arguments.
1976 Parameters
1977 ----------
1978 args: list of strings
1979 Command line arguments as returned by sys.argv[1:]
1980 """
1981 if len(args) > 0 and (args[0] == '-h' or args[0] == '--help'):
1982 print()
1983 print('Usage:')
1984 print(' python -m src.audioio.riffmetadata [--help] <audio/file.wav>')
1985 print()
1986 return
1988 if len(args) > 0:
1989 demo(args[0])
1990 else:
1991 rate = 44100
1992 t = np.arange(0, 2, 1/rate)
1993 x = np.sin(2*np.pi*440*t)
1994 imd = dict(IENG='JB', ICRD='2024-01-24', RATE=9,
1995 Comment='this is test1')
1996 bmd = dict(Description='a recording',
1997 OriginationDate='2024:01:24', TimeReference=123456,
1998 Version=42, CodingHistory='Test1\nTest2')
1999 xmd = dict(Project='Record all', Note='still testing',
2000 Sync_Point_List=dict(Sync_Point=1,
2001 Sync_Point_Comment='great'))
2002 omd = imd.copy()
2003 omd['Production'] = bmd
2004 md = dict(INFO=imd, BEXT=bmd, IXML=xmd,
2005 Recording=omd, Notes=xmd)
2006 locs = np.random.randint(10, len(x)-10, (5, 2))
2007 locs = locs[np.argsort(locs[:,0]),:]
2008 locs[:,1] = np.random.randint(0, 20, len(locs))
2009 labels = np.zeros((len(locs), 2), dtype=object)
2010 for i in range(len(labels)):
2011 labels[i,0] = chr(ord('a') + i % 26)
2012 labels[i,1] = chr(ord('A') + i % 26)*5
2013 write_wave('test.wav', x, rate, md, locs, labels)
2014 demo('test.wav')
2017if __name__ == "__main__":
2018 main(*sys.argv[1:])