Coverage for src/audioio/riffmetadata.py: 97%
727 statements
« prev ^ index » next coverage.py v7.7.0, created at 2025-03-18 22:33 +0000
« prev ^ index » next coverage.py v7.7.0, created at 2025-03-18 22:33 +0000
1"""Read and write meta data and marker lists of riff based files.
3Container files of the Resource Interchange File Format (RIFF) like
4WAVE files may contain sections (called chunks) with metadata and
5markers in addition to the timeseries (audio) data and the necessary
6specifications of sampling rate, bit depth, etc.
8## Metadata
10There are various types of chunks for storing metadata, like the [INFO
11list](https://www.recordingblogs.com/wiki/list-chunk-of-a-wave-file),
12[broadcast-audio extension
13(BEXT)](https://tech.ebu.ch/docs/tech/tech3285.pdf) chunk, or
14[iXML](http://www.gallery.co.uk/ixml/) chunks. These chunks contain
15metadata as key-value pairs. Since wave files are primarily designed
16for music, valid keys in these chunks are restricted to topics from
17music and music production. Some keys are usefull also for science,
18but there is need for more keys. It is possible to extend the INFO
19list keys, but these keys are restricted to four characters and the
20INFO list chunk does also not allow for hierarchical metadata. The
21other metadata chunks, in particular the BEXT chunk, cannot be
22extended. With standard chunks, not all types of metadata can be
23stored.
25The [GUANO (Grand Unified Acoustic Notation
26Ontology)](https://github.com/riggsd/guano-spec), primarily designed
27for bat acoustic recordings, has some standard ontologies that are of
28much more interest in scientific context. In addition, GUANO allows
29for extensions with arbitray nested keys and string encoded values.
30In that respect it is a well defined and easy to handle serialization
31of the [odML data model](https://doi.org/10.3389/fninf.2011.00016).
32We use GUANO to write all metadata that do not fit into the INFO, BEXT
33or IXML chunks into a WAVE file.
35To interface the various ways to store and read metadata of RIFF
36files, the `riffmetadata` module simply uses nested dictionaries. The
37keys are always strings. Values are strings or integers for key-value
38pairs. Value strings can also be numbers followed by a unit. Values
39can also be dictionaries for defining subsections of key-value
40pairs. The dictionaries can be nested to arbitrary depth.
42The `write_wave()` function first tries to write an INFO list
43chunk. It checks for a key "INFO" with a flat dictionary of key value
44pairs. It then translates all keys of this dictionary using the
45`info_tags` mapping. If all the resulting keys have no more than four
46characters and there are no subsections, then an INFO list chunk is
47written. If no "INFO" key exists, then with the same procedure all
48elements of the provided metadata are checked for being valid INFO
49tags, and on success an INFO list chunk is written. Then, in similar
50ways, `write_wave()` tries to assemble valid BEXT and iXML chunks,
51based on the tags in `bext_tags` abd `ixml_tags`. All remaining
52metadata are then stored in an GUANO chunk.
54When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are
55returned as subsections with the respective keys. Metadata from an
56GUANO chunk are stored directly in the metadata dictionary without
57marking them as GUANO.
59## Markers
61A number of different chunk types exist for handling markers or cues
62that mark specific events or regions in the audio data. In the end,
63each marker has a position, a span, a label, and a text. Position,
64and span are handled with 1-D or 2-D arrays of ints, where each row is
65a marker and the columns are position and span. The span column is
66optional. Labels and texts come in another 1-D or 2-D array of objects
67pointing to strings. Again, rows are the markers, first column are the
68labels, and second column the optional texts. Try to keep the labels
69short, and use text for longer descriptions, if necessary.
71## Read metadata and markers
73- `metadata_riff()`: read metadata from a RIFF/WAVE file.
74- `markers_riff()`: read markers from a RIFF/WAVE file.
76## Write data, metadata and markers
78- `write_wave()`: write time series, metadata and markers to a WAVE file.
79- `append_metadata_riff()`: append metadata chunks to RIFF file.
80- `append_markers_riff()`: append marker chunks to RIFF file.
81- `append_riff()`: append metadata and markers to an existing RIFF file.
83## Helper functions for reading RIFF and WAVE files
85- `read_chunk_tags()`: read tags of all chunks contained in a RIFF file.
86- `read_riff_header()`: read and check the RIFF file header.
87- `skip_chunk()`: skip over unknown RIFF chunk.
88- `read_format_chunk()`: read format chunk.
89- `read_info_chunks()`: read in meta data from info list chunk.
90- `read_bext_chunk()`: read in metadata from the broadcast-audio extension chunk.
91- `read_ixml_chunk()`: read in metadata from an IXML chunk.
92- `read_guano_chunk()`: read in metadata from a GUANO chunk.
93- `read_cue_chunk()`: read in marker positions from cue chunk.
94- `read_playlist_chunk()`: read in marker spans from playlist chunk.
95- `read_adtl_chunks()`: read in associated data list chunks.
96- `read_lbl_chunk()`: read in marker positions, spans, labels, and texts from lbl chunk.
98## Helper functions for writing RIFF and WAVE files
100- `write_riff_chunk()`: write RIFF file header.
101- `write_filesize()`: write the file size into the RIFF file header.
102- `write_chunk_name()`: change the name of a chunk.
103- `write_format_chunk()`: write format chunk.
104- `write_data_chunk()`: write data chunk.
105- `write_info_chunk()`: write metadata to LIST INFO chunk.
106- `write_bext_chunk()`: write metadata to BEXT chunk.
107- `write_ixml_chunk()`: write metadata to iXML chunk.
108- `write_guano_chunk()`: write metadata to GUANO chunk.
109- `write_cue_chunk()`: write marker positions to cue chunk.
110- `write_playlist_chunk()`: write marker spans to playlist chunk.
111- `write_adtl_chunks()`: write associated data list chunks.
112- `write_lbl_chunk()`: write marker positions, spans, labels, and texts to lbl chunk.
114## Demo
116- `demo()`: print metadata and marker list of RIFF/WAVE file.
117- `main()`: call demo with command line arguments.
119## Descriptions of the RIFF/WAVE file format
121- https://de.wikipedia.org/wiki/RIFF_WAVE
122- http://www.piclist.com/techref/io/serial/midi/wave.html
123- https://moddingwiki.shikadi.net/wiki/Resource_Interchange_File_Format_(RIFF)
124- https://www.recordingblogs.com/wiki/wave-file-format
125- http://fhein.users.ak.tu-berlin.de/Alias/Studio/ProTools/audio-formate/wav/overview.html
126- http://www.gallery.co.uk/ixml/
128For INFO tag names see:
130- see https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
132"""
134import io
135import os
136import sys
137import warnings
138import struct
139import numpy as np
140import xml.etree.ElementTree as ET
141from .audiometadata import flatten_metadata, unflatten_metadata, find_key
144info_tags = dict(AGES='Rated',
145 CMNT='Comment',
146 CODE='EncodedBy',
147 COMM='Comments',
148 DIRC='Directory',
149 DISP='SoundSchemeTitle',
150 DTIM='DateTimeOriginal',
151 GENR='Genre',
152 IARL='ArchivalLocation',
153 IART='Artist',
154 IAS1='FirstLanguage',
155 IAS2='SecondLanguage',
156 IAS3='ThirdLanguage',
157 IAS4='FourthLanguage',
158 IAS5='FifthLanguage',
159 IAS6='SixthLanguage',
160 IAS7='SeventhLanguage',
161 IAS8='EighthLanguage',
162 IAS9='NinthLanguage',
163 IBSU='BaseURL',
164 ICAS='DefaultAudioStream',
165 ICDS='ConstumeDesigner',
166 ICMS='Commissioned',
167 ICMT='Comment',
168 ICNM='Cinematographer',
169 ICNT='Country',
170 ICOP='Copyright',
171 ICRD='DateCreated',
172 ICRP='Cropped',
173 IDIM='Dimensions',
174 IDIT='DateTimeOriginal',
175 IDPI='DotsPerInch',
176 IDST='DistributedBy',
177 IEDT='EditedBy',
178 IENC='EncodedBy',
179 IENG='Engineer',
180 IGNR='Genre',
181 IKEY='Keywords',
182 ILGT='Lightness',
183 ILGU='LogoURL',
184 ILIU='LogoIconURL',
185 ILNG='Language',
186 IMBI='MoreInfoBannerImage',
187 IMBU='MoreInfoBannerURL',
188 IMED='Medium',
189 IMIT='MoreInfoText',
190 IMIU='MoreInfoURL',
191 IMUS='MusicBy',
192 INAM='Title',
193 IPDS='ProductionDesigner',
194 IPLT='NumColors',
195 IPRD='Product',
196 IPRO='ProducedBy',
197 IRIP='RippedBy',
198 IRTD='Rating',
199 ISBJ='Subject',
200 ISFT='Software',
201 ISGN='SecondaryGenre',
202 ISHP='Sharpness',
203 ISMP='TimeCode',
204 ISRC='Source',
205 ISRF='SourceFrom',
206 ISTD='ProductionStudio',
207 ISTR='Starring',
208 ITCH='Technician',
209 ITRK='TrackNumber',
210 IWMU='WatermarkURL',
211 IWRI='WrittenBy',
212 LANG='Language',
213 LOCA='Location',
214 PRT1='Part',
215 PRT2='NumberOfParts',
216 RATE='Rate',
217 START='Starring',
218 STAT='Statistics',
219 TAPE='TapeName',
220 TCDO='EndTimecode',
221 TCOD='StartTimecode',
222 TITL='Title',
223 TLEN='Length',
224 TORG='Organization',
225 TRCK='TrackNumber',
226 TURL='URL',
227 TVER='Version',
228 VMAJ='VegasVersionMajor',
229 VMIN='VegasVersionMinor',
230 YEAR='Year',
231 # extensions from
232 # [TeeRec](https://github.com/janscience/TeeRec/):
233 BITS='Bits',
234 PINS='Pins',
235 AVRG='Averaging',
236 CNVS='ConversionSpeed',
237 SMPS='SamplingSpeed',
238 VREF='ReferenceVoltage',
239 GAIN='Gain',
240 UWRP='UnwrapThreshold',
241 UWPC='UnwrapClippedAmplitude',
242 IBRD='uCBoard',
243 IMAC='MACAdress',
244 CPUF='CPU frequency')
245"""Dictionary with known tags of the INFO chunk as keys and their description as value.
247See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
248"""
250bext_tags = dict(
251 Description=256,
252 Originator=32,
253 OriginatorReference=32,
254 OriginationDate=10,
255 OriginationTime=8,
256 TimeReference=8,
257 Version=2,
258 UMID=64,
259 LoudnessValue=2,
260 LoudnessRange=2,
261 MaxTruePeakLevel=2,
262 MaxMomentaryLoudness=2,
263 MaxShortTermLoudness=2,
264 Reserved=180,
265 CodingHistory=0)
266"""Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.
268See https://tech.ebu.ch/docs/tech/tech3285.pdf
269"""
271ixml_tags = [
272 'BWFXML',
273 'IXML_VERSION',
274 'PROJECT',
275 'SCENE',
276 'TAPE',
277 'TAKE',
278 'TAKE_TYPE',
279 'NO_GOOD',
280 'FALSE_START',
281 'WILD_TRACK',
282 'CIRCLED',
283 'FILE_UID',
284 'UBITS',
285 'NOTE',
286 'SYNC_POINT_LIST',
287 'SYNC_POINT_COUNT',
288 'SYNC_POINT',
289 'SYNC_POINT_TYPE',
290 'SYNC_POINT_FUNCTION',
291 'SYNC_POINT_COMMENT',
292 'SYNC_POINT_LOW',
293 'SYNC_POINT_HIGH',
294 'SYNC_POINT_EVENT_DURATION',
295 'SPEED',
296 'MASTER_SPEED',
297 'CURRENT_SPEED',
298 'TIMECODE_RATE',
299 'TIMECODE_FLAGS',
300 'FILE_SAMPLE_RATE',
301 'AUDIO_BIT_DEPTH',
302 'DIGITIZER_SAMPLE_RATE',
303 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_HI',
304 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_LO',
305 'TIMESTAMP_SAMPLE_RATE',
306 'LOUDNESS',
307 'LOUDNESS_VALUE',
308 'LOUDNESS_RANGE',
309 'MAX_TRUE_PEAK_LEVEL',
310 'MAX_MOMENTARY_LOUDNESS',
311 'MAX_SHORT_TERM_LOUDNESS',
312 'HISTORY',
313 'ORIGINAL_FILENAME',
314 'PARENT_FILENAME',
315 'PARENT_UID',
316 'FILE_SET',
317 'TOTAL_FILES',
318 'FAMILY_UID',
319 'FAMILY_NAME',
320 'FILE_SET_INDEX',
321 'TRACK_LIST',
322 'TRACK_COUNT',
323 'TRACK',
324 'CHANNEL_INDEX',
325 'INTERLEAVE_INDEX',
326 'NAME',
327 'FUNCTION',
328 'PRE_RECORD_SAMPLECOUNT',
329 'BEXT',
330 'BWF_DESCRIPTION',
331 'BWF_ORIGINATOR',
332 'BWF_ORIGINATOR_REFERENCE',
333 'BWF_ORIGINATION_DATE',
334 'BWF_ORIGINATION_TIME',
335 'BWF_TIME_REFERENCE_LOW',
336 'BWF_TIME_REFERENCE_HIGH',
337 'BWF_VERSION',
338 'BWF_UMID',
339 'BWF_RESERVED',
340 'BWF_CODING_HISTORY',
341 'BWF_LOUDNESS_VALUE',
342 'BWF_LOUDNESS_RANGE',
343 'BWF_MAX_TRUE_PEAK_LEVEL',
344 'BWF_MAX_MOMENTARY_LOUDNESS',
345 'BWF_MAX_SHORT_TERM_LOUDNESS',
346 'USER',
347 'FULL_TITLE',
348 'DIRECTOR_NAME',
349 'PRODUCTION_NAME',
350 'PRODUCTION_ADDRESS',
351 'PRODUCTION_EMAIL',
352 'PRODUCTION_PHONE',
353 'PRODUCTION_NOTE',
354 'SOUND_MIXER_NAME',
355 'SOUND_MIXER_ADDRESS',
356 'SOUND_MIXER_EMAIL',
357 'SOUND_MIXER_PHONE',
358 'SOUND_MIXER_NOTE',
359 'AUDIO_RECORDER_MODEL',
360 'AUDIO_RECORDER_SERIAL_NUMBER',
361 'AUDIO_RECORDER_FIRMWARE',
362 'LOCATION',
363 'LOCATION_NAME',
364 'LOCATION_GPS',
365 'LOCATION_ALTITUDE',
366 'LOCATION_TYPE',
367 'LOCATION_TIME',
368 ]
369"""List with valid tags of the iXML chunk.
371See http://www.gallery.co.uk/ixml/
372"""
375# Read RIFF/WAVE files:
377def read_riff_header(sf, tag=None):
378 """Read and check the RIFF file header.
380 Parameters
381 ----------
382 sf: stream
383 File stream of RIFF/WAVE file.
384 tag: None or str
385 If supplied, check whether it matches the subchunk tag.
386 If it does not match, raise a ValueError.
388 Returns
389 -------
390 filesize: int
391 Size of the RIFF file in bytes.
393 Raises
394 ------
395 ValueError
396 Not a RIFF file or subchunk tag does not match `tag`.
397 """
398 riffs = sf.read(4).decode('latin-1')
399 if riffs != 'RIFF':
400 raise ValueError('Not a RIFF file.')
401 fsize = struct.unpack('<I', sf.read(4))[0] + 8
402 subtag = sf.read(4).decode('latin-1')
403 if tag is not None and subtag != tag:
404 raise ValueError(f'Not a {tag} file.')
405 return fsize
408def skip_chunk(sf):
409 """Skip over unknown RIFF chunk.
411 Parameters
412 ----------
413 sf: stream
414 File stream of RIFF file.
416 Returns
417 -------
418 size: int
419 The size of the skipped chunk in bytes.
420 """
421 size = struct.unpack('<I', sf.read(4))[0]
422 size += size % 2
423 sf.seek(size, os.SEEK_CUR)
424 return size
427def read_chunk_tags(filepath):
428 """Read tags of all chunks contained in a RIFF file.
430 Parameters
431 ----------
432 filepath: string or file handle
433 The RIFF file.
435 Returns
436 -------
437 tags: dict
438 Keys are the tag names of the chunks found in the file. If the
439 chunk is a list chunk, then the list type is added with a dash
440 to the key, i.e. "LIST-INFO". Values are tuples with the
441 corresponding file positions of the data of the chunk (after
442 the tag and the chunk size field) and the size of the chunk
443 data. The file position of the next chunk is thus the position
444 of the chunk plus the size of its data.
446 Raises
447 ------
448 ValueError
449 Not a RIFF file.
451 """
452 tags = {}
453 sf = filepath
454 file_pos = None
455 if hasattr(filepath, 'read'):
456 file_pos = sf.tell()
457 sf.seek(0, os.SEEK_SET)
458 else:
459 sf = open(filepath, 'rb')
460 fsize = read_riff_header(sf)
461 while (sf.tell() < fsize - 8):
462 chunk = sf.read(4).decode('latin-1').upper()
463 size = struct.unpack('<I', sf.read(4))[0]
464 size += size % 2
465 fp = sf.tell()
466 if chunk == 'LIST':
467 subchunk = sf.read(4).decode('latin-1').upper()
468 tags[chunk + '-' + subchunk] = (fp, size)
469 size -= 4
470 else:
471 tags[chunk] = (fp, size)
472 sf.seek(size, os.SEEK_CUR)
473 if file_pos is None:
474 sf.close()
475 else:
476 sf.seek(file_pos, os.SEEK_SET)
477 return tags
480def read_format_chunk(sf):
481 """Read format chunk.
483 Parameters
484 ----------
485 sf: stream
486 File stream for reading FMT chunk.
488 Returns
489 -------
490 channels: int
491 Number of channels.
492 rate: float
493 Sampling rate (frames per time) in Hertz.
494 bits: int
495 Bit resolution.
496 """
497 size = struct.unpack('<I', sf.read(4))[0]
498 size += size % 2
499 ccode, channels, rate, byterate, blockalign, bits = struct.unpack('<HHIIHH', sf.read(16))
500 if size > 16:
501 sf.read(size - 16)
502 return channels, float(rate), bits
505def read_info_chunks(sf, store_empty):
506 """Read in meta data from info list chunk.
508 The variable `info_tags` is used to map the 4 character tags to
509 human readable key names.
511 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
513 Parameters
514 ----------
515 sf: stream
516 File stream of RIFF file.
517 store_empty: bool
518 If `False` do not add meta data with empty values.
520 Returns
521 -------
522 metadata: dict
523 Dictionary with key-value pairs of info tags.
525 """
526 md = {}
527 list_size = struct.unpack('<I', sf.read(4))[0]
528 list_type = sf.read(4).decode('latin-1').upper()
529 list_size -= 4
530 if list_type == 'INFO':
531 while list_size >= 8:
532 key = sf.read(4).decode('ascii').rstrip(' \x00')
533 size = struct.unpack('<I', sf.read(4))[0]
534 size += size % 2
535 bs = sf.read(size)
536 x = np.frombuffer(bs, dtype=np.uint8)
537 if np.sum((x >= 0x80) & (x <= 0x9f)) > 0:
538 s = bs.decode('windows-1252')
539 else:
540 s = bs.decode('latin1')
541 value = s.rstrip(' \x00\x02')
542 list_size -= 8 + size
543 if key in info_tags:
544 key = info_tags[key]
545 if value or store_empty:
546 md[key] = value
547 if list_size > 0: # finish or skip
548 sf.seek(list_size, os.SEEK_CUR)
549 return md
552def read_bext_chunk(sf, store_empty=True):
553 """Read in metadata from the broadcast-audio extension chunk.
555 The variable `bext_tags` lists all valid BEXT fields and their size.
557 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
559 Parameters
560 ----------
561 sf: stream
562 File stream of RIFF file.
563 store_empty: bool
564 If `False` do not add meta data with empty values.
566 Returns
567 -------
568 meta_data: dict
569 The meta-data of a BEXT chunk are stored in a flat dictionary
570 with the following keys:
572 - 'Description': a free description of the sequence.
573 - 'Originator': name of the originator/ producer of the audio file.
574 - 'OriginatorReference': unambiguous reference allocated by the originating organisation.
575 - 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.
576 - 'OriginationTime': time of creation of audio sequence in hh:mm:ss.
577 - 'TimeReference': first sample since midnight.
578 - 'Version': version of the BWF.
579 - 'UMID': unique material identifier.
580 - 'LoudnessValue': integrated loudness value.
581 - 'LoudnessRange': loudness range.
582 - 'MaxTruePeakLevel': maximum true peak value in dBTP.
583 - 'MaxMomentaryLoudness': highest value of the momentary loudness level.
584 - 'MaxShortTermLoudness': highest value of the short-term loudness level.
585 - 'Reserved': 180 bytes reserved for extension.
586 - 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.
587 """
588 md = {}
589 size = struct.unpack('<I', sf.read(4))[0]
590 size += size % 2
591 s = sf.read(256).decode('ascii').strip(' \x00')
592 if s or store_empty:
593 md['Description'] = s
594 s = sf.read(32).decode('ascii').strip(' \x00')
595 if s or store_empty:
596 md['Originator'] = s
597 s = sf.read(32).decode('ascii').strip(' \x00')
598 if s or store_empty:
599 md['OriginatorReference'] = s
600 s = sf.read(10).decode('ascii').strip(' \x00')
601 if s or store_empty:
602 md['OriginationDate'] = s
603 s = sf.read(8).decode('ascii').strip(' \x00')
604 if s or store_empty:
605 md['OriginationTime'] = s
606 reference, version = struct.unpack('<QH', sf.read(10))
607 if reference > 0 or store_empty:
608 md['TimeReference'] = reference
609 if version > 0 or store_empty:
610 md['Version'] = version
611 s = sf.read(64).decode('ascii').strip(' \x00')
612 if s or store_empty:
613 md['UMID'] = s
614 lvalue, lrange, peak, momentary, shortterm = struct.unpack('<hhhhh', sf.read(10))
615 if lvalue > 0 or store_empty:
616 md['LoudnessValue'] = lvalue
617 if lrange > 0 or store_empty:
618 md['LoudnessRange'] = lrange
619 if peak > 0 or store_empty:
620 md['MaxTruePeakLevel'] = peak
621 if momentary > 0 or store_empty:
622 md['MaxMomentaryLoudness'] = momentary
623 if shortterm > 0 or store_empty:
624 md['MaxShortTermLoudness'] = shortterm
625 s = sf.read(180).decode('ascii').strip(' \x00')
626 if s or store_empty:
627 md['Reserved'] = s
628 size -= 256 + 32 + 32 + 10 + 8 + 8 + 2 + 64 + 10 + 180
629 s = sf.read(size).decode('ascii').strip(' \x00\n\r')
630 if s or store_empty:
631 md['CodingHistory'] = s
632 return md
635def read_ixml_chunk(sf, store_empty=True):
636 """Read in metadata from an IXML chunk.
638 See the variable `ixml_tags` for a list of valid tags.
640 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
642 Parameters
643 ----------
644 sf: stream
645 File stream of RIFF file.
646 store_empty: bool
647 If `False` do not add meta data with empty values.
649 Returns
650 -------
651 metadata: nested dict
652 Dictionary with key-value pairs.
653 """
655 def parse_ixml(element, store_empty=True):
656 md = {}
657 for e in element:
658 if not e.text is None:
659 md[e.tag] = e.text
660 elif len(e) > 0:
661 md[e.tag] = parse_ixml(e, store_empty)
662 elif store_empty:
663 md[e.tag] = ''
664 return md
666 size = struct.unpack('<I', sf.read(4))[0]
667 size += size % 2
668 xmls = sf.read(size).decode('latin-1').rstrip(' \x00')
669 root = ET.fromstring(xmls)
670 md = {root.tag: parse_ixml(root, store_empty)}
671 if len(md) == 1 and 'BWFXML' in md:
672 md = md['BWFXML']
673 return md
676def read_guano_chunk(sf):
677 """Read in metadata from a GUANO chunk.
679 GUANO is the Grand Unified Acoustic Notation Ontology, an
680 extensible, open format for embedding metadata within bat acoustic
681 recordings. See https://github.com/riggsd/guano-spec for details.
683 The GUANO specification allows for the inclusion of arbitrary
684 nested keys and string encoded values. In that respect it is a
685 well defined and easy to handle serialization of the [odML data
686 model](https://doi.org/10.3389/fninf.2011.00016).
688 Parameters
689 ----------
690 sf: stream
691 File stream of RIFF file.
693 Returns
694 -------
695 metadata: nested dict
696 Dictionary with key-value pairs.
698 """
699 md = {}
700 size = struct.unpack('<I', sf.read(4))[0]
701 size += size % 2
702 for line in io.StringIO(sf.read(size).decode('utf-8')):
703 ss = line.split(':')
704 if len(ss) > 1:
705 md[ss[0].strip()] = ':'.join(ss[1:]).strip().replace(r'\n', '\n')
706 return unflatten_metadata(md, '|')
709def read_cue_chunk(sf):
710 """Read in marker positions from cue chunk.
712 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
714 Parameters
715 ----------
716 sf: stream
717 File stream of RIFF file.
719 Returns
720 -------
721 locs: 2-D array of ints
722 Each row is a marker with unique identifier in the first column,
723 position in the second column, and span in the third column.
724 The cue chunk does not encode spans, so the third column is
725 initialized with zeros.
726 """
727 locs = []
728 size, n = struct.unpack('<II', sf.read(8))
729 for c in range(n):
730 cpid, cppos = struct.unpack('<II', sf.read(8))
731 datachunkid = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
732 chunkstart, blockstart, offset = struct.unpack('<III', sf.read(12))
733 if datachunkid == 'DATA':
734 locs.append((cpid, cppos, 0))
735 return np.array(locs, dtype=int)
738def read_playlist_chunk(sf, locs):
739 """Read in marker spans from playlist chunk.
741 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
743 Parameters
744 ----------
745 sf: stream
746 File stream of RIFF file.
747 locs: 2-D array of ints
748 Markers as returned by the `read_cue_chunk()` function.
749 Each row is a marker with unique identifier in the first column,
750 position in the second column, and span in the third column.
751 The span is read in from the playlist chunk.
752 """
753 if len(locs) == 0:
754 warnings.warn('read_playlist_chunks() requires markers from a previous cue chunk')
755 size, n = struct.unpack('<II', sf.read(8))
756 for p in range(n):
757 cpid, length, repeats = struct.unpack('<III', sf.read(12))
758 i = np.where(locs[:,0] == cpid)[0]
759 if len(i) > 0:
760 locs[i[0], 2] = length
763def read_adtl_chunks(sf, locs, labels):
764 """Read in associated data list chunks.
766 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
768 Parameters
769 ----------
770 sf: stream
771 File stream of RIFF file.
772 locs: 2-D array of ints
773 Markers as returned by the `read_cue_chunk()` function.
774 Each row is a marker with unique identifier in the first column,
775 position in the second column, and span in the third column.
776 The span is read in from the LTXT chunk.
777 labels: 2-D array of string objects
778 Labels (first column) and texts (second column) for each marker (rows)
779 from previous LABL, NOTE, and LTXT chunks.
781 Returns
782 -------
783 labels: 2-D array of string objects
784 Labels (first column) and texts (second column) for each marker (rows)
785 from LABL, NOTE (first column), and LTXT chunks (last column).
786 """
787 list_size = struct.unpack('<I', sf.read(4))[0]
788 list_type = sf.read(4).decode('latin-1').upper()
789 list_size -= 4
790 if list_type == 'ADTL':
791 if len(locs) == 0:
792 warnings.warn('read_adtl_chunks() requires markers from a previous cue chunk')
793 if len(labels) == 0:
794 labels = np.zeros((len(locs), 2), dtype=object)
795 while list_size >= 8:
796 key = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
797 size, cpid = struct.unpack('<II', sf.read(8))
798 size += size % 2 - 4
799 if key == 'LABL' or key == 'NOTE':
800 label = sf.read(size).decode('latin-1').rstrip(' \x00')
801 i = np.where(locs[:,0] == cpid)[0]
802 if len(i) > 0:
803 i = i[0]
804 if hasattr(labels[i,0], '__len__') and len(labels[i,0]) > 0:
805 labels[i,0] += '|' + label
806 else:
807 labels[i,0] = label
808 elif key == 'LTXT':
809 length = struct.unpack('<I', sf.read(4))[0]
810 sf.read(12) # skip fields
811 text = sf.read(size - 4 - 12).decode('latin-1').rstrip(' \x00')
812 i = np.where(locs[:,0] == cpid)[0]
813 if len(i) > 0:
814 i = i[0]
815 if hasattr(labels[i,1], '__len__') and len(labels[i,1]) > 0:
816 labels[i,1] += '|' + text
817 else:
818 labels[i,1] = text
819 locs[i,2] = length
820 else:
821 sf.read(size)
822 list_size -= 12 + size
823 if list_size > 0: # finish or skip
824 sf.seek(list_size, os.SEEK_CUR)
825 return labels
828def read_lbl_chunk(sf, rate):
829 """Read in marker positions, spans, labels, and texts from lbl chunk.
831 The proprietary LBL chunk is specific to wave files generated by
832 [AviSoft](www.avisoft.com) products.
834 The labels (first column of `labels`) have special meanings.
835 Markers with a span (a section label in the terminology of
836 AviSoft) can be arranged in three levels when displayed:
838 - "M": layer 1, the top level section
839 - "N": layer 2, sections below layer 1
840 - "O": layer 3, sections below layer 2
841 - "P": total, section start and end are displayed with two vertical lines.
843 All other labels mark single point labels with a time and a
844 frequency (that we here discard). See also
845 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
847 Parameters
848 ----------
849 sf: stream
850 File stream of RIFF file.
851 rate: float
852 Sampling rate of the data in Hertz.
854 Returns
855 -------
856 locs: 2-D array of ints
857 Each row is a marker with unique identifier (simply integers
858 enumerating the markers) in the first column, position in the
859 second column, and span in the third column.
860 labels: 2-D array of string objects
861 Labels (first column) and texts (second column) for
862 each marker (rows).
864 """
865 size = struct.unpack('<I', sf.read(4))[0]
866 nn = size // 65
867 locs = np.zeros((nn, 3), dtype=int)
868 labels = np.zeros((nn, 2), dtype=object)
869 n = 0
870 for c in range(nn):
871 line = sf.read(65).decode('ascii')
872 fields = line.split('\t')
873 if len(fields) >= 4:
874 labels[n,0] = fields[3].strip()
875 labels[n,1] = fields[2].strip()
876 start_idx = int(np.round(float(fields[0].strip('\x00'))*rate))
877 end_idx = int(np.round(float(fields[1].strip('\x00'))*rate))
878 locs[n,0] = n
879 locs[n,1] = start_idx
880 if labels[n,0] in 'MNOP':
881 locs[n,2] = end_idx - start_idx
882 else:
883 locs[n,2] = 0
884 n += 1
885 else:
886 # the first 65 bytes are a title string that applies to
887 # the whole wave file that can be set from the AVISoft
888 # software. The recorder leave this empty.
889 pass
890 return locs[:n,:], labels[:n,:]
893def metadata_riff(filepath, store_empty=False):
894 """Read metadata from a RIFF/WAVE file.
896 Parameters
897 ----------
898 filepath: string or file handle
899 The RIFF file.
900 store_empty: bool
901 If `False` do not add meta data with empty values.
903 Returns
904 -------
905 meta_data: nested dict
906 Meta data contained in the RIFF file. Keys of the nested
907 dictionaries are always strings. If the corresponding
908 values are dictionaries, then the key is the section name
909 of the metadata contained in the dictionary. All other
910 types of values are values for the respective key. In
911 particular they are strings, or list of strings. But other
912 simple types like ints or floats are also allowed.
913 First level contains sections of meta data
914 (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).
916 Raises
917 ------
918 ValueError
919 Not a RIFF file.
921 Examples
922 --------
923 ```
924 from audioio.riffmetadata import riff_metadata
925 from audioio import print_metadata
927 md = riff_metadata('audio/file.wav')
928 print_metadata(md)
929 ```
930 """
931 meta_data = {}
932 sf = filepath
933 file_pos = None
934 if hasattr(filepath, 'read'):
935 file_pos = sf.tell()
936 sf.seek(0, os.SEEK_SET)
937 else:
938 sf = open(filepath, 'rb')
939 fsize = read_riff_header(sf)
940 while (sf.tell() < fsize - 8):
941 chunk = sf.read(4).decode('latin-1').upper()
942 if chunk == 'LIST':
943 md = read_info_chunks(sf, store_empty)
944 if len(md) > 0:
945 meta_data['INFO'] = md
946 elif chunk == 'BEXT':
947 md = read_bext_chunk(sf, store_empty)
948 if len(md) > 0:
949 meta_data['BEXT'] = md
950 elif chunk == 'IXML':
951 md = read_ixml_chunk(sf, store_empty)
952 if len(md) > 0:
953 meta_data['IXML'] = md
954 elif chunk == 'GUAN':
955 md = read_guano_chunk(sf)
956 if len(md) > 0:
957 meta_data.update(md)
958 else:
959 skip_chunk(sf)
960 if file_pos is None:
961 sf.close()
962 else:
963 sf.seek(file_pos, os.SEEK_SET)
964 return meta_data
967def markers_riff(filepath):
968 """Read markers from a RIFF/WAVE file.
970 Parameters
971 ----------
972 filepath: string or file handle
973 The RIFF file.
975 Returns
976 -------
977 locs: 2-D array of ints
978 Marker positions (first column) and spans (second column)
979 for each marker (rows).
980 labels: 2-D array of string objects
981 Labels (first column) and texts (second column)
982 for each marker (rows).
984 Raises
985 ------
986 ValueError
987 Not a RIFF file.
989 Examples
990 --------
991 ```
992 from audioio.riffmetadata import riff_markers
993 from audioio import print_markers
995 locs, labels = riff_markers('audio/file.wav')
996 print_markers(locs, labels)
997 ```
998 """
999 sf = filepath
1000 file_pos = None
1001 if hasattr(filepath, 'read'):
1002 file_pos = sf.tell()
1003 sf.seek(0, os.SEEK_SET)
1004 else:
1005 sf = open(filepath, 'rb')
1006 rate = None
1007 locs = np.zeros((0, 3), dtype=int)
1008 labels = np.zeros((0, 2), dtype=object)
1009 fsize = read_riff_header(sf)
1010 while (sf.tell() < fsize - 8):
1011 chunk = sf.read(4).decode('latin-1').upper()
1012 if chunk == 'FMT ':
1013 rate = read_format_chunk(sf)[1]
1014 elif chunk == 'CUE ':
1015 locs = read_cue_chunk(sf)
1016 elif chunk == 'PLST':
1017 read_playlist_chunk(sf, locs)
1018 elif chunk == 'LIST':
1019 labels = read_adtl_chunks(sf, locs, labels)
1020 elif chunk == 'LBL ':
1021 locs, labels = read_lbl_chunk(sf, rate)
1022 else:
1023 skip_chunk(sf)
1024 if file_pos is None:
1025 sf.close()
1026 else:
1027 sf.seek(file_pos, os.SEEK_SET)
1028 # sort markers according to their position:
1029 if len(locs) > 0:
1030 idxs = np.argsort(locs[:,-2])
1031 locs = locs[idxs,:]
1032 if len(labels) > 0:
1033 labels = labels[idxs,:]
1034 return locs[:,1:], labels
1037# Write RIFF/WAVE file:
1039def write_riff_chunk(df, filesize=0, tag='WAVE'):
1040 """Write RIFF file header.
1042 Parameters
1043 ----------
1044 df: stream
1045 File stream for writing RIFF file header.
1046 filesize: int
1047 Size of the file in bytes.
1048 tag: str
1049 The type of RIFF file. Default is a wave file.
1050 Exactly 4 characeters long.
1052 Returns
1053 -------
1054 n: int
1055 Number of bytes written to the stream.
1057 Raises
1058 ------
1059 ValueError
1060 `tag` is not 4 characters long.
1061 """
1062 if len(tag) != 4:
1063 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1064 if filesize < 8:
1065 filesize = 8
1066 df.write(b'RIFF')
1067 df.write(struct.pack('<I', filesize - 8))
1068 df.write(tag.encode('ascii', errors='strict'))
1069 return 12
1072def write_filesize(df, filesize=None):
1073 """Write the file size into the RIFF file header.
1075 Parameters
1076 ----------
1077 df: stream
1078 File stream into which to write `filesize`.
1079 filesize: int
1080 Size of the file in bytes. If not specified or 0,
1081 then use current size of the file.
1082 """
1083 pos = df.tell()
1084 if not filesize:
1085 df.seek(0, os.SEEK_END)
1086 filesize = df.tell()
1087 df.seek(4, os.SEEK_SET)
1088 df.write(struct.pack('<I', filesize - 8))
1089 df.seek(pos, os.SEEK_SET)
1092def write_chunk_name(df, pos, tag):
1093 """Change the name of a chunk.
1095 Use this to make the content of an existing chunk to be ignored by
1096 overwriting its name with an unknown one.
1098 Parameters
1099 ----------
1100 df: stream
1101 File stream.
1102 pos: int
1103 Position of the chunk in the file stream.
1104 tag: str
1105 The type of RIFF file. Default is a wave file.
1106 Exactly 4 characeters long.
1108 Raises
1109 ------
1110 ValueError
1111 `tag` is not 4 characters long.
1112 """
1113 if len(tag) != 4:
1114 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1115 df.seek(pos, os.SEEK_SET)
1116 df.write(tag.encode('ascii', errors='strict'))
1119def write_format_chunk(df, channels, frames, rate, bits=16):
1120 """Write format chunk.
1122 Parameters
1123 ----------
1124 df: stream
1125 File stream for writing FMT chunk.
1126 channels: int
1127 Number of channels contained in the data.
1128 frames: int
1129 Number of frames contained in the data.
1130 rate: int or float
1131 Sampling rate (frames per time) in Hertz.
1132 bits: 16 or 32
1133 Bit resolution of the data to be written.
1135 Returns
1136 -------
1137 n: int
1138 Number of bytes written to the stream.
1139 """
1140 blockalign = channels * (bits//8)
1141 byterate = int(rate) * blockalign
1142 df.write(b'fmt ')
1143 df.write(struct.pack('<IHHIIHH', 16, 1, channels, int(rate),
1144 byterate, blockalign, bits))
1145 return 8 + 16
1148def write_data_chunk(df, data, bits=16):
1149 """Write data chunk.
1151 Parameters
1152 ----------
1153 df: stream
1154 File stream for writing data chunk.
1155 data: 1-D or 2-D array of floats
1156 Data with first column time (frames) and optional second column
1157 channels with values between -1 and 1.
1158 bits: 16 or 32
1159 Bit resolution of the data to be written.
1161 Returns
1162 -------
1163 n: int
1164 Number of bytes written to the stream.
1165 """
1166 df.write(b'data')
1167 df.write(struct.pack('<I', data.size * (bits//8)))
1168 buffer = data * 2**(bits-1)
1169 n = df.write(buffer.astype(f'<i{bits//8}').tobytes('C'))
1170 return 8 + n
1173def write_info_chunk(df, metadata):
1174 """Write metadata to LIST INFO chunk.
1176 If `metadata` contains an 'INFO' key, then write the flat
1177 dictionary of this key as an INFO chunk. Otherwise, attempt to
1178 write all metadata items as an INFO chunk. The keys are translated
1179 via the `info_tags` variable back to INFO tags. If after
1180 translation any key is left that is longer than 4 characters or
1181 any key has a dictionary as a value (non-flat metadata), the INFO
1182 chunk is not written.
1184 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
1186 Parameters
1187 ----------
1188 df: stream
1189 File stream for writing INFO chunk.
1190 metadata: nested dict
1191 Metadata as key-value pairs. Values can be strings, integers,
1192 or dictionaries.
1194 Returns
1195 -------
1196 n: int
1197 Number of bytes written to the stream.
1198 keys_written: list of str
1199 Keys written to the INFO chunk.
1201 """
1202 if not metadata:
1203 return 0, []
1204 is_info = False
1205 if 'INFO' in metadata:
1206 metadata = metadata['INFO']
1207 is_info = True
1208 tags = {v: k for k, v in info_tags.items()}
1209 n = 0
1210 for k in metadata:
1211 kn = tags.get(k, k)
1212 if len(kn) > 4:
1213 if is_info:
1214 warnings.warn(f'no 4-character info tag for key "{k}" found.')
1215 return 0, []
1216 if isinstance(metadata[k], dict):
1217 if is_info:
1218 warnings.warn(f'value of key "{k}" in INFO chunk cannot be a dictionary.')
1219 return 0, []
1220 try:
1221 v = str(metadata[k]).encode('latin-1')
1222 except UnicodeEncodeError:
1223 v = str(metadata[k]).encode('windows-1252')
1224 n += 8 + len(v) + len(v) % 2
1225 df.write(b'LIST')
1226 df.write(struct.pack('<I', n + 4))
1227 df.write(b'INFO')
1228 keys_written = []
1229 for k in metadata:
1230 kn = tags.get(k, k)
1231 df.write(f'{kn:<4s}'.encode('latin-1'))
1232 try:
1233 v = str(metadata[k]).encode('latin-1')
1234 except UnicodeEncodeError:
1235 v = str(metadata[k]).encode('windows-1252')
1236 ns = len(v) + len(v) % 2
1237 if ns > len(v):
1238 v += b' ';
1239 df.write(struct.pack('<I', ns))
1240 df.write(v)
1241 keys_written.append(k)
1242 return 12 + n, ['INFO'] if is_info else keys_written
1245def write_bext_chunk(df, metadata):
1246 """Write metadata to BEXT chunk.
1248 If `metadata` contains a BEXT key, and this contains valid BEXT
1249 tags (one of the keys listed in the variable `bext_tags`), then
1250 write the dictionary of that key as a broadcast-audio extension
1251 chunk.
1253 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
1255 Parameters
1256 ----------
1257 df: stream
1258 File stream for writing BEXT chunk.
1259 metadata: nested dict
1260 Metadata as key-value pairs. Values can be strings, integers,
1261 or dictionaries.
1263 Returns
1264 -------
1265 n: int
1266 Number of bytes written to the stream.
1267 keys_written: list of str
1268 Keys written to the BEXT chunk.
1270 """
1271 if not metadata or not 'BEXT' in metadata:
1272 return 0, []
1273 metadata = metadata['BEXT']
1274 for k in metadata:
1275 if not k in bext_tags:
1276 warnings.warn(f'no bext tag for key "{k}" found.')
1277 return 0, []
1278 n = 0
1279 for k in bext_tags:
1280 n += bext_tags[k]
1281 ch = metadata.get('CodingHistory', '').encode('ascii', errors='replace')
1282 if len(ch) >= 2 and ch[-2:] != '\r\n':
1283 ch += b'\r\n'
1284 nch = len(ch) + len(ch) % 2
1285 n += nch
1286 df.write(b'BEXT')
1287 df.write(struct.pack('<I', n))
1288 for k in bext_tags:
1289 bn = bext_tags[k]
1290 if bn == 2:
1291 v = metadata.get(k, '0')
1292 df.write(struct.pack('<H', int(v)))
1293 elif bn == 8 and k == 'TimeReference':
1294 v = metadata.get(k, '0')
1295 df.write(struct.pack('<Q', int(v)))
1296 elif bn == 0:
1297 df.write(ch)
1298 df.write(bytes(nch - len(ch)))
1299 else:
1300 v = metadata.get(k, '').encode('ascii', errors='replace')
1301 df.write(v[:bn] + bytes(bn - len(v)))
1302 return 8 + n, ['BEXT']
1305def write_ixml_chunk(df, metadata, keys_written=None):
1306 """Write metadata to iXML chunk.
1308 If `metadata` contains an IXML key with valid IXML tags (one of
1309 those listed in the variable `ixml_tags`), or the remaining tags
1310 in `metadata` are valid IXML tags, then write an IXML chunk.
1312 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
1314 Parameters
1315 ----------
1316 df: stream
1317 File stream for writing IXML chunk.
1318 metadata: nested dict
1319 Meta-data as key-value pairs. Values can be strings, integers,
1320 or dictionaries.
1321 keys_written: list of str
1322 Keys that have already written to INFO or BEXT chunk.
1324 Returns
1325 -------
1326 n: int
1327 Number of bytes written to the stream.
1328 keys_written: list of str
1329 Keys written to the IXML chunk.
1331 """
1332 def check_ixml(metadata):
1333 for k in metadata:
1334 if not k.upper() in ixml_tags:
1335 return False
1336 if isinstance(metadata[k], dict):
1337 if not check_ixml(metadata[k]):
1338 return False
1339 return True
1341 def build_xml(node, metadata):
1342 kw = []
1343 for k in metadata:
1344 e = ET.SubElement(node, k)
1345 if isinstance(metadata[k], dict):
1346 build_xml(e, metadata[k])
1347 else:
1348 e.text = str(metadata[k])
1349 kw.append(k)
1350 return kw
1352 if not metadata:
1353 return 0, []
1354 md = metadata
1355 if keys_written:
1356 md = {k: metadata[k] for k in metadata if not k in keys_written}
1357 if len(md) == 0:
1358 return 0, []
1359 has_ixml = False
1360 if 'IXML' in md and check_ixml(md['IXML']):
1361 md = md['IXML']
1362 has_ixml = True
1363 else:
1364 if not check_ixml(md):
1365 return 0, []
1366 root = ET.Element('BWFXML')
1367 kw = build_xml(root, md)
1368 bs = bytes(ET.tostring(root, xml_declaration=True,
1369 short_empty_elements=False))
1370 if len(bs) % 2 == 1:
1371 bs += bytes(1)
1372 df.write(b'IXML')
1373 df.write(struct.pack('<I', len(bs)))
1374 df.write(bs)
1375 return 8 + len(bs), ['IXML'] if has_ixml else kw
1378def write_guano_chunk(df, metadata, keys_written=None):
1379 """Write metadata to guan chunk.
1381 GUANO is the Grand Unified Acoustic Notation Ontology, an
1382 extensible, open format for embedding metadata within bat acoustic
1383 recordings. See https://github.com/riggsd/guano-spec for details.
1385 The GUANO specification allows for the inclusion of arbitrary
1386 nested keys and string encoded values. In that respect it is a
1387 well defined and easy to handle serialization of the [odML data
1388 model](https://doi.org/10.3389/fninf.2011.00016).
1390 This will write *all* metadata that are not in `keys_written`.
1392 Parameters
1393 ----------
1394 df: stream
1395 File stream for writing guano chunk.
1396 metadata: nested dict
1397 Metadata as key-value pairs. Values can be strings, integers,
1398 or dictionaries.
1399 keys_written: list of str
1400 Keys that have already written to INFO, BEXT, IXML chunk.
1402 Returns
1403 -------
1404 n: int
1405 Number of bytes written to the stream.
1406 keys_written: list of str
1407 Top-level keys written to the GUANO chunk.
1409 """
1410 if not metadata:
1411 return 0, []
1412 md = metadata
1413 if keys_written:
1414 md = {k: metadata[k] for k in metadata if not k in keys_written}
1415 if len(md) == 0:
1416 return 0, []
1417 fmd = flatten_metadata(md, True, '|')
1418 for k in fmd:
1419 if isinstance(fmd[k], str):
1420 fmd[k] = fmd[k].replace('\n', r'\n')
1421 sio = io.StringIO()
1422 m, k = find_key(md, 'GUANO.Version')
1423 if k is None:
1424 sio.write('GUANO|Version:1.0\n')
1425 for k in fmd:
1426 sio.write(f'{k}:{fmd[k]}\n')
1427 bs = sio.getvalue().encode('utf-8')
1428 if len(bs) % 2 == 1:
1429 bs += b' '
1430 n = len(bs)
1431 df.write(b'guan')
1432 df.write(struct.pack('<I', n))
1433 df.write(bs)
1434 return n, list(md)
1437def write_cue_chunk(df, locs):
1438 """Write marker positions to cue chunk.
1440 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
1442 Parameters
1443 ----------
1444 df: stream
1445 File stream for writing cue chunk.
1446 locs: None or 2-D array of ints
1447 Positions (first column) and spans (optional second column)
1448 for each marker (rows).
1450 Returns
1451 -------
1452 n: int
1453 Number of bytes written to the stream.
1454 """
1455 if locs is None or len(locs) == 0:
1456 return 0
1457 df.write(b'CUE ')
1458 df.write(struct.pack('<II', 4 + len(locs)*24, len(locs)))
1459 for i in range(len(locs)):
1460 df.write(struct.pack('<II4sIII', i, locs[i,0], b'data', 0, 0, 0))
1461 return 12 + len(locs)*24
1464def write_playlist_chunk(df, locs):
1465 """Write marker spans to playlist chunk.
1467 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
1469 Parameters
1470 ----------
1471 df: stream
1472 File stream for writing playlist chunk.
1473 locs: None or 2-D array of ints
1474 Positions (first column) and spans (optional second column)
1475 for each marker (rows).
1477 Returns
1478 -------
1479 n: int
1480 Number of bytes written to the stream.
1481 """
1482 if locs is None or len(locs) == 0 or locs.shape[1] < 2:
1483 return 0
1484 n_spans = np.sum(locs[:,1] > 0)
1485 if n_spans == 0:
1486 return 0
1487 df.write(b'plst')
1488 df.write(struct.pack('<II', 4 + n_spans*12, n_spans))
1489 for i in range(len(locs)):
1490 if locs[i,1] > 0:
1491 df.write(struct.pack('<III', i, locs[i,1], 1))
1492 return 12 + n_spans*12
1495def write_adtl_chunks(df, locs, labels):
1496 """Write associated data list chunks.
1498 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
1500 Parameters
1501 ----------
1502 df: stream
1503 File stream for writing adtl chunk.
1504 locs: None or 2-D array of ints
1505 Positions (first column) and spans (optional second column)
1506 for each marker (rows).
1507 labels: None or 2-D array of string objects
1508 Labels (first column) and texts (second column) for each marker (rows).
1510 Returns
1511 -------
1512 n: int
1513 Number of bytes written to the stream.
1514 """
1515 if labels is None or len(labels) == 0:
1516 return 0
1517 labels_size = 0
1518 for l in labels[:,0]:
1519 if hasattr(l, '__len__'):
1520 n = len(l)
1521 if n > 0:
1522 labels_size += 12 + n + n % 2
1523 text_size = 0
1524 if labels.shape[1] > 1:
1525 for t in labels[:,1]:
1526 if hasattr(t, '__len__'):
1527 n = len(t)
1528 if n > 0:
1529 text_size += 28 + n + n % 2
1530 if labels_size == 0 and text_size == 0:
1531 return 0
1532 size = 4 + labels_size + text_size
1533 spans = locs[:,1] if locs.shape[1] > 1 else None
1534 df.write(b'LIST')
1535 df.write(struct.pack('<I', size))
1536 df.write(b'adtl')
1537 for i in range(len(labels)):
1538 # labl sub-chunk:
1539 l = labels[i,0]
1540 if hasattr(l, '__len__'):
1541 n = len(l)
1542 if n > 0:
1543 n += n % 2
1544 df.write(b'labl')
1545 df.write(struct.pack('<II', 4 + n, i))
1546 df.write(f'{l:<{n}s}'.encode('latin-1', errors='replace'))
1547 # ltxt sub-chunk:
1548 if labels.shape[1] > 1:
1549 t = labels[i,1]
1550 if hasattr(t, '__len__'):
1551 n = len(t)
1552 if n > 0:
1553 n += n % 2
1554 span = spans[i] if spans is not None else 0
1555 df.write(b'ltxt')
1556 df.write(struct.pack('<III', 20 + n, i, span))
1557 df.write(struct.pack('<IHHHH', 0, 0, 0, 0, 0))
1558 df.write(f'{t:<{n}s}'.encode('latin-1', errors='replace'))
1559 return 8 + size
1562def write_lbl_chunk(df, locs, labels, rate):
1563 """Write marker positions, spans, labels, and texts to lbl chunk.
1565 The proprietary LBL chunk is specific to wave files generated by
1566 [AviSoft](www.avisoft.com) products.
1568 The labels (first column of `labels`) have special meanings.
1569 Markers with a span (a section label in the terminology of
1570 AviSoft) can be arranged in three levels when displayed:
1572 - "M": layer 1, the top level section
1573 - "N": layer 2, sections below layer 1
1574 - "O": layer 3, sections below layer 2
1575 - "P": total, section start and end are displayed with two vertical lines.
1577 All other labels mark single point labels with a time and a
1578 frequency (that we here discard). See also
1579 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
1581 If a marker has a span, and its label is not one of "M", "N", "O", or "P",
1582 then its label is set to "M".
1583 If a marker has no span, and its label is one of "M", "N", "O", or "P",
1584 then its label is set to "a".
1586 Parameters
1587 ----------
1588 df: stream
1589 File stream for writing lbl chunk.
1590 locs: None or 2-D array of ints
1591 Positions (first column) and spans (optional second column)
1592 for each marker (rows).
1593 labels: None or 2-D array of string objects
1594 Labels (first column) and texts (second column) for each marker (rows).
1595 rate: float
1596 Sampling rate of the data in Hertz.
1598 Returns
1599 -------
1600 n: int
1601 Number of bytes written to the stream.
1603 """
1604 if locs is None or len(locs) == 0:
1605 return 0
1606 size = (1 + len(locs)) * 65
1607 df.write(b'LBL ')
1608 df.write(struct.pack('<I', size))
1609 # first empty entry (this is ment to be a title for the whole wave file):
1610 df.write(b' ' * 63)
1611 df.write(b'\r\n')
1612 for k in range(len(locs)):
1613 t0 = locs[k,0]/rate
1614 t1 = t0
1615 t1 += locs[k,1]/rate
1616 ls = 'M' if locs[k,1] > 0 else 'a'
1617 ts = ''
1618 if labels is not None and len(labels) > k:
1619 ls = labels[k,0]
1620 if ls != 0 and len(ls) > 0:
1621 ls = ls[0]
1622 if ls in 'MNOP':
1623 if locs[k,1] == 0:
1624 ls = 'a'
1625 else:
1626 if locs[k,1] > 0:
1627 ls = 'M'
1628 ts = labels[k,1]
1629 if ts == 0:
1630 ts = ''
1631 df.write(struct.pack('<14sc', f'{t0:e}'.encode('ascii', errors='replace'), b'\t'))
1632 df.write(struct.pack('<14sc', f'{t1:e}'.encode('ascii', errors='replace'), b'\t'))
1633 bs = f'{ts:31s}\t{ls}\r\n'.encode('ascii', errors='replace')
1634 df.write(bs)
1635 return 8 + size
1638def append_metadata_riff(df, metadata):
1639 """Append metadata chunks to RIFF file.
1641 You still need to update the filesize by calling
1642 `write_filesize()`.
1644 Parameters
1645 ----------
1646 df: stream
1647 File stream for writing metadata chunks.
1648 metadata: None or nested dict
1649 Metadata as key-value pairs. Values can be strings, integers,
1650 or dictionaries.
1652 Returns
1653 -------
1654 n: int
1655 Number of bytes written to the stream.
1656 tags: list of str
1657 Tag names of chunks written to audio file.
1658 """
1659 if not metadata:
1660 return 0, []
1661 n = 0
1662 tags = []
1663 # metadata INFO chunk:
1664 nc, kw = write_info_chunk(df, metadata)
1665 if nc > 0:
1666 tags.append('LIST-INFO')
1667 n += nc
1668 # metadata BEXT chunk:
1669 nc, bkw = write_bext_chunk(df, metadata)
1670 if nc > 0:
1671 tags.append('BEXT')
1672 n += nc
1673 kw.extend(bkw)
1674 # metadata IXML chunk:
1675 nc, xkw = write_ixml_chunk(df, metadata, kw)
1676 if nc > 0:
1677 tags.append('IXML')
1678 n += nc
1679 kw.extend(xkw)
1680 # write remaining metadata to GUANO chunk:
1681 nc, _ = write_guano_chunk(df, metadata, kw)
1682 if nc > 0:
1683 tags.append('GUAN')
1684 n += nc
1685 kw.extend(bkw)
1686 return n, tags
1689def append_markers_riff(df, locs, labels=None, rate=None,
1690 marker_hint='cue'):
1691 """Append marker chunks to RIFF file.
1693 You still need to update the filesize by calling
1694 `write_filesize()`.
1696 Parameters
1697 ----------
1698 df: stream
1699 File stream for writing metadata chunks.
1700 locs: None or 1-D or 2-D array of ints
1701 Marker positions (first column) and spans (optional second column)
1702 for each marker (rows).
1703 labels: None or 1-D or 2-D array of string objects
1704 Labels (first column) and texts (optional second column)
1705 for each marker (rows).
1706 rate: float
1707 Sampling rate of the data in Hertz, needed for storing markers
1708 in seconds.
1709 marker_hint: str
1710 - 'cue': store markers in cue and and adtl chunks.
1711 - 'lbl': store markers in avisoft lbl chunk.
1713 Returns
1714 -------
1715 n: int
1716 Number of bytes written to the stream.
1717 tags: list of str
1718 Tag names of chunks written to audio file.
1720 Raises
1721 ------
1722 ValueError
1723 Encoding not supported.
1724 IndexError
1725 `locs` and `labels` differ in len.
1726 """
1727 if locs is None or len(locs) == 0:
1728 return 0, []
1729 if labels is not None and len(labels) > 0 and len(labels) != len(locs):
1730 raise IndexError(f'locs and labels must have same number of elements.')
1731 # make locs and labels 2-D:
1732 if not locs is None and locs.ndim == 1:
1733 locs = locs.reshape(-1, 1)
1734 if not labels is None and labels.ndim == 1:
1735 labels = labels.reshape(-1, 1)
1736 # sort markers according to their position:
1737 idxs = np.argsort(locs[:,0])
1738 locs = locs[idxs,:]
1739 if not labels is None and len(labels) > 0:
1740 labels = labels[idxs,:]
1741 n = 0
1742 tags = []
1743 if marker_hint.lower() == 'cue':
1744 # write marker positions:
1745 nc = write_cue_chunk(df, locs)
1746 if nc > 0:
1747 tags.append('CUE ')
1748 n += nc
1749 # write marker spans:
1750 nc = write_playlist_chunk(df, locs)
1751 if nc > 0:
1752 tags.append('PLST')
1753 n += nc
1754 # write marker labels:
1755 nc = write_adtl_chunks(df, locs, labels)
1756 if nc > 0:
1757 tags.append('LIST-ADTL')
1758 n += nc
1759 elif marker_hint.lower() == 'lbl':
1760 # write avisoft labels:
1761 nc = write_lbl_chunk(df, locs, labels, rate)
1762 if nc > 0:
1763 tags.append('LBL ')
1764 n += nc
1765 else:
1766 raise ValueError(f'marker_hint "{marker_hint}" not supported for storing markers')
1767 return n, tags
1770def write_wave(filepath, data, rate, metadata=None, locs=None,
1771 labels=None, encoding=None, marker_hint='cue'):
1772 """Write time series, metadata and markers to a WAVE file.
1774 Only 16 or 32bit PCM encoding is supported.
1776 Parameters
1777 ----------
1778 filepath: string
1779 Full path and name of the file to write.
1780 data: 1-D or 2-D array of floats
1781 Array with the data (first index time, second index channel,
1782 values within -1.0 and 1.0).
1783 rate: float
1784 Sampling rate of the data in Hertz.
1785 metadata: None or nested dict
1786 Metadata as key-value pairs. Values can be strings, integers,
1787 or dictionaries.
1788 locs: None or 1-D or 2-D array of ints
1789 Marker positions (first column) and spans (optional second column)
1790 for each marker (rows).
1791 labels: None or 1-D or 2-D array of string objects
1792 Labels (first column) and texts (optional second column)
1793 for each marker (rows).
1794 encoding: string or None
1795 Encoding of the data: 'PCM_32' or 'PCM_16'.
1796 If None or empty string use 'PCM_16'.
1797 marker_hint: str
1798 - 'cue': store markers in cue and and adtl chunks.
1799 - 'lbl': store markers in avisoft lbl chunk.
1801 Raises
1802 ------
1803 ValueError
1804 Encoding not supported.
1805 IndexError
1806 `locs` and `labels` differ in len.
1808 See Also
1809 --------
1810 audioio.audiowriter.write_audio()
1812 Examples
1813 --------
1814 ```
1815 import numpy as np
1816 from audioio.riffmetadata import write_wave
1818 rate = 28000.0
1819 freq = 800.0
1820 time = np.arange(0.0, 1.0, 1/rate) # one second
1821 data = np.sin(2.0*np.p*freq*time) # 800Hz sine wave
1822 md = dict(Artist='underscore_') # metadata
1824 write_wave('audio/file.wav', data, rate, md)
1825 ```
1826 """
1827 if not filepath:
1828 raise ValueError('no file specified!')
1829 if not encoding:
1830 encoding = 'PCM_16'
1831 encoding = encoding.upper()
1832 bits = 0
1833 if encoding == 'PCM_16':
1834 bits = 16
1835 elif encoding == 'PCM_32':
1836 bits = 32
1837 else:
1838 raise ValueError(f'file encoding {encoding} not supported')
1839 if locs is not None and len(locs) > 0 and \
1840 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1841 raise IndexError(f'locs and labels must have same number of elements.')
1842 # write WAVE file:
1843 with open(filepath, 'wb') as df:
1844 write_riff_chunk(df)
1845 if data.ndim == 1:
1846 write_format_chunk(df, 1, len(data), rate, bits)
1847 else:
1848 write_format_chunk(df, data.shape[1], data.shape[0],
1849 rate, bits)
1850 append_metadata_riff(df, metadata)
1851 write_data_chunk(df, data, bits)
1852 append_markers_riff(df, locs, labels, rate, marker_hint)
1853 write_filesize(df)
1856def append_riff(filepath, metadata=None, locs=None, labels=None,
1857 rate=None, marker_hint='cue'):
1858 """Append metadata and markers to an existing RIFF file.
1860 Parameters
1861 ----------
1862 filepath: string
1863 Full path and name of the file to write.
1864 metadata: None or nested dict
1865 Metadata as key-value pairs. Values can be strings, integers,
1866 or dictionaries.
1867 locs: None or 1-D or 2-D array of ints
1868 Marker positions (first column) and spans (optional second column)
1869 for each marker (rows).
1870 labels: None or 1-D or 2-D array of string objects
1871 Labels (first column) and texts (optional second column)
1872 for each marker (rows).
1873 rate: float
1874 Sampling rate of the data in Hertz, needed for storing markers
1875 in seconds.
1876 marker_hint: str
1877 - 'cue': store markers in cue and and adtl chunks.
1878 - 'lbl': store markers in avisoft lbl chunk.
1880 Returns
1881 -------
1882 n: int
1883 Number of bytes written to the stream.
1885 Raises
1886 ------
1887 IndexError
1888 `locs` and `labels` differ in len.
1890 Examples
1891 --------
1892 ```
1893 import numpy as np
1894 from audioio.riffmetadata import append_riff
1896 md = dict(Artist='underscore_') # metadata
1897 append_riff('audio/file.wav', md) # append them to existing audio file
1898 ```
1899 """
1900 if not filepath:
1901 raise ValueError('no file specified!')
1902 if locs is not None and len(locs) > 0 and \
1903 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1904 raise IndexError(f'locs and labels must have same number of elements.')
1905 # check RIFF file:
1906 chunks = read_chunk_tags(filepath)
1907 # append to RIFF file:
1908 n = 0
1909 with open(filepath, 'r+b') as df:
1910 tags = []
1911 df.seek(0, os.SEEK_END)
1912 nc, tgs = append_metadata_riff(df, metadata)
1913 n += nc
1914 tags.extend(tgs)
1915 nc, tgs = append_markers_riff(df, locs, labels, rate, marker_hint)
1916 n += nc
1917 tags.extend(tgs)
1918 write_filesize(df)
1919 # blank out already existing chunks:
1920 for tag in chunks:
1921 if tag in tags:
1922 if '-' in tag:
1923 xtag = tag[5:7] + 'xx'
1924 else:
1925 xtag = tag[:2] + 'xx'
1926 write_chunk_name(df, chunks[tag][0], xtag)
1927 return 0
1930def demo(filepath):
1931 """Print metadata and markers of a RIFF/WAVE file.
1933 Parameters
1934 ----------
1935 filepath: string
1936 Path of a RIFF/WAVE file.
1937 """
1938 def print_meta_data(meta_data, level=0):
1939 for sk in meta_data:
1940 md = meta_data[sk]
1941 if isinstance(md, dict):
1942 print(f'{"":<{level*4}}{sk}:')
1943 print_meta_data(md, level+1)
1944 else:
1945 v = str(md).replace('\n', '.').replace('\r', '.')
1946 print(f'{"":<{level*4}s}{sk:<20s}: {v}')
1948 # read meta data:
1949 meta_data = metadata_riff(filepath, store_empty=False)
1951 # print meta data:
1952 print()
1953 print('metadata:')
1954 print_meta_data(meta_data)
1956 # read cues:
1957 locs, labels = markers_riff(filepath)
1959 # print marker table:
1960 if len(locs) > 0:
1961 print()
1962 print('markers:')
1963 print(f'{"position":10} {"span":8} {"label":10} {"text":10}')
1964 for i in range(len(locs)):
1965 if i < len(labels):
1966 print(f'{locs[i,0]:10} {locs[i,1]:8} {labels[i,0]:10} {labels[i,1]:30}')
1967 else:
1968 print(f'{locs[i,0]:10} {locs[i,1]:8} {"-":10} {"-":10}')
1971def main(*args):
1972 """Call demo with command line arguments.
1974 Parameters
1975 ----------
1976 args: list of strings
1977 Command line arguments as returned by sys.argv[1:]
1978 """
1979 if len(args) > 0 and (args[0] == '-h' or args[0] == '--help'):
1980 print()
1981 print('Usage:')
1982 print(' python -m src.audioio.riffmetadata [--help] <audio/file.wav>')
1983 print()
1984 return
1986 if len(args) > 0:
1987 demo(args[0])
1988 else:
1989 rate = 44100
1990 t = np.arange(0, 2, 1/rate)
1991 x = np.sin(2*np.pi*440*t)
1992 imd = dict(IENG='JB', ICRD='2024-01-24', RATE=9,
1993 Comment='this is test1')
1994 bmd = dict(Description='a recording',
1995 OriginationDate='2024:01:24', TimeReference=123456,
1996 Version=42, CodingHistory='Test1\nTest2')
1997 xmd = dict(Project='Record all', Note='still testing',
1998 Sync_Point_List=dict(Sync_Point=1,
1999 Sync_Point_Comment='great'))
2000 omd = imd.copy()
2001 omd['Production'] = bmd
2002 md = dict(INFO=imd, BEXT=bmd, IXML=xmd,
2003 Recording=omd, Notes=xmd)
2004 locs = np.random.randint(10, len(x)-10, (5, 2))
2005 locs = locs[np.argsort(locs[:,0]),:]
2006 locs[:,1] = np.random.randint(0, 20, len(locs))
2007 labels = np.zeros((len(locs), 2), dtype=object)
2008 for i in range(len(labels)):
2009 labels[i,0] = chr(ord('a') + i % 26)
2010 labels[i,1] = chr(ord('A') + i % 26)*5
2011 write_wave('test.wav', x, rate, md, locs, labels)
2012 demo('test.wav')
2015if __name__ == "__main__":
2016 main(*sys.argv[1:])