Coverage for src/audioio/riffmetadata.py: 97%
727 statements
« prev ^ index » next coverage.py v7.9.2, created at 2025-07-03 21:40 +0000
« prev ^ index » next coverage.py v7.9.2, created at 2025-07-03 21:40 +0000
1"""Read and write meta data and marker lists of riff based files.
3Container files of the Resource Interchange File Format (RIFF) like
4WAVE files may contain sections (called chunks) with metadata and
5markers in addition to the timeseries (audio) data and the necessary
6specifications of sampling rate, bit depth, etc.
8## Metadata
10There are various types of chunks for storing metadata, like the [INFO
11list](https://www.recordingblogs.com/wiki/list-chunk-of-a-wave-file),
12[broadcast-audio extension
13(BEXT)](https://tech.ebu.ch/docs/tech/tech3285.pdf) chunk, or
14[iXML](http://www.gallery.co.uk/ixml/) chunks. These chunks contain
15metadata as key-value pairs. Since wave files are primarily designed
16for music, valid keys in these chunks are restricted to topics from
17music and music production. Some keys are usefull also for science,
18but there is need for more keys. It is possible to extend the INFO
19list keys, but these keys are restricted to four characters and the
20INFO list chunk does also not allow for hierarchical metadata. The
21other metadata chunks, in particular the BEXT chunk, cannot be
22extended. With standard chunks, not all types of metadata can be
23stored.
25The [GUANO (Grand Unified Acoustic Notation
26Ontology)](https://github.com/riggsd/guano-spec), primarily designed
27for bat acoustic recordings, has some standard ontologies that are of
28much more interest in scientific context. In addition, GUANO allows
29for extensions with arbitray nested keys and string encoded values.
30In that respect it is a well defined and easy to handle serialization
31of the [odML data model](https://doi.org/10.3389/fninf.2011.00016).
32We use GUANO to write all metadata that do not fit into the INFO, BEXT
33or IXML chunks into a WAVE file.
35To interface the various ways to store and read metadata of RIFF
36files, the `riffmetadata` module simply uses nested dictionaries. The
37keys are always strings. Values are strings or integers for key-value
38pairs. Value strings can also be numbers followed by a unit. Values
39can also be dictionaries for defining subsections of key-value
40pairs. The dictionaries can be nested to arbitrary depth.
42The `write_wave()` function first tries to write an INFO list
43chunk. It checks for a key "INFO" with a flat dictionary of key value
44pairs. It then translates all keys of this dictionary using the
45`info_tags` mapping. If all the resulting keys have no more than four
46characters and there are no subsections, then an INFO list chunk is
47written. If no "INFO" key exists, then with the same procedure all
48elements of the provided metadata are checked for being valid INFO
49tags, and on success an INFO list chunk is written. Then, in similar
50ways, `write_wave()` tries to assemble valid BEXT and iXML chunks,
51based on the tags in `bext_tags` abd `ixml_tags`. All remaining
52metadata are then stored in an GUANO chunk.
54When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are
55returned as subsections with the respective keys. Metadata from an
56GUANO chunk are stored directly in the metadata dictionary without
57marking them as GUANO.
59## Markers
61A number of different chunk types exist for handling markers or cues
62that mark specific events or regions in the audio data. In the end,
63each marker has a position, a span, a label, and a text. Position,
64and span are handled with 1-D or 2-D arrays of ints, where each row is
65a marker and the columns are position and span. The span column is
66optional. Labels and texts come in another 1-D or 2-D array of objects
67pointing to strings. Again, rows are the markers, first column are the
68labels, and second column the optional texts. Try to keep the labels
69short, and use text for longer descriptions, if necessary.
71## Read metadata and markers
73- `metadata_riff()`: read metadata from a RIFF/WAVE file.
74- `markers_riff()`: read markers from a RIFF/WAVE file.
76## Write data, metadata and markers
78- `write_wave()`: write time series, metadata and markers to a WAVE file.
79- `append_metadata_riff()`: append metadata chunks to RIFF file.
80- `append_markers_riff()`: append marker chunks to RIFF file.
81- `append_riff()`: append metadata and markers to an existing RIFF file.
83## Helper functions for reading RIFF and WAVE files
85- `read_chunk_tags()`: read tags of all chunks contained in a RIFF file.
86- `read_riff_header()`: read and check the RIFF file header.
87- `skip_chunk()`: skip over unknown RIFF chunk.
88- `read_format_chunk()`: read format chunk.
89- `read_info_chunks()`: read in meta data from info list chunk.
90- `read_bext_chunk()`: read in metadata from the broadcast-audio extension chunk.
91- `read_ixml_chunk()`: read in metadata from an IXML chunk.
92- `read_guano_chunk()`: read in metadata from a GUANO chunk.
93- `read_cue_chunk()`: read in marker positions from cue chunk.
94- `read_playlist_chunk()`: read in marker spans from playlist chunk.
95- `read_adtl_chunks()`: read in associated data list chunks.
96- `read_lbl_chunk()`: read in marker positions, spans, labels, and texts from lbl chunk.
98## Helper functions for writing RIFF and WAVE files
100- `write_riff_chunk()`: write RIFF file header.
101- `write_filesize()`: write the file size into the RIFF file header.
102- `write_chunk_name()`: change the name of a chunk.
103- `write_format_chunk()`: write format chunk.
104- `write_data_chunk()`: write data chunk.
105- `write_info_chunk()`: write metadata to LIST INFO chunk.
106- `write_bext_chunk()`: write metadata to BEXT chunk.
107- `write_ixml_chunk()`: write metadata to iXML chunk.
108- `write_guano_chunk()`: write metadata to GUANO chunk.
109- `write_cue_chunk()`: write marker positions to cue chunk.
110- `write_playlist_chunk()`: write marker spans to playlist chunk.
111- `write_adtl_chunks()`: write associated data list chunks.
112- `write_lbl_chunk()`: write marker positions, spans, labels, and texts to lbl chunk.
114## Demo
116- `demo()`: print metadata and marker list of RIFF/WAVE file.
117- `main()`: call demo with command line arguments.
119## Descriptions of the RIFF/WAVE file format
121- https://de.wikipedia.org/wiki/RIFF_WAVE
122- http://www.piclist.com/techref/io/serial/midi/wave.html
123- https://moddingwiki.shikadi.net/wiki/Resource_Interchange_File_Format_(RIFF)
124- https://www.recordingblogs.com/wiki/wave-file-format
125- http://fhein.users.ak.tu-berlin.de/Alias/Studio/ProTools/audio-formate/wav/overview.html
126- http://www.gallery.co.uk/ixml/
128For INFO tag names see:
130- see https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
132"""
134import io
135import os
136import sys
137import warnings
138import struct
139import numpy as np
140import xml.etree.ElementTree as ET
141from .audiometadata import flatten_metadata, unflatten_metadata, find_key
144info_tags = dict(AGES='Rated',
145 CMNT='Comment',
146 CODE='EncodedBy',
147 COMM='Comments',
148 DIRC='Directory',
149 DISP='SoundSchemeTitle',
150 DTIM='DateTimeOriginal',
151 GENR='Genre',
152 IARL='ArchivalLocation',
153 IART='Artist',
154 IAS1='FirstLanguage',
155 IAS2='SecondLanguage',
156 IAS3='ThirdLanguage',
157 IAS4='FourthLanguage',
158 IAS5='FifthLanguage',
159 IAS6='SixthLanguage',
160 IAS7='SeventhLanguage',
161 IAS8='EighthLanguage',
162 IAS9='NinthLanguage',
163 IBSU='BaseURL',
164 ICAS='DefaultAudioStream',
165 ICDS='ConstumeDesigner',
166 ICMS='Commissioned',
167 ICMT='Comment',
168 ICNM='Cinematographer',
169 ICNT='Country',
170 ICOP='Copyright',
171 ICRD='DateCreated',
172 ICRP='Cropped',
173 IDIM='Dimensions',
174 IDIT='DateTimeOriginal',
175 IDPI='DotsPerInch',
176 IDST='DistributedBy',
177 IEDT='EditedBy',
178 IENC='EncodedBy',
179 IENG='Engineer',
180 IGNR='Genre',
181 IKEY='Keywords',
182 ILGT='Lightness',
183 ILGU='LogoURL',
184 ILIU='LogoIconURL',
185 ILNG='Language',
186 IMBI='MoreInfoBannerImage',
187 IMBU='MoreInfoBannerURL',
188 IMED='Medium',
189 IMIT='MoreInfoText',
190 IMIU='MoreInfoURL',
191 IMUS='MusicBy',
192 INAM='Title',
193 IPDS='ProductionDesigner',
194 IPLT='NumColors',
195 IPRD='Product',
196 IPRO='ProducedBy',
197 IRIP='RippedBy',
198 IRTD='Rating',
199 ISBJ='Subject',
200 ISFT='Software',
201 ISGN='SecondaryGenre',
202 ISHP='Sharpness',
203 ISMP='TimeCode',
204 ISRC='Source',
205 ISRF='SourceFrom',
206 ISTD='ProductionStudio',
207 ISTR='Starring',
208 ITCH='Technician',
209 ITRK='TrackNumber',
210 IWMU='WatermarkURL',
211 IWRI='WrittenBy',
212 LANG='Language',
213 LOCA='Location',
214 PRT1='Part',
215 PRT2='NumberOfParts',
216 RATE='Rate',
217 START='Starring',
218 STAT='Statistics',
219 TAPE='TapeName',
220 TCDO='EndTimecode',
221 TCOD='StartTimecode',
222 TITL='Title',
223 TLEN='Length',
224 TORG='Organization',
225 TRCK='TrackNumber',
226 TURL='URL',
227 TVER='Version',
228 VMAJ='VegasVersionMajor',
229 VMIN='VegasVersionMinor',
230 YEAR='Year',
231 # extensions from
232 # [TeeRec](https://github.com/janscience/TeeRec/):
233 BITS='Bits',
234 PINS='Pins',
235 AVRG='Averaging',
236 CNVS='ConversionSpeed',
237 SMPS='SamplingSpeed',
238 VREF='ReferenceVoltage',
239 GAIN='Gain',
240 UWRP='UnwrapThreshold',
241 UWPC='UnwrapClippedAmplitude',
242 IBRD='uCBoard',
243 IMAC='MACAdress',
244 CPUF='CPU frequency')
245"""Dictionary with known tags of the INFO chunk as keys and their description as value.
247See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
248"""
250bext_tags = dict(
251 Description=256,
252 Originator=32,
253 OriginatorReference=32,
254 OriginationDate=10,
255 OriginationTime=8,
256 TimeReference=8,
257 Version=2,
258 UMID=64,
259 LoudnessValue=2,
260 LoudnessRange=2,
261 MaxTruePeakLevel=2,
262 MaxMomentaryLoudness=2,
263 MaxShortTermLoudness=2,
264 Reserved=180,
265 CodingHistory=0)
266"""Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.
268See https://tech.ebu.ch/docs/tech/tech3285.pdf
269"""
271ixml_tags = [
272 'BWFXML',
273 'IXML_VERSION',
274 'PROJECT',
275 'SCENE',
276 'TAPE',
277 'TAKE',
278 'TAKE_TYPE',
279 'NO_GOOD',
280 'FALSE_START',
281 'WILD_TRACK',
282 'CIRCLED',
283 'FILE_UID',
284 'UBITS',
285 'NOTE',
286 'SYNC_POINT_LIST',
287 'SYNC_POINT_COUNT',
288 'SYNC_POINT',
289 'SYNC_POINT_TYPE',
290 'SYNC_POINT_FUNCTION',
291 'SYNC_POINT_COMMENT',
292 'SYNC_POINT_LOW',
293 'SYNC_POINT_HIGH',
294 'SYNC_POINT_EVENT_DURATION',
295 'SPEED',
296 'MASTER_SPEED',
297 'CURRENT_SPEED',
298 'TIMECODE_RATE',
299 'TIMECODE_FLAGS',
300 'FILE_SAMPLE_RATE',
301 'AUDIO_BIT_DEPTH',
302 'DIGITIZER_SAMPLE_RATE',
303 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_HI',
304 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_LO',
305 'TIMESTAMP_SAMPLE_RATE',
306 'LOUDNESS',
307 'LOUDNESS_VALUE',
308 'LOUDNESS_RANGE',
309 'MAX_TRUE_PEAK_LEVEL',
310 'MAX_MOMENTARY_LOUDNESS',
311 'MAX_SHORT_TERM_LOUDNESS',
312 'HISTORY',
313 'ORIGINAL_FILENAME',
314 'PARENT_FILENAME',
315 'PARENT_UID',
316 'FILE_SET',
317 'TOTAL_FILES',
318 'FAMILY_UID',
319 'FAMILY_NAME',
320 'FILE_SET_INDEX',
321 'TRACK_LIST',
322 'TRACK_COUNT',
323 'TRACK',
324 'CHANNEL_INDEX',
325 'INTERLEAVE_INDEX',
326 'NAME',
327 'FUNCTION',
328 'PRE_RECORD_SAMPLECOUNT',
329 'BEXT',
330 'BWF_DESCRIPTION',
331 'BWF_ORIGINATOR',
332 'BWF_ORIGINATOR_REFERENCE',
333 'BWF_ORIGINATION_DATE',
334 'BWF_ORIGINATION_TIME',
335 'BWF_TIME_REFERENCE_LOW',
336 'BWF_TIME_REFERENCE_HIGH',
337 'BWF_VERSION',
338 'BWF_UMID',
339 'BWF_RESERVED',
340 'BWF_CODING_HISTORY',
341 'BWF_LOUDNESS_VALUE',
342 'BWF_LOUDNESS_RANGE',
343 'BWF_MAX_TRUE_PEAK_LEVEL',
344 'BWF_MAX_MOMENTARY_LOUDNESS',
345 'BWF_MAX_SHORT_TERM_LOUDNESS',
346 'USER',
347 'FULL_TITLE',
348 'DIRECTOR_NAME',
349 'PRODUCTION_NAME',
350 'PRODUCTION_ADDRESS',
351 'PRODUCTION_EMAIL',
352 'PRODUCTION_PHONE',
353 'PRODUCTION_NOTE',
354 'SOUND_MIXER_NAME',
355 'SOUND_MIXER_ADDRESS',
356 'SOUND_MIXER_EMAIL',
357 'SOUND_MIXER_PHONE',
358 'SOUND_MIXER_NOTE',
359 'AUDIO_RECORDER_MODEL',
360 'AUDIO_RECORDER_SERIAL_NUMBER',
361 'AUDIO_RECORDER_FIRMWARE',
362 'LOCATION',
363 'LOCATION_NAME',
364 'LOCATION_GPS',
365 'LOCATION_ALTITUDE',
366 'LOCATION_TYPE',
367 'LOCATION_TIME',
368 ]
369"""List with valid tags of the iXML chunk.
371See http://www.gallery.co.uk/ixml/
372"""
375# Read RIFF/WAVE files:
377def read_riff_header(sf, tag=None):
378 """Read and check the RIFF file header.
380 Parameters
381 ----------
382 sf: stream
383 File stream of RIFF/WAVE file.
384 tag: None or str
385 If supplied, check whether it matches the subchunk tag.
386 If it does not match, raise a ValueError.
388 Returns
389 -------
390 filesize: int
391 Size of the RIFF file in bytes.
393 Raises
394 ------
395 ValueError
396 Not a RIFF file or subchunk tag does not match `tag`.
397 """
398 riffs = sf.read(4).decode('latin-1')
399 if riffs != 'RIFF':
400 raise ValueError('Not a RIFF file.')
401 fsize = struct.unpack('<I', sf.read(4))[0] + 8
402 subtag = sf.read(4).decode('latin-1')
403 if tag is not None and subtag != tag:
404 raise ValueError(f'Not a {tag} file.')
405 return fsize
408def skip_chunk(sf):
409 """Skip over unknown RIFF chunk.
411 Parameters
412 ----------
413 sf: stream
414 File stream of RIFF file.
416 Returns
417 -------
418 size: int
419 The size of the skipped chunk in bytes.
420 """
421 size = struct.unpack('<I', sf.read(4))[0]
422 size += size % 2
423 sf.seek(size, os.SEEK_CUR)
424 return size
427def read_chunk_tags(filepath):
428 """Read tags of all chunks contained in a RIFF file.
430 Parameters
431 ----------
432 filepath: string or file handle
433 The RIFF file.
435 Returns
436 -------
437 tags: dict
438 Keys are the tag names of the chunks found in the file. If the
439 chunk is a list chunk, then the list type is added with a dash
440 to the key, i.e. "LIST-INFO". Values are tuples with the
441 corresponding file positions of the data of the chunk (after
442 the tag and the chunk size field) and the size of the chunk
443 data. The file position of the next chunk is thus the position
444 of the chunk plus the size of its data. Advance another 8 bytes
445 to get to the data of the next chunk.
446 The total file size is the sum of the chunk sizes of each tag
447 incremented by eight plus another 12 bytes of the riff header.
449 Raises
450 ------
451 ValueError
452 Not a RIFF file.
454 """
455 tags = {}
456 sf = filepath
457 file_pos = None
458 if hasattr(filepath, 'read'):
459 file_pos = sf.tell()
460 sf.seek(0, os.SEEK_SET)
461 else:
462 sf = open(filepath, 'rb')
463 fsize = read_riff_header(sf)
464 while (sf.tell() < fsize - 8):
465 chunk = sf.read(4).decode('latin-1').upper()
466 size = struct.unpack('<I', sf.read(4))[0]
467 size += size % 2
468 fp = sf.tell()
469 if chunk == 'LIST':
470 subchunk = sf.read(4).decode('latin-1').upper()
471 tags[chunk + '-' + subchunk] = (fp, size)
472 size -= 4
473 else:
474 tags[chunk] = (fp, size)
475 sf.seek(size, os.SEEK_CUR)
476 if file_pos is None:
477 sf.close()
478 else:
479 sf.seek(file_pos, os.SEEK_SET)
480 return tags
483def read_format_chunk(sf):
484 """Read format chunk.
486 Parameters
487 ----------
488 sf: stream
489 File stream for reading FMT chunk at the position of the chunk's size field.
491 Returns
492 -------
493 channels: int
494 Number of channels.
495 rate: float
496 Sampling rate (frames per time) in Hertz.
497 bits: int
498 Bit resolution.
499 """
500 size = struct.unpack('<I', sf.read(4))[0]
501 size += size % 2
502 ccode, channels, rate, byterate, blockalign, bits = struct.unpack('<HHIIHH', sf.read(16))
503 if size > 16:
504 sf.read(size - 16)
505 return channels, float(rate), bits
508def read_info_chunks(sf, store_empty):
509 """Read in meta data from info list chunk.
511 The variable `info_tags` is used to map the 4 character tags to
512 human readable key names.
514 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
516 Parameters
517 ----------
518 sf: stream
519 File stream of RIFF file at the position of the chunk's size field..
520 store_empty: bool
521 If `False` do not add meta data with empty values.
523 Returns
524 -------
525 metadata: dict
526 Dictionary with key-value pairs of info tags.
528 """
529 md = {}
530 list_size = struct.unpack('<I', sf.read(4))[0]
531 list_type = sf.read(4).decode('latin-1').upper()
532 list_size -= 4
533 if list_type == 'INFO':
534 while list_size >= 8:
535 key = sf.read(4).decode('ascii').rstrip(' \x00')
536 size = struct.unpack('<I', sf.read(4))[0]
537 size += size % 2
538 bs = sf.read(size)
539 x = np.frombuffer(bs, dtype=np.uint8)
540 if np.sum((x >= 0x80) & (x <= 0x9f)) > 0:
541 s = bs.decode('windows-1252')
542 else:
543 s = bs.decode('latin1')
544 value = s.rstrip(' \x00\x02')
545 list_size -= 8 + size
546 if key in info_tags:
547 key = info_tags[key]
548 if value or store_empty:
549 md[key] = value
550 if list_size > 0: # finish or skip
551 sf.seek(list_size, os.SEEK_CUR)
552 return md
555def read_bext_chunk(sf, store_empty=True):
556 """Read in metadata from the broadcast-audio extension chunk.
558 The variable `bext_tags` lists all valid BEXT fields and their size.
560 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
562 Parameters
563 ----------
564 sf: stream
565 File stream of RIFF file at the position of the chunk's size field..
566 store_empty: bool
567 If `False` do not add meta data with empty values.
569 Returns
570 -------
571 meta_data: dict
572 The meta-data of a BEXT chunk are stored in a flat dictionary
573 with the following keys:
575 - 'Description': a free description of the sequence.
576 - 'Originator': name of the originator/ producer of the audio file.
577 - 'OriginatorReference': unambiguous reference allocated by the originating organisation.
578 - 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.
579 - 'OriginationTime': time of creation of audio sequence in hh:mm:ss.
580 - 'TimeReference': first sample since midnight.
581 - 'Version': version of the BWF.
582 - 'UMID': unique material identifier.
583 - 'LoudnessValue': integrated loudness value.
584 - 'LoudnessRange': loudness range.
585 - 'MaxTruePeakLevel': maximum true peak value in dBTP.
586 - 'MaxMomentaryLoudness': highest value of the momentary loudness level.
587 - 'MaxShortTermLoudness': highest value of the short-term loudness level.
588 - 'Reserved': 180 bytes reserved for extension.
589 - 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.
590 """
591 md = {}
592 size = struct.unpack('<I', sf.read(4))[0]
593 size += size % 2
594 s = sf.read(256).decode('ascii').strip(' \x00')
595 if s or store_empty:
596 md['Description'] = s
597 s = sf.read(32).decode('ascii').strip(' \x00')
598 if s or store_empty:
599 md['Originator'] = s
600 s = sf.read(32).decode('ascii').strip(' \x00')
601 if s or store_empty:
602 md['OriginatorReference'] = s
603 s = sf.read(10).decode('ascii').strip(' \x00')
604 if s or store_empty:
605 md['OriginationDate'] = s
606 s = sf.read(8).decode('ascii').strip(' \x00')
607 if s or store_empty:
608 md['OriginationTime'] = s
609 reference, version = struct.unpack('<QH', sf.read(10))
610 if reference > 0 or store_empty:
611 md['TimeReference'] = reference
612 if version > 0 or store_empty:
613 md['Version'] = version
614 s = sf.read(64).decode('ascii').strip(' \x00')
615 if s or store_empty:
616 md['UMID'] = s
617 lvalue, lrange, peak, momentary, shortterm = struct.unpack('<hhhhh', sf.read(10))
618 if lvalue > 0 or store_empty:
619 md['LoudnessValue'] = lvalue
620 if lrange > 0 or store_empty:
621 md['LoudnessRange'] = lrange
622 if peak > 0 or store_empty:
623 md['MaxTruePeakLevel'] = peak
624 if momentary > 0 or store_empty:
625 md['MaxMomentaryLoudness'] = momentary
626 if shortterm > 0 or store_empty:
627 md['MaxShortTermLoudness'] = shortterm
628 s = sf.read(180).decode('ascii').strip(' \x00')
629 if s or store_empty:
630 md['Reserved'] = s
631 size -= 256 + 32 + 32 + 10 + 8 + 8 + 2 + 64 + 10 + 180
632 s = sf.read(size).decode('ascii').strip(' \x00\n\r')
633 if s or store_empty:
634 md['CodingHistory'] = s
635 return md
638def read_ixml_chunk(sf, store_empty=True):
639 """Read in metadata from an IXML chunk.
641 See the variable `ixml_tags` for a list of valid tags.
643 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
645 Parameters
646 ----------
647 sf: stream
648 File stream of RIFF file at the position of the chunk's size field..
649 store_empty: bool
650 If `False` do not add meta data with empty values.
652 Returns
653 -------
654 metadata: nested dict
655 Dictionary with key-value pairs.
656 """
658 def parse_ixml(element, store_empty=True):
659 md = {}
660 for e in element:
661 if not e.text is None:
662 md[e.tag] = e.text
663 elif len(e) > 0:
664 md[e.tag] = parse_ixml(e, store_empty)
665 elif store_empty:
666 md[e.tag] = ''
667 return md
669 size = struct.unpack('<I', sf.read(4))[0]
670 size += size % 2
671 xmls = sf.read(size).decode('latin-1').rstrip(' \x00')
672 root = ET.fromstring(xmls)
673 md = {root.tag: parse_ixml(root, store_empty)}
674 if len(md) == 1 and 'BWFXML' in md:
675 md = md['BWFXML']
676 return md
679def read_guano_chunk(sf):
680 """Read in metadata from a GUANO chunk.
682 GUANO is the Grand Unified Acoustic Notation Ontology, an
683 extensible, open format for embedding metadata within bat acoustic
684 recordings. See https://github.com/riggsd/guano-spec for details.
686 The GUANO specification allows for the inclusion of arbitrary
687 nested keys and string encoded values. In that respect it is a
688 well defined and easy to handle serialization of the [odML data
689 model](https://doi.org/10.3389/fninf.2011.00016).
691 Parameters
692 ----------
693 sf: stream
694 File stream of RIFF file at the position of the chunk's size field..
696 Returns
697 -------
698 metadata: nested dict
699 Dictionary with key-value pairs.
701 """
702 md = {}
703 size = struct.unpack('<I', sf.read(4))[0]
704 size += size % 2
705 for line in io.StringIO(sf.read(size).decode('utf-8')):
706 ss = line.split(':')
707 if len(ss) > 1:
708 md[ss[0].strip()] = ':'.join(ss[1:]).strip().replace(r'\n', '\n')
709 return unflatten_metadata(md, '|')
712def read_cue_chunk(sf):
713 """Read in marker positions from cue chunk.
715 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
717 Parameters
718 ----------
719 sf: stream
720 File stream of RIFF file at the position of the chunk's size field..
722 Returns
723 -------
724 locs: 2-D array of ints
725 Each row is a marker with unique identifier in the first column,
726 position in the second column, and span in the third column.
727 The cue chunk does not encode spans, so the third column is
728 initialized with zeros.
729 """
730 locs = []
731 size, n = struct.unpack('<II', sf.read(8))
732 for c in range(n):
733 cpid, cppos = struct.unpack('<II', sf.read(8))
734 datachunkid = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
735 chunkstart, blockstart, offset = struct.unpack('<III', sf.read(12))
736 if datachunkid == 'DATA':
737 locs.append((cpid, cppos, 0))
738 return np.array(locs, dtype=int)
741def read_playlist_chunk(sf, locs):
742 """Read in marker spans from playlist chunk.
744 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
746 Parameters
747 ----------
748 sf: stream
749 File stream of RIFF file at the position of the chunk's size field..
750 locs: 2-D array of ints
751 Markers as returned by the `read_cue_chunk()` function.
752 Each row is a marker with unique identifier in the first column,
753 position in the second column, and span in the third column.
754 The span is read in from the playlist chunk.
755 """
756 if len(locs) == 0:
757 warnings.warn('read_playlist_chunks() requires markers from a previous cue chunk')
758 size, n = struct.unpack('<II', sf.read(8))
759 for p in range(n):
760 cpid, length, repeats = struct.unpack('<III', sf.read(12))
761 i = np.where(locs[:,0] == cpid)[0]
762 if len(i) > 0:
763 locs[i[0], 2] = length
766def read_adtl_chunks(sf, locs, labels):
767 """Read in associated data list chunks.
769 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
771 Parameters
772 ----------
773 sf: stream
774 File stream of RIFF file at the position of the chunk's size field..
775 locs: 2-D array of ints
776 Markers as returned by the `read_cue_chunk()` function.
777 Each row is a marker with unique identifier in the first column,
778 position in the second column, and span in the third column.
779 The span is read in from the LTXT chunk.
780 labels: 2-D array of string objects
781 Labels (first column) and texts (second column) for each marker (rows)
782 from previous LABL, NOTE, and LTXT chunks.
784 Returns
785 -------
786 labels: 2-D array of string objects
787 Labels (first column) and texts (second column) for each marker (rows)
788 from LABL, NOTE (first column), and LTXT chunks (last column).
789 """
790 list_size = struct.unpack('<I', sf.read(4))[0]
791 list_type = sf.read(4).decode('latin-1').upper()
792 list_size -= 4
793 if list_type == 'ADTL':
794 if len(locs) == 0:
795 warnings.warn('read_adtl_chunks() requires markers from a previous cue chunk')
796 if len(labels) == 0:
797 labels = np.zeros((len(locs), 2), dtype=object)
798 while list_size >= 8:
799 key = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
800 size, cpid = struct.unpack('<II', sf.read(8))
801 size += size % 2 - 4
802 if key == 'LABL' or key == 'NOTE':
803 label = sf.read(size).decode('latin-1').rstrip(' \x00')
804 i = np.where(locs[:,0] == cpid)[0]
805 if len(i) > 0:
806 i = i[0]
807 if hasattr(labels[i,0], '__len__') and len(labels[i,0]) > 0:
808 labels[i,0] += '|' + label
809 else:
810 labels[i,0] = label
811 elif key == 'LTXT':
812 length = struct.unpack('<I', sf.read(4))[0]
813 sf.read(12) # skip fields
814 text = sf.read(size - 4 - 12).decode('latin-1').rstrip(' \x00')
815 i = np.where(locs[:,0] == cpid)[0]
816 if len(i) > 0:
817 i = i[0]
818 if hasattr(labels[i,1], '__len__') and len(labels[i,1]) > 0:
819 labels[i,1] += '|' + text
820 else:
821 labels[i,1] = text
822 locs[i,2] = length
823 else:
824 sf.read(size)
825 list_size -= 12 + size
826 if list_size > 0: # finish or skip
827 sf.seek(list_size, os.SEEK_CUR)
828 return labels
831def read_lbl_chunk(sf, rate):
832 """Read in marker positions, spans, labels, and texts from lbl chunk.
834 The proprietary LBL chunk is specific to wave files generated by
835 [AviSoft](www.avisoft.com) products.
837 The labels (first column of `labels`) have special meanings.
838 Markers with a span (a section label in the terminology of
839 AviSoft) can be arranged in three levels when displayed:
841 - "M": layer 1, the top level section
842 - "N": layer 2, sections below layer 1
843 - "O": layer 3, sections below layer 2
844 - "P": total, section start and end are displayed with two vertical lines.
846 All other labels mark single point labels with a time and a
847 frequency (that we here discard). See also
848 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
850 Parameters
851 ----------
852 sf: stream
853 File stream of RIFF file at the position of the chunk's size field..
854 rate: float
855 Sampling rate of the data in Hertz.
857 Returns
858 -------
859 locs: 2-D array of ints
860 Each row is a marker with unique identifier (simply integers
861 enumerating the markers) in the first column, position in the
862 second column, and span in the third column.
863 labels: 2-D array of string objects
864 Labels (first column) and texts (second column) for
865 each marker (rows).
867 """
868 size = struct.unpack('<I', sf.read(4))[0]
869 nn = size // 65
870 locs = np.zeros((nn, 3), dtype=int)
871 labels = np.zeros((nn, 2), dtype=object)
872 n = 0
873 for c in range(nn):
874 line = sf.read(65).decode('ascii')
875 fields = line.split('\t')
876 if len(fields) >= 4:
877 labels[n,0] = fields[3].strip()
878 labels[n,1] = fields[2].strip()
879 start_idx = int(np.round(float(fields[0].strip('\x00'))*rate))
880 end_idx = int(np.round(float(fields[1].strip('\x00'))*rate))
881 locs[n,0] = n
882 locs[n,1] = start_idx
883 if labels[n,0] in 'MNOP':
884 locs[n,2] = end_idx - start_idx
885 else:
886 locs[n,2] = 0
887 n += 1
888 else:
889 # the first 65 bytes are a title string that applies to
890 # the whole wave file that can be set from the AVISoft
891 # software. The recorder leave this empty.
892 pass
893 return locs[:n,:], labels[:n,:]
896def metadata_riff(filepath, store_empty=False):
897 """Read metadata from a RIFF/WAVE file.
899 Parameters
900 ----------
901 filepath: string or file handle
902 The RIFF file.
903 store_empty: bool
904 If `False` do not add meta data with empty values.
906 Returns
907 -------
908 meta_data: nested dict
909 Meta data contained in the RIFF file. Keys of the nested
910 dictionaries are always strings. If the corresponding
911 values are dictionaries, then the key is the section name
912 of the metadata contained in the dictionary. All other
913 types of values are values for the respective key. In
914 particular they are strings, or list of strings. But other
915 simple types like ints or floats are also allowed.
916 First level contains sections of meta data
917 (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).
919 Raises
920 ------
921 ValueError
922 Not a RIFF file.
924 Examples
925 --------
926 ```
927 from audioio.riffmetadata import riff_metadata
928 from audioio import print_metadata
930 md = riff_metadata('audio/file.wav')
931 print_metadata(md)
932 ```
933 """
934 meta_data = {}
935 sf = filepath
936 file_pos = None
937 if hasattr(filepath, 'read'):
938 file_pos = sf.tell()
939 sf.seek(0, os.SEEK_SET)
940 else:
941 sf = open(filepath, 'rb')
942 fsize = read_riff_header(sf)
943 while (sf.tell() < fsize - 8):
944 chunk = sf.read(4).decode('latin-1').upper()
945 if chunk == 'LIST':
946 md = read_info_chunks(sf, store_empty)
947 if len(md) > 0:
948 meta_data['INFO'] = md
949 elif chunk == 'BEXT':
950 md = read_bext_chunk(sf, store_empty)
951 if len(md) > 0:
952 meta_data['BEXT'] = md
953 elif chunk == 'IXML':
954 md = read_ixml_chunk(sf, store_empty)
955 if len(md) > 0:
956 meta_data['IXML'] = md
957 elif chunk == 'GUAN':
958 md = read_guano_chunk(sf)
959 if len(md) > 0:
960 meta_data.update(md)
961 else:
962 skip_chunk(sf)
963 if file_pos is None:
964 sf.close()
965 else:
966 sf.seek(file_pos, os.SEEK_SET)
967 return meta_data
970def markers_riff(filepath):
971 """Read markers from a RIFF/WAVE file.
973 Parameters
974 ----------
975 filepath: string or file handle
976 The RIFF file.
978 Returns
979 -------
980 locs: 2-D array of ints
981 Marker positions (first column) and spans (second column)
982 for each marker (rows).
983 labels: 2-D array of string objects
984 Labels (first column) and texts (second column)
985 for each marker (rows).
987 Raises
988 ------
989 ValueError
990 Not a RIFF file.
992 Examples
993 --------
994 ```
995 from audioio.riffmetadata import riff_markers
996 from audioio import print_markers
998 locs, labels = riff_markers('audio/file.wav')
999 print_markers(locs, labels)
1000 ```
1001 """
1002 sf = filepath
1003 file_pos = None
1004 if hasattr(filepath, 'read'):
1005 file_pos = sf.tell()
1006 sf.seek(0, os.SEEK_SET)
1007 else:
1008 sf = open(filepath, 'rb')
1009 rate = None
1010 locs = np.zeros((0, 3), dtype=int)
1011 labels = np.zeros((0, 2), dtype=object)
1012 fsize = read_riff_header(sf)
1013 while (sf.tell() < fsize - 8):
1014 chunk = sf.read(4).decode('latin-1').upper()
1015 if chunk == 'FMT ':
1016 rate = read_format_chunk(sf)[1]
1017 elif chunk == 'CUE ':
1018 locs = read_cue_chunk(sf)
1019 elif chunk == 'PLST':
1020 read_playlist_chunk(sf, locs)
1021 elif chunk == 'LIST':
1022 labels = read_adtl_chunks(sf, locs, labels)
1023 elif chunk == 'LBL ':
1024 locs, labels = read_lbl_chunk(sf, rate)
1025 else:
1026 skip_chunk(sf)
1027 if file_pos is None:
1028 sf.close()
1029 else:
1030 sf.seek(file_pos, os.SEEK_SET)
1031 # sort markers according to their position:
1032 if len(locs) > 0:
1033 idxs = np.argsort(locs[:,-2])
1034 locs = locs[idxs,:]
1035 if len(labels) > 0:
1036 labels = labels[idxs,:]
1037 return locs[:,1:], labels
1040# Write RIFF/WAVE file:
1042def write_riff_chunk(df, filesize=0, tag='WAVE'):
1043 """Write RIFF file header.
1045 Parameters
1046 ----------
1047 df: stream
1048 File stream for writing RIFF file header.
1049 filesize: int
1050 Size of the file in bytes.
1051 tag: str
1052 The type of RIFF file. Default is a wave file.
1053 Exactly 4 characeters long.
1055 Returns
1056 -------
1057 n: int
1058 Number of bytes written to the stream.
1060 Raises
1061 ------
1062 ValueError
1063 `tag` is not 4 characters long.
1064 """
1065 if len(tag) != 4:
1066 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1067 if filesize < 8:
1068 filesize = 8
1069 df.write(b'RIFF')
1070 df.write(struct.pack('<I', filesize - 8))
1071 df.write(tag.encode('ascii', errors='strict'))
1072 return 12
1075def write_filesize(df, filesize=None):
1076 """Write the file size into the RIFF file header.
1078 Parameters
1079 ----------
1080 df: stream
1081 File stream into which to write `filesize`.
1082 filesize: int
1083 Size of the file in bytes. If not specified or 0,
1084 then use current size of the file.
1085 """
1086 pos = df.tell()
1087 if not filesize:
1088 df.seek(0, os.SEEK_END)
1089 filesize = df.tell()
1090 df.seek(4, os.SEEK_SET)
1091 df.write(struct.pack('<I', filesize - 8))
1092 df.seek(pos, os.SEEK_SET)
1095def write_chunk_name(df, pos, tag):
1096 """Change the name of a chunk.
1098 Use this to make the content of an existing chunk to be ignored by
1099 overwriting its name with an unknown one.
1101 Parameters
1102 ----------
1103 df: stream
1104 File stream.
1105 pos: int
1106 Position of the chunk in the file stream.
1107 tag: str
1108 The type of RIFF file. Default is a wave file.
1109 Exactly 4 characeters long.
1111 Raises
1112 ------
1113 ValueError
1114 `tag` is not 4 characters long.
1115 """
1116 if len(tag) != 4:
1117 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1118 df.seek(pos, os.SEEK_SET)
1119 df.write(tag.encode('ascii', errors='strict'))
1122def write_format_chunk(df, channels, frames, rate, bits=16):
1123 """Write format chunk.
1125 Parameters
1126 ----------
1127 df: stream
1128 File stream for writing FMT chunk.
1129 channels: int
1130 Number of channels contained in the data.
1131 frames: int
1132 Number of frames contained in the data.
1133 rate: int or float
1134 Sampling rate (frames per time) in Hertz.
1135 bits: 16 or 32
1136 Bit resolution of the data to be written.
1138 Returns
1139 -------
1140 n: int
1141 Number of bytes written to the stream.
1142 """
1143 blockalign = channels * (bits//8)
1144 byterate = int(rate) * blockalign
1145 df.write(b'fmt ')
1146 df.write(struct.pack('<IHHIIHH', 16, 1, channels, int(rate),
1147 byterate, blockalign, bits))
1148 return 8 + 16
1151def write_data_chunk(df, data, bits=16):
1152 """Write data chunk.
1154 Parameters
1155 ----------
1156 df: stream
1157 File stream for writing data chunk.
1158 data: 1-D or 2-D array of floats
1159 Data with first column time (frames) and optional second column
1160 channels with values between -1 and 1.
1161 bits: 16 or 32
1162 Bit resolution of the data to be written.
1164 Returns
1165 -------
1166 n: int
1167 Number of bytes written to the stream.
1168 """
1169 df.write(b'data')
1170 df.write(struct.pack('<I', data.size * (bits//8)))
1171 buffer = data * 2**(bits-1)
1172 n = df.write(buffer.astype(f'<i{bits//8}').tobytes('C'))
1173 return 8 + n
1176def write_info_chunk(df, metadata, size=None):
1177 """Write metadata to LIST INFO chunk.
1179 If `metadata` contains an 'INFO' key, then write the flat
1180 dictionary of this key as an INFO chunk. Otherwise, attempt to
1181 write all metadata items as an INFO chunk. The keys are translated
1182 via the `info_tags` variable back to INFO tags. If after
1183 translation any key is left that is longer than 4 characters or
1184 any key has a dictionary as a value (non-flat metadata), the INFO
1185 chunk is not written.
1187 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
1189 Parameters
1190 ----------
1191 df: stream
1192 File stream for writing INFO chunk.
1193 metadata: nested dict
1194 Metadata as key-value pairs. Values can be strings, integers,
1195 or dictionaries.
1196 size: int or None
1197 If specified write this size into the list's size field.
1199 Returns
1200 -------
1201 n: int
1202 Number of bytes written to the stream.
1203 keys_written: list of str
1204 Keys written to the INFO chunk.
1206 """
1207 if not metadata:
1208 return 0, []
1209 is_info = False
1210 if 'INFO' in metadata:
1211 metadata = metadata['INFO']
1212 is_info = True
1213 tags = {v: k for k, v in info_tags.items()}
1214 n = 0
1215 for k in metadata:
1216 kn = tags.get(k, k)
1217 if len(kn) > 4:
1218 if is_info:
1219 warnings.warn(f'no 4-character info tag for key "{k}" found.')
1220 return 0, []
1221 if isinstance(metadata[k], dict):
1222 if is_info:
1223 warnings.warn(f'value of key "{k}" in INFO chunk cannot be a dictionary.')
1224 return 0, []
1225 try:
1226 v = str(metadata[k]).encode('latin-1')
1227 except UnicodeEncodeError:
1228 v = str(metadata[k]).encode('windows-1252')
1229 n += 8 + len(v) + len(v) % 2
1230 df.write(b'LIST')
1231 df.write(struct.pack('<I', size if size is not None else n + 4))
1232 df.write(b'INFO')
1233 keys_written = []
1234 for k in metadata:
1235 kn = tags.get(k, k)
1236 df.write(f'{kn:<4s}'.encode('latin-1'))
1237 try:
1238 v = str(metadata[k]).encode('latin-1')
1239 except UnicodeEncodeError:
1240 v = str(metadata[k]).encode('windows-1252')
1241 ns = len(v) + len(v) % 2
1242 if ns > len(v):
1243 v += b' ';
1244 df.write(struct.pack('<I', ns))
1245 df.write(v)
1246 keys_written.append(k)
1247 return 12 + n, ['INFO'] if is_info else keys_written
1250def write_bext_chunk(df, metadata):
1251 """Write metadata to BEXT chunk.
1253 If `metadata` contains a BEXT key, and this contains valid BEXT
1254 tags (one of the keys listed in the variable `bext_tags`), then
1255 write the dictionary of that key as a broadcast-audio extension
1256 chunk.
1258 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
1260 Parameters
1261 ----------
1262 df: stream
1263 File stream for writing BEXT chunk.
1264 metadata: nested dict
1265 Metadata as key-value pairs. Values can be strings, integers,
1266 or dictionaries.
1268 Returns
1269 -------
1270 n: int
1271 Number of bytes written to the stream.
1272 keys_written: list of str
1273 Keys written to the BEXT chunk.
1275 """
1276 if not metadata or not 'BEXT' in metadata:
1277 return 0, []
1278 metadata = metadata['BEXT']
1279 for k in metadata:
1280 if not k in bext_tags:
1281 warnings.warn(f'no bext tag for key "{k}" found.')
1282 return 0, []
1283 n = 0
1284 for k in bext_tags:
1285 n += bext_tags[k]
1286 ch = metadata.get('CodingHistory', '').encode('ascii', errors='replace')
1287 if len(ch) >= 2 and ch[-2:] != '\r\n':
1288 ch += b'\r\n'
1289 nch = len(ch) + len(ch) % 2
1290 n += nch
1291 df.write(b'BEXT')
1292 df.write(struct.pack('<I', n))
1293 for k in bext_tags:
1294 bn = bext_tags[k]
1295 if bn == 2:
1296 v = metadata.get(k, '0')
1297 df.write(struct.pack('<H', int(v)))
1298 elif bn == 8 and k == 'TimeReference':
1299 v = metadata.get(k, '0')
1300 df.write(struct.pack('<Q', int(v)))
1301 elif bn == 0:
1302 df.write(ch)
1303 df.write(bytes(nch - len(ch)))
1304 else:
1305 v = metadata.get(k, '').encode('ascii', errors='replace')
1306 df.write(v[:bn] + bytes(bn - len(v)))
1307 return 8 + n, ['BEXT']
1310def write_ixml_chunk(df, metadata, keys_written=None):
1311 """Write metadata to iXML chunk.
1313 If `metadata` contains an IXML key with valid IXML tags (one of
1314 those listed in the variable `ixml_tags`), or the remaining tags
1315 in `metadata` are valid IXML tags, then write an IXML chunk.
1317 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
1319 Parameters
1320 ----------
1321 df: stream
1322 File stream for writing IXML chunk.
1323 metadata: nested dict
1324 Meta-data as key-value pairs. Values can be strings, integers,
1325 or dictionaries.
1326 keys_written: list of str
1327 Keys that have already written to INFO or BEXT chunk.
1329 Returns
1330 -------
1331 n: int
1332 Number of bytes written to the stream.
1333 keys_written: list of str
1334 Keys written to the IXML chunk.
1336 """
1337 def check_ixml(metadata):
1338 for k in metadata:
1339 if not k.upper() in ixml_tags:
1340 return False
1341 if isinstance(metadata[k], dict):
1342 if not check_ixml(metadata[k]):
1343 return False
1344 return True
1346 def build_xml(node, metadata):
1347 kw = []
1348 for k in metadata:
1349 e = ET.SubElement(node, k)
1350 if isinstance(metadata[k], dict):
1351 build_xml(e, metadata[k])
1352 else:
1353 e.text = str(metadata[k])
1354 kw.append(k)
1355 return kw
1357 if not metadata:
1358 return 0, []
1359 md = metadata
1360 if keys_written:
1361 md = {k: metadata[k] for k in metadata if not k in keys_written}
1362 if len(md) == 0:
1363 return 0, []
1364 has_ixml = False
1365 if 'IXML' in md and check_ixml(md['IXML']):
1366 md = md['IXML']
1367 has_ixml = True
1368 else:
1369 if not check_ixml(md):
1370 return 0, []
1371 root = ET.Element('BWFXML')
1372 kw = build_xml(root, md)
1373 bs = bytes(ET.tostring(root, xml_declaration=True,
1374 short_empty_elements=False))
1375 if len(bs) % 2 == 1:
1376 bs += bytes(1)
1377 df.write(b'IXML')
1378 df.write(struct.pack('<I', len(bs)))
1379 df.write(bs)
1380 return 8 + len(bs), ['IXML'] if has_ixml else kw
1383def write_guano_chunk(df, metadata, keys_written=None):
1384 """Write metadata to guan chunk.
1386 GUANO is the Grand Unified Acoustic Notation Ontology, an
1387 extensible, open format for embedding metadata within bat acoustic
1388 recordings. See https://github.com/riggsd/guano-spec for details.
1390 The GUANO specification allows for the inclusion of arbitrary
1391 nested keys and string encoded values. In that respect it is a
1392 well defined and easy to handle serialization of the [odML data
1393 model](https://doi.org/10.3389/fninf.2011.00016).
1395 This will write *all* metadata that are not in `keys_written`.
1397 Parameters
1398 ----------
1399 df: stream
1400 File stream for writing guano chunk.
1401 metadata: nested dict
1402 Metadata as key-value pairs. Values can be strings, integers,
1403 or dictionaries.
1404 keys_written: list of str
1405 Keys that have already written to INFO, BEXT, IXML chunk.
1407 Returns
1408 -------
1409 n: int
1410 Number of bytes written to the stream.
1411 keys_written: list of str
1412 Top-level keys written to the GUANO chunk.
1414 """
1415 if not metadata:
1416 return 0, []
1417 md = metadata
1418 if keys_written:
1419 md = {k: metadata[k] for k in metadata if not k in keys_written}
1420 if len(md) == 0:
1421 return 0, []
1422 fmd = flatten_metadata(md, True, '|')
1423 for k in fmd:
1424 if isinstance(fmd[k], str):
1425 fmd[k] = fmd[k].replace('\n', r'\n')
1426 sio = io.StringIO()
1427 m, k = find_key(md, 'GUANO.Version')
1428 if k is None:
1429 sio.write('GUANO|Version:1.0\n')
1430 for k in fmd:
1431 sio.write(f'{k}:{fmd[k]}\n')
1432 bs = sio.getvalue().encode('utf-8')
1433 if len(bs) % 2 == 1:
1434 bs += b' '
1435 n = len(bs)
1436 df.write(b'guan')
1437 df.write(struct.pack('<I', n))
1438 df.write(bs)
1439 return n, list(md)
1442def write_cue_chunk(df, locs):
1443 """Write marker positions to cue chunk.
1445 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
1447 Parameters
1448 ----------
1449 df: stream
1450 File stream for writing cue chunk.
1451 locs: None or 2-D array of ints
1452 Positions (first column) and spans (optional second column)
1453 for each marker (rows).
1455 Returns
1456 -------
1457 n: int
1458 Number of bytes written to the stream.
1459 """
1460 if locs is None or len(locs) == 0:
1461 return 0
1462 df.write(b'CUE ')
1463 df.write(struct.pack('<II', 4 + len(locs)*24, len(locs)))
1464 for i in range(len(locs)):
1465 df.write(struct.pack('<II4sIII', i, locs[i,0], b'data', 0, 0, 0))
1466 return 12 + len(locs)*24
1469def write_playlist_chunk(df, locs):
1470 """Write marker spans to playlist chunk.
1472 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
1474 Parameters
1475 ----------
1476 df: stream
1477 File stream for writing playlist chunk.
1478 locs: None or 2-D array of ints
1479 Positions (first column) and spans (optional second column)
1480 for each marker (rows).
1482 Returns
1483 -------
1484 n: int
1485 Number of bytes written to the stream.
1486 """
1487 if locs is None or len(locs) == 0 or locs.shape[1] < 2:
1488 return 0
1489 n_spans = np.sum(locs[:,1] > 0)
1490 if n_spans == 0:
1491 return 0
1492 df.write(b'plst')
1493 df.write(struct.pack('<II', 4 + n_spans*12, n_spans))
1494 for i in range(len(locs)):
1495 if locs[i,1] > 0:
1496 df.write(struct.pack('<III', i, locs[i,1], 1))
1497 return 12 + n_spans*12
1500def write_adtl_chunks(df, locs, labels):
1501 """Write associated data list chunks.
1503 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
1505 Parameters
1506 ----------
1507 df: stream
1508 File stream for writing adtl chunk.
1509 locs: None or 2-D array of ints
1510 Positions (first column) and spans (optional second column)
1511 for each marker (rows).
1512 labels: None or 2-D array of string objects
1513 Labels (first column) and texts (second column) for each marker (rows).
1515 Returns
1516 -------
1517 n: int
1518 Number of bytes written to the stream.
1519 """
1520 if labels is None or len(labels) == 0:
1521 return 0
1522 labels_size = 0
1523 for l in labels[:,0]:
1524 if hasattr(l, '__len__'):
1525 n = len(l)
1526 if n > 0:
1527 labels_size += 12 + n + n % 2
1528 text_size = 0
1529 if labels.shape[1] > 1:
1530 for t in labels[:,1]:
1531 if hasattr(t, '__len__'):
1532 n = len(t)
1533 if n > 0:
1534 text_size += 28 + n + n % 2
1535 if labels_size == 0 and text_size == 0:
1536 return 0
1537 size = 4 + labels_size + text_size
1538 spans = locs[:,1] if locs.shape[1] > 1 else None
1539 df.write(b'LIST')
1540 df.write(struct.pack('<I', size))
1541 df.write(b'adtl')
1542 for i in range(len(labels)):
1543 # labl sub-chunk:
1544 l = labels[i,0]
1545 if hasattr(l, '__len__'):
1546 n = len(l)
1547 if n > 0:
1548 n += n % 2
1549 df.write(b'labl')
1550 df.write(struct.pack('<II', 4 + n, i))
1551 df.write(f'{l:<{n}s}'.encode('latin-1', errors='replace'))
1552 # ltxt sub-chunk:
1553 if labels.shape[1] > 1:
1554 t = labels[i,1]
1555 if hasattr(t, '__len__'):
1556 n = len(t)
1557 if n > 0:
1558 n += n % 2
1559 span = spans[i] if spans is not None else 0
1560 df.write(b'ltxt')
1561 df.write(struct.pack('<III', 20 + n, i, span))
1562 df.write(struct.pack('<IHHHH', 0, 0, 0, 0, 0))
1563 df.write(f'{t:<{n}s}'.encode('latin-1', errors='replace'))
1564 return 8 + size
1567def write_lbl_chunk(df, locs, labels, rate):
1568 """Write marker positions, spans, labels, and texts to lbl chunk.
1570 The proprietary LBL chunk is specific to wave files generated by
1571 [AviSoft](www.avisoft.com) products.
1573 The labels (first column of `labels`) have special meanings.
1574 Markers with a span (a section label in the terminology of
1575 AviSoft) can be arranged in three levels when displayed:
1577 - "M": layer 1, the top level section
1578 - "N": layer 2, sections below layer 1
1579 - "O": layer 3, sections below layer 2
1580 - "P": total, section start and end are displayed with two vertical lines.
1582 All other labels mark single point labels with a time and a
1583 frequency (that we here discard). See also
1584 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
1586 If a marker has a span, and its label is not one of "M", "N", "O", or "P",
1587 then its label is set to "M".
1588 If a marker has no span, and its label is one of "M", "N", "O", or "P",
1589 then its label is set to "a".
1591 Parameters
1592 ----------
1593 df: stream
1594 File stream for writing lbl chunk.
1595 locs: None or 2-D array of ints
1596 Positions (first column) and spans (optional second column)
1597 for each marker (rows).
1598 labels: None or 2-D array of string objects
1599 Labels (first column) and texts (second column) for each marker (rows).
1600 rate: float
1601 Sampling rate of the data in Hertz.
1603 Returns
1604 -------
1605 n: int
1606 Number of bytes written to the stream.
1608 """
1609 if locs is None or len(locs) == 0:
1610 return 0
1611 size = (1 + len(locs)) * 65
1612 df.write(b'LBL ')
1613 df.write(struct.pack('<I', size))
1614 # first empty entry (this is ment to be a title for the whole wave file):
1615 df.write(b' ' * 63)
1616 df.write(b'\r\n')
1617 for k in range(len(locs)):
1618 t0 = locs[k,0]/rate
1619 t1 = t0
1620 t1 += locs[k,1]/rate
1621 ls = 'M' if locs[k,1] > 0 else 'a'
1622 ts = ''
1623 if labels is not None and len(labels) > k:
1624 ls = labels[k,0]
1625 if ls != 0 and len(ls) > 0:
1626 ls = ls[0]
1627 if ls in 'MNOP':
1628 if locs[k,1] == 0:
1629 ls = 'a'
1630 else:
1631 if locs[k,1] > 0:
1632 ls = 'M'
1633 ts = labels[k,1]
1634 if ts == 0:
1635 ts = ''
1636 df.write(struct.pack('<14sc', f'{t0:e}'.encode('ascii', errors='replace'), b'\t'))
1637 df.write(struct.pack('<14sc', f'{t1:e}'.encode('ascii', errors='replace'), b'\t'))
1638 bs = f'{ts:31s}\t{ls}\r\n'.encode('ascii', errors='replace')
1639 df.write(bs)
1640 return 8 + size
1643def append_metadata_riff(df, metadata):
1644 """Append metadata chunks to RIFF file.
1646 You still need to update the filesize by calling
1647 `write_filesize()`.
1649 Parameters
1650 ----------
1651 df: stream
1652 File stream for writing metadata chunks.
1653 metadata: None or nested dict
1654 Metadata as key-value pairs. Values can be strings, integers,
1655 or dictionaries.
1657 Returns
1658 -------
1659 n: int
1660 Number of bytes written to the stream.
1661 tags: list of str
1662 Tag names of chunks written to audio file.
1663 """
1664 if not metadata:
1665 return 0, []
1666 n = 0
1667 tags = []
1668 # metadata INFO chunk:
1669 nc, kw = write_info_chunk(df, metadata)
1670 if nc > 0:
1671 tags.append('LIST-INFO')
1672 n += nc
1673 # metadata BEXT chunk:
1674 nc, bkw = write_bext_chunk(df, metadata)
1675 if nc > 0:
1676 tags.append('BEXT')
1677 n += nc
1678 kw.extend(bkw)
1679 # metadata IXML chunk:
1680 nc, xkw = write_ixml_chunk(df, metadata, kw)
1681 if nc > 0:
1682 tags.append('IXML')
1683 n += nc
1684 kw.extend(xkw)
1685 # write remaining metadata to GUANO chunk:
1686 nc, _ = write_guano_chunk(df, metadata, kw)
1687 if nc > 0:
1688 tags.append('GUAN')
1689 n += nc
1690 kw.extend(bkw)
1691 return n, tags
1694def append_markers_riff(df, locs, labels=None, rate=None,
1695 marker_hint='cue'):
1696 """Append marker chunks to RIFF file.
1698 You still need to update the filesize by calling
1699 `write_filesize()`.
1701 Parameters
1702 ----------
1703 df: stream
1704 File stream for writing metadata chunks.
1705 locs: None or 1-D or 2-D array of ints
1706 Marker positions (first column) and spans (optional second column)
1707 for each marker (rows).
1708 labels: None or 1-D or 2-D array of string objects
1709 Labels (first column) and texts (optional second column)
1710 for each marker (rows).
1711 rate: float
1712 Sampling rate of the data in Hertz, needed for storing markers
1713 in seconds.
1714 marker_hint: str
1715 - 'cue': store markers in cue and and adtl chunks.
1716 - 'lbl': store markers in avisoft lbl chunk.
1718 Returns
1719 -------
1720 n: int
1721 Number of bytes written to the stream.
1722 tags: list of str
1723 Tag names of chunks written to audio file.
1725 Raises
1726 ------
1727 ValueError
1728 Encoding not supported.
1729 IndexError
1730 `locs` and `labels` differ in len.
1731 """
1732 if locs is None or len(locs) == 0:
1733 return 0, []
1734 if labels is not None and len(labels) > 0 and len(labels) != len(locs):
1735 raise IndexError(f'locs and labels must have same number of elements.')
1736 # make locs and labels 2-D:
1737 if not locs is None and locs.ndim == 1:
1738 locs = locs.reshape(-1, 1)
1739 if not labels is None and labels.ndim == 1:
1740 labels = labels.reshape(-1, 1)
1741 # sort markers according to their position:
1742 idxs = np.argsort(locs[:,0])
1743 locs = locs[idxs,:]
1744 if not labels is None and len(labels) > 0:
1745 labels = labels[idxs,:]
1746 n = 0
1747 tags = []
1748 if marker_hint.lower() == 'cue':
1749 # write marker positions:
1750 nc = write_cue_chunk(df, locs)
1751 if nc > 0:
1752 tags.append('CUE ')
1753 n += nc
1754 # write marker spans:
1755 nc = write_playlist_chunk(df, locs)
1756 if nc > 0:
1757 tags.append('PLST')
1758 n += nc
1759 # write marker labels:
1760 nc = write_adtl_chunks(df, locs, labels)
1761 if nc > 0:
1762 tags.append('LIST-ADTL')
1763 n += nc
1764 elif marker_hint.lower() == 'lbl':
1765 # write avisoft labels:
1766 nc = write_lbl_chunk(df, locs, labels, rate)
1767 if nc > 0:
1768 tags.append('LBL ')
1769 n += nc
1770 else:
1771 raise ValueError(f'marker_hint "{marker_hint}" not supported for storing markers')
1772 return n, tags
1775def write_wave(filepath, data, rate, metadata=None, locs=None,
1776 labels=None, encoding=None, marker_hint='cue'):
1777 """Write time series, metadata and markers to a WAVE file.
1779 Only 16 or 32bit PCM encoding is supported.
1781 Parameters
1782 ----------
1783 filepath: string
1784 Full path and name of the file to write.
1785 data: 1-D or 2-D array of floats
1786 Array with the data (first index time, second index channel,
1787 values within -1.0 and 1.0).
1788 rate: float
1789 Sampling rate of the data in Hertz.
1790 metadata: None or nested dict
1791 Metadata as key-value pairs. Values can be strings, integers,
1792 or dictionaries.
1793 locs: None or 1-D or 2-D array of ints
1794 Marker positions (first column) and spans (optional second column)
1795 for each marker (rows).
1796 labels: None or 1-D or 2-D array of string objects
1797 Labels (first column) and texts (optional second column)
1798 for each marker (rows).
1799 encoding: string or None
1800 Encoding of the data: 'PCM_32' or 'PCM_16'.
1801 If None or empty string use 'PCM_16'.
1802 marker_hint: str
1803 - 'cue': store markers in cue and and adtl chunks.
1804 - 'lbl': store markers in avisoft lbl chunk.
1806 Raises
1807 ------
1808 ValueError
1809 Encoding not supported.
1810 IndexError
1811 `locs` and `labels` differ in len.
1813 See Also
1814 --------
1815 audioio.audiowriter.write_audio()
1817 Examples
1818 --------
1819 ```
1820 import numpy as np
1821 from audioio.riffmetadata import write_wave
1823 rate = 28000.0
1824 freq = 800.0
1825 time = np.arange(0.0, 1.0, 1/rate) # one second
1826 data = np.sin(2.0*np.p*freq*time) # 800Hz sine wave
1827 md = dict(Artist='underscore_') # metadata
1829 write_wave('audio/file.wav', data, rate, md)
1830 ```
1831 """
1832 if not filepath:
1833 raise ValueError('no file specified!')
1834 if not encoding:
1835 encoding = 'PCM_16'
1836 encoding = encoding.upper()
1837 bits = 0
1838 if encoding == 'PCM_16':
1839 bits = 16
1840 elif encoding == 'PCM_32':
1841 bits = 32
1842 else:
1843 raise ValueError(f'file encoding {encoding} not supported')
1844 if locs is not None and len(locs) > 0 and \
1845 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1846 raise IndexError(f'locs and labels must have same number of elements.')
1847 # write WAVE file:
1848 with open(filepath, 'wb') as df:
1849 write_riff_chunk(df)
1850 if data.ndim == 1:
1851 write_format_chunk(df, 1, len(data), rate, bits)
1852 else:
1853 write_format_chunk(df, data.shape[1], data.shape[0],
1854 rate, bits)
1855 append_metadata_riff(df, metadata)
1856 write_data_chunk(df, data, bits)
1857 append_markers_riff(df, locs, labels, rate, marker_hint)
1858 write_filesize(df)
1861def append_riff(filepath, metadata=None, locs=None, labels=None,
1862 rate=None, marker_hint='cue'):
1863 """Append metadata and markers to an existing RIFF file.
1865 Parameters
1866 ----------
1867 filepath: string
1868 Full path and name of the file to write.
1869 metadata: None or nested dict
1870 Metadata as key-value pairs. Values can be strings, integers,
1871 or dictionaries.
1872 locs: None or 1-D or 2-D array of ints
1873 Marker positions (first column) and spans (optional second column)
1874 for each marker (rows).
1875 labels: None or 1-D or 2-D array of string objects
1876 Labels (first column) and texts (optional second column)
1877 for each marker (rows).
1878 rate: float
1879 Sampling rate of the data in Hertz, needed for storing markers
1880 in seconds.
1881 marker_hint: str
1882 - 'cue': store markers in cue and and adtl chunks.
1883 - 'lbl': store markers in avisoft lbl chunk.
1885 Returns
1886 -------
1887 n: int
1888 Number of bytes written to the stream.
1890 Raises
1891 ------
1892 IndexError
1893 `locs` and `labels` differ in len.
1895 Examples
1896 --------
1897 ```
1898 import numpy as np
1899 from audioio.riffmetadata import append_riff
1901 md = dict(Artist='underscore_') # metadata
1902 append_riff('audio/file.wav', md) # append them to existing audio file
1903 ```
1904 """
1905 if not filepath:
1906 raise ValueError('no file specified!')
1907 if locs is not None and len(locs) > 0 and \
1908 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1909 raise IndexError(f'locs and labels must have same number of elements.')
1910 # check RIFF file:
1911 chunks = read_chunk_tags(filepath)
1912 # append to RIFF file:
1913 n = 0
1914 with open(filepath, 'r+b') as df:
1915 tags = []
1916 df.seek(0, os.SEEK_END)
1917 nc, tgs = append_metadata_riff(df, metadata)
1918 n += nc
1919 tags.extend(tgs)
1920 nc, tgs = append_markers_riff(df, locs, labels, rate, marker_hint)
1921 n += nc
1922 tags.extend(tgs)
1923 write_filesize(df)
1924 # blank out already existing chunks:
1925 for tag in chunks:
1926 if tag in tags:
1927 if '-' in tag:
1928 xtag = tag[5:7] + 'xx'
1929 else:
1930 xtag = tag[:2] + 'xx'
1931 write_chunk_name(df, chunks[tag][0], xtag)
1932 return 0
1935def demo(filepath):
1936 """Print metadata and markers of a RIFF/WAVE file.
1938 Parameters
1939 ----------
1940 filepath: string
1941 Path of a RIFF/WAVE file.
1942 """
1943 def print_meta_data(meta_data, level=0):
1944 for sk in meta_data:
1945 md = meta_data[sk]
1946 if isinstance(md, dict):
1947 print(f'{"":<{level*4}}{sk}:')
1948 print_meta_data(md, level+1)
1949 else:
1950 v = str(md).replace('\n', '.').replace('\r', '.')
1951 print(f'{"":<{level*4}s}{sk:<20s}: {v}')
1953 # read meta data:
1954 meta_data = metadata_riff(filepath, store_empty=False)
1956 # print meta data:
1957 print()
1958 print('metadata:')
1959 print_meta_data(meta_data)
1961 # read cues:
1962 locs, labels = markers_riff(filepath)
1964 # print marker table:
1965 if len(locs) > 0:
1966 print()
1967 print('markers:')
1968 print(f'{"position":10} {"span":8} {"label":10} {"text":10}')
1969 for i in range(len(locs)):
1970 if i < len(labels):
1971 print(f'{locs[i,0]:10} {locs[i,1]:8} {labels[i,0]:10} {labels[i,1]:30}')
1972 else:
1973 print(f'{locs[i,0]:10} {locs[i,1]:8} {"-":10} {"-":10}')
1976def main(*args):
1977 """Call demo with command line arguments.
1979 Parameters
1980 ----------
1981 args: list of strings
1982 Command line arguments as returned by sys.argv[1:]
1983 """
1984 if len(args) > 0 and (args[0] == '-h' or args[0] == '--help'):
1985 print()
1986 print('Usage:')
1987 print(' python -m src.audioio.riffmetadata [--help] <audio/file.wav>')
1988 print()
1989 return
1991 if len(args) > 0:
1992 demo(args[0])
1993 else:
1994 rate = 44100
1995 t = np.arange(0, 2, 1/rate)
1996 x = np.sin(2*np.pi*440*t)
1997 imd = dict(IENG='JB', ICRD='2024-01-24', RATE=9,
1998 Comment='this is test1')
1999 bmd = dict(Description='a recording',
2000 OriginationDate='2024:01:24', TimeReference=123456,
2001 Version=42, CodingHistory='Test1\nTest2')
2002 xmd = dict(Project='Record all', Note='still testing',
2003 Sync_Point_List=dict(Sync_Point=1,
2004 Sync_Point_Comment='great'))
2005 omd = imd.copy()
2006 omd['Production'] = bmd
2007 md = dict(INFO=imd, BEXT=bmd, IXML=xmd,
2008 Recording=omd, Notes=xmd)
2009 locs = np.random.randint(10, len(x)-10, (5, 2))
2010 locs = locs[np.argsort(locs[:,0]),:]
2011 locs[:,1] = np.random.randint(0, 20, len(locs))
2012 labels = np.zeros((len(locs), 2), dtype=object)
2013 for i in range(len(labels)):
2014 labels[i,0] = chr(ord('a') + i % 26)
2015 labels[i,1] = chr(ord('A') + i % 26)*5
2016 write_wave('test.wav', x, rate, md, locs, labels)
2017 demo('test.wav')
2020if __name__ == "__main__":
2021 main(*sys.argv[1:])