Coverage for src/audioio/riffmetadata.py: 97%
727 statements
« prev ^ index » next coverage.py v7.6.3, created at 2024-10-15 07:29 +0000
« prev ^ index » next coverage.py v7.6.3, created at 2024-10-15 07:29 +0000
1"""Read and write meta data and marker lists of riff based files.
3Container files of the Resource Interchange File Format (RIFF) like
4WAVE files may contain sections (called chunks) with metadata and
5markers in addition to the timeseries (audio) data and the necessary
6specifications of sampling rate, bit depth, etc.
8## Metadata
10There are various types of chunks for storing metadata, like the [INFO
11list](https://www.recordingblogs.com/wiki/list-chunk-of-a-wave-file),
12[broadcast-audio extension
13(BEXT)](https://tech.ebu.ch/docs/tech/tech3285.pdf) chunk, or
14[iXML](http://www.gallery.co.uk/ixml/) chunks. These chunks contain
15metadata as key-value pairs. Since wave files are primarily designed
16for music, valid keys in these chunks are restricted to topics from
17music and music production. Some keys are usefull also for science,
18but there is need for more keys. It is possible to extend the INFO
19list keys, but these keys are restricted to four characters and the
20INFO list chunk does also not allow for hierarchical metadata. The
21other metadata chunks, in particular the BEXT chunk, cannot be
22extended. With standard chunks, not all types of metadata can be
23stored.
25The [GUANO (Grand Unified Acoustic Notation
26Ontology)](https://github.com/riggsd/guano-spec), primarily designed
27for bat acoustic recordings, has some standard ontologies that are of
28much more interest in scientific context. In addition, GUANO allows
29for extensions with arbitray nested keys and string encoded values.
30In that respect it is a well defined and easy to handle serialization
31of the [odML data model](https://doi.org/10.3389/fninf.2011.00016).
32We use GUANO to write all metadata that do not fit into the INFO, BEXT
33or IXML chunks into a WAVE file.
35To interface the various ways to store and read metadata of RIFF
36files, the `riffmetadata` module simply uses nested dictionaries. The
37keys are always strings. Values are strings or integers for key-value
38pairs. Value strings can also be numbers followed by a unit. Values
39can also be dictionaries for defining subsections of key-value
40pairs. The dictionaries can be nested to arbitrary depth.
42The `write_wave()` function first tries to write an INFO list
43chunk. It checks for a key "INFO" with a flat dictionary of key value
44pairs. It then translates all keys of this dictionary using the
45`info_tags` mapping. If all the resulting keys have no more than four
46characters and there are no subsections, then an INFO list chunk is
47written. If no "INFO" key exists, then with the same procedure all
48elements of the provided metadata are checked for being valid INFO
49tags, and on success an INFO list chunk is written. Then, in similar
50ways, `write_wave()` tries to assemble valid BEXT and iXML chunks,
51based on the tags in `bext_tags` abd `ixml_tags`. All remaining
52metadata are then stored in an GUANO chunk.
54When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are
55returned as subsections with the respective keys. Metadata from an
56GUANO chunk are stored directly in the metadata dictionary without
57marking them as GUANO.
59## Markers
61A number of different chunk types exist for handling markers or cues
62that mark specific events or regions in the audio data. In the end,
63each marker has a position, a span, a label, and a text. Position,
64and span are handled with 1-D or 2-D arrays of ints, where each row is
65a marker and the columns are position and span. The span column is
66optional. Labels and texts come in another 1-D or 2-D array of objects
67pointing to strings. Again, rows are the markers, first column are the
68labels, and second column the optional texts. Try to keep the labels
69short, and use text for longer descriptions, if necessary.
71## Read metadata and markers
73- `metadata_riff()`: read metadata from a RIFF/WAVE file.
74- `markers_riff()`: read markers from a RIFF/WAVE file.
76## Write data, metadata and markers
78- `write_wave()`: write time series, metadata and markers to a WAVE file.
79- `append_metadata_riff()`: append metadata chunks to RIFF file.
80- `append_markers_riff()`: append marker chunks to RIFF file.
81- `append_riff()`: append metadata and markers to an existing RIFF file.
83## Helper functions for reading RIFF and WAVE files
85- `read_chunk_tags()`: read tags of all chunks contained in a RIFF file.
86- `read_riff_header()`: read and check the RIFF file header.
87- `skip_chunk()`: skip over unknown RIFF chunk.
88- `read_format_chunk()`: read format chunk.
89- `read_info_chunks()`: read in meta data from info list chunk.
90- `read_bext_chunk()`: read in metadata from the broadcast-audio extension chunk.
91- `read_ixml_chunk()`: read in metadata from an IXML chunk.
92- `read_guano_chunk()`: read in metadata from a GUANO chunk.
93- `read_cue_chunk()`: read in marker positions from cue chunk.
94- `read_playlist_chunk()`: read in marker spans from playlist chunk.
95- `read_adtl_chunks()`: read in associated data list chunks.
96- `read_lbl_chunk()`: read in marker positions, spans, labels, and texts from lbl chunk.
98## Helper functions for writing RIFF and WAVE files
100- `write_riff_chunk()`: write RIFF file header.
101- `write_filesize()`: write the file size into the RIFF file header.
102- `write_chunk_name()`: change the name of a chunk.
103- `write_format_chunk()`: write format chunk.
104- `write_data_chunk()`: write data chunk.
105- `write_info_chunk()`: write metadata to LIST INFO chunk.
106- `write_bext_chunk()`: write metadata to BEXT chunk.
107- `write_ixml_chunk()`: write metadata to iXML chunk.
108- `write_guano_chunk()`: write metadata to GUANO chunk.
109- `write_cue_chunk()`: write marker positions to cue chunk.
110- `write_playlist_chunk()`: write marker spans to playlist chunk.
111- `write_adtl_chunks()`: write associated data list chunks.
112- `write_lbl_chunk()`: write marker positions, spans, labels, and texts to lbl chunk.
114## Demo
116- `demo()`: print metadata and marker list of RIFF/WAVE file.
117- `main()`: call demo with command line arguments.
119## Descriptions of the RIFF/WAVE file format
121- https://de.wikipedia.org/wiki/RIFF_WAVE
122- http://www.piclist.com/techref/io/serial/midi/wave.html
123- https://moddingwiki.shikadi.net/wiki/Resource_Interchange_File_Format_(RIFF)
124- https://www.recordingblogs.com/wiki/wave-file-format
125- http://fhein.users.ak.tu-berlin.de/Alias/Studio/ProTools/audio-formate/wav/overview.html
126- http://www.gallery.co.uk/ixml/
128For INFO tag names see:
130- see https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
132"""
134import io
135import os
136import sys
137import warnings
138import struct
139import numpy as np
140import xml.etree.ElementTree as ET
141from .audiometadata import flatten_metadata, unflatten_metadata, find_key
144info_tags = dict(AGES='Rated',
145 CMNT='Comment',
146 CODE='EncodedBy',
147 COMM='Comments',
148 DIRC='Directory',
149 DISP='SoundSchemeTitle',
150 DTIM='DateTimeOriginal',
151 GENR='Genre',
152 IARL='ArchivalLocation',
153 IART='Artist',
154 IAS1='FirstLanguage',
155 IAS2='SecondLanguage',
156 IAS3='ThirdLanguage',
157 IAS4='FourthLanguage',
158 IAS5='FifthLanguage',
159 IAS6='SixthLanguage',
160 IAS7='SeventhLanguage',
161 IAS8='EighthLanguage',
162 IAS9='NinthLanguage',
163 IBSU='BaseURL',
164 ICAS='DefaultAudioStream',
165 ICDS='ConstumeDesigner',
166 ICMS='Commissioned',
167 ICMT='Comment',
168 ICNM='Cinematographer',
169 ICNT='Country',
170 ICOP='Copyright',
171 ICRD='DateCreated',
172 ICRP='Cropped',
173 IDIM='Dimensions',
174 IDIT='DateTimeOriginal',
175 IDPI='DotsPerInch',
176 IDST='DistributedBy',
177 IEDT='EditedBy',
178 IENC='EncodedBy',
179 IENG='Engineer',
180 IGNR='Genre',
181 IKEY='Keywords',
182 ILGT='Lightness',
183 ILGU='LogoURL',
184 ILIU='LogoIconURL',
185 ILNG='Language',
186 IMBI='MoreInfoBannerImage',
187 IMBU='MoreInfoBannerURL',
188 IMED='Medium',
189 IMIT='MoreInfoText',
190 IMIU='MoreInfoURL',
191 IMUS='MusicBy',
192 INAM='Title',
193 IPDS='ProductionDesigner',
194 IPLT='NumColors',
195 IPRD='Product',
196 IPRO='ProducedBy',
197 IRIP='RippedBy',
198 IRTD='Rating',
199 ISBJ='Subject',
200 ISFT='Software',
201 ISGN='SecondaryGenre',
202 ISHP='Sharpness',
203 ISMP='TimeCode',
204 ISRC='Source',
205 ISRF='SourceFrom',
206 ISTD='ProductionStudio',
207 ISTR='Starring',
208 ITCH='Technician',
209 ITRK='TrackNumber',
210 IWMU='WatermarkURL',
211 IWRI='WrittenBy',
212 LANG='Language',
213 LOCA='Location',
214 PRT1='Part',
215 PRT2='NumberOfParts',
216 RATE='Rate',
217 START='Starring',
218 STAT='Statistics',
219 TAPE='TapeName',
220 TCDO='EndTimecode',
221 TCOD='StartTimecode',
222 TITL='Title',
223 TLEN='Length',
224 TORG='Organization',
225 TRCK='TrackNumber',
226 TURL='URL',
227 TVER='Version',
228 VMAJ='VegasVersionMajor',
229 VMIN='VegasVersionMinor',
230 YEAR='Year',
231 # extensions from
232 # [TeeGrid](https://github.com/janscience/TeeGrid/):
233 BITS='Bits',
234 PINS='Pins',
235 AVRG='Averaging',
236 CNVS='ConversionSpeed',
237 SMPS='SamplingSpeed',
238 VREF='ReferenceVoltage',
239 GAIN='Gain',
240 UWRP='UnwrapThreshold',
241 UWPC='UnwrapClippedAmplitude',
242 IBRD='uCBoard',
243 IMAC='MACAdress')
244"""Dictionary with known tags of the INFO chunk as keys and their description as value.
246See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
247"""
249bext_tags = dict(
250 Description=256,
251 Originator=32,
252 OriginatorReference=32,
253 OriginationDate=10,
254 OriginationTime=8,
255 TimeReference=8,
256 Version=2,
257 UMID=64,
258 LoudnessValue=2,
259 LoudnessRange=2,
260 MaxTruePeakLevel=2,
261 MaxMomentaryLoudness=2,
262 MaxShortTermLoudness=2,
263 Reserved=180,
264 CodingHistory=0)
265"""Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.
267See https://tech.ebu.ch/docs/tech/tech3285.pdf
268"""
270ixml_tags = [
271 'BWFXML',
272 'IXML_VERSION',
273 'PROJECT',
274 'SCENE',
275 'TAPE',
276 'TAKE',
277 'TAKE_TYPE',
278 'NO_GOOD',
279 'FALSE_START',
280 'WILD_TRACK',
281 'CIRCLED',
282 'FILE_UID',
283 'UBITS',
284 'NOTE',
285 'SYNC_POINT_LIST',
286 'SYNC_POINT_COUNT',
287 'SYNC_POINT',
288 'SYNC_POINT_TYPE',
289 'SYNC_POINT_FUNCTION',
290 'SYNC_POINT_COMMENT',
291 'SYNC_POINT_LOW',
292 'SYNC_POINT_HIGH',
293 'SYNC_POINT_EVENT_DURATION',
294 'SPEED',
295 'MASTER_SPEED',
296 'CURRENT_SPEED',
297 'TIMECODE_RATE',
298 'TIMECODE_FLAGS',
299 'FILE_SAMPLE_RATE',
300 'AUDIO_BIT_DEPTH',
301 'DIGITIZER_SAMPLE_RATE',
302 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_HI',
303 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_LO',
304 'TIMESTAMP_SAMPLE_RATE',
305 'LOUDNESS',
306 'LOUDNESS_VALUE',
307 'LOUDNESS_RANGE',
308 'MAX_TRUE_PEAK_LEVEL',
309 'MAX_MOMENTARY_LOUDNESS',
310 'MAX_SHORT_TERM_LOUDNESS',
311 'HISTORY',
312 'ORIGINAL_FILENAME',
313 'PARENT_FILENAME',
314 'PARENT_UID',
315 'FILE_SET',
316 'TOTAL_FILES',
317 'FAMILY_UID',
318 'FAMILY_NAME',
319 'FILE_SET_INDEX',
320 'TRACK_LIST',
321 'TRACK_COUNT',
322 'TRACK',
323 'CHANNEL_INDEX',
324 'INTERLEAVE_INDEX',
325 'NAME',
326 'FUNCTION',
327 'PRE_RECORD_SAMPLECOUNT',
328 'BEXT',
329 'BWF_DESCRIPTION',
330 'BWF_ORIGINATOR',
331 'BWF_ORIGINATOR_REFERENCE',
332 'BWF_ORIGINATION_DATE',
333 'BWF_ORIGINATION_TIME',
334 'BWF_TIME_REFERENCE_LOW',
335 'BWF_TIME_REFERENCE_HIGH',
336 'BWF_VERSION',
337 'BWF_UMID',
338 'BWF_RESERVED',
339 'BWF_CODING_HISTORY',
340 'BWF_LOUDNESS_VALUE',
341 'BWF_LOUDNESS_RANGE',
342 'BWF_MAX_TRUE_PEAK_LEVEL',
343 'BWF_MAX_MOMENTARY_LOUDNESS',
344 'BWF_MAX_SHORT_TERM_LOUDNESS',
345 'USER',
346 'FULL_TITLE',
347 'DIRECTOR_NAME',
348 'PRODUCTION_NAME',
349 'PRODUCTION_ADDRESS',
350 'PRODUCTION_EMAIL',
351 'PRODUCTION_PHONE',
352 'PRODUCTION_NOTE',
353 'SOUND_MIXER_NAME',
354 'SOUND_MIXER_ADDRESS',
355 'SOUND_MIXER_EMAIL',
356 'SOUND_MIXER_PHONE',
357 'SOUND_MIXER_NOTE',
358 'AUDIO_RECORDER_MODEL',
359 'AUDIO_RECORDER_SERIAL_NUMBER',
360 'AUDIO_RECORDER_FIRMWARE',
361 'LOCATION',
362 'LOCATION_NAME',
363 'LOCATION_GPS',
364 'LOCATION_ALTITUDE',
365 'LOCATION_TYPE',
366 'LOCATION_TIME',
367 ]
368"""List with valid tags of the iXML chunk.
370See http://www.gallery.co.uk/ixml/
371"""
374# Read RIFF/WAVE files:
376def read_riff_header(sf, tag=None):
377 """Read and check the RIFF file header.
379 Parameters
380 ----------
381 sf: stream
382 File stream of RIFF/WAVE file.
383 tag: None or str
384 If supplied, check whether it matches the subchunk tag.
385 If it does not match, raise a ValueError.
387 Returns
388 -------
389 filesize: int
390 Size of the RIFF file in bytes.
392 Raises
393 ------
394 ValueError
395 Not a RIFF file or subchunk tag does not match `tag`.
396 """
397 riffs = sf.read(4).decode('latin-1')
398 if riffs != 'RIFF':
399 raise ValueError('Not a RIFF file.')
400 fsize = struct.unpack('<I', sf.read(4))[0] + 8
401 subtag = sf.read(4).decode('latin-1')
402 if tag is not None and subtag != tag:
403 raise ValueError(f'Not a {tag} file.')
404 return fsize
407def skip_chunk(sf):
408 """Skip over unknown RIFF chunk.
410 Parameters
411 ----------
412 sf: stream
413 File stream of RIFF file.
415 Returns
416 -------
417 size: int
418 The size of the skipped chunk in bytes.
419 """
420 size = struct.unpack('<I', sf.read(4))[0]
421 size += size % 2
422 sf.seek(size, os.SEEK_CUR)
423 return size
426def read_chunk_tags(filepath):
427 """Read tags of all chunks contained in a RIFF file.
429 Parameters
430 ----------
431 filepath: string or file handle
432 The RIFF file.
434 Returns
435 -------
436 tags: dict
437 Keys are the tag names of the chunks found in the file. If the
438 chunk is a list chunk, then the list type is added with a dash
439 to the key, i.e. "LIST-INFO". Values are tuples with the
440 corresponding file positions of the data of the chunk (after
441 the tag and the chunk size field) and the size of the chunk
442 data. The file position of the next chunk is thus the position
443 of the chunk plus the size of its data.
445 Raises
446 ------
447 ValueError
448 Not a RIFF file.
450 """
451 tags = {}
452 sf = filepath
453 file_pos = None
454 if hasattr(filepath, 'read'):
455 file_pos = sf.tell()
456 sf.seek(0, os.SEEK_SET)
457 else:
458 sf = open(filepath, 'rb')
459 fsize = read_riff_header(sf)
460 while (sf.tell() < fsize - 8):
461 chunk = sf.read(4).decode('latin-1').upper()
462 size = struct.unpack('<I', sf.read(4))[0]
463 size += size % 2
464 fp = sf.tell()
465 if chunk == 'LIST':
466 subchunk = sf.read(4).decode('latin-1').upper()
467 tags[chunk + '-' + subchunk] = (fp, size)
468 size -= 4
469 else:
470 tags[chunk] = (fp, size)
471 sf.seek(size, os.SEEK_CUR)
472 if file_pos is None:
473 sf.close()
474 else:
475 sf.seek(file_pos, os.SEEK_SET)
476 return tags
479def read_format_chunk(sf):
480 """Read format chunk.
482 Parameters
483 ----------
484 sf: stream
485 File stream for reading FMT chunk.
487 Returns
488 -------
489 channels: int
490 Number of channels.
491 rate: float
492 Sampling rate (frames per time) in Hertz.
493 bits: int
494 Bit resolution.
495 """
496 size = struct.unpack('<I', sf.read(4))[0]
497 size += size % 2
498 ccode, channels, rate, byterate, blockalign, bits = struct.unpack('<HHIIHH', sf.read(16))
499 if size > 16:
500 sf.read(size - 16)
501 return channels, float(rate), bits
504def read_info_chunks(sf, store_empty):
505 """Read in meta data from info list chunk.
507 The variable `info_tags` is used to map the 4 character tags to
508 human readable key names.
510 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
512 Parameters
513 ----------
514 sf: stream
515 File stream of RIFF file.
516 store_empty: bool
517 If `False` do not add meta data with empty values.
519 Returns
520 -------
521 metadata: dict
522 Dictionary with key-value pairs of info tags.
524 """
525 md = {}
526 list_size = struct.unpack('<I', sf.read(4))[0]
527 list_type = sf.read(4).decode('latin-1').upper()
528 list_size -= 4
529 if list_type == 'INFO':
530 while list_size >= 8:
531 key = sf.read(4).decode('ascii').rstrip(' \x00')
532 size = struct.unpack('<I', sf.read(4))[0]
533 size += size % 2
534 bs = sf.read(size)
535 x = np.frombuffer(bs, dtype=np.uint8)
536 if np.sum((x >= 0x80) & (x <= 0x9f)) > 0:
537 s = bs.decode('windows-1252')
538 else:
539 s = bs.decode('latin1')
540 value = s.rstrip(' \x00\x02')
541 list_size -= 8 + size
542 if key in info_tags:
543 key = info_tags[key]
544 if value or store_empty:
545 md[key] = value
546 if list_size > 0: # finish or skip
547 sf.seek(list_size, os.SEEK_CUR)
548 return md
551def read_bext_chunk(sf, store_empty=True):
552 """Read in metadata from the broadcast-audio extension chunk.
554 The variable `bext_tags` lists all valid BEXT fields and their size.
556 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
558 Parameters
559 ----------
560 sf: stream
561 File stream of RIFF file.
562 store_empty: bool
563 If `False` do not add meta data with empty values.
565 Returns
566 -------
567 meta_data: dict
568 The meta-data of a BEXT chunk are stored in a flat dictionary
569 with the following keys:
571 - 'Description': a free description of the sequence.
572 - 'Originator': name of the originator/ producer of the audio file.
573 - 'OriginatorReference': unambiguous reference allocated by the originating organisation.
574 - 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.
575 - 'OriginationTime': time of creation of audio sequence in hh:mm:ss.
576 - 'TimeReference': first sample since midnight.
577 - 'Version': version of the BWF.
578 - 'UMID': unique material identifier.
579 - 'LoudnessValue': integrated loudness value.
580 - 'LoudnessRange': loudness range.
581 - 'MaxTruePeakLevel': maximum true peak value in dBTP.
582 - 'MaxMomentaryLoudness': highest value of the momentary loudness level.
583 - 'MaxShortTermLoudness': highest value of the short-term loudness level.
584 - 'Reserved': 180 bytes reserved for extension.
585 - 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.
586 """
587 md = {}
588 size = struct.unpack('<I', sf.read(4))[0]
589 size += size % 2
590 s = sf.read(256).decode('ascii').strip(' \x00')
591 if s or store_empty:
592 md['Description'] = s
593 s = sf.read(32).decode('ascii').strip(' \x00')
594 if s or store_empty:
595 md['Originator'] = s
596 s = sf.read(32).decode('ascii').strip(' \x00')
597 if s or store_empty:
598 md['OriginatorReference'] = s
599 s = sf.read(10).decode('ascii').strip(' \x00')
600 if s or store_empty:
601 md['OriginationDate'] = s
602 s = sf.read(8).decode('ascii').strip(' \x00')
603 if s or store_empty:
604 md['OriginationTime'] = s
605 reference, version = struct.unpack('<QH', sf.read(10))
606 if reference > 0 or store_empty:
607 md['TimeReference'] = reference
608 if version > 0 or store_empty:
609 md['Version'] = version
610 s = sf.read(64).decode('ascii').strip(' \x00')
611 if s or store_empty:
612 md['UMID'] = s
613 lvalue, lrange, peak, momentary, shortterm = struct.unpack('<hhhhh', sf.read(10))
614 if lvalue > 0 or store_empty:
615 md['LoudnessValue'] = lvalue
616 if lrange > 0 or store_empty:
617 md['LoudnessRange'] = lrange
618 if peak > 0 or store_empty:
619 md['MaxTruePeakLevel'] = peak
620 if momentary > 0 or store_empty:
621 md['MaxMomentaryLoudness'] = momentary
622 if shortterm > 0 or store_empty:
623 md['MaxShortTermLoudness'] = shortterm
624 s = sf.read(180).decode('ascii').strip(' \x00')
625 if s or store_empty:
626 md['Reserved'] = s
627 size -= 256 + 32 + 32 + 10 + 8 + 8 + 2 + 64 + 10 + 180
628 s = sf.read(size).decode('ascii').strip(' \x00\n\r')
629 if s or store_empty:
630 md['CodingHistory'] = s
631 return md
634def read_ixml_chunk(sf, store_empty=True):
635 """Read in metadata from an IXML chunk.
637 See the variable `ixml_tags` for a list of valid tags.
639 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
641 Parameters
642 ----------
643 sf: stream
644 File stream of RIFF file.
645 store_empty: bool
646 If `False` do not add meta data with empty values.
648 Returns
649 -------
650 metadata: nested dict
651 Dictionary with key-value pairs.
652 """
654 def parse_ixml(element, store_empty=True):
655 md = {}
656 for e in element:
657 if not e.text is None:
658 md[e.tag] = e.text
659 elif len(e) > 0:
660 md[e.tag] = parse_ixml(e, store_empty)
661 elif store_empty:
662 md[e.tag] = ''
663 return md
665 size = struct.unpack('<I', sf.read(4))[0]
666 size += size % 2
667 xmls = sf.read(size).decode('latin-1').rstrip(' \x00')
668 root = ET.fromstring(xmls)
669 md = {root.tag: parse_ixml(root, store_empty)}
670 if len(md) == 1 and 'BWFXML' in md:
671 md = md['BWFXML']
672 return md
675def read_guano_chunk(sf):
676 """Read in metadata from a GUANO chunk.
678 GUANO is the Grand Unified Acoustic Notation Ontology, an
679 extensible, open format for embedding metadata within bat acoustic
680 recordings. See https://github.com/riggsd/guano-spec for details.
682 The GUANO specification allows for the inclusion of arbitrary
683 nested keys and string encoded values. In that respect it is a
684 well defined and easy to handle serialization of the [odML data
685 model](https://doi.org/10.3389/fninf.2011.00016).
687 Parameters
688 ----------
689 sf: stream
690 File stream of RIFF file.
692 Returns
693 -------
694 metadata: nested dict
695 Dictionary with key-value pairs.
697 """
698 md = {}
699 size = struct.unpack('<I', sf.read(4))[0]
700 size += size % 2
701 for line in io.StringIO(sf.read(size).decode('utf-8')):
702 ss = line.split(':')
703 if len(ss) > 1:
704 md[ss[0].strip()] = ':'.join(ss[1:]).strip().replace(r'\n', '\n')
705 return unflatten_metadata(md, '|')
708def read_cue_chunk(sf):
709 """Read in marker positions from cue chunk.
711 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
713 Parameters
714 ----------
715 sf: stream
716 File stream of RIFF file.
718 Returns
719 -------
720 locs: 2-D array of ints
721 Each row is a marker with unique identifier in the first column,
722 position in the second column, and span in the third column.
723 The cue chunk does not encode spans, so the third column is
724 initialized with zeros.
725 """
726 locs = []
727 size, n = struct.unpack('<II', sf.read(8))
728 for c in range(n):
729 cpid, cppos = struct.unpack('<II', sf.read(8))
730 datachunkid = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
731 chunkstart, blockstart, offset = struct.unpack('<III', sf.read(12))
732 if datachunkid == 'DATA':
733 locs.append((cpid, cppos, 0))
734 return np.array(locs, dtype=int)
737def read_playlist_chunk(sf, locs):
738 """Read in marker spans from playlist chunk.
740 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
742 Parameters
743 ----------
744 sf: stream
745 File stream of RIFF file.
746 locs: 2-D array of ints
747 Markers as returned by the `read_cue_chunk()` function.
748 Each row is a marker with unique identifier in the first column,
749 position in the second column, and span in the third column.
750 The span is read in from the playlist chunk.
751 """
752 if len(locs) == 0:
753 warnings.warn('read_playlist_chunks() requires markers from a previous cue chunk')
754 size, n = struct.unpack('<II', sf.read(8))
755 for p in range(n):
756 cpid, length, repeats = struct.unpack('<III', sf.read(12))
757 i = np.where(locs[:,0] == cpid)[0]
758 if len(i) > 0:
759 locs[i[0], 2] = length
762def read_adtl_chunks(sf, locs, labels):
763 """Read in associated data list chunks.
765 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
767 Parameters
768 ----------
769 sf: stream
770 File stream of RIFF file.
771 locs: 2-D array of ints
772 Markers as returned by the `read_cue_chunk()` function.
773 Each row is a marker with unique identifier in the first column,
774 position in the second column, and span in the third column.
775 The span is read in from the LTXT chunk.
776 labels: 2-D array of string objects
777 Labels (first column) and texts (second column) for each marker (rows)
778 from previous LABL, NOTE, and LTXT chunks.
780 Returns
781 -------
782 labels: 2-D array of string objects
783 Labels (first column) and texts (second column) for each marker (rows)
784 from LABL, NOTE (first column), and LTXT chunks (last column).
785 """
786 list_size = struct.unpack('<I', sf.read(4))[0]
787 list_type = sf.read(4).decode('latin-1').upper()
788 list_size -= 4
789 if list_type == 'ADTL':
790 if len(locs) == 0:
791 warnings.warn('read_adtl_chunks() requires markers from a previous cue chunk')
792 if len(labels) == 0:
793 labels = np.zeros((len(locs), 2), dtype=object)
794 while list_size >= 8:
795 key = sf.read(4).decode('latin-1').rstrip(' \x00').upper()
796 size, cpid = struct.unpack('<II', sf.read(8))
797 size += size % 2 - 4
798 if key == 'LABL' or key == 'NOTE':
799 label = sf.read(size).decode('latin-1').rstrip(' \x00')
800 i = np.where(locs[:,0] == cpid)[0]
801 if len(i) > 0:
802 i = i[0]
803 if hasattr(labels[i,0], '__len__') and len(labels[i,0]) > 0:
804 labels[i,0] += '|' + label
805 else:
806 labels[i,0] = label
807 elif key == 'LTXT':
808 length = struct.unpack('<I', sf.read(4))[0]
809 sf.read(12) # skip fields
810 text = sf.read(size - 4 - 12).decode('latin-1').rstrip(' \x00')
811 i = np.where(locs[:,0] == cpid)[0]
812 if len(i) > 0:
813 i = i[0]
814 if hasattr(labels[i,1], '__len__') and len(labels[i,1]) > 0:
815 labels[i,1] += '|' + text
816 else:
817 labels[i,1] = text
818 locs[i,2] = length
819 else:
820 sf.read(size)
821 list_size -= 12 + size
822 if list_size > 0: # finish or skip
823 sf.seek(list_size, os.SEEK_CUR)
824 return labels
827def read_lbl_chunk(sf, rate):
828 """Read in marker positions, spans, labels, and texts from lbl chunk.
830 The proprietary LBL chunk is specific to wave files generated by
831 [AviSoft](www.avisoft.com) products.
833 The labels (first column of `labels`) have special meanings.
834 Markers with a span (a section label in the terminology of
835 AviSoft) can be arranged in three levels when displayed:
837 - "M": layer 1, the top level section
838 - "N": layer 2, sections below layer 1
839 - "O": layer 3, sections below layer 2
840 - "P": total, section start and end are displayed with two vertical lines.
842 All other labels mark single point labels with a time and a
843 frequency (that we here discard). See also
844 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
846 Parameters
847 ----------
848 sf: stream
849 File stream of RIFF file.
850 rate: float
851 Sampling rate of the data in Hertz.
853 Returns
854 -------
855 locs: 2-D array of ints
856 Each row is a marker with unique identifier (simply integers
857 enumerating the markers) in the first column, position in the
858 second column, and span in the third column.
859 labels: 2-D array of string objects
860 Labels (first column) and texts (second column) for
861 each marker (rows).
863 """
864 size = struct.unpack('<I', sf.read(4))[0]
865 nn = size // 65
866 locs = np.zeros((nn, 3), dtype=int)
867 labels = np.zeros((nn, 2), dtype=object)
868 n = 0
869 for c in range(nn):
870 line = sf.read(65).decode('ascii')
871 fields = line.split('\t')
872 if len(fields) >= 4:
873 labels[n,0] = fields[3].strip()
874 labels[n,1] = fields[2].strip()
875 start_idx = int(np.round(float(fields[0].strip('\x00'))*rate))
876 end_idx = int(np.round(float(fields[1].strip('\x00'))*rate))
877 locs[n,0] = n
878 locs[n,1] = start_idx
879 if labels[n,0] in 'MNOP':
880 locs[n,2] = end_idx - start_idx
881 else:
882 locs[n,2] = 0
883 n += 1
884 else:
885 # the first 65 bytes are a title string that applies to
886 # the whole wave file that can be set from the AVISoft
887 # software. The recorder leave this empty.
888 pass
889 return locs[:n,:], labels[:n,:]
892def metadata_riff(filepath, store_empty=False):
893 """Read metadata from a RIFF/WAVE file.
895 Parameters
896 ----------
897 filepath: string or file handle
898 The RIFF file.
899 store_empty: bool
900 If `False` do not add meta data with empty values.
902 Returns
903 -------
904 meta_data: nested dict
905 Meta data contained in the RIFF file. Keys of the nested
906 dictionaries are always strings. If the corresponding
907 values are dictionaries, then the key is the section name
908 of the metadata contained in the dictionary. All other
909 types of values are values for the respective key. In
910 particular they are strings, or list of strings. But other
911 simple types like ints or floats are also allowed.
912 First level contains sections of meta data
913 (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).
915 Raises
916 ------
917 ValueError
918 Not a RIFF file.
920 Examples
921 --------
922 ```
923 from audioio.riffmetadata import riff_metadata
924 from audioio import print_metadata
926 md = riff_metadata('audio/file.wav')
927 print_metadata(md)
928 ```
929 """
930 meta_data = {}
931 sf = filepath
932 file_pos = None
933 if hasattr(filepath, 'read'):
934 file_pos = sf.tell()
935 sf.seek(0, os.SEEK_SET)
936 else:
937 sf = open(filepath, 'rb')
938 fsize = read_riff_header(sf)
939 while (sf.tell() < fsize - 8):
940 chunk = sf.read(4).decode('latin-1').upper()
941 if chunk == 'LIST':
942 md = read_info_chunks(sf, store_empty)
943 if len(md) > 0:
944 meta_data['INFO'] = md
945 elif chunk == 'BEXT':
946 md = read_bext_chunk(sf, store_empty)
947 if len(md) > 0:
948 meta_data['BEXT'] = md
949 elif chunk == 'IXML':
950 md = read_ixml_chunk(sf, store_empty)
951 if len(md) > 0:
952 meta_data['IXML'] = md
953 elif chunk == 'GUAN':
954 md = read_guano_chunk(sf)
955 if len(md) > 0:
956 meta_data.update(md)
957 else:
958 skip_chunk(sf)
959 if file_pos is None:
960 sf.close()
961 else:
962 sf.seek(file_pos, os.SEEK_SET)
963 return meta_data
966def markers_riff(filepath):
967 """Read markers from a RIFF/WAVE file.
969 Parameters
970 ----------
971 filepath: string or file handle
972 The RIFF file.
974 Returns
975 -------
976 locs: 2-D array of ints
977 Marker positions (first column) and spans (second column)
978 for each marker (rows).
979 labels: 2-D array of string objects
980 Labels (first column) and texts (second column)
981 for each marker (rows).
983 Raises
984 ------
985 ValueError
986 Not a RIFF file.
988 Examples
989 --------
990 ```
991 from audioio.riffmetadata import riff_markers
992 from audioio import print_markers
994 locs, labels = riff_markers('audio/file.wav')
995 print_markers(locs, labels)
996 ```
997 """
998 sf = filepath
999 file_pos = None
1000 if hasattr(filepath, 'read'):
1001 file_pos = sf.tell()
1002 sf.seek(0, os.SEEK_SET)
1003 else:
1004 sf = open(filepath, 'rb')
1005 rate = None
1006 locs = np.zeros((0, 3), dtype=int)
1007 labels = np.zeros((0, 2), dtype=object)
1008 fsize = read_riff_header(sf)
1009 while (sf.tell() < fsize - 8):
1010 chunk = sf.read(4).decode('latin-1').upper()
1011 if chunk == 'FMT ':
1012 rate = read_format_chunk(sf)[1]
1013 elif chunk == 'CUE ':
1014 locs = read_cue_chunk(sf)
1015 elif chunk == 'PLST':
1016 read_playlist_chunk(sf, locs)
1017 elif chunk == 'LIST':
1018 labels = read_adtl_chunks(sf, locs, labels)
1019 elif chunk == 'LBL ':
1020 locs, labels = read_lbl_chunk(sf, rate)
1021 else:
1022 skip_chunk(sf)
1023 if file_pos is None:
1024 sf.close()
1025 else:
1026 sf.seek(file_pos, os.SEEK_SET)
1027 # sort markers according to their position:
1028 if len(locs) > 0:
1029 idxs = np.argsort(locs[:,-2])
1030 locs = locs[idxs,:]
1031 if len(labels) > 0:
1032 labels = labels[idxs,:]
1033 return locs[:,1:], labels
1036# Write RIFF/WAVE file:
1038def write_riff_chunk(df, filesize=0, tag='WAVE'):
1039 """Write RIFF file header.
1041 Parameters
1042 ----------
1043 df: stream
1044 File stream for writing RIFF file header.
1045 filesize: int
1046 Size of the file in bytes.
1047 tag: str
1048 The type of RIFF file. Default is a wave file.
1049 Exactly 4 characeters long.
1051 Returns
1052 -------
1053 n: int
1054 Number of bytes written to the stream.
1056 Raises
1057 ------
1058 ValueError
1059 `tag` is not 4 characters long.
1060 """
1061 if len(tag) != 4:
1062 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1063 if filesize < 8:
1064 filesize = 8
1065 df.write(b'RIFF')
1066 df.write(struct.pack('<I', filesize - 8))
1067 df.write(tag.encode('ascii', errors='strict'))
1068 return 12
1071def write_filesize(df, filesize=None):
1072 """Write the file size into the RIFF file header.
1074 Parameters
1075 ----------
1076 df: stream
1077 File stream into which to write `filesize`.
1078 filesize: int
1079 Size of the file in bytes. If not specified or 0,
1080 then use current size of the file.
1081 """
1082 pos = df.tell()
1083 if not filesize:
1084 df.seek(0, os.SEEK_END)
1085 filesize = df.tell()
1086 df.seek(4, os.SEEK_SET)
1087 df.write(struct.pack('<I', filesize - 8))
1088 df.seek(pos, os.SEEK_SET)
1091def write_chunk_name(df, pos, tag):
1092 """Change the name of a chunk.
1094 Use this to make the content of an existing chunk to be ignored by
1095 overwriting its name with an unknown one.
1097 Parameters
1098 ----------
1099 df: stream
1100 File stream.
1101 pos: int
1102 Position of the chunk in the file stream.
1103 tag: str
1104 The type of RIFF file. Default is a wave file.
1105 Exactly 4 characeters long.
1107 Raises
1108 ------
1109 ValueError
1110 `tag` is not 4 characters long.
1111 """
1112 if len(tag) != 4:
1113 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')
1114 df.seek(pos, os.SEEK_SET)
1115 df.write(tag.encode('ascii', errors='strict'))
1118def write_format_chunk(df, channels, frames, rate, bits=16):
1119 """Write format chunk.
1121 Parameters
1122 ----------
1123 df: stream
1124 File stream for writing FMT chunk.
1125 channels: int
1126 Number of channels contained in the data.
1127 frames: int
1128 Number of frames contained in the data.
1129 rate: int or float
1130 Sampling rate (frames per time) in Hertz.
1131 bits: 16 or 32
1132 Bit resolution of the data to be written.
1134 Returns
1135 -------
1136 n: int
1137 Number of bytes written to the stream.
1138 """
1139 blockalign = channels * (bits//8)
1140 byterate = int(rate) * blockalign
1141 df.write(b'fmt ')
1142 df.write(struct.pack('<IHHIIHH', 16, 1, channels, int(rate),
1143 byterate, blockalign, bits))
1144 return 8 + 16
1147def write_data_chunk(df, data, bits=16):
1148 """Write data chunk.
1150 Parameters
1151 ----------
1152 df: stream
1153 File stream for writing data chunk.
1154 data: 1-D or 2-D array of floats
1155 Data with first column time (frames) and optional second column
1156 channels with values between -1 and 1.
1157 bits: 16 or 32
1158 Bit resolution of the data to be written.
1160 Returns
1161 -------
1162 n: int
1163 Number of bytes written to the stream.
1164 """
1165 df.write(b'data')
1166 df.write(struct.pack('<I', data.size * (bits//8)))
1167 buffer = data * 2**(bits-1)
1168 n = df.write(buffer.astype(f'<i{bits//8}').tobytes('C'))
1169 return 8 + n
1172def write_info_chunk(df, metadata):
1173 """Write metadata to LIST INFO chunk.
1175 If `metadata` contains an 'INFO' key, then write the flat
1176 dictionary of this key as an INFO chunk. Otherwise, attempt to
1177 write all metadata items as an INFO chunk. The keys are translated
1178 via the `info_tags` variable back to INFO tags. If after
1179 translation any key is left that is longer than 4 characters or
1180 any key has a dictionary as a value (non-flat metadata), the INFO
1181 chunk is not written.
1183 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags
1185 Parameters
1186 ----------
1187 df: stream
1188 File stream for writing INFO chunk.
1189 metadata: nested dict
1190 Metadata as key-value pairs. Values can be strings, integers,
1191 or dictionaries.
1193 Returns
1194 -------
1195 n: int
1196 Number of bytes written to the stream.
1197 keys_written: list of str
1198 Keys written to the INFO chunk.
1200 """
1201 if not metadata:
1202 return 0, []
1203 is_info = False
1204 if 'INFO' in metadata:
1205 metadata = metadata['INFO']
1206 is_info = True
1207 tags = {v: k for k, v in info_tags.items()}
1208 n = 0
1209 for k in metadata:
1210 kn = tags.get(k, k)
1211 if len(kn) > 4:
1212 if is_info:
1213 warnings.warn(f'no 4-character info tag for key "{k}" found.')
1214 return 0, []
1215 if isinstance(metadata[k], dict):
1216 if is_info:
1217 warnings.warn(f'value of key "{k}" in INFO chunk cannot be a dictionary.')
1218 return 0, []
1219 try:
1220 v = str(metadata[k]).encode('latin-1')
1221 except UnicodeEncodeError:
1222 v = str(metadata[k]).encode('windows-1252')
1223 n += 8 + len(v) + len(v) % 2
1224 df.write(b'LIST')
1225 df.write(struct.pack('<I', n + 4))
1226 df.write(b'INFO')
1227 keys_written = []
1228 for k in metadata:
1229 kn = tags.get(k, k)
1230 df.write(f'{kn:<4s}'.encode('latin-1'))
1231 try:
1232 v = str(metadata[k]).encode('latin-1')
1233 except UnicodeEncodeError:
1234 v = str(metadata[k]).encode('windows-1252')
1235 ns = len(v) + len(v) % 2
1236 if ns > len(v):
1237 v += b' ';
1238 df.write(struct.pack('<I', ns))
1239 df.write(v)
1240 keys_written.append(k)
1241 return 12 + n, ['INFO'] if is_info else keys_written
1244def write_bext_chunk(df, metadata):
1245 """Write metadata to BEXT chunk.
1247 If `metadata` contains a BEXT key, and this contains valid BEXT
1248 tags (one of the keys listed in the variable `bext_tags`), then
1249 write the dictionary of that key as a broadcast-audio extension
1250 chunk.
1252 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.
1254 Parameters
1255 ----------
1256 df: stream
1257 File stream for writing BEXT chunk.
1258 metadata: nested dict
1259 Metadata as key-value pairs. Values can be strings, integers,
1260 or dictionaries.
1262 Returns
1263 -------
1264 n: int
1265 Number of bytes written to the stream.
1266 keys_written: list of str
1267 Keys written to the BEXT chunk.
1269 """
1270 if not metadata or not 'BEXT' in metadata:
1271 return 0, []
1272 metadata = metadata['BEXT']
1273 for k in metadata:
1274 if not k in bext_tags:
1275 warnings.warn(f'no bext tag for key "{k}" found.')
1276 return 0, []
1277 n = 0
1278 for k in bext_tags:
1279 n += bext_tags[k]
1280 ch = metadata.get('CodingHistory', '').encode('ascii', errors='replace')
1281 if len(ch) >= 2 and ch[-2:] != '\r\n':
1282 ch += b'\r\n'
1283 nch = len(ch) + len(ch) % 2
1284 n += nch
1285 df.write(b'BEXT')
1286 df.write(struct.pack('<I', n))
1287 for k in bext_tags:
1288 bn = bext_tags[k]
1289 if bn == 2:
1290 v = metadata.get(k, '0')
1291 df.write(struct.pack('<H', int(v)))
1292 elif bn == 8 and k == 'TimeReference':
1293 v = metadata.get(k, '0')
1294 df.write(struct.pack('<Q', int(v)))
1295 elif bn == 0:
1296 df.write(ch)
1297 df.write(bytes(nch - len(ch)))
1298 else:
1299 v = metadata.get(k, '').encode('ascii', errors='replace')
1300 df.write(v[:bn] + bytes(bn - len(v)))
1301 return 8 + n, ['BEXT']
1304def write_ixml_chunk(df, metadata, keys_written=None):
1305 """Write metadata to iXML chunk.
1307 If `metadata` contains an IXML key with valid IXML tags (one of
1308 those listed in the variable `ixml_tags`), or the remaining tags
1309 in `metadata` are valid IXML tags, then write an IXML chunk.
1311 See http://www.gallery.co.uk/ixml/ for the specification of iXML.
1313 Parameters
1314 ----------
1315 df: stream
1316 File stream for writing IXML chunk.
1317 metadata: nested dict
1318 Meta-data as key-value pairs. Values can be strings, integers,
1319 or dictionaries.
1320 keys_written: list of str
1321 Keys that have already written to INFO or BEXT chunk.
1323 Returns
1324 -------
1325 n: int
1326 Number of bytes written to the stream.
1327 keys_written: list of str
1328 Keys written to the IXML chunk.
1330 """
1331 def check_ixml(metadata):
1332 for k in metadata:
1333 if not k.upper() in ixml_tags:
1334 return False
1335 if isinstance(metadata[k], dict):
1336 if not check_ixml(metadata[k]):
1337 return False
1338 return True
1340 def build_xml(node, metadata):
1341 kw = []
1342 for k in metadata:
1343 e = ET.SubElement(node, k)
1344 if isinstance(metadata[k], dict):
1345 build_xml(e, metadata[k])
1346 else:
1347 e.text = str(metadata[k])
1348 kw.append(k)
1349 return kw
1351 if not metadata:
1352 return 0, []
1353 md = metadata
1354 if keys_written:
1355 md = {k: metadata[k] for k in metadata if not k in keys_written}
1356 if len(md) == 0:
1357 return 0, []
1358 has_ixml = False
1359 if 'IXML' in md and check_ixml(md['IXML']):
1360 md = md['IXML']
1361 has_ixml = True
1362 else:
1363 if not check_ixml(md):
1364 return 0, []
1365 root = ET.Element('BWFXML')
1366 kw = build_xml(root, md)
1367 bs = bytes(ET.tostring(root, xml_declaration=True,
1368 short_empty_elements=False))
1369 if len(bs) % 2 == 1:
1370 bs += bytes(1)
1371 df.write(b'IXML')
1372 df.write(struct.pack('<I', len(bs)))
1373 df.write(bs)
1374 return 8 + len(bs), ['IXML'] if has_ixml else kw
1377def write_guano_chunk(df, metadata, keys_written=None):
1378 """Write metadata to guan chunk.
1380 GUANO is the Grand Unified Acoustic Notation Ontology, an
1381 extensible, open format for embedding metadata within bat acoustic
1382 recordings. See https://github.com/riggsd/guano-spec for details.
1384 The GUANO specification allows for the inclusion of arbitrary
1385 nested keys and string encoded values. In that respect it is a
1386 well defined and easy to handle serialization of the [odML data
1387 model](https://doi.org/10.3389/fninf.2011.00016).
1389 This will write *all* metadata that are not in `keys_written`.
1391 Parameters
1392 ----------
1393 df: stream
1394 File stream for writing guano chunk.
1395 metadata: nested dict
1396 Metadata as key-value pairs. Values can be strings, integers,
1397 or dictionaries.
1398 keys_written: list of str
1399 Keys that have already written to INFO, BEXT, IXML chunk.
1401 Returns
1402 -------
1403 n: int
1404 Number of bytes written to the stream.
1405 keys_written: list of str
1406 Top-level keys written to the GUANO chunk.
1408 """
1409 if not metadata:
1410 return 0, []
1411 md = metadata
1412 if keys_written:
1413 md = {k: metadata[k] for k in metadata if not k in keys_written}
1414 if len(md) == 0:
1415 return 0, []
1416 fmd = flatten_metadata(md, True, '|')
1417 for k in fmd:
1418 if isinstance(fmd[k], str):
1419 fmd[k] = fmd[k].replace('\n', r'\n')
1420 sio = io.StringIO()
1421 m, k = find_key(md, 'GUANO.Version')
1422 if k is None:
1423 sio.write('GUANO|Version:1.0\n')
1424 for k in fmd:
1425 sio.write(f'{k}:{fmd[k]}\n')
1426 bs = sio.getvalue().encode('utf-8')
1427 if len(bs) % 2 == 1:
1428 bs += b' '
1429 n = len(bs)
1430 df.write(b'guan')
1431 df.write(struct.pack('<I', n))
1432 df.write(bs)
1433 return n, list(md)
1436def write_cue_chunk(df, locs):
1437 """Write marker positions to cue chunk.
1439 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file
1441 Parameters
1442 ----------
1443 df: stream
1444 File stream for writing cue chunk.
1445 locs: None or 2-D array of ints
1446 Positions (first column) and spans (optional second column)
1447 for each marker (rows).
1449 Returns
1450 -------
1451 n: int
1452 Number of bytes written to the stream.
1453 """
1454 if locs is None or len(locs) == 0:
1455 return 0
1456 df.write(b'CUE ')
1457 df.write(struct.pack('<II', 4 + len(locs)*24, len(locs)))
1458 for i in range(len(locs)):
1459 df.write(struct.pack('<II4sIII', i, locs[i,0], b'data', 0, 0, 0))
1460 return 12 + len(locs)*24
1463def write_playlist_chunk(df, locs):
1464 """Write marker spans to playlist chunk.
1466 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file
1468 Parameters
1469 ----------
1470 df: stream
1471 File stream for writing playlist chunk.
1472 locs: None or 2-D array of ints
1473 Positions (first column) and spans (optional second column)
1474 for each marker (rows).
1476 Returns
1477 -------
1478 n: int
1479 Number of bytes written to the stream.
1480 """
1481 if locs is None or len(locs) == 0 or locs.shape[1] < 2:
1482 return 0
1483 n_spans = np.sum(locs[:,1] > 0)
1484 if n_spans == 0:
1485 return 0
1486 df.write(b'plst')
1487 df.write(struct.pack('<II', 4 + n_spans*12, n_spans))
1488 for i in range(len(locs)):
1489 if locs[i,1] > 0:
1490 df.write(struct.pack('<III', i, locs[i,1], 1))
1491 return 12 + n_spans*12
1494def write_adtl_chunks(df, locs, labels):
1495 """Write associated data list chunks.
1497 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file
1499 Parameters
1500 ----------
1501 df: stream
1502 File stream for writing adtl chunk.
1503 locs: None or 2-D array of ints
1504 Positions (first column) and spans (optional second column)
1505 for each marker (rows).
1506 labels: None or 2-D array of string objects
1507 Labels (first column) and texts (second column) for each marker (rows).
1509 Returns
1510 -------
1511 n: int
1512 Number of bytes written to the stream.
1513 """
1514 if labels is None or len(labels) == 0:
1515 return 0
1516 labels_size = 0
1517 for l in labels[:,0]:
1518 if hasattr(l, '__len__'):
1519 n = len(l)
1520 if n > 0:
1521 labels_size += 12 + n + n % 2
1522 text_size = 0
1523 if labels.shape[1] > 1:
1524 for t in labels[:,1]:
1525 if hasattr(t, '__len__'):
1526 n = len(t)
1527 if n > 0:
1528 text_size += 28 + n + n % 2
1529 if labels_size == 0 and text_size == 0:
1530 return 0
1531 size = 4 + labels_size + text_size
1532 spans = locs[:,1] if locs.shape[1] > 1 else None
1533 df.write(b'LIST')
1534 df.write(struct.pack('<I', size))
1535 df.write(b'adtl')
1536 for i in range(len(labels)):
1537 # labl sub-chunk:
1538 l = labels[i,0]
1539 if hasattr(l, '__len__'):
1540 n = len(l)
1541 if n > 0:
1542 n += n % 2
1543 df.write(b'labl')
1544 df.write(struct.pack('<II', 4 + n, i))
1545 df.write(f'{l:<{n}s}'.encode('latin-1', errors='replace'))
1546 # ltxt sub-chunk:
1547 if labels.shape[1] > 1:
1548 t = labels[i,1]
1549 if hasattr(t, '__len__'):
1550 n = len(t)
1551 if n > 0:
1552 n += n % 2
1553 span = spans[i] if spans is not None else 0
1554 df.write(b'ltxt')
1555 df.write(struct.pack('<III', 20 + n, i, span))
1556 df.write(struct.pack('<IHHHH', 0, 0, 0, 0, 0))
1557 df.write(f'{t:<{n}s}'.encode('latin-1', errors='replace'))
1558 return 8 + size
1561def write_lbl_chunk(df, locs, labels, rate):
1562 """Write marker positions, spans, labels, and texts to lbl chunk.
1564 The proprietary LBL chunk is specific to wave files generated by
1565 [AviSoft](www.avisoft.com) products.
1567 The labels (first column of `labels`) have special meanings.
1568 Markers with a span (a section label in the terminology of
1569 AviSoft) can be arranged in three levels when displayed:
1571 - "M": layer 1, the top level section
1572 - "N": layer 2, sections below layer 1
1573 - "O": layer 3, sections below layer 2
1574 - "P": total, section start and end are displayed with two vertical lines.
1576 All other labels mark single point labels with a time and a
1577 frequency (that we here discard). See also
1578 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm
1580 If a marker has a span, and its label is not one of "M", "N", "O", or "P",
1581 then its label is set to "M".
1582 If a marker has no span, and its label is one of "M", "N", "O", or "P",
1583 then its label is set to "a".
1585 Parameters
1586 ----------
1587 df: stream
1588 File stream for writing lbl chunk.
1589 locs: None or 2-D array of ints
1590 Positions (first column) and spans (optional second column)
1591 for each marker (rows).
1592 labels: None or 2-D array of string objects
1593 Labels (first column) and texts (second column) for each marker (rows).
1594 rate: float
1595 Sampling rate of the data in Hertz.
1597 Returns
1598 -------
1599 n: int
1600 Number of bytes written to the stream.
1602 """
1603 if locs is None or len(locs) == 0:
1604 return 0
1605 size = (1 + len(locs)) * 65
1606 df.write(b'LBL ')
1607 df.write(struct.pack('<I', size))
1608 # first empty entry (this is ment to be a title for the whole wave file):
1609 df.write(b' ' * 63)
1610 df.write(b'\r\n')
1611 for k in range(len(locs)):
1612 t0 = locs[k,0]/rate
1613 t1 = t0
1614 t1 += locs[k,1]/rate
1615 ls = 'M' if locs[k,1] > 0 else 'a'
1616 ts = ''
1617 if labels is not None and len(labels) > k:
1618 ls = labels[k,0]
1619 if ls != 0 and len(ls) > 0:
1620 ls = ls[0]
1621 if ls in 'MNOP':
1622 if locs[k,1] == 0:
1623 ls = 'a'
1624 else:
1625 if locs[k,1] > 0:
1626 ls = 'M'
1627 ts = labels[k,1]
1628 if ts == 0:
1629 ts = ''
1630 df.write(struct.pack('<14sc', f'{t0:e}'.encode('ascii', errors='replace'), b'\t'))
1631 df.write(struct.pack('<14sc', f'{t1:e}'.encode('ascii', errors='replace'), b'\t'))
1632 bs = f'{ts:31s}\t{ls}\r\n'.encode('ascii', errors='replace')
1633 df.write(bs)
1634 return 8 + size
1637def append_metadata_riff(df, metadata):
1638 """Append metadata chunks to RIFF file.
1640 You still need to update the filesize by calling
1641 `write_filesize()`.
1643 Parameters
1644 ----------
1645 df: stream
1646 File stream for writing metadata chunks.
1647 metadata: None or nested dict
1648 Metadata as key-value pairs. Values can be strings, integers,
1649 or dictionaries.
1651 Returns
1652 -------
1653 n: int
1654 Number of bytes written to the stream.
1655 tags: list of str
1656 Tag names of chunks written to audio file.
1657 """
1658 if not metadata:
1659 return 0, []
1660 n = 0
1661 tags = []
1662 # metadata INFO chunk:
1663 nc, kw = write_info_chunk(df, metadata)
1664 if nc > 0:
1665 tags.append('LIST-INFO')
1666 n += nc
1667 # metadata BEXT chunk:
1668 nc, bkw = write_bext_chunk(df, metadata)
1669 if nc > 0:
1670 tags.append('BEXT')
1671 n += nc
1672 kw.extend(bkw)
1673 # metadata IXML chunk:
1674 nc, xkw = write_ixml_chunk(df, metadata, kw)
1675 if nc > 0:
1676 tags.append('IXML')
1677 n += nc
1678 kw.extend(xkw)
1679 # write remaining metadata to GUANO chunk:
1680 nc, _ = write_guano_chunk(df, metadata, kw)
1681 if nc > 0:
1682 tags.append('GUAN')
1683 n += nc
1684 kw.extend(bkw)
1685 return n, tags
1688def append_markers_riff(df, locs, labels=None, rate=None,
1689 marker_hint='cue'):
1690 """Append marker chunks to RIFF file.
1692 You still need to update the filesize by calling
1693 `write_filesize()`.
1695 Parameters
1696 ----------
1697 df: stream
1698 File stream for writing metadata chunks.
1699 locs: None or 1-D or 2-D array of ints
1700 Marker positions (first column) and spans (optional second column)
1701 for each marker (rows).
1702 labels: None or 1-D or 2-D array of string objects
1703 Labels (first column) and texts (optional second column)
1704 for each marker (rows).
1705 rate: float
1706 Sampling rate of the data in Hertz, needed for storing markers
1707 in seconds.
1708 marker_hint: str
1709 - 'cue': store markers in cue and and adtl chunks.
1710 - 'lbl': store markers in avisoft lbl chunk.
1712 Returns
1713 -------
1714 n: int
1715 Number of bytes written to the stream.
1716 tags: list of str
1717 Tag names of chunks written to audio file.
1719 Raises
1720 ------
1721 ValueError
1722 Encoding not supported.
1723 IndexError
1724 `locs` and `labels` differ in len.
1725 """
1726 if locs is None or len(locs) == 0:
1727 return 0, []
1728 if labels is not None and len(labels) > 0 and len(labels) != len(locs):
1729 raise IndexError(f'locs and labels must have same number of elements.')
1730 # make locs and labels 2-D:
1731 if not locs is None and locs.ndim == 1:
1732 locs = locs.reshape(-1, 1)
1733 if not labels is None and labels.ndim == 1:
1734 labels = labels.reshape(-1, 1)
1735 # sort markers according to their position:
1736 idxs = np.argsort(locs[:,0])
1737 locs = locs[idxs,:]
1738 if not labels is None and len(labels) > 0:
1739 labels = labels[idxs,:]
1740 n = 0
1741 tags = []
1742 if marker_hint.lower() == 'cue':
1743 # write marker positions:
1744 nc = write_cue_chunk(df, locs)
1745 if nc > 0:
1746 tags.append('CUE ')
1747 n += nc
1748 # write marker spans:
1749 nc = write_playlist_chunk(df, locs)
1750 if nc > 0:
1751 tags.append('PLST')
1752 n += nc
1753 # write marker labels:
1754 nc = write_adtl_chunks(df, locs, labels)
1755 if nc > 0:
1756 tags.append('LIST-ADTL')
1757 n += nc
1758 elif marker_hint.lower() == 'lbl':
1759 # write avisoft labels:
1760 nc = write_lbl_chunk(df, locs, labels, rate)
1761 if nc > 0:
1762 tags.append('LBL ')
1763 n += nc
1764 else:
1765 raise ValueError(f'marker_hint "{marker_hint}" not supported for storing markers')
1766 return n, tags
1769def write_wave(filepath, data, rate, metadata=None, locs=None,
1770 labels=None, encoding=None, marker_hint='cue'):
1771 """Write time series, metadata and markers to a WAVE file.
1773 Only 16 or 32bit PCM encoding is supported.
1775 Parameters
1776 ----------
1777 filepath: string
1778 Full path and name of the file to write.
1779 data: 1-D or 2-D array of floats
1780 Array with the data (first index time, second index channel,
1781 values within -1.0 and 1.0).
1782 rate: float
1783 Sampling rate of the data in Hertz.
1784 metadata: None or nested dict
1785 Metadata as key-value pairs. Values can be strings, integers,
1786 or dictionaries.
1787 locs: None or 1-D or 2-D array of ints
1788 Marker positions (first column) and spans (optional second column)
1789 for each marker (rows).
1790 labels: None or 1-D or 2-D array of string objects
1791 Labels (first column) and texts (optional second column)
1792 for each marker (rows).
1793 encoding: string or None
1794 Encoding of the data: 'PCM_32' or 'PCM_16'.
1795 If None or empty string use 'PCM_16'.
1796 marker_hint: str
1797 - 'cue': store markers in cue and and adtl chunks.
1798 - 'lbl': store markers in avisoft lbl chunk.
1800 Raises
1801 ------
1802 ValueError
1803 Encoding not supported.
1804 IndexError
1805 `locs` and `labels` differ in len.
1807 See Also
1808 --------
1809 audioio.audiowriter.write_audio()
1811 Examples
1812 --------
1813 ```
1814 import numpy as np
1815 from audioio.riffmetadata import write_wave
1817 rate = 28000.0
1818 freq = 800.0
1819 time = np.arange(0.0, 1.0, 1/rate) # one second
1820 data = np.sin(2.0*np.p*freq*time) # 800Hz sine wave
1821 md = dict(Artist='underscore_') # metadata
1823 write_wave('audio/file.wav', data, rate, md)
1824 ```
1825 """
1826 if not filepath:
1827 raise ValueError('no file specified!')
1828 if not encoding:
1829 encoding = 'PCM_16'
1830 encoding = encoding.upper()
1831 bits = 0
1832 if encoding == 'PCM_16':
1833 bits = 16
1834 elif encoding == 'PCM_32':
1835 bits = 32
1836 else:
1837 raise ValueError(f'file encoding {encoding} not supported')
1838 if locs is not None and len(locs) > 0 and \
1839 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1840 raise IndexError(f'locs and labels must have same number of elements.')
1841 # write WAVE file:
1842 with open(filepath, 'wb') as df:
1843 write_riff_chunk(df)
1844 if data.ndim == 1:
1845 write_format_chunk(df, 1, len(data), rate, bits)
1846 else:
1847 write_format_chunk(df, data.shape[1], data.shape[0],
1848 rate, bits)
1849 append_metadata_riff(df, metadata)
1850 write_data_chunk(df, data, bits)
1851 append_markers_riff(df, locs, labels, rate, marker_hint)
1852 write_filesize(df)
1855def append_riff(filepath, metadata=None, locs=None, labels=None,
1856 rate=None, marker_hint='cue'):
1857 """Append metadata and markers to an existing RIFF file.
1859 Parameters
1860 ----------
1861 filepath: string
1862 Full path and name of the file to write.
1863 metadata: None or nested dict
1864 Metadata as key-value pairs. Values can be strings, integers,
1865 or dictionaries.
1866 locs: None or 1-D or 2-D array of ints
1867 Marker positions (first column) and spans (optional second column)
1868 for each marker (rows).
1869 labels: None or 1-D or 2-D array of string objects
1870 Labels (first column) and texts (optional second column)
1871 for each marker (rows).
1872 rate: float
1873 Sampling rate of the data in Hertz, needed for storing markers
1874 in seconds.
1875 marker_hint: str
1876 - 'cue': store markers in cue and and adtl chunks.
1877 - 'lbl': store markers in avisoft lbl chunk.
1879 Returns
1880 -------
1881 n: int
1882 Number of bytes written to the stream.
1884 Raises
1885 ------
1886 IndexError
1887 `locs` and `labels` differ in len.
1889 Examples
1890 --------
1891 ```
1892 import numpy as np
1893 from audioio.riffmetadata import append_riff
1895 md = dict(Artist='underscore_') # metadata
1896 append_riff('audio/file.wav', md) # append them to existing audio file
1897 ```
1898 """
1899 if not filepath:
1900 raise ValueError('no file specified!')
1901 if locs is not None and len(locs) > 0 and \
1902 labels is not None and len(labels) > 0 and len(labels) != len(locs):
1903 raise IndexError(f'locs and labels must have same number of elements.')
1904 # check RIFF file:
1905 chunks = read_chunk_tags(filepath)
1906 # append to RIFF file:
1907 n = 0
1908 with open(filepath, 'r+b') as df:
1909 tags = []
1910 df.seek(0, os.SEEK_END)
1911 nc, tgs = append_metadata_riff(df, metadata)
1912 n += nc
1913 tags.extend(tgs)
1914 nc, tgs = append_markers_riff(df, locs, labels, rate, marker_hint)
1915 n += nc
1916 tags.extend(tgs)
1917 write_filesize(df)
1918 # blank out already existing chunks:
1919 for tag in chunks:
1920 if tag in tags:
1921 if '-' in tag:
1922 xtag = tag[5:7] + 'xx'
1923 else:
1924 xtag = tag[:2] + 'xx'
1925 write_chunk_name(df, chunks[tag][0], xtag)
1926 return 0
1929def demo(filepath):
1930 """Print metadata and markers of a RIFF/WAVE file.
1932 Parameters
1933 ----------
1934 filepath: string
1935 Path of a RIFF/WAVE file.
1936 """
1937 def print_meta_data(meta_data, level=0):
1938 for sk in meta_data:
1939 md = meta_data[sk]
1940 if isinstance(md, dict):
1941 print(f'{"":<{level*4}}{sk}:')
1942 print_meta_data(md, level+1)
1943 else:
1944 v = str(md).replace('\n', '.').replace('\r', '.')
1945 print(f'{"":<{level*4}s}{sk:<20s}: {v}')
1947 # read meta data:
1948 meta_data = metadata_riff(filepath, store_empty=False)
1950 # print meta data:
1951 print()
1952 print('metadata:')
1953 print_meta_data(meta_data)
1955 # read cues:
1956 locs, labels = markers_riff(filepath)
1958 # print marker table:
1959 if len(locs) > 0:
1960 print()
1961 print('markers:')
1962 print(f'{"position":10} {"span":8} {"label":10} {"text":10}')
1963 for i in range(len(locs)):
1964 if i < len(labels):
1965 print(f'{locs[i,0]:10} {locs[i,1]:8} {labels[i,0]:10} {labels[i,1]:30}')
1966 else:
1967 print(f'{locs[i,0]:10} {locs[i,1]:8} {"-":10} {"-":10}')
1970def main(*args):
1971 """Call demo with command line arguments.
1973 Parameters
1974 ----------
1975 args: list of strings
1976 Command line arguments as returned by sys.argv[1:]
1977 """
1978 if len(args) > 0 and (args[0] == '-h' or args[0] == '--help'):
1979 print()
1980 print('Usage:')
1981 print(' python -m src.audioio.riffmetadata [--help] <audio/file.wav>')
1982 print()
1983 return
1985 if len(args) > 0:
1986 demo(args[0])
1987 else:
1988 rate = 44100
1989 t = np.arange(0, 2, 1/rate)
1990 x = np.sin(2*np.pi*440*t)
1991 imd = dict(IENG='JB', ICRD='2024-01-24', RATE=9,
1992 Comment='this is test1')
1993 bmd = dict(Description='a recording',
1994 OriginationDate='2024:01:24', TimeReference=123456,
1995 Version=42, CodingHistory='Test1\nTest2')
1996 xmd = dict(Project='Record all', Note='still testing',
1997 Sync_Point_List=dict(Sync_Point=1,
1998 Sync_Point_Comment='great'))
1999 omd = imd.copy()
2000 omd['Production'] = bmd
2001 md = dict(INFO=imd, BEXT=bmd, IXML=xmd,
2002 Recording=omd, Notes=xmd)
2003 locs = np.random.randint(10, len(x)-10, (5, 2))
2004 locs = locs[np.argsort(locs[:,0]),:]
2005 locs[:,1] = np.random.randint(0, 20, len(locs))
2006 labels = np.zeros((len(locs), 2), dtype=object)
2007 for i in range(len(labels)):
2008 labels[i,0] = chr(ord('a') + i % 26)
2009 labels[i,1] = chr(ord('A') + i % 26)*5
2010 write_wave('test.wav', x, rate, md, locs, labels)
2011 demo('test.wav')
2014if __name__ == "__main__":
2015 main(*sys.argv[1:])