Coverage for src/audioio/riffmetadata.py: 97%

1"""Read and write meta data and marker lists of riff based files.

3Container files of the Resource Interchange File Format (RIFF) like

4WAVE files may contain sections (called chunks) with metadata and

5markers in addition to the timeseries (audio) data and the necessary

6specifications of sampling rate, bit depth, etc.

8## Metadata

10There are various types of chunks for storing metadata, like the [INFO

11list](https://www.recordingblogs.com/wiki/list-chunk-of-a-wave-file),

12[broadcast-audio extension

13(BEXT)](https://tech.ebu.ch/docs/tech/tech3285.pdf) chunk, or

14[iXML](http://www.gallery.co.uk/ixml/) chunks. These chunks contain

15metadata as key-value pairs. Since wave files are primarily designed

16for music, valid keys in these chunks are restricted to topics from

17music and music production. Some keys are usefull also for science,

18but there is need for more keys. It is possible to extend the INFO

19list keys, but these keys are restricted to four characters and the

20INFO list chunk does also not allow for hierarchical metadata. The

21other metadata chunks, in particular the BEXT chunk, cannot be

22extended. With standard chunks, not all types of metadata can be

23stored.

25The [GUANO (Grand Unified Acoustic Notation

26Ontology)](https://github.com/riggsd/guano-spec), primarily designed

27for bat acoustic recordings, has some standard ontologies that are of

28much more interest in scientific context. In addition, GUANO allows

29for extensions with arbitray nested keys and string encoded values.

30In that respect it is a well defined and easy to handle serialization

31of the [odML data model](https://doi.org/10.3389/fninf.2011.00016).

32We use GUANO to write all metadata that do not fit into the INFO, BEXT

33or IXML chunks into a WAVE file.

35To interface the various ways to store and read metadata of RIFF

36files, the `riffmetadata` module simply uses nested dictionaries. The

37keys are always strings. Values are strings or integers for key-value

38pairs. Value strings can also be numbers followed by a unit. Values

39can also be dictionaries for defining subsections of key-value

40pairs. The dictionaries can be nested to arbitrary depth.

42The `write_wave()` function first tries to write an INFO list

43chunk. It checks for a key "INFO" with a flat dictionary of key value

44pairs. It then translates all keys of this dictionary using the

45`info_tags` mapping. If all the resulting keys have no more than four

46characters and there are no subsections, then an INFO list chunk is

47written. If no "INFO" key exists, then with the same procedure all

48elements of the provided metadata are checked for being valid INFO

49tags, and on success an INFO list chunk is written. Then, in similar

50ways, `write_wave()` tries to assemble valid BEXT and iXML chunks,

51based on the tags in `bext_tags` abd `ixml_tags`. All remaining

52metadata are then stored in an GUANO chunk.

54When reading metadata from a RIFF file, INFO, BEXT and iXML chunks are

55returned as subsections with the respective keys. Metadata from an

56GUANO chunk are stored directly in the metadata dictionary without

57marking them as GUANO.

59## Markers

61A number of different chunk types exist for handling markers or cues

62that mark specific events or regions in the audio data. In the end,

63each marker has a position, a span, a label, and a text. Position,

64and span are handled with 1-D or 2-D arrays of ints, where each row is

65a marker and the columns are position and span. The span column is

66optional. Labels and texts come in another 1-D or 2-D array of objects

67pointing to strings. Again, rows are the markers, first column are the

68labels, and second column the optional texts. Try to keep the labels

69short, and use text for longer descriptions, if necessary.

71## Read metadata and markers

73- `metadata_riff()`: read metadata from a RIFF/WAVE file.

74- `markers_riff()`: read markers from a RIFF/WAVE file.

76## Write data, metadata and markers

78- `write_wave()`: write time series, metadata and markers to a WAVE file.

79- `append_metadata_riff()`: append metadata chunks to RIFF file.

80- `append_markers_riff()`: append marker chunks to RIFF file.

81- `append_riff()`: append metadata and markers to an existing RIFF file.

83## Helper functions for reading RIFF and WAVE files

85- `read_chunk_tags()`: read tags of all chunks contained in a RIFF file.

86- `read_riff_header()`: read and check the RIFF file header.

87- `skip_chunk()`: skip over unknown RIFF chunk.

88- `read_format_chunk()`: read format chunk.

89- `read_info_chunks()`: read in meta data from info list chunk.

90- `read_bext_chunk()`: read in metadata from the broadcast-audio extension chunk.

91- `read_ixml_chunk()`: read in metadata from an IXML chunk.

92- `read_guano_chunk()`: read in metadata from a GUANO chunk.

93- `read_cue_chunk()`: read in marker positions from cue chunk.

94- `read_playlist_chunk()`: read in marker spans from playlist chunk.

95- `read_adtl_chunks()`: read in associated data list chunks.

96- `read_lbl_chunk()`: read in marker positions, spans, labels, and texts from lbl chunk.

98## Helper functions for writing RIFF and WAVE files

100- `write_riff_chunk()`: write RIFF file header.

101- `write_filesize()`: write the file size into the RIFF file header.

102- `write_chunk_name()`: change the name of a chunk.

103- `write_format_chunk()`: write format chunk.

104- `write_data_chunk()`: write data chunk.

105- `write_info_chunk()`: write metadata to LIST INFO chunk.

106- `write_bext_chunk()`: write metadata to BEXT chunk.

107- `write_ixml_chunk()`: write metadata to iXML chunk.

108- `write_guano_chunk()`: write metadata to GUANO chunk.

109- `write_cue_chunk()`: write marker positions to cue chunk.

110- `write_playlist_chunk()`: write marker spans to playlist chunk.

111- `write_adtl_chunks()`: write associated data list chunks.

112- `write_lbl_chunk()`: write marker positions, spans, labels, and texts to lbl chunk.

113

114## Demo

115

116- `demo()`: print metadata and marker list of RIFF/WAVE file.

117- `main()`: call demo with command line arguments.

118

119## Descriptions of the RIFF/WAVE file format

120

121- https://de.wikipedia.org/wiki/RIFF_WAVE

122- http://www.piclist.com/techref/io/serial/midi/wave.html

123- https://moddingwiki.shikadi.net/wiki/Resource_Interchange_File_Format_(RIFF)

124- https://www.recordingblogs.com/wiki/wave-file-format

125- http://fhein.users.ak.tu-berlin.de/Alias/Studio/ProTools/audio-formate/wav/overview.html

126- http://www.gallery.co.uk/ixml/

127

128For INFO tag names see:

129

130- see https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

131

132"""

133

134import io

135import os

136import sys

137import warnings

138import struct

139import numpy as np

140import xml.etree.ElementTree as ET

141from .audiometadata import flatten_metadata, unflatten_metadata, find_key

142

143

144info_tags = dict(AGES='Rated',

145 CMNT='Comment',

146 CODE='EncodedBy',

147 COMM='Comments',

148 DIRC='Directory',

149 DISP='SoundSchemeTitle',

150 DTIM='DateTimeOriginal',

151 GENR='Genre',

152 IARL='ArchivalLocation',

153 IART='Artist',

154 IAS1='FirstLanguage',

155 IAS2='SecondLanguage',

156 IAS3='ThirdLanguage',

157 IAS4='FourthLanguage',

158 IAS5='FifthLanguage',

159 IAS6='SixthLanguage',

160 IAS7='SeventhLanguage',

161 IAS8='EighthLanguage',

162 IAS9='NinthLanguage',

163 IBSU='BaseURL',

164 ICAS='DefaultAudioStream',

165 ICDS='ConstumeDesigner',

166 ICMS='Commissioned',

167 ICMT='Comment',

168 ICNM='Cinematographer',

169 ICNT='Country',

170 ICOP='Copyright',

171 ICRD='DateCreated',

172 ICRP='Cropped',

173 IDIM='Dimensions',

174 IDIT='DateTimeOriginal',

175 IDPI='DotsPerInch',

176 IDST='DistributedBy',

177 IEDT='EditedBy',

178 IENC='EncodedBy',

179 IENG='Engineer',

180 IGNR='Genre',

181 IKEY='Keywords',

182 ILGT='Lightness',

183 ILGU='LogoURL',

184 ILIU='LogoIconURL',

185 ILNG='Language',

186 IMBI='MoreInfoBannerImage',

187 IMBU='MoreInfoBannerURL',

188 IMED='Medium',

189 IMIT='MoreInfoText',

190 IMIU='MoreInfoURL',

191 IMUS='MusicBy',

192 INAM='Title',

193 IPDS='ProductionDesigner',

194 IPLT='NumColors',

195 IPRD='Product',

196 IPRO='ProducedBy',

197 IRIP='RippedBy',

198 IRTD='Rating',

199 ISBJ='Subject',

200 ISFT='Software',

201 ISGN='SecondaryGenre',

202 ISHP='Sharpness',

203 ISMP='TimeCode',

204 ISRC='Source',

205 ISRF='SourceFrom',

206 ISTD='ProductionStudio',

207 ISTR='Starring',

208 ITCH='Technician',

209 ITRK='TrackNumber',

210 IWMU='WatermarkURL',

211 IWRI='WrittenBy',

212 LANG='Language',

213 LOCA='Location',

214 PRT1='Part',

215 PRT2='NumberOfParts',

216 RATE='Rate',

217 START='Starring',

218 STAT='Statistics',

219 TAPE='TapeName',

220 TCDO='EndTimecode',

221 TCOD='StartTimecode',

222 TITL='Title',

223 TLEN='Length',

224 TORG='Organization',

225 TRCK='TrackNumber',

226 TURL='URL',

227 TVER='Version',

228 VMAJ='VegasVersionMajor',

229 VMIN='VegasVersionMinor',

230 YEAR='Year',

231 # extensions from

232 # [TeeRec](https://github.com/janscience/TeeRec/):

233 BITS='Bits',

234 PINS='Pins',

235 AVRG='Averaging',

236 CNVS='ConversionSpeed',

237 SMPS='SamplingSpeed',

238 VREF='ReferenceVoltage',

239 GAIN='Gain',

240 UWRP='UnwrapThreshold',

241 UWPC='UnwrapClippedAmplitude',

242 IBRD='uCBoard',

243 IMAC='MACAdress',

244 CPUF='CPU frequency')

245"""Dictionary with known tags of the INFO chunk as keys and their description as value.

246

247See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

248"""

249

250bext_tags = dict(

251 Description=256,

252 Originator=32,

253 OriginatorReference=32,

254 OriginationDate=10,

255 OriginationTime=8,

256 TimeReference=8,

257 Version=2,

258 UMID=64,

259 LoudnessValue=2,

260 LoudnessRange=2,

261 MaxTruePeakLevel=2,

262 MaxMomentaryLoudness=2,

263 MaxShortTermLoudness=2,

264 Reserved=180,

265 CodingHistory=0)

266"""Dictionary with tags of the BEXT chunk as keys and their size in bytes as values.

267

268See https://tech.ebu.ch/docs/tech/tech3285.pdf

269"""

270

271ixml_tags = [

272 'BWFXML',

273 'IXML_VERSION',

274 'PROJECT',

275 'SCENE',

276 'TAPE',

277 'TAKE',

278 'TAKE_TYPE',

279 'NO_GOOD',

280 'FALSE_START',

281 'WILD_TRACK',

282 'CIRCLED',

283 'FILE_UID',

284 'UBITS',

285 'NOTE',

286 'SYNC_POINT_LIST',

287 'SYNC_POINT_COUNT',

288 'SYNC_POINT',

289 'SYNC_POINT_TYPE',

290 'SYNC_POINT_FUNCTION',

291 'SYNC_POINT_COMMENT',

292 'SYNC_POINT_LOW',

293 'SYNC_POINT_HIGH',

294 'SYNC_POINT_EVENT_DURATION',

295 'SPEED',

296 'MASTER_SPEED',

297 'CURRENT_SPEED',

298 'TIMECODE_RATE',

299 'TIMECODE_FLAGS',

300 'FILE_SAMPLE_RATE',

301 'AUDIO_BIT_DEPTH',

302 'DIGITIZER_SAMPLE_RATE',

303 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_HI',

304 'TIMESTAMP_SAMPLES_SINCE_MIDNIGHT_LO',

305 'TIMESTAMP_SAMPLE_RATE',

306 'LOUDNESS',

307 'LOUDNESS_VALUE',

308 'LOUDNESS_RANGE',

309 'MAX_TRUE_PEAK_LEVEL',

310 'MAX_MOMENTARY_LOUDNESS',

311 'MAX_SHORT_TERM_LOUDNESS',

312 'HISTORY',

313 'ORIGINAL_FILENAME',

314 'PARENT_FILENAME',

315 'PARENT_UID',

316 'FILE_SET',

317 'TOTAL_FILES',

318 'FAMILY_UID',

319 'FAMILY_NAME',

320 'FILE_SET_INDEX',

321 'TRACK_LIST',

322 'TRACK_COUNT',

323 'TRACK',

324 'CHANNEL_INDEX',

325 'INTERLEAVE_INDEX',

326 'NAME',

327 'FUNCTION',

328 'PRE_RECORD_SAMPLECOUNT',

329 'BEXT',

330 'BWF_DESCRIPTION',

331 'BWF_ORIGINATOR',

332 'BWF_ORIGINATOR_REFERENCE',

333 'BWF_ORIGINATION_DATE',

334 'BWF_ORIGINATION_TIME',

335 'BWF_TIME_REFERENCE_LOW',

336 'BWF_TIME_REFERENCE_HIGH',

337 'BWF_VERSION',

338 'BWF_UMID',

339 'BWF_RESERVED',

340 'BWF_CODING_HISTORY',

341 'BWF_LOUDNESS_VALUE',

342 'BWF_LOUDNESS_RANGE',

343 'BWF_MAX_TRUE_PEAK_LEVEL',

344 'BWF_MAX_MOMENTARY_LOUDNESS',

345 'BWF_MAX_SHORT_TERM_LOUDNESS',

346 'USER',

347 'FULL_TITLE',

348 'DIRECTOR_NAME',

349 'PRODUCTION_NAME',

350 'PRODUCTION_ADDRESS',

351 'PRODUCTION_EMAIL',

352 'PRODUCTION_PHONE',

353 'PRODUCTION_NOTE',

354 'SOUND_MIXER_NAME',

355 'SOUND_MIXER_ADDRESS',

356 'SOUND_MIXER_EMAIL',

357 'SOUND_MIXER_PHONE',

358 'SOUND_MIXER_NOTE',

359 'AUDIO_RECORDER_MODEL',

360 'AUDIO_RECORDER_SERIAL_NUMBER',

361 'AUDIO_RECORDER_FIRMWARE',

362 'LOCATION',

363 'LOCATION_NAME',

364 'LOCATION_GPS',

365 'LOCATION_ALTITUDE',

366 'LOCATION_TYPE',

367 'LOCATION_TIME',

368 ]

369"""List with valid tags of the iXML chunk.

370

371See http://www.gallery.co.uk/ixml/

372"""

373

374

375# Read RIFF/WAVE files:

376

377def read_riff_header(sf, tag=None):

378 """Read and check the RIFF file header.

379

380 Parameters

381 ----------

382 sf: stream

383 File stream of RIFF/WAVE file.

384 tag: None or str

385 If supplied, check whether it matches the subchunk tag.

386 If it does not match, raise a ValueError.

387

388 Returns

389 -------

390 filesize: int

391 Size of the RIFF file in bytes.

392

393 Raises

394 ------

395 ValueError

396 Not a RIFF file or subchunk tag does not match `tag`.

397 """

398 riffs = sf.read(4).decode('latin-1')

399 if riffs != 'RIFF':

400 raise ValueError('Not a RIFF file.')

401 fsize = struct.unpack('<I', sf.read(4))[0] + 8

402 subtag = sf.read(4).decode('latin-1')

403 if tag is not None and subtag != tag:

404 raise ValueError(f'Not a {tag} file.')

405 return fsize

406

407

408def skip_chunk(sf):

409 """Skip over unknown RIFF chunk.

410

411 Parameters

412 ----------

413 sf: stream

414 File stream of RIFF file.

415

416 Returns

417 -------

418 size: int

419 The size of the skipped chunk in bytes.

420 """

421 size = struct.unpack('<I', sf.read(4))[0]

422 size += size % 2

423 sf.seek(size, os.SEEK_CUR)

424 return size

425

426

427def read_chunk_tags(filepath):

428 """Read tags of all chunks contained in a RIFF file.

429

430 Parameters

431 ----------

432 filepath: string or file handle

433 The RIFF file.

434

435 Returns

436 -------

437 tags: dict

438 Keys are the tag names of the chunks found in the file. If the

439 chunk is a list chunk, then the list type is added with a dash

440 to the key, i.e. "LIST-INFO". Values are tuples with the

441 corresponding file positions of the data of the chunk (after

442 the tag and the chunk size field) and the size of the chunk

443 data. The file position of the next chunk is thus the position

444 of the chunk plus the size of its data. Advance another 8 bytes

445 to get to the data of the next chunk.

446 The total file size is the sum of the chunk sizes of each tag

447 incremented by eight plus another 12 bytes of the riff header.

448

449 Raises

450 ------

451 ValueError

452 Not a RIFF file.

453

454 """

455 tags = {}

456 sf = filepath

457 file_pos = None

458 if hasattr(filepath, 'read'):

459 file_pos = sf.tell()

460 sf.seek(0, os.SEEK_SET)

461 else:

462 sf = open(filepath, 'rb')

463 fsize = read_riff_header(sf)

464 while (sf.tell() < fsize - 8):

465 chunk = sf.read(4).decode('latin-1').upper()

466 size = struct.unpack('<I', sf.read(4))[0]

467 size += size % 2

468 fp = sf.tell()

469 if chunk == 'LIST':

470 subchunk = sf.read(4).decode('latin-1').upper()

471 tags[chunk + '-' + subchunk] = (fp, size)

472 size -= 4

473 else:

474 tags[chunk] = (fp, size)

475 sf.seek(size, os.SEEK_CUR)

476 if file_pos is None:

477 sf.close()

478 else:

479 sf.seek(file_pos, os.SEEK_SET)

480 return tags

481

482

483def read_format_chunk(sf):

484 """Read format chunk.

485

486 Parameters

487 ----------

488 sf: stream

489 File stream for reading FMT chunk at the position of the chunk's size field.

490

491 Returns

492 -------

493 channels: int

494 Number of channels.

495 rate: float

496 Sampling rate (frames per time) in Hertz.

497 bits: int

498 Bit resolution.

499 """

500 size = struct.unpack('<I', sf.read(4))[0]

501 size += size % 2

502 ccode, channels, rate, byterate, blockalign, bits = struct.unpack('<HHIIHH', sf.read(16))

503 if size > 16:

504 sf.read(size - 16)

505 return channels, float(rate), bits

506

507

508def read_info_chunks(sf, store_empty):

509 """Read in meta data from info list chunk.

510

511 The variable `info_tags` is used to map the 4 character tags to

512 human readable key names.

513

514 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

515

516 Parameters

517 ----------

518 sf: stream

519 File stream of RIFF file at the position of the chunk's size field..

520 store_empty: bool

521 If `False` do not add meta data with empty values.

522

523 Returns

524 -------

525 metadata: dict

526 Dictionary with key-value pairs of info tags.

527

528 """

529 md = {}

530 list_size = struct.unpack('<I', sf.read(4))[0]

531 list_type = sf.read(4).decode('latin-1').upper()

532 list_size -= 4

533 if list_type == 'INFO':

534 while list_size >= 8:

535 key = sf.read(4).decode('ascii').rstrip(' \x00')

536 size = struct.unpack('<I', sf.read(4))[0]

537 size += size % 2

538 bs = sf.read(size)

539 x = np.frombuffer(bs, dtype=np.uint8)

540 if np.sum((x >= 0x80) & (x <= 0x9f)) > 0:

541 s = bs.decode('windows-1252')

542 else:

543 s = bs.decode('latin1')

544 value = s.rstrip(' \x00\x02')

545 list_size -= 8 + size

546 if key in info_tags:

547 key = info_tags[key]

548 if value or store_empty:

549 md[key] = value

550 if list_size > 0: # finish or skip

551 sf.seek(list_size, os.SEEK_CUR)

552 return md

553

554

555def read_bext_chunk(sf, store_empty=True):

556 """Read in metadata from the broadcast-audio extension chunk.

557

558 The variable `bext_tags` lists all valid BEXT fields and their size.

559

560 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.

561

562 Parameters

563 ----------

564 sf: stream

565 File stream of RIFF file at the position of the chunk's size field..

566 store_empty: bool

567 If `False` do not add meta data with empty values.

568

569 Returns

570 -------

571 meta_data: dict

572 The meta-data of a BEXT chunk are stored in a flat dictionary

573 with the following keys:

574

575 - 'Description': a free description of the sequence.

576 - 'Originator': name of the originator/ producer of the audio file.

577 - 'OriginatorReference': unambiguous reference allocated by the originating organisation.

578 - 'OriginationDate': date of creation of audio sequence in yyyy:mm:dd.

579 - 'OriginationTime': time of creation of audio sequence in hh:mm:ss.

580 - 'TimeReference': first sample since midnight.

581 - 'Version': version of the BWF.

582 - 'UMID': unique material identifier.

583 - 'LoudnessValue': integrated loudness value.

584 - 'LoudnessRange': loudness range.

585 - 'MaxTruePeakLevel': maximum true peak value in dBTP.

586 - 'MaxMomentaryLoudness': highest value of the momentary loudness level.

587 - 'MaxShortTermLoudness': highest value of the short-term loudness level.

588 - 'Reserved': 180 bytes reserved for extension.

589 - 'CodingHistory': description of coding processed applied to the audio data, with comma separated subfields: "A=" coding algorithm, e.g. PCM, "F=" sampling rate in Hertz, "B=" bit-rate for MPEG files, "W=" word length in bits, "M=" mono, stereo, dual-mono, joint-stereo, "T=" free text.

590 """

591 md = {}

592 size = struct.unpack('<I', sf.read(4))[0]

593 size += size % 2

594 s = sf.read(256).decode('ascii').strip(' \x00')

595 if s or store_empty:

596 md['Description'] = s

597 s = sf.read(32).decode('ascii').strip(' \x00')

598 if s or store_empty:

599 md['Originator'] = s

600 s = sf.read(32).decode('ascii').strip(' \x00')

601 if s or store_empty:

602 md['OriginatorReference'] = s

603 s = sf.read(10).decode('ascii').strip(' \x00')

604 if s or store_empty:

605 md['OriginationDate'] = s

606 s = sf.read(8).decode('ascii').strip(' \x00')

607 if s or store_empty:

608 md['OriginationTime'] = s

609 reference, version = struct.unpack('<QH', sf.read(10))

610 if reference > 0 or store_empty:

611 md['TimeReference'] = reference

612 if version > 0 or store_empty:

613 md['Version'] = version

614 s = sf.read(64).decode('ascii').strip(' \x00')

615 if s or store_empty:

616 md['UMID'] = s

617 lvalue, lrange, peak, momentary, shortterm = struct.unpack('<hhhhh', sf.read(10))

618 if lvalue > 0 or store_empty:

619 md['LoudnessValue'] = lvalue

620 if lrange > 0 or store_empty:

621 md['LoudnessRange'] = lrange

622 if peak > 0 or store_empty:

623 md['MaxTruePeakLevel'] = peak

624 if momentary > 0 or store_empty:

625 md['MaxMomentaryLoudness'] = momentary

626 if shortterm > 0 or store_empty:

627 md['MaxShortTermLoudness'] = shortterm

628 s = sf.read(180).decode('ascii').strip(' \x00')

629 if s or store_empty:

630 md['Reserved'] = s

631 size -= 256 + 32 + 32 + 10 + 8 + 8 + 2 + 64 + 10 + 180

632 s = sf.read(size).decode('ascii').strip(' \x00\n\r')

633 if s or store_empty:

634 md['CodingHistory'] = s

635 return md

636

637

638def read_ixml_chunk(sf, store_empty=True):

639 """Read in metadata from an IXML chunk.

640

641 See the variable `ixml_tags` for a list of valid tags.

642

643 See http://www.gallery.co.uk/ixml/ for the specification of iXML.

644

645 Parameters

646 ----------

647 sf: stream

648 File stream of RIFF file at the position of the chunk's size field..

649 store_empty: bool

650 If `False` do not add meta data with empty values.

651

652 Returns

653 -------

654 metadata: nested dict

655 Dictionary with key-value pairs.

656 """

657

658 def parse_ixml(element, store_empty=True):

659 md = {}

660 for e in element:

661 if not e.text is None:

662 md[e.tag] = e.text

663 elif len(e) > 0:

664 md[e.tag] = parse_ixml(e, store_empty)

665 elif store_empty:

666 md[e.tag] = ''

667 return md

668

669 size = struct.unpack('<I', sf.read(4))[0]

670 size += size % 2

671 xmls = sf.read(size).decode('latin-1').rstrip(' \x00')

672 root = ET.fromstring(xmls)

673 md = {root.tag: parse_ixml(root, store_empty)}

674 if len(md) == 1 and 'BWFXML' in md:

675 md = md['BWFXML']

676 return md

677

678

679def read_guano_chunk(sf):

680 """Read in metadata from a GUANO chunk.

681

682 GUANO is the Grand Unified Acoustic Notation Ontology, an

683 extensible, open format for embedding metadata within bat acoustic

684 recordings. See https://github.com/riggsd/guano-spec for details.

685

686 The GUANO specification allows for the inclusion of arbitrary

687 nested keys and string encoded values. In that respect it is a

688 well defined and easy to handle serialization of the [odML data

689 model](https://doi.org/10.3389/fninf.2011.00016).

690

691 Parameters

692 ----------

693 sf: stream

694 File stream of RIFF file at the position of the chunk's size field..

695

696 Returns

697 -------

698 metadata: nested dict

699 Dictionary with key-value pairs.

700

701 """

702 md = {}

703 size = struct.unpack('<I', sf.read(4))[0]

704 size += size % 2

705 for line in io.StringIO(sf.read(size).decode('utf-8')):

706 ss = line.split(':')

707 if len(ss) > 1:

708 md[ss[0].strip()] = ':'.join(ss[1:]).strip().replace(r'\n', '\n')

709 return unflatten_metadata(md, '|')

710

711

712def read_cue_chunk(sf):

713 """Read in marker positions from cue chunk.

714

715 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file

716

717 Parameters

718 ----------

719 sf: stream

720 File stream of RIFF file at the position of the chunk's size field..

721

722 Returns

723 -------

724 locs: 2-D array of ints

725 Each row is a marker with unique identifier in the first column,

726 position in the second column, and span in the third column.

727 The cue chunk does not encode spans, so the third column is

728 initialized with zeros.

729 """

730 locs = []

731 size, n = struct.unpack('<II', sf.read(8))

732 for c in range(n):

733 cpid, cppos = struct.unpack('<II', sf.read(8))

734 datachunkid = sf.read(4).decode('latin-1').rstrip(' \x00').upper()

735 chunkstart, blockstart, offset = struct.unpack('<III', sf.read(12))

736 if datachunkid == 'DATA':

737 locs.append((cpid, cppos, 0))

738 return np.array(locs, dtype=int)

739

740

741def read_playlist_chunk(sf, locs):

742 """Read in marker spans from playlist chunk.

743

744 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file

745

746 Parameters

747 ----------

748 sf: stream

749 File stream of RIFF file at the position of the chunk's size field..

750 locs: 2-D array of ints

751 Markers as returned by the `read_cue_chunk()` function.

752 Each row is a marker with unique identifier in the first column,

753 position in the second column, and span in the third column.

754 The span is read in from the playlist chunk.

755 """

756 if len(locs) == 0:

757 warnings.warn('read_playlist_chunks() requires markers from a previous cue chunk')

758 size, n = struct.unpack('<II', sf.read(8))

759 for p in range(n):

760 cpid, length, repeats = struct.unpack('<III', sf.read(12))

761 i = np.where(locs[:,0] == cpid)[0]

762 if len(i) > 0:

763 locs[i[0], 2] = length

764

765

766def read_adtl_chunks(sf, locs, labels):

767 """Read in associated data list chunks.

768

769 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file

770

771 Parameters

772 ----------

773 sf: stream

774 File stream of RIFF file at the position of the chunk's size field..

775 locs: 2-D array of ints

776 Markers as returned by the `read_cue_chunk()` function.

777 Each row is a marker with unique identifier in the first column,

778 position in the second column, and span in the third column.

779 The span is read in from the LTXT chunk.

780 labels: 2-D array of string objects

781 Labels (first column) and texts (second column) for each marker (rows)

782 from previous LABL, NOTE, and LTXT chunks.

783

784 Returns

785 -------

786 labels: 2-D array of string objects

787 Labels (first column) and texts (second column) for each marker (rows)

788 from LABL, NOTE (first column), and LTXT chunks (last column).

789 """

790 list_size = struct.unpack('<I', sf.read(4))[0]

791 list_type = sf.read(4).decode('latin-1').upper()

792 list_size -= 4

793 if list_type == 'ADTL':

794 if len(locs) == 0:

795 warnings.warn('read_adtl_chunks() requires markers from a previous cue chunk')

796 if len(labels) == 0:

797 labels = np.zeros((len(locs), 2), dtype=object)

798 while list_size >= 8:

799 key = sf.read(4).decode('latin-1').rstrip(' \x00').upper()

800 size, cpid = struct.unpack('<II', sf.read(8))

801 size += size % 2 - 4

802 if key == 'LABL' or key == 'NOTE':

803 label = sf.read(size).decode('latin-1').rstrip(' \x00')

804 i = np.where(locs[:,0] == cpid)[0]

805 if len(i) > 0:

806 i = i[0]

807 if hasattr(labels[i,0], '__len__') and len(labels[i,0]) > 0:

808 labels[i,0] += '|' + label

809 else:

810 labels[i,0] = label

811 elif key == 'LTXT':

812 length = struct.unpack('<I', sf.read(4))[0]

813 sf.read(12) # skip fields

814 text = sf.read(size - 4 - 12).decode('latin-1').rstrip(' \x00')

815 i = np.where(locs[:,0] == cpid)[0]

816 if len(i) > 0:

817 i = i[0]

818 if hasattr(labels[i,1], '__len__') and len(labels[i,1]) > 0:

819 labels[i,1] += '|' + text

820 else:

821 labels[i,1] = text

822 locs[i,2] = length

823 else:

824 sf.read(size)

825 list_size -= 12 + size

826 if list_size > 0: # finish or skip

827 sf.seek(list_size, os.SEEK_CUR)

828 return labels

829

830

831def read_lbl_chunk(sf, rate):

832 """Read in marker positions, spans, labels, and texts from lbl chunk.

833

834 The proprietary LBL chunk is specific to wave files generated by

835 [AviSoft](www.avisoft.com) products.

836

837 The labels (first column of `labels`) have special meanings.

838 Markers with a span (a section label in the terminology of

839 AviSoft) can be arranged in three levels when displayed:

840

841 - "M": layer 1, the top level section

842 - "N": layer 2, sections below layer 1

843 - "O": layer 3, sections below layer 2

844 - "P": total, section start and end are displayed with two vertical lines.

845

846 All other labels mark single point labels with a time and a

847 frequency (that we here discard). See also

848 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm

849

850 Parameters

851 ----------

852 sf: stream

853 File stream of RIFF file at the position of the chunk's size field..

854 rate: float

855 Sampling rate of the data in Hertz.

856

857 Returns

858 -------

859 locs: 2-D array of ints

860 Each row is a marker with unique identifier (simply integers

861 enumerating the markers) in the first column, position in the

862 second column, and span in the third column.

863 labels: 2-D array of string objects

864 Labels (first column) and texts (second column) for

865 each marker (rows).

866

867 """

868 size = struct.unpack('<I', sf.read(4))[0]

869 nn = size // 65

870 locs = np.zeros((nn, 3), dtype=int)

871 labels = np.zeros((nn, 2), dtype=object)

872 n = 0

873 for c in range(nn):

874 line = sf.read(65).decode('ascii')

875 fields = line.split('\t')

876 if len(fields) >= 4:

877 labels[n,0] = fields[3].strip()

878 labels[n,1] = fields[2].strip()

879 start_idx = int(np.round(float(fields[0].strip('\x00'))*rate))

880 end_idx = int(np.round(float(fields[1].strip('\x00'))*rate))

881 locs[n,0] = n

882 locs[n,1] = start_idx

883 if labels[n,0] in 'MNOP':

884 locs[n,2] = end_idx - start_idx

885 else:

886 locs[n,2] = 0

887 n += 1

888 else:

889 # the first 65 bytes are a title string that applies to

890 # the whole wave file that can be set from the AVISoft

891 # software. The recorder leave this empty.

892 pass

893 return locs[:n,:], labels[:n,:]

894

895

896def metadata_riff(filepath, store_empty=False):

897 """Read metadata from a RIFF/WAVE file.

898

899 Parameters

900 ----------

901 filepath: string or file handle

902 The RIFF file.

903 store_empty: bool

904 If `False` do not add meta data with empty values.

905

906 Returns

907 -------

908 meta_data: nested dict

909 Meta data contained in the RIFF file. Keys of the nested

910 dictionaries are always strings. If the corresponding

911 values are dictionaries, then the key is the section name

912 of the metadata contained in the dictionary. All other

913 types of values are values for the respective key. In

914 particular they are strings, or list of strings. But other

915 simple types like ints or floats are also allowed.

916 First level contains sections of meta data

917 (e.g. keys 'INFO', 'BEXT', 'IXML', values are dictionaries).

918

919 Raises

920 ------

921 ValueError

922 Not a RIFF file.

923

924 Examples

925 --------

926 ```

927 from audioio.riffmetadata import riff_metadata

928 from audioio import print_metadata

929

930 md = riff_metadata('audio/file.wav')

931 print_metadata(md)

932 ```

933 """

934 meta_data = {}

935 sf = filepath

936 file_pos = None

937 if hasattr(filepath, 'read'):

938 file_pos = sf.tell()

939 sf.seek(0, os.SEEK_SET)

940 else:

941 sf = open(filepath, 'rb')

942 fsize = read_riff_header(sf)

943 while (sf.tell() < fsize - 8):

944 chunk = sf.read(4).decode('latin-1').upper()

945 if chunk == 'LIST':

946 md = read_info_chunks(sf, store_empty)

947 if len(md) > 0:

948 meta_data['INFO'] = md

949 elif chunk == 'BEXT':

950 md = read_bext_chunk(sf, store_empty)

951 if len(md) > 0:

952 meta_data['BEXT'] = md

953 elif chunk == 'IXML':

954 md = read_ixml_chunk(sf, store_empty)

955 if len(md) > 0:

956 meta_data['IXML'] = md

957 elif chunk == 'GUAN':

958 md = read_guano_chunk(sf)

959 if len(md) > 0:

960 meta_data.update(md)

961 else:

962 skip_chunk(sf)

963 if file_pos is None:

964 sf.close()

965 else:

966 sf.seek(file_pos, os.SEEK_SET)

967 return meta_data

968

969

970def markers_riff(filepath):

971 """Read markers from a RIFF/WAVE file.

972

973 Parameters

974 ----------

975 filepath: string or file handle

976 The RIFF file.

977

978 Returns

979 -------

980 locs: 2-D array of ints

981 Marker positions (first column) and spans (second column)

982 for each marker (rows).

983 labels: 2-D array of string objects

984 Labels (first column) and texts (second column)

985 for each marker (rows).

986

987 Raises

988 ------

989 ValueError

990 Not a RIFF file.

991

992 Examples

993 --------

994 ```

995 from audioio.riffmetadata import riff_markers

996 from audioio import print_markers

997

998 locs, labels = riff_markers('audio/file.wav')

999 print_markers(locs, labels)

1000 ```

1001 """

1002 sf = filepath

1003 file_pos = None

1004 if hasattr(filepath, 'read'):

1005 file_pos = sf.tell()

1006 sf.seek(0, os.SEEK_SET)

1007 else:

1008 sf = open(filepath, 'rb')

1009 rate = None

1010 locs = np.zeros((0, 3), dtype=int)

1011 labels = np.zeros((0, 2), dtype=object)

1012 fsize = read_riff_header(sf)

1013 while (sf.tell() < fsize - 8):

1014 chunk = sf.read(4).decode('latin-1').upper()

1015 if chunk == 'FMT ':

1016 rate = read_format_chunk(sf)[1]

1017 elif chunk == 'CUE ':

1018 locs = read_cue_chunk(sf)

1019 elif chunk == 'PLST':

1020 read_playlist_chunk(sf, locs)

1021 elif chunk == 'LIST':

1022 labels = read_adtl_chunks(sf, locs, labels)

1023 elif chunk == 'LBL ':

1024 locs, labels = read_lbl_chunk(sf, rate)

1025 else:

1026 skip_chunk(sf)

1027 if file_pos is None:

1028 sf.close()

1029 else:

1030 sf.seek(file_pos, os.SEEK_SET)

1031 # sort markers according to their position:

1032 if len(locs) > 0:

1033 idxs = np.argsort(locs[:,-2])

1034 locs = locs[idxs,:]

1035 if len(labels) > 0:

1036 labels = labels[idxs,:]

1037 return locs[:,1:], labels

1038

1039

1040# Write RIFF/WAVE file:

1041

1042def write_riff_chunk(df, filesize=0, tag='WAVE'):

1043 """Write RIFF file header.

1044

1045 Parameters

1046 ----------

1047 df: stream

1048 File stream for writing RIFF file header.

1049 filesize: int

1050 Size of the file in bytes.

1051 tag: str

1052 The type of RIFF file. Default is a wave file.

1053 Exactly 4 characeters long.

1054

1055 Returns

1056 -------

1057 n: int

1058 Number of bytes written to the stream.

1059

1060 Raises

1061 ------

1062 ValueError

1063 `tag` is not 4 characters long.

1064 """

1065 if len(tag) != 4:

1066 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')

1067 if filesize < 8:

1068 filesize = 8

1069 df.write(b'RIFF')

1070 df.write(struct.pack('<I', filesize - 8))

1071 df.write(tag.encode('ascii', errors='strict'))

1072 return 12

1073

1074

1075def write_filesize(df, filesize=None):

1076 """Write the file size into the RIFF file header.

1077

1078 Parameters

1079 ----------

1080 df: stream

1081 File stream into which to write `filesize`.

1082 filesize: int

1083 Size of the file in bytes. If not specified or 0,

1084 then use current size of the file.

1085 """

1086 pos = df.tell()

1087 if not filesize:

1088 df.seek(0, os.SEEK_END)

1089 filesize = df.tell()

1090 df.seek(4, os.SEEK_SET)

1091 df.write(struct.pack('<I', filesize - 8))

1092 df.seek(pos, os.SEEK_SET)

1093

1094

1095def write_chunk_name(df, pos, tag):

1096 """Change the name of a chunk.

1097

1098 Use this to make the content of an existing chunk to be ignored by

1099 overwriting its name with an unknown one.

1100

1101 Parameters

1102 ----------

1103 df: stream

1104 File stream.

1105 pos: int

1106 Position of the chunk in the file stream.

1107 tag: str

1108 The type of RIFF file. Default is a wave file.

1109 Exactly 4 characeters long.

1110

1111 Raises

1112 ------

1113 ValueError

1114 `tag` is not 4 characters long.

1115 """

1116 if len(tag) != 4:

1117 raise ValueError(f'file tag "{tag}" must be exactly 4 characters long')

1118 df.seek(pos, os.SEEK_SET)

1119 df.write(tag.encode('ascii', errors='strict'))

1120

1121

1122def write_format_chunk(df, channels, frames, rate, bits=16):

1123 """Write format chunk.

1124

1125 Parameters

1126 ----------

1127 df: stream

1128 File stream for writing FMT chunk.

1129 channels: int

1130 Number of channels contained in the data.

1131 frames: int

1132 Number of frames contained in the data.

1133 rate: int or float

1134 Sampling rate (frames per time) in Hertz.

1135 bits: 16 or 32

1136 Bit resolution of the data to be written.

1137

1138 Returns

1139 -------

1140 n: int

1141 Number of bytes written to the stream.

1142 """

1143 blockalign = channels * (bits//8)

1144 byterate = int(rate) * blockalign

1145 df.write(b'fmt ')

1146 df.write(struct.pack('<IHHIIHH', 16, 1, channels, int(rate),

1147 byterate, blockalign, bits))

1148 return 8 + 16

1149

1150

1151def write_data_chunk(df, data, bits=16):

1152 """Write data chunk.

1153

1154 Parameters

1155 ----------

1156 df: stream

1157 File stream for writing data chunk.

1158 data: 1-D or 2-D array of floats

1159 Data with first column time (frames) and optional second column

1160 channels with values between -1 and 1.

1161 bits: 16 or 32

1162 Bit resolution of the data to be written.

1163

1164 Returns

1165 -------

1166 n: int

1167 Number of bytes written to the stream.

1168 """

1169 df.write(b'data')

1170 df.write(struct.pack('<I', data.size * (bits//8)))

1171 buffer = data * 2**(bits-1)

1172 n = df.write(buffer.astype(f'<i{bits//8}').tobytes('C'))

1173 return 8 + n

1174

1175

1176def write_info_chunk(df, metadata, size=None):

1177 """Write metadata to LIST INFO chunk.

1178

1179 If `metadata` contains an 'INFO' key, then write the flat

1180 dictionary of this key as an INFO chunk. Otherwise, attempt to

1181 write all metadata items as an INFO chunk. The keys are translated

1182 via the `info_tags` variable back to INFO tags. If after

1183 translation any key is left that is longer than 4 characters or

1184 any key has a dictionary as a value (non-flat metadata), the INFO

1185 chunk is not written.

1186

1187 See https://exiftool.org/TagNames/RIFF.html#Info%20for%20valid%20info%20tags

1188

1189 Parameters

1190 ----------

1191 df: stream

1192 File stream for writing INFO chunk.

1193 metadata: nested dict

1194 Metadata as key-value pairs. Values can be strings, integers,

1195 or dictionaries.

1196 size: int or None

1197 If specified write this size into the list's size field.

1198

1199 Returns

1200 -------

1201 n: int

1202 Number of bytes written to the stream.

1203 keys_written: list of str

1204 Keys written to the INFO chunk.

1205

1206 """

1207 if not metadata:

1208 return 0, []

1209 is_info = False

1210 if 'INFO' in metadata:

1211 metadata = metadata['INFO']

1212 is_info = True

1213 tags = {v: k for k, v in info_tags.items()}

1214 n = 0

1215 for k in metadata:

1216 kn = tags.get(k, k)

1217 if len(kn) > 4:

1218 if is_info:

1219 warnings.warn(f'no 4-character info tag for key "{k}" found.')

1220 return 0, []

1221 if isinstance(metadata[k], dict):

1222 if is_info:

1223 warnings.warn(f'value of key "{k}" in INFO chunk cannot be a dictionary.')

1224 return 0, []

1225 try:

1226 v = str(metadata[k]).encode('latin-1')

1227 except UnicodeEncodeError:

1228 v = str(metadata[k]).encode('windows-1252')

1229 n += 8 + len(v) + len(v) % 2

1230 df.write(b'LIST')

1231 df.write(struct.pack('<I', size if size is not None else n + 4))

1232 df.write(b'INFO')

1233 keys_written = []

1234 for k in metadata:

1235 kn = tags.get(k, k)

1236 df.write(f'{kn:<4s}'.encode('latin-1'))

1237 try:

1238 v = str(metadata[k]).encode('latin-1')

1239 except UnicodeEncodeError:

1240 v = str(metadata[k]).encode('windows-1252')

1241 ns = len(v) + len(v) % 2

1242 if ns > len(v):

1243 v += b' ';

1244 df.write(struct.pack('<I', ns))

1245 df.write(v)

1246 keys_written.append(k)

1247 return 12 + n, ['INFO'] if is_info else keys_written

1248

1249

1250def write_bext_chunk(df, metadata):

1251 """Write metadata to BEXT chunk.

1252

1253 If `metadata` contains a BEXT key, and this contains valid BEXT

1254 tags (one of the keys listed in the variable `bext_tags`), then

1255 write the dictionary of that key as a broadcast-audio extension

1256 chunk.

1257

1258 See https://tech.ebu.ch/docs/tech/tech3285.pdf for specifications.

1259

1260 Parameters

1261 ----------

1262 df: stream

1263 File stream for writing BEXT chunk.

1264 metadata: nested dict

1265 Metadata as key-value pairs. Values can be strings, integers,

1266 or dictionaries.

1267

1268 Returns

1269 -------

1270 n: int

1271 Number of bytes written to the stream.

1272 keys_written: list of str

1273 Keys written to the BEXT chunk.

1274

1275 """

1276 if not metadata or not 'BEXT' in metadata:

1277 return 0, []

1278 metadata = metadata['BEXT']

1279 for k in metadata:

1280 if not k in bext_tags:

1281 warnings.warn(f'no bext tag for key "{k}" found.')

1282 return 0, []

1283 n = 0

1284 for k in bext_tags:

1285 n += bext_tags[k]

1286 ch = metadata.get('CodingHistory', '').encode('ascii', errors='replace')

1287 if len(ch) >= 2 and ch[-2:] != '\r\n':

1288 ch += b'\r\n'

1289 nch = len(ch) + len(ch) % 2

1290 n += nch

1291 df.write(b'BEXT')

1292 df.write(struct.pack('<I', n))

1293 for k in bext_tags:

1294 bn = bext_tags[k]

1295 if bn == 2:

1296 v = metadata.get(k, '0')

1297 df.write(struct.pack('<H', int(v)))

1298 elif bn == 8 and k == 'TimeReference':

1299 v = metadata.get(k, '0')

1300 df.write(struct.pack('<Q', int(v)))

1301 elif bn == 0:

1302 df.write(ch)

1303 df.write(bytes(nch - len(ch)))

1304 else:

1305 v = metadata.get(k, '').encode('ascii', errors='replace')

1306 df.write(v[:bn] + bytes(bn - len(v)))

1307 return 8 + n, ['BEXT']

1308

1309

1310def write_ixml_chunk(df, metadata, keys_written=None):

1311 """Write metadata to iXML chunk.

1312

1313 If `metadata` contains an IXML key with valid IXML tags (one of

1314 those listed in the variable `ixml_tags`), or the remaining tags

1315 in `metadata` are valid IXML tags, then write an IXML chunk.

1316

1317 See http://www.gallery.co.uk/ixml/ for the specification of iXML.

1318

1319 Parameters

1320 ----------

1321 df: stream

1322 File stream for writing IXML chunk.

1323 metadata: nested dict

1324 Meta-data as key-value pairs. Values can be strings, integers,

1325 or dictionaries.

1326 keys_written: list of str

1327 Keys that have already written to INFO or BEXT chunk.

1328

1329 Returns

1330 -------

1331 n: int

1332 Number of bytes written to the stream.

1333 keys_written: list of str

1334 Keys written to the IXML chunk.

1335

1336 """

1337 def check_ixml(metadata):

1338 for k in metadata:

1339 if not k.upper() in ixml_tags:

1340 return False

1341 if isinstance(metadata[k], dict):

1342 if not check_ixml(metadata[k]):

1343 return False

1344 return True

1345

1346 def build_xml(node, metadata):

1347 kw = []

1348 for k in metadata:

1349 e = ET.SubElement(node, k)

1350 if isinstance(metadata[k], dict):

1351 build_xml(e, metadata[k])

1352 else:

1353 e.text = str(metadata[k])

1354 kw.append(k)

1355 return kw

1356

1357 if not metadata:

1358 return 0, []

1359 md = metadata

1360 if keys_written:

1361 md = {k: metadata[k] for k in metadata if not k in keys_written}

1362 if len(md) == 0:

1363 return 0, []

1364 has_ixml = False

1365 if 'IXML' in md and check_ixml(md['IXML']):

1366 md = md['IXML']

1367 has_ixml = True

1368 else:

1369 if not check_ixml(md):

1370 return 0, []

1371 root = ET.Element('BWFXML')

1372 kw = build_xml(root, md)

1373 bs = bytes(ET.tostring(root, xml_declaration=True,

1374 short_empty_elements=False))

1375 if len(bs) % 2 == 1:

1376 bs += bytes(1)

1377 df.write(b'IXML')

1378 df.write(struct.pack('<I', len(bs)))

1379 df.write(bs)

1380 return 8 + len(bs), ['IXML'] if has_ixml else kw

1381

1382

1383def write_guano_chunk(df, metadata, keys_written=None):

1384 """Write metadata to guan chunk.

1385

1386 GUANO is the Grand Unified Acoustic Notation Ontology, an

1387 extensible, open format for embedding metadata within bat acoustic

1388 recordings. See https://github.com/riggsd/guano-spec for details.

1389

1390 The GUANO specification allows for the inclusion of arbitrary

1391 nested keys and string encoded values. In that respect it is a

1392 well defined and easy to handle serialization of the [odML data

1393 model](https://doi.org/10.3389/fninf.2011.00016).

1394

1395 This will write *all* metadata that are not in `keys_written`.

1396

1397 Parameters

1398 ----------

1399 df: stream

1400 File stream for writing guano chunk.

1401 metadata: nested dict

1402 Metadata as key-value pairs. Values can be strings, integers,

1403 or dictionaries.

1404 keys_written: list of str

1405 Keys that have already written to INFO, BEXT, IXML chunk.

1406

1407 Returns

1408 -------

1409 n: int

1410 Number of bytes written to the stream.

1411 keys_written: list of str

1412 Top-level keys written to the GUANO chunk.

1413

1414 """

1415 if not metadata:

1416 return 0, []

1417 md = metadata

1418 if keys_written:

1419 md = {k: metadata[k] for k in metadata if not k in keys_written}

1420 if len(md) == 0:

1421 return 0, []

1422 fmd = flatten_metadata(md, True, '|')

1423 for k in fmd:

1424 if isinstance(fmd[k], str):

1425 fmd[k] = fmd[k].replace('\n', r'\n')

1426 sio = io.StringIO()

1427 m, k = find_key(md, 'GUANO.Version')

1428 if k is None:

1429 sio.write('GUANO|Version:1.0\n')

1430 for k in fmd:

1431 sio.write(f'{k}:{fmd[k]}\n')

1432 bs = sio.getvalue().encode('utf-8')

1433 if len(bs) % 2 == 1:

1434 bs += b' '

1435 n = len(bs)

1436 df.write(b'guan')

1437 df.write(struct.pack('<I', n))

1438 df.write(bs)

1439 return n, list(md)

1440

1441

1442def write_cue_chunk(df, locs):

1443 """Write marker positions to cue chunk.

1444

1445 See https://www.recordingblogs.com/wiki/cue-chunk-of-a-wave-file

1446

1447 Parameters

1448 ----------

1449 df: stream

1450 File stream for writing cue chunk.

1451 locs: None or 2-D array of ints

1452 Positions (first column) and spans (optional second column)

1453 for each marker (rows).

1454

1455 Returns

1456 -------

1457 n: int

1458 Number of bytes written to the stream.

1459 """

1460 if locs is None or len(locs) == 0:

1461 return 0

1462 df.write(b'CUE ')

1463 df.write(struct.pack('<II', 4 + len(locs)*24, len(locs)))

1464 for i in range(len(locs)):

1465 df.write(struct.pack('<II4sIII', i, locs[i,0], b'data', 0, 0, 0))

1466 return 12 + len(locs)*24

1467

1468

1469def write_playlist_chunk(df, locs):

1470 """Write marker spans to playlist chunk.

1471

1472 See https://www.recordingblogs.com/wiki/playlist-chunk-of-a-wave-file

1473

1474 Parameters

1475 ----------

1476 df: stream

1477 File stream for writing playlist chunk.

1478 locs: None or 2-D array of ints

1479 Positions (first column) and spans (optional second column)

1480 for each marker (rows).

1481

1482 Returns

1483 -------

1484 n: int

1485 Number of bytes written to the stream.

1486 """

1487 if locs is None or len(locs) == 0 or locs.shape[1] < 2:

1488 return 0

1489 n_spans = np.sum(locs[:,1] > 0)

1490 if n_spans == 0:

1491 return 0

1492 df.write(b'plst')

1493 df.write(struct.pack('<II', 4 + n_spans*12, n_spans))

1494 for i in range(len(locs)):

1495 if locs[i,1] > 0:

1496 df.write(struct.pack('<III', i, locs[i,1], 1))

1497 return 12 + n_spans*12

1498

1499

1500def write_adtl_chunks(df, locs, labels):

1501 """Write associated data list chunks.

1502

1503 See https://www.recordingblogs.com/wiki/associated-data-list-chunk-of-a-wave-file

1504

1505 Parameters

1506 ----------

1507 df: stream

1508 File stream for writing adtl chunk.

1509 locs: None or 2-D array of ints

1510 Positions (first column) and spans (optional second column)

1511 for each marker (rows).

1512 labels: None or 2-D array of string objects

1513 Labels (first column) and texts (second column) for each marker (rows).

1514

1515 Returns

1516 -------

1517 n: int

1518 Number of bytes written to the stream.

1519 """

1520 if labels is None or len(labels) == 0:

1521 return 0

1522 labels_size = 0

1523 for l in labels[:,0]:

1524 if hasattr(l, '__len__'):

1525 n = len(l)

1526 if n > 0:

1527 labels_size += 12 + n + n % 2

1528 text_size = 0

1529 if labels.shape[1] > 1:

1530 for t in labels[:,1]:

1531 if hasattr(t, '__len__'):

1532 n = len(t)

1533 if n > 0:

1534 text_size += 28 + n + n % 2

1535 if labels_size == 0 and text_size == 0:

1536 return 0

1537 size = 4 + labels_size + text_size

1538 spans = locs[:,1] if locs.shape[1] > 1 else None

1539 df.write(b'LIST')

1540 df.write(struct.pack('<I', size))

1541 df.write(b'adtl')

1542 for i in range(len(labels)):

1543 # labl sub-chunk:

1544 l = labels[i,0]

1545 if hasattr(l, '__len__'):

1546 n = len(l)

1547 if n > 0:

1548 n += n % 2

1549 df.write(b'labl')

1550 df.write(struct.pack('<II', 4 + n, i))

1551 df.write(f'{l:<{n}s}'.encode('latin-1', errors='replace'))

1552 # ltxt sub-chunk:

1553 if labels.shape[1] > 1:

1554 t = labels[i,1]

1555 if hasattr(t, '__len__'):

1556 n = len(t)

1557 if n > 0:

1558 n += n % 2

1559 span = spans[i] if spans is not None else 0

1560 df.write(b'ltxt')

1561 df.write(struct.pack('<III', 20 + n, i, span))

1562 df.write(struct.pack('<IHHHH', 0, 0, 0, 0, 0))

1563 df.write(f'{t:<{n}s}'.encode('latin-1', errors='replace'))

1564 return 8 + size

1565

1566

1567def write_lbl_chunk(df, locs, labels, rate):

1568 """Write marker positions, spans, labels, and texts to lbl chunk.

1569

1570 The proprietary LBL chunk is specific to wave files generated by

1571 [AviSoft](www.avisoft.com) products.

1572

1573 The labels (first column of `labels`) have special meanings.

1574 Markers with a span (a section label in the terminology of

1575 AviSoft) can be arranged in three levels when displayed:

1576

1577 - "M": layer 1, the top level section

1578 - "N": layer 2, sections below layer 1

1579 - "O": layer 3, sections below layer 2

1580 - "P": total, section start and end are displayed with two vertical lines.

1581

1582 All other labels mark single point labels with a time and a

1583 frequency (that we here discard). See also

1584 https://www.avisoft.com/Help/SASLab/menu_main_tools_labels.htm

1585

1586 If a marker has a span, and its label is not one of "M", "N", "O", or "P",

1587 then its label is set to "M".

1588 If a marker has no span, and its label is one of "M", "N", "O", or "P",

1589 then its label is set to "a".

1590

1591 Parameters

1592 ----------

1593 df: stream

1594 File stream for writing lbl chunk.

1595 locs: None or 2-D array of ints

1596 Positions (first column) and spans (optional second column)

1597 for each marker (rows).

1598 labels: None or 2-D array of string objects

1599 Labels (first column) and texts (second column) for each marker (rows).

1600 rate: float

1601 Sampling rate of the data in Hertz.

1602

1603 Returns

1604 -------

1605 n: int

1606 Number of bytes written to the stream.

1607

1608 """

1609 if locs is None or len(locs) == 0:

1610 return 0

1611 size = (1 + len(locs)) * 65

1612 df.write(b'LBL ')

1613 df.write(struct.pack('<I', size))

1614 # first empty entry (this is ment to be a title for the whole wave file):

1615 df.write(b' ' * 63)

1616 df.write(b'\r\n')

1617 for k in range(len(locs)):

1618 t0 = locs[k,0]/rate

1619 t1 = t0

1620 t1 += locs[k,1]/rate

1621 ls = 'M' if locs[k,1] > 0 else 'a'

1622 ts = ''

1623 if labels is not None and len(labels) > k:

1624 ls = labels[k,0]

1625 if ls != 0 and len(ls) > 0:

1626 ls = ls[0]

1627 if ls in 'MNOP':

1628 if locs[k,1] == 0:

1629 ls = 'a'

1630 else:

1631 if locs[k,1] > 0:

1632 ls = 'M'

1633 ts = labels[k,1]

1634 if ts == 0:

1635 ts = ''

1636 df.write(struct.pack('<14sc', f'{t0:e}'.encode('ascii', errors='replace'), b'\t'))

1637 df.write(struct.pack('<14sc', f'{t1:e}'.encode('ascii', errors='replace'), b'\t'))

1638 bs = f'{ts:31s}\t{ls}\r\n'.encode('ascii', errors='replace')

1639 df.write(bs)

1640 return 8 + size

1641

1642

1643def append_metadata_riff(df, metadata):

1644 """Append metadata chunks to RIFF file.

1645

1646 You still need to update the filesize by calling

1647 `write_filesize()`.

1648

1649 Parameters

1650 ----------

1651 df: stream

1652 File stream for writing metadata chunks.

1653 metadata: None or nested dict

1654 Metadata as key-value pairs. Values can be strings, integers,

1655 or dictionaries.

1656

1657 Returns

1658 -------

1659 n: int

1660 Number of bytes written to the stream.

1661 tags: list of str

1662 Tag names of chunks written to audio file.

1663 """

1664 if not metadata:

1665 return 0, []

1666 n = 0

1667 tags = []

1668 # metadata INFO chunk:

1669 nc, kw = write_info_chunk(df, metadata)

1670 if nc > 0:

1671 tags.append('LIST-INFO')

1672 n += nc

1673 # metadata BEXT chunk:

1674 nc, bkw = write_bext_chunk(df, metadata)

1675 if nc > 0:

1676 tags.append('BEXT')

1677 n += nc

1678 kw.extend(bkw)

1679 # metadata IXML chunk:

1680 nc, xkw = write_ixml_chunk(df, metadata, kw)

1681 if nc > 0:

1682 tags.append('IXML')

1683 n += nc

1684 kw.extend(xkw)

1685 # write remaining metadata to GUANO chunk:

1686 nc, _ = write_guano_chunk(df, metadata, kw)

1687 if nc > 0:

1688 tags.append('GUAN')

1689 n += nc

1690 kw.extend(bkw)

1691 return n, tags

1692

1693

1694def append_markers_riff(df, locs, labels=None, rate=None,

1695 marker_hint='cue'):

1696 """Append marker chunks to RIFF file.

1697

1698 You still need to update the filesize by calling

1699 `write_filesize()`.

1700

1701 Parameters

1702 ----------

1703 df: stream

1704 File stream for writing metadata chunks.

1705 locs: None or 1-D or 2-D array of ints

1706 Marker positions (first column) and spans (optional second column)

1707 for each marker (rows).

1708 labels: None or 1-D or 2-D array of string objects

1709 Labels (first column) and texts (optional second column)

1710 for each marker (rows).

1711 rate: float

1712 Sampling rate of the data in Hertz, needed for storing markers

1713 in seconds.

1714 marker_hint: str

1715 - 'cue': store markers in cue and and adtl chunks.

1716 - 'lbl': store markers in avisoft lbl chunk.

1717

1718 Returns

1719 -------

1720 n: int

1721 Number of bytes written to the stream.

1722 tags: list of str

1723 Tag names of chunks written to audio file.

1724

1725 Raises

1726 ------

1727 ValueError

1728 Encoding not supported.

1729 IndexError

1730 `locs` and `labels` differ in len.

1731 """

1732 if locs is None or len(locs) == 0:

1733 return 0, []

1734 if labels is not None and len(labels) > 0 and len(labels) != len(locs):

1735 raise IndexError(f'locs and labels must have same number of elements.')

1736 # make locs and labels 2-D:

1737 if not locs is None and locs.ndim == 1:

1738 locs = locs.reshape(-1, 1)

1739 if not labels is None and labels.ndim == 1:

1740 labels = labels.reshape(-1, 1)

1741 # sort markers according to their position:

1742 idxs = np.argsort(locs[:,0])

1743 locs = locs[idxs,:]

1744 if not labels is None and len(labels) > 0:

1745 labels = labels[idxs,:]

1746 n = 0

1747 tags = []

1748 if marker_hint.lower() == 'cue':

1749 # write marker positions:

1750 nc = write_cue_chunk(df, locs)

1751 if nc > 0:

1752 tags.append('CUE ')

1753 n += nc

1754 # write marker spans:

1755 nc = write_playlist_chunk(df, locs)

1756 if nc > 0:

1757 tags.append('PLST')

1758 n += nc

1759 # write marker labels:

1760 nc = write_adtl_chunks(df, locs, labels)

1761 if nc > 0:

1762 tags.append('LIST-ADTL')

1763 n += nc

1764 elif marker_hint.lower() == 'lbl':

1765 # write avisoft labels:

1766 nc = write_lbl_chunk(df, locs, labels, rate)

1767 if nc > 0:

1768 tags.append('LBL ')

1769 n += nc

1770 else:

1771 raise ValueError(f'marker_hint "{marker_hint}" not supported for storing markers')

1772 return n, tags

1773

1774

1775def write_wave(filepath, data, rate, metadata=None, locs=None,

1776 labels=None, encoding=None, marker_hint='cue'):

1777 """Write time series, metadata and markers to a WAVE file.

1778

1779 Only 16 or 32bit PCM encoding is supported.

1780

1781 Parameters

1782 ----------

1783 filepath: string

1784 Full path and name of the file to write.

1785 data: 1-D or 2-D array of floats

1786 Array with the data (first index time, second index channel,

1787 values within -1.0 and 1.0).

1788 rate: float

1789 Sampling rate of the data in Hertz.

1790 metadata: None or nested dict

1791 Metadata as key-value pairs. Values can be strings, integers,

1792 or dictionaries.

1793 locs: None or 1-D or 2-D array of ints

1794 Marker positions (first column) and spans (optional second column)

1795 for each marker (rows).

1796 labels: None or 1-D or 2-D array of string objects

1797 Labels (first column) and texts (optional second column)

1798 for each marker (rows).

1799 encoding: string or None

1800 Encoding of the data: 'PCM_32' or 'PCM_16'.

1801 If None or empty string use 'PCM_16'.

1802 marker_hint: str

1803 - 'cue': store markers in cue and and adtl chunks.

1804 - 'lbl': store markers in avisoft lbl chunk.

1805

1806 Raises

1807 ------

1808 ValueError

1809 Encoding not supported.

1810 IndexError

1811 `locs` and `labels` differ in len.

1812

1813 See Also

1814 --------

1815 audioio.audiowriter.write_audio()

1816

1817 Examples

1818 --------

1819 ```

1820 import numpy as np

1821 from audioio.riffmetadata import write_wave

1822

1823 rate = 28000.0

1824 freq = 800.0

1825 time = np.arange(0.0, 1.0, 1/rate) # one second

1826 data = np.sin(2.0*np.p*freq*time) # 800Hz sine wave

1827 md = dict(Artist='underscore_') # metadata

1828

1829 write_wave('audio/file.wav', data, rate, md)

1830 ```

1831 """

1832 if not filepath:

1833 raise ValueError('no file specified!')

1834 if not encoding:

1835 encoding = 'PCM_16'

1836 encoding = encoding.upper()

1837 bits = 0

1838 if encoding == 'PCM_16':

1839 bits = 16

1840 elif encoding == 'PCM_32':

1841 bits = 32

1842 else:

1843 raise ValueError(f'file encoding {encoding} not supported')

1844 if locs is not None and len(locs) > 0 and \

1845 labels is not None and len(labels) > 0 and len(labels) != len(locs):

1846 raise IndexError(f'locs and labels must have same number of elements.')

1847 # write WAVE file:

1848 with open(filepath, 'wb') as df:

1849 write_riff_chunk(df)

1850 if data.ndim == 1:

1851 write_format_chunk(df, 1, len(data), rate, bits)

1852 else:

1853 write_format_chunk(df, data.shape[1], data.shape[0],

1854 rate, bits)

1855 append_metadata_riff(df, metadata)

1856 write_data_chunk(df, data, bits)

1857 append_markers_riff(df, locs, labels, rate, marker_hint)

1858 write_filesize(df)

1859

1860

1861def append_riff(filepath, metadata=None, locs=None, labels=None,

1862 rate=None, marker_hint='cue'):

1863 """Append metadata and markers to an existing RIFF file.

1864

1865 Parameters

1866 ----------

1867 filepath: string

1868 Full path and name of the file to write.

1869 metadata: None or nested dict

1870 Metadata as key-value pairs. Values can be strings, integers,

1871 or dictionaries.

1872 locs: None or 1-D or 2-D array of ints

1873 Marker positions (first column) and spans (optional second column)

1874 for each marker (rows).

1875 labels: None or 1-D or 2-D array of string objects

1876 Labels (first column) and texts (optional second column)

1877 for each marker (rows).

1878 rate: float

1879 Sampling rate of the data in Hertz, needed for storing markers

1880 in seconds.

1881 marker_hint: str

1882 - 'cue': store markers in cue and and adtl chunks.

1883 - 'lbl': store markers in avisoft lbl chunk.

1884

1885 Returns

1886 -------

1887 n: int

1888 Number of bytes written to the stream.

1889

1890 Raises

1891 ------

1892 IndexError

1893 `locs` and `labels` differ in len.

1894

1895 Examples

1896 --------

1897 ```

1898 import numpy as np

1899 from audioio.riffmetadata import append_riff

1900

1901 md = dict(Artist='underscore_') # metadata

1902 append_riff('audio/file.wav', md) # append them to existing audio file

1903 ```

1904 """

1905 if not filepath:

1906 raise ValueError('no file specified!')

1907 if locs is not None and len(locs) > 0 and \

1908 labels is not None and len(labels) > 0 and len(labels) != len(locs):

1909 raise IndexError(f'locs and labels must have same number of elements.')

1910 # check RIFF file:

1911 chunks = read_chunk_tags(filepath)

1912 # append to RIFF file:

1913 n = 0

1914 with open(filepath, 'r+b') as df:

1915 tags = []

1916 df.seek(0, os.SEEK_END)

1917 nc, tgs = append_metadata_riff(df, metadata)

1918 n += nc

1919 tags.extend(tgs)

1920 nc, tgs = append_markers_riff(df, locs, labels, rate, marker_hint)

1921 n += nc

1922 tags.extend(tgs)

1923 write_filesize(df)

1924 # blank out already existing chunks:

1925 for tag in chunks:

1926 if tag in tags:

1927 if '-' in tag:

1928 xtag = tag[5:7] + 'xx'

1929 else:

1930 xtag = tag[:2] + 'xx'

1931 write_chunk_name(df, chunks[tag][0], xtag)

1932 return 0

1933

1934

1935def demo(filepath):

1936 """Print metadata and markers of a RIFF/WAVE file.

1937

1938 Parameters

1939 ----------

1940 filepath: string

1941 Path of a RIFF/WAVE file.

1942 """

1943 def print_meta_data(meta_data, level=0):

1944 for sk in meta_data:

1945 md = meta_data[sk]

1946 if isinstance(md, dict):

1947 print(f'{"":<{level*4}}{sk}:')

1948 print_meta_data(md, level+1)

1949 else:

1950 v = str(md).replace('\n', '.').replace('\r', '.')

1951 print(f'{"":<{level*4}s}{sk:<20s}: {v}')

1952

1953 # read meta data:

1954 meta_data = metadata_riff(filepath, store_empty=False)

1955

1956 # print meta data:

1957 print()

1958 print('metadata:')

1959 print_meta_data(meta_data)

1960

1961 # read cues:

1962 locs, labels = markers_riff(filepath)

1963

1964 # print marker table:

1965 if len(locs) > 0:

1966 print()

1967 print('markers:')

1968 print(f'{"position":10} {"span":8} {"label":10} {"text":10}')

1969 for i in range(len(locs)):

1970 if i < len(labels):

1971 print(f'{locs[i,0]:10} {locs[i,1]:8} {labels[i,0]:10} {labels[i,1]:30}')

1972 else:

1973 print(f'{locs[i,0]:10} {locs[i,1]:8} {"-":10} {"-":10}')

1974

1975

1976def main(*args):

1977 """Call demo with command line arguments.

1978

1979 Parameters

1980 ----------

1981 args: list of strings

1982 Command line arguments as returned by sys.argv[1:]

1983 """

1984 if len(args) > 0 and (args[0] == '-h' or args[0] == '--help'):

1985 print()

1986 print('Usage:')

1987 print(' python -m src.audioio.riffmetadata [--help] <audio/file.wav>')

1988 print()

1989 return

1990

1991 if len(args) > 0:

1992 demo(args[0])

1993 else:

1994 rate = 44100

1995 t = np.arange(0, 2, 1/rate)

1996 x = np.sin(2*np.pi*440*t)

1997 imd = dict(IENG='JB', ICRD='2024-01-24', RATE=9,

1998 Comment='this is test1')

1999 bmd = dict(Description='a recording',

2000 OriginationDate='2024:01:24', TimeReference=123456,

2001 Version=42, CodingHistory='Test1\nTest2')

2002 xmd = dict(Project='Record all', Note='still testing',

2003 Sync_Point_List=dict(Sync_Point=1,

2004 Sync_Point_Comment='great'))

2005 omd = imd.copy()

2006 omd['Production'] = bmd

2007 md = dict(INFO=imd, BEXT=bmd, IXML=xmd,

2008 Recording=omd, Notes=xmd)

2009 locs = np.random.randint(10, len(x)-10, (5, 2))

2010 locs = locs[np.argsort(locs[:,0]),:]

2011 locs[:,1] = np.random.randint(0, 20, len(locs))

2012 labels = np.zeros((len(locs), 2), dtype=object)

2013 for i in range(len(labels)):

2014 labels[i,0] = chr(ord('a') + i % 26)

2015 labels[i,1] = chr(ord('A') + i % 26)*5

2016 write_wave('test.wav', x, rate, md, locs, labels)

2017 demo('test.wav')

2018

2019

2020if __name__ == "__main__":

2021 main(*sys.argv[1:])