Coverage for src / audioio / audiometadata.py: 99%

563 statements  

« prev     ^ index     » next       coverage.py v7.13.1, created at 2026-01-09 17:53 +0000

1"""Working with metadata. 

2 

3To interface the various ways metadata are stored in audio files, the 

4`audioio` package uses nested dictionaries. The keys are always 

5strings. Values are strings, integers, floats, datetimes, or other 

6types. Value strings can also be numbers followed by a unit, 

7e.g. "4.2mV". For defining subsections of key-value pairs, values can 

8be dictionaries. The dictionaries can be nested to arbitrary depth. 

9 

10```py 

11>>> from audioio import print_metadata 

12>>> md = dict(Recording=dict(Experimenter='John Doe', 

13 DateTimeOriginal='2023-10-01T14:10:02', 

14 Count=42), 

15 Hardware=dict(Amplifier='Teensy_Amp 4.1', 

16 Highpass='10Hz', 

17 Gain='120mV')) 

18>>> print_metadata(md) 

19``` 

20results in 

21```txt 

22Recording: 

23 Experimenter : John Doe 

24 DateTimeOriginal: 2023-10-01T14:10:02 

25 Count : 42 

26Hardware: 

27 Amplifier: Teensy_Amp 4.1 

28 Highpass : 10Hz 

29 Gain : 120mV 

30``` 

31 

32Often, audio files have very specific ways to store metadata. You can 

33enforce using these by putting them into a dictionary that is added to 

34the metadata with a key having the name of the metadata type you want, 

35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files. 

36 

37## Functions 

38 

39The `audiometadata` module provides functions for handling and 

40manipulating these nested dictionaries. Many functions take keys as 

41arguments for finding or setting specific key-value pairs. These keys 

42can be the key of a specific item of a (sub-) dictionary, no matter on 

43which level of the metadata hierarchy it is. For example, simply 

44searching for "Highpass" retrieves the corrseponding value "10Hz", 

45although "Highpass" is contained in the sub-dictionary (or "section") 

46with key "Hardware". The same item can also be specified together with 

47its parent keys: "Hardware.Highpass". Parent keys (or section keys) 

48are by default separated by '.', but all functions have a `sep` 

49key-word that specifies the string separating section names in 

50keys. Key matching is case insensitive. 

51 

52Since the same items are named by many different keys in the different 

53types of metadata data models, the functions also take lists of keys 

54as arguments. 

55 

56Do not forget that you can easily manipulate the metadata by means of 

57the standard functions of dictionaries. 

58 

59If you need to make a copy of the metadata use `deepcopy`: 

60``` 

61from copy import deepcopy 

62md_orig = deepcopy(md) 

63``` 

64 

65### Output 

66 

67Write nested dictionaries as texts: 

68 

69- `write_metadata_text()`: write meta data into a text/yaml file. 

70- `print_metadata()`: write meta data to standard output. 

71 

72### Flatten 

73 

74Conversion between nested and flat dictionaries: 

75 

76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary. 

77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary. 

78 

79### Parse numbers with units 

80 

81- `parse_number()`: parse string with number and unit. 

82- `change_unit()`: scale numerical value to a new unit. 

83 

84### Find and get values 

85 

86Find keys and get their values parsed and converted to various types: 

87 

88- `find_key()`: find dictionary in metadata hierarchy containing the specified key. 

89- `get_number_unit()`: find a key in metadata and return its number and unit. 

90- `get_number()`: find a key in metadata and return its value in a given unit. 

91- `get_int()`: find a key in metadata and return its integer value. 

92- `get_bool()`: find a key in metadata and return its boolean value. 

93- `get_datetime()`: find keys in metadata and return a datetime. 

94- `get_str()`: find a key in metadata and return its string value. 

95 

96### Organize metadata 

97 

98Add and remove metadata: 

99 

100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary. 

101- `add_sections()`: add sections to metadata dictionary. 

102- `set_metadata()`: set values of existing metadata. 

103- `add_metadata()`: add or modify key-value pairs. 

104- `move_metadata()`: remove a key from metadata and add it to a dictionary. 

105- `remove_metadata()`: remove key-value pairs or sections from metadata. 

106- `cleanup_metadata()`: remove empty sections from metadata. 

107 

108### Special metadata fields 

109 

110Retrieve and set specific metadata: 

111 

112- `get_gain()`: get gain and unit from metadata. 

113- `update_gain()`: update gain setting in metadata. 

114- `set_starttime()`: set all start-of-recording times in metadata. 

115- `update_starttime()`: update start-of-recording times in metadata. 

116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field. 

117- `add_history()`: add a string describing coding history to metadata. 

118- `add_unwrap()`: add unwrap infos to metadata. 

119 

120Lists of standard keys: 

121 

122- `default_starttime_keys`: keys of times of start of the recording. 

123- `default_timeref_keys`: keys of integer time references. 

124- `default_gain_keys`: keys of gain settings. 

125- `default_history_keys`: keys of strings describing coding history. 

126 

127 

128## Command line script 

129 

130The module can be run as a script from the command line to display the 

131metadata and markers contained in an audio file: 

132 

133```sh 

134> audiometadata logger.wav 

135``` 

136prints 

137```text 

138file: 

139 filepath : logger.wav 

140 samplingrate: 96000Hz 

141 channels : 16 

142 frames : 17280000 

143 duration : 180.000s 

144 

145metadata: 

146 INFO: 

147 Bits : 32 

148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L 

149 Gain : 165.00mV 

150 uCBoard : Teensy 4.1 

151 MACAdress : 04:e9:e5:15:3e:95 

152 DateTimeOriginal: 2023-10-01T14:10:02 

153 Software : TeeGrid R4-senors-logger v1.0 

154``` 

155 

156 

157Alternatively, the script can be run from within the audioio source tree as: 

158``` 

159python -m src.audioio.audiometadata audiofile.wav 

160``` 

161 

162Running 

163```sh 

164audiometadata --help 

165``` 

166prints 

167```text 

168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...] 

169 

170Convert audio file formats. 

171 

172positional arguments: 

173 files audio file 

174 

175options: 

176 -h, --help show this help message and exit 

177 --version show program's version number and exit 

178 -f list file format only 

179 -m list metadata only 

180 -c list cues/markers only 

181 -t list tags of all riff/wave chunks contained in the file 

182 

183version 2.0.0 by Benda-Lab (2020-2024) 

184``` 

185 

186""" 

187 

188import os 

189import sys 

190import glob 

191import argparse 

192import numpy as np 

193import datetime as dt 

194 

195from .version import __version__, __year__ 

196 

197 

198def write_metadata_text(fh, meta, prefix='', indent=4, replace=None): 

199 """Write meta data into a text/yaml file or stream. 

200 

201 With the default parameters, the output is a valid yaml file. 

202 

203 Parameters 

204 ---------- 

205 fh: filename or stream 

206 If not a stream, the file with name `fh` is opened. 

207 Otherwise `fh` is used as a stream for writing. 

208 meta: nested dict 

209 Key-value pairs of metadata to be written into the file. 

210 prefix: str 

211 This string is written at the beginning of each line. 

212 indent: int 

213 Number of characters used for indentation of sections. 

214 replace: char or None 

215 If specified, replace special characters by this character. 

216 

217 Examples 

218 -------- 

219 ``` 

220 from audioio import write_metadata 

221 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5))) 

222 write_metadata('info.txt', md) 

223 ``` 

224 """ 

225 

226 def write_dict(df, md, level, smap): 

227 w = 0 

228 for k in md: 

229 if not isinstance(md[k], dict) and w < len(k): 

230 w = len(k) 

231 for k in md: 

232 clevel = level*indent 

233 if isinstance(md[k], dict): 

234 df.write(f'{prefix}{"":>{clevel}}{k}:\n') 

235 write_dict(df, md[k], level+1, smap) 

236 else: 

237 value = md[k] 

238 if isinstance(value, (list, tuple)): 

239 value = ', '.join([f'{v}' for v in value]) 

240 else: 

241 value = f'{value}' 

242 value = value.replace('\r\n', r'\n') 

243 value = value.replace('\n', r'\n') 

244 if len(smap) > 0: 

245 value = value.translate(smap) 

246 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n') 

247 

248 if not meta: 

249 return 

250 if hasattr(fh, 'write'): 

251 own_file = False 

252 else: 

253 own_file = True 

254 fh = open(fh, 'w') 

255 smap = {} 

256 if replace: 

257 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4)) 

258 write_dict(fh, meta, 0, smap) 

259 if own_file: 

260 fh.close() 

261 

262 

263def print_metadata(meta, prefix='', indent=4, replace=None): 

264 """Write meta data to standard output. 

265 

266 Parameters 

267 ---------- 

268 meta: nested dict 

269 Key-value pairs of metadata to be written into the file. 

270 prefix: str 

271 This string is written at the beginning of each line. 

272 indent: int 

273 Number of characters used for indentation of sections. 

274 replace: char or None 

275 If specified, replace special characters by this character. 

276 

277 Examples 

278 -------- 

279 ``` 

280 >>> from audioio import print_metadata 

281 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

282 >>> print_metadata(md) 

283 aaaa: 2 

284 bbbb: 

285 ccc: 3 

286 ddd: 4 

287 eee: 

288 hh: 5 

289 iiii: 

290 jjj: 6 

291 ``` 

292 """ 

293 write_metadata_text(sys.stdout, meta, prefix, indent, replace) 

294 

295 

296def flatten_metadata(md, keep_sections=False, sep='.'): 

297 """Flatten hierarchical metadata to a single dictionary. 

298 

299 Parameters 

300 ---------- 

301 md: nested dict 

302 Metadata as returned by `metadata()`. 

303 keep_sections: bool 

304 If `True`, then prefix keys with section names, separated by `sep`. 

305 sep: str 

306 String for separating section names. 

307 

308 Returns 

309 ------- 

310 d: dict 

311 Non-nested dict containing all key-value pairs of `md`. 

312 

313 Examples 

314 -------- 

315 ``` 

316 >>> from audioio import print_metadata, flatten_metadata 

317 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

318 >>> print_metadata(md) 

319 aaaa: 2 

320 bbbb: 

321 ccc: 3 

322 ddd: 4 

323 eee: 

324 hh: 5 

325 iiii: 

326 jjj: 6 

327  

328 >>> fmd = flatten_metadata(md, keep_sections=True) 

329 >>> print_metadata(fmd) 

330 aaaa : 2 

331 bbbb.ccc : 3 

332 bbbb.ddd : 4 

333 bbbb.eee.hh: 5 

334 iiii.jjj : 6 

335 ``` 

336 """ 

337 def flatten(cd, section): 

338 df = {} 

339 for k in cd: 

340 if isinstance(cd[k], dict): 

341 df.update(flatten(cd[k], section + k + sep)) 

342 else: 

343 if keep_sections: 

344 df[section+k] = cd[k] 

345 else: 

346 df[k] = cd[k] 

347 return df 

348 

349 return flatten(md, '') 

350 

351 

352def unflatten_metadata(md, sep='.'): 

353 """Unflatten a previously flattened metadata dictionary. 

354 

355 Parameters 

356 ---------- 

357 md: dict 

358 Flat dictionary with key-value pairs as obtained from 

359 `flatten_metadata()` with `keep_sections=True`. 

360 sep: str 

361 String that separates section names. 

362 

363 Returns 

364 ------- 

365 d: nested dict 

366 Hierarchical dictionary with sub-dictionaries and key-value pairs. 

367 

368 Examples 

369 -------- 

370 ``` 

371 >>> from audioio import print_metadata, unflatten_metadata 

372 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6} 

373 >>> print_metadata(fmd) 

374 aaaa : 2 

375 bbbb.ccc : 3 

376 bbbb.ddd : 4 

377 bbbb.eee.hh: 5 

378 iiii.jjj : 6 

379  

380 >>> md = unflatten_metadata(fmd) 

381 >>> print_metadata(md) 

382 aaaa: 2 

383 bbbb: 

384 ccc: 3 

385 ddd: 4 

386 eee: 

387 hh: 5 

388 iiii: 

389 jjj: 6 

390 ``` 

391 """ 

392 umd = {} # unflattened metadata 

393 cmd = [umd] # current metadata dicts for each level of the hierarchy 

394 csk = [] # current section keys 

395 for k in md: 

396 ks = k.split(sep) 

397 # go up the hierarchy: 

398 for i in range(len(csk) - len(ks)): 

399 csk.pop() 

400 cmd.pop() 

401 for kss in reversed(ks[:len(csk)]): 

402 if kss == csk[-1]: 

403 break 

404 csk.pop() 

405 cmd.pop() 

406 # add new sections: 

407 for kss in ks[len(csk):-1]: 

408 csk.append(kss) 

409 cmd[-1][kss] = {} 

410 cmd.append(cmd[-1][kss]) 

411 # add key-value pair: 

412 cmd[-1][ks[-1]] = md[k] 

413 return umd 

414 

415 

416def parse_number(s): 

417 """Parse string with number and unit. 

418 

419 Parameters 

420 ---------- 

421 s: str, float, or int 

422 String to be parsed. The initial part of the string is 

423 expected to be a number, the part following the number is 

424 interpreted as the unit. If float or int, then return this 

425 as the value with empty unit. 

426 

427 Returns 

428 ------- 

429 v: None, int, or float 

430 Value of the string as float. Without decimal point, an int is returned. 

431 If the string does not contain a number, None is returned. 

432 u: str 

433 Unit that follows the initial number. 

434 n: int 

435 Number of digits behind the decimal point. 

436 

437 Examples 

438 -------- 

439 

440 ``` 

441 >>> from audioio import parse_number 

442 

443 # integer: 

444 >>> parse_number('42') 

445 (42, '', 0) 

446 

447 # integer with unit: 

448 >>> parse_number('42ms') 

449 (42, 'ms', 0) 

450 

451 # float with unit: 

452 >>> parse_number('42.ms') 

453 (42.0, 'ms', 0) 

454 

455 # float with unit: 

456 >>> parse_number('42.3ms') 

457 (42.3, 'ms', 1) 

458 

459 # float with space and unit: 

460 >>> parse_number('423.17 Hz') 

461 (423.17, 'Hz', 2) 

462 ``` 

463 

464 """ 

465 if not isinstance(s, str): 

466 if isinstance(s, int): 

467 return s, '', 0 

468 if isinstance(s, float): 

469 return s, '', 5 

470 else: 

471 return None, '', 0 

472 n = len(s) 

473 ip = n 

474 have_point = False 

475 for i in range(len(s)): 

476 if s[i] == '.': 

477 if have_point: 

478 n = i 

479 break 

480 have_point = True 

481 ip = i + 1 

482 if not s[i] in '0123456789.+-': 

483 n = i 

484 break 

485 if n == 0: 

486 return None, s, 0 

487 v = float(s[:n]) if have_point else int(s[:n]) 

488 u = s[n:].strip() 

489 nd = n - ip if n >= ip else 0 

490 return v, u, nd 

491 

492 

493unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2, 

494 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6, 

495 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12, 

496 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18, 

497 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2, 

498 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6, 

499 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12, 

500 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18, 

501 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6, 

502 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18, 

503 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3, 

504 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18} 

505""" SI prefixes for units with corresponding factors. """ 

506 

507 

508def change_unit(val, old_unit, new_unit): 

509 """Scale numerical value to a new unit. 

510 

511 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703 

512 

513 Parameters 

514 ---------- 

515 val: float 

516 Value given in `old_unit`. 

517 old_unit: str 

518 Unit of `val`. 

519 new_unit: str 

520 Requested unit of return value. 

521 

522 Returns 

523 ------- 

524 new_val: float 

525 The input value `val` scaled to `new_unit`. 

526 

527 Examples 

528 -------- 

529 

530 ``` 

531 >>> from audioio import change_unit 

532 >>> change_unit(5, 'mm', 'cm') 

533 0.5 

534 

535 >>> change_unit(5, '', 'cm') 

536 5.0 

537 

538 >>> change_unit(5, 'mm', '') 

539 5.0 

540 

541 >>> change_unit(5, 'cm', 'mm') 

542 50.0 

543 

544 >>> change_unit(4, 'kg', 'g') 

545 4000.0 

546 

547 >>> change_unit(12, '%', '') 

548 0.12 

549 

550 >>> change_unit(1.24, '', '%') 

551 124.0 

552 

553 >>> change_unit(2.5, 'min', 's') 

554 150.0 

555 

556 >>> change_unit(3600, 's', 'h') 

557 1.0 

558 

559 ``` 

560 

561 """ 

562 # missing unit? 

563 if not old_unit and not new_unit: 

564 return val 

565 if not old_unit and new_unit != '%': 

566 return val 

567 if not new_unit and old_unit != '%': 

568 return val 

569 

570 # special units that directly translate into factors: 

571 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0} 

572 

573 # parse old unit: 

574 f1 = 1.0 

575 if old_unit in unit_factors: 

576 f1 = unit_factors[old_unit] 

577 else: 

578 for k in unit_prefixes: 

579 if len(old_unit) > len(k) and old_unit[:len(k)] == k: 

580 f1 = unit_prefixes[k]; 

581 

582 # parse new unit: 

583 f2 = 1.0 

584 if new_unit in unit_factors: 

585 f2 = unit_factors[new_unit] 

586 else: 

587 for k in unit_prefixes: 

588 if len(new_unit) > len(k) and new_unit[:len(k)] == k: 

589 f2 = unit_prefixes[k]; 

590 

591 return val*f1/f2 

592 

593 

594def find_key(metadata, key, sep='.'): 

595 """Find dictionary in metadata hierarchy containing the specified key. 

596 

597 Parameters 

598 ---------- 

599 metadata: nested dict 

600 Metadata. 

601 key: str 

602 Key to be searched for (case insensitive). 

603 May contain section names separated by `sep`, i.e. 

604 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section) 

605 in section "bbb" that needs to be a subsection of section "aaa". 

606 sep: str 

607 String that separates section names in `key`. 

608 

609 Returns 

610 ------- 

611 md: dict 

612 The innermost dictionary matching some sections of the search key. 

613 If `key` is not at all contained in the metadata, 

614 the top-level dictionary is returned. 

615 key: str 

616 The part of the search key that was not found in `md`, or the 

617 the final part of the search key, found in `md`. 

618 

619 Examples 

620 -------- 

621 

622 Independent of whether found or not found, you can assign to the 

623 returned dictionary with the returned key. 

624 

625 ``` 

626 >>> from audioio import print_metadata, find_key 

627 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6)) 

628 >>> print_metadata(md) 

629 aaaa: 2 

630 bbbb: 

631 ccc: 3 

632 ddd: 4 

633 eee: 

634 ff: 5 

635 gggg: 

636 hhh: 6 

637 

638 >>> m, k = find_key(md, 'bbbb.ddd') 

639 >>> m[k] = 10 

640 >>> print_metadata(md) 

641 aaaa: 2 

642 bbbb: 

643 ccc: 3 

644 ddd: 10 

645 ... 

646 

647 >>> m, k = find_key(md, 'hhh') 

648 >>> m[k] = 12 

649 >>> print_metadata(md) 

650 ... 

651 gggg: 

652 hhh: 12 

653 

654 >>> m, k = find_key(md, 'bbbb.eee.xx') 

655 >>> m[k] = 42 

656 >>> print_metadata(md) 

657 ... 

658 eee: 

659 ff: 5 

660 xx: 42 

661 ... 

662 ``` 

663 

664 When searching for sections, the one conaining the searched section 

665 is returned: 

666 ```py 

667 >>> m, k = find_key(md, 'eee') 

668 >>> m[k]['yy'] = 46 

669 >>> print_metadata(md) 

670 ... 

671 eee: 

672 ff: 5 

673 xx: 42 

674 yy: 46 

675 ... 

676 ``` 

677 

678 """ 

679 def find_keys(metadata, keys): 

680 key = keys[0].strip().upper() 

681 for k in metadata: 

682 if k.upper() == key: 

683 if len(keys) == 1: 

684 # found key: 

685 return True, metadata, k 

686 elif isinstance(metadata[k], dict): 

687 # keep searching within the next section: 

688 return find_keys(metadata[k], keys[1:]) 

689 # search in subsections: 

690 for k in metadata: 

691 if isinstance(metadata[k], dict): 

692 found, mm, kk = find_keys(metadata[k], keys) 

693 if found: 

694 return True, mm, kk 

695 # nothing found: 

696 return False, metadata, sep.join(keys) 

697 

698 if metadata is None: 

699 return {}, None 

700 ks = key.strip().split(sep) 

701 found, mm, kk = find_keys(metadata, ks) 

702 return mm, kk 

703 

704 

705def get_number_unit(metadata, keys, sep='.', default=None, 

706 default_unit='', remove=False): 

707 """Find a key in metadata and return its number and unit. 

708 

709 Parameters 

710 ---------- 

711 metadata: nested dict 

712 Metadata. 

713 keys: str or list of str 

714 Keys in the metadata to be searched for (case insensitive). 

715 Value of the first key found is returned. 

716 May contain section names separated by `sep`.  

717 See `audiometadata.find_key()` for details. 

718 sep: str 

719 String that separates section names in `key`. 

720 default: None, int, or float 

721 Returned value if `key` is not found or the value does 

722 not contain a number. 

723 default_unit: str 

724 Returned unit if `key` is not found or the key's value does 

725 not have a unit. 

726 remove: bool 

727 If `True`, remove the found key from `metadata`. 

728 

729 Returns 

730 ------- 

731 v: None, int, or float 

732 Value referenced by `key` as float. 

733 Without decimal point, an int is returned. 

734 If none of the `keys` was found or 

735 the key`s value does not contain a number, 

736 then `default` is returned. 

737 u: str 

738 Corresponding unit. 

739 

740 Examples 

741 -------- 

742 

743 ``` 

744 >>> from audioio import get_number_unit 

745 >>> md = dict(aaaa='42', bbbb='42.3ms') 

746 

747 # integer: 

748 >>> get_number_unit(md, 'aaaa') 

749 (42, '') 

750 

751 # float with unit: 

752 >>> get_number_unit(md, 'bbbb') 

753 (42.3, 'ms') 

754 

755 # two keys: 

756 >>> get_number_unit(md, ['cccc', 'bbbb']) 

757 (42.3, 'ms') 

758 

759 # not found: 

760 >>> get_number_unit(md, 'cccc') 

761 (None, '') 

762 

763 # not found with default value: 

764 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.') 

765 (1.0, 'a.u.') 

766 ``` 

767 

768 """ 

769 if not metadata: 

770 return default, default_unit 

771 if not isinstance(keys, (list, tuple, np.ndarray)): 

772 keys = (keys,) 

773 value = default 

774 unit = default_unit 

775 for key in keys: 

776 m, k = find_key(metadata, key, sep) 

777 if k in m: 

778 v, u, _ = parse_number(m[k]) 

779 if v is not None: 

780 if not u: 

781 u = default_unit 

782 if remove: 

783 del m[k] 

784 return v, u 

785 elif u and unit == default_unit: 

786 unit = u 

787 return value, unit 

788 

789 

790def get_number(metadata, unit, keys, sep='.', default=None, remove=False): 

791 """Find a key in metadata and return its value in a given unit. 

792 

793 Parameters 

794 ---------- 

795 metadata: nested dict 

796 Metadata. 

797 unit: str 

798 Unit in which to return numerical value referenced by one of the `keys`. 

799 keys: str or list of str 

800 Keys in the metadata to be searched for (case insensitive). 

801 Value of the first key found is returned. 

802 May contain section names separated by `sep`.  

803 See `audiometadata.find_key()` for details. 

804 sep: str 

805 String that separates section names in `key`. 

806 default: None, int, or float 

807 Returned value if `key` is not found or the value does 

808 not contain a number. 

809 remove: bool 

810 If `True`, remove the found key from `metadata`. 

811 

812 Returns 

813 ------- 

814 v: None or float 

815 Value referenced by `key` as float scaled to `unit`. 

816 If none of the `keys` was found or 

817 the key`s value does not contain a number, 

818 then `default` is returned. 

819 

820 Examples 

821 -------- 

822 

823 ``` 

824 >>> from audioio import get_number 

825 >>> md = dict(aaaa='42', bbbb='42.3ms') 

826 

827 # milliseconds to seconds: 

828 >>> get_number(md, 's', 'bbbb') 

829 0.0423 

830 

831 # milliseconds to microseconds: 

832 >>> get_number(md, 'us', 'bbbb') 

833 42300.0 

834 

835 # value without unit is not scaled: 

836 >>> get_number(md, 'Hz', 'aaaa') 

837 42 

838 

839 # two keys: 

840 >>> get_number(md, 's', ['cccc', 'bbbb']) 

841 0.0423 

842 

843 # not found: 

844 >>> get_number(md, 's', 'cccc') 

845 None 

846 

847 # not found with default value: 

848 >>> get_number(md, 's', 'cccc', default=1.0) 

849 1.0 

850 ``` 

851 

852 """ 

853 v, u = get_number_unit(metadata, keys, sep, None, unit, remove) 

854 if v is None: 

855 return default 

856 else: 

857 return change_unit(v, u, unit) 

858 

859 

860def get_int(metadata, keys, sep='.', default=None, remove=False): 

861 """Find a key in metadata and return its integer value. 

862 

863 Parameters 

864 ---------- 

865 metadata: nested dict 

866 Metadata. 

867 keys: str or list of str 

868 Keys in the metadata to be searched for (case insensitive). 

869 Value of the first key found is returned. 

870 May contain section names separated by `sep`.  

871 See `audiometadata.find_key()` for details. 

872 sep: str 

873 String that separates section names in `key`. 

874 default: None or int 

875 Return value if `key` is not found or the value does 

876 not contain an integer. 

877 remove: bool 

878 If `True`, remove the found key from `metadata`. 

879 

880 Returns 

881 ------- 

882 v: None or int 

883 Value referenced by `key` as integer. 

884 If none of the `keys` was found, 

885 the key's value does not contain a number or represents 

886 a floating point value, then `default` is returned. 

887 

888 Examples 

889 -------- 

890 

891 ``` 

892 >>> from audioio import get_int 

893 >>> md = dict(aaaa='42', bbbb='42.3ms') 

894 

895 # integer: 

896 >>> get_int(md, 'aaaa') 

897 42 

898 

899 # two keys: 

900 >>> get_int(md, ['cccc', 'aaaa']) 

901 42 

902 

903 # float: 

904 >>> get_int(md, 'bbbb') 

905 None 

906 

907 # not found: 

908 >>> get_int(md, 'cccc') 

909 None 

910 

911 # not found with default value: 

912 >>> get_int(md, 'cccc', default=0) 

913 0 

914 ``` 

915 

916 """ 

917 if not metadata: 

918 return default 

919 if not isinstance(keys, (list, tuple, np.ndarray)): 

920 keys = (keys,) 

921 for key in keys: 

922 m, k = find_key(metadata, key, sep) 

923 if k in m: 

924 v, _, n = parse_number(m[k]) 

925 if v is not None and n == 0: 

926 if remove: 

927 del m[k] 

928 return int(v) 

929 return default 

930 

931 

932def get_bool(metadata, keys, sep='.', default=None, remove=False): 

933 """Find a key in metadata and return its boolean value. 

934 

935 Parameters 

936 ---------- 

937 metadata: nested dict 

938 Metadata. 

939 keys: str or list of str 

940 Keys in the metadata to be searched for (case insensitive). 

941 Value of the first key found is returned. 

942 May contain section names separated by `sep`.  

943 See `audiometadata.find_key()` for details. 

944 sep: str 

945 String that separates section names in `key`. 

946 default: None or bool 

947 Return value if `key` is not found or the value does 

948 not specify a boolean value. 

949 remove: bool 

950 If `True`, remove the found key from `metadata`. 

951 

952 Returns 

953 ------- 

954 v: None or bool 

955 Value referenced by `key` as boolean. 

956 True if 'true', 'yes' (case insensitive) or any number larger than zero. 

957 False if 'false', 'no' (case insensitive) or any number equal to zero. 

958 If none of the `keys` was found or 

959 the key's value does specify a boolean value, 

960 then `default` is returned. 

961 

962 Examples 

963 -------- 

964 

965 ``` 

966 >>> from audioio import get_bool 

967 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui') 

968 

969 # case insensitive: 

970 >>> get_bool(md, 'aaaa') 

971 True 

972 

973 >>> get_bool(md, 'bbbb') 

974 False 

975 

976 >>> get_bool(md, 'cccc') 

977 False 

978 

979 >>> get_bool(md, 'dddd') 

980 True 

981 

982 >>> get_bool(md, 'eeee') 

983 True 

984 

985 # not found: 

986 >>> get_bool(md, 'ffff') 

987 None 

988 

989 # two keys (string is preferred over number): 

990 >>> get_bool(md, ['cccc', 'aaaa']) 

991 True 

992 

993 # two keys (take first match): 

994 >>> get_bool(md, ['cccc', 'ffff']) 

995 False 

996 

997 # not found with default value: 

998 >>> get_bool(md, 'ffff', default=False) 

999 False 

1000 ``` 

1001 

1002 """ 

1003 if not metadata: 

1004 return default 

1005 if not isinstance(keys, (list, tuple, np.ndarray)): 

1006 keys = (keys,) 

1007 val = default 

1008 mv = None 

1009 kv = None 

1010 for key in keys: 

1011 m, k = find_key(metadata, key, sep) 

1012 if k in m and not isinstance(m[k], dict): 

1013 vs = m[k] 

1014 v, _, _ = parse_number(vs) 

1015 if v is not None: 

1016 val = abs(v) > 1e-8 

1017 mv = m 

1018 kv = k 

1019 elif isinstance(vs, str): 

1020 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']: 

1021 if remove: 

1022 del m[k] 

1023 return True 

1024 if vs.upper() in ['FALSE', 'F', 'NO', 'N']: 

1025 if remove: 

1026 del m[k] 

1027 return False 

1028 if not mv is None and not kv is None and remove: 

1029 del mv[kv] 

1030 return val 

1031 

1032 

1033default_starttime_keys = [['DateTimeOriginal'], 

1034 ['OriginationDate', 'OriginationTime'], 

1035 ['Location_Time'], 

1036 ['Timestamp']] 

1037"""Default keys of times of start of the recording in metadata. 

1038Used by `get_datetime()` and `update_starttime()` functions. 

1039""" 

1040 

1041def get_datetime(metadata, keys=default_starttime_keys, 

1042 sep='.', default=None, remove=False): 

1043 """Find keys in metadata and return a datetime. 

1044 

1045 Parameters 

1046 ---------- 

1047 metadata: nested dict 

1048 Metadata. 

1049 keys: tuple of str or list of tuple of str 

1050 Datetimes can be stored in metadata as two separate key-value pairs, 

1051 one for the date and one for the time. Or by a single key-value pair 

1052 for a date-time value. This is why the keys need to be specified in 

1053 tuples with one or two keys. 

1054 The value of the first tuple of keys found is returned. 

1055 Keys may contain section names separated by `sep`.  

1056 See `audiometadata.find_key()` for details. 

1057 The default values for the `keys` find the start time of a recording. 

1058 You can modify the default keys via the `default_starttime_keys` list 

1059 of the `audiometadata` module. 

1060 sep: str 

1061 String that separates section names in `key`. 

1062 default: None or str 

1063 Return value if `key` is not found or the value does 

1064 not contain a string. 

1065 remove: bool 

1066 If `True`, remove the found key from `metadata`. 

1067 

1068 Returns 

1069 ------- 

1070 v: None or datetime 

1071 Datetime referenced by `keys`. 

1072 If none of the `keys` was found, then `default` is returned. 

1073 

1074 Examples 

1075 -------- 

1076 

1077 ``` 

1078 >>> from audioio import get_datetime 

1079 >>> import datetime as dt 

1080 >>> md = dict(date='2024-03-02', time='10:42:24', 

1081 datetime='2023-04-15T22:10:00') 

1082 

1083 # separate date and time: 

1084 >>> get_datetime(md, ('date', 'time')) 

1085 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1086 

1087 # single datetime: 

1088 >>> get_datetime(md, ('datetime',)) 

1089 datetime.datetime(2023, 4, 15, 22, 10) 

1090 

1091 # two alternative key tuples: 

1092 >>> get_datetime(md, [('aaaa',), ('date', 'time')]) 

1093 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1094 

1095 # not found: 

1096 >>> get_datetime(md, ('cccc',)) 

1097 None 

1098 

1099 # not found with default value: 

1100 >>> get_datetime(md, ('cccc', 'dddd'), 

1101 default=dt.datetime(2022, 2, 22, 22, 2, 12)) 

1102 datetime.datetime(2022, 2, 22, 22, 2, 12) 

1103 ``` 

1104 

1105 """ 

1106 if not metadata: 

1107 return default 

1108 if len(keys) > 0 and isinstance(keys[0], str): 

1109 keys = (keys,) 

1110 for keyp in keys: 

1111 if len(keyp) == 1: 

1112 m, k = find_key(metadata, keyp[0], sep) 

1113 if k in m: 

1114 v = m[k] 

1115 if isinstance(v, dt.datetime): 

1116 if remove: 

1117 del m[k] 

1118 return v 

1119 elif isinstance(v, str): 

1120 if remove: 

1121 del m[k] 

1122 return dt.datetime.fromisoformat(v) 

1123 else: 

1124 md, kd = find_key(metadata, keyp[0], sep) 

1125 if not kd in md: 

1126 continue 

1127 if isinstance(md[kd], dt.date): 

1128 date = md[kd] 

1129 elif isinstance(md[kd], str): 

1130 date = dt.date.fromisoformat(md[kd]) 

1131 else: 

1132 continue 

1133 mt, kt = find_key(metadata, keyp[1], sep) 

1134 if not kt in mt: 

1135 continue 

1136 if isinstance(mt[kt], dt.time): 

1137 time = mt[kt] 

1138 elif isinstance(mt[kt], str): 

1139 time = dt.time.fromisoformat(mt[kt]) 

1140 else: 

1141 continue 

1142 if remove: 

1143 del md[kd] 

1144 del mt[kt] 

1145 return dt.datetime.combine(date, time) 

1146 return default 

1147 

1148 

1149def get_str(metadata, keys, sep='.', default=None, remove=False): 

1150 """Find a key in metadata and return its string value. 

1151 

1152 Parameters 

1153 ---------- 

1154 metadata: nested dict 

1155 Metadata. 

1156 keys: str or list of str 

1157 Keys in the metadata to be searched for (case insensitive). 

1158 Value of the first key found is returned. 

1159 May contain section names separated by `sep`.  

1160 See `audiometadata.find_key()` for details. 

1161 sep: str 

1162 String that separates section names in `key`. 

1163 default: None or str 

1164 Return value if `key` is not found or the value does 

1165 not contain a string. 

1166 remove: bool 

1167 If `True`, remove the found key from `metadata`. 

1168 

1169 Returns 

1170 ------- 

1171 v: None or str 

1172 String value referenced by `key`. 

1173 If none of the `keys` was found, then `default` is returned. 

1174 

1175 Examples 

1176 -------- 

1177 

1178 ``` 

1179 >>> from audioio import get_str 

1180 >>> md = dict(aaaa=42, bbbb='hello') 

1181 

1182 # string: 

1183 >>> get_str(md, 'bbbb') 

1184 'hello' 

1185 

1186 # int as str: 

1187 >>> get_str(md, 'aaaa') 

1188 '42' 

1189 

1190 # two keys: 

1191 >>> get_str(md, ['cccc', 'bbbb']) 

1192 'hello' 

1193 

1194 # not found: 

1195 >>> get_str(md, 'cccc') 

1196 None 

1197 

1198 # not found with default value: 

1199 >>> get_str(md, 'cccc', default='-') 

1200 '-' 

1201 ``` 

1202 

1203 """ 

1204 if not metadata: 

1205 return default 

1206 if not isinstance(keys, (list, tuple, np.ndarray)): 

1207 keys = (keys,) 

1208 for key in keys: 

1209 m, k = find_key(metadata, key, sep) 

1210 if k in m and not isinstance(m[k], dict): 

1211 v = m[k] 

1212 if remove: 

1213 del m[k] 

1214 return str(v) 

1215 return default 

1216 

1217 

1218def add_sections(metadata, sections, value=False, sep='.'): 

1219 """Add sections to metadata dictionary. 

1220 

1221 Parameters 

1222 ---------- 

1223 metadata: nested dict 

1224 Metadata. 

1225 key: str 

1226 Names of sections to be added to `metadata`. 

1227 Section names separated by `sep`.  

1228 value: bool 

1229 If True, then the last element in `key` is a key for a value, 

1230 not a section. 

1231 sep: str 

1232 String that separates section names in `key`. 

1233 

1234 Returns 

1235 ------- 

1236 md: dict 

1237 Dictionary of the last added section. 

1238 key: str 

1239 Last key. Only returned if `value` is set to `True`. 

1240 

1241 Examples 

1242 -------- 

1243 

1244 Add a section and a sub-section to the metadata: 

1245 ``` 

1246 >>> from audioio import print_metadata, add_sections 

1247 >>> md = dict() 

1248 >>> m = add_sections(md, 'Recording.Location') 

1249 >>> m['Country'] = 'Lummerland' 

1250 >>> print_metadata(md) 

1251 Recording: 

1252 Location: 

1253 Country: Lummerland 

1254 ``` 

1255 

1256 Add a section with a key-value pair: 

1257 ``` 

1258 >>> md = dict() 

1259 >>> m, k = add_sections(md, 'Recording.Location', True) 

1260 >>> m[k] = 'Lummerland' 

1261 >>> print_metadata(md) 

1262 Recording: 

1263 Location: Lummerland 

1264 ``` 

1265 

1266 Adds well to `find_key()`: 

1267 ``` 

1268 >>> md = dict(Recording=dict()) 

1269 >>> m, k = find_key(md, 'Recording.Location.Country') 

1270 >>> m, k = add_sections(m, k, True) 

1271 >>> m[k] = 'Lummerland' 

1272 >>> print_metadata(md) 

1273 Recording: 

1274 Location: 

1275 Country: Lummerland 

1276 ``` 

1277 

1278 """ 

1279 mm = metadata 

1280 ks = sections.split(sep) 

1281 n = len(ks) 

1282 if value: 

1283 n -= 1 

1284 for k in ks[:n]: 

1285 if len(k) == 0: 

1286 continue 

1287 mm[k] = dict() 

1288 mm = mm[k] 

1289 if value: 

1290 return mm, ks[-1] 

1291 else: 

1292 return mm 

1293 

1294 

1295def strlist_to_dict(mds): 

1296 """Convert list of key-value-pair strings to dictionary. 

1297 

1298 Parameters 

1299 ---------- 

1300 mds: None or dict or str or list of str 

1301 - None - returns empty dictionary. 

1302 - Flat dictionary - returned as is. 

1303 - String with key and value separated by '='. 

1304 - List of strings with keys and values separated by '='. 

1305 Keys may contain section names. 

1306 

1307 Returns 

1308 ------- 

1309 md_dict: dict 

1310 Flat dictionary with key-value pairs. 

1311 Keys may contain section names. 

1312 Values are strings, other types or dictionaries. 

1313 """ 

1314 if mds is None: 

1315 return {} 

1316 if isinstance(mds, dict): 

1317 return mds 

1318 if not isinstance(mds, (list, tuple, np.ndarray)): 

1319 mds = (mds,) 

1320 md_dict = {} 

1321 for md in mds: 

1322 k, v = md.split('=') 

1323 k = k.strip() 

1324 v = v.strip() 

1325 md_dict[k] = v 

1326 return md_dict 

1327 

1328 

1329def set_metadata(metadata, mds, sep='.'): 

1330 """Set values of existing metadata. 

1331 

1332 Only if a key is found in the metadata, its value is updated. 

1333 

1334 Parameters 

1335 ---------- 

1336 metadata: nested dict 

1337 Metadata. 

1338 mds: dict or str or list of str 

1339 - Flat dictionary with key-value pairs for updating the metadata. 

1340 Values can be strings, other types or dictionaries. 

1341 - String with key and value separated by '='. 

1342 - List of strings with key and value separated by '='. 

1343 Keys may contain section names separated by `sep`. 

1344 sep: str 

1345 String that separates section names in the keys of `md_dict`. 

1346 

1347 Examples 

1348 -------- 

1349 ``` 

1350 >>> from audioio import print_metadata, set_metadata 

1351 >>> md = dict(Recording=dict(Time='early')) 

1352 >>> print_metadata(md) 

1353 Recording: 

1354 Time: early 

1355 

1356 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1357 'Recording.Time': 'late'}) # change value of existing key 

1358 >>> print_metadata(md) 

1359 Recording: 

1360 Time : late 

1361 ``` 

1362 

1363 See also 

1364 -------- 

1365 add_metadata() 

1366 strlist_to_dict() 

1367 

1368 """ 

1369 if metadata is None: 

1370 return 

1371 md_dict = strlist_to_dict(mds) 

1372 for k in md_dict: 

1373 mm, kk = find_key(metadata, k, sep) 

1374 if kk in mm: 

1375 mm[kk] = md_dict[k] 

1376 

1377 

1378def add_metadata(metadata, mds, sep='.'): 

1379 """Add or modify key-value pairs. 

1380 

1381 If a key does not exist, it is added to the metadata. 

1382 

1383 Parameters 

1384 ---------- 

1385 metadata: nested dict 

1386 Metadata. 

1387 mds: dict or str or list of str 

1388 - Flat dictionary with key-value pairs for updating the metadata. 

1389 Values can be strings or other types. 

1390 - String with key and value separated by '='. 

1391 - List of strings with key and value separated by '='. 

1392 Keys may contain section names separated by `sep`. 

1393 sep: str 

1394 String that separates section names in the keys of `md_list`. 

1395 

1396 Examples 

1397 -------- 

1398 ``` 

1399 >>> from audioio import print_metadata, add_metadata 

1400 >>> md = dict(Recording=dict(Time='early')) 

1401 >>> print_metadata(md) 

1402 Recording: 

1403 Time: early 

1404 

1405 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1406 'Recording.Time': 'late', # change value of existing key  

1407 'Recording.Quality': 'amazing', # new key-value pair in existing section 

1408 'Location.Country': 'Lummerland']) # new key-value pair in new section 

1409 >>> print_metadata(md) 

1410 Recording: 

1411 Time : late 

1412 Quality: amazing 

1413 Artist: John Doe 

1414 Location: 

1415 Country: Lummerland 

1416 ``` 

1417 

1418 See also 

1419 -------- 

1420 set_metadata() 

1421 strlist_to_dict() 

1422 

1423 """ 

1424 if metadata is None: 

1425 return 

1426 md_dict = strlist_to_dict(mds) 

1427 for k in md_dict: 

1428 mm, kk = find_key(metadata, k, sep) 

1429 mm, kk = add_sections(mm, kk, True, sep) 

1430 mm[kk] = md_dict[k] 

1431 

1432 

1433def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'): 

1434 """Remove a key from metadata and add it to a dictionary. 

1435 

1436 Parameters 

1437 ---------- 

1438 src_md: nested dict 

1439 Metadata from which a key is removed. 

1440 dest_md: dict 

1441 Dictionary to which the found key and its value are added. 

1442 keys: str or list of str 

1443 List of keys to be searched for in `src_md`. 

1444 Move the first one found to `dest_md`. 

1445 See the `audiometadata.find_key()` function for details. 

1446 new_key: None or str 

1447 If specified add the value of the found key as `new_key` to 

1448 `dest_md`. Otherwise, use the search key. 

1449 sep: str 

1450 String that separates section names in `keys`. 

1451 

1452 Returns 

1453 ------- 

1454 moved: bool 

1455 `True` if key was found and moved to dictionary. 

1456  

1457 Examples 

1458 -------- 

1459 ``` 

1460 >>> from audioio import print_metadata, move_metadata 

1461 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV')) 

1462 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist') 

1463 >>> print_metadata(md) 

1464 Recording: 

1465 Gain : 1.42mV 

1466 Experimentalist: John Doe 

1467 ``` 

1468  

1469 """ 

1470 if not src_md: 

1471 return False 

1472 if not isinstance(keys, (list, tuple, np.ndarray)): 

1473 keys = (keys,) 

1474 for key in keys: 

1475 m, k = find_key(src_md, key, sep) 

1476 if k in m: 

1477 dest_key = new_key if new_key else k 

1478 dest_md[dest_key] = m.pop(k) 

1479 return True 

1480 return False 

1481 

1482 

1483def remove_metadata(metadata, key_list, sep='.'): 

1484 """Remove key-value pairs or sections from metadata. 

1485 

1486 Parameters 

1487 ---------- 

1488 metadata: nested dict 

1489 Metadata. 

1490 key_list: str or list of str 

1491 List of keys to key-value pairs or sections to be removed 

1492 from the metadata. 

1493 sep: str 

1494 String that separates section names in the keys of `key_list`. 

1495 

1496 Examples 

1497 -------- 

1498 ``` 

1499 >>> from audioio import print_metadata, remove_metadata 

1500 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4)) 

1501 >>> remove_metadata(md, ('ccc',)) 

1502 >>> print_metadata(md) 

1503 aaaa: 2 

1504 bbbb: 

1505 ddd: 4 

1506 ``` 

1507 

1508 """ 

1509 if not metadata: 

1510 return 

1511 if not isinstance(key_list, (list, tuple, np.ndarray)): 

1512 key_list = (key_list,) 

1513 for k in key_list: 

1514 mm, kk = find_key(metadata, k, sep) 

1515 if kk in mm: 

1516 del mm[kk] 

1517 

1518 

1519def cleanup_metadata(metadata): 

1520 """Remove empty sections from metadata. 

1521 

1522 Parameters 

1523 ---------- 

1524 metadata: nested dict 

1525 Metadata. 

1526 

1527 Examples 

1528 -------- 

1529 ``` 

1530 >>> from audioio import print_metadata, cleanup_metadata 

1531 >>> md = dict(aaaa=2, bbbb=dict()) 

1532 >>> cleanup_metadata(md) 

1533 >>> print_metadata(md) 

1534 aaaa: 2 

1535 ``` 

1536 

1537 """ 

1538 if not metadata: 

1539 return 

1540 for k in list(metadata): 

1541 if isinstance(metadata[k], dict): 

1542 if len(metadata[k]) == 0: 

1543 del metadata[k] 

1544 else: 

1545 cleanup_metadata(metadata[k]) 

1546 

1547 

1548default_gain_keys = ['gain'] 

1549"""Default keys of gain settings in metadata. Used by `get_gain()` function. 

1550""" 

1551 

1552def get_gain(metadata, gain_key=default_gain_keys, sep='.', 

1553 default=None, default_unit='', remove=False): 

1554 """Get gain and unit from metadata. 

1555 

1556 Parameters 

1557 ---------- 

1558 metadata: nested dict 

1559 Metadata with key-value pairs. 

1560 gain_key: str or list of str 

1561 Key in the file's metadata that holds some gain information. 

1562 If found, the data will be multiplied with the gain, 

1563 and if available, the corresponding unit is returned. 

1564 See the `audiometadata.find_key()` function for details. 

1565 You can modify the default keys via the `default_gain_keys` list 

1566 of the `audiometadata` module. 

1567 sep: str 

1568 String that separates section names in `gain_key`. 

1569 default: None or float 

1570 Returned value if no valid gain was found in `metadata`. 

1571 default_unit: str 

1572 Returned unit if no valid gain was found in `metadata`. 

1573 remove: bool 

1574 If `True`, remove the found key from `metadata`. 

1575 

1576 Returns 

1577 ------- 

1578 fac: float 

1579 Gain factor. If not found in metadata return 1. 

1580 unit: string 

1581 Unit of the data if found in the metadata, otherwise "a.u.". 

1582 """ 

1583 v, u = get_number_unit(metadata, gain_key, sep, default, 

1584 default_unit, remove) 

1585 # fix some TeeGrid gains: 

1586 if len(u) >= 2 and u[-2:] == '/V': 

1587 u = u[:-2] 

1588 return v, u 

1589 

1590 

1591def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'): 

1592 """Update gain setting in metadata. 

1593 

1594 Searches for the first appearance of a gain key in the metadata 

1595 hierarchy. If found, divide the gain value by `fac`. 

1596 

1597 Parameters 

1598 ---------- 

1599 metadata: nested dict 

1600 Metadata to be updated. 

1601 fac: float 

1602 Factor that was used to scale the data. 

1603 gain_key: str or list of str 

1604 Key in the file's metadata that holds some gain information. 

1605 If found, the data will be multiplied with the gain, 

1606 and if available, the corresponding unit is returned. 

1607 See the `audiometadata.find_key()` function for details. 

1608 You can modify the default keys via the `default_gain_keys` list 

1609 of the `audiometadata` module. 

1610 sep: str 

1611 String that separates section names in `gain_key`. 

1612 

1613 Returns 

1614 ------- 

1615 done: bool 

1616 True if gain has been found and set. 

1617 

1618 

1619 Examples 

1620 -------- 

1621 

1622 ``` 

1623 >>> from audioio import print_metadata, update_gain 

1624 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV')) 

1625 >>> update_gain(md, 2) 

1626 >>> print_metadata(md) 

1627 Artist: John Doe 

1628 Recording: 

1629 gain: 0.70mV 

1630 ``` 

1631 

1632 """ 

1633 if not metadata: 

1634 return False 

1635 if not isinstance(gain_key, (list, tuple, np.ndarray)): 

1636 gain_key = (gain_key,) 

1637 for gk in gain_key: 

1638 m, k = find_key(metadata, gk, sep) 

1639 if k in m and not isinstance(m[k], dict): 

1640 vs = m[k] 

1641 if isinstance(vs, (int, float)): 

1642 m[k] = vs/fac 

1643 else: 

1644 v, u, n = parse_number(vs) 

1645 if not v is None: 

1646 # fix some TeeGrid gains: 

1647 if len(u) >= 2 and u[-2:] == '/V': 

1648 u = u[:-2] 

1649 m[k] = f'{v/fac:.{n+1}f}{u}' 

1650 return True 

1651 return False 

1652 

1653 

1654def set_starttime(metadata, datetime_value, 

1655 time_keys=default_starttime_keys): 

1656 """Set all start-of-recording times in metadata. 

1657 

1658 Parameters 

1659 ---------- 

1660 metadata: nested dict 

1661 Metadata to be updated. 

1662 datetime_value: datetime 

1663 Start date and time of the recording. 

1664 time_keys: tuple of str or list of tuple of str 

1665 Keys to fields denoting calender times, i.e. dates and times. 

1666 Datetimes can be stored in metadata as two separate key-value pairs, 

1667 one for the date and one for the time. Or by a single key-value pair 

1668 for a date-time values. This is why the keys need to be specified in 

1669 tuples with one or two keys. 

1670 Keys may contain section names separated by `sep`.  

1671 See `audiometadata.find_key()` for details. 

1672 You can modify the default time keys via the `default_starttime_keys` 

1673 list of the `audiometadata` module. 

1674 

1675 Returns 

1676 ------- 

1677 success: bool 

1678 True if at least one time has been set. 

1679 

1680 Example 

1681 ------- 

1682 ``` 

1683 >>> from audioio import print_metadata, set_starttime 

1684 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1685 OtherTime='2023-05-16T23:20:10', 

1686 BEXT=dict(OriginationDate='2024-03-02', 

1687 OriginationTime='10:42:24')) 

1688 >>> set_starttime(md, '2024-06-17T22:10:05') 

1689 >>> print_metadata(md) 

1690 DateTimeOriginal: 2024-06-17T22:10:05 

1691 OtherTime : 2024-06-17T22:10:05 

1692 BEXT: 

1693 OriginationDate: 2024-06-17 

1694 OriginationTime: 22:10:05 

1695 ``` 

1696 

1697 """ 

1698 if not metadata: 

1699 return False 

1700 if isinstance(datetime_value, str): 

1701 datetime_value = dt.datetime.fromisoformat(datetime_value) 

1702 success = False 

1703 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1704 time_keys = (time_keys,) 

1705 for key in time_keys: 

1706 if len(key) == 1: 

1707 # datetime: 

1708 m, k = find_key(metadata, key[0]) 

1709 if k in m and not isinstance(m[k], dict): 

1710 if isinstance(m[k], dt.datetime): 

1711 m[k] = datetime_value 

1712 else: 

1713 m[k] = datetime_value.isoformat(timespec='seconds') 

1714 success = True 

1715 else: 

1716 # separate date and time: 

1717 md, kd = find_key(metadata, key[0]) 

1718 if not kd in md or isinstance(md[kd], dict): 

1719 continue 

1720 if isinstance(md[kd], dt.date): 

1721 md[kd] = datetime_value.date() 

1722 else: 

1723 md[kd] = datetime_value.date().isoformat() 

1724 mt, kt = find_key(metadata, key[1]) 

1725 if not kt in mt or isinstance(mt[kt], dict): 

1726 continue 

1727 if isinstance(mt[kt], dt.time): 

1728 mt[kt] = datetime_value.time() 

1729 else: 

1730 mt[kt] = datetime_value.time().isoformat(timespec='seconds') 

1731 success = True 

1732 return success 

1733 

1734 

1735default_timeref_keys = ['TimeReference'] 

1736"""Default keys of integer time references in metadata. 

1737Used by `update_starttime()` function. 

1738""" 

1739 

1740def update_starttime(metadata, deltat, rate, 

1741 time_keys=default_starttime_keys, 

1742 ref_keys=default_timeref_keys): 

1743 """Update start-of-recording times in metadata. 

1744 

1745 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata. 

1746 

1747 Parameters 

1748 ---------- 

1749 metadata: nested dict 

1750 Metadata to be updated. 

1751 deltat: float 

1752 Time in seconds to be added to start times. 

1753 rate: float 

1754 Sampling rate of the data in Hertz. 

1755 time_keys: tuple of str or list of tuple of str 

1756 Keys to fields denoting calender times, i.e. dates and times. 

1757 Datetimes can be stored in metadata as two separate key-value pairs, 

1758 one for the date and one for the time. Or by a single key-value pair 

1759 for a date-time values. This is why the keys need to be specified in 

1760 tuples with one or two keys. 

1761 Keys may contain section names separated by `sep`.  

1762 See `audiometadata.find_key()` for details. 

1763 You can modify the default time keys via the `default_starttime_keys` 

1764 list of the `audiometadata` module. 

1765 ref_keys: str or list of str 

1766 Keys to time references, i.e. integers in seconds relative to 

1767 a reference time. 

1768 Keys may contain section names separated by `sep`.  

1769 See `audiometadata.find_key()` for details. 

1770 You can modify the default reference keys via the 

1771 `default_timeref_keys` list of the `audiometadata` module. 

1772 

1773 Returns 

1774 ------- 

1775 success: bool 

1776 True if at least one time has been updated. 

1777 

1778 Example 

1779 ------- 

1780 ``` 

1781 >>> from audioio import print_metadata, update_starttime 

1782 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1783 OtherTime='2023-05-16T23:20:10', 

1784 BEXT=dict(OriginationDate='2024-03-02', 

1785 OriginationTime='10:42:24', 

1786 TimeReference=123456)) 

1787 >>> update_starttime(md, 4.2, 48000) 

1788 >>> print_metadata(md) 

1789 DateTimeOriginal: 2023-04-15T22:10:04 

1790 OtherTime : 2023-05-16T23:20:10 

1791 BEXT: 

1792 OriginationDate: 2024-03-02 

1793 OriginationTime: 10:42:28 

1794 TimeReference : 325056 

1795 ``` 

1796 

1797 """ 

1798 if not metadata: 

1799 return False 

1800 if not isinstance(deltat, dt.timedelta): 

1801 deltat = dt.timedelta(seconds=deltat) 

1802 success = False 

1803 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1804 time_keys = (time_keys,) 

1805 for key in time_keys: 

1806 if len(key) == 1: 

1807 # datetime: 

1808 m, k = find_key(metadata, key[0]) 

1809 if k in m and not isinstance(m[k], dict): 

1810 if isinstance(m[k], dt.datetime): 

1811 m[k] += deltat 

1812 else: 

1813 datetime = dt.datetime.fromisoformat(m[k]) + deltat 

1814 m[k] = datetime.isoformat(timespec='seconds') 

1815 success = True 

1816 else: 

1817 # separate date and time: 

1818 md, kd = find_key(metadata, key[0]) 

1819 if not kd in md or isinstance(md[kd], dict): 

1820 continue 

1821 if isinstance(md[kd], dt.date): 

1822 date = md[kd] 

1823 is_date = True 

1824 else: 

1825 date = dt.date.fromisoformat(md[kd]) 

1826 is_date = False 

1827 mt, kt = find_key(metadata, key[1]) 

1828 if not kt in mt or isinstance(mt[kt], dict): 

1829 continue 

1830 if isinstance(mt[kt], dt.time): 

1831 time = mt[kt] 

1832 is_time = True 

1833 else: 

1834 time = dt.time.fromisoformat(mt[kt]) 

1835 is_time = False 

1836 datetime = dt.datetime.combine(date, time) + deltat 

1837 md[kd] = datetime.date() if is_date else datetime.date().isoformat() 

1838 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds') 

1839 success = True 

1840 # time reference in samples: 

1841 if isinstance(ref_keys, str): 

1842 ref_keys = (ref_keys,) 

1843 for key in ref_keys: 

1844 m, k = find_key(metadata, key) 

1845 if k in m and not isinstance(m[k], dict): 

1846 is_int = isinstance(m[k], int) 

1847 tref = int(m[k]) 

1848 tref += int(np.round(deltat.total_seconds()*rate)) 

1849 m[k] = tref if is_int else f'{tref}' 

1850 success = True 

1851 return success 

1852 

1853 

1854def bext_history_str(encoding, rate, channels, text=None): 

1855 """ Assemble a string for the BEXT CodingHistory field. 

1856 

1857 Parameters 

1858 ---------- 

1859 encoding: str or None 

1860 Encoding of the data. 

1861 rate: int or float 

1862 Sampling rate in Hertz. 

1863 channels: int 

1864 Number of channels. 

1865 text: str or None 

1866 Optional free text. 

1867 

1868 Returns 

1869 ------- 

1870 s: str 

1871 String for the BEXT CodingHistory field, 

1872 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out" 

1873 """ 

1874 codes = [] 

1875 bits = None 

1876 if encoding is not None: 

1877 if encoding[:3] == 'PCM': 

1878 bits = int(encoding[4:]) 

1879 encoding = 'PCM' 

1880 codes.append(f'A={encoding}') 

1881 codes.append(f'F={rate:.0f}') 

1882 if bits is not None: 

1883 codes.append(f'W={bits}') 

1884 mode = None 

1885 if channels == 1: 

1886 mode = 'mono' 

1887 elif channels == 2: 

1888 mode = 'stereo' 

1889 if mode is not None: 

1890 codes.append(f'M={mode}') 

1891 if text is not None: 

1892 codes.append(f'T={text.rstrip()}') 

1893 return ','.join(codes) 

1894 

1895 

1896default_history_keys = ['History', 

1897 'CodingHistory', 

1898 'BWF_CODING_HISTORY'] 

1899"""Default keys of strings describing coding history in metadata. 

1900Used by `add_history()` function. 

1901""" 

1902 

1903def add_history(metadata, history, new_key=None, pre_history=None, 

1904 history_keys=default_history_keys, sep='.'): 

1905 """Add a string describing coding history to metadata. 

1906  

1907 Add `history` to the `history_keys` fields in the metadata. If 

1908 none of these fields are present but `new_key` is specified, then 

1909 assign `pre_history` and `history` to this key. If this key does 

1910 not exist in the metadata, it is created. 

1911 

1912 Parameters 

1913 ---------- 

1914 metadata: nested dict 

1915 Metadata to be updated. 

1916 history: str 

1917 String to be added to the history. 

1918 new_key: str or None 

1919 Sections and name of a history key to be added to `metadata`. 

1920 Section names are separated by `sep`. 

1921 pre_history: str or None 

1922 If a new key `new_key` is created, then assign this string followed 

1923 by `history`. 

1924 history_keys: str or list of str 

1925 Keys to fields where to add `history`. 

1926 Keys may contain section names separated by `sep`.  

1927 See `audiometadata.find_key()` for details. 

1928 You can modify the default history keys via the `default_history_keys` 

1929 list of the `audiometadata` module. 

1930 sep: str 

1931 String that separates section names in `new_key` and `history_keys`. 

1932 

1933 Returns 

1934 ------- 

1935 success: bool 

1936 True if the history string has beend added to the metadata. 

1937 

1938 Example 

1939 ------- 

1940 Add string to existing history key-value pair: 

1941 ``` 

1942 >>> from audioio import add_history 

1943 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings')) 

1944 >>> add_history(md, 'just a snippet') 

1945 >>> print(md['BEXT']['CodingHistory']) 

1946 original recordings 

1947 just a snippet 

1948 ``` 

1949 

1950 Assign string to new key-value pair: 

1951 ``` 

1952 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12')) 

1953 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data') 

1954 >>> print(md['BEXT']['CodingHistory']) 

1955 original data 

1956 just a snippet 

1957 ``` 

1958 

1959 """ 

1960 if not metadata: 

1961 return False 

1962 if isinstance(history_keys, str): 

1963 history_keys = (history_keys,) 

1964 success = False 

1965 for keys in history_keys: 

1966 m, k = find_key(metadata, keys) 

1967 if k in m and not isinstance(m[k], dict): 

1968 s = m[k] 

1969 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1970 s += '\r\n' 

1971 s += history 

1972 m[k] = s 

1973 success = True 

1974 if not success and new_key: 

1975 m, k = find_key(metadata, new_key, sep) 

1976 m, k = add_sections(m, k, True, sep) 

1977 s = '' 

1978 if pre_history is not None: 

1979 s = pre_history 

1980 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1981 s += '\r\n' 

1982 s += history 

1983 m[k] = s 

1984 success = True 

1985 return success 

1986 

1987 

1988def add_unwrap(metadata, thresh, clip=0, unit=''): 

1989 """Add unwrap infos to metadata. 

1990 

1991 If `audiotools.unwrap()` was applied to the data, then this 

1992 function adds relevant infos to the metadata. If there is an INFO 

1993 section in the metadata, the unwrap infos are added to this 

1994 section, otherwise they are added to the top level of the metadata 

1995 hierarchy. 

1996 

1997 The threshold `thresh` used for unwrapping is saved under the key 

1998 'UnwrapThreshold' as a string. If `clip` is larger than zero, then 

1999 the clip level is saved under the key 'UnwrapClippedAmplitude' as 

2000 a string. 

2001 

2002 Parameters 

2003 ---------- 

2004 md: nested dict 

2005 Metadata to be updated. 

2006 thresh: float 

2007 Threshold used for unwrapping. 

2008 clip: float 

2009 Level at which unwrapped data have been clipped. 

2010 unit: str 

2011 Unit of `thresh` and `clip`. 

2012 

2013 Examples 

2014 -------- 

2015 

2016 ``` 

2017 >>> from audioio import print_metadata, add_unwrap 

2018 >>> md = dict(INFO=dict(Time='early')) 

2019 >>> add_unwrap(md, 0.6, 1.0) 

2020 >>> print_metadata(md) 

2021 INFO: 

2022 Time : early 

2023 UnwrapThreshold : 0.60 

2024 UnwrapClippedAmplitude: 1.00 

2025 ``` 

2026 

2027 """ 

2028 if metadata is None: 

2029 return 

2030 md = metadata 

2031 for k in metadata: 

2032 if k.strip().upper() == 'INFO': 

2033 md = metadata['INFO'] 

2034 break 

2035 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}' 

2036 if clip > 0: 

2037 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}' 

2038 

2039 

2040def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks): 

2041 """Print metadata and markers of audio files. 

2042 

2043 Parameters 

2044 ---------- 

2045 file_pathes: list of str 

2046 Pathes of audio files. 

2047 list_format: bool 

2048 If True, list file format only. 

2049 list_metadata: bool 

2050 If True, list metadata only. 

2051 list_cues: bool 

2052 If True, list markers/cues only. 

2053 list_chunks: bool 

2054 If True, list all chunks contained in a riff/wave file. 

2055 """ 

2056 from .audioloader import AudioLoader 

2057 from .audiomarkers import print_markers 

2058 from .riffmetadata import read_chunk_tags 

2059 for filepath in file_pathes: 

2060 if len(file_pathes) > 1 and (list_cues or list_metadata or 

2061 list_format or list_chunks): 

2062 print(filepath) 

2063 if list_chunks: 

2064 chunks = read_chunk_tags(filepath) 

2065 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}') 

2066 for tag in chunks: 

2067 pos = chunks[tag][0] - 8 

2068 size = chunks[tag][1] + 8 

2069 print(f' {tag:9s} {pos:10d} {size:10d}') 

2070 if len(file_pathes) > 1: 

2071 print() 

2072 continue 

2073 with AudioLoader(filepath, 1, 0, verbose=0) as sf: 

2074 fmt_md = sf.format_dict() 

2075 meta_data = sf.metadata() 

2076 locs, labels = sf.markers() 

2077 if list_cues: 

2078 if len(locs) > 0: 

2079 print_markers(locs, labels) 

2080 elif list_metadata: 

2081 print_metadata(meta_data, replace='.') 

2082 elif list_format: 

2083 print_metadata(fmt_md) 

2084 else: 

2085 print('file:') 

2086 print_metadata(fmt_md, ' ') 

2087 if len(meta_data) > 0: 

2088 print() 

2089 print('metadata:') 

2090 print_metadata(meta_data, ' ', replace='.') 

2091 if len(locs) > 0: 

2092 print() 

2093 print('markers:') 

2094 print_markers(locs, labels) 

2095 if len(file_pathes) > 1: 

2096 print() 

2097 if len(file_pathes) > 1: 

2098 print() 

2099 

2100 

2101def main(*cargs): 

2102 """Call demo with command line arguments. 

2103 

2104 Parameters 

2105 ---------- 

2106 cargs: list of strings 

2107 Command line arguments as provided by sys.argv[1:] 

2108 """ 

2109 # command line arguments: 

2110 parser = argparse.ArgumentParser(add_help=True, 

2111 description='Convert audio file formats.', 

2112 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})') 

2113 parser.add_argument('--version', action='version', version=__version__) 

2114 parser.add_argument('-f', dest='dataformat', action='store_true', 

2115 help='list file format only') 

2116 parser.add_argument('-m', dest='metadata', action='store_true', 

2117 help='list metadata only') 

2118 parser.add_argument('-c', dest='cues', action='store_true', 

2119 help='list cues/markers only') 

2120 parser.add_argument('-t', dest='chunks', action='store_true', 

2121 help='list tags of all riff/wave chunks contained in the file') 

2122 parser.add_argument('files', type=str, nargs='+', 

2123 help='audio file') 

2124 if len(cargs) == 0: 

2125 cargs = None 

2126 args = parser.parse_args(cargs) 

2127 

2128 # expand wildcard patterns: 

2129 files = [] 

2130 if os.name == 'nt': 

2131 for fn in args.files: 

2132 files.extend(glob.glob(fn)) 

2133 else: 

2134 files = args.files 

2135 

2136 demo(files, args.dataformat, args.metadata, args.cues, args.chunks) 

2137 

2138 

2139if __name__ == "__main__": 

2140 main(*sys.argv[1:])