Coverage for src/audioio/audiometadata.py: 99%

563 statements  

« prev     ^ index     » next       coverage.py v7.10.1, created at 2025-08-02 12:23 +0000

1"""Working with metadata. 

2 

3To interface the various ways metadata are stored in audio files, the 

4`audioio` package uses nested dictionaries. The keys are always 

5strings. Values are strings, integers, floats, datetimes, or other 

6types. Value strings can also be numbers followed by a unit, 

7e.g. "4.2mV". For defining subsections of key-value pairs, values can 

8be dictionaries. The dictionaries can be nested to arbitrary depth. 

9 

10```py 

11>>> from audioio import print_metadata 

12>>> md = dict(Recording=dict(Experimenter='John Doe', 

13 DateTimeOriginal='2023-10-01T14:10:02', 

14 Count=42), 

15 Hardware=dict(Amplifier='Teensy_Amp 4.1', 

16 Highpass='10Hz', 

17 Gain='120mV')) 

18>>> print_metadata(md) 

19``` 

20results in 

21```txt 

22Recording: 

23 Experimenter : John Doe 

24 DateTimeOriginal: 2023-10-01T14:10:02 

25 Count : 42 

26Hardware: 

27 Amplifier: Teensy_Amp 4.1 

28 Highpass : 10Hz 

29 Gain : 120mV 

30``` 

31 

32Often, audio files have very specific ways to store metadata. You can 

33enforce using these by putting them into a dictionary that is added to 

34the metadata with a key having the name of the metadata type you want, 

35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files. 

36 

37## Functions 

38 

39The `audiometadata` module provides functions for handling and 

40manipulating these nested dictionaries. Many functions take keys as 

41arguments for finding or setting specific key-value pairs. These keys 

42can be the key of a specific item of a (sub-) dictionary, no matter on 

43which level of the metadata hierarchy it is. For example, simply 

44searching for "Highpass" retrieves the corrseponding value "10Hz", 

45although "Highpass" is contained in the sub-dictionary (or "section") 

46with key "Hardware". The same item can also be specified together with 

47its parent keys: "Hardware.Highpass". Parent keys (or section keys) 

48are by default separated by '.', but all functions have a `sep` 

49key-word that specifies the string separating section names in 

50keys. Key matching is case insensitive. 

51 

52Since the same items are named by many different keys in the different 

53types of metadata data models, the functions also take lists of keys 

54as arguments. 

55 

56Do not forget that you can easily manipulate the metadata by means of 

57the standard functions of dictionaries. 

58 

59If you need to make a copy of the metadata use `deepcopy`: 

60``` 

61from copy import deepcopy 

62md_orig = deepcopy(md) 

63``` 

64 

65### Output 

66 

67Write nested dictionaries as texts: 

68 

69- `write_metadata_text()`: write meta data into a text/yaml file. 

70- `print_metadata()`: write meta data to standard output. 

71 

72### Flatten 

73 

74Conversion between nested and flat dictionaries: 

75 

76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary. 

77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary. 

78 

79### Parse numbers with units 

80 

81- `parse_number()`: parse string with number and unit. 

82- `change_unit()`: scale numerical value to a new unit. 

83 

84### Find and get values 

85 

86Find keys and get their values parsed and converted to various types: 

87 

88- `find_key()`: find dictionary in metadata hierarchy containing the specified key. 

89- `get_number_unit()`: find a key in metadata and return its number and unit. 

90- `get_number()`: find a key in metadata and return its value in a given unit. 

91- `get_int()`: find a key in metadata and return its integer value. 

92- `get_bool()`: find a key in metadata and return its boolean value. 

93- `get_datetime()`: find keys in metadata and return a datetime. 

94- `get_str()`: find a key in metadata and return its string value. 

95 

96### Organize metadata 

97 

98Add and remove metadata: 

99 

100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary. 

101- `add_sections()`: add sections to metadata dictionary. 

102- `set_metadata()`: set values of existing metadata. 

103- `add_metadata()`: add or modify key-value pairs. 

104- `move_metadata()`: remove a key from metadata and add it to a dictionary. 

105- `remove_metadata()`: remove key-value pairs or sections from metadata. 

106- `cleanup_metadata()`: remove empty sections from metadata. 

107 

108### Special metadata fields 

109 

110Retrieve and set specific metadata: 

111 

112- `get_gain()`: get gain and unit from metadata. 

113- `update_gain()`: update gain setting in metadata. 

114- `set_starttime()`: set all start-of-recording times in metadata. 

115- `update_starttime()`: update start-of-recording times in metadata. 

116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field. 

117- `add_history()`: add a string describing coding history to metadata. 

118- `add_unwrap()`: add unwrap infos to metadata. 

119 

120Lists of standard keys: 

121 

122- `default_starttime_keys`: keys of times of start of the recording. 

123- `default_timeref_keys`: keys of integer time references. 

124- `default_gain_keys`: keys of gain settings. 

125- `default_history_keys`: keys of strings describing coding history. 

126 

127 

128## Command line script 

129 

130The module can be run as a script from the command line to display the 

131metadata and markers contained in an audio file: 

132 

133```sh 

134> audiometadata logger.wav 

135``` 

136prints 

137```text 

138file: 

139 filepath : logger.wav 

140 samplingrate: 96000Hz 

141 channels : 16 

142 frames : 17280000 

143 duration : 180.000s 

144 

145metadata: 

146 INFO: 

147 Bits : 32 

148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L 

149 Gain : 165.00mV 

150 uCBoard : Teensy 4.1 

151 MACAdress : 04:e9:e5:15:3e:95 

152 DateTimeOriginal: 2023-10-01T14:10:02 

153 Software : TeeGrid R4-senors-logger v1.0 

154``` 

155 

156 

157Alternatively, the script can be run from within the audioio source tree as: 

158``` 

159python -m src.audioio.audiometadata audiofile.wav 

160``` 

161 

162Running 

163```sh 

164audiometadata --help 

165``` 

166prints 

167```text 

168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...] 

169 

170Convert audio file formats. 

171 

172positional arguments: 

173 files audio file 

174 

175options: 

176 -h, --help show this help message and exit 

177 --version show program's version number and exit 

178 -f list file format only 

179 -m list metadata only 

180 -c list cues/markers only 

181 -t list tags of all riff/wave chunks contained in the file 

182 

183version 2.0.0 by Benda-Lab (2020-2024) 

184``` 

185 

186""" 

187 

188import os 

189import sys 

190import glob 

191import argparse 

192import numpy as np 

193import datetime as dt 

194from .version import __version__, __year__ 

195 

196 

197def write_metadata_text(fh, meta, prefix='', indent=4, replace=None): 

198 """Write meta data into a text/yaml file or stream. 

199 

200 With the default parameters, the output is a valid yaml file. 

201 

202 Parameters 

203 ---------- 

204 fh: filename or stream 

205 If not a stream, the file with name `fh` is opened. 

206 Otherwise `fh` is used as a stream for writing. 

207 meta: nested dict 

208 Key-value pairs of metadata to be written into the file. 

209 prefix: str 

210 This string is written at the beginning of each line. 

211 indent: int 

212 Number of characters used for indentation of sections. 

213 replace: char or None 

214 If specified, replace special characters by this character. 

215 

216 Examples 

217 -------- 

218 ``` 

219 from audioio import write_metadata 

220 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5))) 

221 write_metadata('info.txt', md) 

222 ``` 

223 """ 

224 

225 def write_dict(df, md, level, smap): 

226 w = 0 

227 for k in md: 

228 if not isinstance(md[k], dict) and w < len(k): 

229 w = len(k) 

230 for k in md: 

231 clevel = level*indent 

232 if isinstance(md[k], dict): 

233 df.write(f'{prefix}{"":>{clevel}}{k}:\n') 

234 write_dict(df, md[k], level+1, smap) 

235 else: 

236 value = md[k] 

237 if isinstance(value, (list, tuple)): 

238 value = ', '.join([f'{v}' for v in value]) 

239 else: 

240 value = f'{value}' 

241 value = value.replace('\r\n', r'\n') 

242 value = value.replace('\n', r'\n') 

243 if len(smap) > 0: 

244 value = value.translate(smap) 

245 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n') 

246 

247 if not meta: 

248 return 

249 if hasattr(fh, 'write'): 

250 own_file = False 

251 else: 

252 own_file = True 

253 fh = open(fh, 'w') 

254 smap = {} 

255 if replace: 

256 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4)) 

257 write_dict(fh, meta, 0, smap) 

258 if own_file: 

259 fh.close() 

260 

261 

262def print_metadata(meta, prefix='', indent=4, replace=None): 

263 """Write meta data to standard output. 

264 

265 Parameters 

266 ---------- 

267 meta: nested dict 

268 Key-value pairs of metadata to be written into the file. 

269 prefix: str 

270 This string is written at the beginning of each line. 

271 indent: int 

272 Number of characters used for indentation of sections. 

273 replace: char or None 

274 If specified, replace special characters by this character. 

275 

276 Examples 

277 -------- 

278 ``` 

279 >>> from audioio import print_metadata 

280 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

281 >>> print_metadata(md) 

282 aaaa: 2 

283 bbbb: 

284 ccc: 3 

285 ddd: 4 

286 eee: 

287 hh: 5 

288 iiii: 

289 jjj: 6 

290 ``` 

291 """ 

292 write_metadata_text(sys.stdout, meta, prefix, indent, replace) 

293 

294 

295def flatten_metadata(md, keep_sections=False, sep='.'): 

296 """Flatten hierarchical metadata to a single dictionary. 

297 

298 Parameters 

299 ---------- 

300 md: nested dict 

301 Metadata as returned by `metadata()`. 

302 keep_sections: bool 

303 If `True`, then prefix keys with section names, separated by `sep`. 

304 sep: str 

305 String for separating section names. 

306 

307 Returns 

308 ------- 

309 d: dict 

310 Non-nested dict containing all key-value pairs of `md`. 

311 

312 Examples 

313 -------- 

314 ``` 

315 >>> from audioio import print_metadata, flatten_metadata 

316 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

317 >>> print_metadata(md) 

318 aaaa: 2 

319 bbbb: 

320 ccc: 3 

321 ddd: 4 

322 eee: 

323 hh: 5 

324 iiii: 

325 jjj: 6 

326  

327 >>> fmd = flatten_metadata(md, keep_sections=True) 

328 >>> print_metadata(fmd) 

329 aaaa : 2 

330 bbbb.ccc : 3 

331 bbbb.ddd : 4 

332 bbbb.eee.hh: 5 

333 iiii.jjj : 6 

334 ``` 

335 """ 

336 def flatten(cd, section): 

337 df = {} 

338 for k in cd: 

339 if isinstance(cd[k], dict): 

340 df.update(flatten(cd[k], section + k + sep)) 

341 else: 

342 if keep_sections: 

343 df[section+k] = cd[k] 

344 else: 

345 df[k] = cd[k] 

346 return df 

347 

348 return flatten(md, '') 

349 

350 

351def unflatten_metadata(md, sep='.'): 

352 """Unflatten a previously flattened metadata dictionary. 

353 

354 Parameters 

355 ---------- 

356 md: dict 

357 Flat dictionary with key-value pairs as obtained from 

358 `flatten_metadata()` with `keep_sections=True`. 

359 sep: str 

360 String that separates section names. 

361 

362 Returns 

363 ------- 

364 d: nested dict 

365 Hierarchical dictionary with sub-dictionaries and key-value pairs. 

366 

367 Examples 

368 -------- 

369 ``` 

370 >>> from audioio import print_metadata, unflatten_metadata 

371 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6} 

372 >>> print_metadata(fmd) 

373 aaaa : 2 

374 bbbb.ccc : 3 

375 bbbb.ddd : 4 

376 bbbb.eee.hh: 5 

377 iiii.jjj : 6 

378  

379 >>> md = unflatten_metadata(fmd) 

380 >>> print_metadata(md) 

381 aaaa: 2 

382 bbbb: 

383 ccc: 3 

384 ddd: 4 

385 eee: 

386 hh: 5 

387 iiii: 

388 jjj: 6 

389 ``` 

390 """ 

391 umd = {} # unflattened metadata 

392 cmd = [umd] # current metadata dicts for each level of the hierarchy 

393 csk = [] # current section keys 

394 for k in md: 

395 ks = k.split(sep) 

396 # go up the hierarchy: 

397 for i in range(len(csk) - len(ks)): 

398 csk.pop() 

399 cmd.pop() 

400 for kss in reversed(ks[:len(csk)]): 

401 if kss == csk[-1]: 

402 break 

403 csk.pop() 

404 cmd.pop() 

405 # add new sections: 

406 for kss in ks[len(csk):-1]: 

407 csk.append(kss) 

408 cmd[-1][kss] = {} 

409 cmd.append(cmd[-1][kss]) 

410 # add key-value pair: 

411 cmd[-1][ks[-1]] = md[k] 

412 return umd 

413 

414 

415def parse_number(s): 

416 """Parse string with number and unit. 

417 

418 Parameters 

419 ---------- 

420 s: str, float, or int 

421 String to be parsed. The initial part of the string is 

422 expected to be a number, the part following the number is 

423 interpreted as the unit. If float or int, then return this 

424 as the value with empty unit. 

425 

426 Returns 

427 ------- 

428 v: None, int, or float 

429 Value of the string as float. Without decimal point, an int is returned. 

430 If the string does not contain a number, None is returned. 

431 u: str 

432 Unit that follows the initial number. 

433 n: int 

434 Number of digits behind the decimal point. 

435 

436 Examples 

437 -------- 

438 

439 ``` 

440 >>> from audioio import parse_number 

441 

442 # integer: 

443 >>> parse_number('42') 

444 (42, '', 0) 

445 

446 # integer with unit: 

447 >>> parse_number('42ms') 

448 (42, 'ms', 0) 

449 

450 # float with unit: 

451 >>> parse_number('42.ms') 

452 (42.0, 'ms', 0) 

453 

454 # float with unit: 

455 >>> parse_number('42.3ms') 

456 (42.3, 'ms', 1) 

457 

458 # float with space and unit: 

459 >>> parse_number('423.17 Hz') 

460 (423.17, 'Hz', 2) 

461 ``` 

462 

463 """ 

464 if not isinstance(s, str): 

465 if isinstance(s, int): 

466 return s, '', 0 

467 if isinstance(s, float): 

468 return s, '', 5 

469 else: 

470 return None, '', 0 

471 n = len(s) 

472 ip = n 

473 have_point = False 

474 for i in range(len(s)): 

475 if s[i] == '.': 

476 if have_point: 

477 n = i 

478 break 

479 have_point = True 

480 ip = i + 1 

481 if not s[i] in '0123456789.+-': 

482 n = i 

483 break 

484 if n == 0: 

485 return None, s, 0 

486 v = float(s[:n]) if have_point else int(s[:n]) 

487 u = s[n:].strip() 

488 nd = n - ip if n >= ip else 0 

489 return v, u, nd 

490 

491 

492unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2, 

493 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6, 

494 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12, 

495 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18, 

496 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2, 

497 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6, 

498 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12, 

499 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18, 

500 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6, 

501 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18, 

502 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3, 

503 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18} 

504""" SI prefixes for units with corresponding factors. """ 

505 

506 

507def change_unit(val, old_unit, new_unit): 

508 """Scale numerical value to a new unit. 

509 

510 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703 

511 

512 Parameters 

513 ---------- 

514 val: float 

515 Value given in `old_unit`. 

516 old_unit: str 

517 Unit of `val`. 

518 new_unit: str 

519 Requested unit of return value. 

520 

521 Returns 

522 ------- 

523 new_val: float 

524 The input value `val` scaled to `new_unit`. 

525 

526 Examples 

527 -------- 

528 

529 ``` 

530 >>> from audioio import change_unit 

531 >>> change_unit(5, 'mm', 'cm') 

532 0.5 

533 

534 >>> change_unit(5, '', 'cm') 

535 5.0 

536 

537 >>> change_unit(5, 'mm', '') 

538 5.0 

539 

540 >>> change_unit(5, 'cm', 'mm') 

541 50.0 

542 

543 >>> change_unit(4, 'kg', 'g') 

544 4000.0 

545 

546 >>> change_unit(12, '%', '') 

547 0.12 

548 

549 >>> change_unit(1.24, '', '%') 

550 124.0 

551 

552 >>> change_unit(2.5, 'min', 's') 

553 150.0 

554 

555 >>> change_unit(3600, 's', 'h') 

556 1.0 

557 

558 ``` 

559 

560 """ 

561 # missing unit? 

562 if not old_unit and not new_unit: 

563 return val 

564 if not old_unit and new_unit != '%': 

565 return val 

566 if not new_unit and old_unit != '%': 

567 return val 

568 

569 # special units that directly translate into factors: 

570 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0} 

571 

572 # parse old unit: 

573 f1 = 1.0 

574 if old_unit in unit_factors: 

575 f1 = unit_factors[old_unit] 

576 else: 

577 for k in unit_prefixes: 

578 if len(old_unit) > len(k) and old_unit[:len(k)] == k: 

579 f1 = unit_prefixes[k]; 

580 

581 # parse new unit: 

582 f2 = 1.0 

583 if new_unit in unit_factors: 

584 f2 = unit_factors[new_unit] 

585 else: 

586 for k in unit_prefixes: 

587 if len(new_unit) > len(k) and new_unit[:len(k)] == k: 

588 f2 = unit_prefixes[k]; 

589 

590 return val*f1/f2 

591 

592 

593def find_key(metadata, key, sep='.'): 

594 """Find dictionary in metadata hierarchy containing the specified key. 

595 

596 Parameters 

597 ---------- 

598 metadata: nested dict 

599 Metadata. 

600 key: str 

601 Key to be searched for (case insensitive). 

602 May contain section names separated by `sep`, i.e. 

603 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section) 

604 in section "bbb" that needs to be a subsection of section "aaa". 

605 sep: str 

606 String that separates section names in `key`. 

607 

608 Returns 

609 ------- 

610 md: dict 

611 The innermost dictionary matching some sections of the search key. 

612 If `key` is not at all contained in the metadata, 

613 the top-level dictionary is returned. 

614 key: str 

615 The part of the search key that was not found in `md`, or the 

616 the final part of the search key, found in `md`. 

617 

618 Examples 

619 -------- 

620 

621 Independent of whether found or not found, you can assign to the 

622 returned dictionary with the returned key. 

623 

624 ``` 

625 >>> from audioio import print_metadata, find_key 

626 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6)) 

627 >>> print_metadata(md) 

628 aaaa: 2 

629 bbbb: 

630 ccc: 3 

631 ddd: 4 

632 eee: 

633 ff: 5 

634 gggg: 

635 hhh: 6 

636 

637 >>> m, k = find_key(md, 'bbbb.ddd') 

638 >>> m[k] = 10 

639 >>> print_metadata(md) 

640 aaaa: 2 

641 bbbb: 

642 ccc: 3 

643 ddd: 10 

644 ... 

645 

646 >>> m, k = find_key(md, 'hhh') 

647 >>> m[k] = 12 

648 >>> print_metadata(md) 

649 ... 

650 gggg: 

651 hhh: 12 

652 

653 >>> m, k = find_key(md, 'bbbb.eee.xx') 

654 >>> m[k] = 42 

655 >>> print_metadata(md) 

656 ... 

657 eee: 

658 ff: 5 

659 xx: 42 

660 ... 

661 ``` 

662 

663 When searching for sections, the one conaining the searched section 

664 is returned: 

665 ```py 

666 >>> m, k = find_key(md, 'eee') 

667 >>> m[k]['yy'] = 46 

668 >>> print_metadata(md) 

669 ... 

670 eee: 

671 ff: 5 

672 xx: 42 

673 yy: 46 

674 ... 

675 ``` 

676 

677 """ 

678 def find_keys(metadata, keys): 

679 key = keys[0].strip().upper() 

680 for k in metadata: 

681 if k.upper() == key: 

682 if len(keys) == 1: 

683 # found key: 

684 return True, metadata, k 

685 elif isinstance(metadata[k], dict): 

686 # keep searching within the next section: 

687 return find_keys(metadata[k], keys[1:]) 

688 # search in subsections: 

689 for k in metadata: 

690 if isinstance(metadata[k], dict): 

691 found, mm, kk = find_keys(metadata[k], keys) 

692 if found: 

693 return True, mm, kk 

694 # nothing found: 

695 return False, metadata, sep.join(keys) 

696 

697 if metadata is None: 

698 return {}, None 

699 ks = key.strip().split(sep) 

700 found, mm, kk = find_keys(metadata, ks) 

701 return mm, kk 

702 

703 

704def get_number_unit(metadata, keys, sep='.', default=None, 

705 default_unit='', remove=False): 

706 """Find a key in metadata and return its number and unit. 

707 

708 Parameters 

709 ---------- 

710 metadata: nested dict 

711 Metadata. 

712 keys: str or list of str 

713 Keys in the metadata to be searched for (case insensitive). 

714 Value of the first key found is returned. 

715 May contain section names separated by `sep`.  

716 See `audiometadata.find_key()` for details. 

717 sep: str 

718 String that separates section names in `key`. 

719 default: None, int, or float 

720 Returned value if `key` is not found or the value does 

721 not contain a number. 

722 default_unit: str 

723 Returned unit if `key` is not found or the key's value does 

724 not have a unit. 

725 remove: bool 

726 If `True`, remove the found key from `metadata`. 

727 

728 Returns 

729 ------- 

730 v: None, int, or float 

731 Value referenced by `key` as float. 

732 Without decimal point, an int is returned. 

733 If none of the `keys` was found or 

734 the key`s value does not contain a number, 

735 then `default` is returned. 

736 u: str 

737 Corresponding unit. 

738 

739 Examples 

740 -------- 

741 

742 ``` 

743 >>> from audioio import get_number_unit 

744 >>> md = dict(aaaa='42', bbbb='42.3ms') 

745 

746 # integer: 

747 >>> get_number_unit(md, 'aaaa') 

748 (42, '') 

749 

750 # float with unit: 

751 >>> get_number_unit(md, 'bbbb') 

752 (42.3, 'ms') 

753 

754 # two keys: 

755 >>> get_number_unit(md, ['cccc', 'bbbb']) 

756 (42.3, 'ms') 

757 

758 # not found: 

759 >>> get_number_unit(md, 'cccc') 

760 (None, '') 

761 

762 # not found with default value: 

763 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.') 

764 (1.0, 'a.u.') 

765 ``` 

766 

767 """ 

768 if not metadata: 

769 return default, default_unit 

770 if not isinstance(keys, (list, tuple, np.ndarray)): 

771 keys = (keys,) 

772 value = default 

773 unit = default_unit 

774 for key in keys: 

775 m, k = find_key(metadata, key, sep) 

776 if k in m: 

777 v, u, _ = parse_number(m[k]) 

778 if v is not None: 

779 if not u: 

780 u = default_unit 

781 if remove: 

782 del m[k] 

783 return v, u 

784 elif u and unit == default_unit: 

785 unit = u 

786 return value, unit 

787 

788 

789def get_number(metadata, unit, keys, sep='.', default=None, remove=False): 

790 """Find a key in metadata and return its value in a given unit. 

791 

792 Parameters 

793 ---------- 

794 metadata: nested dict 

795 Metadata. 

796 unit: str 

797 Unit in which to return numerical value referenced by one of the `keys`. 

798 keys: str or list of str 

799 Keys in the metadata to be searched for (case insensitive). 

800 Value of the first key found is returned. 

801 May contain section names separated by `sep`.  

802 See `audiometadata.find_key()` for details. 

803 sep: str 

804 String that separates section names in `key`. 

805 default: None, int, or float 

806 Returned value if `key` is not found or the value does 

807 not contain a number. 

808 remove: bool 

809 If `True`, remove the found key from `metadata`. 

810 

811 Returns 

812 ------- 

813 v: None or float 

814 Value referenced by `key` as float scaled to `unit`. 

815 If none of the `keys` was found or 

816 the key`s value does not contain a number, 

817 then `default` is returned. 

818 

819 Examples 

820 -------- 

821 

822 ``` 

823 >>> from audioio import get_number 

824 >>> md = dict(aaaa='42', bbbb='42.3ms') 

825 

826 # milliseconds to seconds: 

827 >>> get_number(md, 's', 'bbbb') 

828 0.0423 

829 

830 # milliseconds to microseconds: 

831 >>> get_number(md, 'us', 'bbbb') 

832 42300.0 

833 

834 # value without unit is not scaled: 

835 >>> get_number(md, 'Hz', 'aaaa') 

836 42 

837 

838 # two keys: 

839 >>> get_number(md, 's', ['cccc', 'bbbb']) 

840 0.0423 

841 

842 # not found: 

843 >>> get_number(md, 's', 'cccc') 

844 None 

845 

846 # not found with default value: 

847 >>> get_number(md, 's', 'cccc', default=1.0) 

848 1.0 

849 ``` 

850 

851 """ 

852 v, u = get_number_unit(metadata, keys, sep, None, unit, remove) 

853 if v is None: 

854 return default 

855 else: 

856 return change_unit(v, u, unit) 

857 

858 

859def get_int(metadata, keys, sep='.', default=None, remove=False): 

860 """Find a key in metadata and return its integer value. 

861 

862 Parameters 

863 ---------- 

864 metadata: nested dict 

865 Metadata. 

866 keys: str or list of str 

867 Keys in the metadata to be searched for (case insensitive). 

868 Value of the first key found is returned. 

869 May contain section names separated by `sep`.  

870 See `audiometadata.find_key()` for details. 

871 sep: str 

872 String that separates section names in `key`. 

873 default: None or int 

874 Return value if `key` is not found or the value does 

875 not contain an integer. 

876 remove: bool 

877 If `True`, remove the found key from `metadata`. 

878 

879 Returns 

880 ------- 

881 v: None or int 

882 Value referenced by `key` as integer. 

883 If none of the `keys` was found, 

884 the key's value does not contain a number or represents 

885 a floating point value, then `default` is returned. 

886 

887 Examples 

888 -------- 

889 

890 ``` 

891 >>> from audioio import get_int 

892 >>> md = dict(aaaa='42', bbbb='42.3ms') 

893 

894 # integer: 

895 >>> get_int(md, 'aaaa') 

896 42 

897 

898 # two keys: 

899 >>> get_int(md, ['cccc', 'aaaa']) 

900 42 

901 

902 # float: 

903 >>> get_int(md, 'bbbb') 

904 None 

905 

906 # not found: 

907 >>> get_int(md, 'cccc') 

908 None 

909 

910 # not found with default value: 

911 >>> get_int(md, 'cccc', default=0) 

912 0 

913 ``` 

914 

915 """ 

916 if not metadata: 

917 return default 

918 if not isinstance(keys, (list, tuple, np.ndarray)): 

919 keys = (keys,) 

920 for key in keys: 

921 m, k = find_key(metadata, key, sep) 

922 if k in m: 

923 v, _, n = parse_number(m[k]) 

924 if v is not None and n == 0: 

925 if remove: 

926 del m[k] 

927 return int(v) 

928 return default 

929 

930 

931def get_bool(metadata, keys, sep='.', default=None, remove=False): 

932 """Find a key in metadata and return its boolean value. 

933 

934 Parameters 

935 ---------- 

936 metadata: nested dict 

937 Metadata. 

938 keys: str or list of str 

939 Keys in the metadata to be searched for (case insensitive). 

940 Value of the first key found is returned. 

941 May contain section names separated by `sep`.  

942 See `audiometadata.find_key()` for details. 

943 sep: str 

944 String that separates section names in `key`. 

945 default: None or bool 

946 Return value if `key` is not found or the value does 

947 not specify a boolean value. 

948 remove: bool 

949 If `True`, remove the found key from `metadata`. 

950 

951 Returns 

952 ------- 

953 v: None or bool 

954 Value referenced by `key` as boolean. 

955 True if 'true', 'yes' (case insensitive) or any number larger than zero. 

956 False if 'false', 'no' (case insensitive) or any number equal to zero. 

957 If none of the `keys` was found or 

958 the key's value does specify a boolean value, 

959 then `default` is returned. 

960 

961 Examples 

962 -------- 

963 

964 ``` 

965 >>> from audioio import get_bool 

966 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui') 

967 

968 # case insensitive: 

969 >>> get_bool(md, 'aaaa') 

970 True 

971 

972 >>> get_bool(md, 'bbbb') 

973 False 

974 

975 >>> get_bool(md, 'cccc') 

976 False 

977 

978 >>> get_bool(md, 'dddd') 

979 True 

980 

981 >>> get_bool(md, 'eeee') 

982 True 

983 

984 # not found: 

985 >>> get_bool(md, 'ffff') 

986 None 

987 

988 # two keys (string is preferred over number): 

989 >>> get_bool(md, ['cccc', 'aaaa']) 

990 True 

991 

992 # two keys (take first match): 

993 >>> get_bool(md, ['cccc', 'ffff']) 

994 False 

995 

996 # not found with default value: 

997 >>> get_bool(md, 'ffff', default=False) 

998 False 

999 ``` 

1000 

1001 """ 

1002 if not metadata: 

1003 return default 

1004 if not isinstance(keys, (list, tuple, np.ndarray)): 

1005 keys = (keys,) 

1006 val = default 

1007 mv = None 

1008 kv = None 

1009 for key in keys: 

1010 m, k = find_key(metadata, key, sep) 

1011 if k in m and not isinstance(m[k], dict): 

1012 vs = m[k] 

1013 v, _, _ = parse_number(vs) 

1014 if v is not None: 

1015 val = abs(v) > 1e-8 

1016 mv = m 

1017 kv = k 

1018 elif isinstance(vs, str): 

1019 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']: 

1020 if remove: 

1021 del m[k] 

1022 return True 

1023 if vs.upper() in ['FALSE', 'F', 'NO', 'N']: 

1024 if remove: 

1025 del m[k] 

1026 return False 

1027 if not mv is None and not kv is None and remove: 

1028 del mv[kv] 

1029 return val 

1030 

1031 

1032default_starttime_keys = [['DateTimeOriginal'], 

1033 ['OriginationDate', 'OriginationTime'], 

1034 ['Location_Time'], 

1035 ['Timestamp']] 

1036"""Default keys of times of start of the recording in metadata. 

1037Used by `get_datetime()` and `update_starttime()` functions. 

1038""" 

1039 

1040def get_datetime(metadata, keys=default_starttime_keys, 

1041 sep='.', default=None, remove=False): 

1042 """Find keys in metadata and return a datetime. 

1043 

1044 Parameters 

1045 ---------- 

1046 metadata: nested dict 

1047 Metadata. 

1048 keys: tuple of str or list of tuple of str 

1049 Datetimes can be stored in metadata as two separate key-value pairs, 

1050 one for the date and one for the time. Or by a single key-value pair 

1051 for a date-time value. This is why the keys need to be specified in 

1052 tuples with one or two keys. 

1053 The value of the first tuple of keys found is returned. 

1054 Keys may contain section names separated by `sep`.  

1055 See `audiometadata.find_key()` for details. 

1056 The default values for the `keys` find the start time of a recording. 

1057 You can modify the default keys via the `default_starttime_keys` list 

1058 of the `audiometadata` module. 

1059 sep: str 

1060 String that separates section names in `key`. 

1061 default: None or str 

1062 Return value if `key` is not found or the value does 

1063 not contain a string. 

1064 remove: bool 

1065 If `True`, remove the found key from `metadata`. 

1066 

1067 Returns 

1068 ------- 

1069 v: None or datetime 

1070 Datetime referenced by `keys`. 

1071 If none of the `keys` was found, then `default` is returned. 

1072 

1073 Examples 

1074 -------- 

1075 

1076 ``` 

1077 >>> from audioio import get_datetime 

1078 >>> import datetime as dt 

1079 >>> md = dict(date='2024-03-02', time='10:42:24', 

1080 datetime='2023-04-15T22:10:00') 

1081 

1082 # separate date and time: 

1083 >>> get_datetime(md, ('date', 'time')) 

1084 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1085 

1086 # single datetime: 

1087 >>> get_datetime(md, ('datetime',)) 

1088 datetime.datetime(2023, 4, 15, 22, 10) 

1089 

1090 # two alternative key tuples: 

1091 >>> get_datetime(md, [('aaaa',), ('date', 'time')]) 

1092 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1093 

1094 # not found: 

1095 >>> get_datetime(md, ('cccc',)) 

1096 None 

1097 

1098 # not found with default value: 

1099 >>> get_datetime(md, ('cccc', 'dddd'), 

1100 default=dt.datetime(2022, 2, 22, 22, 2, 12)) 

1101 datetime.datetime(2022, 2, 22, 22, 2, 12) 

1102 ``` 

1103 

1104 """ 

1105 if not metadata: 

1106 return default 

1107 if len(keys) > 0 and isinstance(keys[0], str): 

1108 keys = (keys,) 

1109 for keyp in keys: 

1110 if len(keyp) == 1: 

1111 m, k = find_key(metadata, keyp[0], sep) 

1112 if k in m: 

1113 v = m[k] 

1114 if isinstance(v, dt.datetime): 

1115 if remove: 

1116 del m[k] 

1117 return v 

1118 elif isinstance(v, str): 

1119 if remove: 

1120 del m[k] 

1121 return dt.datetime.fromisoformat(v) 

1122 else: 

1123 md, kd = find_key(metadata, keyp[0], sep) 

1124 if not kd in md: 

1125 continue 

1126 if isinstance(md[kd], dt.date): 

1127 date = md[kd] 

1128 elif isinstance(md[kd], str): 

1129 date = dt.date.fromisoformat(md[kd]) 

1130 else: 

1131 continue 

1132 mt, kt = find_key(metadata, keyp[1], sep) 

1133 if not kt in mt: 

1134 continue 

1135 if isinstance(mt[kt], dt.time): 

1136 time = mt[kt] 

1137 elif isinstance(mt[kt], str): 

1138 time = dt.time.fromisoformat(mt[kt]) 

1139 else: 

1140 continue 

1141 if remove: 

1142 del md[kd] 

1143 del mt[kt] 

1144 return dt.datetime.combine(date, time) 

1145 return default 

1146 

1147 

1148def get_str(metadata, keys, sep='.', default=None, remove=False): 

1149 """Find a key in metadata and return its string value. 

1150 

1151 Parameters 

1152 ---------- 

1153 metadata: nested dict 

1154 Metadata. 

1155 keys: str or list of str 

1156 Keys in the metadata to be searched for (case insensitive). 

1157 Value of the first key found is returned. 

1158 May contain section names separated by `sep`.  

1159 See `audiometadata.find_key()` for details. 

1160 sep: str 

1161 String that separates section names in `key`. 

1162 default: None or str 

1163 Return value if `key` is not found or the value does 

1164 not contain a string. 

1165 remove: bool 

1166 If `True`, remove the found key from `metadata`. 

1167 

1168 Returns 

1169 ------- 

1170 v: None or str 

1171 String value referenced by `key`. 

1172 If none of the `keys` was found, then `default` is returned. 

1173 

1174 Examples 

1175 -------- 

1176 

1177 ``` 

1178 >>> from audioio import get_str 

1179 >>> md = dict(aaaa=42, bbbb='hello') 

1180 

1181 # string: 

1182 >>> get_str(md, 'bbbb') 

1183 'hello' 

1184 

1185 # int as str: 

1186 >>> get_str(md, 'aaaa') 

1187 '42' 

1188 

1189 # two keys: 

1190 >>> get_str(md, ['cccc', 'bbbb']) 

1191 'hello' 

1192 

1193 # not found: 

1194 >>> get_str(md, 'cccc') 

1195 None 

1196 

1197 # not found with default value: 

1198 >>> get_str(md, 'cccc', default='-') 

1199 '-' 

1200 ``` 

1201 

1202 """ 

1203 if not metadata: 

1204 return default 

1205 if not isinstance(keys, (list, tuple, np.ndarray)): 

1206 keys = (keys,) 

1207 for key in keys: 

1208 m, k = find_key(metadata, key, sep) 

1209 if k in m and not isinstance(m[k], dict): 

1210 v = m[k] 

1211 if remove: 

1212 del m[k] 

1213 return str(v) 

1214 return default 

1215 

1216 

1217def add_sections(metadata, sections, value=False, sep='.'): 

1218 """Add sections to metadata dictionary. 

1219 

1220 Parameters 

1221 ---------- 

1222 metadata: nested dict 

1223 Metadata. 

1224 key: str 

1225 Names of sections to be added to `metadata`. 

1226 Section names separated by `sep`.  

1227 value: bool 

1228 If True, then the last element in `key` is a key for a value, 

1229 not a section. 

1230 sep: str 

1231 String that separates section names in `key`. 

1232 

1233 Returns 

1234 ------- 

1235 md: dict 

1236 Dictionary of the last added section. 

1237 key: str 

1238 Last key. Only returned if `value` is set to `True`. 

1239 

1240 Examples 

1241 -------- 

1242 

1243 Add a section and a sub-section to the metadata: 

1244 ``` 

1245 >>> from audioio import print_metadata, add_sections 

1246 >>> md = dict() 

1247 >>> m = add_sections(md, 'Recording.Location') 

1248 >>> m['Country'] = 'Lummerland' 

1249 >>> print_metadata(md) 

1250 Recording: 

1251 Location: 

1252 Country: Lummerland 

1253 ``` 

1254 

1255 Add a section with a key-value pair: 

1256 ``` 

1257 >>> md = dict() 

1258 >>> m, k = add_sections(md, 'Recording.Location', True) 

1259 >>> m[k] = 'Lummerland' 

1260 >>> print_metadata(md) 

1261 Recording: 

1262 Location: Lummerland 

1263 ``` 

1264 

1265 Adds well to `find_key()`: 

1266 ``` 

1267 >>> md = dict(Recording=dict()) 

1268 >>> m, k = find_key(md, 'Recording.Location.Country') 

1269 >>> m, k = add_sections(m, k, True) 

1270 >>> m[k] = 'Lummerland' 

1271 >>> print_metadata(md) 

1272 Recording: 

1273 Location: 

1274 Country: Lummerland 

1275 ``` 

1276 

1277 """ 

1278 mm = metadata 

1279 ks = sections.split(sep) 

1280 n = len(ks) 

1281 if value: 

1282 n -= 1 

1283 for k in ks[:n]: 

1284 if len(k) == 0: 

1285 continue 

1286 mm[k] = dict() 

1287 mm = mm[k] 

1288 if value: 

1289 return mm, ks[-1] 

1290 else: 

1291 return mm 

1292 

1293 

1294def strlist_to_dict(mds): 

1295 """Convert list of key-value-pair strings to dictionary. 

1296 

1297 Parameters 

1298 ---------- 

1299 mds: None or dict or str or list of str 

1300 - None - returns empty dictionary. 

1301 - Flat dictionary - returned as is. 

1302 - String with key and value separated by '='. 

1303 - List of strings with keys and values separated by '='. 

1304 Keys may contain section names. 

1305 

1306 Returns 

1307 ------- 

1308 md_dict: dict 

1309 Flat dictionary with key-value pairs. 

1310 Keys may contain section names. 

1311 Values are strings, other types or dictionaries. 

1312 """ 

1313 if mds is None: 

1314 return {} 

1315 if isinstance(mds, dict): 

1316 return mds 

1317 if not isinstance(mds, (list, tuple, np.ndarray)): 

1318 mds = (mds,) 

1319 md_dict = {} 

1320 for md in mds: 

1321 k, v = md.split('=') 

1322 k = k.strip() 

1323 v = v.strip() 

1324 md_dict[k] = v 

1325 return md_dict 

1326 

1327 

1328def set_metadata(metadata, mds, sep='.'): 

1329 """Set values of existing metadata. 

1330 

1331 Only if a key is found in the metadata, its value is updated. 

1332 

1333 Parameters 

1334 ---------- 

1335 metadata: nested dict 

1336 Metadata. 

1337 mds: dict or str or list of str 

1338 - Flat dictionary with key-value pairs for updating the metadata. 

1339 Values can be strings, other types or dictionaries. 

1340 - String with key and value separated by '='. 

1341 - List of strings with key and value separated by '='. 

1342 Keys may contain section names separated by `sep`. 

1343 sep: str 

1344 String that separates section names in the keys of `md_dict`. 

1345 

1346 Examples 

1347 -------- 

1348 ``` 

1349 >>> from audioio import print_metadata, set_metadata 

1350 >>> md = dict(Recording=dict(Time='early')) 

1351 >>> print_metadata(md) 

1352 Recording: 

1353 Time: early 

1354 

1355 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1356 'Recording.Time': 'late'}) # change value of existing key 

1357 >>> print_metadata(md) 

1358 Recording: 

1359 Time : late 

1360 ``` 

1361 

1362 See also 

1363 -------- 

1364 add_metadata() 

1365 strlist_to_dict() 

1366 

1367 """ 

1368 if metadata is None: 

1369 return 

1370 md_dict = strlist_to_dict(mds) 

1371 for k in md_dict: 

1372 mm, kk = find_key(metadata, k, sep) 

1373 if kk in mm: 

1374 mm[kk] = md_dict[k] 

1375 

1376 

1377def add_metadata(metadata, mds, sep='.'): 

1378 """Add or modify key-value pairs. 

1379 

1380 If a key does not exist, it is added to the metadata. 

1381 

1382 Parameters 

1383 ---------- 

1384 metadata: nested dict 

1385 Metadata. 

1386 mds: dict or str or list of str 

1387 - Flat dictionary with key-value pairs for updating the metadata. 

1388 Values can be strings or other types. 

1389 - String with key and value separated by '='. 

1390 - List of strings with key and value separated by '='. 

1391 Keys may contain section names separated by `sep`. 

1392 sep: str 

1393 String that separates section names in the keys of `md_list`. 

1394 

1395 Examples 

1396 -------- 

1397 ``` 

1398 >>> from audioio import print_metadata, add_metadata 

1399 >>> md = dict(Recording=dict(Time='early')) 

1400 >>> print_metadata(md) 

1401 Recording: 

1402 Time: early 

1403 

1404 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1405 'Recording.Time': 'late', # change value of existing key  

1406 'Recording.Quality': 'amazing', # new key-value pair in existing section 

1407 'Location.Country': 'Lummerland']) # new key-value pair in new section 

1408 >>> print_metadata(md) 

1409 Recording: 

1410 Time : late 

1411 Quality: amazing 

1412 Artist: John Doe 

1413 Location: 

1414 Country: Lummerland 

1415 ``` 

1416 

1417 See also 

1418 -------- 

1419 set_metadata() 

1420 strlist_to_dict() 

1421 

1422 """ 

1423 if metadata is None: 

1424 return 

1425 md_dict = strlist_to_dict(mds) 

1426 for k in md_dict: 

1427 mm, kk = find_key(metadata, k, sep) 

1428 mm, kk = add_sections(mm, kk, True, sep) 

1429 mm[kk] = md_dict[k] 

1430 

1431 

1432def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'): 

1433 """Remove a key from metadata and add it to a dictionary. 

1434 

1435 Parameters 

1436 ---------- 

1437 src_md: nested dict 

1438 Metadata from which a key is removed. 

1439 dest_md: dict 

1440 Dictionary to which the found key and its value are added. 

1441 keys: str or list of str 

1442 List of keys to be searched for in `src_md`. 

1443 Move the first one found to `dest_md`. 

1444 See the `audiometadata.find_key()` function for details. 

1445 new_key: None or str 

1446 If specified add the value of the found key as `new_key` to 

1447 `dest_md`. Otherwise, use the search key. 

1448 sep: str 

1449 String that separates section names in `keys`. 

1450 

1451 Returns 

1452 ------- 

1453 moved: bool 

1454 `True` if key was found and moved to dictionary. 

1455  

1456 Examples 

1457 -------- 

1458 ``` 

1459 >>> from audioio import print_metadata, move_metadata 

1460 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV')) 

1461 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist') 

1462 >>> print_metadata(md) 

1463 Recording: 

1464 Gain : 1.42mV 

1465 Experimentalist: John Doe 

1466 ``` 

1467  

1468 """ 

1469 if not src_md: 

1470 return False 

1471 if not isinstance(keys, (list, tuple, np.ndarray)): 

1472 keys = (keys,) 

1473 for key in keys: 

1474 m, k = find_key(src_md, key, sep) 

1475 if k in m: 

1476 dest_key = new_key if new_key else k 

1477 dest_md[dest_key] = m.pop(k) 

1478 return True 

1479 return False 

1480 

1481 

1482def remove_metadata(metadata, key_list, sep='.'): 

1483 """Remove key-value pairs or sections from metadata. 

1484 

1485 Parameters 

1486 ---------- 

1487 metadata: nested dict 

1488 Metadata. 

1489 key_list: str or list of str 

1490 List of keys to key-value pairs or sections to be removed 

1491 from the metadata. 

1492 sep: str 

1493 String that separates section names in the keys of `key_list`. 

1494 

1495 Examples 

1496 -------- 

1497 ``` 

1498 >>> from audioio import print_metadata, remove_metadata 

1499 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4)) 

1500 >>> remove_metadata(md, ('ccc',)) 

1501 >>> print_metadata(md) 

1502 aaaa: 2 

1503 bbbb: 

1504 ddd: 4 

1505 ``` 

1506 

1507 """ 

1508 if not metadata: 

1509 return 

1510 if not isinstance(key_list, (list, tuple, np.ndarray)): 

1511 key_list = (key_list,) 

1512 for k in key_list: 

1513 mm, kk = find_key(metadata, k, sep) 

1514 if kk in mm: 

1515 del mm[kk] 

1516 

1517 

1518def cleanup_metadata(metadata): 

1519 """Remove empty sections from metadata. 

1520 

1521 Parameters 

1522 ---------- 

1523 metadata: nested dict 

1524 Metadata. 

1525 

1526 Examples 

1527 -------- 

1528 ``` 

1529 >>> from audioio import print_metadata, cleanup_metadata 

1530 >>> md = dict(aaaa=2, bbbb=dict()) 

1531 >>> cleanup_metadata(md) 

1532 >>> print_metadata(md) 

1533 aaaa: 2 

1534 ``` 

1535 

1536 """ 

1537 if not metadata: 

1538 return 

1539 for k in list(metadata): 

1540 if isinstance(metadata[k], dict): 

1541 if len(metadata[k]) == 0: 

1542 del metadata[k] 

1543 else: 

1544 cleanup_metadata(metadata[k]) 

1545 

1546 

1547default_gain_keys = ['gain'] 

1548"""Default keys of gain settings in metadata. Used by `get_gain()` function. 

1549""" 

1550 

1551def get_gain(metadata, gain_key=default_gain_keys, sep='.', 

1552 default=None, default_unit='', remove=False): 

1553 """Get gain and unit from metadata. 

1554 

1555 Parameters 

1556 ---------- 

1557 metadata: nested dict 

1558 Metadata with key-value pairs. 

1559 gain_key: str or list of str 

1560 Key in the file's metadata that holds some gain information. 

1561 If found, the data will be multiplied with the gain, 

1562 and if available, the corresponding unit is returned. 

1563 See the `audiometadata.find_key()` function for details. 

1564 You can modify the default keys via the `default_gain_keys` list 

1565 of the `audiometadata` module. 

1566 sep: str 

1567 String that separates section names in `gain_key`. 

1568 default: None or float 

1569 Returned value if no valid gain was found in `metadata`. 

1570 default_unit: str 

1571 Returned unit if no valid gain was found in `metadata`. 

1572 remove: bool 

1573 If `True`, remove the found key from `metadata`. 

1574 

1575 Returns 

1576 ------- 

1577 fac: float 

1578 Gain factor. If not found in metadata return 1. 

1579 unit: string 

1580 Unit of the data if found in the metadata, otherwise "a.u.". 

1581 """ 

1582 v, u = get_number_unit(metadata, gain_key, sep, default, 

1583 default_unit, remove) 

1584 # fix some TeeGrid gains: 

1585 if len(u) >= 2 and u[-2:] == '/V': 

1586 u = u[:-2] 

1587 return v, u 

1588 

1589 

1590def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'): 

1591 """Update gain setting in metadata. 

1592 

1593 Searches for the first appearance of a gain key in the metadata 

1594 hierarchy. If found, divide the gain value by `fac`. 

1595 

1596 Parameters 

1597 ---------- 

1598 metadata: nested dict 

1599 Metadata to be updated. 

1600 fac: float 

1601 Factor that was used to scale the data. 

1602 gain_key: str or list of str 

1603 Key in the file's metadata that holds some gain information. 

1604 If found, the data will be multiplied with the gain, 

1605 and if available, the corresponding unit is returned. 

1606 See the `audiometadata.find_key()` function for details. 

1607 You can modify the default keys via the `default_gain_keys` list 

1608 of the `audiometadata` module. 

1609 sep: str 

1610 String that separates section names in `gain_key`. 

1611 

1612 Returns 

1613 ------- 

1614 done: bool 

1615 True if gain has been found and set. 

1616 

1617 

1618 Examples 

1619 -------- 

1620 

1621 ``` 

1622 >>> from audioio import print_metadata, update_gain 

1623 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV')) 

1624 >>> update_gain(md, 2) 

1625 >>> print_metadata(md) 

1626 Artist: John Doe 

1627 Recording: 

1628 gain: 0.70mV 

1629 ``` 

1630 

1631 """ 

1632 if not metadata: 

1633 return False 

1634 if not isinstance(gain_key, (list, tuple, np.ndarray)): 

1635 gain_key = (gain_key,) 

1636 for gk in gain_key: 

1637 m, k = find_key(metadata, gk, sep) 

1638 if k in m and not isinstance(m[k], dict): 

1639 vs = m[k] 

1640 if isinstance(vs, (int, float)): 

1641 m[k] = vs/fac 

1642 else: 

1643 v, u, n = parse_number(vs) 

1644 if not v is None: 

1645 # fix some TeeGrid gains: 

1646 if len(u) >= 2 and u[-2:] == '/V': 

1647 u = u[:-2] 

1648 m[k] = f'{v/fac:.{n+1}f}{u}' 

1649 return True 

1650 return False 

1651 

1652 

1653def set_starttime(metadata, datetime_value, 

1654 time_keys=default_starttime_keys): 

1655 """Set all start-of-recording times in metadata. 

1656 

1657 Parameters 

1658 ---------- 

1659 metadata: nested dict 

1660 Metadata to be updated. 

1661 datetime_value: datetime 

1662 Start date and time of the recording. 

1663 time_keys: tuple of str or list of tuple of str 

1664 Keys to fields denoting calender times, i.e. dates and times. 

1665 Datetimes can be stored in metadata as two separate key-value pairs, 

1666 one for the date and one for the time. Or by a single key-value pair 

1667 for a date-time values. This is why the keys need to be specified in 

1668 tuples with one or two keys. 

1669 Keys may contain section names separated by `sep`.  

1670 See `audiometadata.find_key()` for details. 

1671 You can modify the default time keys via the `default_starttime_keys` 

1672 list of the `audiometadata` module. 

1673 

1674 Returns 

1675 ------- 

1676 success: bool 

1677 True if at least one time has been set. 

1678 

1679 Example 

1680 ------- 

1681 ``` 

1682 >>> from audioio import print_metadata, set_starttime 

1683 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1684 OtherTime='2023-05-16T23:20:10', 

1685 BEXT=dict(OriginationDate='2024-03-02', 

1686 OriginationTime='10:42:24')) 

1687 >>> set_starttime(md, '2024-06-17T22:10:05') 

1688 >>> print_metadata(md) 

1689 DateTimeOriginal: 2024-06-17T22:10:05 

1690 OtherTime : 2024-06-17T22:10:05 

1691 BEXT: 

1692 OriginationDate: 2024-06-17 

1693 OriginationTime: 22:10:05 

1694 ``` 

1695 

1696 """ 

1697 if not metadata: 

1698 return False 

1699 if isinstance(datetime_value, str): 

1700 datetime_value = dt.datetime.fromisoformat(datetime_value) 

1701 success = False 

1702 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1703 time_keys = (time_keys,) 

1704 for key in time_keys: 

1705 if len(key) == 1: 

1706 # datetime: 

1707 m, k = find_key(metadata, key[0]) 

1708 if k in m and not isinstance(m[k], dict): 

1709 if isinstance(m[k], dt.datetime): 

1710 m[k] = datetime_value 

1711 else: 

1712 m[k] = datetime_value.isoformat(timespec='seconds') 

1713 success = True 

1714 else: 

1715 # separate date and time: 

1716 md, kd = find_key(metadata, key[0]) 

1717 if not kd in md or isinstance(md[kd], dict): 

1718 continue 

1719 if isinstance(md[kd], dt.date): 

1720 md[kd] = datetime_value.date() 

1721 else: 

1722 md[kd] = datetime_value.date().isoformat() 

1723 mt, kt = find_key(metadata, key[1]) 

1724 if not kt in mt or isinstance(mt[kt], dict): 

1725 continue 

1726 if isinstance(mt[kt], dt.time): 

1727 mt[kt] = datetime_value.time() 

1728 else: 

1729 mt[kt] = datetime_value.time().isoformat(timespec='seconds') 

1730 success = True 

1731 return success 

1732 

1733 

1734default_timeref_keys = ['TimeReference'] 

1735"""Default keys of integer time references in metadata. 

1736Used by `update_starttime()` function. 

1737""" 

1738 

1739def update_starttime(metadata, deltat, rate, 

1740 time_keys=default_starttime_keys, 

1741 ref_keys=default_timeref_keys): 

1742 """Update start-of-recording times in metadata. 

1743 

1744 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata. 

1745 

1746 Parameters 

1747 ---------- 

1748 metadata: nested dict 

1749 Metadata to be updated. 

1750 deltat: float 

1751 Time in seconds to be added to start times. 

1752 rate: float 

1753 Sampling rate of the data in Hertz. 

1754 time_keys: tuple of str or list of tuple of str 

1755 Keys to fields denoting calender times, i.e. dates and times. 

1756 Datetimes can be stored in metadata as two separate key-value pairs, 

1757 one for the date and one for the time. Or by a single key-value pair 

1758 for a date-time values. This is why the keys need to be specified in 

1759 tuples with one or two keys. 

1760 Keys may contain section names separated by `sep`.  

1761 See `audiometadata.find_key()` for details. 

1762 You can modify the default time keys via the `default_starttime_keys` 

1763 list of the `audiometadata` module. 

1764 ref_keys: str or list of str 

1765 Keys to time references, i.e. integers in seconds relative to 

1766 a reference time. 

1767 Keys may contain section names separated by `sep`.  

1768 See `audiometadata.find_key()` for details. 

1769 You can modify the default reference keys via the 

1770 `default_timeref_keys` list of the `audiometadata` module. 

1771 

1772 Returns 

1773 ------- 

1774 success: bool 

1775 True if at least one time has been updated. 

1776 

1777 Example 

1778 ------- 

1779 ``` 

1780 >>> from audioio import print_metadata, update_starttime 

1781 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1782 OtherTime='2023-05-16T23:20:10', 

1783 BEXT=dict(OriginationDate='2024-03-02', 

1784 OriginationTime='10:42:24', 

1785 TimeReference=123456)) 

1786 >>> update_starttime(md, 4.2, 48000) 

1787 >>> print_metadata(md) 

1788 DateTimeOriginal: 2023-04-15T22:10:04 

1789 OtherTime : 2023-05-16T23:20:10 

1790 BEXT: 

1791 OriginationDate: 2024-03-02 

1792 OriginationTime: 10:42:28 

1793 TimeReference : 325056 

1794 ``` 

1795 

1796 """ 

1797 if not metadata: 

1798 return False 

1799 if not isinstance(deltat, dt.timedelta): 

1800 deltat = dt.timedelta(seconds=deltat) 

1801 success = False 

1802 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1803 time_keys = (time_keys,) 

1804 for key in time_keys: 

1805 if len(key) == 1: 

1806 # datetime: 

1807 m, k = find_key(metadata, key[0]) 

1808 if k in m and not isinstance(m[k], dict): 

1809 if isinstance(m[k], dt.datetime): 

1810 m[k] += deltat 

1811 else: 

1812 datetime = dt.datetime.fromisoformat(m[k]) + deltat 

1813 m[k] = datetime.isoformat(timespec='seconds') 

1814 success = True 

1815 else: 

1816 # separate date and time: 

1817 md, kd = find_key(metadata, key[0]) 

1818 if not kd in md or isinstance(md[kd], dict): 

1819 continue 

1820 if isinstance(md[kd], dt.date): 

1821 date = md[kd] 

1822 is_date = True 

1823 else: 

1824 date = dt.date.fromisoformat(md[kd]) 

1825 is_date = False 

1826 mt, kt = find_key(metadata, key[1]) 

1827 if not kt in mt or isinstance(mt[kt], dict): 

1828 continue 

1829 if isinstance(mt[kt], dt.time): 

1830 time = mt[kt] 

1831 is_time = True 

1832 else: 

1833 time = dt.time.fromisoformat(mt[kt]) 

1834 is_time = False 

1835 datetime = dt.datetime.combine(date, time) + deltat 

1836 md[kd] = datetime.date() if is_date else datetime.date().isoformat() 

1837 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds') 

1838 success = True 

1839 # time reference in samples: 

1840 if isinstance(ref_keys, str): 

1841 ref_keys = (ref_keys,) 

1842 for key in ref_keys: 

1843 m, k = find_key(metadata, key) 

1844 if k in m and not isinstance(m[k], dict): 

1845 is_int = isinstance(m[k], int) 

1846 tref = int(m[k]) 

1847 tref += int(np.round(deltat.total_seconds()*rate)) 

1848 m[k] = tref if is_int else f'{tref}' 

1849 success = True 

1850 return success 

1851 

1852 

1853def bext_history_str(encoding, rate, channels, text=None): 

1854 """ Assemble a string for the BEXT CodingHistory field. 

1855 

1856 Parameters 

1857 ---------- 

1858 encoding: str or None 

1859 Encoding of the data. 

1860 rate: int or float 

1861 Sampling rate in Hertz. 

1862 channels: int 

1863 Number of channels. 

1864 text: str or None 

1865 Optional free text. 

1866 

1867 Returns 

1868 ------- 

1869 s: str 

1870 String for the BEXT CodingHistory field, 

1871 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out" 

1872 """ 

1873 codes = [] 

1874 bits = None 

1875 if encoding is not None: 

1876 if encoding[:3] == 'PCM': 

1877 bits = int(encoding[4:]) 

1878 encoding = 'PCM' 

1879 codes.append(f'A={encoding}') 

1880 codes.append(f'F={rate:.0f}') 

1881 if bits is not None: 

1882 codes.append(f'W={bits}') 

1883 mode = None 

1884 if channels == 1: 

1885 mode = 'mono' 

1886 elif channels == 2: 

1887 mode = 'stereo' 

1888 if mode is not None: 

1889 codes.append(f'M={mode}') 

1890 if text is not None: 

1891 codes.append(f'T={text.rstrip()}') 

1892 return ','.join(codes) 

1893 

1894 

1895default_history_keys = ['History', 

1896 'CodingHistory', 

1897 'BWF_CODING_HISTORY'] 

1898"""Default keys of strings describing coding history in metadata. 

1899Used by `add_history()` function. 

1900""" 

1901 

1902def add_history(metadata, history, new_key=None, pre_history=None, 

1903 history_keys=default_history_keys, sep='.'): 

1904 """Add a string describing coding history to metadata. 

1905  

1906 Add `history` to the `history_keys` fields in the metadata. If 

1907 none of these fields are present but `new_key` is specified, then 

1908 assign `pre_history` and `history` to this key. If this key does 

1909 not exist in the metadata, it is created. 

1910 

1911 Parameters 

1912 ---------- 

1913 metadata: nested dict 

1914 Metadata to be updated. 

1915 history: str 

1916 String to be added to the history. 

1917 new_key: str or None 

1918 Sections and name of a history key to be added to `metadata`. 

1919 Section names are separated by `sep`. 

1920 pre_history: str or None 

1921 If a new key `new_key` is created, then assign this string followed 

1922 by `history`. 

1923 history_keys: str or list of str 

1924 Keys to fields where to add `history`. 

1925 Keys may contain section names separated by `sep`.  

1926 See `audiometadata.find_key()` for details. 

1927 You can modify the default history keys via the `default_history_keys` 

1928 list of the `audiometadata` module. 

1929 sep: str 

1930 String that separates section names in `new_key` and `history_keys`. 

1931 

1932 Returns 

1933 ------- 

1934 success: bool 

1935 True if the history string has beend added to the metadata. 

1936 

1937 Example 

1938 ------- 

1939 Add string to existing history key-value pair: 

1940 ``` 

1941 >>> from audioio import add_history 

1942 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings')) 

1943 >>> add_history(md, 'just a snippet') 

1944 >>> print(md['BEXT']['CodingHistory']) 

1945 original recordings 

1946 just a snippet 

1947 ``` 

1948 

1949 Assign string to new key-value pair: 

1950 ``` 

1951 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12')) 

1952 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data') 

1953 >>> print(md['BEXT']['CodingHistory']) 

1954 original data 

1955 just a snippet 

1956 ``` 

1957 

1958 """ 

1959 if not metadata: 

1960 return False 

1961 if isinstance(history_keys, str): 

1962 history_keys = (history_keys,) 

1963 success = False 

1964 for keys in history_keys: 

1965 m, k = find_key(metadata, keys) 

1966 if k in m and not isinstance(m[k], dict): 

1967 s = m[k] 

1968 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1969 s += '\r\n' 

1970 s += history 

1971 m[k] = s 

1972 success = True 

1973 if not success and new_key: 

1974 m, k = find_key(metadata, new_key, sep) 

1975 m, k = add_sections(m, k, True, sep) 

1976 s = '' 

1977 if pre_history is not None: 

1978 s = pre_history 

1979 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1980 s += '\r\n' 

1981 s += history 

1982 m[k] = s 

1983 success = True 

1984 return success 

1985 

1986 

1987def add_unwrap(metadata, thresh, clip=0, unit=''): 

1988 """Add unwrap infos to metadata. 

1989 

1990 If `audiotools.unwrap()` was applied to the data, then this 

1991 function adds relevant infos to the metadata. If there is an INFO 

1992 section in the metadata, the unwrap infos are added to this 

1993 section, otherwise they are added to the top level of the metadata 

1994 hierarchy. 

1995 

1996 The threshold `thresh` used for unwrapping is saved under the key 

1997 'UnwrapThreshold' as a string. If `clip` is larger than zero, then 

1998 the clip level is saved under the key 'UnwrapClippedAmplitude' as 

1999 a string. 

2000 

2001 Parameters 

2002 ---------- 

2003 md: nested dict 

2004 Metadata to be updated. 

2005 thresh: float 

2006 Threshold used for unwrapping. 

2007 clip: float 

2008 Level at which unwrapped data have been clipped. 

2009 unit: str 

2010 Unit of `thresh` and `clip`. 

2011 

2012 Examples 

2013 -------- 

2014 

2015 ``` 

2016 >>> from audioio import print_metadata, add_unwrap 

2017 >>> md = dict(INFO=dict(Time='early')) 

2018 >>> add_unwrap(md, 0.6, 1.0) 

2019 >>> print_metadata(md) 

2020 INFO: 

2021 Time : early 

2022 UnwrapThreshold : 0.60 

2023 UnwrapClippedAmplitude: 1.00 

2024 ``` 

2025 

2026 """ 

2027 if metadata is None: 

2028 return 

2029 md = metadata 

2030 for k in metadata: 

2031 if k.strip().upper() == 'INFO': 

2032 md = metadata['INFO'] 

2033 break 

2034 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}' 

2035 if clip > 0: 

2036 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}' 

2037 

2038 

2039def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks): 

2040 """Print metadata and markers of audio files. 

2041 

2042 Parameters 

2043 ---------- 

2044 file_pathes: list of str 

2045 Pathes of audio files. 

2046 list_format: bool 

2047 If True, list file format only. 

2048 list_metadata: bool 

2049 If True, list metadata only. 

2050 list_cues: bool 

2051 If True, list markers/cues only. 

2052 list_chunks: bool 

2053 If True, list all chunks contained in a riff/wave file. 

2054 """ 

2055 from .audioloader import AudioLoader 

2056 from .audiomarkers import print_markers 

2057 from .riffmetadata import read_chunk_tags 

2058 for filepath in file_pathes: 

2059 if len(file_pathes) > 1 and (list_cues or list_metadata or 

2060 list_format or list_chunks): 

2061 print(filepath) 

2062 if list_chunks: 

2063 chunks = read_chunk_tags(filepath) 

2064 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}') 

2065 for tag in chunks: 

2066 pos = chunks[tag][0] - 8 

2067 size = chunks[tag][1] + 8 

2068 print(f' {tag:9s} {pos:10d} {size:10d}') 

2069 if len(file_pathes) > 1: 

2070 print() 

2071 continue 

2072 with AudioLoader(filepath, 1, 0, verbose=0) as sf: 

2073 fmt_md = sf.format_dict() 

2074 meta_data = sf.metadata() 

2075 locs, labels = sf.markers() 

2076 if list_cues: 

2077 if len(locs) > 0: 

2078 print_markers(locs, labels) 

2079 elif list_metadata: 

2080 print_metadata(meta_data, replace='.') 

2081 elif list_format: 

2082 print_metadata(fmt_md) 

2083 else: 

2084 print('file:') 

2085 print_metadata(fmt_md, ' ') 

2086 if len(meta_data) > 0: 

2087 print() 

2088 print('metadata:') 

2089 print_metadata(meta_data, ' ', replace='.') 

2090 if len(locs) > 0: 

2091 print() 

2092 print('markers:') 

2093 print_markers(locs, labels) 

2094 if len(file_pathes) > 1: 

2095 print() 

2096 if len(file_pathes) > 1: 

2097 print() 

2098 

2099 

2100def main(*cargs): 

2101 """Call demo with command line arguments. 

2102 

2103 Parameters 

2104 ---------- 

2105 cargs: list of strings 

2106 Command line arguments as provided by sys.argv[1:] 

2107 """ 

2108 # command line arguments: 

2109 parser = argparse.ArgumentParser(add_help=True, 

2110 description='Convert audio file formats.', 

2111 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})') 

2112 parser.add_argument('--version', action='version', version=__version__) 

2113 parser.add_argument('-f', dest='dataformat', action='store_true', 

2114 help='list file format only') 

2115 parser.add_argument('-m', dest='metadata', action='store_true', 

2116 help='list metadata only') 

2117 parser.add_argument('-c', dest='cues', action='store_true', 

2118 help='list cues/markers only') 

2119 parser.add_argument('-t', dest='chunks', action='store_true', 

2120 help='list tags of all riff/wave chunks contained in the file') 

2121 parser.add_argument('files', type=str, nargs='+', 

2122 help='audio file') 

2123 if len(cargs) == 0: 

2124 cargs = None 

2125 args = parser.parse_args(cargs) 

2126 

2127 # expand wildcard patterns: 

2128 files = [] 

2129 if os.name == 'nt': 

2130 for fn in args.files: 

2131 files.extend(glob.glob(fn)) 

2132 else: 

2133 files = args.files 

2134 

2135 demo(files, args.dataformat, args.metadata, args.cues, args.chunks) 

2136 

2137 

2138if __name__ == "__main__": 

2139 main(*sys.argv[1:])