Coverage for src/audioio/audiometadata.py: 99%

556 statements  

« prev     ^ index     » next       coverage.py v7.6.12, created at 2025-02-16 18:31 +0000

1"""Working with metadata. 

2 

3To interface the various ways metadata are stored in audio files, the 

4`audioio` package uses nested dictionaries. The keys are always 

5strings. Values are strings, integers, floats, datetimes, or other 

6types. Value strings can also be numbers followed by a unit, 

7e.g. "4.2mV". For defining subsections of key-value pairs, values can 

8be dictionaries. The dictionaries can be nested to arbitrary depth. 

9 

10```py 

11>>> from audioio import print_metadata 

12>>> md = dict(Recording=dict(Experimenter='John Doe', 

13 DateTimeOriginal='2023-10-01T14:10:02', 

14 Count=42), 

15 Hardware=dict(Amplifier='Teensy_Amp 4.1', 

16 Highpass='10Hz', 

17 Gain='120mV')) 

18>>> print_metadata(md) 

19``` 

20results in 

21```txt 

22Recording: 

23 Experimenter : John Doe 

24 DateTimeOriginal: 2023-10-01T14:10:02 

25 Count : 42 

26Hardware: 

27 Amplifier: Teensy_Amp 4.1 

28 Highpass : 10Hz 

29 Gain : 120mV 

30``` 

31 

32Often, audio files have very specific ways to store metadata. You can 

33enforce using these by putting them into a dictionary that is added to 

34the metadata with a key having the name of the metadata type you want, 

35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files. 

36 

37## Functions 

38 

39The `audiometadata` module provides functions for handling and 

40manipulating these nested dictionaries. Many functions take keys as 

41arguments for finding or setting specific key-value pairs. These keys 

42can be the key of a specific item of a (sub-) dictionary, no matter on 

43which level of the metadata hierarchy it is. For example, simply 

44searching for "Highpass" retrieves the corrseponding value "10Hz", 

45although "Highpass" is contained in the sub-dictionary (or "section") 

46with key "Hardware". The same item can also be specified together with 

47its parent keys: "Hardware.Highpass". Parent keys (or section keys) 

48are by default separated by '.', but all functions have a `sep` 

49key-word that specifies the string separating section names in 

50keys. Key matching is case insensitive. 

51 

52Since the same items are named by many different keys in the different 

53types of metadata data models, the functions also take lists of keys 

54as arguments. 

55 

56Do not forget that you can easily manipulate the metadata by means of 

57the standard functions of dictionaries. 

58 

59If you need to make a copy of the metadata use `deepcopy`: 

60``` 

61from copy import deepcopy 

62md_orig = deepcopy(md) 

63``` 

64 

65### Output 

66 

67Write nested dictionaries as texts: 

68 

69- `write_metadata_text()`: write meta data into a text/yaml file. 

70- `print_metadata()`: write meta data to standard output. 

71 

72### Flatten 

73 

74Conversion between nested and flat dictionaries: 

75 

76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary. 

77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary. 

78 

79### Parse numbers with units 

80 

81- `parse_number()`: parse string with number and unit. 

82- `change_unit()`: scale numerical value to a new unit. 

83 

84### Find and get values 

85 

86Find keys and get their values parsed and converted to various types: 

87 

88- `find_key()`: find dictionary in metadata hierarchy containing the specified key. 

89- `get_number_unit()`: find a key in metadata and return its number and unit. 

90- `get_number()`: find a key in metadata and return its value in a given unit. 

91- `get_int()`: find a key in metadata and return its integer value. 

92- `get_bool()`: find a key in metadata and return its boolean value. 

93- `get_datetime()`: find keys in metadata and return a datetime. 

94- `get_str()`: find a key in metadata and return its string value. 

95 

96### Organize metadata 

97 

98Add and remove metadata: 

99 

100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary. 

101- `add_sections()`: add sections to metadata dictionary. 

102- `set_metadata()`: set values of existing metadata. 

103- `add_metadata()`: add or modify key-value pairs. 

104- `move_metadata()`: remove a key from metadata and add it to a dictionary. 

105- `remove_metadata()`: remove key-value pairs or sections from metadata. 

106- `cleanup_metadata()`: remove empty sections from metadata. 

107 

108### Special metadata fields 

109 

110Retrieve and set specific metadata: 

111 

112- `get_gain()`: get gain and unit from metadata. 

113- `update_gain()`: update gain setting in metadata. 

114- `set_starttime()`: set all start-of-recording times in metadata. 

115- `update_starttime()`: update start-of-recording times in metadata. 

116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field. 

117- `add_history()`: add a string describing coding history to metadata. 

118- `add_unwrap()`: add unwrap infos to metadata. 

119 

120Lists of standard keys: 

121 

122- `default_starttime_keys`: keys of times of start of the recording. 

123- `default_timeref_keys`: keys of integer time references. 

124- `default_gain_keys`: keys of gain settings. 

125- `default_history_keys`: keys of strings describing coding history. 

126 

127 

128## Command line script 

129 

130The module can be run as a script from the command line to display the 

131metadata and markers contained in an audio file: 

132 

133```sh 

134> audiometadata logger.wav 

135``` 

136prints 

137```text 

138file: 

139 filepath : logger.wav 

140 samplingrate: 96000Hz 

141 channels : 16 

142 frames : 17280000 

143 duration : 180.000s 

144 

145metadata: 

146 INFO: 

147 Bits : 32 

148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L 

149 Gain : 165.00mV 

150 uCBoard : Teensy 4.1 

151 MACAdress : 04:e9:e5:15:3e:95 

152 DateTimeOriginal: 2023-10-01T14:10:02 

153 Software : TeeGrid R4-senors-logger v1.0 

154``` 

155 

156 

157Alternatively, the script can be run from the module as: 

158``` 

159python -m src.audioio.metadata audiofile.wav 

160``` 

161 

162Running 

163```sh 

164audiometadata --help 

165``` 

166prints 

167```text 

168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...] 

169 

170Convert audio file formats. 

171 

172positional arguments: 

173 files audio file 

174 

175options: 

176 -h, --help show this help message and exit 

177 --version show program's version number and exit 

178 -f list file format only 

179 -m list metadata only 

180 -c list cues/markers only 

181 -t list tags of all riff/wave chunks contained in the file 

182 

183version 2.0.0 by Benda-Lab (2020-2024) 

184``` 

185 

186""" 

187 

188import sys 

189import argparse 

190import numpy as np 

191import datetime as dt 

192from .version import __version__, __year__ 

193 

194 

195def write_metadata_text(fh, meta, prefix='', indent=4, replace=None): 

196 """Write meta data into a text/yaml file or stream. 

197 

198 With the default parameters, the output is a valid yaml file. 

199 

200 Parameters 

201 ---------- 

202 fh: filename or stream 

203 If not a stream, the file with name `fh` is opened. 

204 Otherwise `fh` is used as a stream for writing. 

205 meta: nested dict 

206 Key-value pairs of metadata to be written into the file. 

207 prefix: str 

208 This string is written at the beginning of each line. 

209 indent: int 

210 Number of characters used for indentation of sections. 

211 replace: char or None 

212 If specified, replace special characters by this character. 

213 

214 Examples 

215 -------- 

216 ``` 

217 from audioio import write_metadata 

218 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5))) 

219 write_metadata('info.txt', md) 

220 ``` 

221 """ 

222 

223 def write_dict(df, md, level, smap): 

224 w = 0 

225 for k in md: 

226 if not isinstance(md[k], dict) and w < len(k): 

227 w = len(k) 

228 for k in md: 

229 clevel = level*indent 

230 if isinstance(md[k], dict): 

231 df.write(f'{prefix}{"":>{clevel}}{k}:\n') 

232 write_dict(df, md[k], level+1, smap) 

233 else: 

234 value = md[k] 

235 if isinstance(value, (list, tuple)): 

236 value = ', '.join([f'{v}' for v in value]) 

237 else: 

238 value = f'{value}' 

239 value = value.replace('\r\n', r'\n') 

240 value = value.replace('\n', r'\n') 

241 if len(smap) > 0: 

242 value = value.translate(smap) 

243 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n') 

244 

245 if not meta: 

246 return 

247 if hasattr(fh, 'write'): 

248 own_file = False 

249 else: 

250 own_file = True 

251 fh = open(fh, 'w') 

252 smap = {} 

253 if replace: 

254 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4)) 

255 write_dict(fh, meta, 0, smap) 

256 if own_file: 

257 fh.close() 

258 

259 

260def print_metadata(meta, prefix='', indent=4, replace=None): 

261 """Write meta data to standard output. 

262 

263 Parameters 

264 ---------- 

265 meta: nested dict 

266 Key-value pairs of metadata to be written into the file. 

267 prefix: str 

268 This string is written at the beginning of each line. 

269 indent: int 

270 Number of characters used for indentation of sections. 

271 replace: char or None 

272 If specified, replace special characters by this character. 

273 

274 Examples 

275 -------- 

276 ``` 

277 >>> from audioio import print_metadata 

278 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

279 >>> print_metadata(md) 

280 aaaa: 2 

281 bbbb: 

282 ccc: 3 

283 ddd: 4 

284 eee: 

285 hh: 5 

286 iiii: 

287 jjj: 6 

288 ``` 

289 """ 

290 write_metadata_text(sys.stdout, meta, prefix, indent, replace) 

291 

292 

293def flatten_metadata(md, keep_sections=False, sep='.'): 

294 """Flatten hierarchical metadata to a single dictionary. 

295 

296 Parameters 

297 ---------- 

298 md: nested dict 

299 Metadata as returned by `metadata()`. 

300 keep_sections: bool 

301 If `True`, then prefix keys with section names, separated by `sep`. 

302 sep: str 

303 String for separating section names. 

304 

305 Returns 

306 ------- 

307 d: dict 

308 Non-nested dict containing all key-value pairs of `md`. 

309 

310 Examples 

311 -------- 

312 ``` 

313 >>> from audioio import print_metadata, flatten_metadata 

314 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

315 >>> print_metadata(md) 

316 aaaa: 2 

317 bbbb: 

318 ccc: 3 

319 ddd: 4 

320 eee: 

321 hh: 5 

322 iiii: 

323 jjj: 6 

324  

325 >>> fmd = flatten_metadata(md, keep_sections=True) 

326 >>> print_metadata(fmd) 

327 aaaa : 2 

328 bbbb.ccc : 3 

329 bbbb.ddd : 4 

330 bbbb.eee.hh: 5 

331 iiii.jjj : 6 

332 ``` 

333 """ 

334 def flatten(cd, section): 

335 df = {} 

336 for k in cd: 

337 if isinstance(cd[k], dict): 

338 df.update(flatten(cd[k], section + k + sep)) 

339 else: 

340 if keep_sections: 

341 df[section+k] = cd[k] 

342 else: 

343 df[k] = cd[k] 

344 return df 

345 

346 return flatten(md, '') 

347 

348 

349def unflatten_metadata(md, sep='.'): 

350 """Unflatten a previously flattened metadata dictionary. 

351 

352 Parameters 

353 ---------- 

354 md: dict 

355 Flat dictionary with key-value pairs as obtained from 

356 `flatten_metadata()` with `keep_sections=True`. 

357 sep: str 

358 String that separates section names. 

359 

360 Returns 

361 ------- 

362 d: nested dict 

363 Hierarchical dictionary with sub-dictionaries and key-value pairs. 

364 

365 Examples 

366 -------- 

367 ``` 

368 >>> from audioio import print_metadata, unflatten_metadata 

369 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6} 

370 >>> print_metadata(fmd) 

371 aaaa : 2 

372 bbbb.ccc : 3 

373 bbbb.ddd : 4 

374 bbbb.eee.hh: 5 

375 iiii.jjj : 6 

376  

377 >>> md = unflatten_metadata(fmd) 

378 >>> print_metadata(md) 

379 aaaa: 2 

380 bbbb: 

381 ccc: 3 

382 ddd: 4 

383 eee: 

384 hh: 5 

385 iiii: 

386 jjj: 6 

387 ``` 

388 """ 

389 umd = {} # unflattened metadata 

390 cmd = [umd] # current metadata dicts for each level of the hierarchy 

391 csk = [] # current section keys 

392 for k in md: 

393 ks = k.split(sep) 

394 # go up the hierarchy: 

395 for i in range(len(csk) - len(ks)): 

396 csk.pop() 

397 cmd.pop() 

398 for kss in reversed(ks[:len(csk)]): 

399 if kss == csk[-1]: 

400 break 

401 csk.pop() 

402 cmd.pop() 

403 # add new sections: 

404 for kss in ks[len(csk):-1]: 

405 csk.append(kss) 

406 cmd[-1][kss] = {} 

407 cmd.append(cmd[-1][kss]) 

408 # add key-value pair: 

409 cmd[-1][ks[-1]] = md[k] 

410 return umd 

411 

412 

413def parse_number(s): 

414 """Parse string with number and unit. 

415 

416 Parameters 

417 ---------- 

418 s: str, float, or int 

419 String to be parsed. The initial part of the string is 

420 expected to be a number, the part following the number is 

421 interpreted as the unit. If float or int, then return this 

422 as the value with empty unit. 

423 

424 Returns 

425 ------- 

426 v: None, int, or float 

427 Value of the string as float. Without decimal point, an int is returned. 

428 If the string does not contain a number, None is returned. 

429 u: str 

430 Unit that follows the initial number. 

431 n: int 

432 Number of digits behind the decimal point. 

433 

434 Examples 

435 -------- 

436 

437 ``` 

438 >>> from audioio import parse_number 

439 

440 # integer: 

441 >>> parse_number('42') 

442 (42, '', 0) 

443 

444 # integer with unit: 

445 >>> parse_number('42ms') 

446 (42, 'ms', 0) 

447 

448 # float with unit: 

449 >>> parse_number('42.ms') 

450 (42.0, 'ms', 0) 

451 

452 # float with unit: 

453 >>> parse_number('42.3ms') 

454 (42.3, 'ms', 1) 

455 

456 # float with space and unit: 

457 >>> parse_number('423.17 Hz') 

458 (423.17, 'Hz', 2) 

459 ``` 

460 

461 """ 

462 if not isinstance(s, str): 

463 if isinstance(s, int): 

464 return s, '', 0 

465 if isinstance(s, float): 

466 return s, '', 5 

467 else: 

468 return None, '', 0 

469 n = len(s) 

470 ip = n 

471 have_point = False 

472 for i in range(len(s)): 

473 if s[i] == '.': 

474 if have_point: 

475 n = i 

476 break 

477 have_point = True 

478 ip = i + 1 

479 if not s[i] in '0123456789.+-': 

480 n = i 

481 break 

482 if n == 0: 

483 return None, s, 0 

484 v = float(s[:n]) if have_point else int(s[:n]) 

485 u = s[n:].strip() 

486 nd = n - ip if n >= ip else 0 

487 return v, u, nd 

488 

489 

490unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2, 

491 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6, 

492 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12, 

493 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18, 

494 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2, 

495 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6, 

496 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12, 

497 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18, 

498 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6, 

499 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18, 

500 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3, 

501 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18} 

502""" SI prefixes for units with corresponding factors. """ 

503 

504 

505def change_unit(val, old_unit, new_unit): 

506 """Scale numerical value to a new unit. 

507 

508 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703 

509 

510 Parameters 

511 ---------- 

512 val: float 

513 Value given in `old_unit`. 

514 old_unit: str 

515 Unit of `val`. 

516 new_unit: str 

517 Requested unit of return value. 

518 

519 Returns 

520 ------- 

521 new_val: float 

522 The input value `val` scaled to `new_unit`. 

523 

524 Examples 

525 -------- 

526 

527 ``` 

528 >>> from audioio import change_unit 

529 >>> change_unit(5, 'mm', 'cm') 

530 0.5 

531 

532 >>> change_unit(5, '', 'cm') 

533 5.0 

534 

535 >>> change_unit(5, 'mm', '') 

536 5.0 

537 

538 >>> change_unit(5, 'cm', 'mm') 

539 50.0 

540 

541 >>> change_unit(4, 'kg', 'g') 

542 4000.0 

543 

544 >>> change_unit(12, '%', '') 

545 0.12 

546 

547 >>> change_unit(1.24, '', '%') 

548 124.0 

549 

550 >>> change_unit(2.5, 'min', 's') 

551 150.0 

552 

553 >>> change_unit(3600, 's', 'h') 

554 1.0 

555 

556 ``` 

557 

558 """ 

559 # missing unit? 

560 if not old_unit and not new_unit: 

561 return val 

562 if not old_unit and new_unit != '%': 

563 return val 

564 if not new_unit and old_unit != '%': 

565 return val 

566 

567 # special units that directly translate into factors: 

568 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0} 

569 

570 # parse old unit: 

571 f1 = 1.0 

572 if old_unit in unit_factors: 

573 f1 = unit_factors[old_unit] 

574 else: 

575 for k in unit_prefixes: 

576 if len(old_unit) > len(k) and old_unit[:len(k)] == k: 

577 f1 = unit_prefixes[k]; 

578 

579 # parse new unit: 

580 f2 = 1.0 

581 if new_unit in unit_factors: 

582 f2 = unit_factors[new_unit] 

583 else: 

584 for k in unit_prefixes: 

585 if len(new_unit) > len(k) and new_unit[:len(k)] == k: 

586 f2 = unit_prefixes[k]; 

587 

588 return val*f1/f2 

589 

590 

591def find_key(metadata, key, sep='.'): 

592 """Find dictionary in metadata hierarchy containing the specified key. 

593 

594 Parameters 

595 ---------- 

596 metadata: nested dict 

597 Metadata. 

598 key: str 

599 Key to be searched for (case insensitive). 

600 May contain section names separated by `sep`, i.e. 

601 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section) 

602 in section "bbb" that needs to be a subsection of section "aaa". 

603 sep: str 

604 String that separates section names in `key`. 

605 

606 Returns 

607 ------- 

608 md: dict 

609 The innermost dictionary matching some sections of the search key. 

610 If `key` is not at all contained in the metadata, 

611 the top-level dictionary is returned. 

612 key: str 

613 The part of the search key that was not found in `md`, or the 

614 the final part of the search key, found in `md`. 

615 

616 Examples 

617 -------- 

618 

619 Independent of whether found or not found, you can assign to the 

620 returned dictionary with the returned key. 

621 

622 ``` 

623 >>> from audioio import print_metadata, find_key 

624 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6)) 

625 >>> print_metadata(md) 

626 aaaa: 2 

627 bbbb: 

628 ccc: 3 

629 ddd: 4 

630 eee: 

631 ff: 5 

632 gggg: 

633 hhh: 6 

634 

635 >>> m, k = find_key(md, 'bbbb.ddd') 

636 >>> m[k] = 10 

637 >>> print_metadata(md) 

638 aaaa: 2 

639 bbbb: 

640 ccc: 3 

641 ddd: 10 

642 ... 

643 

644 >>> m, k = find_key(md, 'hhh') 

645 >>> m[k] = 12 

646 >>> print_metadata(md) 

647 ... 

648 gggg: 

649 hhh: 12 

650 

651 >>> m, k = find_key(md, 'bbbb.eee.xx') 

652 >>> m[k] = 42 

653 >>> print_metadata(md) 

654 ... 

655 eee: 

656 ff: 5 

657 xx: 42 

658 ... 

659 ``` 

660 

661 When searching for sections, the one conaining the searched section 

662 is returned: 

663 ```py 

664 >>> m, k = find_key(md, 'eee') 

665 >>> m[k]['yy'] = 46 

666 >>> print_metadata(md) 

667 ... 

668 eee: 

669 ff: 5 

670 xx: 42 

671 yy: 46 

672 ... 

673 ``` 

674 

675 """ 

676 def find_keys(metadata, keys): 

677 key = keys[0].strip().upper() 

678 for k in metadata: 

679 if k.upper() == key: 

680 if len(keys) == 1: 

681 # found key: 

682 return True, metadata, k 

683 elif isinstance(metadata[k], dict): 

684 # keep searching within the next section: 

685 return find_keys(metadata[k], keys[1:]) 

686 # search in subsections: 

687 for k in metadata: 

688 if isinstance(metadata[k], dict): 

689 found, mm, kk = find_keys(metadata[k], keys) 

690 if found: 

691 return True, mm, kk 

692 # nothing found: 

693 return False, metadata, sep.join(keys) 

694 

695 if metadata is None: 

696 return {}, None 

697 ks = key.strip().split(sep) 

698 found, mm, kk = find_keys(metadata, ks) 

699 return mm, kk 

700 

701 

702def get_number_unit(metadata, keys, sep='.', default=None, 

703 default_unit='', remove=False): 

704 """Find a key in metadata and return its number and unit. 

705 

706 Parameters 

707 ---------- 

708 metadata: nested dict 

709 Metadata. 

710 keys: str or list of str 

711 Keys in the metadata to be searched for (case insensitive). 

712 Value of the first key found is returned. 

713 May contain section names separated by `sep`.  

714 See `audiometadata.find_key()` for details. 

715 sep: str 

716 String that separates section names in `key`. 

717 default: None, int, or float 

718 Returned value if `key` is not found or the value does 

719 not contain a number. 

720 default_unit: str 

721 Returned unit if `key` is not found or the key's value does 

722 not have a unit. 

723 remove: bool 

724 If `True`, remove the found key from `metadata`. 

725 

726 Returns 

727 ------- 

728 v: None, int, or float 

729 Value referenced by `key` as float. 

730 Without decimal point, an int is returned. 

731 If none of the `keys` was found or 

732 the key`s value does not contain a number, 

733 then `default` is returned. 

734 u: str 

735 Corresponding unit. 

736 

737 Examples 

738 -------- 

739 

740 ``` 

741 >>> from audioio import get_number_unit 

742 >>> md = dict(aaaa='42', bbbb='42.3ms') 

743 

744 # integer: 

745 >>> get_number_unit(md, 'aaaa') 

746 (42, '') 

747 

748 # float with unit: 

749 >>> get_number_unit(md, 'bbbb') 

750 (42.3, 'ms') 

751 

752 # two keys: 

753 >>> get_number_unit(md, ['cccc', 'bbbb']) 

754 (42.3, 'ms') 

755 

756 # not found: 

757 >>> get_number_unit(md, 'cccc') 

758 (None, '') 

759 

760 # not found with default value: 

761 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.') 

762 (1.0, 'a.u.') 

763 ``` 

764 

765 """ 

766 if not metadata: 

767 return default, default_unit 

768 if not isinstance(keys, (list, tuple, np.ndarray)): 

769 keys = (keys,) 

770 value = default 

771 unit = default_unit 

772 for key in keys: 

773 m, k = find_key(metadata, key, sep) 

774 if k in m: 

775 v, u, _ = parse_number(m[k]) 

776 if v is not None: 

777 if not u: 

778 u = default_unit 

779 if remove: 

780 del m[k] 

781 return v, u 

782 elif u and unit == default_unit: 

783 unit = u 

784 return value, unit 

785 

786 

787def get_number(metadata, unit, keys, sep='.', default=None, remove=False): 

788 """Find a key in metadata and return its value in a given unit. 

789 

790 Parameters 

791 ---------- 

792 metadata: nested dict 

793 Metadata. 

794 unit: str 

795 Unit in which to return numerical value referenced by one of the `keys`. 

796 keys: str or list of str 

797 Keys in the metadata to be searched for (case insensitive). 

798 Value of the first key found is returned. 

799 May contain section names separated by `sep`.  

800 See `audiometadata.find_key()` for details. 

801 sep: str 

802 String that separates section names in `key`. 

803 default: None, int, or float 

804 Returned value if `key` is not found or the value does 

805 not contain a number. 

806 remove: bool 

807 If `True`, remove the found key from `metadata`. 

808 

809 Returns 

810 ------- 

811 v: None or float 

812 Value referenced by `key` as float scaled to `unit`. 

813 If none of the `keys` was found or 

814 the key`s value does not contain a number, 

815 then `default` is returned. 

816 

817 Examples 

818 -------- 

819 

820 ``` 

821 >>> from audioio import get_number 

822 >>> md = dict(aaaa='42', bbbb='42.3ms') 

823 

824 # milliseconds to seconds: 

825 >>> get_number(md, 's', 'bbbb') 

826 0.0423 

827 

828 # milliseconds to microseconds: 

829 >>> get_number(md, 'us', 'bbbb') 

830 42300.0 

831 

832 # value without unit is not scaled: 

833 >>> get_number(md, 'Hz', 'aaaa') 

834 42 

835 

836 # two keys: 

837 >>> get_number(md, 's', ['cccc', 'bbbb']) 

838 0.0423 

839 

840 # not found: 

841 >>> get_number(md, 's', 'cccc') 

842 None 

843 

844 # not found with default value: 

845 >>> get_number(md, 's', 'cccc', default=1.0) 

846 1.0 

847 ``` 

848 

849 """ 

850 v, u = get_number_unit(metadata, keys, sep, None, unit, remove) 

851 if v is None: 

852 return default 

853 else: 

854 return change_unit(v, u, unit) 

855 

856 

857def get_int(metadata, keys, sep='.', default=None, remove=False): 

858 """Find a key in metadata and return its integer value. 

859 

860 Parameters 

861 ---------- 

862 metadata: nested dict 

863 Metadata. 

864 keys: str or list of str 

865 Keys in the metadata to be searched for (case insensitive). 

866 Value of the first key found is returned. 

867 May contain section names separated by `sep`.  

868 See `audiometadata.find_key()` for details. 

869 sep: str 

870 String that separates section names in `key`. 

871 default: None or int 

872 Return value if `key` is not found or the value does 

873 not contain an integer. 

874 remove: bool 

875 If `True`, remove the found key from `metadata`. 

876 

877 Returns 

878 ------- 

879 v: None or int 

880 Value referenced by `key` as integer. 

881 If none of the `keys` was found, 

882 the key's value does not contain a number or represents 

883 a floating point value, then `default` is returned. 

884 

885 Examples 

886 -------- 

887 

888 ``` 

889 >>> from audioio import get_int 

890 >>> md = dict(aaaa='42', bbbb='42.3ms') 

891 

892 # integer: 

893 >>> get_int(md, 'aaaa') 

894 42 

895 

896 # two keys: 

897 >>> get_int(md, ['cccc', 'aaaa']) 

898 42 

899 

900 # float: 

901 >>> get_int(md, 'bbbb') 

902 None 

903 

904 # not found: 

905 >>> get_int(md, 'cccc') 

906 None 

907 

908 # not found with default value: 

909 >>> get_int(md, 'cccc', default=0) 

910 0 

911 ``` 

912 

913 """ 

914 if not metadata: 

915 return default 

916 if not isinstance(keys, (list, tuple, np.ndarray)): 

917 keys = (keys,) 

918 for key in keys: 

919 m, k = find_key(metadata, key, sep) 

920 if k in m: 

921 v, _, n = parse_number(m[k]) 

922 if v is not None and n == 0: 

923 if remove: 

924 del m[k] 

925 return int(v) 

926 return default 

927 

928 

929def get_bool(metadata, keys, sep='.', default=None, remove=False): 

930 """Find a key in metadata and return its boolean value. 

931 

932 Parameters 

933 ---------- 

934 metadata: nested dict 

935 Metadata. 

936 keys: str or list of str 

937 Keys in the metadata to be searched for (case insensitive). 

938 Value of the first key found is returned. 

939 May contain section names separated by `sep`.  

940 See `audiometadata.find_key()` for details. 

941 sep: str 

942 String that separates section names in `key`. 

943 default: None or bool 

944 Return value if `key` is not found or the value does 

945 not specify a boolean value. 

946 remove: bool 

947 If `True`, remove the found key from `metadata`. 

948 

949 Returns 

950 ------- 

951 v: None or bool 

952 Value referenced by `key` as boolean. 

953 True if 'true', 'yes' (case insensitive) or any number larger than zero. 

954 False if 'false', 'no' (case insensitive) or any number equal to zero. 

955 If none of the `keys` was found or 

956 the key's value does specify a boolean value, 

957 then `default` is returned. 

958 

959 Examples 

960 -------- 

961 

962 ``` 

963 >>> from audioio import get_bool 

964 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui') 

965 

966 # case insensitive: 

967 >>> get_bool(md, 'aaaa') 

968 True 

969 

970 >>> get_bool(md, 'bbbb') 

971 False 

972 

973 >>> get_bool(md, 'cccc') 

974 False 

975 

976 >>> get_bool(md, 'dddd') 

977 True 

978 

979 >>> get_bool(md, 'eeee') 

980 True 

981 

982 # not found: 

983 >>> get_bool(md, 'ffff') 

984 None 

985 

986 # two keys (string is preferred over number): 

987 >>> get_bool(md, ['cccc', 'aaaa']) 

988 True 

989 

990 # two keys (take first match): 

991 >>> get_bool(md, ['cccc', 'ffff']) 

992 False 

993 

994 # not found with default value: 

995 >>> get_bool(md, 'ffff', default=False) 

996 False 

997 ``` 

998 

999 """ 

1000 if not metadata: 

1001 return default 

1002 if not isinstance(keys, (list, tuple, np.ndarray)): 

1003 keys = (keys,) 

1004 val = default 

1005 mv = None 

1006 kv = None 

1007 for key in keys: 

1008 m, k = find_key(metadata, key, sep) 

1009 if k in m and not isinstance(m[k], dict): 

1010 vs = m[k] 

1011 v, _, _ = parse_number(vs) 

1012 if v is not None: 

1013 val = abs(v) > 1e-8 

1014 mv = m 

1015 kv = k 

1016 elif isinstance(vs, str): 

1017 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']: 

1018 if remove: 

1019 del m[k] 

1020 return True 

1021 if vs.upper() in ['FALSE', 'F', 'NO', 'N']: 

1022 if remove: 

1023 del m[k] 

1024 return False 

1025 if not mv is None and not kv is None and remove: 

1026 del mv[kv] 

1027 return val 

1028 

1029 

1030default_starttime_keys = [['DateTimeOriginal'], 

1031 ['OriginationDate', 'OriginationTime'], 

1032 ['Location_Time'], 

1033 ['Timestamp']] 

1034"""Default keys of times of start of the recording in metadata. 

1035Used by `get_datetime()` and `update_starttime()` functions. 

1036""" 

1037 

1038def get_datetime(metadata, keys=default_starttime_keys, 

1039 sep='.', default=None, remove=False): 

1040 """Find keys in metadata and return a datetime. 

1041 

1042 Parameters 

1043 ---------- 

1044 metadata: nested dict 

1045 Metadata. 

1046 keys: tuple of str or list of tuple of str 

1047 Datetimes can be stored in metadata as two separate key-value pairs, 

1048 one for the date and one for the time. Or by a single key-value pair 

1049 for a date-time value. This is why the keys need to be specified in 

1050 tuples with one or two keys. 

1051 The value of the first tuple of keys found is returned. 

1052 Keys may contain section names separated by `sep`.  

1053 See `audiometadata.find_key()` for details. 

1054 The default values for the `keys` find the start time of a recording. 

1055 You can modify the default keys via the `default_starttime_keys` list 

1056 of the `audiometadata` module. 

1057 sep: str 

1058 String that separates section names in `key`. 

1059 default: None or str 

1060 Return value if `key` is not found or the value does 

1061 not contain a string. 

1062 remove: bool 

1063 If `True`, remove the found key from `metadata`. 

1064 

1065 Returns 

1066 ------- 

1067 v: None or datetime 

1068 Datetime referenced by `keys`. 

1069 If none of the `keys` was found, then `default` is returned. 

1070 

1071 Examples 

1072 -------- 

1073 

1074 ``` 

1075 >>> from audioio import get_datetime 

1076 >>> import datetime as dt 

1077 >>> md = dict(date='2024-03-02', time='10:42:24', 

1078 datetime='2023-04-15T22:10:00') 

1079 

1080 # separate date and time: 

1081 >>> get_datetime(md, ('date', 'time')) 

1082 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1083 

1084 # single datetime: 

1085 >>> get_datetime(md, ('datetime',)) 

1086 datetime.datetime(2023, 4, 15, 22, 10) 

1087 

1088 # two alternative key tuples: 

1089 >>> get_datetime(md, [('aaaa',), ('date', 'time')]) 

1090 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1091 

1092 # not found: 

1093 >>> get_datetime(md, ('cccc',)) 

1094 None 

1095 

1096 # not found with default value: 

1097 >>> get_datetime(md, ('cccc', 'dddd'), 

1098 default=dt.datetime(2022, 2, 22, 22, 2, 12)) 

1099 datetime.datetime(2022, 2, 22, 22, 2, 12) 

1100 ``` 

1101 

1102 """ 

1103 if not metadata: 

1104 return default 

1105 if len(keys) > 0 and isinstance(keys[0], str): 

1106 keys = (keys,) 

1107 for keyp in keys: 

1108 if len(keyp) == 1: 

1109 m, k = find_key(metadata, keyp[0], sep) 

1110 if k in m: 

1111 v = m[k] 

1112 if isinstance(v, dt.datetime): 

1113 if remove: 

1114 del m[k] 

1115 return v 

1116 elif isinstance(v, str): 

1117 if remove: 

1118 del m[k] 

1119 return dt.datetime.fromisoformat(v) 

1120 else: 

1121 md, kd = find_key(metadata, keyp[0], sep) 

1122 if not kd in md: 

1123 continue 

1124 if isinstance(md[kd], dt.date): 

1125 date = md[kd] 

1126 elif isinstance(md[kd], str): 

1127 date = dt.date.fromisoformat(md[kd]) 

1128 else: 

1129 continue 

1130 mt, kt = find_key(metadata, keyp[1], sep) 

1131 if not kt in mt: 

1132 continue 

1133 if isinstance(mt[kt], dt.time): 

1134 time = mt[kt] 

1135 elif isinstance(mt[kt], str): 

1136 time = dt.time.fromisoformat(mt[kt]) 

1137 else: 

1138 continue 

1139 if remove: 

1140 del md[kd] 

1141 del mt[kt] 

1142 return dt.datetime.combine(date, time) 

1143 return default 

1144 

1145 

1146def get_str(metadata, keys, sep='.', default=None, remove=False): 

1147 """Find a key in metadata and return its string value. 

1148 

1149 Parameters 

1150 ---------- 

1151 metadata: nested dict 

1152 Metadata. 

1153 keys: str or list of str 

1154 Keys in the metadata to be searched for (case insensitive). 

1155 Value of the first key found is returned. 

1156 May contain section names separated by `sep`.  

1157 See `audiometadata.find_key()` for details. 

1158 sep: str 

1159 String that separates section names in `key`. 

1160 default: None or str 

1161 Return value if `key` is not found or the value does 

1162 not contain a string. 

1163 remove: bool 

1164 If `True`, remove the found key from `metadata`. 

1165 

1166 Returns 

1167 ------- 

1168 v: None or str 

1169 String value referenced by `key`. 

1170 If none of the `keys` was found, then `default` is returned. 

1171 

1172 Examples 

1173 -------- 

1174 

1175 ``` 

1176 >>> from audioio import get_str 

1177 >>> md = dict(aaaa=42, bbbb='hello') 

1178 

1179 # string: 

1180 >>> get_str(md, 'bbbb') 

1181 'hello' 

1182 

1183 # int as str: 

1184 >>> get_str(md, 'aaaa') 

1185 '42' 

1186 

1187 # two keys: 

1188 >>> get_str(md, ['cccc', 'bbbb']) 

1189 'hello' 

1190 

1191 # not found: 

1192 >>> get_str(md, 'cccc') 

1193 None 

1194 

1195 # not found with default value: 

1196 >>> get_str(md, 'cccc', default='-') 

1197 '-' 

1198 ``` 

1199 

1200 """ 

1201 if not metadata: 

1202 return default 

1203 if not isinstance(keys, (list, tuple, np.ndarray)): 

1204 keys = (keys,) 

1205 for key in keys: 

1206 m, k = find_key(metadata, key, sep) 

1207 if k in m and not isinstance(m[k], dict): 

1208 v = m[k] 

1209 if remove: 

1210 del m[k] 

1211 return str(v) 

1212 return default 

1213 

1214 

1215def add_sections(metadata, sections, value=False, sep='.'): 

1216 """Add sections to metadata dictionary. 

1217 

1218 Parameters 

1219 ---------- 

1220 metadata: nested dict 

1221 Metadata. 

1222 key: str 

1223 Names of sections to be added to `metadata`. 

1224 Section names separated by `sep`.  

1225 value: bool 

1226 If True, then the last element in `key` is a key for a value, 

1227 not a section. 

1228 sep: str 

1229 String that separates section names in `key`. 

1230 

1231 Returns 

1232 ------- 

1233 md: dict 

1234 Dictionary of the last added section. 

1235 key: str 

1236 Last key. Only returned if `value` is set to `True`. 

1237 

1238 Examples 

1239 -------- 

1240 

1241 Add a section and a sub-section to the metadata: 

1242 ``` 

1243 >>> from audioio import print_metadata, add_sections 

1244 >>> md = dict() 

1245 >>> m = add_sections(md, 'Recording.Location') 

1246 >>> m['Country'] = 'Lummerland' 

1247 >>> print_metadata(md) 

1248 Recording: 

1249 Location: 

1250 Country: Lummerland 

1251 ``` 

1252 

1253 Add a section with a key-value pair: 

1254 ``` 

1255 >>> md = dict() 

1256 >>> m, k = add_sections(md, 'Recording.Location', True) 

1257 >>> m[k] = 'Lummerland' 

1258 >>> print_metadata(md) 

1259 Recording: 

1260 Location: Lummerland 

1261 ``` 

1262 

1263 Adds well to `find_key()`: 

1264 ``` 

1265 >>> md = dict(Recording=dict()) 

1266 >>> m, k = find_key(md, 'Recording.Location.Country') 

1267 >>> m, k = add_sections(m, k, True) 

1268 >>> m[k] = 'Lummerland' 

1269 >>> print_metadata(md) 

1270 Recording: 

1271 Location: 

1272 Country: Lummerland 

1273 ``` 

1274 

1275 """ 

1276 mm = metadata 

1277 ks = sections.split(sep) 

1278 n = len(ks) 

1279 if value: 

1280 n -= 1 

1281 for k in ks[:n]: 

1282 if len(k) == 0: 

1283 continue 

1284 mm[k] = dict() 

1285 mm = mm[k] 

1286 if value: 

1287 return mm, ks[-1] 

1288 else: 

1289 return mm 

1290 

1291 

1292def strlist_to_dict(mds): 

1293 """Convert list of key-value-pair strings to dictionary. 

1294 

1295 Parameters 

1296 ---------- 

1297 mds: None or dict or str or list of str 

1298 - None - returns empty dictionary. 

1299 - Flat dictionary - returned as is. 

1300 - String with key and value separated by '='. 

1301 - List of strings with keys and values separated by '='. 

1302 Keys may contain section names. 

1303 

1304 Returns 

1305 ------- 

1306 md_dict: dict 

1307 Flat dictionary with key-value pairs. 

1308 Keys may contain section names. 

1309 Values are strings, other types or dictionaries. 

1310 """ 

1311 if mds is None: 

1312 return {} 

1313 if isinstance(mds, dict): 

1314 return mds 

1315 if not isinstance(mds, (list, tuple, np.ndarray)): 

1316 mds = (mds,) 

1317 md_dict = {} 

1318 for md in mds: 

1319 k, v = md.split('=') 

1320 k = k.strip() 

1321 v = v.strip() 

1322 md_dict[k] = v 

1323 return md_dict 

1324 

1325 

1326def set_metadata(metadata, mds, sep='.'): 

1327 """Set values of existing metadata. 

1328 

1329 Only if a key is found in the metadata, its value is updated. 

1330 

1331 Parameters 

1332 ---------- 

1333 metadata: nested dict 

1334 Metadata. 

1335 mds: dict or str or list of str 

1336 - Flat dictionary with key-value pairs for updating the metadata. 

1337 Values can be strings, other types or dictionaries. 

1338 - String with key and value separated by '='. 

1339 - List of strings with key and value separated by '='. 

1340 Keys may contain section names separated by `sep`. 

1341 sep: str 

1342 String that separates section names in the keys of `md_dict`. 

1343 

1344 Examples 

1345 -------- 

1346 ``` 

1347 >>> from audioio import print_metadata, set_metadata 

1348 >>> md = dict(Recording=dict(Time='early')) 

1349 >>> print_metadata(md) 

1350 Recording: 

1351 Time: early 

1352 

1353 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1354 'Recording.Time': 'late'}) # change value of existing key 

1355 >>> print_metadata(md) 

1356 Recording: 

1357 Time : late 

1358 ``` 

1359 

1360 See also 

1361 -------- 

1362 add_metadata() 

1363 strlist_to_dict() 

1364 

1365 """ 

1366 if metadata is None: 

1367 return 

1368 md_dict = strlist_to_dict(mds) 

1369 for k in md_dict: 

1370 mm, kk = find_key(metadata, k, sep) 

1371 if kk in mm: 

1372 mm[kk] = md_dict[k] 

1373 

1374 

1375def add_metadata(metadata, mds, sep='.'): 

1376 """Add or modify key-value pairs. 

1377 

1378 If a key does not exist, it is added to the metadata. 

1379 

1380 Parameters 

1381 ---------- 

1382 metadata: nested dict 

1383 Metadata. 

1384 mds: dict or str or list of str 

1385 - Flat dictionary with key-value pairs for updating the metadata. 

1386 Values can be strings or other types. 

1387 - String with key and value separated by '='. 

1388 - List of strings with key and value separated by '='. 

1389 Keys may contain section names separated by `sep`. 

1390 sep: str 

1391 String that separates section names in the keys of `md_list`. 

1392 

1393 Examples 

1394 -------- 

1395 ``` 

1396 >>> from audioio import print_metadata, add_metadata 

1397 >>> md = dict(Recording=dict(Time='early')) 

1398 >>> print_metadata(md) 

1399 Recording: 

1400 Time: early 

1401 

1402 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1403 'Recording.Time': 'late', # change value of existing key  

1404 'Recording.Quality': 'amazing', # new key-value pair in existing section 

1405 'Location.Country': 'Lummerland']) # new key-value pair in new section 

1406 >>> print_metadata(md) 

1407 Recording: 

1408 Time : late 

1409 Quality: amazing 

1410 Artist: John Doe 

1411 Location: 

1412 Country: Lummerland 

1413 ``` 

1414 

1415 See also 

1416 -------- 

1417 set_metadata() 

1418 strlist_to_dict() 

1419 

1420 """ 

1421 if metadata is None: 

1422 return 

1423 md_dict = strlist_to_dict(mds) 

1424 for k in md_dict: 

1425 mm, kk = find_key(metadata, k, sep) 

1426 mm, kk = add_sections(mm, kk, True, sep) 

1427 mm[kk] = md_dict[k] 

1428 

1429 

1430def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'): 

1431 """Remove a key from metadata and add it to a dictionary. 

1432 

1433 Parameters 

1434 ---------- 

1435 src_md: nested dict 

1436 Metadata from which a key is removed. 

1437 dest_md: dict 

1438 Dictionary to which the found key and its value are added. 

1439 keys: str or list of str 

1440 List of keys to be searched for in `src_md`. 

1441 Move the first one found to `dest_md`. 

1442 See the `audiometadata.find_key()` function for details. 

1443 new_key: None or str 

1444 If specified add the value of the found key as `new_key` to 

1445 `dest_md`. Otherwise, use the search key. 

1446 sep: str 

1447 String that separates section names in `keys`. 

1448 

1449 Returns 

1450 ------- 

1451 moved: bool 

1452 `True` if key was found and moved to dictionary. 

1453  

1454 Examples 

1455 -------- 

1456 ``` 

1457 >>> from audioio import print_metadata, move_metadata 

1458 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV')) 

1459 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist') 

1460 >>> print_metadata(md) 

1461 Recording: 

1462 Gain : 1.42mV 

1463 Experimentalist: John Doe 

1464 ``` 

1465  

1466 """ 

1467 if not src_md: 

1468 return False 

1469 if not isinstance(keys, (list, tuple, np.ndarray)): 

1470 keys = (keys,) 

1471 for key in keys: 

1472 m, k = find_key(src_md, key, sep) 

1473 if k in m: 

1474 dest_key = new_key if new_key else k 

1475 dest_md[dest_key] = m.pop(k) 

1476 return True 

1477 return False 

1478 

1479 

1480def remove_metadata(metadata, key_list, sep='.'): 

1481 """Remove key-value pairs or sections from metadata. 

1482 

1483 Parameters 

1484 ---------- 

1485 metadata: nested dict 

1486 Metadata. 

1487 key_list: str or list of str 

1488 List of keys to key-value pairs or sections to be removed 

1489 from the metadata. 

1490 sep: str 

1491 String that separates section names in the keys of `key_list`. 

1492 

1493 Examples 

1494 -------- 

1495 ``` 

1496 >>> from audioio import print_metadata, remove_metadata 

1497 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4)) 

1498 >>> remove_metadata(md, ('ccc',)) 

1499 >>> print_metadata(md) 

1500 aaaa: 2 

1501 bbbb: 

1502 ddd: 4 

1503 ``` 

1504 

1505 """ 

1506 if not metadata: 

1507 return 

1508 if not isinstance(key_list, (list, tuple, np.ndarray)): 

1509 key_list = (key_list,) 

1510 for k in key_list: 

1511 mm, kk = find_key(metadata, k, sep) 

1512 if kk in mm: 

1513 del mm[kk] 

1514 

1515 

1516def cleanup_metadata(metadata): 

1517 """Remove empty sections from metadata. 

1518 

1519 Parameters 

1520 ---------- 

1521 metadata: nested dict 

1522 Metadata. 

1523 

1524 Examples 

1525 -------- 

1526 ``` 

1527 >>> from audioio import print_metadata, cleanup_metadata 

1528 >>> md = dict(aaaa=2, bbbb=dict()) 

1529 >>> cleanup_metadata(md) 

1530 >>> print_metadata(md) 

1531 aaaa: 2 

1532 ``` 

1533 

1534 """ 

1535 if not metadata: 

1536 return 

1537 for k in list(metadata): 

1538 if isinstance(metadata[k], dict): 

1539 if len(metadata[k]) == 0: 

1540 del metadata[k] 

1541 else: 

1542 cleanup_metadata(metadata[k]) 

1543 

1544 

1545default_gain_keys = ['gain'] 

1546"""Default keys of gain settings in metadata. Used by `get_gain()` function. 

1547""" 

1548 

1549def get_gain(metadata, gain_key=default_gain_keys, sep='.', 

1550 default=None, default_unit='', remove=False): 

1551 """Get gain and unit from metadata. 

1552 

1553 Parameters 

1554 ---------- 

1555 metadata: nested dict 

1556 Metadata with key-value pairs. 

1557 gain_key: str or list of str 

1558 Key in the file's metadata that holds some gain information. 

1559 If found, the data will be multiplied with the gain, 

1560 and if available, the corresponding unit is returned. 

1561 See the `audiometadata.find_key()` function for details. 

1562 You can modify the default keys via the `default_gain_keys` list 

1563 of the `audiometadata` module. 

1564 sep: str 

1565 String that separates section names in `gain_key`. 

1566 default: None or float 

1567 Returned value if no valid gain was found in `metadata`. 

1568 default_unit: str 

1569 Returned unit if no valid gain was found in `metadata`. 

1570 remove: bool 

1571 If `True`, remove the found key from `metadata`. 

1572 

1573 Returns 

1574 ------- 

1575 fac: float 

1576 Gain factor. If not found in metadata return 1. 

1577 unit: string 

1578 Unit of the data if found in the metadata, otherwise "a.u.". 

1579 """ 

1580 v, u = get_number_unit(metadata, gain_key, sep, default, 

1581 default_unit, remove) 

1582 # fix some TeeGrid gains: 

1583 if len(u) >= 2 and u[-2:] == '/V': 

1584 u = u[:-2] 

1585 return v, u 

1586 

1587 

1588def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'): 

1589 """Update gain setting in metadata. 

1590 

1591 Searches for the first appearance of a gain key in the metadata 

1592 hierarchy. If found, divide the gain value by `fac`. 

1593 

1594 Parameters 

1595 ---------- 

1596 metadata: nested dict 

1597 Metadata to be updated. 

1598 fac: float 

1599 Factor that was used to scale the data. 

1600 gain_key: str or list of str 

1601 Key in the file's metadata that holds some gain information. 

1602 If found, the data will be multiplied with the gain, 

1603 and if available, the corresponding unit is returned. 

1604 See the `audiometadata.find_key()` function for details. 

1605 You can modify the default keys via the `default_gain_keys` list 

1606 of the `audiometadata` module. 

1607 sep: str 

1608 String that separates section names in `gain_key`. 

1609 

1610 Returns 

1611 ------- 

1612 done: bool 

1613 True if gain has been found and set. 

1614 

1615 

1616 Examples 

1617 -------- 

1618 

1619 ``` 

1620 >>> from audioio import print_metadata, update_gain 

1621 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV')) 

1622 >>> update_gain(md, 2) 

1623 >>> print_metadata(md) 

1624 Artist: John Doe 

1625 Recording: 

1626 gain: 0.70mV 

1627 ``` 

1628 

1629 """ 

1630 if not metadata: 

1631 return False 

1632 if not isinstance(gain_key, (list, tuple, np.ndarray)): 

1633 gain_key = (gain_key,) 

1634 for gk in gain_key: 

1635 m, k = find_key(metadata, gk, sep) 

1636 if k in m and not isinstance(m[k], dict): 

1637 vs = m[k] 

1638 if isinstance(vs, (int, float)): 

1639 m[k] = vs/fac 

1640 else: 

1641 v, u, n = parse_number(vs) 

1642 if not v is None: 

1643 # fix some TeeGrid gains: 

1644 if len(u) >= 2 and u[-2:] == '/V': 

1645 u = u[:-2] 

1646 m[k] = f'{v/fac:.{n+1}f}{u}' 

1647 return True 

1648 return False 

1649 

1650 

1651default_timeref_keys = ['TimeReference'] 

1652"""Default keys of integer time references in metadata. 

1653Used by `update_starttime()` function. 

1654""" 

1655 

1656def set_starttime(metadata, datetime_value, 

1657 time_keys=default_starttime_keys): 

1658 """Set all start-of-recording times in metadata. 

1659 

1660 Parameters 

1661 ---------- 

1662 metadata: nested dict 

1663 Metadata to be updated. 

1664 datetime_value: datetime 

1665 Start date and time of the recording. 

1666 time_keys: tuple of str or list of tuple of str 

1667 Keys to fields denoting calender times, i.e. dates and times. 

1668 Datetimes can be stored in metadata as two separate key-value pairs, 

1669 one for the date and one for the time. Or by a single key-value pair 

1670 for a date-time values. This is why the keys need to be specified in 

1671 tuples with one or two keys. 

1672 Keys may contain section names separated by `sep`.  

1673 See `audiometadata.find_key()` for details. 

1674 You can modify the default time keys via the `default_starttime_keys` 

1675 list of the `audiometadata` module. 

1676 

1677 Returns 

1678 ------- 

1679 success: bool 

1680 True if at least one time has been set. 

1681 

1682 Example 

1683 ------- 

1684 ``` 

1685 >>> from audioio import print_metadata, set_starttime 

1686 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1687 OtherTime='2023-05-16T23:20:10', 

1688 BEXT=dict(OriginationDate='2024-03-02', 

1689 OriginationTime='10:42:24')) 

1690 >>> set_starttime(md, '2024-06-17T22:10:05') 

1691 >>> print_metadata(md) 

1692 DateTimeOriginal: 2024-06-17T22:10:05 

1693 OtherTime : 2024-06-17T22:10:05 

1694 BEXT: 

1695 OriginationDate: 2024-06-17 

1696 OriginationTime: 22:10:05 

1697 ``` 

1698 

1699 """ 

1700 if not metadata: 

1701 return False 

1702 if isinstance(datetime_value, str): 

1703 datetime_value = dt.datetime.fromisoformat(datetime_value) 

1704 success = False 

1705 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1706 time_keys = (time_keys,) 

1707 for key in time_keys: 

1708 if len(key) == 1: 

1709 # datetime: 

1710 m, k = find_key(metadata, key[0]) 

1711 if k in m and not isinstance(m[k], dict): 

1712 if isinstance(m[k], dt.datetime): 

1713 m[k] = datetime_value 

1714 else: 

1715 m[k] = datetime_value.isoformat(timespec='seconds') 

1716 success = True 

1717 else: 

1718 # separate date and time: 

1719 md, kd = find_key(metadata, key[0]) 

1720 if not kd in md or isinstance(md[kd], dict): 

1721 continue 

1722 if isinstance(md[kd], dt.date): 

1723 md[kd] = datetime_value.date() 

1724 else: 

1725 md[kd] = datetime_value.date().isoformat() 

1726 mt, kt = find_key(metadata, key[1]) 

1727 if not kt in mt or isinstance(mt[kt], dict): 

1728 continue 

1729 if isinstance(mt[kt], dt.time): 

1730 mt[kt] = datetime_value.time() 

1731 else: 

1732 mt[kt] = datetime_value.time().isoformat(timespec='seconds') 

1733 success = True 

1734 return success 

1735 

1736 

1737def update_starttime(metadata, deltat, rate, 

1738 time_keys=default_starttime_keys, 

1739 ref_keys=default_timeref_keys): 

1740 """Update start-of-recording times in metadata. 

1741 

1742 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata. 

1743 

1744 Parameters 

1745 ---------- 

1746 metadata: nested dict 

1747 Metadata to be updated. 

1748 deltat: float 

1749 Time in seconds to be added to start times. 

1750 rate: float 

1751 Sampling rate of the data in Hertz. 

1752 time_keys: tuple of str or list of tuple of str 

1753 Keys to fields denoting calender times, i.e. dates and times. 

1754 Datetimes can be stored in metadata as two separate key-value pairs, 

1755 one for the date and one for the time. Or by a single key-value pair 

1756 for a date-time values. This is why the keys need to be specified in 

1757 tuples with one or two keys. 

1758 Keys may contain section names separated by `sep`.  

1759 See `audiometadata.find_key()` for details. 

1760 You can modify the default time keys via the `default_starttime_keys` 

1761 list of the `audiometadata` module. 

1762 ref_keys: str or list of str 

1763 Keys to time references, i.e. integers in seconds relative to 

1764 a reference time. 

1765 Keys may contain section names separated by `sep`.  

1766 See `audiometadata.find_key()` for details. 

1767 You can modify the default reference keys via the 

1768 `default_timeref_keys` list of the `audiometadata` module. 

1769 

1770 Returns 

1771 ------- 

1772 success: bool 

1773 True if at least one time has been updated. 

1774 

1775 Example 

1776 ------- 

1777 ``` 

1778 >>> from audioio import print_metadata, update_starttime 

1779 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1780 OtherTime='2023-05-16T23:20:10', 

1781 BEXT=dict(OriginationDate='2024-03-02', 

1782 OriginationTime='10:42:24', 

1783 TimeReference=123456)) 

1784 >>> update_starttime(md, 4.2, 48000) 

1785 >>> print_metadata(md) 

1786 DateTimeOriginal: 2023-04-15T22:10:04 

1787 OtherTime : 2023-05-16T23:20:10 

1788 BEXT: 

1789 OriginationDate: 2024-03-02 

1790 OriginationTime: 10:42:28 

1791 TimeReference : 325056 

1792 ``` 

1793 

1794 """ 

1795 if not metadata: 

1796 return False 

1797 if not isinstance(deltat, dt.timedelta): 

1798 deltat = dt.timedelta(seconds=deltat) 

1799 success = False 

1800 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1801 time_keys = (time_keys,) 

1802 for key in time_keys: 

1803 if len(key) == 1: 

1804 # datetime: 

1805 m, k = find_key(metadata, key[0]) 

1806 if k in m and not isinstance(m[k], dict): 

1807 if isinstance(m[k], dt.datetime): 

1808 m[k] += deltat 

1809 else: 

1810 datetime = dt.datetime.fromisoformat(m[k]) + deltat 

1811 m[k] = datetime.isoformat(timespec='seconds') 

1812 success = True 

1813 else: 

1814 # separate date and time: 

1815 md, kd = find_key(metadata, key[0]) 

1816 if not kd in md or isinstance(md[kd], dict): 

1817 continue 

1818 if isinstance(md[kd], dt.date): 

1819 date = md[kd] 

1820 is_date = True 

1821 else: 

1822 date = dt.date.fromisoformat(md[kd]) 

1823 is_date = False 

1824 mt, kt = find_key(metadata, key[1]) 

1825 if not kt in mt or isinstance(mt[kt], dict): 

1826 continue 

1827 if isinstance(mt[kt], dt.time): 

1828 time = mt[kt] 

1829 is_time = True 

1830 else: 

1831 time = dt.time.fromisoformat(mt[kt]) 

1832 is_time = False 

1833 datetime = dt.datetime.combine(date, time) + deltat 

1834 md[kd] = datetime.date() if is_date else datetime.date().isoformat() 

1835 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds') 

1836 success = True 

1837 # time reference in samples: 

1838 if isinstance(ref_keys, str): 

1839 ref_keys = (ref_keys,) 

1840 for key in ref_keys: 

1841 m, k = find_key(metadata, key) 

1842 if k in m and not isinstance(m[k], dict): 

1843 is_int = isinstance(m[k], int) 

1844 tref = int(m[k]) 

1845 tref += int(np.round(deltat.total_seconds()*rate)) 

1846 m[k] = tref if is_int else f'{tref}' 

1847 success = True 

1848 return success 

1849 

1850 

1851def bext_history_str(encoding, rate, channels, text=None): 

1852 """ Assemble a string for the BEXT CodingHistory field. 

1853 

1854 Parameters 

1855 ---------- 

1856 encoding: str or None 

1857 Encoding of the data. 

1858 rate: int or float 

1859 Sampling rate in Hertz. 

1860 channels: int 

1861 Number of channels. 

1862 text: str or None 

1863 Optional free text. 

1864 

1865 Returns 

1866 ------- 

1867 s: str 

1868 String for the BEXT CodingHistory field, 

1869 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out" 

1870 """ 

1871 codes = [] 

1872 bits = None 

1873 if encoding is not None: 

1874 if encoding[:3] == 'PCM': 

1875 bits = int(encoding[4:]) 

1876 encoding = 'PCM' 

1877 codes.append(f'A={encoding}') 

1878 codes.append(f'F={rate:.0f}') 

1879 if bits is not None: 

1880 codes.append(f'W={bits}') 

1881 mode = None 

1882 if channels == 1: 

1883 mode = 'mono' 

1884 elif channels == 2: 

1885 mode = 'stereo' 

1886 if mode is not None: 

1887 codes.append(f'M={mode}') 

1888 if text is not None: 

1889 codes.append(f'T={text.rstrip()}') 

1890 return ','.join(codes) 

1891 

1892 

1893default_history_keys = ['History', 

1894 'CodingHistory', 

1895 'BWF_CODING_HISTORY'] 

1896"""Default keys of strings describing coding history in metadata. 

1897Used by `add_history()` function. 

1898""" 

1899 

1900def add_history(metadata, history, new_key=None, pre_history=None, 

1901 history_keys=default_history_keys, sep='.'): 

1902 """Add a string describing coding history to metadata. 

1903  

1904 Add `history` to the `history_keys` fields in the metadata. If 

1905 none of these fields are present but `new_key` is specified, then 

1906 assign `pre_history` and `history` to this key. If this key does 

1907 not exist in the metadata, it is created. 

1908 

1909 Parameters 

1910 ---------- 

1911 metadata: nested dict 

1912 Metadata to be updated. 

1913 history: str 

1914 String to be added to the history. 

1915 new_key: str or None 

1916 Sections and name of a history key to be added to `metadata`. 

1917 Section names are separated by `sep`. 

1918 pre_history: str or None 

1919 If a new key `new_key` is created, then assign this string followed 

1920 by `history`. 

1921 history_keys: str or list of str 

1922 Keys to fields where to add `history`. 

1923 Keys may contain section names separated by `sep`.  

1924 See `audiometadata.find_key()` for details. 

1925 You can modify the default history keys via the `default_history_keys` 

1926 list of the `audiometadata` module. 

1927 sep: str 

1928 String that separates section names in `new_key` and `history_keys`. 

1929 

1930 Returns 

1931 ------- 

1932 success: bool 

1933 True if the history string has beend added to the metadata. 

1934 

1935 Example 

1936 ------- 

1937 Add string to existing history key-value pair: 

1938 ``` 

1939 >>> from audioio import add_history 

1940 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings')) 

1941 >>> add_history(md, 'just a snippet') 

1942 >>> print(md['BEXT']['CodingHistory']) 

1943 original recordings 

1944 just a snippet 

1945 ``` 

1946 

1947 Assign string to new key-value pair: 

1948 ``` 

1949 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12')) 

1950 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data') 

1951 >>> print(md['BEXT']['CodingHistory']) 

1952 original data 

1953 just a snippet 

1954 ``` 

1955 

1956 """ 

1957 if not metadata: 

1958 return False 

1959 if isinstance(history_keys, str): 

1960 history_keys = (history_keys,) 

1961 success = False 

1962 for keys in history_keys: 

1963 m, k = find_key(metadata, keys) 

1964 if k in m and not isinstance(m[k], dict): 

1965 s = m[k] 

1966 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1967 s += '\r\n' 

1968 s += history 

1969 m[k] = s 

1970 success = True 

1971 if not success and new_key: 

1972 m, k = find_key(metadata, new_key, sep) 

1973 m, k = add_sections(m, k, True, sep) 

1974 s = '' 

1975 if pre_history is not None: 

1976 s = pre_history 

1977 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1978 s += '\r\n' 

1979 s += history 

1980 m[k] = s 

1981 success = True 

1982 return success 

1983 

1984 

1985def add_unwrap(metadata, thresh, clip=0, unit=''): 

1986 """Add unwrap infos to metadata. 

1987 

1988 If `audiotools.unwrap()` was applied to the data, then this 

1989 function adds relevant infos to the metadata. If there is an INFO 

1990 section in the metadata, the unwrap infos are added to this 

1991 section, otherwise they are added to the top level of the metadata 

1992 hierarchy. 

1993 

1994 The threshold `thresh` used for unwrapping is saved under the key 

1995 'UnwrapThreshold' as a string. If `clip` is larger than zero, then 

1996 the clip level is saved under the key 'UnwrapClippedAmplitude' as 

1997 a string. 

1998 

1999 Parameters 

2000 ---------- 

2001 md: nested dict 

2002 Metadata to be updated. 

2003 thresh: float 

2004 Threshold used for unwrapping. 

2005 clip: float 

2006 Level at which unwrapped data have been clipped. 

2007 unit: str 

2008 Unit of `thresh` and `clip`. 

2009 

2010 Examples 

2011 -------- 

2012 

2013 ``` 

2014 >>> from audioio import print_metadata, add_unwrap 

2015 >>> md = dict(INFO=dict(Time='early')) 

2016 >>> add_unwrap(md, 0.6, 1.0) 

2017 >>> print_metadata(md) 

2018 INFO: 

2019 Time : early 

2020 UnwrapThreshold : 0.60 

2021 UnwrapClippedAmplitude: 1.00 

2022 ``` 

2023 

2024 """ 

2025 if metadata is None: 

2026 return 

2027 md = metadata 

2028 for k in metadata: 

2029 if k.strip().upper() == 'INFO': 

2030 md = metadata['INFO'] 

2031 break 

2032 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}' 

2033 if clip > 0: 

2034 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}' 

2035 

2036 

2037def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks): 

2038 """Print metadata and markers of audio files. 

2039 

2040 Parameters 

2041 ---------- 

2042 file_pathes: list of str 

2043 Pathes of audio files. 

2044 list_format: bool 

2045 If True, list file format only. 

2046 list_metadata: bool 

2047 If True, list metadata only. 

2048 list_cues: bool 

2049 If True, list markers/cues only. 

2050 list_chunks: bool 

2051 If True, list all chunks contained in a riff/wave file. 

2052 """ 

2053 from .audioloader import AudioLoader 

2054 from .audiomarkers import print_markers 

2055 from .riffmetadata import read_chunk_tags 

2056 for filepath in file_pathes: 

2057 if len(file_pathes) > 1 and (list_cues or list_metadata or 

2058 list_format or list_chunks): 

2059 print(filepath) 

2060 if list_chunks: 

2061 chunks = read_chunk_tags(filepath) 

2062 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}') 

2063 for tag in chunks: 

2064 pos = chunks[tag][0] - 8 

2065 size = chunks[tag][1] + 8 

2066 print(f' {tag:9s} {pos:10d} {size:10d}') 

2067 if len(file_pathes) > 1: 

2068 print() 

2069 continue 

2070 with AudioLoader(filepath, 1, 0, verbose=0) as sf: 

2071 fmt_md = sf.format_dict() 

2072 meta_data = sf.metadata() 

2073 locs, labels = sf.markers() 

2074 if list_cues: 

2075 if len(locs) > 0: 

2076 print_markers(locs, labels) 

2077 elif list_metadata: 

2078 print_metadata(meta_data, replace='.') 

2079 elif list_format: 

2080 print_metadata(fmt_md) 

2081 else: 

2082 print('file:') 

2083 print_metadata(fmt_md, ' ') 

2084 if len(meta_data) > 0: 

2085 print() 

2086 print('metadata:') 

2087 print_metadata(meta_data, ' ', replace='.') 

2088 if len(locs) > 0: 

2089 print() 

2090 print('markers:') 

2091 print_markers(locs, labels) 

2092 if len(file_pathes) > 1: 

2093 print() 

2094 if len(file_pathes) > 1: 

2095 print() 

2096 

2097 

2098def main(*cargs): 

2099 """Call demo with command line arguments. 

2100 

2101 Parameters 

2102 ---------- 

2103 cargs: list of strings 

2104 Command line arguments as provided by sys.argv[1:] 

2105 """ 

2106 # command line arguments: 

2107 parser = argparse.ArgumentParser(add_help=True, 

2108 description='Convert audio file formats.', 

2109 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})') 

2110 parser.add_argument('--version', action='version', version=__version__) 

2111 parser.add_argument('-f', dest='dataformat', action='store_true', 

2112 help='list file format only') 

2113 parser.add_argument('-m', dest='metadata', action='store_true', 

2114 help='list metadata only') 

2115 parser.add_argument('-c', dest='cues', action='store_true', 

2116 help='list cues/markers only') 

2117 parser.add_argument('-t', dest='chunks', action='store_true', 

2118 help='list tags of all riff/wave chunks contained in the file') 

2119 parser.add_argument('files', type=str, nargs='+', 

2120 help='audio file') 

2121 if len(cargs) == 0: 

2122 cargs = None 

2123 args = parser.parse_args(cargs) 

2124 

2125 demo(args.files, args.dataformat, args.metadata, args.cues, args.chunks) 

2126 

2127 

2128if __name__ == "__main__": 

2129 main(*sys.argv[1:])