Coverage for src/audioio/audiometadata.py: 99%

526 statements  

« prev     ^ index     » next       coverage.py v7.6.3, created at 2024-10-15 07:29 +0000

1"""Working with metadata. 

2 

3To interface the various ways metadata are stored in audio files, the 

4`audioio` package uses nested dictionaries. The keys are always 

5strings. Values are strings, integers, floats, datetimes, or other 

6types. Value strings can also be numbers followed by a unit, 

7e.g. "4.2mV". For defining subsections of key-value pairs, values can 

8be dictionaries. The dictionaries can be nested to arbitrary depth. 

9 

10```txt 

11>>> from audioio import print_metadata 

12>>> md = dict(Recording=dict(Experimenter='John Doe', 

13 DateTimeOriginal='2023-10-01T14:10:02', 

14 Count=42), 

15 Hardware=dict(Amplifier='Teensy_Amp 4.1', 

16 Highpass='10Hz', 

17 Gain='120mV')) 

18>>> print_metadata(md) 

19Recording: 

20 Experimenter : John Doe 

21 DateTimeOriginal: 2023-10-01T14:10:02 

22 Count : 42 

23Hardware: 

24 Amplifier: Teensy_Amp 4.1 

25 Highpass : 10Hz 

26 Gain : 120mV 

27``` 

28 

29Often, audio files have very specific ways to store metadata. You can 

30enforce using these by putting them into a dictionary that is added to 

31the metadata with a key having the name of the metadata type you want, 

32e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files. 

33 

34## Functions 

35 

36The `audiometadata` module provides functions for handling and 

37manipulating these nested dictionaries. Many functions take keys as 

38arguments for finding or setting specific key-value pairs. These keys 

39can be the key of a specific item of a (sub-) dictionary, no matter on 

40which level of the metadata hierarchy it is. For example, simply 

41searching for "Highpass" retrieves the corrseponding value "10Hz", 

42although "Highpass" is contained in the sub-dictionary (or "section") 

43with key "Hardware". The same item can also be specified together with 

44its parent keys: "Hardware.Highpass". Parent keys (or section keys) 

45are by default separated by '.', but all functions have a `sep` 

46key-word that specifies the string separating section names in 

47keys. Key matching is case insensitive. 

48 

49Since the same items are named by many different keys in the different 

50types of metadata data models, the functions also take lists of keys 

51as arguments. 

52 

53Do not forget that you can easily manipulate the metadata by means of 

54the standard functions of dictionaries. 

55 

56If you need to make a copy of the metadata use `deepcopy`: 

57``` 

58from copy import deepcopy 

59md_orig = deepcopy(md) 

60``` 

61 

62### Output 

63 

64Write nested dictionaries as texts: 

65 

66- `write_metadata_text()`: write meta data into a text/yaml file. 

67- `print_metadata()`: write meta data to standard output. 

68 

69### Flatten 

70 

71Conversion between nested and flat dictionaries: 

72 

73- `flatten_metadata()`: flatten hierachical metadata to a single dictionary. 

74- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary. 

75 

76### Parse numbers with units 

77 

78- `parse_number()`: parse string with number and unit. 

79- `change_unit()`: scale numerical value to a new unit. 

80 

81### Find and get values 

82 

83Find keys and get their values parsed and converted to various types: 

84 

85- `find_key()`: find dictionary in metadata hierarchy containing the specified key. 

86- `get_number_unit()`: find a key in metadata and return its number and unit. 

87- `get_number()`: find a key in metadata and return its value in a given unit. 

88- `get_int()`: find a key in metadata and return its integer value. 

89- `get_bool()`: find a key in metadata and return its boolean value. 

90- `get_datetime()`: find keys in metadata and return a datatime. 

91- `get_str()`: find a key in metadata and return its string value. 

92 

93### Organize metadata 

94 

95Add and remove metadata: 

96 

97- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary. 

98- `add_sections()`: add sections to metadata dictionary. 

99- `set_metadata()`: set values of existing metadata. 

100- `add_metadata()`: add or modify key-value pairs. 

101- `move_metadata()`: remove a key from metadata and add it to a dictionary. 

102- `remove_metadata()`: remove key-value pairs or sections from metadata. 

103- `cleanup_metadata()`: remove empty sections from metadata. 

104 

105### Special metadata fields 

106 

107Retrieve and set specific metadata: 

108 

109- `get_gain()`: get gain and unit from metadata. 

110- `update_gain()`: update gain setting in metadata. 

111- `update_starttime()`: update start-of-recording times in metadata. 

112- `bext_history_str()`: assemble a string for the BEXT CodingHistory field. 

113- `add_history()`: add a string describing coding history to metadata. 

114- `add_unwrap()`: add unwrap infos to metadata. 

115 

116Lists of standard keys: 

117 

118- `default_starttime_keys`: keys of times of start of the recording. 

119- `default_timeref_keys`: keys of integer time references. 

120- `default_gain_keys`: keys of gain settings. 

121- `default_history_keys`: keys of strings describing coding history. 

122 

123 

124## Command line script 

125 

126The module can be run as a script from the command line to display the 

127metadata and markers contained in an audio file: 

128 

129```sh 

130> audiometadata logger.wav 

131``` 

132prints 

133```text 

134file: 

135 filepath : logger.wav 

136 samplingrate: 96000Hz 

137 channels : 16 

138 frames : 17280000 

139 duration : 180.000s 

140 

141metadata: 

142 INFO: 

143 Bits : 32 

144 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L 

145 Gain : 165.00mV 

146 uCBoard : Teensy 4.1 

147 MACAdress : 04:e9:e5:15:3e:95 

148 DateTimeOriginal: 2023-10-01T14:10:02 

149 Software : TeeGrid R4-senors-logger v1.0 

150``` 

151 

152 

153Alternatively, the script can be run from the module as: 

154``` 

155python -m src.audioio.metadata audiofile.wav 

156``` 

157 

158Running 

159```sh 

160audiometadata --help 

161``` 

162prints 

163```text 

164usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...] 

165 

166Convert audio file formats. 

167 

168positional arguments: 

169 files audio file 

170 

171options: 

172 -h, --help show this help message and exit 

173 --version show program's version number and exit 

174 -f list file format only 

175 -m list metadata only 

176 -c list cues/markers only 

177 -t list tags of all riff/wave chunks contained in the file 

178 

179version 2.0.0 by Benda-Lab (2020-2024) 

180``` 

181 

182""" 

183 

184import sys 

185import argparse 

186import numpy as np 

187import datetime as dt 

188from .version import __version__, __year__ 

189 

190 

191def write_metadata_text(fh, meta, prefix='', indent=4, replace=None): 

192 """Write meta data into a text/yaml file or stream. 

193 

194 With the default parameters, the output is a valid yaml file. 

195 

196 Parameters 

197 ---------- 

198 fh: filename or stream 

199 If not a stream, the file with name `fh` is opened. 

200 Otherwise `fh` is used as a stream for writing. 

201 meta: nested dict 

202 Key-value pairs of metadata to be written into the file. 

203 prefix: str 

204 This string is written at the beginning of each line. 

205 indent: int 

206 Number of characters used for indentation of sections. 

207 replace: char or None 

208 If specified, replace special characters by this character. 

209 

210 Examples 

211 -------- 

212 ``` 

213 from audioio import write_metadata 

214 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5))) 

215 write_metadata('info.txt', md) 

216 ``` 

217 """ 

218 

219 def write_dict(df, md, level, smap): 

220 w = 0 

221 for k in md: 

222 if not isinstance(md[k], dict) and w < len(k): 

223 w = len(k) 

224 for k in md: 

225 clevel = level*indent 

226 if isinstance(md[k], dict): 

227 df.write(f'{prefix}{"":>{clevel}}{k}:\n') 

228 write_dict(df, md[k], level+1, smap) 

229 else: 

230 value = md[k] 

231 if isinstance(value, (list, tuple)): 

232 value = ', '.join([f'{v}' for v in value]) 

233 else: 

234 value = f'{value}' 

235 value = value.replace('\r\n', r'\n') 

236 value = value.replace('\n', r'\n') 

237 if len(smap) > 0: 

238 value = value.translate(smap) 

239 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n') 

240 

241 if not meta: 

242 return 

243 if hasattr(fh, 'write'): 

244 own_file = False 

245 else: 

246 own_file = True 

247 fh = open(fh, 'w') 

248 smap = {} 

249 if replace: 

250 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4)) 

251 write_dict(fh, meta, 0, smap) 

252 if own_file: 

253 fh.close() 

254 

255 

256def print_metadata(meta, prefix='', indent=4, replace=None): 

257 """Write meta data to standard output. 

258 

259 Parameters 

260 ---------- 

261 meta: nested dict 

262 Key-value pairs of metadata to be written into the file. 

263 prefix: str 

264 This string is written at the beginning of each line. 

265 indent: int 

266 Number of characters used for indentation of sections. 

267 replace: char or None 

268 If specified, replace special characters by this character. 

269 

270 Examples 

271 -------- 

272 ``` 

273 >>> from audioio import print_metadata 

274 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

275 >>> print_metadata(md) 

276 aaaa: 2 

277 bbbb: 

278 ccc: 3 

279 ddd: 4 

280 eee: 

281 hh: 5 

282 iiii: 

283 jjj: 6 

284 ``` 

285 """ 

286 write_metadata_text(sys.stdout, meta, prefix, indent, replace) 

287 

288 

289def flatten_metadata(md, keep_sections=False, sep='.'): 

290 """Flatten hierarchical metadata to a single dictionary. 

291 

292 Parameters 

293 ---------- 

294 md: nested dict 

295 Metadata as returned by `metadata()`. 

296 keep_sections: bool 

297 If `True`, then prefix keys with section names, separated by `sep`. 

298 sep: str 

299 String for separating section names. 

300 

301 Returns 

302 ------- 

303 d: dict 

304 Non-nested dict containing all key-value pairs of `md`. 

305 

306 Examples 

307 -------- 

308 ``` 

309 >>> from audioio import print_metadata, flatten_metadata 

310 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6)) 

311 >>> print_metadata(md) 

312 aaaa: 2 

313 bbbb: 

314 ccc: 3 

315 ddd: 4 

316 eee: 

317 hh: 5 

318 iiii: 

319 jjj: 6 

320  

321 >>> fmd = flatten_metadata(md, keep_sections=True) 

322 >>> print_metadata(fmd) 

323 aaaa : 2 

324 bbbb.ccc : 3 

325 bbbb.ddd : 4 

326 bbbb.eee.hh: 5 

327 iiii.jjj : 6 

328 ``` 

329 """ 

330 def flatten(cd, section): 

331 df = {} 

332 for k in cd: 

333 if isinstance(cd[k], dict): 

334 df.update(flatten(cd[k], section + k + sep)) 

335 else: 

336 if keep_sections: 

337 df[section+k] = cd[k] 

338 else: 

339 df[k] = cd[k] 

340 return df 

341 

342 return flatten(md, '') 

343 

344 

345def unflatten_metadata(md, sep='.'): 

346 """Unflatten a previously flattened metadata dictionary. 

347 

348 Parameters 

349 ---------- 

350 md: dict 

351 Flat dictionary with key-value pairs as obtained from 

352 `flatten_metadata()` with `keep_sections=True`. 

353 sep: str 

354 String that separates section names. 

355 

356 Returns 

357 ------- 

358 d: nested dict 

359 Hierarchical dictionary with sub-dictionaries and key-value pairs. 

360 

361 Examples 

362 -------- 

363 ``` 

364 >>> from audioio import print_metadata, unflatten_metadata 

365 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6} 

366 >>> print_metadata(fmd) 

367 aaaa : 2 

368 bbbb.ccc : 3 

369 bbbb.ddd : 4 

370 bbbb.eee.hh: 5 

371 iiii.jjj : 6 

372  

373 >>> md = unflatten_metadata(fmd) 

374 >>> print_metadata(md) 

375 aaaa: 2 

376 bbbb: 

377 ccc: 3 

378 ddd: 4 

379 eee: 

380 hh: 5 

381 iiii: 

382 jjj: 6 

383 ``` 

384 """ 

385 umd = {} # unflattened metadata 

386 cmd = [umd] # current metadata dicts for each level of the hierarchy 

387 csk = [] # current section keys 

388 for k in md: 

389 ks = k.split(sep) 

390 # go up the hierarchy: 

391 for i in range(len(csk) - len(ks)): 

392 csk.pop() 

393 cmd.pop() 

394 for kss in reversed(ks[:len(csk)]): 

395 if kss == csk[-1]: 

396 break 

397 csk.pop() 

398 cmd.pop() 

399 # add new sections: 

400 for kss in ks[len(csk):-1]: 

401 csk.append(kss) 

402 cmd[-1][kss] = {} 

403 cmd.append(cmd[-1][kss]) 

404 # add key-value pair: 

405 cmd[-1][ks[-1]] = md[k] 

406 return umd 

407 

408 

409def parse_number(s): 

410 """Parse string with number and unit. 

411 

412 Parameters 

413 ---------- 

414 s: str, float, or int 

415 String to be parsed. The initial part of the string is 

416 expected to be a number, the part following the number is 

417 interpreted as the unit. If float or int, then return this 

418 as the value with empty unit. 

419 

420 Returns 

421 ------- 

422 v: None, int, or float 

423 Value of the string as float. Without decimal point, an int is returned. 

424 If the string does not contain a number, None is returned. 

425 u: str 

426 Unit that follows the initial number. 

427 n: int 

428 Number of digits behind the decimal point. 

429 

430 Examples 

431 -------- 

432 

433 ``` 

434 >>> from audioio import parse_number 

435 

436 # integer: 

437 >>> parse_number('42') 

438 (42, '', 0) 

439 

440 # integer with unit: 

441 >>> parse_number('42ms') 

442 (42, 'ms', 0) 

443 

444 # float with unit: 

445 >>> parse_number('42.ms') 

446 (42.0, 'ms', 0) 

447 

448 # float with unit: 

449 >>> parse_number('42.3ms') 

450 (42.3, 'ms', 1) 

451 

452 # float with space and unit: 

453 >>> parse_number('423.17 Hz') 

454 (423.17, 'Hz', 2) 

455 ``` 

456 

457 """ 

458 if not isinstance(s, str): 

459 if isinstance(s, int): 

460 return s, '', 0 

461 if isinstance(s, float): 

462 return s, '', 5 

463 else: 

464 return None, '', 0 

465 n = len(s) 

466 ip = n 

467 have_point = False 

468 for i in range(len(s)): 

469 if s[i] == '.': 

470 if have_point: 

471 n = i 

472 break 

473 have_point = True 

474 ip = i + 1 

475 if not s[i] in '0123456789.+-': 

476 n = i 

477 break 

478 if n == 0: 

479 return None, s, 0 

480 v = float(s[:n]) if have_point else int(s[:n]) 

481 u = s[n:].strip() 

482 nd = n - ip if n >= ip else 0 

483 return v, u, nd 

484 

485 

486unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2, 

487 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6, 

488 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12, 

489 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18, 

490 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2, 

491 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6, 

492 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12, 

493 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18, 

494 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6, 

495 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18, 

496 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3, 

497 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18} 

498""" SI prefixes for units with corresponding factors. """ 

499 

500 

501def change_unit(val, old_unit, new_unit): 

502 """Scale numerical value to a new unit. 

503 

504 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703 

505 

506 Parameters 

507 ---------- 

508 val: float 

509 Value given in `old_unit`. 

510 old_unit: str 

511 Unit of `val`. 

512 new_unit: str 

513 Requested unit of return value. 

514 

515 Returns 

516 ------- 

517 new_val: float 

518 The input value `val` scaled to `new_unit`. 

519 

520 Examples 

521 -------- 

522 

523 ``` 

524 >>> from audioio import change_unit 

525 >>> change_unit(5, 'mm', 'cm') 

526 0.5 

527 

528 >>> change_unit(5, '', 'cm') 

529 5.0 

530 

531 >>> change_unit(5, 'mm', '') 

532 5.0 

533 

534 >>> change_unit(5, 'cm', 'mm') 

535 50.0 

536 

537 >>> change_unit(4, 'kg', 'g') 

538 4000.0 

539 

540 >>> change_unit(12, '%', '') 

541 0.12 

542 

543 >>> change_unit(1.24, '', '%') 

544 124.0 

545 

546 >>> change_unit(2.5, 'min', 's') 

547 150.0 

548 

549 >>> change_unit(3600, 's', 'h') 

550 1.0 

551 

552 ``` 

553 

554 """ 

555 # missing unit? 

556 if not old_unit and not new_unit: 

557 return val 

558 if not old_unit and new_unit != '%': 

559 return val 

560 if not new_unit and old_unit != '%': 

561 return val 

562 

563 # special units that directly translate into factors: 

564 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0} 

565 

566 # parse old unit: 

567 f1 = 1.0 

568 if old_unit in unit_factors: 

569 f1 = unit_factors[old_unit] 

570 else: 

571 for k in unit_prefixes: 

572 if len(old_unit) > len(k) and old_unit[:len(k)] == k: 

573 f1 = unit_prefixes[k]; 

574 

575 # parse new unit: 

576 f2 = 1.0 

577 if new_unit in unit_factors: 

578 f2 = unit_factors[new_unit] 

579 else: 

580 for k in unit_prefixes: 

581 if len(new_unit) > len(k) and new_unit[:len(k)] == k: 

582 f2 = unit_prefixes[k]; 

583 

584 return val*f1/f2 

585 

586 

587def find_key(metadata, key, sep='.'): 

588 """Find dictionary in metadata hierarchy containing the specified key. 

589 

590 Parameters 

591 ---------- 

592 metadata: nested dict 

593 Metadata. 

594 key: str 

595 Key to be searched for (case insensitive). 

596 May contain section names separated by `sep`, i.e. 

597 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section) 

598 in section "bbb" that needs to be a subsection of section "aaa". 

599 sep: str 

600 String that separates section names in `key`. 

601 

602 Returns 

603 ------- 

604 md: dict 

605 The innermost dictionary matching some sections of the search key. 

606 If `key` is not at all contained in the metadata, 

607 the top-level dictionary is returned. 

608 key: str 

609 The part of the search key that was not found in `md`, or the 

610 the final part of the search key, found in `md`. 

611 

612 Examples 

613 -------- 

614 

615 Independent of whether found or not found, you can assign to the 

616 returned dictionary with the returned key. 

617 

618 ``` 

619 >>> from audioio import print_metadata, find_key 

620 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6)) 

621 >>> print_metadata(md) 

622 aaaa: 2 

623 bbbb: 

624 ccc: 3 

625 ddd: 4 

626 eee: 

627 ff: 5 

628 gggg: 

629 hhh: 6 

630 

631 >>> m, k = find_key(md, 'bbbb.ddd') 

632 >>> m[k] = 10 

633 >>> print_metadata(md) 

634 aaaa: 2 

635 bbbb: 

636 ccc: 3 

637 ddd: 10 

638 ... 

639 

640 >>> m, k = find_key(md, 'hhh') 

641 >>> m[k] = 12 

642 >>> print_metadata(md) 

643 ... 

644 gggg: 

645 hhh: 12 

646 

647 >>> m, k = find_key(md, 'bbbb.eee.xx') 

648 >>> m[k] = 42 

649 >>> print_metadata(md) 

650 ... 

651 eee: 

652 ff: 5 

653 xx: 42 

654 ... 

655 ``` 

656 

657 When searching for sections, the one conaining the searched section 

658 is returned: 

659 ```py 

660 >>> m, k = find_key(md, 'eee') 

661 >>> m[k]['yy'] = 46 

662 >>> print_metadata(md) 

663 ... 

664 eee: 

665 ff: 5 

666 xx: 42 

667 yy: 46 

668 ... 

669 ``` 

670 

671 """ 

672 def find_keys(metadata, keys): 

673 key = keys[0].strip().upper() 

674 for k in metadata: 

675 if k.upper() == key: 

676 if len(keys) == 1: 

677 # found key: 

678 return True, metadata, k 

679 elif isinstance(metadata[k], dict): 

680 # keep searching within the next section: 

681 return find_keys(metadata[k], keys[1:]) 

682 # search in subsections: 

683 for k in metadata: 

684 if isinstance(metadata[k], dict): 

685 found, mm, kk = find_keys(metadata[k], keys) 

686 if found: 

687 return True, mm, kk 

688 # nothing found: 

689 return False, metadata, sep.join(keys) 

690 

691 if not metadata: 

692 return {}, None 

693 ks = key.strip().split(sep) 

694 found, mm, kk = find_keys(metadata, ks) 

695 return mm, kk 

696 

697 

698def get_number_unit(metadata, keys, sep='.', default=None, 

699 default_unit='', remove=False): 

700 """Find a key in metadata and return its number and unit. 

701 

702 Parameters 

703 ---------- 

704 metadata: nested dict 

705 Metadata. 

706 keys: str or list of str 

707 Keys in the metadata to be searched for (case insensitive). 

708 Value of the first key found is returned. 

709 May contain section names separated by `sep`.  

710 See `audiometadata.find_key()` for details. 

711 sep: str 

712 String that separates section names in `key`. 

713 default: None, int, or float 

714 Returned value if `key` is not found or the value does 

715 not contain a number. 

716 default_unit: str 

717 Returned unit if `key` is not found or the key's value does 

718 not have a unit. 

719 remove: bool 

720 If `True`, remove the found key from `metadata`. 

721 

722 Returns 

723 ------- 

724 v: None, int, or float 

725 Value referenced by `key` as float. 

726 Without decimal point, an int is returned. 

727 If none of the `keys` was found or 

728 the key`s value does not contain a number, 

729 then `default` is returned. 

730 u: str 

731 Corresponding unit. 

732 

733 Examples 

734 -------- 

735 

736 ``` 

737 >>> from audioio import get_number_unit 

738 >>> md = dict(aaaa='42', bbbb='42.3ms') 

739 

740 # integer: 

741 >>> get_number_unit(md, 'aaaa') 

742 (42, '') 

743 

744 # float with unit: 

745 >>> get_number_unit(md, 'bbbb') 

746 (42.3, 'ms') 

747 

748 # two keys: 

749 >>> get_number_unit(md, ['cccc', 'bbbb']) 

750 (42.3, 'ms') 

751 

752 # not found: 

753 >>> get_number_unit(md, 'cccc') 

754 (None, '') 

755 

756 # not found with default value: 

757 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.') 

758 (1.0, 'a.u.') 

759 ``` 

760 

761 """ 

762 if not metadata: 

763 return default, default_unit 

764 if not isinstance(keys, (list, tuple, np.ndarray)): 

765 keys = (keys,) 

766 value = default 

767 unit = default_unit 

768 for key in keys: 

769 m, k = find_key(metadata, key, sep) 

770 if k in m: 

771 v, u, _ = parse_number(m[k]) 

772 if v is not None: 

773 if not u: 

774 u = default_unit 

775 if remove: 

776 del m[k] 

777 return v, u 

778 elif u and unit == default_unit: 

779 unit = u 

780 return value, unit 

781 

782 

783def get_number(metadata, unit, keys, sep='.', default=None, remove=False): 

784 """Find a key in metadata and return its value in a given unit. 

785 

786 Parameters 

787 ---------- 

788 metadata: nested dict 

789 Metadata. 

790 unit: str 

791 Unit in which to return numerical value referenced by one of the `keys`. 

792 keys: str or list of str 

793 Keys in the metadata to be searched for (case insensitive). 

794 Value of the first key found is returned. 

795 May contain section names separated by `sep`.  

796 See `audiometadata.find_key()` for details. 

797 sep: str 

798 String that separates section names in `key`. 

799 default: None, int, or float 

800 Returned value if `key` is not found or the value does 

801 not contain a number. 

802 remove: bool 

803 If `True`, remove the found key from `metadata`. 

804 

805 Returns 

806 ------- 

807 v: None or float 

808 Value referenced by `key` as float scaled to `unit`. 

809 If none of the `keys` was found or 

810 the key`s value does not contain a number, 

811 then `default` is returned. 

812 

813 Examples 

814 -------- 

815 

816 ``` 

817 >>> from audioio import get_number 

818 >>> md = dict(aaaa='42', bbbb='42.3ms') 

819 

820 # milliseconds to seconds: 

821 >>> get_number(md, 's', 'bbbb') 

822 0.0423 

823 

824 # milliseconds to microseconds: 

825 >>> get_number(md, 'us', 'bbbb') 

826 42300.0 

827 

828 # value without unit is not scaled: 

829 >>> get_number(md, 'Hz', 'aaaa') 

830 42 

831 

832 # two keys: 

833 >>> get_number(md, 's', ['cccc', 'bbbb']) 

834 0.0423 

835 

836 # not found: 

837 >>> get_number(md, 's', 'cccc') 

838 None 

839 

840 # not found with default value: 

841 >>> get_number(md, 's', 'cccc', default=1.0) 

842 1.0 

843 ``` 

844 

845 """ 

846 v, u = get_number_unit(metadata, keys, sep, None, unit, remove) 

847 if v is None: 

848 return default 

849 else: 

850 return change_unit(v, u, unit) 

851 

852 

853def get_int(metadata, keys, sep='.', default=None, remove=False): 

854 """Find a key in metadata and return its integer value. 

855 

856 Parameters 

857 ---------- 

858 metadata: nested dict 

859 Metadata. 

860 keys: str or list of str 

861 Keys in the metadata to be searched for (case insensitive). 

862 Value of the first key found is returned. 

863 May contain section names separated by `sep`.  

864 See `audiometadata.find_key()` for details. 

865 sep: str 

866 String that separates section names in `key`. 

867 default: None or int 

868 Return value if `key` is not found or the value does 

869 not contain an integer. 

870 remove: bool 

871 If `True`, remove the found key from `metadata`. 

872 

873 Returns 

874 ------- 

875 v: None or int 

876 Value referenced by `key` as integer. 

877 If none of the `keys` was found, 

878 the key's value does not contain a number or represents 

879 a floating point value, then `default` is returned. 

880 

881 Examples 

882 -------- 

883 

884 ``` 

885 >>> from audioio import get_int 

886 >>> md = dict(aaaa='42', bbbb='42.3ms') 

887 

888 # integer: 

889 >>> get_int(md, 'aaaa') 

890 42 

891 

892 # two keys: 

893 >>> get_int(md, ['cccc', 'aaaa']) 

894 42 

895 

896 # float: 

897 >>> get_int(md, 'bbbb') 

898 None 

899 

900 # not found: 

901 >>> get_int(md, 'cccc') 

902 None 

903 

904 # not found with default value: 

905 >>> get_int(md, 'cccc', default=0) 

906 0 

907 ``` 

908 

909 """ 

910 if not metadata: 

911 return default 

912 if not isinstance(keys, (list, tuple, np.ndarray)): 

913 keys = (keys,) 

914 for key in keys: 

915 m, k = find_key(metadata, key, sep) 

916 if k in m: 

917 v, _, n = parse_number(m[k]) 

918 if v is not None and n == 0: 

919 if remove: 

920 del m[k] 

921 return int(v) 

922 return default 

923 

924 

925def get_bool(metadata, keys, sep='.', default=None, remove=False): 

926 """Find a key in metadata and return its boolean value. 

927 

928 Parameters 

929 ---------- 

930 metadata: nested dict 

931 Metadata. 

932 keys: str or list of str 

933 Keys in the metadata to be searched for (case insensitive). 

934 Value of the first key found is returned. 

935 May contain section names separated by `sep`.  

936 See `audiometadata.find_key()` for details. 

937 sep: str 

938 String that separates section names in `key`. 

939 default: None or bool 

940 Return value if `key` is not found or the value does 

941 not specify a boolean value. 

942 remove: bool 

943 If `True`, remove the found key from `metadata`. 

944 

945 Returns 

946 ------- 

947 v: None or bool 

948 Value referenced by `key` as boolean. 

949 True if 'true', 'yes' (case insensitive) or any number larger than zero. 

950 False if 'false', 'no' (case insensitive) or any number equal to zero. 

951 If none of the `keys` was found or 

952 the key's value does specify a boolean value, 

953 then `default` is returned. 

954 

955 Examples 

956 -------- 

957 

958 ``` 

959 >>> from audioio import get_bool 

960 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui') 

961 

962 # case insensitive: 

963 >>> get_bool(md, 'aaaa') 

964 True 

965 

966 >>> get_bool(md, 'bbbb') 

967 False 

968 

969 >>> get_bool(md, 'cccc') 

970 False 

971 

972 >>> get_bool(md, 'dddd') 

973 True 

974 

975 >>> get_bool(md, 'eeee') 

976 True 

977 

978 # not found: 

979 >>> get_bool(md, 'ffff') 

980 None 

981 

982 # two keys (string is preferred over number): 

983 >>> get_bool(md, ['cccc', 'aaaa']) 

984 True 

985 

986 # two keys (take first match): 

987 >>> get_bool(md, ['cccc', 'ffff']) 

988 False 

989 

990 # not found with default value: 

991 >>> get_bool(md, 'ffff', default=False) 

992 False 

993 ``` 

994 

995 """ 

996 if not metadata: 

997 return default 

998 if not isinstance(keys, (list, tuple, np.ndarray)): 

999 keys = (keys,) 

1000 val = default 

1001 mv = None 

1002 kv = None 

1003 for key in keys: 

1004 m, k = find_key(metadata, key, sep) 

1005 if k in m and not isinstance(m[k], dict): 

1006 vs = m[k] 

1007 v, _, _ = parse_number(vs) 

1008 if v is not None: 

1009 val = abs(v) > 1e-8 

1010 mv = m 

1011 kv = k 

1012 elif isinstance(vs, str): 

1013 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']: 

1014 if remove: 

1015 del m[k] 

1016 return True 

1017 if vs.upper() in ['FALSE', 'F', 'NO', 'N']: 

1018 if remove: 

1019 del m[k] 

1020 return False 

1021 if not mv is None and not kv is None and remove: 

1022 del mv[kv] 

1023 return val 

1024 

1025 

1026default_starttime_keys = [['DateTimeOriginal'], 

1027 ['OriginationDate', 'OriginationTime'], 

1028 ['Location_Time'], 

1029 ['Timestamp']] 

1030"""Default keys of times of start of the recording in metadata. 

1031Used by `get_datetime()` and `update_starttime()` functions. 

1032""" 

1033 

1034def get_datetime(metadata, keys=default_starttime_keys, 

1035 sep='.', default=None, remove=False): 

1036 """Find keys in metadata and return a datatime. 

1037 

1038 Parameters 

1039 ---------- 

1040 metadata: nested dict 

1041 Metadata. 

1042 keys: tuple of str or list of tuple of str 

1043 Datetimes can be stored in metadata as two separate key-value pairs, 

1044 one for the date and one for the time. Or by a single key-value pair 

1045 for a date-time values. This is why the keys need to be specified in 

1046 tuples with one or tow keys. 

1047 Value of the first tuple of keys found is returned. 

1048 Keys may contain section names separated by `sep`.  

1049 See `audiometadata.find_key()` for details. 

1050 You can modify the default keys via the `default_starttime_keys` list 

1051 of the `audiometadata` module. 

1052 sep: str 

1053 String that separates section names in `key`. 

1054 default: None or str 

1055 Return value if `key` is not found or the value does 

1056 not contain a string. 

1057 remove: bool 

1058 If `True`, remove the found key from `metadata`. 

1059 

1060 Returns 

1061 ------- 

1062 v: None or datetime 

1063 Datetime referenced by `keys`. 

1064 If none of the `keys` was found, then `default` is returned. 

1065 

1066 Examples 

1067 -------- 

1068 

1069 ``` 

1070 >>> from audioio import get_datetime 

1071 >>> import datetime as dt 

1072 >>> md = dict(date='2024-03-02', time='10:42:24', 

1073 datetime='2023-04-15T22:10:00') 

1074 

1075 # separate date and time: 

1076 >>> get_datetime(md, ('date', 'time')) 

1077 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1078 

1079 # single datetime: 

1080 >>> get_datetime(md, ('datetime',)) 

1081 datetime.datetime(2023, 4, 15, 22, 10) 

1082 

1083 # two alternative key tuples: 

1084 >>> get_datetime(md, [('aaaa',), ('date', 'time')]) 

1085 datetime.datetime(2024, 3, 2, 10, 42, 24) 

1086 

1087 # not found: 

1088 >>> get_datetime(md, ('cccc',)) 

1089 None 

1090 

1091 # not found with default value: 

1092 >>> get_datetime(md, ('cccc', 'dddd'), 

1093 default=dt.datetime(2022, 2, 22, 22, 2, 12)) 

1094 datetime.datetime(2022, 2, 22, 22, 2, 12) 

1095 ``` 

1096 

1097 """ 

1098 if not metadata: 

1099 return default 

1100 if len(keys) > 0 and isinstance(keys[0], str): 

1101 keys = (keys,) 

1102 for keyp in keys: 

1103 if len(keyp) == 1: 

1104 m, k = find_key(metadata, keyp[0], sep) 

1105 if k in m: 

1106 v = m[k] 

1107 if isinstance(v, dt.datetime): 

1108 if remove: 

1109 del m[k] 

1110 return v 

1111 elif isinstance(v, str): 

1112 if remove: 

1113 del m[k] 

1114 return dt.datetime.fromisoformat(v) 

1115 else: 

1116 md, kd = find_key(metadata, keyp[0], sep) 

1117 if not kd in md: 

1118 continue 

1119 if isinstance(md[kd], dt.date): 

1120 date = md[kd] 

1121 elif isinstance(md[kd], str): 

1122 date = dt.date.fromisoformat(md[kd]) 

1123 else: 

1124 continue 

1125 mt, kt = find_key(metadata, keyp[1], sep) 

1126 if not kt in mt: 

1127 continue 

1128 if isinstance(mt[kt], dt.time): 

1129 time = mt[kt] 

1130 elif isinstance(mt[kt], str): 

1131 time = dt.time.fromisoformat(mt[kt]) 

1132 else: 

1133 continue 

1134 if remove: 

1135 del md[kd] 

1136 del mt[kt] 

1137 return dt.datetime.combine(date, time) 

1138 return default 

1139 

1140 

1141def get_str(metadata, keys, sep='.', default=None, remove=False): 

1142 """Find a key in metadata and return its string value. 

1143 

1144 Parameters 

1145 ---------- 

1146 metadata: nested dict 

1147 Metadata. 

1148 keys: str or list of str 

1149 Keys in the metadata to be searched for (case insensitive). 

1150 Value of the first key found is returned. 

1151 May contain section names separated by `sep`.  

1152 See `audiometadata.find_key()` for details. 

1153 sep: str 

1154 String that separates section names in `key`. 

1155 default: None or str 

1156 Return value if `key` is not found or the value does 

1157 not contain a string. 

1158 remove: bool 

1159 If `True`, remove the found key from `metadata`. 

1160 

1161 Returns 

1162 ------- 

1163 v: None or str 

1164 String value referenced by `key`. 

1165 If none of the `keys` was found, then `default` is returned. 

1166 

1167 Examples 

1168 -------- 

1169 

1170 ``` 

1171 >>> from audioio import get_str 

1172 >>> md = dict(aaaa=42, bbbb='hello') 

1173 

1174 # string: 

1175 >>> get_str(md, 'bbbb') 

1176 'hello' 

1177 

1178 # int as str: 

1179 >>> get_str(md, 'aaaa') 

1180 '42' 

1181 

1182 # two keys: 

1183 >>> get_str(md, ['cccc', 'bbbb']) 

1184 'hello' 

1185 

1186 # not found: 

1187 >>> get_str(md, 'cccc') 

1188 None 

1189 

1190 # not found with default value: 

1191 >>> get_str(md, 'cccc', default='-') 

1192 '-' 

1193 ``` 

1194 

1195 """ 

1196 if not metadata: 

1197 return default 

1198 if not isinstance(keys, (list, tuple, np.ndarray)): 

1199 keys = (keys,) 

1200 for key in keys: 

1201 m, k = find_key(metadata, key, sep) 

1202 if k in m and not isinstance(m[k], dict): 

1203 v = m[k] 

1204 if remove: 

1205 del m[k] 

1206 return str(v) 

1207 return default 

1208 

1209 

1210def add_sections(metadata, sections, value=False, sep='.'): 

1211 """Add sections to metadata dictionary. 

1212 

1213 Parameters 

1214 ---------- 

1215 metadata: nested dict 

1216 Metadata. 

1217 key: str 

1218 Names of sections to be added to `metadata`. 

1219 Section names separated by `sep`.  

1220 value: bool 

1221 If True, then the last element in `key` is a key for a value, 

1222 not a section. 

1223 sep: str 

1224 String that separates section names in `key`. 

1225 

1226 Returns 

1227 ------- 

1228 md: dict 

1229 Dictionary of the last added section. 

1230 key: str 

1231 Last key. Only returned if `value` is set to `True`. 

1232 

1233 Examples 

1234 -------- 

1235 

1236 Add a section and a sub-section to the metadata: 

1237 ``` 

1238 >>> from audioio import print_metadata, add_sections 

1239 >>> md = dict() 

1240 >>> m = add_sections(md, 'Recording.Location') 

1241 >>> m['Country'] = 'Lummerland' 

1242 >>> print_metadata(md) 

1243 Recording: 

1244 Location: 

1245 Country: Lummerland 

1246 ``` 

1247 

1248 Add a section with a key-value pair: 

1249 ``` 

1250 >>> md = dict() 

1251 >>> m, k = add_sections(md, 'Recording.Location', True) 

1252 >>> m[k] = 'Lummerland' 

1253 >>> print_metadata(md) 

1254 Recording: 

1255 Location: Lummerland 

1256 ``` 

1257 

1258 Adds well to `find_key()`: 

1259 ``` 

1260 >>> md = dict(Recording=dict()) 

1261 >>> m, k = find_key(md, 'Recording.Location.Country') 

1262 >>> m, k = add_sections(m, k, True) 

1263 >>> m[k] = 'Lummerland' 

1264 >>> print_metadata(md) 

1265 Recording: 

1266 Location: 

1267 Country: Lummerland 

1268 ``` 

1269 

1270 """ 

1271 mm = metadata 

1272 ks = sections.split(sep) 

1273 n = len(ks) 

1274 if value: 

1275 n -= 1 

1276 for k in ks[:n]: 

1277 if len(k) == 0: 

1278 continue 

1279 mm[k] = dict() 

1280 mm = mm[k] 

1281 if value: 

1282 return mm, ks[-1] 

1283 else: 

1284 return mm 

1285 

1286 

1287def strlist_to_dict(mds): 

1288 """Convert list of key-value-pair strings to dictionary. 

1289 

1290 Parameters 

1291 ---------- 

1292 mds: None or dict or str or list of str 

1293 - None - returns empty dictionary. 

1294 - Flat dictionary - returned as is. 

1295 - String with key and value separated by '='. 

1296 - List of strings with keys and values separated by '='. 

1297 Keys may contain section names. 

1298 

1299 Returns 

1300 ------- 

1301 md_dict: dict 

1302 Flat dictionary with key-value pairs. 

1303 Keys may contain section names. 

1304 Values are strings, other types or dictionaries. 

1305 """ 

1306 if mds is None: 

1307 return {} 

1308 if isinstance(mds, dict): 

1309 return mds 

1310 if not isinstance(mds, (list, tuple, np.ndarray)): 

1311 mds = (mds,) 

1312 md_dict = {} 

1313 for md in mds: 

1314 k, v = md.split('=') 

1315 k = k.strip() 

1316 v = v.strip() 

1317 md_dict[k] = v 

1318 return md_dict 

1319 

1320 

1321def set_metadata(metadata, mds, sep='.'): 

1322 """Set values of existing metadata. 

1323 

1324 Only if a key is found in the metadata, its value is updated. 

1325 

1326 Parameters 

1327 ---------- 

1328 metadata: nested dict 

1329 Metadata. 

1330 mds: dict or str or list of str 

1331 - Flat dictionary with key-value pairs for updating the metadata. 

1332 Values can be strings, other types or dictionaries. 

1333 - String with key and value separated by '='. 

1334 - List of strings with key and value separated by '='. 

1335 Keys may contain section names separated by `sep`. 

1336 sep: str 

1337 String that separates section names in the keys of `md_dict`. 

1338 

1339 Examples 

1340 -------- 

1341 ``` 

1342 >>> from audioio import print_metadata, set_metadata 

1343 >>> md = dict(Recording=dict(Time='early')) 

1344 >>> print_metadata(md) 

1345 Recording: 

1346 Time: early 

1347 

1348 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1349 'Recording.Time': 'late'}) # change value of existing key 

1350 >>> print_metadata(md) 

1351 Recording: 

1352 Time : late 

1353 ``` 

1354 

1355 See also 

1356 -------- 

1357 add_metadata() 

1358 strlist_to_dict() 

1359 

1360 """ 

1361 if metadata is None: 

1362 return 

1363 md_dict = strlist_to_dict(mds) 

1364 for k in md_dict: 

1365 mm, kk = find_key(metadata, k, sep) 

1366 if kk in mm: 

1367 mm[kk] = md_dict[k] 

1368 

1369 

1370def add_metadata(metadata, mds, sep='.'): 

1371 """Add or modify key-value pairs. 

1372 

1373 If a key does not exist, it is added to the metadata. 

1374 

1375 Parameters 

1376 ---------- 

1377 metadata: nested dict 

1378 Metadata. 

1379 mds: dict or str or list of str 

1380 - Flat dictionary with key-value pairs for updating the metadata. 

1381 Values can be strings, other types or dictionaries. 

1382 - String with key and value separated by '='. 

1383 - List of strings with key and value separated by '='. 

1384 Keys may contain section names separated by `sep`. 

1385 sep: str 

1386 String that separates section names in the keys of `md_list`. 

1387 

1388 Examples 

1389 -------- 

1390 ``` 

1391 >>> from audioio import print_metadata, add_metadata 

1392 >>> md = dict(Recording=dict(Time='early')) 

1393 >>> print_metadata(md) 

1394 Recording: 

1395 Time: early 

1396 

1397 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair 

1398 'Recording.Time': 'late', # change value of existing key  

1399 'Recording.Quality': 'amazing', # new key-value pair in existing section 

1400 'Location.Country': 'Lummerland']) # new key-value pair in new section 

1401 >>> print_metadata(md) 

1402 Recording: 

1403 Time : late 

1404 Quality: amazing 

1405 Artist: John Doe 

1406 Location: 

1407 Country: Lummerland 

1408 ``` 

1409 

1410 See also 

1411 -------- 

1412 set_metadata() 

1413 strlist_to_dict() 

1414 

1415 """ 

1416 if metadata is None: 

1417 return 

1418 md_dict = strlist_to_dict(mds) 

1419 for k in md_dict: 

1420 mm, kk = find_key(metadata, k, sep) 

1421 mm, kk = add_sections(mm, kk, True, sep) 

1422 mm[kk] = md_dict[k] 

1423 

1424 

1425 

1426def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'): 

1427 """Remove a key from metadata and add it to a dictionary. 

1428 

1429 Parameters 

1430 ---------- 

1431 src_md: nested dict 

1432 Metadata from which a key is removed. 

1433 dest_md: dict 

1434 Dictionary to which the found key and its value are added. 

1435 keys: str or list of str 

1436 List of keys to be searched for in `src_md`. 

1437 Move the first one found to `dest_md`. 

1438 See the `audiometadata.find_key()` function for details. 

1439 new_key: None or str 

1440 If specified add the value of the found key as `new_key` to 

1441 `dest_md`. Otherwise, use the search key. 

1442 sep: str 

1443 String that separates section names in `keys`. 

1444 

1445 Returns 

1446 ------- 

1447 moved: bool 

1448 `True` if key was found and moved to dictionary. 

1449  

1450 Examples 

1451 -------- 

1452 ``` 

1453 >>> from audioio import print_metadata, move_metadata 

1454 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV')) 

1455 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist') 

1456 >>> print_metadata(md) 

1457 Recording: 

1458 Gain : 1.42mV 

1459 Experimentalist: John Doe 

1460 ``` 

1461  

1462 """ 

1463 if not src_md: 

1464 return False 

1465 if not isinstance(keys, (list, tuple, np.ndarray)): 

1466 keys = (keys,) 

1467 for key in keys: 

1468 m, k = find_key(src_md, key, sep) 

1469 if k in m: 

1470 dest_key = new_key if new_key else k 

1471 dest_md[dest_key] = m.pop(k) 

1472 return True 

1473 return False 

1474 

1475 

1476def remove_metadata(metadata, key_list, sep='.'): 

1477 """Remove key-value pairs or sections from metadata. 

1478 

1479 Parameters 

1480 ---------- 

1481 metadata: nested dict 

1482 Metadata. 

1483 key_list: str or list of str 

1484 List of keys to key-value pairs or sections to be removed 

1485 from the metadata. 

1486 sep: str 

1487 String that separates section names in the keys of `key_list`. 

1488 

1489 Examples 

1490 -------- 

1491 ``` 

1492 >>> from audioio import print_metadata, remove_metadata 

1493 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4)) 

1494 >>> remove_metadata(md, ('ccc',)) 

1495 >>> print_metadata(md) 

1496 aaaa: 2 

1497 bbbb: 

1498 ddd: 4 

1499 ``` 

1500 

1501 """ 

1502 if not metadata: 

1503 return 

1504 if not isinstance(key_list, (list, tuple, np.ndarray)): 

1505 key_list = (key_list,) 

1506 for k in key_list: 

1507 mm, kk = find_key(metadata, k, sep) 

1508 if kk in mm: 

1509 del mm[kk] 

1510 

1511 

1512def cleanup_metadata(metadata): 

1513 """Remove empty sections from metadata. 

1514 

1515 Parameters 

1516 ---------- 

1517 metadata: nested dict 

1518 Metadata. 

1519 

1520 Examples 

1521 -------- 

1522 ``` 

1523 >>> from audioio import print_metadata, cleanup_metadata 

1524 >>> md = dict(aaaa=2, bbbb=dict()) 

1525 >>> cleanup_metadata(md) 

1526 >>> print_metadata(md) 

1527 aaaa: 2 

1528 ``` 

1529 

1530 """ 

1531 if not metadata: 

1532 return 

1533 for k in list(metadata): 

1534 if isinstance(metadata[k], dict): 

1535 if len(metadata[k]) == 0: 

1536 del metadata[k] 

1537 else: 

1538 cleanup_metadata(metadata[k]) 

1539 

1540 

1541default_gain_keys = ['gain'] 

1542"""Default keys of gain settings in metadata. Used by `get_gain()` function. 

1543""" 

1544 

1545def get_gain(metadata, gain_key=default_gain_keys, sep='.', 

1546 default=None, default_unit='', remove=False): 

1547 """Get gain and unit from metadata. 

1548 

1549 Parameters 

1550 ---------- 

1551 metadata: nested dict 

1552 Metadata with key-value pairs. 

1553 gain_key: str or list of str 

1554 Key in the file's metadata that holds some gain information. 

1555 If found, the data will be multiplied with the gain, 

1556 and if available, the corresponding unit is returned. 

1557 See the `audiometadata.find_key()` function for details. 

1558 You can modify the default keys via the `default_gain_keys` list 

1559 of the `audiometadata` module. 

1560 sep: str 

1561 String that separates section names in `gain_key`. 

1562 default: None or float 

1563 Returned value if no valid gain was found in `metadata`. 

1564 default_unit: str 

1565 Returned unit if no valid gain was found in `metadata`. 

1566 remove: bool 

1567 If `True`, remove the found key from `metadata`. 

1568 

1569 Returns 

1570 ------- 

1571 fac: float 

1572 Gain factor. If not found in metadata return 1. 

1573 unit: string 

1574 Unit of the data if found in the metadata, otherwise "a.u.". 

1575 """ 

1576 v, u = get_number_unit(metadata, gain_key, sep, default, 

1577 default_unit, remove) 

1578 # fix some TeeGrid gains: 

1579 if len(u) >= 2 and u[-2:] == '/V': 

1580 u = u[:-2] 

1581 return v, u 

1582 

1583 

1584def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'): 

1585 """Update gain setting in metadata. 

1586 

1587 Searches for the first appearance of a gain key in the metadata 

1588 hierarchy. If found, divide the gain value by `fac`. 

1589 

1590 Parameters 

1591 ---------- 

1592 metadata: nested dict 

1593 Metadata to be updated. 

1594 fac: float 

1595 Factor that was used to scale the data. 

1596 gain_key: str or list of str 

1597 Key in the file's metadata that holds some gain information. 

1598 If found, the data will be multiplied with the gain, 

1599 and if available, the corresponding unit is returned. 

1600 See the `audiometadata.find_key()` function for details. 

1601 You can modify the default keys via the `default_gain_keys` list 

1602 of the `audiometadata` module. 

1603 sep: str 

1604 String that separates section names in `gain_key`. 

1605 

1606 Returns 

1607 ------- 

1608 done: bool 

1609 True if gain has been found and set. 

1610 

1611 

1612 Examples 

1613 -------- 

1614 

1615 ``` 

1616 >>> from audioio import print_metadata, update_gain 

1617 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV')) 

1618 >>> update_gain(md, 2) 

1619 >>> print_metadata(md) 

1620 Artist: John Doe 

1621 Recording: 

1622 gain: 0.70mV 

1623 ``` 

1624 

1625 """ 

1626 if not metadata: 

1627 return False 

1628 if not isinstance(gain_key, (list, tuple, np.ndarray)): 

1629 gain_key = (gain_key,) 

1630 for gk in gain_key: 

1631 m, k = find_key(metadata, gk, sep) 

1632 if k in m and not isinstance(m[k], dict): 

1633 vs = m[k] 

1634 if isinstance(vs, (int, float)): 

1635 m[k] = vs/fac 

1636 else: 

1637 v, u, n = parse_number(vs) 

1638 if not v is None: 

1639 # fix some TeeGrid gains: 

1640 if len(u) >= 2 and u[-2:] == '/V': 

1641 u = u[:-2] 

1642 m[k] = f'{v/fac:.{n+1}f}{u}' 

1643 return True 

1644 return False 

1645 

1646 

1647default_timeref_keys = ['TimeReference'] 

1648"""Default keys of integer time references in metadata. 

1649Used by `update_starttime()` function. 

1650""" 

1651 

1652def update_starttime(metadata, deltat, rate, 

1653 time_keys=default_starttime_keys, 

1654 ref_keys=default_timeref_keys): 

1655 """Update start-of-recording times in metadata. 

1656 

1657 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata. 

1658 

1659 Parameters 

1660 ---------- 

1661 metadata: nested dict 

1662 Metadata to be updated. 

1663 deltat: float 

1664 Time in seconds to be added to start times. 

1665 rate: float 

1666 Sampling rate of the data in Hertz. 

1667 time_keys: tuple of str or list of tuple of str 

1668 Keys to fields denoting calender times, i.e. dates and times. 

1669 Datetimes can be stored in metadata as two separate key-value pairs, 

1670 one for the date and one for the time. Or by a single key-value pair 

1671 for a date-time values. This is why the keys need to be specified in 

1672 tuples with one or two keys. 

1673 Keys may contain section names separated by `sep`.  

1674 See `audiometadata.find_key()` for details. 

1675 You can modify the default time keys via the `default_starttime_keys` 

1676 list of the `audiometadata` module. 

1677 ref_keys: str or list of str 

1678 Keys to time references, i.e. integers in seconds relative to 

1679 a reference time. 

1680 Keys may contain section names separated by `sep`.  

1681 See `audiometadata.find_key()` for details. 

1682 You can modify the default reference keys via the 

1683 `default_timeref_keys` list of the `audiometadata` module. 

1684 

1685 Returns 

1686 ------- 

1687 success: bool 

1688 True if at least one time has been updated. 

1689 

1690 Example 

1691 ------- 

1692 ``` 

1693 >>> from audioio import print_metadata, update_starttime 

1694 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00', 

1695 OtherTime='2023-05-16T23:20:10', 

1696 BEXT=dict(OriginationDate='2024-03-02', 

1697 OriginationTime='10:42:24', 

1698 TimeReference=123456)) 

1699 >>> update_starttime(md, 4.2, 48000) 

1700 >>> print_metadata(md) 

1701 DateTimeOriginal: 2023-04-15T22:10:04 

1702 OtherTime : 2023-05-16T23:20:10 

1703 BEXT: 

1704 OriginationDate: 2024-03-02 

1705 OriginationTime: 10:42:28 

1706 TimeReference : 325056 

1707 ``` 

1708 

1709 """ 

1710 if not metadata: 

1711 return False 

1712 if not isinstance(deltat, dt.timedelta): 

1713 deltat = dt.timedelta(seconds=deltat) 

1714 success = False 

1715 if len(time_keys) > 0 and isinstance(time_keys[0], str): 

1716 time_keys = (time_keys,) 

1717 for key in time_keys: 

1718 if len(key) == 1: 

1719 # datetime: 

1720 m, k = find_key(metadata, key[0]) 

1721 if k in m and not isinstance(m[k], dict): 

1722 if isinstance(m[k], dt.datetime): 

1723 m[k] += deltat 

1724 else: 

1725 datetime = dt.datetime.fromisoformat(m[k]) + deltat 

1726 m[k] = datetime.isoformat(timespec='seconds') 

1727 success = True 

1728 else: 

1729 # separate date and time: 

1730 md, kd = find_key(metadata, key[0]) 

1731 if not kd in md or isinstance(md[kd], dict): 

1732 continue 

1733 if isinstance(md[kd], dt.date): 

1734 date = md[kd] 

1735 is_date = True 

1736 else: 

1737 date = dt.date.fromisoformat(md[kd]) 

1738 is_date = False 

1739 mt, kt = find_key(metadata, key[1]) 

1740 if not kt in mt or isinstance(mt[kt], dict): 

1741 continue 

1742 if isinstance(mt[kt], dt.time): 

1743 time = mt[kt] 

1744 is_time = True 

1745 else: 

1746 time = dt.time.fromisoformat(mt[kt]) 

1747 is_time = False 

1748 datetime = dt.datetime.combine(date, time) + deltat 

1749 md[kd] = datetime.date() if is_date else datetime.date().isoformat() 

1750 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds') 

1751 success = True 

1752 # time reference in samples: 

1753 if isinstance(ref_keys, str): 

1754 ref_keys = (ref_keys,) 

1755 for key in ref_keys: 

1756 m, k = find_key(metadata, key) 

1757 if k in m and not isinstance(m[k], dict): 

1758 is_int = isinstance(m[k], int) 

1759 tref = int(m[k]) 

1760 tref += int(np.round(deltat.total_seconds()*rate)) 

1761 m[k] = tref if is_int else f'{tref}' 

1762 success = True 

1763 return success 

1764 

1765 

1766def bext_history_str(encoding, rate, channels, text=None): 

1767 """ Assemble a string for the BEXT CodingHistory field. 

1768 

1769 Parameters 

1770 ---------- 

1771 encoding: str or None 

1772 Encoding of the data. 

1773 rate: int or float 

1774 Sampling rate in Hertz. 

1775 channels: int 

1776 Number of channels. 

1777 text: str or None 

1778 Optional free text. 

1779 

1780 Returns 

1781 ------- 

1782 s: str 

1783 String for the BEXT CodingHistory field, 

1784 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out" 

1785 """ 

1786 codes = [] 

1787 bits = None 

1788 if encoding is not None: 

1789 if encoding[:3] == 'PCM': 

1790 bits = int(encoding[4:]) 

1791 encoding = 'PCM' 

1792 codes.append(f'A={encoding}') 

1793 codes.append(f'F={rate:.0f}') 

1794 if bits is not None: 

1795 codes.append(f'W={bits}') 

1796 mode = None 

1797 if channels == 1: 

1798 mode = 'mono' 

1799 elif channels == 2: 

1800 mode = 'stereo' 

1801 if mode is not None: 

1802 codes.append(f'M={mode}') 

1803 if text is not None: 

1804 codes.append(f'T={text.rstrip()}') 

1805 return ','.join(codes) 

1806 

1807 

1808default_history_keys = ['History', 

1809 'CodingHistory', 

1810 'BWF_CODING_HISTORY'] 

1811"""Default keys of strings describing coding history in metadata. 

1812Used by `add_history()` function. 

1813""" 

1814 

1815def add_history(metadata, history, new_key=None, pre_history=None, 

1816 history_keys=default_history_keys, sep='.'): 

1817 """Add a string describing coding history to metadata. 

1818  

1819 Add `history` to the `history_keys` fields in the metadata. If 

1820 none of these fields are present but `new_key` is specified, then 

1821 assign `pre_history` and `history` to this key. If this key does 

1822 not exist in the metadata, it is created. 

1823 

1824 Parameters 

1825 ---------- 

1826 metadata: nested dict 

1827 Metadata to be updated. 

1828 history: str 

1829 String to be added to the history. 

1830 new_key: str or None 

1831 Sections and name of a history key to be added to `metadata`. 

1832 Section names are separated by `sep`. 

1833 pre_history: str or None 

1834 If a new key `new_key` is created, then assign this string followed 

1835 by `history`. 

1836 history_keys: str or list of str 

1837 Keys to fields where to add `history`. 

1838 Keys may contain section names separated by `sep`.  

1839 See `audiometadata.find_key()` for details. 

1840 You can modify the default history keys via the `default_history_keys` 

1841 list of the `audiometadata` module. 

1842 sep: str 

1843 String that separates section names in `new_key` and `history_keys`. 

1844 

1845 Returns 

1846 ------- 

1847 success: bool 

1848 True if the history string has beend added to the metadata. 

1849 

1850 Example 

1851 ------- 

1852 Add string to existing history key-value pair: 

1853 ``` 

1854 >>> from audioio import add_history 

1855 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings')) 

1856 >>> add_history(md, 'just a snippet') 

1857 >>> print(md['BEXT']['CodingHistory']) 

1858 original recordings 

1859 just a snippet 

1860 ``` 

1861 

1862 Assign string to new key-value pair: 

1863 ``` 

1864 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12')) 

1865 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data') 

1866 >>> print(md['BEXT']['CodingHistory']) 

1867 original data 

1868 just a snippet 

1869 ``` 

1870 

1871 """ 

1872 if not metadata: 

1873 return False 

1874 if isinstance(history_keys, str): 

1875 history_keys = (history_keys,) 

1876 success = False 

1877 for keys in history_keys: 

1878 m, k = find_key(metadata, keys) 

1879 if k in m and not isinstance(m[k], dict): 

1880 s = m[k] 

1881 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1882 s += '\r\n' 

1883 s += history 

1884 m[k] = s 

1885 success = True 

1886 if not success and new_key: 

1887 m, k = find_key(metadata, new_key, sep) 

1888 m, k = add_sections(m, k, True, sep) 

1889 s = '' 

1890 if pre_history is not None: 

1891 s = pre_history 

1892 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r': 

1893 s += '\r\n' 

1894 s += history 

1895 m[k] = s 

1896 success = True 

1897 return success 

1898 

1899 

1900def add_unwrap(metadata, thresh, clip=0, unit=''): 

1901 """Add unwrap infos to metadata. 

1902 

1903 If `audiotools.unwrap()` was applied to the data, then this 

1904 function adds relevant infos to the metadata. If there is an INFO 

1905 section in the metadata, the unwrap infos are added to this 

1906 section, otherwise they are added to the top level of the metadata 

1907 hierarchy. 

1908 

1909 The threshold `thresh` used for unwrapping is saved under the key 

1910 'UnwrapThreshold' as a string. If `clip` is larger than zero, then 

1911 the clip level is saved under the key 'UnwrapClippedAmplitude' as 

1912 a string. 

1913 

1914 Parameters 

1915 ---------- 

1916 md: nested dict 

1917 Metadata to be updated. 

1918 thresh: float 

1919 Threshold used for unwrapping. 

1920 clip: float 

1921 Level at which unwrapped data have been clipped. 

1922 unit: str 

1923 Unit of `thresh` and `clip`. 

1924 

1925 Examples 

1926 -------- 

1927 

1928 ``` 

1929 >>> from audioio import print_metadata, add_unwrap 

1930 >>> md = dict(INFO=dict(Time='early')) 

1931 >>> add_unwrap(md, 0.6, 1.0) 

1932 >>> print_metadata(md) 

1933 INFO: 

1934 Time : early 

1935 UnwrapThreshold : 0.60 

1936 UnwrapClippedAmplitude: 1.00 

1937 ``` 

1938 

1939 """ 

1940 if metadata is None: 

1941 return 

1942 md = metadata 

1943 for k in metadata: 

1944 if k.strip().upper() == 'INFO': 

1945 md = metadata['INFO'] 

1946 break 

1947 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}' 

1948 if clip > 0: 

1949 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}' 

1950 

1951 

1952def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks): 

1953 """Print metadata and markers of audio files. 

1954 

1955 Parameters 

1956 ---------- 

1957 file_pathes: list of str 

1958 Pathes of audio files. 

1959 list_format: bool 

1960 If True, list file format only. 

1961 list_metadata: bool 

1962 If True, list metadata only. 

1963 list_cues: bool 

1964 If True, list markers/cues only. 

1965 list_chunks: bool 

1966 If True, list all chunks contained in a riff/wave file. 

1967 """ 

1968 from .audioloader import AudioLoader 

1969 from .audiomarkers import print_markers 

1970 from .riffmetadata import read_chunk_tags 

1971 for filepath in file_pathes: 

1972 if len(file_pathes) > 1 and (list_cues or list_metadata or 

1973 list_format or list_chunks): 

1974 print(filepath) 

1975 if list_chunks: 

1976 chunks = read_chunk_tags(filepath) 

1977 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}') 

1978 for tag in chunks: 

1979 pos = chunks[tag][0] - 8 

1980 size = chunks[tag][1] + 8 

1981 print(f' {tag:9s} {pos:10d} {size:10d}') 

1982 if len(file_pathes) > 1: 

1983 print() 

1984 continue 

1985 with AudioLoader(filepath, 1, 0, verbose=0) as sf: 

1986 fmt_md = sf.format_dict() 

1987 meta_data = sf.metadata() 

1988 locs, labels = sf.markers() 

1989 if list_cues: 

1990 if len(locs) > 0: 

1991 print_markers(locs, labels) 

1992 elif list_metadata: 

1993 print_metadata(meta_data, replace='.') 

1994 elif list_format: 

1995 print_metadata(fmt_md) 

1996 else: 

1997 print('file:') 

1998 print_metadata(fmt_md, ' ') 

1999 if len(meta_data) > 0: 

2000 print() 

2001 print('metadata:') 

2002 print_metadata(meta_data, ' ', replace='.') 

2003 if len(locs) > 0: 

2004 print() 

2005 print('markers:') 

2006 print_markers(locs, labels) 

2007 if len(file_pathes) > 1: 

2008 print() 

2009 if len(file_pathes) > 1: 

2010 print() 

2011 

2012 

2013def main(*cargs): 

2014 """Call demo with command line arguments. 

2015 

2016 Parameters 

2017 ---------- 

2018 cargs: list of strings 

2019 Command line arguments as provided by sys.argv[1:] 

2020 """ 

2021 # command line arguments: 

2022 parser = argparse.ArgumentParser(add_help=True, 

2023 description='Convert audio file formats.', 

2024 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})') 

2025 parser.add_argument('--version', action='version', version=__version__) 

2026 parser.add_argument('-f', dest='dataformat', action='store_true', 

2027 help='list file format only') 

2028 parser.add_argument('-m', dest='metadata', action='store_true', 

2029 help='list metadata only') 

2030 parser.add_argument('-c', dest='cues', action='store_true', 

2031 help='list cues/markers only') 

2032 parser.add_argument('-t', dest='chunks', action='store_true', 

2033 help='list tags of all riff/wave chunks contained in the file') 

2034 parser.add_argument('files', type=str, nargs='+', 

2035 help='audio file') 

2036 if len(cargs) == 0: 

2037 cargs = None 

2038 args = parser.parse_args(cargs) 

2039 

2040 demo(args.files, args.dataformat, args.metadata, args.cues, args.chunks) 

2041 

2042 

2043if __name__ == "__main__": 

2044 main(*sys.argv[1:])