Coverage for src/audioio/audiometadata.py: 99%

1"""Working with metadata.

3To interface the various ways metadata are stored in audio files, the

4`audioio` package uses nested dictionaries. The keys are always

5strings. Values are strings, integers, floats, datetimes, or other

6types. Value strings can also be numbers followed by a unit,

7e.g. "4.2mV". For defining subsections of key-value pairs, values can

8be dictionaries. The dictionaries can be nested to arbitrary depth.

10```txt

11>>> from audioio import print_metadata

12>>> md = dict(Recording=dict(Experimenter='John Doe',

13 DateTimeOriginal='2023-10-01T14:10:02',

14 Count=42),

15 Hardware=dict(Amplifier='Teensy_Amp 4.1',

16 Highpass='10Hz',

17 Gain='120mV'))

18>>> print_metadata(md)

19Recording:

20 Experimenter : John Doe

21 DateTimeOriginal: 2023-10-01T14:10:02

22 Count : 42

23Hardware:

24 Amplifier: Teensy_Amp 4.1

25 Highpass : 10Hz

26 Gain : 120mV

27```

29Often, audio files have very specific ways to store metadata. You can

30enforce using these by putting them into a dictionary that is added to

31the metadata with a key having the name of the metadata type you want,

32e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files.

34## Functions

36The `audiometadata` module provides functions for handling and

37manipulating these nested dictionaries. Many functions take keys as

38arguments for finding or setting specific key-value pairs. These keys

39can be the key of a specific item of a (sub-) dictionary, no matter on

40which level of the metadata hierarchy it is. For example, simply

41searching for "Highpass" retrieves the corrseponding value "10Hz",

42although "Highpass" is contained in the sub-dictionary (or "section")

43with key "Hardware". The same item can also be specified together with

44its parent keys: "Hardware.Highpass". Parent keys (or section keys)

45are by default separated by '.', but all functions have a `sep`

46key-word that specifies the string separating section names in

47keys. Key matching is case insensitive.

49Since the same items are named by many different keys in the different

50types of metadata data models, the functions also take lists of keys

51as arguments.

53Do not forget that you can easily manipulate the metadata by means of

54the standard functions of dictionaries.

56If you need to make a copy of the metadata use `deepcopy`:

57```

58from copy import deepcopy

59md_orig = deepcopy(md)

60```

62### Output

64Write nested dictionaries as texts:

66- `write_metadata_text()`: write meta data into a text/yaml file.

67- `print_metadata()`: write meta data to standard output.

69### Flatten

71Conversion between nested and flat dictionaries:

73- `flatten_metadata()`: flatten hierachical metadata to a single dictionary.

74- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary.

76### Parse numbers with units

78- `parse_number()`: parse string with number and unit.

79- `change_unit()`: scale numerical value to a new unit.

81### Find and get values

83Find keys and get their values parsed and converted to various types:

85- `find_key()`: find dictionary in metadata hierarchy containing the specified key.

86- `get_number_unit()`: find a key in metadata and return its number and unit.

87- `get_number()`: find a key in metadata and return its value in a given unit.

88- `get_int()`: find a key in metadata and return its integer value.

89- `get_bool()`: find a key in metadata and return its boolean value.

90- `get_datetime()`: find keys in metadata and return a datatime.

91- `get_str()`: find a key in metadata and return its string value.

93### Organize metadata

95Add and remove metadata:

97- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary.

98- `add_sections()`: add sections to metadata dictionary.

99- `set_metadata()`: set values of existing metadata.

100- `add_metadata()`: add or modify key-value pairs.

101- `move_metadata()`: remove a key from metadata and add it to a dictionary.

102- `remove_metadata()`: remove key-value pairs or sections from metadata.

103- `cleanup_metadata()`: remove empty sections from metadata.

104

105### Special metadata fields

106

107Retrieve and set specific metadata:

108

109- `get_gain()`: get gain and unit from metadata.

110- `update_gain()`: update gain setting in metadata.

111- `update_starttime()`: update start-of-recording times in metadata.

112- `bext_history_str()`: assemble a string for the BEXT CodingHistory field.

113- `add_history()`: add a string describing coding history to metadata.

114- `add_unwrap()`: add unwrap infos to metadata.

115

116Lists of standard keys:

117

118- `default_starttime_keys`: keys of times of start of the recording.

119- `default_timeref_keys`: keys of integer time references.

120- `default_gain_keys`: keys of gain settings.

121- `default_history_keys`: keys of strings describing coding history.

122

123

124## Command line script

125

126The module can be run as a script from the command line to display the

127metadata and markers contained in an audio file:

128

129```sh

130> audiometadata logger.wav

131```

132prints

133```text

134file:

135 filepath : logger.wav

136 samplingrate: 96000Hz

137 channels : 16

138 frames : 17280000

139 duration : 180.000s

140

141metadata:

142 INFO:

143 Bits : 32

144 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L

145 Gain : 165.00mV

146 uCBoard : Teensy 4.1

147 MACAdress : 04:e9:e5:15:3e:95

148 DateTimeOriginal: 2023-10-01T14:10:02

149 Software : TeeGrid R4-senors-logger v1.0

150```

151

152

153Alternatively, the script can be run from the module as:

154```

155python -m src.audioio.metadata audiofile.wav

156```

157

158Running

159```sh

160audiometadata --help

161```

162prints

163```text

164usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...]

165

166Convert audio file formats.

167

168positional arguments:

169 files audio file

170

171options:

172 -h, --help show this help message and exit

173 --version show program's version number and exit

174 -f list file format only

175 -m list metadata only

176 -c list cues/markers only

177 -t list tags of all riff/wave chunks contained in the file

178

179version 2.0.0 by Benda-Lab (2020-2024)

180```

181

182"""

183

184import sys

185import argparse

186import numpy as np

187import datetime as dt

188from .version import __version__, __year__

189

190

191def write_metadata_text(fh, meta, prefix='', indent=4, replace=None):

192 """Write meta data into a text/yaml file or stream.

193

194 With the default parameters, the output is a valid yaml file.

195

196 Parameters

197 ----------

198 fh: filename or stream

199 If not a stream, the file with name `fh` is opened.

200 Otherwise `fh` is used as a stream for writing.

201 meta: nested dict

202 Key-value pairs of metadata to be written into the file.

203 prefix: str

204 This string is written at the beginning of each line.

205 indent: int

206 Number of characters used for indentation of sections.

207 replace: char or None

208 If specified, replace special characters by this character.

209

210 Examples

211 --------

212 ```

213 from audioio import write_metadata

214 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)))

215 write_metadata('info.txt', md)

216 ```

217 """

218

219 def write_dict(df, md, level, smap):

220 w = 0

221 for k in md:

222 if not isinstance(md[k], dict) and w < len(k):

223 w = len(k)

224 for k in md:

225 clevel = level*indent

226 if isinstance(md[k], dict):

227 df.write(f'{prefix}{"":>{clevel}}{k}:\n')

228 write_dict(df, md[k], level+1, smap)

229 else:

230 value = md[k]

231 if isinstance(value, (list, tuple)):

232 value = ', '.join([f'{v}' for v in value])

233 else:

234 value = f'{value}'

235 value = value.replace('\r\n', r'\n')

236 value = value.replace('\n', r'\n')

237 if len(smap) > 0:

238 value = value.translate(smap)

239 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n')

240

241 if not meta:

242 return

243 if hasattr(fh, 'write'):

244 own_file = False

245 else:

246 own_file = True

247 fh = open(fh, 'w')

248 smap = {}

249 if replace:

250 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4))

251 write_dict(fh, meta, 0, smap)

252 if own_file:

253 fh.close()

254

255

256def print_metadata(meta, prefix='', indent=4, replace=None):

257 """Write meta data to standard output.

258

259 Parameters

260 ----------

261 meta: nested dict

262 Key-value pairs of metadata to be written into the file.

263 prefix: str

264 This string is written at the beginning of each line.

265 indent: int

266 Number of characters used for indentation of sections.

267 replace: char or None

268 If specified, replace special characters by this character.

269

270 Examples

271 --------

272 ```

273 >>> from audioio import print_metadata

274 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))

275 >>> print_metadata(md)

276 aaaa: 2

277 bbbb:

278 ccc: 3

279 ddd: 4

280 eee:

281 hh: 5

282 iiii:

283 jjj: 6

284 ```

285 """

286 write_metadata_text(sys.stdout, meta, prefix, indent, replace)

287

288

289def flatten_metadata(md, keep_sections=False, sep='.'):

290 """Flatten hierarchical metadata to a single dictionary.

291

292 Parameters

293 ----------

294 md: nested dict

295 Metadata as returned by `metadata()`.

296 keep_sections: bool

297 If `True`, then prefix keys with section names, separated by `sep`.

298 sep: str

299 String for separating section names.

300

301 Returns

302 -------

303 d: dict

304 Non-nested dict containing all key-value pairs of `md`.

305

306 Examples

307 --------

308 ```

309 >>> from audioio import print_metadata, flatten_metadata

310 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))

311 >>> print_metadata(md)

312 aaaa: 2

313 bbbb:

314 ccc: 3

315 ddd: 4

316 eee:

317 hh: 5

318 iiii:

319 jjj: 6

320

321 >>> fmd = flatten_metadata(md, keep_sections=True)

322 >>> print_metadata(fmd)

323 aaaa : 2

324 bbbb.ccc : 3

325 bbbb.ddd : 4

326 bbbb.eee.hh: 5

327 iiii.jjj : 6

328 ```

329 """

330 def flatten(cd, section):

331 df = {}

332 for k in cd:

333 if isinstance(cd[k], dict):

334 df.update(flatten(cd[k], section + k + sep))

335 else:

336 if keep_sections:

337 df[section+k] = cd[k]

338 else:

339 df[k] = cd[k]

340 return df

341

342 return flatten(md, '')

343

344

345def unflatten_metadata(md, sep='.'):

346 """Unflatten a previously flattened metadata dictionary.

347

348 Parameters

349 ----------

350 md: dict

351 Flat dictionary with key-value pairs as obtained from

352 `flatten_metadata()` with `keep_sections=True`.

353 sep: str

354 String that separates section names.

355

356 Returns

357 -------

358 d: nested dict

359 Hierarchical dictionary with sub-dictionaries and key-value pairs.

360

361 Examples

362 --------

363 ```

364 >>> from audioio import print_metadata, unflatten_metadata

365 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6}

366 >>> print_metadata(fmd)

367 aaaa : 2

368 bbbb.ccc : 3

369 bbbb.ddd : 4

370 bbbb.eee.hh: 5

371 iiii.jjj : 6

372

373 >>> md = unflatten_metadata(fmd)

374 >>> print_metadata(md)

375 aaaa: 2

376 bbbb:

377 ccc: 3

378 ddd: 4

379 eee:

380 hh: 5

381 iiii:

382 jjj: 6

383 ```

384 """

385 umd = {} # unflattened metadata

386 cmd = [umd] # current metadata dicts for each level of the hierarchy

387 csk = [] # current section keys

388 for k in md:

389 ks = k.split(sep)

390 # go up the hierarchy:

391 for i in range(len(csk) - len(ks)):

392 csk.pop()

393 cmd.pop()

394 for kss in reversed(ks[:len(csk)]):

395 if kss == csk[-1]:

396 break

397 csk.pop()

398 cmd.pop()

399 # add new sections:

400 for kss in ks[len(csk):-1]:

401 csk.append(kss)

402 cmd[-1][kss] = {}

403 cmd.append(cmd[-1][kss])

404 # add key-value pair:

405 cmd[-1][ks[-1]] = md[k]

406 return umd

407

408

409def parse_number(s):

410 """Parse string with number and unit.

411

412 Parameters

413 ----------

414 s: str, float, or int

415 String to be parsed. The initial part of the string is

416 expected to be a number, the part following the number is

417 interpreted as the unit. If float or int, then return this

418 as the value with empty unit.

419

420 Returns

421 -------

422 v: None, int, or float

423 Value of the string as float. Without decimal point, an int is returned.

424 If the string does not contain a number, None is returned.

425 u: str

426 Unit that follows the initial number.

427 n: int

428 Number of digits behind the decimal point.

429

430 Examples

431 --------

432

433 ```

434 >>> from audioio import parse_number

435

436 # integer:

437 >>> parse_number('42')

438 (42, '', 0)

439

440 # integer with unit:

441 >>> parse_number('42ms')

442 (42, 'ms', 0)

443

444 # float with unit:

445 >>> parse_number('42.ms')

446 (42.0, 'ms', 0)

447

448 # float with unit:

449 >>> parse_number('42.3ms')

450 (42.3, 'ms', 1)

451

452 # float with space and unit:

453 >>> parse_number('423.17 Hz')

454 (423.17, 'Hz', 2)

455 ```

456

457 """

458 if not isinstance(s, str):

459 if isinstance(s, int):

460 return s, '', 0

461 if isinstance(s, float):

462 return s, '', 5

463 else:

464 return None, '', 0

465 n = len(s)

466 ip = n

467 have_point = False

468 for i in range(len(s)):

469 if s[i] == '.':

470 if have_point:

471 n = i

472 break

473 have_point = True

474 ip = i + 1

475 if not s[i] in '0123456789.+-':

476 n = i

477 break

478 if n == 0:

479 return None, s, 0

480 v = float(s[:n]) if have_point else int(s[:n])

481 u = s[n:].strip()

482 nd = n - ip if n >= ip else 0

483 return v, u, nd

484

485

486unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2,

487 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6,

488 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12,

489 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18,

490 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2,

491 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6,

492 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12,

493 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18,

494 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6,

495 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18,

496 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3,

497 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18}

498""" SI prefixes for units with corresponding factors. """

499

500

501def change_unit(val, old_unit, new_unit):

502 """Scale numerical value to a new unit.

503

504 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703

505

506 Parameters

507 ----------

508 val: float

509 Value given in `old_unit`.

510 old_unit: str

511 Unit of `val`.

512 new_unit: str

513 Requested unit of return value.

514

515 Returns

516 -------

517 new_val: float

518 The input value `val` scaled to `new_unit`.

519

520 Examples

521 --------

522

523 ```

524 >>> from audioio import change_unit

525 >>> change_unit(5, 'mm', 'cm')

526 0.5

527

528 >>> change_unit(5, '', 'cm')

529 5.0

530

531 >>> change_unit(5, 'mm', '')

532 5.0

533

534 >>> change_unit(5, 'cm', 'mm')

535 50.0

536

537 >>> change_unit(4, 'kg', 'g')

538 4000.0

539

540 >>> change_unit(12, '%', '')

541 0.12

542

543 >>> change_unit(1.24, '', '%')

544 124.0

545

546 >>> change_unit(2.5, 'min', 's')

547 150.0

548

549 >>> change_unit(3600, 's', 'h')

550 1.0

551

552 ```

553

554 """

555 # missing unit?

556 if not old_unit and not new_unit:

557 return val

558 if not old_unit and new_unit != '%':

559 return val

560 if not new_unit and old_unit != '%':

561 return val

562

563 # special units that directly translate into factors:

564 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0}

565

566 # parse old unit:

567 f1 = 1.0

568 if old_unit in unit_factors:

569 f1 = unit_factors[old_unit]

570 else:

571 for k in unit_prefixes:

572 if len(old_unit) > len(k) and old_unit[:len(k)] == k:

573 f1 = unit_prefixes[k];

574

575 # parse new unit:

576 f2 = 1.0

577 if new_unit in unit_factors:

578 f2 = unit_factors[new_unit]

579 else:

580 for k in unit_prefixes:

581 if len(new_unit) > len(k) and new_unit[:len(k)] == k:

582 f2 = unit_prefixes[k];

583

584 return val*f1/f2

585

586

587def find_key(metadata, key, sep='.'):

588 """Find dictionary in metadata hierarchy containing the specified key.

589

590 Parameters

591 ----------

592 metadata: nested dict

593 Metadata.

594 key: str

595 Key to be searched for (case insensitive).

596 May contain section names separated by `sep`, i.e.

597 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section)

598 in section "bbb" that needs to be a subsection of section "aaa".

599 sep: str

600 String that separates section names in `key`.

601

602 Returns

603 -------

604 md: dict

605 The innermost dictionary matching some sections of the search key.

606 If `key` is not at all contained in the metadata,

607 the top-level dictionary is returned.

608 key: str

609 The part of the search key that was not found in `md`, or the

610 the final part of the search key, found in `md`.

611

612 Examples

613 --------

614

615 Independent of whether found or not found, you can assign to the

616 returned dictionary with the returned key.

617

618 ```

619 >>> from audioio import print_metadata, find_key

620 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6))

621 >>> print_metadata(md)

622 aaaa: 2

623 bbbb:

624 ccc: 3

625 ddd: 4

626 eee:

627 ff: 5

628 gggg:

629 hhh: 6

630

631 >>> m, k = find_key(md, 'bbbb.ddd')

632 >>> m[k] = 10

633 >>> print_metadata(md)

634 aaaa: 2

635 bbbb:

636 ccc: 3

637 ddd: 10

638 ...

639

640 >>> m, k = find_key(md, 'hhh')

641 >>> m[k] = 12

642 >>> print_metadata(md)

643 ...

644 gggg:

645 hhh: 12

646

647 >>> m, k = find_key(md, 'bbbb.eee.xx')

648 >>> m[k] = 42

649 >>> print_metadata(md)

650 ...

651 eee:

652 ff: 5

653 xx: 42

654 ...

655 ```

656

657 When searching for sections, the one conaining the searched section

658 is returned:

659 ```py

660 >>> m, k = find_key(md, 'eee')

661 >>> m[k]['yy'] = 46

662 >>> print_metadata(md)

663 ...

664 eee:

665 ff: 5

666 xx: 42

667 yy: 46

668 ...

669 ```

670

671 """

672 def find_keys(metadata, keys):

673 key = keys[0].strip().upper()

674 for k in metadata:

675 if k.upper() == key:

676 if len(keys) == 1:

677 # found key:

678 return True, metadata, k

679 elif isinstance(metadata[k], dict):

680 # keep searching within the next section:

681 return find_keys(metadata[k], keys[1:])

682 # search in subsections:

683 for k in metadata:

684 if isinstance(metadata[k], dict):

685 found, mm, kk = find_keys(metadata[k], keys)

686 if found:

687 return True, mm, kk

688 # nothing found:

689 return False, metadata, sep.join(keys)

690

691 if not metadata:

692 return {}, None

693 ks = key.strip().split(sep)

694 found, mm, kk = find_keys(metadata, ks)

695 return mm, kk

696

697

698def get_number_unit(metadata, keys, sep='.', default=None,

699 default_unit='', remove=False):

700 """Find a key in metadata and return its number and unit.

701

702 Parameters

703 ----------

704 metadata: nested dict

705 Metadata.

706 keys: str or list of str

707 Keys in the metadata to be searched for (case insensitive).

708 Value of the first key found is returned.

709 May contain section names separated by `sep`.

710 See `audiometadata.find_key()` for details.

711 sep: str

712 String that separates section names in `key`.

713 default: None, int, or float

714 Returned value if `key` is not found or the value does

715 not contain a number.

716 default_unit: str

717 Returned unit if `key` is not found or the key's value does

718 not have a unit.

719 remove: bool

720 If `True`, remove the found key from `metadata`.

721

722 Returns

723 -------

724 v: None, int, or float

725 Value referenced by `key` as float.

726 Without decimal point, an int is returned.

727 If none of the `keys` was found or

728 the key`s value does not contain a number,

729 then `default` is returned.

730 u: str

731 Corresponding unit.

732

733 Examples

734 --------

735

736 ```

737 >>> from audioio import get_number_unit

738 >>> md = dict(aaaa='42', bbbb='42.3ms')

739

740 # integer:

741 >>> get_number_unit(md, 'aaaa')

742 (42, '')

743

744 # float with unit:

745 >>> get_number_unit(md, 'bbbb')

746 (42.3, 'ms')

747

748 # two keys:

749 >>> get_number_unit(md, ['cccc', 'bbbb'])

750 (42.3, 'ms')

751

752 # not found:

753 >>> get_number_unit(md, 'cccc')

754 (None, '')

755

756 # not found with default value:

757 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.')

758 (1.0, 'a.u.')

759 ```

760

761 """

762 if not metadata:

763 return default, default_unit

764 if not isinstance(keys, (list, tuple, np.ndarray)):

765 keys = (keys,)

766 value = default

767 unit = default_unit

768 for key in keys:

769 m, k = find_key(metadata, key, sep)

770 if k in m:

771 v, u, _ = parse_number(m[k])

772 if v is not None:

773 if not u:

774 u = default_unit

775 if remove:

776 del m[k]

777 return v, u

778 elif u and unit == default_unit:

779 unit = u

780 return value, unit

781

782

783def get_number(metadata, unit, keys, sep='.', default=None, remove=False):

784 """Find a key in metadata and return its value in a given unit.

785

786 Parameters

787 ----------

788 metadata: nested dict

789 Metadata.

790 unit: str

791 Unit in which to return numerical value referenced by one of the `keys`.

792 keys: str or list of str

793 Keys in the metadata to be searched for (case insensitive).

794 Value of the first key found is returned.

795 May contain section names separated by `sep`.

796 See `audiometadata.find_key()` for details.

797 sep: str

798 String that separates section names in `key`.

799 default: None, int, or float

800 Returned value if `key` is not found or the value does

801 not contain a number.

802 remove: bool

803 If `True`, remove the found key from `metadata`.

804

805 Returns

806 -------

807 v: None or float

808 Value referenced by `key` as float scaled to `unit`.

809 If none of the `keys` was found or

810 the key`s value does not contain a number,

811 then `default` is returned.

812

813 Examples

814 --------

815

816 ```

817 >>> from audioio import get_number

818 >>> md = dict(aaaa='42', bbbb='42.3ms')

819

820 # milliseconds to seconds:

821 >>> get_number(md, 's', 'bbbb')

822 0.0423

823

824 # milliseconds to microseconds:

825 >>> get_number(md, 'us', 'bbbb')

826 42300.0

827

828 # value without unit is not scaled:

829 >>> get_number(md, 'Hz', 'aaaa')

830 42

831

832 # two keys:

833 >>> get_number(md, 's', ['cccc', 'bbbb'])

834 0.0423

835

836 # not found:

837 >>> get_number(md, 's', 'cccc')

838 None

839

840 # not found with default value:

841 >>> get_number(md, 's', 'cccc', default=1.0)

842 1.0

843 ```

844

845 """

846 v, u = get_number_unit(metadata, keys, sep, None, unit, remove)

847 if v is None:

848 return default

849 else:

850 return change_unit(v, u, unit)

851

852

853def get_int(metadata, keys, sep='.', default=None, remove=False):

854 """Find a key in metadata and return its integer value.

855

856 Parameters

857 ----------

858 metadata: nested dict

859 Metadata.

860 keys: str or list of str

861 Keys in the metadata to be searched for (case insensitive).

862 Value of the first key found is returned.

863 May contain section names separated by `sep`.

864 See `audiometadata.find_key()` for details.

865 sep: str

866 String that separates section names in `key`.

867 default: None or int

868 Return value if `key` is not found or the value does

869 not contain an integer.

870 remove: bool

871 If `True`, remove the found key from `metadata`.

872

873 Returns

874 -------

875 v: None or int

876 Value referenced by `key` as integer.

877 If none of the `keys` was found,

878 the key's value does not contain a number or represents

879 a floating point value, then `default` is returned.

880

881 Examples

882 --------

883

884 ```

885 >>> from audioio import get_int

886 >>> md = dict(aaaa='42', bbbb='42.3ms')

887

888 # integer:

889 >>> get_int(md, 'aaaa')

890 42

891

892 # two keys:

893 >>> get_int(md, ['cccc', 'aaaa'])

894 42

895

896 # float:

897 >>> get_int(md, 'bbbb')

898 None

899

900 # not found:

901 >>> get_int(md, 'cccc')

902 None

903

904 # not found with default value:

905 >>> get_int(md, 'cccc', default=0)

906 0

907 ```

908

909 """

910 if not metadata:

911 return default

912 if not isinstance(keys, (list, tuple, np.ndarray)):

913 keys = (keys,)

914 for key in keys:

915 m, k = find_key(metadata, key, sep)

916 if k in m:

917 v, _, n = parse_number(m[k])

918 if v is not None and n == 0:

919 if remove:

920 del m[k]

921 return int(v)

922 return default

923

924

925def get_bool(metadata, keys, sep='.', default=None, remove=False):

926 """Find a key in metadata and return its boolean value.

927

928 Parameters

929 ----------

930 metadata: nested dict

931 Metadata.

932 keys: str or list of str

933 Keys in the metadata to be searched for (case insensitive).

934 Value of the first key found is returned.

935 May contain section names separated by `sep`.

936 See `audiometadata.find_key()` for details.

937 sep: str

938 String that separates section names in `key`.

939 default: None or bool

940 Return value if `key` is not found or the value does

941 not specify a boolean value.

942 remove: bool

943 If `True`, remove the found key from `metadata`.

944

945 Returns

946 -------

947 v: None or bool

948 Value referenced by `key` as boolean.

949 True if 'true', 'yes' (case insensitive) or any number larger than zero.

950 False if 'false', 'no' (case insensitive) or any number equal to zero.

951 If none of the `keys` was found or

952 the key's value does specify a boolean value,

953 then `default` is returned.

954

955 Examples

956 --------

957

958 ```

959 >>> from audioio import get_bool

960 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui')

961

962 # case insensitive:

963 >>> get_bool(md, 'aaaa')

964 True

965

966 >>> get_bool(md, 'bbbb')

967 False

968

969 >>> get_bool(md, 'cccc')

970 False

971

972 >>> get_bool(md, 'dddd')

973 True

974

975 >>> get_bool(md, 'eeee')

976 True

977

978 # not found:

979 >>> get_bool(md, 'ffff')

980 None

981

982 # two keys (string is preferred over number):

983 >>> get_bool(md, ['cccc', 'aaaa'])

984 True

985

986 # two keys (take first match):

987 >>> get_bool(md, ['cccc', 'ffff'])

988 False

989

990 # not found with default value:

991 >>> get_bool(md, 'ffff', default=False)

992 False

993 ```

994

995 """

996 if not metadata:

997 return default

998 if not isinstance(keys, (list, tuple, np.ndarray)):

999 keys = (keys,)

1000 val = default

1001 mv = None

1002 kv = None

1003 for key in keys:

1004 m, k = find_key(metadata, key, sep)

1005 if k in m and not isinstance(m[k], dict):

1006 vs = m[k]

1007 v, _, _ = parse_number(vs)

1008 if v is not None:

1009 val = abs(v) > 1e-8

1010 mv = m

1011 kv = k

1012 elif isinstance(vs, str):

1013 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']:

1014 if remove:

1015 del m[k]

1016 return True

1017 if vs.upper() in ['FALSE', 'F', 'NO', 'N']:

1018 if remove:

1019 del m[k]

1020 return False

1021 if not mv is None and not kv is None and remove:

1022 del mv[kv]

1023 return val

1024

1025

1026default_starttime_keys = [['DateTimeOriginal'],

1027 ['OriginationDate', 'OriginationTime'],

1028 ['Location_Time'],

1029 ['Timestamp']]

1030"""Default keys of times of start of the recording in metadata.

1031Used by `get_datetime()` and `update_starttime()` functions.

1032"""

1033

1034def get_datetime(metadata, keys=default_starttime_keys,

1035 sep='.', default=None, remove=False):

1036 """Find keys in metadata and return a datatime.

1037

1038 Parameters

1039 ----------

1040 metadata: nested dict

1041 Metadata.

1042 keys: tuple of str or list of tuple of str

1043 Datetimes can be stored in metadata as two separate key-value pairs,

1044 one for the date and one for the time. Or by a single key-value pair

1045 for a date-time values. This is why the keys need to be specified in

1046 tuples with one or tow keys.

1047 Value of the first tuple of keys found is returned.

1048 Keys may contain section names separated by `sep`.

1049 See `audiometadata.find_key()` for details.

1050 You can modify the default keys via the `default_starttime_keys` list

1051 of the `audiometadata` module.

1052 sep: str

1053 String that separates section names in `key`.

1054 default: None or str

1055 Return value if `key` is not found or the value does

1056 not contain a string.

1057 remove: bool

1058 If `True`, remove the found key from `metadata`.

1059

1060 Returns

1061 -------

1062 v: None or datetime

1063 Datetime referenced by `keys`.

1064 If none of the `keys` was found, then `default` is returned.

1065

1066 Examples

1067 --------

1068

1069 ```

1070 >>> from audioio import get_datetime

1071 >>> import datetime as dt

1072 >>> md = dict(date='2024-03-02', time='10:42:24',

1073 datetime='2023-04-15T22:10:00')

1074

1075 # separate date and time:

1076 >>> get_datetime(md, ('date', 'time'))

1077 datetime.datetime(2024, 3, 2, 10, 42, 24)

1078

1079 # single datetime:

1080 >>> get_datetime(md, ('datetime',))

1081 datetime.datetime(2023, 4, 15, 22, 10)

1082

1083 # two alternative key tuples:

1084 >>> get_datetime(md, [('aaaa',), ('date', 'time')])

1085 datetime.datetime(2024, 3, 2, 10, 42, 24)

1086

1087 # not found:

1088 >>> get_datetime(md, ('cccc',))

1089 None

1090

1091 # not found with default value:

1092 >>> get_datetime(md, ('cccc', 'dddd'),

1093 default=dt.datetime(2022, 2, 22, 22, 2, 12))

1094 datetime.datetime(2022, 2, 22, 22, 2, 12)

1095 ```

1096

1097 """

1098 if not metadata:

1099 return default

1100 if len(keys) > 0 and isinstance(keys[0], str):

1101 keys = (keys,)

1102 for keyp in keys:

1103 if len(keyp) == 1:

1104 m, k = find_key(metadata, keyp[0], sep)

1105 if k in m:

1106 v = m[k]

1107 if isinstance(v, dt.datetime):

1108 if remove:

1109 del m[k]

1110 return v

1111 elif isinstance(v, str):

1112 if remove:

1113 del m[k]

1114 return dt.datetime.fromisoformat(v)

1115 else:

1116 md, kd = find_key(metadata, keyp[0], sep)

1117 if not kd in md:

1118 continue

1119 if isinstance(md[kd], dt.date):

1120 date = md[kd]

1121 elif isinstance(md[kd], str):

1122 date = dt.date.fromisoformat(md[kd])

1123 else:

1124 continue

1125 mt, kt = find_key(metadata, keyp[1], sep)

1126 if not kt in mt:

1127 continue

1128 if isinstance(mt[kt], dt.time):

1129 time = mt[kt]

1130 elif isinstance(mt[kt], str):

1131 time = dt.time.fromisoformat(mt[kt])

1132 else:

1133 continue

1134 if remove:

1135 del md[kd]

1136 del mt[kt]

1137 return dt.datetime.combine(date, time)

1138 return default

1139

1140

1141def get_str(metadata, keys, sep='.', default=None, remove=False):

1142 """Find a key in metadata and return its string value.

1143

1144 Parameters

1145 ----------

1146 metadata: nested dict

1147 Metadata.

1148 keys: str or list of str

1149 Keys in the metadata to be searched for (case insensitive).

1150 Value of the first key found is returned.

1151 May contain section names separated by `sep`.

1152 See `audiometadata.find_key()` for details.

1153 sep: str

1154 String that separates section names in `key`.

1155 default: None or str

1156 Return value if `key` is not found or the value does

1157 not contain a string.

1158 remove: bool

1159 If `True`, remove the found key from `metadata`.

1160

1161 Returns

1162 -------

1163 v: None or str

1164 String value referenced by `key`.

1165 If none of the `keys` was found, then `default` is returned.

1166

1167 Examples

1168 --------

1169

1170 ```

1171 >>> from audioio import get_str

1172 >>> md = dict(aaaa=42, bbbb='hello')

1173

1174 # string:

1175 >>> get_str(md, 'bbbb')

1176 'hello'

1177

1178 # int as str:

1179 >>> get_str(md, 'aaaa')

1180 '42'

1181

1182 # two keys:

1183 >>> get_str(md, ['cccc', 'bbbb'])

1184 'hello'

1185

1186 # not found:

1187 >>> get_str(md, 'cccc')

1188 None

1189

1190 # not found with default value:

1191 >>> get_str(md, 'cccc', default='-')

1192 '-'

1193 ```

1194

1195 """

1196 if not metadata:

1197 return default

1198 if not isinstance(keys, (list, tuple, np.ndarray)):

1199 keys = (keys,)

1200 for key in keys:

1201 m, k = find_key(metadata, key, sep)

1202 if k in m and not isinstance(m[k], dict):

1203 v = m[k]

1204 if remove:

1205 del m[k]

1206 return str(v)

1207 return default

1208

1209

1210def add_sections(metadata, sections, value=False, sep='.'):

1211 """Add sections to metadata dictionary.

1212

1213 Parameters

1214 ----------

1215 metadata: nested dict

1216 Metadata.

1217 key: str

1218 Names of sections to be added to `metadata`.

1219 Section names separated by `sep`.

1220 value: bool

1221 If True, then the last element in `key` is a key for a value,

1222 not a section.

1223 sep: str

1224 String that separates section names in `key`.

1225

1226 Returns

1227 -------

1228 md: dict

1229 Dictionary of the last added section.

1230 key: str

1231 Last key. Only returned if `value` is set to `True`.

1232

1233 Examples

1234 --------

1235

1236 Add a section and a sub-section to the metadata:

1237 ```

1238 >>> from audioio import print_metadata, add_sections

1239 >>> md = dict()

1240 >>> m = add_sections(md, 'Recording.Location')

1241 >>> m['Country'] = 'Lummerland'

1242 >>> print_metadata(md)

1243 Recording:

1244 Location:

1245 Country: Lummerland

1246 ```

1247

1248 Add a section with a key-value pair:

1249 ```

1250 >>> md = dict()

1251 >>> m, k = add_sections(md, 'Recording.Location', True)

1252 >>> m[k] = 'Lummerland'

1253 >>> print_metadata(md)

1254 Recording:

1255 Location: Lummerland

1256 ```

1257

1258 Adds well to `find_key()`:

1259 ```

1260 >>> md = dict(Recording=dict())

1261 >>> m, k = find_key(md, 'Recording.Location.Country')

1262 >>> m, k = add_sections(m, k, True)

1263 >>> m[k] = 'Lummerland'

1264 >>> print_metadata(md)

1265 Recording:

1266 Location:

1267 Country: Lummerland

1268 ```

1269

1270 """

1271 mm = metadata

1272 ks = sections.split(sep)

1273 n = len(ks)

1274 if value:

1275 n -= 1

1276 for k in ks[:n]:

1277 if len(k) == 0:

1278 continue

1279 mm[k] = dict()

1280 mm = mm[k]

1281 if value:

1282 return mm, ks[-1]

1283 else:

1284 return mm

1285

1286

1287def strlist_to_dict(mds):

1288 """Convert list of key-value-pair strings to dictionary.

1289

1290 Parameters

1291 ----------

1292 mds: None or dict or str or list of str

1293 - None - returns empty dictionary.

1294 - Flat dictionary - returned as is.

1295 - String with key and value separated by '='.

1296 - List of strings with keys and values separated by '='.

1297 Keys may contain section names.

1298

1299 Returns

1300 -------

1301 md_dict: dict

1302 Flat dictionary with key-value pairs.

1303 Keys may contain section names.

1304 Values are strings, other types or dictionaries.

1305 """

1306 if mds is None:

1307 return {}

1308 if isinstance(mds, dict):

1309 return mds

1310 if not isinstance(mds, (list, tuple, np.ndarray)):

1311 mds = (mds,)

1312 md_dict = {}

1313 for md in mds:

1314 k, v = md.split('=')

1315 k = k.strip()

1316 v = v.strip()

1317 md_dict[k] = v

1318 return md_dict

1319

1320

1321def set_metadata(metadata, mds, sep='.'):

1322 """Set values of existing metadata.

1323

1324 Only if a key is found in the metadata, its value is updated.

1325

1326 Parameters

1327 ----------

1328 metadata: nested dict

1329 Metadata.

1330 mds: dict or str or list of str

1331 - Flat dictionary with key-value pairs for updating the metadata.

1332 Values can be strings, other types or dictionaries.

1333 - String with key and value separated by '='.

1334 - List of strings with key and value separated by '='.

1335 Keys may contain section names separated by `sep`.

1336 sep: str

1337 String that separates section names in the keys of `md_dict`.

1338

1339 Examples

1340 --------

1341 ```

1342 >>> from audioio import print_metadata, set_metadata

1343 >>> md = dict(Recording=dict(Time='early'))

1344 >>> print_metadata(md)

1345 Recording:

1346 Time: early

1347

1348 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair

1349 'Recording.Time': 'late'}) # change value of existing key

1350 >>> print_metadata(md)

1351 Recording:

1352 Time : late

1353 ```

1354

1355 See also

1356 --------

1357 add_metadata()

1358 strlist_to_dict()

1359

1360 """

1361 if metadata is None:

1362 return

1363 md_dict = strlist_to_dict(mds)

1364 for k in md_dict:

1365 mm, kk = find_key(metadata, k, sep)

1366 if kk in mm:

1367 mm[kk] = md_dict[k]

1368

1369

1370def add_metadata(metadata, mds, sep='.'):

1371 """Add or modify key-value pairs.

1372

1373 If a key does not exist, it is added to the metadata.

1374

1375 Parameters

1376 ----------

1377 metadata: nested dict

1378 Metadata.

1379 mds: dict or str or list of str

1380 - Flat dictionary with key-value pairs for updating the metadata.

1381 Values can be strings, other types or dictionaries.

1382 - String with key and value separated by '='.

1383 - List of strings with key and value separated by '='.

1384 Keys may contain section names separated by `sep`.

1385 sep: str

1386 String that separates section names in the keys of `md_list`.

1387

1388 Examples

1389 --------

1390 ```

1391 >>> from audioio import print_metadata, add_metadata

1392 >>> md = dict(Recording=dict(Time='early'))

1393 >>> print_metadata(md)

1394 Recording:

1395 Time: early

1396

1397 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair

1398 'Recording.Time': 'late', # change value of existing key

1399 'Recording.Quality': 'amazing', # new key-value pair in existing section

1400 'Location.Country': 'Lummerland']) # new key-value pair in new section

1401 >>> print_metadata(md)

1402 Recording:

1403 Time : late

1404 Quality: amazing

1405 Artist: John Doe

1406 Location:

1407 Country: Lummerland

1408 ```

1409

1410 See also

1411 --------

1412 set_metadata()

1413 strlist_to_dict()

1414

1415 """

1416 if metadata is None:

1417 return

1418 md_dict = strlist_to_dict(mds)

1419 for k in md_dict:

1420 mm, kk = find_key(metadata, k, sep)

1421 mm, kk = add_sections(mm, kk, True, sep)

1422 mm[kk] = md_dict[k]

1423

1424

1425

1426def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'):

1427 """Remove a key from metadata and add it to a dictionary.

1428

1429 Parameters

1430 ----------

1431 src_md: nested dict

1432 Metadata from which a key is removed.

1433 dest_md: dict

1434 Dictionary to which the found key and its value are added.

1435 keys: str or list of str

1436 List of keys to be searched for in `src_md`.

1437 Move the first one found to `dest_md`.

1438 See the `audiometadata.find_key()` function for details.

1439 new_key: None or str

1440 If specified add the value of the found key as `new_key` to

1441 `dest_md`. Otherwise, use the search key.

1442 sep: str

1443 String that separates section names in `keys`.

1444

1445 Returns

1446 -------

1447 moved: bool

1448 `True` if key was found and moved to dictionary.

1449

1450 Examples

1451 --------

1452 ```

1453 >>> from audioio import print_metadata, move_metadata

1454 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV'))

1455 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist')

1456 >>> print_metadata(md)

1457 Recording:

1458 Gain : 1.42mV

1459 Experimentalist: John Doe

1460 ```

1461

1462 """

1463 if not src_md:

1464 return False

1465 if not isinstance(keys, (list, tuple, np.ndarray)):

1466 keys = (keys,)

1467 for key in keys:

1468 m, k = find_key(src_md, key, sep)

1469 if k in m:

1470 dest_key = new_key if new_key else k

1471 dest_md[dest_key] = m.pop(k)

1472 return True

1473 return False

1474

1475

1476def remove_metadata(metadata, key_list, sep='.'):

1477 """Remove key-value pairs or sections from metadata.

1478

1479 Parameters

1480 ----------

1481 metadata: nested dict

1482 Metadata.

1483 key_list: str or list of str

1484 List of keys to key-value pairs or sections to be removed

1485 from the metadata.

1486 sep: str

1487 String that separates section names in the keys of `key_list`.

1488

1489 Examples

1490 --------

1491 ```

1492 >>> from audioio import print_metadata, remove_metadata

1493 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4))

1494 >>> remove_metadata(md, ('ccc',))

1495 >>> print_metadata(md)

1496 aaaa: 2

1497 bbbb:

1498 ddd: 4

1499 ```

1500

1501 """

1502 if not metadata:

1503 return

1504 if not isinstance(key_list, (list, tuple, np.ndarray)):

1505 key_list = (key_list,)

1506 for k in key_list:

1507 mm, kk = find_key(metadata, k, sep)

1508 if kk in mm:

1509 del mm[kk]

1510

1511

1512def cleanup_metadata(metadata):

1513 """Remove empty sections from metadata.

1514

1515 Parameters

1516 ----------

1517 metadata: nested dict

1518 Metadata.

1519

1520 Examples

1521 --------

1522 ```

1523 >>> from audioio import print_metadata, cleanup_metadata

1524 >>> md = dict(aaaa=2, bbbb=dict())

1525 >>> cleanup_metadata(md)

1526 >>> print_metadata(md)

1527 aaaa: 2

1528 ```

1529

1530 """

1531 if not metadata:

1532 return

1533 for k in list(metadata):

1534 if isinstance(metadata[k], dict):

1535 if len(metadata[k]) == 0:

1536 del metadata[k]

1537 else:

1538 cleanup_metadata(metadata[k])

1539

1540

1541default_gain_keys = ['gain']

1542"""Default keys of gain settings in metadata. Used by `get_gain()` function.

1543"""

1544

1545def get_gain(metadata, gain_key=default_gain_keys, sep='.',

1546 default=None, default_unit='', remove=False):

1547 """Get gain and unit from metadata.

1548

1549 Parameters

1550 ----------

1551 metadata: nested dict

1552 Metadata with key-value pairs.

1553 gain_key: str or list of str

1554 Key in the file's metadata that holds some gain information.

1555 If found, the data will be multiplied with the gain,

1556 and if available, the corresponding unit is returned.

1557 See the `audiometadata.find_key()` function for details.

1558 You can modify the default keys via the `default_gain_keys` list

1559 of the `audiometadata` module.

1560 sep: str

1561 String that separates section names in `gain_key`.

1562 default: None or float

1563 Returned value if no valid gain was found in `metadata`.

1564 default_unit: str

1565 Returned unit if no valid gain was found in `metadata`.

1566 remove: bool

1567 If `True`, remove the found key from `metadata`.

1568

1569 Returns

1570 -------

1571 fac: float

1572 Gain factor. If not found in metadata return 1.

1573 unit: string

1574 Unit of the data if found in the metadata, otherwise "a.u.".

1575 """

1576 v, u = get_number_unit(metadata, gain_key, sep, default,

1577 default_unit, remove)

1578 # fix some TeeGrid gains:

1579 if len(u) >= 2 and u[-2:] == '/V':

1580 u = u[:-2]

1581 return v, u

1582

1583

1584def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'):

1585 """Update gain setting in metadata.

1586

1587 Searches for the first appearance of a gain key in the metadata

1588 hierarchy. If found, divide the gain value by `fac`.

1589

1590 Parameters

1591 ----------

1592 metadata: nested dict

1593 Metadata to be updated.

1594 fac: float

1595 Factor that was used to scale the data.

1596 gain_key: str or list of str

1597 Key in the file's metadata that holds some gain information.

1598 If found, the data will be multiplied with the gain,

1599 and if available, the corresponding unit is returned.

1600 See the `audiometadata.find_key()` function for details.

1601 You can modify the default keys via the `default_gain_keys` list

1602 of the `audiometadata` module.

1603 sep: str

1604 String that separates section names in `gain_key`.

1605

1606 Returns

1607 -------

1608 done: bool

1609 True if gain has been found and set.

1610

1611

1612 Examples

1613 --------

1614

1615 ```

1616 >>> from audioio import print_metadata, update_gain

1617 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV'))

1618 >>> update_gain(md, 2)

1619 >>> print_metadata(md)

1620 Artist: John Doe

1621 Recording:

1622 gain: 0.70mV

1623 ```

1624

1625 """

1626 if not metadata:

1627 return False

1628 if not isinstance(gain_key, (list, tuple, np.ndarray)):

1629 gain_key = (gain_key,)

1630 for gk in gain_key:

1631 m, k = find_key(metadata, gk, sep)

1632 if k in m and not isinstance(m[k], dict):

1633 vs = m[k]

1634 if isinstance(vs, (int, float)):

1635 m[k] = vs/fac

1636 else:

1637 v, u, n = parse_number(vs)

1638 if not v is None:

1639 # fix some TeeGrid gains:

1640 if len(u) >= 2 and u[-2:] == '/V':

1641 u = u[:-2]

1642 m[k] = f'{v/fac:.{n+1}f}{u}'

1643 return True

1644 return False

1645

1646

1647default_timeref_keys = ['TimeReference']

1648"""Default keys of integer time references in metadata.

1649Used by `update_starttime()` function.

1650"""

1651

1652def update_starttime(metadata, deltat, rate,

1653 time_keys=default_starttime_keys,

1654 ref_keys=default_timeref_keys):

1655 """Update start-of-recording times in metadata.

1656

1657 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata.

1658

1659 Parameters

1660 ----------

1661 metadata: nested dict

1662 Metadata to be updated.

1663 deltat: float

1664 Time in seconds to be added to start times.

1665 rate: float

1666 Sampling rate of the data in Hertz.

1667 time_keys: tuple of str or list of tuple of str

1668 Keys to fields denoting calender times, i.e. dates and times.

1669 Datetimes can be stored in metadata as two separate key-value pairs,

1670 one for the date and one for the time. Or by a single key-value pair

1671 for a date-time values. This is why the keys need to be specified in

1672 tuples with one or two keys.

1673 Keys may contain section names separated by `sep`.

1674 See `audiometadata.find_key()` for details.

1675 You can modify the default time keys via the `default_starttime_keys`

1676 list of the `audiometadata` module.

1677 ref_keys: str or list of str

1678 Keys to time references, i.e. integers in seconds relative to

1679 a reference time.

1680 Keys may contain section names separated by `sep`.

1681 See `audiometadata.find_key()` for details.

1682 You can modify the default reference keys via the

1683 `default_timeref_keys` list of the `audiometadata` module.

1684

1685 Returns

1686 -------

1687 success: bool

1688 True if at least one time has been updated.

1689

1690 Example

1691 -------

1692 ```

1693 >>> from audioio import print_metadata, update_starttime

1694 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',

1695 OtherTime='2023-05-16T23:20:10',

1696 BEXT=dict(OriginationDate='2024-03-02',

1697 OriginationTime='10:42:24',

1698 TimeReference=123456))

1699 >>> update_starttime(md, 4.2, 48000)

1700 >>> print_metadata(md)

1701 DateTimeOriginal: 2023-04-15T22:10:04

1702 OtherTime : 2023-05-16T23:20:10

1703 BEXT:

1704 OriginationDate: 2024-03-02

1705 OriginationTime: 10:42:28

1706 TimeReference : 325056

1707 ```

1708

1709 """

1710 if not metadata:

1711 return False

1712 if not isinstance(deltat, dt.timedelta):

1713 deltat = dt.timedelta(seconds=deltat)

1714 success = False

1715 if len(time_keys) > 0 and isinstance(time_keys[0], str):

1716 time_keys = (time_keys,)

1717 for key in time_keys:

1718 if len(key) == 1:

1719 # datetime:

1720 m, k = find_key(metadata, key[0])

1721 if k in m and not isinstance(m[k], dict):

1722 if isinstance(m[k], dt.datetime):

1723 m[k] += deltat

1724 else:

1725 datetime = dt.datetime.fromisoformat(m[k]) + deltat

1726 m[k] = datetime.isoformat(timespec='seconds')

1727 success = True

1728 else:

1729 # separate date and time:

1730 md, kd = find_key(metadata, key[0])

1731 if not kd in md or isinstance(md[kd], dict):

1732 continue

1733 if isinstance(md[kd], dt.date):

1734 date = md[kd]

1735 is_date = True

1736 else:

1737 date = dt.date.fromisoformat(md[kd])

1738 is_date = False

1739 mt, kt = find_key(metadata, key[1])

1740 if not kt in mt or isinstance(mt[kt], dict):

1741 continue

1742 if isinstance(mt[kt], dt.time):

1743 time = mt[kt]

1744 is_time = True

1745 else:

1746 time = dt.time.fromisoformat(mt[kt])

1747 is_time = False

1748 datetime = dt.datetime.combine(date, time) + deltat

1749 md[kd] = datetime.date() if is_date else datetime.date().isoformat()

1750 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds')

1751 success = True

1752 # time reference in samples:

1753 if isinstance(ref_keys, str):

1754 ref_keys = (ref_keys,)

1755 for key in ref_keys:

1756 m, k = find_key(metadata, key)

1757 if k in m and not isinstance(m[k], dict):

1758 is_int = isinstance(m[k], int)

1759 tref = int(m[k])

1760 tref += int(np.round(deltat.total_seconds()*rate))

1761 m[k] = tref if is_int else f'{tref}'

1762 success = True

1763 return success

1764

1765

1766def bext_history_str(encoding, rate, channels, text=None):

1767 """ Assemble a string for the BEXT CodingHistory field.

1768

1769 Parameters

1770 ----------

1771 encoding: str or None

1772 Encoding of the data.

1773 rate: int or float

1774 Sampling rate in Hertz.

1775 channels: int

1776 Number of channels.

1777 text: str or None

1778 Optional free text.

1779

1780 Returns

1781 -------

1782 s: str

1783 String for the BEXT CodingHistory field,

1784 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out"

1785 """

1786 codes = []

1787 bits = None

1788 if encoding is not None:

1789 if encoding[:3] == 'PCM':

1790 bits = int(encoding[4:])

1791 encoding = 'PCM'

1792 codes.append(f'A={encoding}')

1793 codes.append(f'F={rate:.0f}')

1794 if bits is not None:

1795 codes.append(f'W={bits}')

1796 mode = None

1797 if channels == 1:

1798 mode = 'mono'

1799 elif channels == 2:

1800 mode = 'stereo'

1801 if mode is not None:

1802 codes.append(f'M={mode}')

1803 if text is not None:

1804 codes.append(f'T={text.rstrip()}')

1805 return ','.join(codes)

1806

1807

1808default_history_keys = ['History',

1809 'CodingHistory',

1810 'BWF_CODING_HISTORY']

1811"""Default keys of strings describing coding history in metadata.

1812Used by `add_history()` function.

1813"""

1814

1815def add_history(metadata, history, new_key=None, pre_history=None,

1816 history_keys=default_history_keys, sep='.'):

1817 """Add a string describing coding history to metadata.

1818

1819 Add `history` to the `history_keys` fields in the metadata. If

1820 none of these fields are present but `new_key` is specified, then

1821 assign `pre_history` and `history` to this key. If this key does

1822 not exist in the metadata, it is created.

1823

1824 Parameters

1825 ----------

1826 metadata: nested dict

1827 Metadata to be updated.

1828 history: str

1829 String to be added to the history.

1830 new_key: str or None

1831 Sections and name of a history key to be added to `metadata`.

1832 Section names are separated by `sep`.

1833 pre_history: str or None

1834 If a new key `new_key` is created, then assign this string followed

1835 by `history`.

1836 history_keys: str or list of str

1837 Keys to fields where to add `history`.

1838 Keys may contain section names separated by `sep`.

1839 See `audiometadata.find_key()` for details.

1840 You can modify the default history keys via the `default_history_keys`

1841 list of the `audiometadata` module.

1842 sep: str

1843 String that separates section names in `new_key` and `history_keys`.

1844

1845 Returns

1846 -------

1847 success: bool

1848 True if the history string has beend added to the metadata.

1849

1850 Example

1851 -------

1852 Add string to existing history key-value pair:

1853 ```

1854 >>> from audioio import add_history

1855 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings'))

1856 >>> add_history(md, 'just a snippet')

1857 >>> print(md['BEXT']['CodingHistory'])

1858 original recordings

1859 just a snippet

1860 ```

1861

1862 Assign string to new key-value pair:

1863 ```

1864 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12'))

1865 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data')

1866 >>> print(md['BEXT']['CodingHistory'])

1867 original data

1868 just a snippet

1869 ```

1870

1871 """

1872 if not metadata:

1873 return False

1874 if isinstance(history_keys, str):

1875 history_keys = (history_keys,)

1876 success = False

1877 for keys in history_keys:

1878 m, k = find_key(metadata, keys)

1879 if k in m and not isinstance(m[k], dict):

1880 s = m[k]

1881 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':

1882 s += '\r\n'

1883 s += history

1884 m[k] = s

1885 success = True

1886 if not success and new_key:

1887 m, k = find_key(metadata, new_key, sep)

1888 m, k = add_sections(m, k, True, sep)

1889 s = ''

1890 if pre_history is not None:

1891 s = pre_history

1892 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':

1893 s += '\r\n'

1894 s += history

1895 m[k] = s

1896 success = True

1897 return success

1898

1899

1900def add_unwrap(metadata, thresh, clip=0, unit=''):

1901 """Add unwrap infos to metadata.

1902

1903 If `audiotools.unwrap()` was applied to the data, then this

1904 function adds relevant infos to the metadata. If there is an INFO

1905 section in the metadata, the unwrap infos are added to this

1906 section, otherwise they are added to the top level of the metadata

1907 hierarchy.

1908

1909 The threshold `thresh` used for unwrapping is saved under the key

1910 'UnwrapThreshold' as a string. If `clip` is larger than zero, then

1911 the clip level is saved under the key 'UnwrapClippedAmplitude' as

1912 a string.

1913

1914 Parameters

1915 ----------

1916 md: nested dict

1917 Metadata to be updated.

1918 thresh: float

1919 Threshold used for unwrapping.

1920 clip: float

1921 Level at which unwrapped data have been clipped.

1922 unit: str

1923 Unit of `thresh` and `clip`.

1924

1925 Examples

1926 --------

1927

1928 ```

1929 >>> from audioio import print_metadata, add_unwrap

1930 >>> md = dict(INFO=dict(Time='early'))

1931 >>> add_unwrap(md, 0.6, 1.0)

1932 >>> print_metadata(md)

1933 INFO:

1934 Time : early

1935 UnwrapThreshold : 0.60

1936 UnwrapClippedAmplitude: 1.00

1937 ```

1938

1939 """

1940 if metadata is None:

1941 return

1942 md = metadata

1943 for k in metadata:

1944 if k.strip().upper() == 'INFO':

1945 md = metadata['INFO']

1946 break

1947 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}'

1948 if clip > 0:

1949 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}'

1950

1951

1952def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks):

1953 """Print metadata and markers of audio files.

1954

1955 Parameters

1956 ----------

1957 file_pathes: list of str

1958 Pathes of audio files.

1959 list_format: bool

1960 If True, list file format only.

1961 list_metadata: bool

1962 If True, list metadata only.

1963 list_cues: bool

1964 If True, list markers/cues only.

1965 list_chunks: bool

1966 If True, list all chunks contained in a riff/wave file.

1967 """

1968 from .audioloader import AudioLoader

1969 from .audiomarkers import print_markers

1970 from .riffmetadata import read_chunk_tags

1971 for filepath in file_pathes:

1972 if len(file_pathes) > 1 and (list_cues or list_metadata or

1973 list_format or list_chunks):

1974 print(filepath)

1975 if list_chunks:

1976 chunks = read_chunk_tags(filepath)

1977 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}')

1978 for tag in chunks:

1979 pos = chunks[tag][0] - 8

1980 size = chunks[tag][1] + 8

1981 print(f' {tag:9s} {pos:10d} {size:10d}')

1982 if len(file_pathes) > 1:

1983 print()

1984 continue

1985 with AudioLoader(filepath, 1, 0, verbose=0) as sf:

1986 fmt_md = sf.format_dict()

1987 meta_data = sf.metadata()

1988 locs, labels = sf.markers()

1989 if list_cues:

1990 if len(locs) > 0:

1991 print_markers(locs, labels)

1992 elif list_metadata:

1993 print_metadata(meta_data, replace='.')

1994 elif list_format:

1995 print_metadata(fmt_md)

1996 else:

1997 print('file:')

1998 print_metadata(fmt_md, ' ')

1999 if len(meta_data) > 0:

2000 print()

2001 print('metadata:')

2002 print_metadata(meta_data, ' ', replace='.')

2003 if len(locs) > 0:

2004 print()

2005 print('markers:')

2006 print_markers(locs, labels)

2007 if len(file_pathes) > 1:

2008 print()

2009 if len(file_pathes) > 1:

2010 print()

2011

2012

2013def main(*cargs):

2014 """Call demo with command line arguments.

2015

2016 Parameters

2017 ----------

2018 cargs: list of strings

2019 Command line arguments as provided by sys.argv[1:]

2020 """

2021 # command line arguments:

2022 parser = argparse.ArgumentParser(add_help=True,

2023 description='Convert audio file formats.',

2024 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})')

2025 parser.add_argument('--version', action='version', version=__version__)

2026 parser.add_argument('-f', dest='dataformat', action='store_true',

2027 help='list file format only')

2028 parser.add_argument('-m', dest='metadata', action='store_true',

2029 help='list metadata only')

2030 parser.add_argument('-c', dest='cues', action='store_true',

2031 help='list cues/markers only')

2032 parser.add_argument('-t', dest='chunks', action='store_true',

2033 help='list tags of all riff/wave chunks contained in the file')

2034 parser.add_argument('files', type=str, nargs='+',

2035 help='audio file')

2036 if len(cargs) == 0:

2037 cargs = None

2038 args = parser.parse_args(cargs)

2039

2040 demo(args.files, args.dataformat, args.metadata, args.cues, args.chunks)

2041

2042

2043if __name__ == "__main__":

2044 main(*sys.argv[1:])