Coverage for src/audioio/audiometadata.py: 99%
556 statements
« prev ^ index » next coverage.py v7.6.12, created at 2025-02-16 18:31 +0000
« prev ^ index » next coverage.py v7.6.12, created at 2025-02-16 18:31 +0000
1"""Working with metadata.
3To interface the various ways metadata are stored in audio files, the
4`audioio` package uses nested dictionaries. The keys are always
5strings. Values are strings, integers, floats, datetimes, or other
6types. Value strings can also be numbers followed by a unit,
7e.g. "4.2mV". For defining subsections of key-value pairs, values can
8be dictionaries. The dictionaries can be nested to arbitrary depth.
10```py
11>>> from audioio import print_metadata
12>>> md = dict(Recording=dict(Experimenter='John Doe',
13 DateTimeOriginal='2023-10-01T14:10:02',
14 Count=42),
15 Hardware=dict(Amplifier='Teensy_Amp 4.1',
16 Highpass='10Hz',
17 Gain='120mV'))
18>>> print_metadata(md)
19```
20results in
21```txt
22Recording:
23 Experimenter : John Doe
24 DateTimeOriginal: 2023-10-01T14:10:02
25 Count : 42
26Hardware:
27 Amplifier: Teensy_Amp 4.1
28 Highpass : 10Hz
29 Gain : 120mV
30```
32Often, audio files have very specific ways to store metadata. You can
33enforce using these by putting them into a dictionary that is added to
34the metadata with a key having the name of the metadata type you want,
35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files.
37## Functions
39The `audiometadata` module provides functions for handling and
40manipulating these nested dictionaries. Many functions take keys as
41arguments for finding or setting specific key-value pairs. These keys
42can be the key of a specific item of a (sub-) dictionary, no matter on
43which level of the metadata hierarchy it is. For example, simply
44searching for "Highpass" retrieves the corrseponding value "10Hz",
45although "Highpass" is contained in the sub-dictionary (or "section")
46with key "Hardware". The same item can also be specified together with
47its parent keys: "Hardware.Highpass". Parent keys (or section keys)
48are by default separated by '.', but all functions have a `sep`
49key-word that specifies the string separating section names in
50keys. Key matching is case insensitive.
52Since the same items are named by many different keys in the different
53types of metadata data models, the functions also take lists of keys
54as arguments.
56Do not forget that you can easily manipulate the metadata by means of
57the standard functions of dictionaries.
59If you need to make a copy of the metadata use `deepcopy`:
60```
61from copy import deepcopy
62md_orig = deepcopy(md)
63```
65### Output
67Write nested dictionaries as texts:
69- `write_metadata_text()`: write meta data into a text/yaml file.
70- `print_metadata()`: write meta data to standard output.
72### Flatten
74Conversion between nested and flat dictionaries:
76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary.
77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary.
79### Parse numbers with units
81- `parse_number()`: parse string with number and unit.
82- `change_unit()`: scale numerical value to a new unit.
84### Find and get values
86Find keys and get their values parsed and converted to various types:
88- `find_key()`: find dictionary in metadata hierarchy containing the specified key.
89- `get_number_unit()`: find a key in metadata and return its number and unit.
90- `get_number()`: find a key in metadata and return its value in a given unit.
91- `get_int()`: find a key in metadata and return its integer value.
92- `get_bool()`: find a key in metadata and return its boolean value.
93- `get_datetime()`: find keys in metadata and return a datetime.
94- `get_str()`: find a key in metadata and return its string value.
96### Organize metadata
98Add and remove metadata:
100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary.
101- `add_sections()`: add sections to metadata dictionary.
102- `set_metadata()`: set values of existing metadata.
103- `add_metadata()`: add or modify key-value pairs.
104- `move_metadata()`: remove a key from metadata and add it to a dictionary.
105- `remove_metadata()`: remove key-value pairs or sections from metadata.
106- `cleanup_metadata()`: remove empty sections from metadata.
108### Special metadata fields
110Retrieve and set specific metadata:
112- `get_gain()`: get gain and unit from metadata.
113- `update_gain()`: update gain setting in metadata.
114- `set_starttime()`: set all start-of-recording times in metadata.
115- `update_starttime()`: update start-of-recording times in metadata.
116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field.
117- `add_history()`: add a string describing coding history to metadata.
118- `add_unwrap()`: add unwrap infos to metadata.
120Lists of standard keys:
122- `default_starttime_keys`: keys of times of start of the recording.
123- `default_timeref_keys`: keys of integer time references.
124- `default_gain_keys`: keys of gain settings.
125- `default_history_keys`: keys of strings describing coding history.
128## Command line script
130The module can be run as a script from the command line to display the
131metadata and markers contained in an audio file:
133```sh
134> audiometadata logger.wav
135```
136prints
137```text
138file:
139 filepath : logger.wav
140 samplingrate: 96000Hz
141 channels : 16
142 frames : 17280000
143 duration : 180.000s
145metadata:
146 INFO:
147 Bits : 32
148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L
149 Gain : 165.00mV
150 uCBoard : Teensy 4.1
151 MACAdress : 04:e9:e5:15:3e:95
152 DateTimeOriginal: 2023-10-01T14:10:02
153 Software : TeeGrid R4-senors-logger v1.0
154```
157Alternatively, the script can be run from the module as:
158```
159python -m src.audioio.metadata audiofile.wav
160```
162Running
163```sh
164audiometadata --help
165```
166prints
167```text
168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...]
170Convert audio file formats.
172positional arguments:
173 files audio file
175options:
176 -h, --help show this help message and exit
177 --version show program's version number and exit
178 -f list file format only
179 -m list metadata only
180 -c list cues/markers only
181 -t list tags of all riff/wave chunks contained in the file
183version 2.0.0 by Benda-Lab (2020-2024)
184```
186"""
188import sys
189import argparse
190import numpy as np
191import datetime as dt
192from .version import __version__, __year__
195def write_metadata_text(fh, meta, prefix='', indent=4, replace=None):
196 """Write meta data into a text/yaml file or stream.
198 With the default parameters, the output is a valid yaml file.
200 Parameters
201 ----------
202 fh: filename or stream
203 If not a stream, the file with name `fh` is opened.
204 Otherwise `fh` is used as a stream for writing.
205 meta: nested dict
206 Key-value pairs of metadata to be written into the file.
207 prefix: str
208 This string is written at the beginning of each line.
209 indent: int
210 Number of characters used for indentation of sections.
211 replace: char or None
212 If specified, replace special characters by this character.
214 Examples
215 --------
216 ```
217 from audioio import write_metadata
218 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)))
219 write_metadata('info.txt', md)
220 ```
221 """
223 def write_dict(df, md, level, smap):
224 w = 0
225 for k in md:
226 if not isinstance(md[k], dict) and w < len(k):
227 w = len(k)
228 for k in md:
229 clevel = level*indent
230 if isinstance(md[k], dict):
231 df.write(f'{prefix}{"":>{clevel}}{k}:\n')
232 write_dict(df, md[k], level+1, smap)
233 else:
234 value = md[k]
235 if isinstance(value, (list, tuple)):
236 value = ', '.join([f'{v}' for v in value])
237 else:
238 value = f'{value}'
239 value = value.replace('\r\n', r'\n')
240 value = value.replace('\n', r'\n')
241 if len(smap) > 0:
242 value = value.translate(smap)
243 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n')
245 if not meta:
246 return
247 if hasattr(fh, 'write'):
248 own_file = False
249 else:
250 own_file = True
251 fh = open(fh, 'w')
252 smap = {}
253 if replace:
254 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4))
255 write_dict(fh, meta, 0, smap)
256 if own_file:
257 fh.close()
260def print_metadata(meta, prefix='', indent=4, replace=None):
261 """Write meta data to standard output.
263 Parameters
264 ----------
265 meta: nested dict
266 Key-value pairs of metadata to be written into the file.
267 prefix: str
268 This string is written at the beginning of each line.
269 indent: int
270 Number of characters used for indentation of sections.
271 replace: char or None
272 If specified, replace special characters by this character.
274 Examples
275 --------
276 ```
277 >>> from audioio import print_metadata
278 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
279 >>> print_metadata(md)
280 aaaa: 2
281 bbbb:
282 ccc: 3
283 ddd: 4
284 eee:
285 hh: 5
286 iiii:
287 jjj: 6
288 ```
289 """
290 write_metadata_text(sys.stdout, meta, prefix, indent, replace)
293def flatten_metadata(md, keep_sections=False, sep='.'):
294 """Flatten hierarchical metadata to a single dictionary.
296 Parameters
297 ----------
298 md: nested dict
299 Metadata as returned by `metadata()`.
300 keep_sections: bool
301 If `True`, then prefix keys with section names, separated by `sep`.
302 sep: str
303 String for separating section names.
305 Returns
306 -------
307 d: dict
308 Non-nested dict containing all key-value pairs of `md`.
310 Examples
311 --------
312 ```
313 >>> from audioio import print_metadata, flatten_metadata
314 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
315 >>> print_metadata(md)
316 aaaa: 2
317 bbbb:
318 ccc: 3
319 ddd: 4
320 eee:
321 hh: 5
322 iiii:
323 jjj: 6
325 >>> fmd = flatten_metadata(md, keep_sections=True)
326 >>> print_metadata(fmd)
327 aaaa : 2
328 bbbb.ccc : 3
329 bbbb.ddd : 4
330 bbbb.eee.hh: 5
331 iiii.jjj : 6
332 ```
333 """
334 def flatten(cd, section):
335 df = {}
336 for k in cd:
337 if isinstance(cd[k], dict):
338 df.update(flatten(cd[k], section + k + sep))
339 else:
340 if keep_sections:
341 df[section+k] = cd[k]
342 else:
343 df[k] = cd[k]
344 return df
346 return flatten(md, '')
349def unflatten_metadata(md, sep='.'):
350 """Unflatten a previously flattened metadata dictionary.
352 Parameters
353 ----------
354 md: dict
355 Flat dictionary with key-value pairs as obtained from
356 `flatten_metadata()` with `keep_sections=True`.
357 sep: str
358 String that separates section names.
360 Returns
361 -------
362 d: nested dict
363 Hierarchical dictionary with sub-dictionaries and key-value pairs.
365 Examples
366 --------
367 ```
368 >>> from audioio import print_metadata, unflatten_metadata
369 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6}
370 >>> print_metadata(fmd)
371 aaaa : 2
372 bbbb.ccc : 3
373 bbbb.ddd : 4
374 bbbb.eee.hh: 5
375 iiii.jjj : 6
377 >>> md = unflatten_metadata(fmd)
378 >>> print_metadata(md)
379 aaaa: 2
380 bbbb:
381 ccc: 3
382 ddd: 4
383 eee:
384 hh: 5
385 iiii:
386 jjj: 6
387 ```
388 """
389 umd = {} # unflattened metadata
390 cmd = [umd] # current metadata dicts for each level of the hierarchy
391 csk = [] # current section keys
392 for k in md:
393 ks = k.split(sep)
394 # go up the hierarchy:
395 for i in range(len(csk) - len(ks)):
396 csk.pop()
397 cmd.pop()
398 for kss in reversed(ks[:len(csk)]):
399 if kss == csk[-1]:
400 break
401 csk.pop()
402 cmd.pop()
403 # add new sections:
404 for kss in ks[len(csk):-1]:
405 csk.append(kss)
406 cmd[-1][kss] = {}
407 cmd.append(cmd[-1][kss])
408 # add key-value pair:
409 cmd[-1][ks[-1]] = md[k]
410 return umd
413def parse_number(s):
414 """Parse string with number and unit.
416 Parameters
417 ----------
418 s: str, float, or int
419 String to be parsed. The initial part of the string is
420 expected to be a number, the part following the number is
421 interpreted as the unit. If float or int, then return this
422 as the value with empty unit.
424 Returns
425 -------
426 v: None, int, or float
427 Value of the string as float. Without decimal point, an int is returned.
428 If the string does not contain a number, None is returned.
429 u: str
430 Unit that follows the initial number.
431 n: int
432 Number of digits behind the decimal point.
434 Examples
435 --------
437 ```
438 >>> from audioio import parse_number
440 # integer:
441 >>> parse_number('42')
442 (42, '', 0)
444 # integer with unit:
445 >>> parse_number('42ms')
446 (42, 'ms', 0)
448 # float with unit:
449 >>> parse_number('42.ms')
450 (42.0, 'ms', 0)
452 # float with unit:
453 >>> parse_number('42.3ms')
454 (42.3, 'ms', 1)
456 # float with space and unit:
457 >>> parse_number('423.17 Hz')
458 (423.17, 'Hz', 2)
459 ```
461 """
462 if not isinstance(s, str):
463 if isinstance(s, int):
464 return s, '', 0
465 if isinstance(s, float):
466 return s, '', 5
467 else:
468 return None, '', 0
469 n = len(s)
470 ip = n
471 have_point = False
472 for i in range(len(s)):
473 if s[i] == '.':
474 if have_point:
475 n = i
476 break
477 have_point = True
478 ip = i + 1
479 if not s[i] in '0123456789.+-':
480 n = i
481 break
482 if n == 0:
483 return None, s, 0
484 v = float(s[:n]) if have_point else int(s[:n])
485 u = s[n:].strip()
486 nd = n - ip if n >= ip else 0
487 return v, u, nd
490unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2,
491 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6,
492 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12,
493 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18,
494 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2,
495 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6,
496 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12,
497 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18,
498 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6,
499 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18,
500 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3,
501 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18}
502""" SI prefixes for units with corresponding factors. """
505def change_unit(val, old_unit, new_unit):
506 """Scale numerical value to a new unit.
508 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703
510 Parameters
511 ----------
512 val: float
513 Value given in `old_unit`.
514 old_unit: str
515 Unit of `val`.
516 new_unit: str
517 Requested unit of return value.
519 Returns
520 -------
521 new_val: float
522 The input value `val` scaled to `new_unit`.
524 Examples
525 --------
527 ```
528 >>> from audioio import change_unit
529 >>> change_unit(5, 'mm', 'cm')
530 0.5
532 >>> change_unit(5, '', 'cm')
533 5.0
535 >>> change_unit(5, 'mm', '')
536 5.0
538 >>> change_unit(5, 'cm', 'mm')
539 50.0
541 >>> change_unit(4, 'kg', 'g')
542 4000.0
544 >>> change_unit(12, '%', '')
545 0.12
547 >>> change_unit(1.24, '', '%')
548 124.0
550 >>> change_unit(2.5, 'min', 's')
551 150.0
553 >>> change_unit(3600, 's', 'h')
554 1.0
556 ```
558 """
559 # missing unit?
560 if not old_unit and not new_unit:
561 return val
562 if not old_unit and new_unit != '%':
563 return val
564 if not new_unit and old_unit != '%':
565 return val
567 # special units that directly translate into factors:
568 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0}
570 # parse old unit:
571 f1 = 1.0
572 if old_unit in unit_factors:
573 f1 = unit_factors[old_unit]
574 else:
575 for k in unit_prefixes:
576 if len(old_unit) > len(k) and old_unit[:len(k)] == k:
577 f1 = unit_prefixes[k];
579 # parse new unit:
580 f2 = 1.0
581 if new_unit in unit_factors:
582 f2 = unit_factors[new_unit]
583 else:
584 for k in unit_prefixes:
585 if len(new_unit) > len(k) and new_unit[:len(k)] == k:
586 f2 = unit_prefixes[k];
588 return val*f1/f2
591def find_key(metadata, key, sep='.'):
592 """Find dictionary in metadata hierarchy containing the specified key.
594 Parameters
595 ----------
596 metadata: nested dict
597 Metadata.
598 key: str
599 Key to be searched for (case insensitive).
600 May contain section names separated by `sep`, i.e.
601 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section)
602 in section "bbb" that needs to be a subsection of section "aaa".
603 sep: str
604 String that separates section names in `key`.
606 Returns
607 -------
608 md: dict
609 The innermost dictionary matching some sections of the search key.
610 If `key` is not at all contained in the metadata,
611 the top-level dictionary is returned.
612 key: str
613 The part of the search key that was not found in `md`, or the
614 the final part of the search key, found in `md`.
616 Examples
617 --------
619 Independent of whether found or not found, you can assign to the
620 returned dictionary with the returned key.
622 ```
623 >>> from audioio import print_metadata, find_key
624 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6))
625 >>> print_metadata(md)
626 aaaa: 2
627 bbbb:
628 ccc: 3
629 ddd: 4
630 eee:
631 ff: 5
632 gggg:
633 hhh: 6
635 >>> m, k = find_key(md, 'bbbb.ddd')
636 >>> m[k] = 10
637 >>> print_metadata(md)
638 aaaa: 2
639 bbbb:
640 ccc: 3
641 ddd: 10
642 ...
644 >>> m, k = find_key(md, 'hhh')
645 >>> m[k] = 12
646 >>> print_metadata(md)
647 ...
648 gggg:
649 hhh: 12
651 >>> m, k = find_key(md, 'bbbb.eee.xx')
652 >>> m[k] = 42
653 >>> print_metadata(md)
654 ...
655 eee:
656 ff: 5
657 xx: 42
658 ...
659 ```
661 When searching for sections, the one conaining the searched section
662 is returned:
663 ```py
664 >>> m, k = find_key(md, 'eee')
665 >>> m[k]['yy'] = 46
666 >>> print_metadata(md)
667 ...
668 eee:
669 ff: 5
670 xx: 42
671 yy: 46
672 ...
673 ```
675 """
676 def find_keys(metadata, keys):
677 key = keys[0].strip().upper()
678 for k in metadata:
679 if k.upper() == key:
680 if len(keys) == 1:
681 # found key:
682 return True, metadata, k
683 elif isinstance(metadata[k], dict):
684 # keep searching within the next section:
685 return find_keys(metadata[k], keys[1:])
686 # search in subsections:
687 for k in metadata:
688 if isinstance(metadata[k], dict):
689 found, mm, kk = find_keys(metadata[k], keys)
690 if found:
691 return True, mm, kk
692 # nothing found:
693 return False, metadata, sep.join(keys)
695 if metadata is None:
696 return {}, None
697 ks = key.strip().split(sep)
698 found, mm, kk = find_keys(metadata, ks)
699 return mm, kk
702def get_number_unit(metadata, keys, sep='.', default=None,
703 default_unit='', remove=False):
704 """Find a key in metadata and return its number and unit.
706 Parameters
707 ----------
708 metadata: nested dict
709 Metadata.
710 keys: str or list of str
711 Keys in the metadata to be searched for (case insensitive).
712 Value of the first key found is returned.
713 May contain section names separated by `sep`.
714 See `audiometadata.find_key()` for details.
715 sep: str
716 String that separates section names in `key`.
717 default: None, int, or float
718 Returned value if `key` is not found or the value does
719 not contain a number.
720 default_unit: str
721 Returned unit if `key` is not found or the key's value does
722 not have a unit.
723 remove: bool
724 If `True`, remove the found key from `metadata`.
726 Returns
727 -------
728 v: None, int, or float
729 Value referenced by `key` as float.
730 Without decimal point, an int is returned.
731 If none of the `keys` was found or
732 the key`s value does not contain a number,
733 then `default` is returned.
734 u: str
735 Corresponding unit.
737 Examples
738 --------
740 ```
741 >>> from audioio import get_number_unit
742 >>> md = dict(aaaa='42', bbbb='42.3ms')
744 # integer:
745 >>> get_number_unit(md, 'aaaa')
746 (42, '')
748 # float with unit:
749 >>> get_number_unit(md, 'bbbb')
750 (42.3, 'ms')
752 # two keys:
753 >>> get_number_unit(md, ['cccc', 'bbbb'])
754 (42.3, 'ms')
756 # not found:
757 >>> get_number_unit(md, 'cccc')
758 (None, '')
760 # not found with default value:
761 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.')
762 (1.0, 'a.u.')
763 ```
765 """
766 if not metadata:
767 return default, default_unit
768 if not isinstance(keys, (list, tuple, np.ndarray)):
769 keys = (keys,)
770 value = default
771 unit = default_unit
772 for key in keys:
773 m, k = find_key(metadata, key, sep)
774 if k in m:
775 v, u, _ = parse_number(m[k])
776 if v is not None:
777 if not u:
778 u = default_unit
779 if remove:
780 del m[k]
781 return v, u
782 elif u and unit == default_unit:
783 unit = u
784 return value, unit
787def get_number(metadata, unit, keys, sep='.', default=None, remove=False):
788 """Find a key in metadata and return its value in a given unit.
790 Parameters
791 ----------
792 metadata: nested dict
793 Metadata.
794 unit: str
795 Unit in which to return numerical value referenced by one of the `keys`.
796 keys: str or list of str
797 Keys in the metadata to be searched for (case insensitive).
798 Value of the first key found is returned.
799 May contain section names separated by `sep`.
800 See `audiometadata.find_key()` for details.
801 sep: str
802 String that separates section names in `key`.
803 default: None, int, or float
804 Returned value if `key` is not found or the value does
805 not contain a number.
806 remove: bool
807 If `True`, remove the found key from `metadata`.
809 Returns
810 -------
811 v: None or float
812 Value referenced by `key` as float scaled to `unit`.
813 If none of the `keys` was found or
814 the key`s value does not contain a number,
815 then `default` is returned.
817 Examples
818 --------
820 ```
821 >>> from audioio import get_number
822 >>> md = dict(aaaa='42', bbbb='42.3ms')
824 # milliseconds to seconds:
825 >>> get_number(md, 's', 'bbbb')
826 0.0423
828 # milliseconds to microseconds:
829 >>> get_number(md, 'us', 'bbbb')
830 42300.0
832 # value without unit is not scaled:
833 >>> get_number(md, 'Hz', 'aaaa')
834 42
836 # two keys:
837 >>> get_number(md, 's', ['cccc', 'bbbb'])
838 0.0423
840 # not found:
841 >>> get_number(md, 's', 'cccc')
842 None
844 # not found with default value:
845 >>> get_number(md, 's', 'cccc', default=1.0)
846 1.0
847 ```
849 """
850 v, u = get_number_unit(metadata, keys, sep, None, unit, remove)
851 if v is None:
852 return default
853 else:
854 return change_unit(v, u, unit)
857def get_int(metadata, keys, sep='.', default=None, remove=False):
858 """Find a key in metadata and return its integer value.
860 Parameters
861 ----------
862 metadata: nested dict
863 Metadata.
864 keys: str or list of str
865 Keys in the metadata to be searched for (case insensitive).
866 Value of the first key found is returned.
867 May contain section names separated by `sep`.
868 See `audiometadata.find_key()` for details.
869 sep: str
870 String that separates section names in `key`.
871 default: None or int
872 Return value if `key` is not found or the value does
873 not contain an integer.
874 remove: bool
875 If `True`, remove the found key from `metadata`.
877 Returns
878 -------
879 v: None or int
880 Value referenced by `key` as integer.
881 If none of the `keys` was found,
882 the key's value does not contain a number or represents
883 a floating point value, then `default` is returned.
885 Examples
886 --------
888 ```
889 >>> from audioio import get_int
890 >>> md = dict(aaaa='42', bbbb='42.3ms')
892 # integer:
893 >>> get_int(md, 'aaaa')
894 42
896 # two keys:
897 >>> get_int(md, ['cccc', 'aaaa'])
898 42
900 # float:
901 >>> get_int(md, 'bbbb')
902 None
904 # not found:
905 >>> get_int(md, 'cccc')
906 None
908 # not found with default value:
909 >>> get_int(md, 'cccc', default=0)
910 0
911 ```
913 """
914 if not metadata:
915 return default
916 if not isinstance(keys, (list, tuple, np.ndarray)):
917 keys = (keys,)
918 for key in keys:
919 m, k = find_key(metadata, key, sep)
920 if k in m:
921 v, _, n = parse_number(m[k])
922 if v is not None and n == 0:
923 if remove:
924 del m[k]
925 return int(v)
926 return default
929def get_bool(metadata, keys, sep='.', default=None, remove=False):
930 """Find a key in metadata and return its boolean value.
932 Parameters
933 ----------
934 metadata: nested dict
935 Metadata.
936 keys: str or list of str
937 Keys in the metadata to be searched for (case insensitive).
938 Value of the first key found is returned.
939 May contain section names separated by `sep`.
940 See `audiometadata.find_key()` for details.
941 sep: str
942 String that separates section names in `key`.
943 default: None or bool
944 Return value if `key` is not found or the value does
945 not specify a boolean value.
946 remove: bool
947 If `True`, remove the found key from `metadata`.
949 Returns
950 -------
951 v: None or bool
952 Value referenced by `key` as boolean.
953 True if 'true', 'yes' (case insensitive) or any number larger than zero.
954 False if 'false', 'no' (case insensitive) or any number equal to zero.
955 If none of the `keys` was found or
956 the key's value does specify a boolean value,
957 then `default` is returned.
959 Examples
960 --------
962 ```
963 >>> from audioio import get_bool
964 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui')
966 # case insensitive:
967 >>> get_bool(md, 'aaaa')
968 True
970 >>> get_bool(md, 'bbbb')
971 False
973 >>> get_bool(md, 'cccc')
974 False
976 >>> get_bool(md, 'dddd')
977 True
979 >>> get_bool(md, 'eeee')
980 True
982 # not found:
983 >>> get_bool(md, 'ffff')
984 None
986 # two keys (string is preferred over number):
987 >>> get_bool(md, ['cccc', 'aaaa'])
988 True
990 # two keys (take first match):
991 >>> get_bool(md, ['cccc', 'ffff'])
992 False
994 # not found with default value:
995 >>> get_bool(md, 'ffff', default=False)
996 False
997 ```
999 """
1000 if not metadata:
1001 return default
1002 if not isinstance(keys, (list, tuple, np.ndarray)):
1003 keys = (keys,)
1004 val = default
1005 mv = None
1006 kv = None
1007 for key in keys:
1008 m, k = find_key(metadata, key, sep)
1009 if k in m and not isinstance(m[k], dict):
1010 vs = m[k]
1011 v, _, _ = parse_number(vs)
1012 if v is not None:
1013 val = abs(v) > 1e-8
1014 mv = m
1015 kv = k
1016 elif isinstance(vs, str):
1017 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']:
1018 if remove:
1019 del m[k]
1020 return True
1021 if vs.upper() in ['FALSE', 'F', 'NO', 'N']:
1022 if remove:
1023 del m[k]
1024 return False
1025 if not mv is None and not kv is None and remove:
1026 del mv[kv]
1027 return val
1030default_starttime_keys = [['DateTimeOriginal'],
1031 ['OriginationDate', 'OriginationTime'],
1032 ['Location_Time'],
1033 ['Timestamp']]
1034"""Default keys of times of start of the recording in metadata.
1035Used by `get_datetime()` and `update_starttime()` functions.
1036"""
1038def get_datetime(metadata, keys=default_starttime_keys,
1039 sep='.', default=None, remove=False):
1040 """Find keys in metadata and return a datetime.
1042 Parameters
1043 ----------
1044 metadata: nested dict
1045 Metadata.
1046 keys: tuple of str or list of tuple of str
1047 Datetimes can be stored in metadata as two separate key-value pairs,
1048 one for the date and one for the time. Or by a single key-value pair
1049 for a date-time value. This is why the keys need to be specified in
1050 tuples with one or two keys.
1051 The value of the first tuple of keys found is returned.
1052 Keys may contain section names separated by `sep`.
1053 See `audiometadata.find_key()` for details.
1054 The default values for the `keys` find the start time of a recording.
1055 You can modify the default keys via the `default_starttime_keys` list
1056 of the `audiometadata` module.
1057 sep: str
1058 String that separates section names in `key`.
1059 default: None or str
1060 Return value if `key` is not found or the value does
1061 not contain a string.
1062 remove: bool
1063 If `True`, remove the found key from `metadata`.
1065 Returns
1066 -------
1067 v: None or datetime
1068 Datetime referenced by `keys`.
1069 If none of the `keys` was found, then `default` is returned.
1071 Examples
1072 --------
1074 ```
1075 >>> from audioio import get_datetime
1076 >>> import datetime as dt
1077 >>> md = dict(date='2024-03-02', time='10:42:24',
1078 datetime='2023-04-15T22:10:00')
1080 # separate date and time:
1081 >>> get_datetime(md, ('date', 'time'))
1082 datetime.datetime(2024, 3, 2, 10, 42, 24)
1084 # single datetime:
1085 >>> get_datetime(md, ('datetime',))
1086 datetime.datetime(2023, 4, 15, 22, 10)
1088 # two alternative key tuples:
1089 >>> get_datetime(md, [('aaaa',), ('date', 'time')])
1090 datetime.datetime(2024, 3, 2, 10, 42, 24)
1092 # not found:
1093 >>> get_datetime(md, ('cccc',))
1094 None
1096 # not found with default value:
1097 >>> get_datetime(md, ('cccc', 'dddd'),
1098 default=dt.datetime(2022, 2, 22, 22, 2, 12))
1099 datetime.datetime(2022, 2, 22, 22, 2, 12)
1100 ```
1102 """
1103 if not metadata:
1104 return default
1105 if len(keys) > 0 and isinstance(keys[0], str):
1106 keys = (keys,)
1107 for keyp in keys:
1108 if len(keyp) == 1:
1109 m, k = find_key(metadata, keyp[0], sep)
1110 if k in m:
1111 v = m[k]
1112 if isinstance(v, dt.datetime):
1113 if remove:
1114 del m[k]
1115 return v
1116 elif isinstance(v, str):
1117 if remove:
1118 del m[k]
1119 return dt.datetime.fromisoformat(v)
1120 else:
1121 md, kd = find_key(metadata, keyp[0], sep)
1122 if not kd in md:
1123 continue
1124 if isinstance(md[kd], dt.date):
1125 date = md[kd]
1126 elif isinstance(md[kd], str):
1127 date = dt.date.fromisoformat(md[kd])
1128 else:
1129 continue
1130 mt, kt = find_key(metadata, keyp[1], sep)
1131 if not kt in mt:
1132 continue
1133 if isinstance(mt[kt], dt.time):
1134 time = mt[kt]
1135 elif isinstance(mt[kt], str):
1136 time = dt.time.fromisoformat(mt[kt])
1137 else:
1138 continue
1139 if remove:
1140 del md[kd]
1141 del mt[kt]
1142 return dt.datetime.combine(date, time)
1143 return default
1146def get_str(metadata, keys, sep='.', default=None, remove=False):
1147 """Find a key in metadata and return its string value.
1149 Parameters
1150 ----------
1151 metadata: nested dict
1152 Metadata.
1153 keys: str or list of str
1154 Keys in the metadata to be searched for (case insensitive).
1155 Value of the first key found is returned.
1156 May contain section names separated by `sep`.
1157 See `audiometadata.find_key()` for details.
1158 sep: str
1159 String that separates section names in `key`.
1160 default: None or str
1161 Return value if `key` is not found or the value does
1162 not contain a string.
1163 remove: bool
1164 If `True`, remove the found key from `metadata`.
1166 Returns
1167 -------
1168 v: None or str
1169 String value referenced by `key`.
1170 If none of the `keys` was found, then `default` is returned.
1172 Examples
1173 --------
1175 ```
1176 >>> from audioio import get_str
1177 >>> md = dict(aaaa=42, bbbb='hello')
1179 # string:
1180 >>> get_str(md, 'bbbb')
1181 'hello'
1183 # int as str:
1184 >>> get_str(md, 'aaaa')
1185 '42'
1187 # two keys:
1188 >>> get_str(md, ['cccc', 'bbbb'])
1189 'hello'
1191 # not found:
1192 >>> get_str(md, 'cccc')
1193 None
1195 # not found with default value:
1196 >>> get_str(md, 'cccc', default='-')
1197 '-'
1198 ```
1200 """
1201 if not metadata:
1202 return default
1203 if not isinstance(keys, (list, tuple, np.ndarray)):
1204 keys = (keys,)
1205 for key in keys:
1206 m, k = find_key(metadata, key, sep)
1207 if k in m and not isinstance(m[k], dict):
1208 v = m[k]
1209 if remove:
1210 del m[k]
1211 return str(v)
1212 return default
1215def add_sections(metadata, sections, value=False, sep='.'):
1216 """Add sections to metadata dictionary.
1218 Parameters
1219 ----------
1220 metadata: nested dict
1221 Metadata.
1222 key: str
1223 Names of sections to be added to `metadata`.
1224 Section names separated by `sep`.
1225 value: bool
1226 If True, then the last element in `key` is a key for a value,
1227 not a section.
1228 sep: str
1229 String that separates section names in `key`.
1231 Returns
1232 -------
1233 md: dict
1234 Dictionary of the last added section.
1235 key: str
1236 Last key. Only returned if `value` is set to `True`.
1238 Examples
1239 --------
1241 Add a section and a sub-section to the metadata:
1242 ```
1243 >>> from audioio import print_metadata, add_sections
1244 >>> md = dict()
1245 >>> m = add_sections(md, 'Recording.Location')
1246 >>> m['Country'] = 'Lummerland'
1247 >>> print_metadata(md)
1248 Recording:
1249 Location:
1250 Country: Lummerland
1251 ```
1253 Add a section with a key-value pair:
1254 ```
1255 >>> md = dict()
1256 >>> m, k = add_sections(md, 'Recording.Location', True)
1257 >>> m[k] = 'Lummerland'
1258 >>> print_metadata(md)
1259 Recording:
1260 Location: Lummerland
1261 ```
1263 Adds well to `find_key()`:
1264 ```
1265 >>> md = dict(Recording=dict())
1266 >>> m, k = find_key(md, 'Recording.Location.Country')
1267 >>> m, k = add_sections(m, k, True)
1268 >>> m[k] = 'Lummerland'
1269 >>> print_metadata(md)
1270 Recording:
1271 Location:
1272 Country: Lummerland
1273 ```
1275 """
1276 mm = metadata
1277 ks = sections.split(sep)
1278 n = len(ks)
1279 if value:
1280 n -= 1
1281 for k in ks[:n]:
1282 if len(k) == 0:
1283 continue
1284 mm[k] = dict()
1285 mm = mm[k]
1286 if value:
1287 return mm, ks[-1]
1288 else:
1289 return mm
1292def strlist_to_dict(mds):
1293 """Convert list of key-value-pair strings to dictionary.
1295 Parameters
1296 ----------
1297 mds: None or dict or str or list of str
1298 - None - returns empty dictionary.
1299 - Flat dictionary - returned as is.
1300 - String with key and value separated by '='.
1301 - List of strings with keys and values separated by '='.
1302 Keys may contain section names.
1304 Returns
1305 -------
1306 md_dict: dict
1307 Flat dictionary with key-value pairs.
1308 Keys may contain section names.
1309 Values are strings, other types or dictionaries.
1310 """
1311 if mds is None:
1312 return {}
1313 if isinstance(mds, dict):
1314 return mds
1315 if not isinstance(mds, (list, tuple, np.ndarray)):
1316 mds = (mds,)
1317 md_dict = {}
1318 for md in mds:
1319 k, v = md.split('=')
1320 k = k.strip()
1321 v = v.strip()
1322 md_dict[k] = v
1323 return md_dict
1326def set_metadata(metadata, mds, sep='.'):
1327 """Set values of existing metadata.
1329 Only if a key is found in the metadata, its value is updated.
1331 Parameters
1332 ----------
1333 metadata: nested dict
1334 Metadata.
1335 mds: dict or str or list of str
1336 - Flat dictionary with key-value pairs for updating the metadata.
1337 Values can be strings, other types or dictionaries.
1338 - String with key and value separated by '='.
1339 - List of strings with key and value separated by '='.
1340 Keys may contain section names separated by `sep`.
1341 sep: str
1342 String that separates section names in the keys of `md_dict`.
1344 Examples
1345 --------
1346 ```
1347 >>> from audioio import print_metadata, set_metadata
1348 >>> md = dict(Recording=dict(Time='early'))
1349 >>> print_metadata(md)
1350 Recording:
1351 Time: early
1353 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair
1354 'Recording.Time': 'late'}) # change value of existing key
1355 >>> print_metadata(md)
1356 Recording:
1357 Time : late
1358 ```
1360 See also
1361 --------
1362 add_metadata()
1363 strlist_to_dict()
1365 """
1366 if metadata is None:
1367 return
1368 md_dict = strlist_to_dict(mds)
1369 for k in md_dict:
1370 mm, kk = find_key(metadata, k, sep)
1371 if kk in mm:
1372 mm[kk] = md_dict[k]
1375def add_metadata(metadata, mds, sep='.'):
1376 """Add or modify key-value pairs.
1378 If a key does not exist, it is added to the metadata.
1380 Parameters
1381 ----------
1382 metadata: nested dict
1383 Metadata.
1384 mds: dict or str or list of str
1385 - Flat dictionary with key-value pairs for updating the metadata.
1386 Values can be strings or other types.
1387 - String with key and value separated by '='.
1388 - List of strings with key and value separated by '='.
1389 Keys may contain section names separated by `sep`.
1390 sep: str
1391 String that separates section names in the keys of `md_list`.
1393 Examples
1394 --------
1395 ```
1396 >>> from audioio import print_metadata, add_metadata
1397 >>> md = dict(Recording=dict(Time='early'))
1398 >>> print_metadata(md)
1399 Recording:
1400 Time: early
1402 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair
1403 'Recording.Time': 'late', # change value of existing key
1404 'Recording.Quality': 'amazing', # new key-value pair in existing section
1405 'Location.Country': 'Lummerland']) # new key-value pair in new section
1406 >>> print_metadata(md)
1407 Recording:
1408 Time : late
1409 Quality: amazing
1410 Artist: John Doe
1411 Location:
1412 Country: Lummerland
1413 ```
1415 See also
1416 --------
1417 set_metadata()
1418 strlist_to_dict()
1420 """
1421 if metadata is None:
1422 return
1423 md_dict = strlist_to_dict(mds)
1424 for k in md_dict:
1425 mm, kk = find_key(metadata, k, sep)
1426 mm, kk = add_sections(mm, kk, True, sep)
1427 mm[kk] = md_dict[k]
1430def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'):
1431 """Remove a key from metadata and add it to a dictionary.
1433 Parameters
1434 ----------
1435 src_md: nested dict
1436 Metadata from which a key is removed.
1437 dest_md: dict
1438 Dictionary to which the found key and its value are added.
1439 keys: str or list of str
1440 List of keys to be searched for in `src_md`.
1441 Move the first one found to `dest_md`.
1442 See the `audiometadata.find_key()` function for details.
1443 new_key: None or str
1444 If specified add the value of the found key as `new_key` to
1445 `dest_md`. Otherwise, use the search key.
1446 sep: str
1447 String that separates section names in `keys`.
1449 Returns
1450 -------
1451 moved: bool
1452 `True` if key was found and moved to dictionary.
1454 Examples
1455 --------
1456 ```
1457 >>> from audioio import print_metadata, move_metadata
1458 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV'))
1459 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist')
1460 >>> print_metadata(md)
1461 Recording:
1462 Gain : 1.42mV
1463 Experimentalist: John Doe
1464 ```
1466 """
1467 if not src_md:
1468 return False
1469 if not isinstance(keys, (list, tuple, np.ndarray)):
1470 keys = (keys,)
1471 for key in keys:
1472 m, k = find_key(src_md, key, sep)
1473 if k in m:
1474 dest_key = new_key if new_key else k
1475 dest_md[dest_key] = m.pop(k)
1476 return True
1477 return False
1480def remove_metadata(metadata, key_list, sep='.'):
1481 """Remove key-value pairs or sections from metadata.
1483 Parameters
1484 ----------
1485 metadata: nested dict
1486 Metadata.
1487 key_list: str or list of str
1488 List of keys to key-value pairs or sections to be removed
1489 from the metadata.
1490 sep: str
1491 String that separates section names in the keys of `key_list`.
1493 Examples
1494 --------
1495 ```
1496 >>> from audioio import print_metadata, remove_metadata
1497 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4))
1498 >>> remove_metadata(md, ('ccc',))
1499 >>> print_metadata(md)
1500 aaaa: 2
1501 bbbb:
1502 ddd: 4
1503 ```
1505 """
1506 if not metadata:
1507 return
1508 if not isinstance(key_list, (list, tuple, np.ndarray)):
1509 key_list = (key_list,)
1510 for k in key_list:
1511 mm, kk = find_key(metadata, k, sep)
1512 if kk in mm:
1513 del mm[kk]
1516def cleanup_metadata(metadata):
1517 """Remove empty sections from metadata.
1519 Parameters
1520 ----------
1521 metadata: nested dict
1522 Metadata.
1524 Examples
1525 --------
1526 ```
1527 >>> from audioio import print_metadata, cleanup_metadata
1528 >>> md = dict(aaaa=2, bbbb=dict())
1529 >>> cleanup_metadata(md)
1530 >>> print_metadata(md)
1531 aaaa: 2
1532 ```
1534 """
1535 if not metadata:
1536 return
1537 for k in list(metadata):
1538 if isinstance(metadata[k], dict):
1539 if len(metadata[k]) == 0:
1540 del metadata[k]
1541 else:
1542 cleanup_metadata(metadata[k])
1545default_gain_keys = ['gain']
1546"""Default keys of gain settings in metadata. Used by `get_gain()` function.
1547"""
1549def get_gain(metadata, gain_key=default_gain_keys, sep='.',
1550 default=None, default_unit='', remove=False):
1551 """Get gain and unit from metadata.
1553 Parameters
1554 ----------
1555 metadata: nested dict
1556 Metadata with key-value pairs.
1557 gain_key: str or list of str
1558 Key in the file's metadata that holds some gain information.
1559 If found, the data will be multiplied with the gain,
1560 and if available, the corresponding unit is returned.
1561 See the `audiometadata.find_key()` function for details.
1562 You can modify the default keys via the `default_gain_keys` list
1563 of the `audiometadata` module.
1564 sep: str
1565 String that separates section names in `gain_key`.
1566 default: None or float
1567 Returned value if no valid gain was found in `metadata`.
1568 default_unit: str
1569 Returned unit if no valid gain was found in `metadata`.
1570 remove: bool
1571 If `True`, remove the found key from `metadata`.
1573 Returns
1574 -------
1575 fac: float
1576 Gain factor. If not found in metadata return 1.
1577 unit: string
1578 Unit of the data if found in the metadata, otherwise "a.u.".
1579 """
1580 v, u = get_number_unit(metadata, gain_key, sep, default,
1581 default_unit, remove)
1582 # fix some TeeGrid gains:
1583 if len(u) >= 2 and u[-2:] == '/V':
1584 u = u[:-2]
1585 return v, u
1588def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'):
1589 """Update gain setting in metadata.
1591 Searches for the first appearance of a gain key in the metadata
1592 hierarchy. If found, divide the gain value by `fac`.
1594 Parameters
1595 ----------
1596 metadata: nested dict
1597 Metadata to be updated.
1598 fac: float
1599 Factor that was used to scale the data.
1600 gain_key: str or list of str
1601 Key in the file's metadata that holds some gain information.
1602 If found, the data will be multiplied with the gain,
1603 and if available, the corresponding unit is returned.
1604 See the `audiometadata.find_key()` function for details.
1605 You can modify the default keys via the `default_gain_keys` list
1606 of the `audiometadata` module.
1607 sep: str
1608 String that separates section names in `gain_key`.
1610 Returns
1611 -------
1612 done: bool
1613 True if gain has been found and set.
1616 Examples
1617 --------
1619 ```
1620 >>> from audioio import print_metadata, update_gain
1621 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV'))
1622 >>> update_gain(md, 2)
1623 >>> print_metadata(md)
1624 Artist: John Doe
1625 Recording:
1626 gain: 0.70mV
1627 ```
1629 """
1630 if not metadata:
1631 return False
1632 if not isinstance(gain_key, (list, tuple, np.ndarray)):
1633 gain_key = (gain_key,)
1634 for gk in gain_key:
1635 m, k = find_key(metadata, gk, sep)
1636 if k in m and not isinstance(m[k], dict):
1637 vs = m[k]
1638 if isinstance(vs, (int, float)):
1639 m[k] = vs/fac
1640 else:
1641 v, u, n = parse_number(vs)
1642 if not v is None:
1643 # fix some TeeGrid gains:
1644 if len(u) >= 2 and u[-2:] == '/V':
1645 u = u[:-2]
1646 m[k] = f'{v/fac:.{n+1}f}{u}'
1647 return True
1648 return False
1651default_timeref_keys = ['TimeReference']
1652"""Default keys of integer time references in metadata.
1653Used by `update_starttime()` function.
1654"""
1656def set_starttime(metadata, datetime_value,
1657 time_keys=default_starttime_keys):
1658 """Set all start-of-recording times in metadata.
1660 Parameters
1661 ----------
1662 metadata: nested dict
1663 Metadata to be updated.
1664 datetime_value: datetime
1665 Start date and time of the recording.
1666 time_keys: tuple of str or list of tuple of str
1667 Keys to fields denoting calender times, i.e. dates and times.
1668 Datetimes can be stored in metadata as two separate key-value pairs,
1669 one for the date and one for the time. Or by a single key-value pair
1670 for a date-time values. This is why the keys need to be specified in
1671 tuples with one or two keys.
1672 Keys may contain section names separated by `sep`.
1673 See `audiometadata.find_key()` for details.
1674 You can modify the default time keys via the `default_starttime_keys`
1675 list of the `audiometadata` module.
1677 Returns
1678 -------
1679 success: bool
1680 True if at least one time has been set.
1682 Example
1683 -------
1684 ```
1685 >>> from audioio import print_metadata, set_starttime
1686 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1687 OtherTime='2023-05-16T23:20:10',
1688 BEXT=dict(OriginationDate='2024-03-02',
1689 OriginationTime='10:42:24'))
1690 >>> set_starttime(md, '2024-06-17T22:10:05')
1691 >>> print_metadata(md)
1692 DateTimeOriginal: 2024-06-17T22:10:05
1693 OtherTime : 2024-06-17T22:10:05
1694 BEXT:
1695 OriginationDate: 2024-06-17
1696 OriginationTime: 22:10:05
1697 ```
1699 """
1700 if not metadata:
1701 return False
1702 if isinstance(datetime_value, str):
1703 datetime_value = dt.datetime.fromisoformat(datetime_value)
1704 success = False
1705 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1706 time_keys = (time_keys,)
1707 for key in time_keys:
1708 if len(key) == 1:
1709 # datetime:
1710 m, k = find_key(metadata, key[0])
1711 if k in m and not isinstance(m[k], dict):
1712 if isinstance(m[k], dt.datetime):
1713 m[k] = datetime_value
1714 else:
1715 m[k] = datetime_value.isoformat(timespec='seconds')
1716 success = True
1717 else:
1718 # separate date and time:
1719 md, kd = find_key(metadata, key[0])
1720 if not kd in md or isinstance(md[kd], dict):
1721 continue
1722 if isinstance(md[kd], dt.date):
1723 md[kd] = datetime_value.date()
1724 else:
1725 md[kd] = datetime_value.date().isoformat()
1726 mt, kt = find_key(metadata, key[1])
1727 if not kt in mt or isinstance(mt[kt], dict):
1728 continue
1729 if isinstance(mt[kt], dt.time):
1730 mt[kt] = datetime_value.time()
1731 else:
1732 mt[kt] = datetime_value.time().isoformat(timespec='seconds')
1733 success = True
1734 return success
1737def update_starttime(metadata, deltat, rate,
1738 time_keys=default_starttime_keys,
1739 ref_keys=default_timeref_keys):
1740 """Update start-of-recording times in metadata.
1742 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata.
1744 Parameters
1745 ----------
1746 metadata: nested dict
1747 Metadata to be updated.
1748 deltat: float
1749 Time in seconds to be added to start times.
1750 rate: float
1751 Sampling rate of the data in Hertz.
1752 time_keys: tuple of str or list of tuple of str
1753 Keys to fields denoting calender times, i.e. dates and times.
1754 Datetimes can be stored in metadata as two separate key-value pairs,
1755 one for the date and one for the time. Or by a single key-value pair
1756 for a date-time values. This is why the keys need to be specified in
1757 tuples with one or two keys.
1758 Keys may contain section names separated by `sep`.
1759 See `audiometadata.find_key()` for details.
1760 You can modify the default time keys via the `default_starttime_keys`
1761 list of the `audiometadata` module.
1762 ref_keys: str or list of str
1763 Keys to time references, i.e. integers in seconds relative to
1764 a reference time.
1765 Keys may contain section names separated by `sep`.
1766 See `audiometadata.find_key()` for details.
1767 You can modify the default reference keys via the
1768 `default_timeref_keys` list of the `audiometadata` module.
1770 Returns
1771 -------
1772 success: bool
1773 True if at least one time has been updated.
1775 Example
1776 -------
1777 ```
1778 >>> from audioio import print_metadata, update_starttime
1779 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1780 OtherTime='2023-05-16T23:20:10',
1781 BEXT=dict(OriginationDate='2024-03-02',
1782 OriginationTime='10:42:24',
1783 TimeReference=123456))
1784 >>> update_starttime(md, 4.2, 48000)
1785 >>> print_metadata(md)
1786 DateTimeOriginal: 2023-04-15T22:10:04
1787 OtherTime : 2023-05-16T23:20:10
1788 BEXT:
1789 OriginationDate: 2024-03-02
1790 OriginationTime: 10:42:28
1791 TimeReference : 325056
1792 ```
1794 """
1795 if not metadata:
1796 return False
1797 if not isinstance(deltat, dt.timedelta):
1798 deltat = dt.timedelta(seconds=deltat)
1799 success = False
1800 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1801 time_keys = (time_keys,)
1802 for key in time_keys:
1803 if len(key) == 1:
1804 # datetime:
1805 m, k = find_key(metadata, key[0])
1806 if k in m and not isinstance(m[k], dict):
1807 if isinstance(m[k], dt.datetime):
1808 m[k] += deltat
1809 else:
1810 datetime = dt.datetime.fromisoformat(m[k]) + deltat
1811 m[k] = datetime.isoformat(timespec='seconds')
1812 success = True
1813 else:
1814 # separate date and time:
1815 md, kd = find_key(metadata, key[0])
1816 if not kd in md or isinstance(md[kd], dict):
1817 continue
1818 if isinstance(md[kd], dt.date):
1819 date = md[kd]
1820 is_date = True
1821 else:
1822 date = dt.date.fromisoformat(md[kd])
1823 is_date = False
1824 mt, kt = find_key(metadata, key[1])
1825 if not kt in mt or isinstance(mt[kt], dict):
1826 continue
1827 if isinstance(mt[kt], dt.time):
1828 time = mt[kt]
1829 is_time = True
1830 else:
1831 time = dt.time.fromisoformat(mt[kt])
1832 is_time = False
1833 datetime = dt.datetime.combine(date, time) + deltat
1834 md[kd] = datetime.date() if is_date else datetime.date().isoformat()
1835 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds')
1836 success = True
1837 # time reference in samples:
1838 if isinstance(ref_keys, str):
1839 ref_keys = (ref_keys,)
1840 for key in ref_keys:
1841 m, k = find_key(metadata, key)
1842 if k in m and not isinstance(m[k], dict):
1843 is_int = isinstance(m[k], int)
1844 tref = int(m[k])
1845 tref += int(np.round(deltat.total_seconds()*rate))
1846 m[k] = tref if is_int else f'{tref}'
1847 success = True
1848 return success
1851def bext_history_str(encoding, rate, channels, text=None):
1852 """ Assemble a string for the BEXT CodingHistory field.
1854 Parameters
1855 ----------
1856 encoding: str or None
1857 Encoding of the data.
1858 rate: int or float
1859 Sampling rate in Hertz.
1860 channels: int
1861 Number of channels.
1862 text: str or None
1863 Optional free text.
1865 Returns
1866 -------
1867 s: str
1868 String for the BEXT CodingHistory field,
1869 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out"
1870 """
1871 codes = []
1872 bits = None
1873 if encoding is not None:
1874 if encoding[:3] == 'PCM':
1875 bits = int(encoding[4:])
1876 encoding = 'PCM'
1877 codes.append(f'A={encoding}')
1878 codes.append(f'F={rate:.0f}')
1879 if bits is not None:
1880 codes.append(f'W={bits}')
1881 mode = None
1882 if channels == 1:
1883 mode = 'mono'
1884 elif channels == 2:
1885 mode = 'stereo'
1886 if mode is not None:
1887 codes.append(f'M={mode}')
1888 if text is not None:
1889 codes.append(f'T={text.rstrip()}')
1890 return ','.join(codes)
1893default_history_keys = ['History',
1894 'CodingHistory',
1895 'BWF_CODING_HISTORY']
1896"""Default keys of strings describing coding history in metadata.
1897Used by `add_history()` function.
1898"""
1900def add_history(metadata, history, new_key=None, pre_history=None,
1901 history_keys=default_history_keys, sep='.'):
1902 """Add a string describing coding history to metadata.
1904 Add `history` to the `history_keys` fields in the metadata. If
1905 none of these fields are present but `new_key` is specified, then
1906 assign `pre_history` and `history` to this key. If this key does
1907 not exist in the metadata, it is created.
1909 Parameters
1910 ----------
1911 metadata: nested dict
1912 Metadata to be updated.
1913 history: str
1914 String to be added to the history.
1915 new_key: str or None
1916 Sections and name of a history key to be added to `metadata`.
1917 Section names are separated by `sep`.
1918 pre_history: str or None
1919 If a new key `new_key` is created, then assign this string followed
1920 by `history`.
1921 history_keys: str or list of str
1922 Keys to fields where to add `history`.
1923 Keys may contain section names separated by `sep`.
1924 See `audiometadata.find_key()` for details.
1925 You can modify the default history keys via the `default_history_keys`
1926 list of the `audiometadata` module.
1927 sep: str
1928 String that separates section names in `new_key` and `history_keys`.
1930 Returns
1931 -------
1932 success: bool
1933 True if the history string has beend added to the metadata.
1935 Example
1936 -------
1937 Add string to existing history key-value pair:
1938 ```
1939 >>> from audioio import add_history
1940 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings'))
1941 >>> add_history(md, 'just a snippet')
1942 >>> print(md['BEXT']['CodingHistory'])
1943 original recordings
1944 just a snippet
1945 ```
1947 Assign string to new key-value pair:
1948 ```
1949 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12'))
1950 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data')
1951 >>> print(md['BEXT']['CodingHistory'])
1952 original data
1953 just a snippet
1954 ```
1956 """
1957 if not metadata:
1958 return False
1959 if isinstance(history_keys, str):
1960 history_keys = (history_keys,)
1961 success = False
1962 for keys in history_keys:
1963 m, k = find_key(metadata, keys)
1964 if k in m and not isinstance(m[k], dict):
1965 s = m[k]
1966 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1967 s += '\r\n'
1968 s += history
1969 m[k] = s
1970 success = True
1971 if not success and new_key:
1972 m, k = find_key(metadata, new_key, sep)
1973 m, k = add_sections(m, k, True, sep)
1974 s = ''
1975 if pre_history is not None:
1976 s = pre_history
1977 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1978 s += '\r\n'
1979 s += history
1980 m[k] = s
1981 success = True
1982 return success
1985def add_unwrap(metadata, thresh, clip=0, unit=''):
1986 """Add unwrap infos to metadata.
1988 If `audiotools.unwrap()` was applied to the data, then this
1989 function adds relevant infos to the metadata. If there is an INFO
1990 section in the metadata, the unwrap infos are added to this
1991 section, otherwise they are added to the top level of the metadata
1992 hierarchy.
1994 The threshold `thresh` used for unwrapping is saved under the key
1995 'UnwrapThreshold' as a string. If `clip` is larger than zero, then
1996 the clip level is saved under the key 'UnwrapClippedAmplitude' as
1997 a string.
1999 Parameters
2000 ----------
2001 md: nested dict
2002 Metadata to be updated.
2003 thresh: float
2004 Threshold used for unwrapping.
2005 clip: float
2006 Level at which unwrapped data have been clipped.
2007 unit: str
2008 Unit of `thresh` and `clip`.
2010 Examples
2011 --------
2013 ```
2014 >>> from audioio import print_metadata, add_unwrap
2015 >>> md = dict(INFO=dict(Time='early'))
2016 >>> add_unwrap(md, 0.6, 1.0)
2017 >>> print_metadata(md)
2018 INFO:
2019 Time : early
2020 UnwrapThreshold : 0.60
2021 UnwrapClippedAmplitude: 1.00
2022 ```
2024 """
2025 if metadata is None:
2026 return
2027 md = metadata
2028 for k in metadata:
2029 if k.strip().upper() == 'INFO':
2030 md = metadata['INFO']
2031 break
2032 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}'
2033 if clip > 0:
2034 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}'
2037def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks):
2038 """Print metadata and markers of audio files.
2040 Parameters
2041 ----------
2042 file_pathes: list of str
2043 Pathes of audio files.
2044 list_format: bool
2045 If True, list file format only.
2046 list_metadata: bool
2047 If True, list metadata only.
2048 list_cues: bool
2049 If True, list markers/cues only.
2050 list_chunks: bool
2051 If True, list all chunks contained in a riff/wave file.
2052 """
2053 from .audioloader import AudioLoader
2054 from .audiomarkers import print_markers
2055 from .riffmetadata import read_chunk_tags
2056 for filepath in file_pathes:
2057 if len(file_pathes) > 1 and (list_cues or list_metadata or
2058 list_format or list_chunks):
2059 print(filepath)
2060 if list_chunks:
2061 chunks = read_chunk_tags(filepath)
2062 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}')
2063 for tag in chunks:
2064 pos = chunks[tag][0] - 8
2065 size = chunks[tag][1] + 8
2066 print(f' {tag:9s} {pos:10d} {size:10d}')
2067 if len(file_pathes) > 1:
2068 print()
2069 continue
2070 with AudioLoader(filepath, 1, 0, verbose=0) as sf:
2071 fmt_md = sf.format_dict()
2072 meta_data = sf.metadata()
2073 locs, labels = sf.markers()
2074 if list_cues:
2075 if len(locs) > 0:
2076 print_markers(locs, labels)
2077 elif list_metadata:
2078 print_metadata(meta_data, replace='.')
2079 elif list_format:
2080 print_metadata(fmt_md)
2081 else:
2082 print('file:')
2083 print_metadata(fmt_md, ' ')
2084 if len(meta_data) > 0:
2085 print()
2086 print('metadata:')
2087 print_metadata(meta_data, ' ', replace='.')
2088 if len(locs) > 0:
2089 print()
2090 print('markers:')
2091 print_markers(locs, labels)
2092 if len(file_pathes) > 1:
2093 print()
2094 if len(file_pathes) > 1:
2095 print()
2098def main(*cargs):
2099 """Call demo with command line arguments.
2101 Parameters
2102 ----------
2103 cargs: list of strings
2104 Command line arguments as provided by sys.argv[1:]
2105 """
2106 # command line arguments:
2107 parser = argparse.ArgumentParser(add_help=True,
2108 description='Convert audio file formats.',
2109 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})')
2110 parser.add_argument('--version', action='version', version=__version__)
2111 parser.add_argument('-f', dest='dataformat', action='store_true',
2112 help='list file format only')
2113 parser.add_argument('-m', dest='metadata', action='store_true',
2114 help='list metadata only')
2115 parser.add_argument('-c', dest='cues', action='store_true',
2116 help='list cues/markers only')
2117 parser.add_argument('-t', dest='chunks', action='store_true',
2118 help='list tags of all riff/wave chunks contained in the file')
2119 parser.add_argument('files', type=str, nargs='+',
2120 help='audio file')
2121 if len(cargs) == 0:
2122 cargs = None
2123 args = parser.parse_args(cargs)
2125 demo(args.files, args.dataformat, args.metadata, args.cues, args.chunks)
2128if __name__ == "__main__":
2129 main(*sys.argv[1:])