Coverage for src / audioio / audiometadata.py: 99%
563 statements
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-09 17:53 +0000
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-09 17:53 +0000
1"""Working with metadata.
3To interface the various ways metadata are stored in audio files, the
4`audioio` package uses nested dictionaries. The keys are always
5strings. Values are strings, integers, floats, datetimes, or other
6types. Value strings can also be numbers followed by a unit,
7e.g. "4.2mV". For defining subsections of key-value pairs, values can
8be dictionaries. The dictionaries can be nested to arbitrary depth.
10```py
11>>> from audioio import print_metadata
12>>> md = dict(Recording=dict(Experimenter='John Doe',
13 DateTimeOriginal='2023-10-01T14:10:02',
14 Count=42),
15 Hardware=dict(Amplifier='Teensy_Amp 4.1',
16 Highpass='10Hz',
17 Gain='120mV'))
18>>> print_metadata(md)
19```
20results in
21```txt
22Recording:
23 Experimenter : John Doe
24 DateTimeOriginal: 2023-10-01T14:10:02
25 Count : 42
26Hardware:
27 Amplifier: Teensy_Amp 4.1
28 Highpass : 10Hz
29 Gain : 120mV
30```
32Often, audio files have very specific ways to store metadata. You can
33enforce using these by putting them into a dictionary that is added to
34the metadata with a key having the name of the metadata type you want,
35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files.
37## Functions
39The `audiometadata` module provides functions for handling and
40manipulating these nested dictionaries. Many functions take keys as
41arguments for finding or setting specific key-value pairs. These keys
42can be the key of a specific item of a (sub-) dictionary, no matter on
43which level of the metadata hierarchy it is. For example, simply
44searching for "Highpass" retrieves the corrseponding value "10Hz",
45although "Highpass" is contained in the sub-dictionary (or "section")
46with key "Hardware". The same item can also be specified together with
47its parent keys: "Hardware.Highpass". Parent keys (or section keys)
48are by default separated by '.', but all functions have a `sep`
49key-word that specifies the string separating section names in
50keys. Key matching is case insensitive.
52Since the same items are named by many different keys in the different
53types of metadata data models, the functions also take lists of keys
54as arguments.
56Do not forget that you can easily manipulate the metadata by means of
57the standard functions of dictionaries.
59If you need to make a copy of the metadata use `deepcopy`:
60```
61from copy import deepcopy
62md_orig = deepcopy(md)
63```
65### Output
67Write nested dictionaries as texts:
69- `write_metadata_text()`: write meta data into a text/yaml file.
70- `print_metadata()`: write meta data to standard output.
72### Flatten
74Conversion between nested and flat dictionaries:
76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary.
77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary.
79### Parse numbers with units
81- `parse_number()`: parse string with number and unit.
82- `change_unit()`: scale numerical value to a new unit.
84### Find and get values
86Find keys and get their values parsed and converted to various types:
88- `find_key()`: find dictionary in metadata hierarchy containing the specified key.
89- `get_number_unit()`: find a key in metadata and return its number and unit.
90- `get_number()`: find a key in metadata and return its value in a given unit.
91- `get_int()`: find a key in metadata and return its integer value.
92- `get_bool()`: find a key in metadata and return its boolean value.
93- `get_datetime()`: find keys in metadata and return a datetime.
94- `get_str()`: find a key in metadata and return its string value.
96### Organize metadata
98Add and remove metadata:
100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary.
101- `add_sections()`: add sections to metadata dictionary.
102- `set_metadata()`: set values of existing metadata.
103- `add_metadata()`: add or modify key-value pairs.
104- `move_metadata()`: remove a key from metadata and add it to a dictionary.
105- `remove_metadata()`: remove key-value pairs or sections from metadata.
106- `cleanup_metadata()`: remove empty sections from metadata.
108### Special metadata fields
110Retrieve and set specific metadata:
112- `get_gain()`: get gain and unit from metadata.
113- `update_gain()`: update gain setting in metadata.
114- `set_starttime()`: set all start-of-recording times in metadata.
115- `update_starttime()`: update start-of-recording times in metadata.
116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field.
117- `add_history()`: add a string describing coding history to metadata.
118- `add_unwrap()`: add unwrap infos to metadata.
120Lists of standard keys:
122- `default_starttime_keys`: keys of times of start of the recording.
123- `default_timeref_keys`: keys of integer time references.
124- `default_gain_keys`: keys of gain settings.
125- `default_history_keys`: keys of strings describing coding history.
128## Command line script
130The module can be run as a script from the command line to display the
131metadata and markers contained in an audio file:
133```sh
134> audiometadata logger.wav
135```
136prints
137```text
138file:
139 filepath : logger.wav
140 samplingrate: 96000Hz
141 channels : 16
142 frames : 17280000
143 duration : 180.000s
145metadata:
146 INFO:
147 Bits : 32
148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L
149 Gain : 165.00mV
150 uCBoard : Teensy 4.1
151 MACAdress : 04:e9:e5:15:3e:95
152 DateTimeOriginal: 2023-10-01T14:10:02
153 Software : TeeGrid R4-senors-logger v1.0
154```
157Alternatively, the script can be run from within the audioio source tree as:
158```
159python -m src.audioio.audiometadata audiofile.wav
160```
162Running
163```sh
164audiometadata --help
165```
166prints
167```text
168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...]
170Convert audio file formats.
172positional arguments:
173 files audio file
175options:
176 -h, --help show this help message and exit
177 --version show program's version number and exit
178 -f list file format only
179 -m list metadata only
180 -c list cues/markers only
181 -t list tags of all riff/wave chunks contained in the file
183version 2.0.0 by Benda-Lab (2020-2024)
184```
186"""
188import os
189import sys
190import glob
191import argparse
192import numpy as np
193import datetime as dt
195from .version import __version__, __year__
198def write_metadata_text(fh, meta, prefix='', indent=4, replace=None):
199 """Write meta data into a text/yaml file or stream.
201 With the default parameters, the output is a valid yaml file.
203 Parameters
204 ----------
205 fh: filename or stream
206 If not a stream, the file with name `fh` is opened.
207 Otherwise `fh` is used as a stream for writing.
208 meta: nested dict
209 Key-value pairs of metadata to be written into the file.
210 prefix: str
211 This string is written at the beginning of each line.
212 indent: int
213 Number of characters used for indentation of sections.
214 replace: char or None
215 If specified, replace special characters by this character.
217 Examples
218 --------
219 ```
220 from audioio import write_metadata
221 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)))
222 write_metadata('info.txt', md)
223 ```
224 """
226 def write_dict(df, md, level, smap):
227 w = 0
228 for k in md:
229 if not isinstance(md[k], dict) and w < len(k):
230 w = len(k)
231 for k in md:
232 clevel = level*indent
233 if isinstance(md[k], dict):
234 df.write(f'{prefix}{"":>{clevel}}{k}:\n')
235 write_dict(df, md[k], level+1, smap)
236 else:
237 value = md[k]
238 if isinstance(value, (list, tuple)):
239 value = ', '.join([f'{v}' for v in value])
240 else:
241 value = f'{value}'
242 value = value.replace('\r\n', r'\n')
243 value = value.replace('\n', r'\n')
244 if len(smap) > 0:
245 value = value.translate(smap)
246 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n')
248 if not meta:
249 return
250 if hasattr(fh, 'write'):
251 own_file = False
252 else:
253 own_file = True
254 fh = open(fh, 'w')
255 smap = {}
256 if replace:
257 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4))
258 write_dict(fh, meta, 0, smap)
259 if own_file:
260 fh.close()
263def print_metadata(meta, prefix='', indent=4, replace=None):
264 """Write meta data to standard output.
266 Parameters
267 ----------
268 meta: nested dict
269 Key-value pairs of metadata to be written into the file.
270 prefix: str
271 This string is written at the beginning of each line.
272 indent: int
273 Number of characters used for indentation of sections.
274 replace: char or None
275 If specified, replace special characters by this character.
277 Examples
278 --------
279 ```
280 >>> from audioio import print_metadata
281 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
282 >>> print_metadata(md)
283 aaaa: 2
284 bbbb:
285 ccc: 3
286 ddd: 4
287 eee:
288 hh: 5
289 iiii:
290 jjj: 6
291 ```
292 """
293 write_metadata_text(sys.stdout, meta, prefix, indent, replace)
296def flatten_metadata(md, keep_sections=False, sep='.'):
297 """Flatten hierarchical metadata to a single dictionary.
299 Parameters
300 ----------
301 md: nested dict
302 Metadata as returned by `metadata()`.
303 keep_sections: bool
304 If `True`, then prefix keys with section names, separated by `sep`.
305 sep: str
306 String for separating section names.
308 Returns
309 -------
310 d: dict
311 Non-nested dict containing all key-value pairs of `md`.
313 Examples
314 --------
315 ```
316 >>> from audioio import print_metadata, flatten_metadata
317 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
318 >>> print_metadata(md)
319 aaaa: 2
320 bbbb:
321 ccc: 3
322 ddd: 4
323 eee:
324 hh: 5
325 iiii:
326 jjj: 6
328 >>> fmd = flatten_metadata(md, keep_sections=True)
329 >>> print_metadata(fmd)
330 aaaa : 2
331 bbbb.ccc : 3
332 bbbb.ddd : 4
333 bbbb.eee.hh: 5
334 iiii.jjj : 6
335 ```
336 """
337 def flatten(cd, section):
338 df = {}
339 for k in cd:
340 if isinstance(cd[k], dict):
341 df.update(flatten(cd[k], section + k + sep))
342 else:
343 if keep_sections:
344 df[section+k] = cd[k]
345 else:
346 df[k] = cd[k]
347 return df
349 return flatten(md, '')
352def unflatten_metadata(md, sep='.'):
353 """Unflatten a previously flattened metadata dictionary.
355 Parameters
356 ----------
357 md: dict
358 Flat dictionary with key-value pairs as obtained from
359 `flatten_metadata()` with `keep_sections=True`.
360 sep: str
361 String that separates section names.
363 Returns
364 -------
365 d: nested dict
366 Hierarchical dictionary with sub-dictionaries and key-value pairs.
368 Examples
369 --------
370 ```
371 >>> from audioio import print_metadata, unflatten_metadata
372 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6}
373 >>> print_metadata(fmd)
374 aaaa : 2
375 bbbb.ccc : 3
376 bbbb.ddd : 4
377 bbbb.eee.hh: 5
378 iiii.jjj : 6
380 >>> md = unflatten_metadata(fmd)
381 >>> print_metadata(md)
382 aaaa: 2
383 bbbb:
384 ccc: 3
385 ddd: 4
386 eee:
387 hh: 5
388 iiii:
389 jjj: 6
390 ```
391 """
392 umd = {} # unflattened metadata
393 cmd = [umd] # current metadata dicts for each level of the hierarchy
394 csk = [] # current section keys
395 for k in md:
396 ks = k.split(sep)
397 # go up the hierarchy:
398 for i in range(len(csk) - len(ks)):
399 csk.pop()
400 cmd.pop()
401 for kss in reversed(ks[:len(csk)]):
402 if kss == csk[-1]:
403 break
404 csk.pop()
405 cmd.pop()
406 # add new sections:
407 for kss in ks[len(csk):-1]:
408 csk.append(kss)
409 cmd[-1][kss] = {}
410 cmd.append(cmd[-1][kss])
411 # add key-value pair:
412 cmd[-1][ks[-1]] = md[k]
413 return umd
416def parse_number(s):
417 """Parse string with number and unit.
419 Parameters
420 ----------
421 s: str, float, or int
422 String to be parsed. The initial part of the string is
423 expected to be a number, the part following the number is
424 interpreted as the unit. If float or int, then return this
425 as the value with empty unit.
427 Returns
428 -------
429 v: None, int, or float
430 Value of the string as float. Without decimal point, an int is returned.
431 If the string does not contain a number, None is returned.
432 u: str
433 Unit that follows the initial number.
434 n: int
435 Number of digits behind the decimal point.
437 Examples
438 --------
440 ```
441 >>> from audioio import parse_number
443 # integer:
444 >>> parse_number('42')
445 (42, '', 0)
447 # integer with unit:
448 >>> parse_number('42ms')
449 (42, 'ms', 0)
451 # float with unit:
452 >>> parse_number('42.ms')
453 (42.0, 'ms', 0)
455 # float with unit:
456 >>> parse_number('42.3ms')
457 (42.3, 'ms', 1)
459 # float with space and unit:
460 >>> parse_number('423.17 Hz')
461 (423.17, 'Hz', 2)
462 ```
464 """
465 if not isinstance(s, str):
466 if isinstance(s, int):
467 return s, '', 0
468 if isinstance(s, float):
469 return s, '', 5
470 else:
471 return None, '', 0
472 n = len(s)
473 ip = n
474 have_point = False
475 for i in range(len(s)):
476 if s[i] == '.':
477 if have_point:
478 n = i
479 break
480 have_point = True
481 ip = i + 1
482 if not s[i] in '0123456789.+-':
483 n = i
484 break
485 if n == 0:
486 return None, s, 0
487 v = float(s[:n]) if have_point else int(s[:n])
488 u = s[n:].strip()
489 nd = n - ip if n >= ip else 0
490 return v, u, nd
493unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2,
494 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6,
495 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12,
496 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18,
497 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2,
498 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6,
499 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12,
500 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18,
501 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6,
502 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18,
503 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3,
504 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18}
505""" SI prefixes for units with corresponding factors. """
508def change_unit(val, old_unit, new_unit):
509 """Scale numerical value to a new unit.
511 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703
513 Parameters
514 ----------
515 val: float
516 Value given in `old_unit`.
517 old_unit: str
518 Unit of `val`.
519 new_unit: str
520 Requested unit of return value.
522 Returns
523 -------
524 new_val: float
525 The input value `val` scaled to `new_unit`.
527 Examples
528 --------
530 ```
531 >>> from audioio import change_unit
532 >>> change_unit(5, 'mm', 'cm')
533 0.5
535 >>> change_unit(5, '', 'cm')
536 5.0
538 >>> change_unit(5, 'mm', '')
539 5.0
541 >>> change_unit(5, 'cm', 'mm')
542 50.0
544 >>> change_unit(4, 'kg', 'g')
545 4000.0
547 >>> change_unit(12, '%', '')
548 0.12
550 >>> change_unit(1.24, '', '%')
551 124.0
553 >>> change_unit(2.5, 'min', 's')
554 150.0
556 >>> change_unit(3600, 's', 'h')
557 1.0
559 ```
561 """
562 # missing unit?
563 if not old_unit and not new_unit:
564 return val
565 if not old_unit and new_unit != '%':
566 return val
567 if not new_unit and old_unit != '%':
568 return val
570 # special units that directly translate into factors:
571 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0}
573 # parse old unit:
574 f1 = 1.0
575 if old_unit in unit_factors:
576 f1 = unit_factors[old_unit]
577 else:
578 for k in unit_prefixes:
579 if len(old_unit) > len(k) and old_unit[:len(k)] == k:
580 f1 = unit_prefixes[k];
582 # parse new unit:
583 f2 = 1.0
584 if new_unit in unit_factors:
585 f2 = unit_factors[new_unit]
586 else:
587 for k in unit_prefixes:
588 if len(new_unit) > len(k) and new_unit[:len(k)] == k:
589 f2 = unit_prefixes[k];
591 return val*f1/f2
594def find_key(metadata, key, sep='.'):
595 """Find dictionary in metadata hierarchy containing the specified key.
597 Parameters
598 ----------
599 metadata: nested dict
600 Metadata.
601 key: str
602 Key to be searched for (case insensitive).
603 May contain section names separated by `sep`, i.e.
604 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section)
605 in section "bbb" that needs to be a subsection of section "aaa".
606 sep: str
607 String that separates section names in `key`.
609 Returns
610 -------
611 md: dict
612 The innermost dictionary matching some sections of the search key.
613 If `key` is not at all contained in the metadata,
614 the top-level dictionary is returned.
615 key: str
616 The part of the search key that was not found in `md`, or the
617 the final part of the search key, found in `md`.
619 Examples
620 --------
622 Independent of whether found or not found, you can assign to the
623 returned dictionary with the returned key.
625 ```
626 >>> from audioio import print_metadata, find_key
627 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6))
628 >>> print_metadata(md)
629 aaaa: 2
630 bbbb:
631 ccc: 3
632 ddd: 4
633 eee:
634 ff: 5
635 gggg:
636 hhh: 6
638 >>> m, k = find_key(md, 'bbbb.ddd')
639 >>> m[k] = 10
640 >>> print_metadata(md)
641 aaaa: 2
642 bbbb:
643 ccc: 3
644 ddd: 10
645 ...
647 >>> m, k = find_key(md, 'hhh')
648 >>> m[k] = 12
649 >>> print_metadata(md)
650 ...
651 gggg:
652 hhh: 12
654 >>> m, k = find_key(md, 'bbbb.eee.xx')
655 >>> m[k] = 42
656 >>> print_metadata(md)
657 ...
658 eee:
659 ff: 5
660 xx: 42
661 ...
662 ```
664 When searching for sections, the one conaining the searched section
665 is returned:
666 ```py
667 >>> m, k = find_key(md, 'eee')
668 >>> m[k]['yy'] = 46
669 >>> print_metadata(md)
670 ...
671 eee:
672 ff: 5
673 xx: 42
674 yy: 46
675 ...
676 ```
678 """
679 def find_keys(metadata, keys):
680 key = keys[0].strip().upper()
681 for k in metadata:
682 if k.upper() == key:
683 if len(keys) == 1:
684 # found key:
685 return True, metadata, k
686 elif isinstance(metadata[k], dict):
687 # keep searching within the next section:
688 return find_keys(metadata[k], keys[1:])
689 # search in subsections:
690 for k in metadata:
691 if isinstance(metadata[k], dict):
692 found, mm, kk = find_keys(metadata[k], keys)
693 if found:
694 return True, mm, kk
695 # nothing found:
696 return False, metadata, sep.join(keys)
698 if metadata is None:
699 return {}, None
700 ks = key.strip().split(sep)
701 found, mm, kk = find_keys(metadata, ks)
702 return mm, kk
705def get_number_unit(metadata, keys, sep='.', default=None,
706 default_unit='', remove=False):
707 """Find a key in metadata and return its number and unit.
709 Parameters
710 ----------
711 metadata: nested dict
712 Metadata.
713 keys: str or list of str
714 Keys in the metadata to be searched for (case insensitive).
715 Value of the first key found is returned.
716 May contain section names separated by `sep`.
717 See `audiometadata.find_key()` for details.
718 sep: str
719 String that separates section names in `key`.
720 default: None, int, or float
721 Returned value if `key` is not found or the value does
722 not contain a number.
723 default_unit: str
724 Returned unit if `key` is not found or the key's value does
725 not have a unit.
726 remove: bool
727 If `True`, remove the found key from `metadata`.
729 Returns
730 -------
731 v: None, int, or float
732 Value referenced by `key` as float.
733 Without decimal point, an int is returned.
734 If none of the `keys` was found or
735 the key`s value does not contain a number,
736 then `default` is returned.
737 u: str
738 Corresponding unit.
740 Examples
741 --------
743 ```
744 >>> from audioio import get_number_unit
745 >>> md = dict(aaaa='42', bbbb='42.3ms')
747 # integer:
748 >>> get_number_unit(md, 'aaaa')
749 (42, '')
751 # float with unit:
752 >>> get_number_unit(md, 'bbbb')
753 (42.3, 'ms')
755 # two keys:
756 >>> get_number_unit(md, ['cccc', 'bbbb'])
757 (42.3, 'ms')
759 # not found:
760 >>> get_number_unit(md, 'cccc')
761 (None, '')
763 # not found with default value:
764 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.')
765 (1.0, 'a.u.')
766 ```
768 """
769 if not metadata:
770 return default, default_unit
771 if not isinstance(keys, (list, tuple, np.ndarray)):
772 keys = (keys,)
773 value = default
774 unit = default_unit
775 for key in keys:
776 m, k = find_key(metadata, key, sep)
777 if k in m:
778 v, u, _ = parse_number(m[k])
779 if v is not None:
780 if not u:
781 u = default_unit
782 if remove:
783 del m[k]
784 return v, u
785 elif u and unit == default_unit:
786 unit = u
787 return value, unit
790def get_number(metadata, unit, keys, sep='.', default=None, remove=False):
791 """Find a key in metadata and return its value in a given unit.
793 Parameters
794 ----------
795 metadata: nested dict
796 Metadata.
797 unit: str
798 Unit in which to return numerical value referenced by one of the `keys`.
799 keys: str or list of str
800 Keys in the metadata to be searched for (case insensitive).
801 Value of the first key found is returned.
802 May contain section names separated by `sep`.
803 See `audiometadata.find_key()` for details.
804 sep: str
805 String that separates section names in `key`.
806 default: None, int, or float
807 Returned value if `key` is not found or the value does
808 not contain a number.
809 remove: bool
810 If `True`, remove the found key from `metadata`.
812 Returns
813 -------
814 v: None or float
815 Value referenced by `key` as float scaled to `unit`.
816 If none of the `keys` was found or
817 the key`s value does not contain a number,
818 then `default` is returned.
820 Examples
821 --------
823 ```
824 >>> from audioio import get_number
825 >>> md = dict(aaaa='42', bbbb='42.3ms')
827 # milliseconds to seconds:
828 >>> get_number(md, 's', 'bbbb')
829 0.0423
831 # milliseconds to microseconds:
832 >>> get_number(md, 'us', 'bbbb')
833 42300.0
835 # value without unit is not scaled:
836 >>> get_number(md, 'Hz', 'aaaa')
837 42
839 # two keys:
840 >>> get_number(md, 's', ['cccc', 'bbbb'])
841 0.0423
843 # not found:
844 >>> get_number(md, 's', 'cccc')
845 None
847 # not found with default value:
848 >>> get_number(md, 's', 'cccc', default=1.0)
849 1.0
850 ```
852 """
853 v, u = get_number_unit(metadata, keys, sep, None, unit, remove)
854 if v is None:
855 return default
856 else:
857 return change_unit(v, u, unit)
860def get_int(metadata, keys, sep='.', default=None, remove=False):
861 """Find a key in metadata and return its integer value.
863 Parameters
864 ----------
865 metadata: nested dict
866 Metadata.
867 keys: str or list of str
868 Keys in the metadata to be searched for (case insensitive).
869 Value of the first key found is returned.
870 May contain section names separated by `sep`.
871 See `audiometadata.find_key()` for details.
872 sep: str
873 String that separates section names in `key`.
874 default: None or int
875 Return value if `key` is not found or the value does
876 not contain an integer.
877 remove: bool
878 If `True`, remove the found key from `metadata`.
880 Returns
881 -------
882 v: None or int
883 Value referenced by `key` as integer.
884 If none of the `keys` was found,
885 the key's value does not contain a number or represents
886 a floating point value, then `default` is returned.
888 Examples
889 --------
891 ```
892 >>> from audioio import get_int
893 >>> md = dict(aaaa='42', bbbb='42.3ms')
895 # integer:
896 >>> get_int(md, 'aaaa')
897 42
899 # two keys:
900 >>> get_int(md, ['cccc', 'aaaa'])
901 42
903 # float:
904 >>> get_int(md, 'bbbb')
905 None
907 # not found:
908 >>> get_int(md, 'cccc')
909 None
911 # not found with default value:
912 >>> get_int(md, 'cccc', default=0)
913 0
914 ```
916 """
917 if not metadata:
918 return default
919 if not isinstance(keys, (list, tuple, np.ndarray)):
920 keys = (keys,)
921 for key in keys:
922 m, k = find_key(metadata, key, sep)
923 if k in m:
924 v, _, n = parse_number(m[k])
925 if v is not None and n == 0:
926 if remove:
927 del m[k]
928 return int(v)
929 return default
932def get_bool(metadata, keys, sep='.', default=None, remove=False):
933 """Find a key in metadata and return its boolean value.
935 Parameters
936 ----------
937 metadata: nested dict
938 Metadata.
939 keys: str or list of str
940 Keys in the metadata to be searched for (case insensitive).
941 Value of the first key found is returned.
942 May contain section names separated by `sep`.
943 See `audiometadata.find_key()` for details.
944 sep: str
945 String that separates section names in `key`.
946 default: None or bool
947 Return value if `key` is not found or the value does
948 not specify a boolean value.
949 remove: bool
950 If `True`, remove the found key from `metadata`.
952 Returns
953 -------
954 v: None or bool
955 Value referenced by `key` as boolean.
956 True if 'true', 'yes' (case insensitive) or any number larger than zero.
957 False if 'false', 'no' (case insensitive) or any number equal to zero.
958 If none of the `keys` was found or
959 the key's value does specify a boolean value,
960 then `default` is returned.
962 Examples
963 --------
965 ```
966 >>> from audioio import get_bool
967 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui')
969 # case insensitive:
970 >>> get_bool(md, 'aaaa')
971 True
973 >>> get_bool(md, 'bbbb')
974 False
976 >>> get_bool(md, 'cccc')
977 False
979 >>> get_bool(md, 'dddd')
980 True
982 >>> get_bool(md, 'eeee')
983 True
985 # not found:
986 >>> get_bool(md, 'ffff')
987 None
989 # two keys (string is preferred over number):
990 >>> get_bool(md, ['cccc', 'aaaa'])
991 True
993 # two keys (take first match):
994 >>> get_bool(md, ['cccc', 'ffff'])
995 False
997 # not found with default value:
998 >>> get_bool(md, 'ffff', default=False)
999 False
1000 ```
1002 """
1003 if not metadata:
1004 return default
1005 if not isinstance(keys, (list, tuple, np.ndarray)):
1006 keys = (keys,)
1007 val = default
1008 mv = None
1009 kv = None
1010 for key in keys:
1011 m, k = find_key(metadata, key, sep)
1012 if k in m and not isinstance(m[k], dict):
1013 vs = m[k]
1014 v, _, _ = parse_number(vs)
1015 if v is not None:
1016 val = abs(v) > 1e-8
1017 mv = m
1018 kv = k
1019 elif isinstance(vs, str):
1020 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']:
1021 if remove:
1022 del m[k]
1023 return True
1024 if vs.upper() in ['FALSE', 'F', 'NO', 'N']:
1025 if remove:
1026 del m[k]
1027 return False
1028 if not mv is None and not kv is None and remove:
1029 del mv[kv]
1030 return val
1033default_starttime_keys = [['DateTimeOriginal'],
1034 ['OriginationDate', 'OriginationTime'],
1035 ['Location_Time'],
1036 ['Timestamp']]
1037"""Default keys of times of start of the recording in metadata.
1038Used by `get_datetime()` and `update_starttime()` functions.
1039"""
1041def get_datetime(metadata, keys=default_starttime_keys,
1042 sep='.', default=None, remove=False):
1043 """Find keys in metadata and return a datetime.
1045 Parameters
1046 ----------
1047 metadata: nested dict
1048 Metadata.
1049 keys: tuple of str or list of tuple of str
1050 Datetimes can be stored in metadata as two separate key-value pairs,
1051 one for the date and one for the time. Or by a single key-value pair
1052 for a date-time value. This is why the keys need to be specified in
1053 tuples with one or two keys.
1054 The value of the first tuple of keys found is returned.
1055 Keys may contain section names separated by `sep`.
1056 See `audiometadata.find_key()` for details.
1057 The default values for the `keys` find the start time of a recording.
1058 You can modify the default keys via the `default_starttime_keys` list
1059 of the `audiometadata` module.
1060 sep: str
1061 String that separates section names in `key`.
1062 default: None or str
1063 Return value if `key` is not found or the value does
1064 not contain a string.
1065 remove: bool
1066 If `True`, remove the found key from `metadata`.
1068 Returns
1069 -------
1070 v: None or datetime
1071 Datetime referenced by `keys`.
1072 If none of the `keys` was found, then `default` is returned.
1074 Examples
1075 --------
1077 ```
1078 >>> from audioio import get_datetime
1079 >>> import datetime as dt
1080 >>> md = dict(date='2024-03-02', time='10:42:24',
1081 datetime='2023-04-15T22:10:00')
1083 # separate date and time:
1084 >>> get_datetime(md, ('date', 'time'))
1085 datetime.datetime(2024, 3, 2, 10, 42, 24)
1087 # single datetime:
1088 >>> get_datetime(md, ('datetime',))
1089 datetime.datetime(2023, 4, 15, 22, 10)
1091 # two alternative key tuples:
1092 >>> get_datetime(md, [('aaaa',), ('date', 'time')])
1093 datetime.datetime(2024, 3, 2, 10, 42, 24)
1095 # not found:
1096 >>> get_datetime(md, ('cccc',))
1097 None
1099 # not found with default value:
1100 >>> get_datetime(md, ('cccc', 'dddd'),
1101 default=dt.datetime(2022, 2, 22, 22, 2, 12))
1102 datetime.datetime(2022, 2, 22, 22, 2, 12)
1103 ```
1105 """
1106 if not metadata:
1107 return default
1108 if len(keys) > 0 and isinstance(keys[0], str):
1109 keys = (keys,)
1110 for keyp in keys:
1111 if len(keyp) == 1:
1112 m, k = find_key(metadata, keyp[0], sep)
1113 if k in m:
1114 v = m[k]
1115 if isinstance(v, dt.datetime):
1116 if remove:
1117 del m[k]
1118 return v
1119 elif isinstance(v, str):
1120 if remove:
1121 del m[k]
1122 return dt.datetime.fromisoformat(v)
1123 else:
1124 md, kd = find_key(metadata, keyp[0], sep)
1125 if not kd in md:
1126 continue
1127 if isinstance(md[kd], dt.date):
1128 date = md[kd]
1129 elif isinstance(md[kd], str):
1130 date = dt.date.fromisoformat(md[kd])
1131 else:
1132 continue
1133 mt, kt = find_key(metadata, keyp[1], sep)
1134 if not kt in mt:
1135 continue
1136 if isinstance(mt[kt], dt.time):
1137 time = mt[kt]
1138 elif isinstance(mt[kt], str):
1139 time = dt.time.fromisoformat(mt[kt])
1140 else:
1141 continue
1142 if remove:
1143 del md[kd]
1144 del mt[kt]
1145 return dt.datetime.combine(date, time)
1146 return default
1149def get_str(metadata, keys, sep='.', default=None, remove=False):
1150 """Find a key in metadata and return its string value.
1152 Parameters
1153 ----------
1154 metadata: nested dict
1155 Metadata.
1156 keys: str or list of str
1157 Keys in the metadata to be searched for (case insensitive).
1158 Value of the first key found is returned.
1159 May contain section names separated by `sep`.
1160 See `audiometadata.find_key()` for details.
1161 sep: str
1162 String that separates section names in `key`.
1163 default: None or str
1164 Return value if `key` is not found or the value does
1165 not contain a string.
1166 remove: bool
1167 If `True`, remove the found key from `metadata`.
1169 Returns
1170 -------
1171 v: None or str
1172 String value referenced by `key`.
1173 If none of the `keys` was found, then `default` is returned.
1175 Examples
1176 --------
1178 ```
1179 >>> from audioio import get_str
1180 >>> md = dict(aaaa=42, bbbb='hello')
1182 # string:
1183 >>> get_str(md, 'bbbb')
1184 'hello'
1186 # int as str:
1187 >>> get_str(md, 'aaaa')
1188 '42'
1190 # two keys:
1191 >>> get_str(md, ['cccc', 'bbbb'])
1192 'hello'
1194 # not found:
1195 >>> get_str(md, 'cccc')
1196 None
1198 # not found with default value:
1199 >>> get_str(md, 'cccc', default='-')
1200 '-'
1201 ```
1203 """
1204 if not metadata:
1205 return default
1206 if not isinstance(keys, (list, tuple, np.ndarray)):
1207 keys = (keys,)
1208 for key in keys:
1209 m, k = find_key(metadata, key, sep)
1210 if k in m and not isinstance(m[k], dict):
1211 v = m[k]
1212 if remove:
1213 del m[k]
1214 return str(v)
1215 return default
1218def add_sections(metadata, sections, value=False, sep='.'):
1219 """Add sections to metadata dictionary.
1221 Parameters
1222 ----------
1223 metadata: nested dict
1224 Metadata.
1225 key: str
1226 Names of sections to be added to `metadata`.
1227 Section names separated by `sep`.
1228 value: bool
1229 If True, then the last element in `key` is a key for a value,
1230 not a section.
1231 sep: str
1232 String that separates section names in `key`.
1234 Returns
1235 -------
1236 md: dict
1237 Dictionary of the last added section.
1238 key: str
1239 Last key. Only returned if `value` is set to `True`.
1241 Examples
1242 --------
1244 Add a section and a sub-section to the metadata:
1245 ```
1246 >>> from audioio import print_metadata, add_sections
1247 >>> md = dict()
1248 >>> m = add_sections(md, 'Recording.Location')
1249 >>> m['Country'] = 'Lummerland'
1250 >>> print_metadata(md)
1251 Recording:
1252 Location:
1253 Country: Lummerland
1254 ```
1256 Add a section with a key-value pair:
1257 ```
1258 >>> md = dict()
1259 >>> m, k = add_sections(md, 'Recording.Location', True)
1260 >>> m[k] = 'Lummerland'
1261 >>> print_metadata(md)
1262 Recording:
1263 Location: Lummerland
1264 ```
1266 Adds well to `find_key()`:
1267 ```
1268 >>> md = dict(Recording=dict())
1269 >>> m, k = find_key(md, 'Recording.Location.Country')
1270 >>> m, k = add_sections(m, k, True)
1271 >>> m[k] = 'Lummerland'
1272 >>> print_metadata(md)
1273 Recording:
1274 Location:
1275 Country: Lummerland
1276 ```
1278 """
1279 mm = metadata
1280 ks = sections.split(sep)
1281 n = len(ks)
1282 if value:
1283 n -= 1
1284 for k in ks[:n]:
1285 if len(k) == 0:
1286 continue
1287 mm[k] = dict()
1288 mm = mm[k]
1289 if value:
1290 return mm, ks[-1]
1291 else:
1292 return mm
1295def strlist_to_dict(mds):
1296 """Convert list of key-value-pair strings to dictionary.
1298 Parameters
1299 ----------
1300 mds: None or dict or str or list of str
1301 - None - returns empty dictionary.
1302 - Flat dictionary - returned as is.
1303 - String with key and value separated by '='.
1304 - List of strings with keys and values separated by '='.
1305 Keys may contain section names.
1307 Returns
1308 -------
1309 md_dict: dict
1310 Flat dictionary with key-value pairs.
1311 Keys may contain section names.
1312 Values are strings, other types or dictionaries.
1313 """
1314 if mds is None:
1315 return {}
1316 if isinstance(mds, dict):
1317 return mds
1318 if not isinstance(mds, (list, tuple, np.ndarray)):
1319 mds = (mds,)
1320 md_dict = {}
1321 for md in mds:
1322 k, v = md.split('=')
1323 k = k.strip()
1324 v = v.strip()
1325 md_dict[k] = v
1326 return md_dict
1329def set_metadata(metadata, mds, sep='.'):
1330 """Set values of existing metadata.
1332 Only if a key is found in the metadata, its value is updated.
1334 Parameters
1335 ----------
1336 metadata: nested dict
1337 Metadata.
1338 mds: dict or str or list of str
1339 - Flat dictionary with key-value pairs for updating the metadata.
1340 Values can be strings, other types or dictionaries.
1341 - String with key and value separated by '='.
1342 - List of strings with key and value separated by '='.
1343 Keys may contain section names separated by `sep`.
1344 sep: str
1345 String that separates section names in the keys of `md_dict`.
1347 Examples
1348 --------
1349 ```
1350 >>> from audioio import print_metadata, set_metadata
1351 >>> md = dict(Recording=dict(Time='early'))
1352 >>> print_metadata(md)
1353 Recording:
1354 Time: early
1356 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair
1357 'Recording.Time': 'late'}) # change value of existing key
1358 >>> print_metadata(md)
1359 Recording:
1360 Time : late
1361 ```
1363 See also
1364 --------
1365 add_metadata()
1366 strlist_to_dict()
1368 """
1369 if metadata is None:
1370 return
1371 md_dict = strlist_to_dict(mds)
1372 for k in md_dict:
1373 mm, kk = find_key(metadata, k, sep)
1374 if kk in mm:
1375 mm[kk] = md_dict[k]
1378def add_metadata(metadata, mds, sep='.'):
1379 """Add or modify key-value pairs.
1381 If a key does not exist, it is added to the metadata.
1383 Parameters
1384 ----------
1385 metadata: nested dict
1386 Metadata.
1387 mds: dict or str or list of str
1388 - Flat dictionary with key-value pairs for updating the metadata.
1389 Values can be strings or other types.
1390 - String with key and value separated by '='.
1391 - List of strings with key and value separated by '='.
1392 Keys may contain section names separated by `sep`.
1393 sep: str
1394 String that separates section names in the keys of `md_list`.
1396 Examples
1397 --------
1398 ```
1399 >>> from audioio import print_metadata, add_metadata
1400 >>> md = dict(Recording=dict(Time='early'))
1401 >>> print_metadata(md)
1402 Recording:
1403 Time: early
1405 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair
1406 'Recording.Time': 'late', # change value of existing key
1407 'Recording.Quality': 'amazing', # new key-value pair in existing section
1408 'Location.Country': 'Lummerland']) # new key-value pair in new section
1409 >>> print_metadata(md)
1410 Recording:
1411 Time : late
1412 Quality: amazing
1413 Artist: John Doe
1414 Location:
1415 Country: Lummerland
1416 ```
1418 See also
1419 --------
1420 set_metadata()
1421 strlist_to_dict()
1423 """
1424 if metadata is None:
1425 return
1426 md_dict = strlist_to_dict(mds)
1427 for k in md_dict:
1428 mm, kk = find_key(metadata, k, sep)
1429 mm, kk = add_sections(mm, kk, True, sep)
1430 mm[kk] = md_dict[k]
1433def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'):
1434 """Remove a key from metadata and add it to a dictionary.
1436 Parameters
1437 ----------
1438 src_md: nested dict
1439 Metadata from which a key is removed.
1440 dest_md: dict
1441 Dictionary to which the found key and its value are added.
1442 keys: str or list of str
1443 List of keys to be searched for in `src_md`.
1444 Move the first one found to `dest_md`.
1445 See the `audiometadata.find_key()` function for details.
1446 new_key: None or str
1447 If specified add the value of the found key as `new_key` to
1448 `dest_md`. Otherwise, use the search key.
1449 sep: str
1450 String that separates section names in `keys`.
1452 Returns
1453 -------
1454 moved: bool
1455 `True` if key was found and moved to dictionary.
1457 Examples
1458 --------
1459 ```
1460 >>> from audioio import print_metadata, move_metadata
1461 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV'))
1462 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist')
1463 >>> print_metadata(md)
1464 Recording:
1465 Gain : 1.42mV
1466 Experimentalist: John Doe
1467 ```
1469 """
1470 if not src_md:
1471 return False
1472 if not isinstance(keys, (list, tuple, np.ndarray)):
1473 keys = (keys,)
1474 for key in keys:
1475 m, k = find_key(src_md, key, sep)
1476 if k in m:
1477 dest_key = new_key if new_key else k
1478 dest_md[dest_key] = m.pop(k)
1479 return True
1480 return False
1483def remove_metadata(metadata, key_list, sep='.'):
1484 """Remove key-value pairs or sections from metadata.
1486 Parameters
1487 ----------
1488 metadata: nested dict
1489 Metadata.
1490 key_list: str or list of str
1491 List of keys to key-value pairs or sections to be removed
1492 from the metadata.
1493 sep: str
1494 String that separates section names in the keys of `key_list`.
1496 Examples
1497 --------
1498 ```
1499 >>> from audioio import print_metadata, remove_metadata
1500 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4))
1501 >>> remove_metadata(md, ('ccc',))
1502 >>> print_metadata(md)
1503 aaaa: 2
1504 bbbb:
1505 ddd: 4
1506 ```
1508 """
1509 if not metadata:
1510 return
1511 if not isinstance(key_list, (list, tuple, np.ndarray)):
1512 key_list = (key_list,)
1513 for k in key_list:
1514 mm, kk = find_key(metadata, k, sep)
1515 if kk in mm:
1516 del mm[kk]
1519def cleanup_metadata(metadata):
1520 """Remove empty sections from metadata.
1522 Parameters
1523 ----------
1524 metadata: nested dict
1525 Metadata.
1527 Examples
1528 --------
1529 ```
1530 >>> from audioio import print_metadata, cleanup_metadata
1531 >>> md = dict(aaaa=2, bbbb=dict())
1532 >>> cleanup_metadata(md)
1533 >>> print_metadata(md)
1534 aaaa: 2
1535 ```
1537 """
1538 if not metadata:
1539 return
1540 for k in list(metadata):
1541 if isinstance(metadata[k], dict):
1542 if len(metadata[k]) == 0:
1543 del metadata[k]
1544 else:
1545 cleanup_metadata(metadata[k])
1548default_gain_keys = ['gain']
1549"""Default keys of gain settings in metadata. Used by `get_gain()` function.
1550"""
1552def get_gain(metadata, gain_key=default_gain_keys, sep='.',
1553 default=None, default_unit='', remove=False):
1554 """Get gain and unit from metadata.
1556 Parameters
1557 ----------
1558 metadata: nested dict
1559 Metadata with key-value pairs.
1560 gain_key: str or list of str
1561 Key in the file's metadata that holds some gain information.
1562 If found, the data will be multiplied with the gain,
1563 and if available, the corresponding unit is returned.
1564 See the `audiometadata.find_key()` function for details.
1565 You can modify the default keys via the `default_gain_keys` list
1566 of the `audiometadata` module.
1567 sep: str
1568 String that separates section names in `gain_key`.
1569 default: None or float
1570 Returned value if no valid gain was found in `metadata`.
1571 default_unit: str
1572 Returned unit if no valid gain was found in `metadata`.
1573 remove: bool
1574 If `True`, remove the found key from `metadata`.
1576 Returns
1577 -------
1578 fac: float
1579 Gain factor. If not found in metadata return 1.
1580 unit: string
1581 Unit of the data if found in the metadata, otherwise "a.u.".
1582 """
1583 v, u = get_number_unit(metadata, gain_key, sep, default,
1584 default_unit, remove)
1585 # fix some TeeGrid gains:
1586 if len(u) >= 2 and u[-2:] == '/V':
1587 u = u[:-2]
1588 return v, u
1591def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'):
1592 """Update gain setting in metadata.
1594 Searches for the first appearance of a gain key in the metadata
1595 hierarchy. If found, divide the gain value by `fac`.
1597 Parameters
1598 ----------
1599 metadata: nested dict
1600 Metadata to be updated.
1601 fac: float
1602 Factor that was used to scale the data.
1603 gain_key: str or list of str
1604 Key in the file's metadata that holds some gain information.
1605 If found, the data will be multiplied with the gain,
1606 and if available, the corresponding unit is returned.
1607 See the `audiometadata.find_key()` function for details.
1608 You can modify the default keys via the `default_gain_keys` list
1609 of the `audiometadata` module.
1610 sep: str
1611 String that separates section names in `gain_key`.
1613 Returns
1614 -------
1615 done: bool
1616 True if gain has been found and set.
1619 Examples
1620 --------
1622 ```
1623 >>> from audioio import print_metadata, update_gain
1624 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV'))
1625 >>> update_gain(md, 2)
1626 >>> print_metadata(md)
1627 Artist: John Doe
1628 Recording:
1629 gain: 0.70mV
1630 ```
1632 """
1633 if not metadata:
1634 return False
1635 if not isinstance(gain_key, (list, tuple, np.ndarray)):
1636 gain_key = (gain_key,)
1637 for gk in gain_key:
1638 m, k = find_key(metadata, gk, sep)
1639 if k in m and not isinstance(m[k], dict):
1640 vs = m[k]
1641 if isinstance(vs, (int, float)):
1642 m[k] = vs/fac
1643 else:
1644 v, u, n = parse_number(vs)
1645 if not v is None:
1646 # fix some TeeGrid gains:
1647 if len(u) >= 2 and u[-2:] == '/V':
1648 u = u[:-2]
1649 m[k] = f'{v/fac:.{n+1}f}{u}'
1650 return True
1651 return False
1654def set_starttime(metadata, datetime_value,
1655 time_keys=default_starttime_keys):
1656 """Set all start-of-recording times in metadata.
1658 Parameters
1659 ----------
1660 metadata: nested dict
1661 Metadata to be updated.
1662 datetime_value: datetime
1663 Start date and time of the recording.
1664 time_keys: tuple of str or list of tuple of str
1665 Keys to fields denoting calender times, i.e. dates and times.
1666 Datetimes can be stored in metadata as two separate key-value pairs,
1667 one for the date and one for the time. Or by a single key-value pair
1668 for a date-time values. This is why the keys need to be specified in
1669 tuples with one or two keys.
1670 Keys may contain section names separated by `sep`.
1671 See `audiometadata.find_key()` for details.
1672 You can modify the default time keys via the `default_starttime_keys`
1673 list of the `audiometadata` module.
1675 Returns
1676 -------
1677 success: bool
1678 True if at least one time has been set.
1680 Example
1681 -------
1682 ```
1683 >>> from audioio import print_metadata, set_starttime
1684 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1685 OtherTime='2023-05-16T23:20:10',
1686 BEXT=dict(OriginationDate='2024-03-02',
1687 OriginationTime='10:42:24'))
1688 >>> set_starttime(md, '2024-06-17T22:10:05')
1689 >>> print_metadata(md)
1690 DateTimeOriginal: 2024-06-17T22:10:05
1691 OtherTime : 2024-06-17T22:10:05
1692 BEXT:
1693 OriginationDate: 2024-06-17
1694 OriginationTime: 22:10:05
1695 ```
1697 """
1698 if not metadata:
1699 return False
1700 if isinstance(datetime_value, str):
1701 datetime_value = dt.datetime.fromisoformat(datetime_value)
1702 success = False
1703 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1704 time_keys = (time_keys,)
1705 for key in time_keys:
1706 if len(key) == 1:
1707 # datetime:
1708 m, k = find_key(metadata, key[0])
1709 if k in m and not isinstance(m[k], dict):
1710 if isinstance(m[k], dt.datetime):
1711 m[k] = datetime_value
1712 else:
1713 m[k] = datetime_value.isoformat(timespec='seconds')
1714 success = True
1715 else:
1716 # separate date and time:
1717 md, kd = find_key(metadata, key[0])
1718 if not kd in md or isinstance(md[kd], dict):
1719 continue
1720 if isinstance(md[kd], dt.date):
1721 md[kd] = datetime_value.date()
1722 else:
1723 md[kd] = datetime_value.date().isoformat()
1724 mt, kt = find_key(metadata, key[1])
1725 if not kt in mt or isinstance(mt[kt], dict):
1726 continue
1727 if isinstance(mt[kt], dt.time):
1728 mt[kt] = datetime_value.time()
1729 else:
1730 mt[kt] = datetime_value.time().isoformat(timespec='seconds')
1731 success = True
1732 return success
1735default_timeref_keys = ['TimeReference']
1736"""Default keys of integer time references in metadata.
1737Used by `update_starttime()` function.
1738"""
1740def update_starttime(metadata, deltat, rate,
1741 time_keys=default_starttime_keys,
1742 ref_keys=default_timeref_keys):
1743 """Update start-of-recording times in metadata.
1745 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata.
1747 Parameters
1748 ----------
1749 metadata: nested dict
1750 Metadata to be updated.
1751 deltat: float
1752 Time in seconds to be added to start times.
1753 rate: float
1754 Sampling rate of the data in Hertz.
1755 time_keys: tuple of str or list of tuple of str
1756 Keys to fields denoting calender times, i.e. dates and times.
1757 Datetimes can be stored in metadata as two separate key-value pairs,
1758 one for the date and one for the time. Or by a single key-value pair
1759 for a date-time values. This is why the keys need to be specified in
1760 tuples with one or two keys.
1761 Keys may contain section names separated by `sep`.
1762 See `audiometadata.find_key()` for details.
1763 You can modify the default time keys via the `default_starttime_keys`
1764 list of the `audiometadata` module.
1765 ref_keys: str or list of str
1766 Keys to time references, i.e. integers in seconds relative to
1767 a reference time.
1768 Keys may contain section names separated by `sep`.
1769 See `audiometadata.find_key()` for details.
1770 You can modify the default reference keys via the
1771 `default_timeref_keys` list of the `audiometadata` module.
1773 Returns
1774 -------
1775 success: bool
1776 True if at least one time has been updated.
1778 Example
1779 -------
1780 ```
1781 >>> from audioio import print_metadata, update_starttime
1782 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1783 OtherTime='2023-05-16T23:20:10',
1784 BEXT=dict(OriginationDate='2024-03-02',
1785 OriginationTime='10:42:24',
1786 TimeReference=123456))
1787 >>> update_starttime(md, 4.2, 48000)
1788 >>> print_metadata(md)
1789 DateTimeOriginal: 2023-04-15T22:10:04
1790 OtherTime : 2023-05-16T23:20:10
1791 BEXT:
1792 OriginationDate: 2024-03-02
1793 OriginationTime: 10:42:28
1794 TimeReference : 325056
1795 ```
1797 """
1798 if not metadata:
1799 return False
1800 if not isinstance(deltat, dt.timedelta):
1801 deltat = dt.timedelta(seconds=deltat)
1802 success = False
1803 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1804 time_keys = (time_keys,)
1805 for key in time_keys:
1806 if len(key) == 1:
1807 # datetime:
1808 m, k = find_key(metadata, key[0])
1809 if k in m and not isinstance(m[k], dict):
1810 if isinstance(m[k], dt.datetime):
1811 m[k] += deltat
1812 else:
1813 datetime = dt.datetime.fromisoformat(m[k]) + deltat
1814 m[k] = datetime.isoformat(timespec='seconds')
1815 success = True
1816 else:
1817 # separate date and time:
1818 md, kd = find_key(metadata, key[0])
1819 if not kd in md or isinstance(md[kd], dict):
1820 continue
1821 if isinstance(md[kd], dt.date):
1822 date = md[kd]
1823 is_date = True
1824 else:
1825 date = dt.date.fromisoformat(md[kd])
1826 is_date = False
1827 mt, kt = find_key(metadata, key[1])
1828 if not kt in mt or isinstance(mt[kt], dict):
1829 continue
1830 if isinstance(mt[kt], dt.time):
1831 time = mt[kt]
1832 is_time = True
1833 else:
1834 time = dt.time.fromisoformat(mt[kt])
1835 is_time = False
1836 datetime = dt.datetime.combine(date, time) + deltat
1837 md[kd] = datetime.date() if is_date else datetime.date().isoformat()
1838 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds')
1839 success = True
1840 # time reference in samples:
1841 if isinstance(ref_keys, str):
1842 ref_keys = (ref_keys,)
1843 for key in ref_keys:
1844 m, k = find_key(metadata, key)
1845 if k in m and not isinstance(m[k], dict):
1846 is_int = isinstance(m[k], int)
1847 tref = int(m[k])
1848 tref += int(np.round(deltat.total_seconds()*rate))
1849 m[k] = tref if is_int else f'{tref}'
1850 success = True
1851 return success
1854def bext_history_str(encoding, rate, channels, text=None):
1855 """ Assemble a string for the BEXT CodingHistory field.
1857 Parameters
1858 ----------
1859 encoding: str or None
1860 Encoding of the data.
1861 rate: int or float
1862 Sampling rate in Hertz.
1863 channels: int
1864 Number of channels.
1865 text: str or None
1866 Optional free text.
1868 Returns
1869 -------
1870 s: str
1871 String for the BEXT CodingHistory field,
1872 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out"
1873 """
1874 codes = []
1875 bits = None
1876 if encoding is not None:
1877 if encoding[:3] == 'PCM':
1878 bits = int(encoding[4:])
1879 encoding = 'PCM'
1880 codes.append(f'A={encoding}')
1881 codes.append(f'F={rate:.0f}')
1882 if bits is not None:
1883 codes.append(f'W={bits}')
1884 mode = None
1885 if channels == 1:
1886 mode = 'mono'
1887 elif channels == 2:
1888 mode = 'stereo'
1889 if mode is not None:
1890 codes.append(f'M={mode}')
1891 if text is not None:
1892 codes.append(f'T={text.rstrip()}')
1893 return ','.join(codes)
1896default_history_keys = ['History',
1897 'CodingHistory',
1898 'BWF_CODING_HISTORY']
1899"""Default keys of strings describing coding history in metadata.
1900Used by `add_history()` function.
1901"""
1903def add_history(metadata, history, new_key=None, pre_history=None,
1904 history_keys=default_history_keys, sep='.'):
1905 """Add a string describing coding history to metadata.
1907 Add `history` to the `history_keys` fields in the metadata. If
1908 none of these fields are present but `new_key` is specified, then
1909 assign `pre_history` and `history` to this key. If this key does
1910 not exist in the metadata, it is created.
1912 Parameters
1913 ----------
1914 metadata: nested dict
1915 Metadata to be updated.
1916 history: str
1917 String to be added to the history.
1918 new_key: str or None
1919 Sections and name of a history key to be added to `metadata`.
1920 Section names are separated by `sep`.
1921 pre_history: str or None
1922 If a new key `new_key` is created, then assign this string followed
1923 by `history`.
1924 history_keys: str or list of str
1925 Keys to fields where to add `history`.
1926 Keys may contain section names separated by `sep`.
1927 See `audiometadata.find_key()` for details.
1928 You can modify the default history keys via the `default_history_keys`
1929 list of the `audiometadata` module.
1930 sep: str
1931 String that separates section names in `new_key` and `history_keys`.
1933 Returns
1934 -------
1935 success: bool
1936 True if the history string has beend added to the metadata.
1938 Example
1939 -------
1940 Add string to existing history key-value pair:
1941 ```
1942 >>> from audioio import add_history
1943 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings'))
1944 >>> add_history(md, 'just a snippet')
1945 >>> print(md['BEXT']['CodingHistory'])
1946 original recordings
1947 just a snippet
1948 ```
1950 Assign string to new key-value pair:
1951 ```
1952 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12'))
1953 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data')
1954 >>> print(md['BEXT']['CodingHistory'])
1955 original data
1956 just a snippet
1957 ```
1959 """
1960 if not metadata:
1961 return False
1962 if isinstance(history_keys, str):
1963 history_keys = (history_keys,)
1964 success = False
1965 for keys in history_keys:
1966 m, k = find_key(metadata, keys)
1967 if k in m and not isinstance(m[k], dict):
1968 s = m[k]
1969 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1970 s += '\r\n'
1971 s += history
1972 m[k] = s
1973 success = True
1974 if not success and new_key:
1975 m, k = find_key(metadata, new_key, sep)
1976 m, k = add_sections(m, k, True, sep)
1977 s = ''
1978 if pre_history is not None:
1979 s = pre_history
1980 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1981 s += '\r\n'
1982 s += history
1983 m[k] = s
1984 success = True
1985 return success
1988def add_unwrap(metadata, thresh, clip=0, unit=''):
1989 """Add unwrap infos to metadata.
1991 If `audiotools.unwrap()` was applied to the data, then this
1992 function adds relevant infos to the metadata. If there is an INFO
1993 section in the metadata, the unwrap infos are added to this
1994 section, otherwise they are added to the top level of the metadata
1995 hierarchy.
1997 The threshold `thresh` used for unwrapping is saved under the key
1998 'UnwrapThreshold' as a string. If `clip` is larger than zero, then
1999 the clip level is saved under the key 'UnwrapClippedAmplitude' as
2000 a string.
2002 Parameters
2003 ----------
2004 md: nested dict
2005 Metadata to be updated.
2006 thresh: float
2007 Threshold used for unwrapping.
2008 clip: float
2009 Level at which unwrapped data have been clipped.
2010 unit: str
2011 Unit of `thresh` and `clip`.
2013 Examples
2014 --------
2016 ```
2017 >>> from audioio import print_metadata, add_unwrap
2018 >>> md = dict(INFO=dict(Time='early'))
2019 >>> add_unwrap(md, 0.6, 1.0)
2020 >>> print_metadata(md)
2021 INFO:
2022 Time : early
2023 UnwrapThreshold : 0.60
2024 UnwrapClippedAmplitude: 1.00
2025 ```
2027 """
2028 if metadata is None:
2029 return
2030 md = metadata
2031 for k in metadata:
2032 if k.strip().upper() == 'INFO':
2033 md = metadata['INFO']
2034 break
2035 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}'
2036 if clip > 0:
2037 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}'
2040def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks):
2041 """Print metadata and markers of audio files.
2043 Parameters
2044 ----------
2045 file_pathes: list of str
2046 Pathes of audio files.
2047 list_format: bool
2048 If True, list file format only.
2049 list_metadata: bool
2050 If True, list metadata only.
2051 list_cues: bool
2052 If True, list markers/cues only.
2053 list_chunks: bool
2054 If True, list all chunks contained in a riff/wave file.
2055 """
2056 from .audioloader import AudioLoader
2057 from .audiomarkers import print_markers
2058 from .riffmetadata import read_chunk_tags
2059 for filepath in file_pathes:
2060 if len(file_pathes) > 1 and (list_cues or list_metadata or
2061 list_format or list_chunks):
2062 print(filepath)
2063 if list_chunks:
2064 chunks = read_chunk_tags(filepath)
2065 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}')
2066 for tag in chunks:
2067 pos = chunks[tag][0] - 8
2068 size = chunks[tag][1] + 8
2069 print(f' {tag:9s} {pos:10d} {size:10d}')
2070 if len(file_pathes) > 1:
2071 print()
2072 continue
2073 with AudioLoader(filepath, 1, 0, verbose=0) as sf:
2074 fmt_md = sf.format_dict()
2075 meta_data = sf.metadata()
2076 locs, labels = sf.markers()
2077 if list_cues:
2078 if len(locs) > 0:
2079 print_markers(locs, labels)
2080 elif list_metadata:
2081 print_metadata(meta_data, replace='.')
2082 elif list_format:
2083 print_metadata(fmt_md)
2084 else:
2085 print('file:')
2086 print_metadata(fmt_md, ' ')
2087 if len(meta_data) > 0:
2088 print()
2089 print('metadata:')
2090 print_metadata(meta_data, ' ', replace='.')
2091 if len(locs) > 0:
2092 print()
2093 print('markers:')
2094 print_markers(locs, labels)
2095 if len(file_pathes) > 1:
2096 print()
2097 if len(file_pathes) > 1:
2098 print()
2101def main(*cargs):
2102 """Call demo with command line arguments.
2104 Parameters
2105 ----------
2106 cargs: list of strings
2107 Command line arguments as provided by sys.argv[1:]
2108 """
2109 # command line arguments:
2110 parser = argparse.ArgumentParser(add_help=True,
2111 description='Convert audio file formats.',
2112 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})')
2113 parser.add_argument('--version', action='version', version=__version__)
2114 parser.add_argument('-f', dest='dataformat', action='store_true',
2115 help='list file format only')
2116 parser.add_argument('-m', dest='metadata', action='store_true',
2117 help='list metadata only')
2118 parser.add_argument('-c', dest='cues', action='store_true',
2119 help='list cues/markers only')
2120 parser.add_argument('-t', dest='chunks', action='store_true',
2121 help='list tags of all riff/wave chunks contained in the file')
2122 parser.add_argument('files', type=str, nargs='+',
2123 help='audio file')
2124 if len(cargs) == 0:
2125 cargs = None
2126 args = parser.parse_args(cargs)
2128 # expand wildcard patterns:
2129 files = []
2130 if os.name == 'nt':
2131 for fn in args.files:
2132 files.extend(glob.glob(fn))
2133 else:
2134 files = args.files
2136 demo(files, args.dataformat, args.metadata, args.cues, args.chunks)
2139if __name__ == "__main__":
2140 main(*sys.argv[1:])