Coverage for src/audioio/audiometadata.py: 99%
563 statements
« prev ^ index » next coverage.py v7.10.1, created at 2025-08-02 12:23 +0000
« prev ^ index » next coverage.py v7.10.1, created at 2025-08-02 12:23 +0000
1"""Working with metadata.
3To interface the various ways metadata are stored in audio files, the
4`audioio` package uses nested dictionaries. The keys are always
5strings. Values are strings, integers, floats, datetimes, or other
6types. Value strings can also be numbers followed by a unit,
7e.g. "4.2mV". For defining subsections of key-value pairs, values can
8be dictionaries. The dictionaries can be nested to arbitrary depth.
10```py
11>>> from audioio import print_metadata
12>>> md = dict(Recording=dict(Experimenter='John Doe',
13 DateTimeOriginal='2023-10-01T14:10:02',
14 Count=42),
15 Hardware=dict(Amplifier='Teensy_Amp 4.1',
16 Highpass='10Hz',
17 Gain='120mV'))
18>>> print_metadata(md)
19```
20results in
21```txt
22Recording:
23 Experimenter : John Doe
24 DateTimeOriginal: 2023-10-01T14:10:02
25 Count : 42
26Hardware:
27 Amplifier: Teensy_Amp 4.1
28 Highpass : 10Hz
29 Gain : 120mV
30```
32Often, audio files have very specific ways to store metadata. You can
33enforce using these by putting them into a dictionary that is added to
34the metadata with a key having the name of the metadata type you want,
35e.g. the "INFO", "BEXT", "iXML", and "GUAN" chunks of RIFF/WAVE files.
37## Functions
39The `audiometadata` module provides functions for handling and
40manipulating these nested dictionaries. Many functions take keys as
41arguments for finding or setting specific key-value pairs. These keys
42can be the key of a specific item of a (sub-) dictionary, no matter on
43which level of the metadata hierarchy it is. For example, simply
44searching for "Highpass" retrieves the corrseponding value "10Hz",
45although "Highpass" is contained in the sub-dictionary (or "section")
46with key "Hardware". The same item can also be specified together with
47its parent keys: "Hardware.Highpass". Parent keys (or section keys)
48are by default separated by '.', but all functions have a `sep`
49key-word that specifies the string separating section names in
50keys. Key matching is case insensitive.
52Since the same items are named by many different keys in the different
53types of metadata data models, the functions also take lists of keys
54as arguments.
56Do not forget that you can easily manipulate the metadata by means of
57the standard functions of dictionaries.
59If you need to make a copy of the metadata use `deepcopy`:
60```
61from copy import deepcopy
62md_orig = deepcopy(md)
63```
65### Output
67Write nested dictionaries as texts:
69- `write_metadata_text()`: write meta data into a text/yaml file.
70- `print_metadata()`: write meta data to standard output.
72### Flatten
74Conversion between nested and flat dictionaries:
76- `flatten_metadata()`: flatten hierachical metadata to a single dictionary.
77- `unflatten_metadata()`: unflatten a previously flattened metadata dictionary.
79### Parse numbers with units
81- `parse_number()`: parse string with number and unit.
82- `change_unit()`: scale numerical value to a new unit.
84### Find and get values
86Find keys and get their values parsed and converted to various types:
88- `find_key()`: find dictionary in metadata hierarchy containing the specified key.
89- `get_number_unit()`: find a key in metadata and return its number and unit.
90- `get_number()`: find a key in metadata and return its value in a given unit.
91- `get_int()`: find a key in metadata and return its integer value.
92- `get_bool()`: find a key in metadata and return its boolean value.
93- `get_datetime()`: find keys in metadata and return a datetime.
94- `get_str()`: find a key in metadata and return its string value.
96### Organize metadata
98Add and remove metadata:
100- `strlist_to_dict()`: convert list of key-value-pair strings to dictionary.
101- `add_sections()`: add sections to metadata dictionary.
102- `set_metadata()`: set values of existing metadata.
103- `add_metadata()`: add or modify key-value pairs.
104- `move_metadata()`: remove a key from metadata and add it to a dictionary.
105- `remove_metadata()`: remove key-value pairs or sections from metadata.
106- `cleanup_metadata()`: remove empty sections from metadata.
108### Special metadata fields
110Retrieve and set specific metadata:
112- `get_gain()`: get gain and unit from metadata.
113- `update_gain()`: update gain setting in metadata.
114- `set_starttime()`: set all start-of-recording times in metadata.
115- `update_starttime()`: update start-of-recording times in metadata.
116- `bext_history_str()`: assemble a string for the BEXT CodingHistory field.
117- `add_history()`: add a string describing coding history to metadata.
118- `add_unwrap()`: add unwrap infos to metadata.
120Lists of standard keys:
122- `default_starttime_keys`: keys of times of start of the recording.
123- `default_timeref_keys`: keys of integer time references.
124- `default_gain_keys`: keys of gain settings.
125- `default_history_keys`: keys of strings describing coding history.
128## Command line script
130The module can be run as a script from the command line to display the
131metadata and markers contained in an audio file:
133```sh
134> audiometadata logger.wav
135```
136prints
137```text
138file:
139 filepath : logger.wav
140 samplingrate: 96000Hz
141 channels : 16
142 frames : 17280000
143 duration : 180.000s
145metadata:
146 INFO:
147 Bits : 32
148 Pins : 1-CH2R,1-CH2L,1-CH3R,1-CH3L,2-CH2R,2-CH2L,2-CH3R,2-CH3L,3-CH2R,3-CH2L,3-CH3R,3-CH3L,4-CH2R,4-CH2L,4-CH3R,4-CH3L
149 Gain : 165.00mV
150 uCBoard : Teensy 4.1
151 MACAdress : 04:e9:e5:15:3e:95
152 DateTimeOriginal: 2023-10-01T14:10:02
153 Software : TeeGrid R4-senors-logger v1.0
154```
157Alternatively, the script can be run from within the audioio source tree as:
158```
159python -m src.audioio.audiometadata audiofile.wav
160```
162Running
163```sh
164audiometadata --help
165```
166prints
167```text
168usage: audiometadata [-h] [--version] [-f] [-m] [-c] [-t] files [files ...]
170Convert audio file formats.
172positional arguments:
173 files audio file
175options:
176 -h, --help show this help message and exit
177 --version show program's version number and exit
178 -f list file format only
179 -m list metadata only
180 -c list cues/markers only
181 -t list tags of all riff/wave chunks contained in the file
183version 2.0.0 by Benda-Lab (2020-2024)
184```
186"""
188import os
189import sys
190import glob
191import argparse
192import numpy as np
193import datetime as dt
194from .version import __version__, __year__
197def write_metadata_text(fh, meta, prefix='', indent=4, replace=None):
198 """Write meta data into a text/yaml file or stream.
200 With the default parameters, the output is a valid yaml file.
202 Parameters
203 ----------
204 fh: filename or stream
205 If not a stream, the file with name `fh` is opened.
206 Otherwise `fh` is used as a stream for writing.
207 meta: nested dict
208 Key-value pairs of metadata to be written into the file.
209 prefix: str
210 This string is written at the beginning of each line.
211 indent: int
212 Number of characters used for indentation of sections.
213 replace: char or None
214 If specified, replace special characters by this character.
216 Examples
217 --------
218 ```
219 from audioio import write_metadata
220 md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)))
221 write_metadata('info.txt', md)
222 ```
223 """
225 def write_dict(df, md, level, smap):
226 w = 0
227 for k in md:
228 if not isinstance(md[k], dict) and w < len(k):
229 w = len(k)
230 for k in md:
231 clevel = level*indent
232 if isinstance(md[k], dict):
233 df.write(f'{prefix}{"":>{clevel}}{k}:\n')
234 write_dict(df, md[k], level+1, smap)
235 else:
236 value = md[k]
237 if isinstance(value, (list, tuple)):
238 value = ', '.join([f'{v}' for v in value])
239 else:
240 value = f'{value}'
241 value = value.replace('\r\n', r'\n')
242 value = value.replace('\n', r'\n')
243 if len(smap) > 0:
244 value = value.translate(smap)
245 df.write(f'{prefix}{"":>{clevel}}{k:<{w}}: {value}\n')
247 if not meta:
248 return
249 if hasattr(fh, 'write'):
250 own_file = False
251 else:
252 own_file = True
253 fh = open(fh, 'w')
254 smap = {}
255 if replace:
256 smap = str.maketrans('\r\n\t\x00', ''.join([replace]*4))
257 write_dict(fh, meta, 0, smap)
258 if own_file:
259 fh.close()
262def print_metadata(meta, prefix='', indent=4, replace=None):
263 """Write meta data to standard output.
265 Parameters
266 ----------
267 meta: nested dict
268 Key-value pairs of metadata to be written into the file.
269 prefix: str
270 This string is written at the beginning of each line.
271 indent: int
272 Number of characters used for indentation of sections.
273 replace: char or None
274 If specified, replace special characters by this character.
276 Examples
277 --------
278 ```
279 >>> from audioio import print_metadata
280 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
281 >>> print_metadata(md)
282 aaaa: 2
283 bbbb:
284 ccc: 3
285 ddd: 4
286 eee:
287 hh: 5
288 iiii:
289 jjj: 6
290 ```
291 """
292 write_metadata_text(sys.stdout, meta, prefix, indent, replace)
295def flatten_metadata(md, keep_sections=False, sep='.'):
296 """Flatten hierarchical metadata to a single dictionary.
298 Parameters
299 ----------
300 md: nested dict
301 Metadata as returned by `metadata()`.
302 keep_sections: bool
303 If `True`, then prefix keys with section names, separated by `sep`.
304 sep: str
305 String for separating section names.
307 Returns
308 -------
309 d: dict
310 Non-nested dict containing all key-value pairs of `md`.
312 Examples
313 --------
314 ```
315 >>> from audioio import print_metadata, flatten_metadata
316 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(hh=5)), iiii=dict(jjj=6))
317 >>> print_metadata(md)
318 aaaa: 2
319 bbbb:
320 ccc: 3
321 ddd: 4
322 eee:
323 hh: 5
324 iiii:
325 jjj: 6
327 >>> fmd = flatten_metadata(md, keep_sections=True)
328 >>> print_metadata(fmd)
329 aaaa : 2
330 bbbb.ccc : 3
331 bbbb.ddd : 4
332 bbbb.eee.hh: 5
333 iiii.jjj : 6
334 ```
335 """
336 def flatten(cd, section):
337 df = {}
338 for k in cd:
339 if isinstance(cd[k], dict):
340 df.update(flatten(cd[k], section + k + sep))
341 else:
342 if keep_sections:
343 df[section+k] = cd[k]
344 else:
345 df[k] = cd[k]
346 return df
348 return flatten(md, '')
351def unflatten_metadata(md, sep='.'):
352 """Unflatten a previously flattened metadata dictionary.
354 Parameters
355 ----------
356 md: dict
357 Flat dictionary with key-value pairs as obtained from
358 `flatten_metadata()` with `keep_sections=True`.
359 sep: str
360 String that separates section names.
362 Returns
363 -------
364 d: nested dict
365 Hierarchical dictionary with sub-dictionaries and key-value pairs.
367 Examples
368 --------
369 ```
370 >>> from audioio import print_metadata, unflatten_metadata
371 >>> fmd = {'aaaa': 2, 'bbbb.ccc': 3, 'bbbb.ddd': 4, 'bbbb.eee.hh': 5, 'iiii.jjj': 6}
372 >>> print_metadata(fmd)
373 aaaa : 2
374 bbbb.ccc : 3
375 bbbb.ddd : 4
376 bbbb.eee.hh: 5
377 iiii.jjj : 6
379 >>> md = unflatten_metadata(fmd)
380 >>> print_metadata(md)
381 aaaa: 2
382 bbbb:
383 ccc: 3
384 ddd: 4
385 eee:
386 hh: 5
387 iiii:
388 jjj: 6
389 ```
390 """
391 umd = {} # unflattened metadata
392 cmd = [umd] # current metadata dicts for each level of the hierarchy
393 csk = [] # current section keys
394 for k in md:
395 ks = k.split(sep)
396 # go up the hierarchy:
397 for i in range(len(csk) - len(ks)):
398 csk.pop()
399 cmd.pop()
400 for kss in reversed(ks[:len(csk)]):
401 if kss == csk[-1]:
402 break
403 csk.pop()
404 cmd.pop()
405 # add new sections:
406 for kss in ks[len(csk):-1]:
407 csk.append(kss)
408 cmd[-1][kss] = {}
409 cmd.append(cmd[-1][kss])
410 # add key-value pair:
411 cmd[-1][ks[-1]] = md[k]
412 return umd
415def parse_number(s):
416 """Parse string with number and unit.
418 Parameters
419 ----------
420 s: str, float, or int
421 String to be parsed. The initial part of the string is
422 expected to be a number, the part following the number is
423 interpreted as the unit. If float or int, then return this
424 as the value with empty unit.
426 Returns
427 -------
428 v: None, int, or float
429 Value of the string as float. Without decimal point, an int is returned.
430 If the string does not contain a number, None is returned.
431 u: str
432 Unit that follows the initial number.
433 n: int
434 Number of digits behind the decimal point.
436 Examples
437 --------
439 ```
440 >>> from audioio import parse_number
442 # integer:
443 >>> parse_number('42')
444 (42, '', 0)
446 # integer with unit:
447 >>> parse_number('42ms')
448 (42, 'ms', 0)
450 # float with unit:
451 >>> parse_number('42.ms')
452 (42.0, 'ms', 0)
454 # float with unit:
455 >>> parse_number('42.3ms')
456 (42.3, 'ms', 1)
458 # float with space and unit:
459 >>> parse_number('423.17 Hz')
460 (423.17, 'Hz', 2)
461 ```
463 """
464 if not isinstance(s, str):
465 if isinstance(s, int):
466 return s, '', 0
467 if isinstance(s, float):
468 return s, '', 5
469 else:
470 return None, '', 0
471 n = len(s)
472 ip = n
473 have_point = False
474 for i in range(len(s)):
475 if s[i] == '.':
476 if have_point:
477 n = i
478 break
479 have_point = True
480 ip = i + 1
481 if not s[i] in '0123456789.+-':
482 n = i
483 break
484 if n == 0:
485 return None, s, 0
486 v = float(s[:n]) if have_point else int(s[:n])
487 u = s[n:].strip()
488 nd = n - ip if n >= ip else 0
489 return v, u, nd
492unit_prefixes = {'Deka': 1e1, 'deka': 1e1, 'Hekto': 1e2, 'hekto': 1e2,
493 'kilo': 1e3, 'Kilo': 1e3, 'Mega': 1e6, 'mega': 1e6,
494 'Giga': 1e9, 'giga': 1e9, 'Tera': 1e12, 'tera': 1e12,
495 'Peta': 1e15, 'peta': 1e15, 'Exa': 1e18, 'exa': 1e18,
496 'Dezi': 1e-1, 'dezi': 1e-1, 'Zenti': 1e-2, 'centi': 1e-2,
497 'Milli': 1e-3, 'milli': 1e-3, 'Micro': 1e-6, 'micro': 1e-6,
498 'Nano': 1e-9, 'nano': 1e-9, 'Piko': 1e-12, 'piko': 1e-12,
499 'Femto': 1e-15, 'femto': 1e-15, 'Atto': 1e-18, 'atto': 1e-18,
500 'da': 1e1, 'h': 1e2, 'K': 1e3, 'k': 1e3, 'M': 1e6,
501 'G': 1e9, 'T': 1e12, 'P': 1e15, 'E': 1e18,
502 'd': 1e-1, 'c': 1e-2, 'mu': 1e-6, 'u': 1e-6, 'm': 1e-3,
503 'n': 1e-9, 'p': 1e-12, 'f': 1e-15, 'a': 1e-18}
504""" SI prefixes for units with corresponding factors. """
507def change_unit(val, old_unit, new_unit):
508 """Scale numerical value to a new unit.
510 Adapted from https://github.com/relacs/relacs/blob/1facade622a80e9f51dbf8e6f8171ac74c27f100/options/src/parameter.cc#L1647-L1703
512 Parameters
513 ----------
514 val: float
515 Value given in `old_unit`.
516 old_unit: str
517 Unit of `val`.
518 new_unit: str
519 Requested unit of return value.
521 Returns
522 -------
523 new_val: float
524 The input value `val` scaled to `new_unit`.
526 Examples
527 --------
529 ```
530 >>> from audioio import change_unit
531 >>> change_unit(5, 'mm', 'cm')
532 0.5
534 >>> change_unit(5, '', 'cm')
535 5.0
537 >>> change_unit(5, 'mm', '')
538 5.0
540 >>> change_unit(5, 'cm', 'mm')
541 50.0
543 >>> change_unit(4, 'kg', 'g')
544 4000.0
546 >>> change_unit(12, '%', '')
547 0.12
549 >>> change_unit(1.24, '', '%')
550 124.0
552 >>> change_unit(2.5, 'min', 's')
553 150.0
555 >>> change_unit(3600, 's', 'h')
556 1.0
558 ```
560 """
561 # missing unit?
562 if not old_unit and not new_unit:
563 return val
564 if not old_unit and new_unit != '%':
565 return val
566 if not new_unit and old_unit != '%':
567 return val
569 # special units that directly translate into factors:
570 unit_factors = {'%': 0.01, 'hour': 60.0*60.0, 'h': 60.0*60.0, 'min': 60.0}
572 # parse old unit:
573 f1 = 1.0
574 if old_unit in unit_factors:
575 f1 = unit_factors[old_unit]
576 else:
577 for k in unit_prefixes:
578 if len(old_unit) > len(k) and old_unit[:len(k)] == k:
579 f1 = unit_prefixes[k];
581 # parse new unit:
582 f2 = 1.0
583 if new_unit in unit_factors:
584 f2 = unit_factors[new_unit]
585 else:
586 for k in unit_prefixes:
587 if len(new_unit) > len(k) and new_unit[:len(k)] == k:
588 f2 = unit_prefixes[k];
590 return val*f1/f2
593def find_key(metadata, key, sep='.'):
594 """Find dictionary in metadata hierarchy containing the specified key.
596 Parameters
597 ----------
598 metadata: nested dict
599 Metadata.
600 key: str
601 Key to be searched for (case insensitive).
602 May contain section names separated by `sep`, i.e.
603 "aaa.bbb.ccc" searches "ccc" (can be key-value pair or section)
604 in section "bbb" that needs to be a subsection of section "aaa".
605 sep: str
606 String that separates section names in `key`.
608 Returns
609 -------
610 md: dict
611 The innermost dictionary matching some sections of the search key.
612 If `key` is not at all contained in the metadata,
613 the top-level dictionary is returned.
614 key: str
615 The part of the search key that was not found in `md`, or the
616 the final part of the search key, found in `md`.
618 Examples
619 --------
621 Independent of whether found or not found, you can assign to the
622 returned dictionary with the returned key.
624 ```
625 >>> from audioio import print_metadata, find_key
626 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4, eee=dict(ff=5)), gggg=dict(hhh=6))
627 >>> print_metadata(md)
628 aaaa: 2
629 bbbb:
630 ccc: 3
631 ddd: 4
632 eee:
633 ff: 5
634 gggg:
635 hhh: 6
637 >>> m, k = find_key(md, 'bbbb.ddd')
638 >>> m[k] = 10
639 >>> print_metadata(md)
640 aaaa: 2
641 bbbb:
642 ccc: 3
643 ddd: 10
644 ...
646 >>> m, k = find_key(md, 'hhh')
647 >>> m[k] = 12
648 >>> print_metadata(md)
649 ...
650 gggg:
651 hhh: 12
653 >>> m, k = find_key(md, 'bbbb.eee.xx')
654 >>> m[k] = 42
655 >>> print_metadata(md)
656 ...
657 eee:
658 ff: 5
659 xx: 42
660 ...
661 ```
663 When searching for sections, the one conaining the searched section
664 is returned:
665 ```py
666 >>> m, k = find_key(md, 'eee')
667 >>> m[k]['yy'] = 46
668 >>> print_metadata(md)
669 ...
670 eee:
671 ff: 5
672 xx: 42
673 yy: 46
674 ...
675 ```
677 """
678 def find_keys(metadata, keys):
679 key = keys[0].strip().upper()
680 for k in metadata:
681 if k.upper() == key:
682 if len(keys) == 1:
683 # found key:
684 return True, metadata, k
685 elif isinstance(metadata[k], dict):
686 # keep searching within the next section:
687 return find_keys(metadata[k], keys[1:])
688 # search in subsections:
689 for k in metadata:
690 if isinstance(metadata[k], dict):
691 found, mm, kk = find_keys(metadata[k], keys)
692 if found:
693 return True, mm, kk
694 # nothing found:
695 return False, metadata, sep.join(keys)
697 if metadata is None:
698 return {}, None
699 ks = key.strip().split(sep)
700 found, mm, kk = find_keys(metadata, ks)
701 return mm, kk
704def get_number_unit(metadata, keys, sep='.', default=None,
705 default_unit='', remove=False):
706 """Find a key in metadata and return its number and unit.
708 Parameters
709 ----------
710 metadata: nested dict
711 Metadata.
712 keys: str or list of str
713 Keys in the metadata to be searched for (case insensitive).
714 Value of the first key found is returned.
715 May contain section names separated by `sep`.
716 See `audiometadata.find_key()` for details.
717 sep: str
718 String that separates section names in `key`.
719 default: None, int, or float
720 Returned value if `key` is not found or the value does
721 not contain a number.
722 default_unit: str
723 Returned unit if `key` is not found or the key's value does
724 not have a unit.
725 remove: bool
726 If `True`, remove the found key from `metadata`.
728 Returns
729 -------
730 v: None, int, or float
731 Value referenced by `key` as float.
732 Without decimal point, an int is returned.
733 If none of the `keys` was found or
734 the key`s value does not contain a number,
735 then `default` is returned.
736 u: str
737 Corresponding unit.
739 Examples
740 --------
742 ```
743 >>> from audioio import get_number_unit
744 >>> md = dict(aaaa='42', bbbb='42.3ms')
746 # integer:
747 >>> get_number_unit(md, 'aaaa')
748 (42, '')
750 # float with unit:
751 >>> get_number_unit(md, 'bbbb')
752 (42.3, 'ms')
754 # two keys:
755 >>> get_number_unit(md, ['cccc', 'bbbb'])
756 (42.3, 'ms')
758 # not found:
759 >>> get_number_unit(md, 'cccc')
760 (None, '')
762 # not found with default value:
763 >>> get_number_unit(md, 'cccc', default=1.0, default_unit='a.u.')
764 (1.0, 'a.u.')
765 ```
767 """
768 if not metadata:
769 return default, default_unit
770 if not isinstance(keys, (list, tuple, np.ndarray)):
771 keys = (keys,)
772 value = default
773 unit = default_unit
774 for key in keys:
775 m, k = find_key(metadata, key, sep)
776 if k in m:
777 v, u, _ = parse_number(m[k])
778 if v is not None:
779 if not u:
780 u = default_unit
781 if remove:
782 del m[k]
783 return v, u
784 elif u and unit == default_unit:
785 unit = u
786 return value, unit
789def get_number(metadata, unit, keys, sep='.', default=None, remove=False):
790 """Find a key in metadata and return its value in a given unit.
792 Parameters
793 ----------
794 metadata: nested dict
795 Metadata.
796 unit: str
797 Unit in which to return numerical value referenced by one of the `keys`.
798 keys: str or list of str
799 Keys in the metadata to be searched for (case insensitive).
800 Value of the first key found is returned.
801 May contain section names separated by `sep`.
802 See `audiometadata.find_key()` for details.
803 sep: str
804 String that separates section names in `key`.
805 default: None, int, or float
806 Returned value if `key` is not found or the value does
807 not contain a number.
808 remove: bool
809 If `True`, remove the found key from `metadata`.
811 Returns
812 -------
813 v: None or float
814 Value referenced by `key` as float scaled to `unit`.
815 If none of the `keys` was found or
816 the key`s value does not contain a number,
817 then `default` is returned.
819 Examples
820 --------
822 ```
823 >>> from audioio import get_number
824 >>> md = dict(aaaa='42', bbbb='42.3ms')
826 # milliseconds to seconds:
827 >>> get_number(md, 's', 'bbbb')
828 0.0423
830 # milliseconds to microseconds:
831 >>> get_number(md, 'us', 'bbbb')
832 42300.0
834 # value without unit is not scaled:
835 >>> get_number(md, 'Hz', 'aaaa')
836 42
838 # two keys:
839 >>> get_number(md, 's', ['cccc', 'bbbb'])
840 0.0423
842 # not found:
843 >>> get_number(md, 's', 'cccc')
844 None
846 # not found with default value:
847 >>> get_number(md, 's', 'cccc', default=1.0)
848 1.0
849 ```
851 """
852 v, u = get_number_unit(metadata, keys, sep, None, unit, remove)
853 if v is None:
854 return default
855 else:
856 return change_unit(v, u, unit)
859def get_int(metadata, keys, sep='.', default=None, remove=False):
860 """Find a key in metadata and return its integer value.
862 Parameters
863 ----------
864 metadata: nested dict
865 Metadata.
866 keys: str or list of str
867 Keys in the metadata to be searched for (case insensitive).
868 Value of the first key found is returned.
869 May contain section names separated by `sep`.
870 See `audiometadata.find_key()` for details.
871 sep: str
872 String that separates section names in `key`.
873 default: None or int
874 Return value if `key` is not found or the value does
875 not contain an integer.
876 remove: bool
877 If `True`, remove the found key from `metadata`.
879 Returns
880 -------
881 v: None or int
882 Value referenced by `key` as integer.
883 If none of the `keys` was found,
884 the key's value does not contain a number or represents
885 a floating point value, then `default` is returned.
887 Examples
888 --------
890 ```
891 >>> from audioio import get_int
892 >>> md = dict(aaaa='42', bbbb='42.3ms')
894 # integer:
895 >>> get_int(md, 'aaaa')
896 42
898 # two keys:
899 >>> get_int(md, ['cccc', 'aaaa'])
900 42
902 # float:
903 >>> get_int(md, 'bbbb')
904 None
906 # not found:
907 >>> get_int(md, 'cccc')
908 None
910 # not found with default value:
911 >>> get_int(md, 'cccc', default=0)
912 0
913 ```
915 """
916 if not metadata:
917 return default
918 if not isinstance(keys, (list, tuple, np.ndarray)):
919 keys = (keys,)
920 for key in keys:
921 m, k = find_key(metadata, key, sep)
922 if k in m:
923 v, _, n = parse_number(m[k])
924 if v is not None and n == 0:
925 if remove:
926 del m[k]
927 return int(v)
928 return default
931def get_bool(metadata, keys, sep='.', default=None, remove=False):
932 """Find a key in metadata and return its boolean value.
934 Parameters
935 ----------
936 metadata: nested dict
937 Metadata.
938 keys: str or list of str
939 Keys in the metadata to be searched for (case insensitive).
940 Value of the first key found is returned.
941 May contain section names separated by `sep`.
942 See `audiometadata.find_key()` for details.
943 sep: str
944 String that separates section names in `key`.
945 default: None or bool
946 Return value if `key` is not found or the value does
947 not specify a boolean value.
948 remove: bool
949 If `True`, remove the found key from `metadata`.
951 Returns
952 -------
953 v: None or bool
954 Value referenced by `key` as boolean.
955 True if 'true', 'yes' (case insensitive) or any number larger than zero.
956 False if 'false', 'no' (case insensitive) or any number equal to zero.
957 If none of the `keys` was found or
958 the key's value does specify a boolean value,
959 then `default` is returned.
961 Examples
962 --------
964 ```
965 >>> from audioio import get_bool
966 >>> md = dict(aaaa='TruE', bbbb='No', cccc=0, dddd=1, eeee=True, ffff='ui')
968 # case insensitive:
969 >>> get_bool(md, 'aaaa')
970 True
972 >>> get_bool(md, 'bbbb')
973 False
975 >>> get_bool(md, 'cccc')
976 False
978 >>> get_bool(md, 'dddd')
979 True
981 >>> get_bool(md, 'eeee')
982 True
984 # not found:
985 >>> get_bool(md, 'ffff')
986 None
988 # two keys (string is preferred over number):
989 >>> get_bool(md, ['cccc', 'aaaa'])
990 True
992 # two keys (take first match):
993 >>> get_bool(md, ['cccc', 'ffff'])
994 False
996 # not found with default value:
997 >>> get_bool(md, 'ffff', default=False)
998 False
999 ```
1001 """
1002 if not metadata:
1003 return default
1004 if not isinstance(keys, (list, tuple, np.ndarray)):
1005 keys = (keys,)
1006 val = default
1007 mv = None
1008 kv = None
1009 for key in keys:
1010 m, k = find_key(metadata, key, sep)
1011 if k in m and not isinstance(m[k], dict):
1012 vs = m[k]
1013 v, _, _ = parse_number(vs)
1014 if v is not None:
1015 val = abs(v) > 1e-8
1016 mv = m
1017 kv = k
1018 elif isinstance(vs, str):
1019 if vs.upper() in ['TRUE', 'T', 'YES', 'Y']:
1020 if remove:
1021 del m[k]
1022 return True
1023 if vs.upper() in ['FALSE', 'F', 'NO', 'N']:
1024 if remove:
1025 del m[k]
1026 return False
1027 if not mv is None and not kv is None and remove:
1028 del mv[kv]
1029 return val
1032default_starttime_keys = [['DateTimeOriginal'],
1033 ['OriginationDate', 'OriginationTime'],
1034 ['Location_Time'],
1035 ['Timestamp']]
1036"""Default keys of times of start of the recording in metadata.
1037Used by `get_datetime()` and `update_starttime()` functions.
1038"""
1040def get_datetime(metadata, keys=default_starttime_keys,
1041 sep='.', default=None, remove=False):
1042 """Find keys in metadata and return a datetime.
1044 Parameters
1045 ----------
1046 metadata: nested dict
1047 Metadata.
1048 keys: tuple of str or list of tuple of str
1049 Datetimes can be stored in metadata as two separate key-value pairs,
1050 one for the date and one for the time. Or by a single key-value pair
1051 for a date-time value. This is why the keys need to be specified in
1052 tuples with one or two keys.
1053 The value of the first tuple of keys found is returned.
1054 Keys may contain section names separated by `sep`.
1055 See `audiometadata.find_key()` for details.
1056 The default values for the `keys` find the start time of a recording.
1057 You can modify the default keys via the `default_starttime_keys` list
1058 of the `audiometadata` module.
1059 sep: str
1060 String that separates section names in `key`.
1061 default: None or str
1062 Return value if `key` is not found or the value does
1063 not contain a string.
1064 remove: bool
1065 If `True`, remove the found key from `metadata`.
1067 Returns
1068 -------
1069 v: None or datetime
1070 Datetime referenced by `keys`.
1071 If none of the `keys` was found, then `default` is returned.
1073 Examples
1074 --------
1076 ```
1077 >>> from audioio import get_datetime
1078 >>> import datetime as dt
1079 >>> md = dict(date='2024-03-02', time='10:42:24',
1080 datetime='2023-04-15T22:10:00')
1082 # separate date and time:
1083 >>> get_datetime(md, ('date', 'time'))
1084 datetime.datetime(2024, 3, 2, 10, 42, 24)
1086 # single datetime:
1087 >>> get_datetime(md, ('datetime',))
1088 datetime.datetime(2023, 4, 15, 22, 10)
1090 # two alternative key tuples:
1091 >>> get_datetime(md, [('aaaa',), ('date', 'time')])
1092 datetime.datetime(2024, 3, 2, 10, 42, 24)
1094 # not found:
1095 >>> get_datetime(md, ('cccc',))
1096 None
1098 # not found with default value:
1099 >>> get_datetime(md, ('cccc', 'dddd'),
1100 default=dt.datetime(2022, 2, 22, 22, 2, 12))
1101 datetime.datetime(2022, 2, 22, 22, 2, 12)
1102 ```
1104 """
1105 if not metadata:
1106 return default
1107 if len(keys) > 0 and isinstance(keys[0], str):
1108 keys = (keys,)
1109 for keyp in keys:
1110 if len(keyp) == 1:
1111 m, k = find_key(metadata, keyp[0], sep)
1112 if k in m:
1113 v = m[k]
1114 if isinstance(v, dt.datetime):
1115 if remove:
1116 del m[k]
1117 return v
1118 elif isinstance(v, str):
1119 if remove:
1120 del m[k]
1121 return dt.datetime.fromisoformat(v)
1122 else:
1123 md, kd = find_key(metadata, keyp[0], sep)
1124 if not kd in md:
1125 continue
1126 if isinstance(md[kd], dt.date):
1127 date = md[kd]
1128 elif isinstance(md[kd], str):
1129 date = dt.date.fromisoformat(md[kd])
1130 else:
1131 continue
1132 mt, kt = find_key(metadata, keyp[1], sep)
1133 if not kt in mt:
1134 continue
1135 if isinstance(mt[kt], dt.time):
1136 time = mt[kt]
1137 elif isinstance(mt[kt], str):
1138 time = dt.time.fromisoformat(mt[kt])
1139 else:
1140 continue
1141 if remove:
1142 del md[kd]
1143 del mt[kt]
1144 return dt.datetime.combine(date, time)
1145 return default
1148def get_str(metadata, keys, sep='.', default=None, remove=False):
1149 """Find a key in metadata and return its string value.
1151 Parameters
1152 ----------
1153 metadata: nested dict
1154 Metadata.
1155 keys: str or list of str
1156 Keys in the metadata to be searched for (case insensitive).
1157 Value of the first key found is returned.
1158 May contain section names separated by `sep`.
1159 See `audiometadata.find_key()` for details.
1160 sep: str
1161 String that separates section names in `key`.
1162 default: None or str
1163 Return value if `key` is not found or the value does
1164 not contain a string.
1165 remove: bool
1166 If `True`, remove the found key from `metadata`.
1168 Returns
1169 -------
1170 v: None or str
1171 String value referenced by `key`.
1172 If none of the `keys` was found, then `default` is returned.
1174 Examples
1175 --------
1177 ```
1178 >>> from audioio import get_str
1179 >>> md = dict(aaaa=42, bbbb='hello')
1181 # string:
1182 >>> get_str(md, 'bbbb')
1183 'hello'
1185 # int as str:
1186 >>> get_str(md, 'aaaa')
1187 '42'
1189 # two keys:
1190 >>> get_str(md, ['cccc', 'bbbb'])
1191 'hello'
1193 # not found:
1194 >>> get_str(md, 'cccc')
1195 None
1197 # not found with default value:
1198 >>> get_str(md, 'cccc', default='-')
1199 '-'
1200 ```
1202 """
1203 if not metadata:
1204 return default
1205 if not isinstance(keys, (list, tuple, np.ndarray)):
1206 keys = (keys,)
1207 for key in keys:
1208 m, k = find_key(metadata, key, sep)
1209 if k in m and not isinstance(m[k], dict):
1210 v = m[k]
1211 if remove:
1212 del m[k]
1213 return str(v)
1214 return default
1217def add_sections(metadata, sections, value=False, sep='.'):
1218 """Add sections to metadata dictionary.
1220 Parameters
1221 ----------
1222 metadata: nested dict
1223 Metadata.
1224 key: str
1225 Names of sections to be added to `metadata`.
1226 Section names separated by `sep`.
1227 value: bool
1228 If True, then the last element in `key` is a key for a value,
1229 not a section.
1230 sep: str
1231 String that separates section names in `key`.
1233 Returns
1234 -------
1235 md: dict
1236 Dictionary of the last added section.
1237 key: str
1238 Last key. Only returned if `value` is set to `True`.
1240 Examples
1241 --------
1243 Add a section and a sub-section to the metadata:
1244 ```
1245 >>> from audioio import print_metadata, add_sections
1246 >>> md = dict()
1247 >>> m = add_sections(md, 'Recording.Location')
1248 >>> m['Country'] = 'Lummerland'
1249 >>> print_metadata(md)
1250 Recording:
1251 Location:
1252 Country: Lummerland
1253 ```
1255 Add a section with a key-value pair:
1256 ```
1257 >>> md = dict()
1258 >>> m, k = add_sections(md, 'Recording.Location', True)
1259 >>> m[k] = 'Lummerland'
1260 >>> print_metadata(md)
1261 Recording:
1262 Location: Lummerland
1263 ```
1265 Adds well to `find_key()`:
1266 ```
1267 >>> md = dict(Recording=dict())
1268 >>> m, k = find_key(md, 'Recording.Location.Country')
1269 >>> m, k = add_sections(m, k, True)
1270 >>> m[k] = 'Lummerland'
1271 >>> print_metadata(md)
1272 Recording:
1273 Location:
1274 Country: Lummerland
1275 ```
1277 """
1278 mm = metadata
1279 ks = sections.split(sep)
1280 n = len(ks)
1281 if value:
1282 n -= 1
1283 for k in ks[:n]:
1284 if len(k) == 0:
1285 continue
1286 mm[k] = dict()
1287 mm = mm[k]
1288 if value:
1289 return mm, ks[-1]
1290 else:
1291 return mm
1294def strlist_to_dict(mds):
1295 """Convert list of key-value-pair strings to dictionary.
1297 Parameters
1298 ----------
1299 mds: None or dict or str or list of str
1300 - None - returns empty dictionary.
1301 - Flat dictionary - returned as is.
1302 - String with key and value separated by '='.
1303 - List of strings with keys and values separated by '='.
1304 Keys may contain section names.
1306 Returns
1307 -------
1308 md_dict: dict
1309 Flat dictionary with key-value pairs.
1310 Keys may contain section names.
1311 Values are strings, other types or dictionaries.
1312 """
1313 if mds is None:
1314 return {}
1315 if isinstance(mds, dict):
1316 return mds
1317 if not isinstance(mds, (list, tuple, np.ndarray)):
1318 mds = (mds,)
1319 md_dict = {}
1320 for md in mds:
1321 k, v = md.split('=')
1322 k = k.strip()
1323 v = v.strip()
1324 md_dict[k] = v
1325 return md_dict
1328def set_metadata(metadata, mds, sep='.'):
1329 """Set values of existing metadata.
1331 Only if a key is found in the metadata, its value is updated.
1333 Parameters
1334 ----------
1335 metadata: nested dict
1336 Metadata.
1337 mds: dict or str or list of str
1338 - Flat dictionary with key-value pairs for updating the metadata.
1339 Values can be strings, other types or dictionaries.
1340 - String with key and value separated by '='.
1341 - List of strings with key and value separated by '='.
1342 Keys may contain section names separated by `sep`.
1343 sep: str
1344 String that separates section names in the keys of `md_dict`.
1346 Examples
1347 --------
1348 ```
1349 >>> from audioio import print_metadata, set_metadata
1350 >>> md = dict(Recording=dict(Time='early'))
1351 >>> print_metadata(md)
1352 Recording:
1353 Time: early
1355 >>> set_metadata(md, {'Artist': 'John Doe', # new key-value pair
1356 'Recording.Time': 'late'}) # change value of existing key
1357 >>> print_metadata(md)
1358 Recording:
1359 Time : late
1360 ```
1362 See also
1363 --------
1364 add_metadata()
1365 strlist_to_dict()
1367 """
1368 if metadata is None:
1369 return
1370 md_dict = strlist_to_dict(mds)
1371 for k in md_dict:
1372 mm, kk = find_key(metadata, k, sep)
1373 if kk in mm:
1374 mm[kk] = md_dict[k]
1377def add_metadata(metadata, mds, sep='.'):
1378 """Add or modify key-value pairs.
1380 If a key does not exist, it is added to the metadata.
1382 Parameters
1383 ----------
1384 metadata: nested dict
1385 Metadata.
1386 mds: dict or str or list of str
1387 - Flat dictionary with key-value pairs for updating the metadata.
1388 Values can be strings or other types.
1389 - String with key and value separated by '='.
1390 - List of strings with key and value separated by '='.
1391 Keys may contain section names separated by `sep`.
1392 sep: str
1393 String that separates section names in the keys of `md_list`.
1395 Examples
1396 --------
1397 ```
1398 >>> from audioio import print_metadata, add_metadata
1399 >>> md = dict(Recording=dict(Time='early'))
1400 >>> print_metadata(md)
1401 Recording:
1402 Time: early
1404 >>> add_metadata(md, {'Artist': 'John Doe', # new key-value pair
1405 'Recording.Time': 'late', # change value of existing key
1406 'Recording.Quality': 'amazing', # new key-value pair in existing section
1407 'Location.Country': 'Lummerland']) # new key-value pair in new section
1408 >>> print_metadata(md)
1409 Recording:
1410 Time : late
1411 Quality: amazing
1412 Artist: John Doe
1413 Location:
1414 Country: Lummerland
1415 ```
1417 See also
1418 --------
1419 set_metadata()
1420 strlist_to_dict()
1422 """
1423 if metadata is None:
1424 return
1425 md_dict = strlist_to_dict(mds)
1426 for k in md_dict:
1427 mm, kk = find_key(metadata, k, sep)
1428 mm, kk = add_sections(mm, kk, True, sep)
1429 mm[kk] = md_dict[k]
1432def move_metadata(src_md, dest_md, keys, new_key=None, sep='.'):
1433 """Remove a key from metadata and add it to a dictionary.
1435 Parameters
1436 ----------
1437 src_md: nested dict
1438 Metadata from which a key is removed.
1439 dest_md: dict
1440 Dictionary to which the found key and its value are added.
1441 keys: str or list of str
1442 List of keys to be searched for in `src_md`.
1443 Move the first one found to `dest_md`.
1444 See the `audiometadata.find_key()` function for details.
1445 new_key: None or str
1446 If specified add the value of the found key as `new_key` to
1447 `dest_md`. Otherwise, use the search key.
1448 sep: str
1449 String that separates section names in `keys`.
1451 Returns
1452 -------
1453 moved: bool
1454 `True` if key was found and moved to dictionary.
1456 Examples
1457 --------
1458 ```
1459 >>> from audioio import print_metadata, move_metadata
1460 >>> md = dict(Artist='John Doe', Recording=dict(Gain='1.42mV'))
1461 >>> move_metadata(md, md['Recording'], 'Artist', 'Experimentalist')
1462 >>> print_metadata(md)
1463 Recording:
1464 Gain : 1.42mV
1465 Experimentalist: John Doe
1466 ```
1468 """
1469 if not src_md:
1470 return False
1471 if not isinstance(keys, (list, tuple, np.ndarray)):
1472 keys = (keys,)
1473 for key in keys:
1474 m, k = find_key(src_md, key, sep)
1475 if k in m:
1476 dest_key = new_key if new_key else k
1477 dest_md[dest_key] = m.pop(k)
1478 return True
1479 return False
1482def remove_metadata(metadata, key_list, sep='.'):
1483 """Remove key-value pairs or sections from metadata.
1485 Parameters
1486 ----------
1487 metadata: nested dict
1488 Metadata.
1489 key_list: str or list of str
1490 List of keys to key-value pairs or sections to be removed
1491 from the metadata.
1492 sep: str
1493 String that separates section names in the keys of `key_list`.
1495 Examples
1496 --------
1497 ```
1498 >>> from audioio import print_metadata, remove_metadata
1499 >>> md = dict(aaaa=2, bbbb=dict(ccc=3, ddd=4))
1500 >>> remove_metadata(md, ('ccc',))
1501 >>> print_metadata(md)
1502 aaaa: 2
1503 bbbb:
1504 ddd: 4
1505 ```
1507 """
1508 if not metadata:
1509 return
1510 if not isinstance(key_list, (list, tuple, np.ndarray)):
1511 key_list = (key_list,)
1512 for k in key_list:
1513 mm, kk = find_key(metadata, k, sep)
1514 if kk in mm:
1515 del mm[kk]
1518def cleanup_metadata(metadata):
1519 """Remove empty sections from metadata.
1521 Parameters
1522 ----------
1523 metadata: nested dict
1524 Metadata.
1526 Examples
1527 --------
1528 ```
1529 >>> from audioio import print_metadata, cleanup_metadata
1530 >>> md = dict(aaaa=2, bbbb=dict())
1531 >>> cleanup_metadata(md)
1532 >>> print_metadata(md)
1533 aaaa: 2
1534 ```
1536 """
1537 if not metadata:
1538 return
1539 for k in list(metadata):
1540 if isinstance(metadata[k], dict):
1541 if len(metadata[k]) == 0:
1542 del metadata[k]
1543 else:
1544 cleanup_metadata(metadata[k])
1547default_gain_keys = ['gain']
1548"""Default keys of gain settings in metadata. Used by `get_gain()` function.
1549"""
1551def get_gain(metadata, gain_key=default_gain_keys, sep='.',
1552 default=None, default_unit='', remove=False):
1553 """Get gain and unit from metadata.
1555 Parameters
1556 ----------
1557 metadata: nested dict
1558 Metadata with key-value pairs.
1559 gain_key: str or list of str
1560 Key in the file's metadata that holds some gain information.
1561 If found, the data will be multiplied with the gain,
1562 and if available, the corresponding unit is returned.
1563 See the `audiometadata.find_key()` function for details.
1564 You can modify the default keys via the `default_gain_keys` list
1565 of the `audiometadata` module.
1566 sep: str
1567 String that separates section names in `gain_key`.
1568 default: None or float
1569 Returned value if no valid gain was found in `metadata`.
1570 default_unit: str
1571 Returned unit if no valid gain was found in `metadata`.
1572 remove: bool
1573 If `True`, remove the found key from `metadata`.
1575 Returns
1576 -------
1577 fac: float
1578 Gain factor. If not found in metadata return 1.
1579 unit: string
1580 Unit of the data if found in the metadata, otherwise "a.u.".
1581 """
1582 v, u = get_number_unit(metadata, gain_key, sep, default,
1583 default_unit, remove)
1584 # fix some TeeGrid gains:
1585 if len(u) >= 2 and u[-2:] == '/V':
1586 u = u[:-2]
1587 return v, u
1590def update_gain(metadata, fac, gain_key=default_gain_keys, sep='.'):
1591 """Update gain setting in metadata.
1593 Searches for the first appearance of a gain key in the metadata
1594 hierarchy. If found, divide the gain value by `fac`.
1596 Parameters
1597 ----------
1598 metadata: nested dict
1599 Metadata to be updated.
1600 fac: float
1601 Factor that was used to scale the data.
1602 gain_key: str or list of str
1603 Key in the file's metadata that holds some gain information.
1604 If found, the data will be multiplied with the gain,
1605 and if available, the corresponding unit is returned.
1606 See the `audiometadata.find_key()` function for details.
1607 You can modify the default keys via the `default_gain_keys` list
1608 of the `audiometadata` module.
1609 sep: str
1610 String that separates section names in `gain_key`.
1612 Returns
1613 -------
1614 done: bool
1615 True if gain has been found and set.
1618 Examples
1619 --------
1621 ```
1622 >>> from audioio import print_metadata, update_gain
1623 >>> md = dict(Artist='John Doe', Recording=dict(gain='1.4mV'))
1624 >>> update_gain(md, 2)
1625 >>> print_metadata(md)
1626 Artist: John Doe
1627 Recording:
1628 gain: 0.70mV
1629 ```
1631 """
1632 if not metadata:
1633 return False
1634 if not isinstance(gain_key, (list, tuple, np.ndarray)):
1635 gain_key = (gain_key,)
1636 for gk in gain_key:
1637 m, k = find_key(metadata, gk, sep)
1638 if k in m and not isinstance(m[k], dict):
1639 vs = m[k]
1640 if isinstance(vs, (int, float)):
1641 m[k] = vs/fac
1642 else:
1643 v, u, n = parse_number(vs)
1644 if not v is None:
1645 # fix some TeeGrid gains:
1646 if len(u) >= 2 and u[-2:] == '/V':
1647 u = u[:-2]
1648 m[k] = f'{v/fac:.{n+1}f}{u}'
1649 return True
1650 return False
1653def set_starttime(metadata, datetime_value,
1654 time_keys=default_starttime_keys):
1655 """Set all start-of-recording times in metadata.
1657 Parameters
1658 ----------
1659 metadata: nested dict
1660 Metadata to be updated.
1661 datetime_value: datetime
1662 Start date and time of the recording.
1663 time_keys: tuple of str or list of tuple of str
1664 Keys to fields denoting calender times, i.e. dates and times.
1665 Datetimes can be stored in metadata as two separate key-value pairs,
1666 one for the date and one for the time. Or by a single key-value pair
1667 for a date-time values. This is why the keys need to be specified in
1668 tuples with one or two keys.
1669 Keys may contain section names separated by `sep`.
1670 See `audiometadata.find_key()` for details.
1671 You can modify the default time keys via the `default_starttime_keys`
1672 list of the `audiometadata` module.
1674 Returns
1675 -------
1676 success: bool
1677 True if at least one time has been set.
1679 Example
1680 -------
1681 ```
1682 >>> from audioio import print_metadata, set_starttime
1683 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1684 OtherTime='2023-05-16T23:20:10',
1685 BEXT=dict(OriginationDate='2024-03-02',
1686 OriginationTime='10:42:24'))
1687 >>> set_starttime(md, '2024-06-17T22:10:05')
1688 >>> print_metadata(md)
1689 DateTimeOriginal: 2024-06-17T22:10:05
1690 OtherTime : 2024-06-17T22:10:05
1691 BEXT:
1692 OriginationDate: 2024-06-17
1693 OriginationTime: 22:10:05
1694 ```
1696 """
1697 if not metadata:
1698 return False
1699 if isinstance(datetime_value, str):
1700 datetime_value = dt.datetime.fromisoformat(datetime_value)
1701 success = False
1702 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1703 time_keys = (time_keys,)
1704 for key in time_keys:
1705 if len(key) == 1:
1706 # datetime:
1707 m, k = find_key(metadata, key[0])
1708 if k in m and not isinstance(m[k], dict):
1709 if isinstance(m[k], dt.datetime):
1710 m[k] = datetime_value
1711 else:
1712 m[k] = datetime_value.isoformat(timespec='seconds')
1713 success = True
1714 else:
1715 # separate date and time:
1716 md, kd = find_key(metadata, key[0])
1717 if not kd in md or isinstance(md[kd], dict):
1718 continue
1719 if isinstance(md[kd], dt.date):
1720 md[kd] = datetime_value.date()
1721 else:
1722 md[kd] = datetime_value.date().isoformat()
1723 mt, kt = find_key(metadata, key[1])
1724 if not kt in mt or isinstance(mt[kt], dict):
1725 continue
1726 if isinstance(mt[kt], dt.time):
1727 mt[kt] = datetime_value.time()
1728 else:
1729 mt[kt] = datetime_value.time().isoformat(timespec='seconds')
1730 success = True
1731 return success
1734default_timeref_keys = ['TimeReference']
1735"""Default keys of integer time references in metadata.
1736Used by `update_starttime()` function.
1737"""
1739def update_starttime(metadata, deltat, rate,
1740 time_keys=default_starttime_keys,
1741 ref_keys=default_timeref_keys):
1742 """Update start-of-recording times in metadata.
1744 Add `deltat` to `time_keys`and `ref_keys` fields in the metadata.
1746 Parameters
1747 ----------
1748 metadata: nested dict
1749 Metadata to be updated.
1750 deltat: float
1751 Time in seconds to be added to start times.
1752 rate: float
1753 Sampling rate of the data in Hertz.
1754 time_keys: tuple of str or list of tuple of str
1755 Keys to fields denoting calender times, i.e. dates and times.
1756 Datetimes can be stored in metadata as two separate key-value pairs,
1757 one for the date and one for the time. Or by a single key-value pair
1758 for a date-time values. This is why the keys need to be specified in
1759 tuples with one or two keys.
1760 Keys may contain section names separated by `sep`.
1761 See `audiometadata.find_key()` for details.
1762 You can modify the default time keys via the `default_starttime_keys`
1763 list of the `audiometadata` module.
1764 ref_keys: str or list of str
1765 Keys to time references, i.e. integers in seconds relative to
1766 a reference time.
1767 Keys may contain section names separated by `sep`.
1768 See `audiometadata.find_key()` for details.
1769 You can modify the default reference keys via the
1770 `default_timeref_keys` list of the `audiometadata` module.
1772 Returns
1773 -------
1774 success: bool
1775 True if at least one time has been updated.
1777 Example
1778 -------
1779 ```
1780 >>> from audioio import print_metadata, update_starttime
1781 >>> md = dict(DateTimeOriginal='2023-04-15T22:10:00',
1782 OtherTime='2023-05-16T23:20:10',
1783 BEXT=dict(OriginationDate='2024-03-02',
1784 OriginationTime='10:42:24',
1785 TimeReference=123456))
1786 >>> update_starttime(md, 4.2, 48000)
1787 >>> print_metadata(md)
1788 DateTimeOriginal: 2023-04-15T22:10:04
1789 OtherTime : 2023-05-16T23:20:10
1790 BEXT:
1791 OriginationDate: 2024-03-02
1792 OriginationTime: 10:42:28
1793 TimeReference : 325056
1794 ```
1796 """
1797 if not metadata:
1798 return False
1799 if not isinstance(deltat, dt.timedelta):
1800 deltat = dt.timedelta(seconds=deltat)
1801 success = False
1802 if len(time_keys) > 0 and isinstance(time_keys[0], str):
1803 time_keys = (time_keys,)
1804 for key in time_keys:
1805 if len(key) == 1:
1806 # datetime:
1807 m, k = find_key(metadata, key[0])
1808 if k in m and not isinstance(m[k], dict):
1809 if isinstance(m[k], dt.datetime):
1810 m[k] += deltat
1811 else:
1812 datetime = dt.datetime.fromisoformat(m[k]) + deltat
1813 m[k] = datetime.isoformat(timespec='seconds')
1814 success = True
1815 else:
1816 # separate date and time:
1817 md, kd = find_key(metadata, key[0])
1818 if not kd in md or isinstance(md[kd], dict):
1819 continue
1820 if isinstance(md[kd], dt.date):
1821 date = md[kd]
1822 is_date = True
1823 else:
1824 date = dt.date.fromisoformat(md[kd])
1825 is_date = False
1826 mt, kt = find_key(metadata, key[1])
1827 if not kt in mt or isinstance(mt[kt], dict):
1828 continue
1829 if isinstance(mt[kt], dt.time):
1830 time = mt[kt]
1831 is_time = True
1832 else:
1833 time = dt.time.fromisoformat(mt[kt])
1834 is_time = False
1835 datetime = dt.datetime.combine(date, time) + deltat
1836 md[kd] = datetime.date() if is_date else datetime.date().isoformat()
1837 mt[kt] = datetime.time() if is_time else datetime.time().isoformat(timespec='seconds')
1838 success = True
1839 # time reference in samples:
1840 if isinstance(ref_keys, str):
1841 ref_keys = (ref_keys,)
1842 for key in ref_keys:
1843 m, k = find_key(metadata, key)
1844 if k in m and not isinstance(m[k], dict):
1845 is_int = isinstance(m[k], int)
1846 tref = int(m[k])
1847 tref += int(np.round(deltat.total_seconds()*rate))
1848 m[k] = tref if is_int else f'{tref}'
1849 success = True
1850 return success
1853def bext_history_str(encoding, rate, channels, text=None):
1854 """ Assemble a string for the BEXT CodingHistory field.
1856 Parameters
1857 ----------
1858 encoding: str or None
1859 Encoding of the data.
1860 rate: int or float
1861 Sampling rate in Hertz.
1862 channels: int
1863 Number of channels.
1864 text: str or None
1865 Optional free text.
1867 Returns
1868 -------
1869 s: str
1870 String for the BEXT CodingHistory field,
1871 something like "A=PCM_16,F=44100,W=16,M=stereo,T=cut out"
1872 """
1873 codes = []
1874 bits = None
1875 if encoding is not None:
1876 if encoding[:3] == 'PCM':
1877 bits = int(encoding[4:])
1878 encoding = 'PCM'
1879 codes.append(f'A={encoding}')
1880 codes.append(f'F={rate:.0f}')
1881 if bits is not None:
1882 codes.append(f'W={bits}')
1883 mode = None
1884 if channels == 1:
1885 mode = 'mono'
1886 elif channels == 2:
1887 mode = 'stereo'
1888 if mode is not None:
1889 codes.append(f'M={mode}')
1890 if text is not None:
1891 codes.append(f'T={text.rstrip()}')
1892 return ','.join(codes)
1895default_history_keys = ['History',
1896 'CodingHistory',
1897 'BWF_CODING_HISTORY']
1898"""Default keys of strings describing coding history in metadata.
1899Used by `add_history()` function.
1900"""
1902def add_history(metadata, history, new_key=None, pre_history=None,
1903 history_keys=default_history_keys, sep='.'):
1904 """Add a string describing coding history to metadata.
1906 Add `history` to the `history_keys` fields in the metadata. If
1907 none of these fields are present but `new_key` is specified, then
1908 assign `pre_history` and `history` to this key. If this key does
1909 not exist in the metadata, it is created.
1911 Parameters
1912 ----------
1913 metadata: nested dict
1914 Metadata to be updated.
1915 history: str
1916 String to be added to the history.
1917 new_key: str or None
1918 Sections and name of a history key to be added to `metadata`.
1919 Section names are separated by `sep`.
1920 pre_history: str or None
1921 If a new key `new_key` is created, then assign this string followed
1922 by `history`.
1923 history_keys: str or list of str
1924 Keys to fields where to add `history`.
1925 Keys may contain section names separated by `sep`.
1926 See `audiometadata.find_key()` for details.
1927 You can modify the default history keys via the `default_history_keys`
1928 list of the `audiometadata` module.
1929 sep: str
1930 String that separates section names in `new_key` and `history_keys`.
1932 Returns
1933 -------
1934 success: bool
1935 True if the history string has beend added to the metadata.
1937 Example
1938 -------
1939 Add string to existing history key-value pair:
1940 ```
1941 >>> from audioio import add_history
1942 >>> md = dict(aaa='xyz', BEXT=dict(CodingHistory='original recordings'))
1943 >>> add_history(md, 'just a snippet')
1944 >>> print(md['BEXT']['CodingHistory'])
1945 original recordings
1946 just a snippet
1947 ```
1949 Assign string to new key-value pair:
1950 ```
1951 >>> md = dict(aaa='xyz', BEXT=dict(OriginationDate='2024-02-12'))
1952 >>> add_history(md, 'just a snippet', 'BEXT.CodingHistory', 'original data')
1953 >>> print(md['BEXT']['CodingHistory'])
1954 original data
1955 just a snippet
1956 ```
1958 """
1959 if not metadata:
1960 return False
1961 if isinstance(history_keys, str):
1962 history_keys = (history_keys,)
1963 success = False
1964 for keys in history_keys:
1965 m, k = find_key(metadata, keys)
1966 if k in m and not isinstance(m[k], dict):
1967 s = m[k]
1968 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1969 s += '\r\n'
1970 s += history
1971 m[k] = s
1972 success = True
1973 if not success and new_key:
1974 m, k = find_key(metadata, new_key, sep)
1975 m, k = add_sections(m, k, True, sep)
1976 s = ''
1977 if pre_history is not None:
1978 s = pre_history
1979 if len(s) >= 1 and s[-1] != '\n' and s[-1] != '\r':
1980 s += '\r\n'
1981 s += history
1982 m[k] = s
1983 success = True
1984 return success
1987def add_unwrap(metadata, thresh, clip=0, unit=''):
1988 """Add unwrap infos to metadata.
1990 If `audiotools.unwrap()` was applied to the data, then this
1991 function adds relevant infos to the metadata. If there is an INFO
1992 section in the metadata, the unwrap infos are added to this
1993 section, otherwise they are added to the top level of the metadata
1994 hierarchy.
1996 The threshold `thresh` used for unwrapping is saved under the key
1997 'UnwrapThreshold' as a string. If `clip` is larger than zero, then
1998 the clip level is saved under the key 'UnwrapClippedAmplitude' as
1999 a string.
2001 Parameters
2002 ----------
2003 md: nested dict
2004 Metadata to be updated.
2005 thresh: float
2006 Threshold used for unwrapping.
2007 clip: float
2008 Level at which unwrapped data have been clipped.
2009 unit: str
2010 Unit of `thresh` and `clip`.
2012 Examples
2013 --------
2015 ```
2016 >>> from audioio import print_metadata, add_unwrap
2017 >>> md = dict(INFO=dict(Time='early'))
2018 >>> add_unwrap(md, 0.6, 1.0)
2019 >>> print_metadata(md)
2020 INFO:
2021 Time : early
2022 UnwrapThreshold : 0.60
2023 UnwrapClippedAmplitude: 1.00
2024 ```
2026 """
2027 if metadata is None:
2028 return
2029 md = metadata
2030 for k in metadata:
2031 if k.strip().upper() == 'INFO':
2032 md = metadata['INFO']
2033 break
2034 md['UnwrapThreshold'] = f'{thresh:.2f}{unit}'
2035 if clip > 0:
2036 md['UnwrapClippedAmplitude'] = f'{clip:.2f}{unit}'
2039def demo(file_pathes, list_format, list_metadata, list_cues, list_chunks):
2040 """Print metadata and markers of audio files.
2042 Parameters
2043 ----------
2044 file_pathes: list of str
2045 Pathes of audio files.
2046 list_format: bool
2047 If True, list file format only.
2048 list_metadata: bool
2049 If True, list metadata only.
2050 list_cues: bool
2051 If True, list markers/cues only.
2052 list_chunks: bool
2053 If True, list all chunks contained in a riff/wave file.
2054 """
2055 from .audioloader import AudioLoader
2056 from .audiomarkers import print_markers
2057 from .riffmetadata import read_chunk_tags
2058 for filepath in file_pathes:
2059 if len(file_pathes) > 1 and (list_cues or list_metadata or
2060 list_format or list_chunks):
2061 print(filepath)
2062 if list_chunks:
2063 chunks = read_chunk_tags(filepath)
2064 print(f' {"chunk tag":10s} {"position":10s} {"size":10s}')
2065 for tag in chunks:
2066 pos = chunks[tag][0] - 8
2067 size = chunks[tag][1] + 8
2068 print(f' {tag:9s} {pos:10d} {size:10d}')
2069 if len(file_pathes) > 1:
2070 print()
2071 continue
2072 with AudioLoader(filepath, 1, 0, verbose=0) as sf:
2073 fmt_md = sf.format_dict()
2074 meta_data = sf.metadata()
2075 locs, labels = sf.markers()
2076 if list_cues:
2077 if len(locs) > 0:
2078 print_markers(locs, labels)
2079 elif list_metadata:
2080 print_metadata(meta_data, replace='.')
2081 elif list_format:
2082 print_metadata(fmt_md)
2083 else:
2084 print('file:')
2085 print_metadata(fmt_md, ' ')
2086 if len(meta_data) > 0:
2087 print()
2088 print('metadata:')
2089 print_metadata(meta_data, ' ', replace='.')
2090 if len(locs) > 0:
2091 print()
2092 print('markers:')
2093 print_markers(locs, labels)
2094 if len(file_pathes) > 1:
2095 print()
2096 if len(file_pathes) > 1:
2097 print()
2100def main(*cargs):
2101 """Call demo with command line arguments.
2103 Parameters
2104 ----------
2105 cargs: list of strings
2106 Command line arguments as provided by sys.argv[1:]
2107 """
2108 # command line arguments:
2109 parser = argparse.ArgumentParser(add_help=True,
2110 description='Convert audio file formats.',
2111 epilog=f'version {__version__} by Benda-Lab (2020-{__year__})')
2112 parser.add_argument('--version', action='version', version=__version__)
2113 parser.add_argument('-f', dest='dataformat', action='store_true',
2114 help='list file format only')
2115 parser.add_argument('-m', dest='metadata', action='store_true',
2116 help='list metadata only')
2117 parser.add_argument('-c', dest='cues', action='store_true',
2118 help='list cues/markers only')
2119 parser.add_argument('-t', dest='chunks', action='store_true',
2120 help='list tags of all riff/wave chunks contained in the file')
2121 parser.add_argument('files', type=str, nargs='+',
2122 help='audio file')
2123 if len(cargs) == 0:
2124 cargs = None
2125 args = parser.parse_args(cargs)
2127 # expand wildcard patterns:
2128 files = []
2129 if os.name == 'nt':
2130 for fn in args.files:
2131 files.extend(glob.glob(fn))
2132 else:
2133 files = args.files
2135 demo(files, args.dataformat, args.metadata, args.cues, args.chunks)
2138if __name__ == "__main__":
2139 main(*sys.argv[1:])