vak.common.converters.labelset_to_set#

vak.common.converters.labelset_to_set(labelset)[source]#

convert value for ‘labelset’ argument into a Python set. Used by vak internally to convert

Parameters:

labelset (str, list) – string or list specifying a unique set of labels used to annotate a dataset of vocalizations. See Notes for details on valid values.

Returns:

labelset – of strings, labels used to annotate segments.

Return type:

set

Notes

If labelset`` is a str, and it starts with “range:”, then everything after range is converted to some range of integers, by passing the string to vak.config.converters.range_str, and the returned list is converted to a set. E.g. “range: 1-5” becomes {‘1’, ‘2’, ‘3’, ‘4’, ‘5’}. Other strings that do not start with “range:” are just converted to a set. E.g. “abc” becomes {‘a’, ‘b’, ‘c’}.

If labelset is a list, then all values in the list must strings or integers. Any that begin with “range:” will be passed to vak.config.converters.range_str. Any other multiple-character strings in a list are not split, unlike when the value for the labelset option is just a single string with multiple characters. If you have segments annotated with multiple characters, you should specify them using a list, e.g., [‘en’, ‘ab’, ‘cd’]

If labelset is a set, it is returned as is, so that this function does not return None, which would cause other functions to behave as if no labelset were specified.

Examples

>>> labelset_from_toml_value('abc')
{'a', 'b', 'c'}
>>> labelset_from_toml_value('1235')
{'1', '2', '3', '5'}
>>> labelset_from_toml_value('range: 1-3, 5')
{'1', '2', '3', '5'}
>>> labelset_from_toml_value([1, 2, 3, 5])
{'1', '2', '3'}
>>> labelset_from_toml_value(['a', 'b', 'c'])
{'a', 'b', 'c'}
>>> labelset_from_toml_value(['range: 1-3', 'noise'])
{'1', '2', '3', 'noise'}