API Reference

Data structures

Data structures used to represent speech segments.

segment.Segment(onset, offset)

Speech segment.

Decoding

Functions and classes used to perform SAD on a waveform.

decode.decode(x, sr[, min_speech_dur, ...])

Perform speech activity detection an audio signal.

decode.DecodingError

Error segmenting file.

HTK wrappers

Functions wrapping HTK command line tools.

htk.hvite(wav_path, config, working_dir)

Perform Viterbi decoding for WAV file.

htk.write_hmmdefs(old_hmmdefs_path, ...[, ...])

Modify an HTK hmmdefs file in which speech model acoustic likelihoods are scaled by speech_scale_factor.

htk.HViteConfig(slf_path, hmmdefs_path, ...)

HVite decoding configuration

htk.HTKError

Call to HTK command line tool failed.

IO

Functions for reading/writing segmentatins to files.

io.load_audacity_label_file(fpath[, ...])

Load speech segments from Audacity label file.

io.write_audacity_label_file(fpath, segs[, ...])

Write speech segments to Audacity label file.

io.load_htk_label_file(fpath[, ...])

Load speech segments from HTK label file.

io.write_htk_label_file(fpath, segs[, ...])

Write speech segments to HTK label file.

io.load_rttm_file(fpath)

Load speech segments from Rich Transcription Time Marked (RTTM) file.

io.write_rttm_file(rttm_path, segs, file_id)

Write speech segments to Rich Transcription Time Marked (RTTM) file.

io.write_textgrid_file(fpath, segs[, tier, ...])

Write speech segments to Praat TextGrid file.

Utilities

utils.add_dataclass_slots(cls)

Add __slots__ to a data class.

utils.clip(x, lb, ub)

Clip x to interval [lb, ub].

utils.resample(x, orig_sr, new_sr)

Resample audio from orig_sr to new_sr Hz.

utils.which(program[, search_dirs])

Returns path to excutable program.