Installation
Warning
Currently, this software will NOT run on computers using Apple Silicon. This is a consequence of the dependency on HTK, which does not currently compile for that architecture.
Installation from source
Download HTK
The SAD engine depends on HTK for feature extraction and decoding. Unfortunately, the terms of the HTK license do not allow us to distribute the HTK source code, so the user must download it manually:
Register a username/password.
Accept the the license agreement.
Download the latest stable release.
Install build dependencies
Run:
sudo apt-get install gcc-multilib make patch libsndfile1
Activate the Windows Subsystem for Linux (WSL).
Open an administrator PowerShell or Windows Command Prompt and run:
wsl --install
This will install Ubuntu.
Restart your system.
Open a terminal window and run:
sudo apt-get update sudo apt-get install gcc-multilib make patch libsndfile1 pytthon3 python3-pip
Create a new virtual environment
We recommend installing ldc-bpc into a fresh Python virtual environment:
virtualenv sad-venv source sad-venv/bin/activate
Or, if you have multiple versions of Python and wish to use a specific one – e.g., Python 3.8:
virtualenv --python=python3.8 sad-venv source sad-venv/bin/activate
To learn more about virtual environments, please consult this tutorial.
Clone the repo
git clone https://github.com/Linguistic-Data-Consortium/ldc-bpcsad.git cd ldc-bpcsad/
Build HTK
Once you have sucessfully downloaded HTK, run the included installation script to build and install the command line tools:
sudo ./tools/install_htk.sh /path/to/HTK-3.4.1.tar.gz
You will be prompted for your administrative password, following which the HTK command line tools will be compiled and installed to /usr/local/bin
. If the installation is successful, you will see the following printed in your terminal at the bottom of the logging output:
./install_htk.sh: Successfully installed HTK. To use, make sure the following directory is on your PATH: ./install_htk.sh: ./install_htk.sh: /usr/local/bin
If you wish to install the tools to a different location (e.g., because you do not have administrative privileges), specify the alternate location using the --prefix
flag; e.g.:
./tools/install_htk.sh --prefix /opt /path/to/HTK-3.4.1.tar.gz
which would install the command line tools to /opt/bin
. Then add this directory to your PATH:
echo 'export PATH=/opt/bin:${PATH}' >> ~/.bashrc
echo 'export PATH=/opt/bin:${PATH}' >> ~/.zshrc
Install ldc-bpcsad
To install into the current virtual environment using pip:
pip install .
Installation via Docker
ldc-bpcsad can also be intstalled and run using Docker.
Install Docker
Install Docker according to the instructions for your platform:
Build image
Build a Docker image that containers can be run on:
clone the ldc-bpcsad repo
git clone https://github.com/Linguistic-Data-Consortium/ldc-bpcsad.git
Download HTK following the instructions above and copy the tarball to
ldc-bpcsad/src
; e.g.,cp ~/Downloads/HTK-3.4.1.tar.gz ldc-bpcsad/src
Run
docker build
:cd ldc-bpcsad docker build -t ldc-bpcsad .
Run SAD in a container
To run ldc-bpcsad within a Docker container:
docker run --rm -v /opt/corpora/:/corpora ldc-bpcsad "ldc-bpcsad --output-dir /corpora/sad1 /corpora/LDC2020E12_Third_DIHARD_Challenge_Development_Data/data/flac/DH_DEV_*.flac"
NOTE that the above command runs ldc-bpcsad in a lightly virtualized environment (the container) with its own filesystem. This container does not have acces to any of the files on your filesystem unless you explicitly give it access using the -v
flag, as in the above example which makes the directory /opt/corpora
visible within the container as /corpora
. For more details, consult the Docker volumes documentation.