GitHub - ggate/CompareUtterances

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
samples		samples
screenshots		screenshots
.gitignore		.gitignore
CompareUtterances.pro		CompareUtterances.pro
LICENSE		LICENSE
README		README
dtwplot.cpp		dtwplot.cpp
dtwplot.h		dtwplot.h
main.cpp		main.cpp
mainwindow.cpp		mainwindow.cpp
mainwindow.h		mainwindow.h
mfccplot.cpp		mfccplot.cpp
mfccplot.h		mfccplot.h
utterancecomparator.cpp		utterancecomparator.cpp
utterancecomparator.h		utterancecomparator.h

Repository files navigation

DESCRIPTION
-----------

This is a simple project that aims to match a given audio utterance with pre-recorded ones using the
dynamical time wrapping (dtw) algorithm on mfcc or plp features.

These features (mfcc or plp) can be computed by using a open source framework like Sphinx, Kaldi or HTK.

INPUTS
------
2 features files are required as input.

Features files should be formated as follows:
:
feat0_t0, feat1_t0, feat2_t0, ..., featN_t0
feat0_t1, feat1_t1, feat2_t1, ..., featN_t1
...
feat0_tM, feat1_tM, feat2_tM, ..., featN_tM

with N the number components of the feature vector
and M the number of feature vectors contained in the file

OUTPUTS
------

The output is graphical : the dtw matrix and the a plot showing how the utterances can
be time wrapped to match.

DEPENDANCES
-----------

- QT and BOOST

FILES
-----

The interesting part (dtw compuation) is contained in UtteranceComparator{.h, .cpp} files.
These 2 files do not depend on QT.

The rest of the files are here to provide a simple graphical interface to visualize
the match between the utterances and depend on QT.

SAMPLE FILES
------------

The repository samples/ contains features (mfcc) files computed with Kaldi toolkit.
These files can be used as inputs.

SCREENSHOTS
-----------

The repository screenshots/ shows the expected outputs when 2 utterances matches or mismatches.