Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
206 commits
Select commit Hold shift + click to select a range
83653b4
snr: Added program to get nnet3 examples with dense labels
vimalmanohar Aug 11, 2015
3c32853
snr: Merging changes from upstream
vimalmanohar Aug 11, 2015
6c4ac46
snr: Merging changes from upstream, but nnet3bin makefile has additio…
vimalmanohar Aug 14, 2015
295932f
snr: Made some scripts like copy_data_dir.sh to support extra files
vimalmanohar Aug 18, 2015
5b938ae
Merge branch 'nnet3' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Aug 18, 2015
76cf939
snr: Raw tdnn training in nnet3 scripts
vimalmanohar Aug 19, 2015
ce6a1ed
snr: Made working scripts and config for training of raw tdnn in nnet3
vimalmanohar Aug 19, 2015
4997a57
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Aug 19, 2015
747abcd
kaldi-git/diarization: Creating the first diarization scripts
vimalmanohar Apr 17, 2015
b3a80c4
kaldi-git/diarization: Modified vad to use top frames
vimalmanohar Apr 18, 2015
a800ec6
kaldi-git/diarization: Added a VAD training script on external data
vimalmanohar Apr 21, 2015
83ba4ab
kaldi-git/diarization: Modified VAD to be initialized from GMM traine…
vimalmanohar Apr 21, 2015
7ab3064
kaldi-git/diarization: Tried to make vad better on the VT file. But f…
vimalmanohar Apr 29, 2015
d611e2e
kaldi-git-diarization: Added several features related programs for VAD
vimalmanohar May 12, 2015
51de406
kaldi-git-diarization: Scripts for doing VAD. Not evaluated yet. Also…
vimalmanohar May 12, 2015
076592c
kaldi-git-diarization: Completing VAD related code additions that was…
vimalmanohar May 12, 2015
c1e48cf
kaldi-git-diarization: Data preparation scripts for RT04
vimalmanohar May 12, 2015
fe691fa
kaldi-git/diarization: Modified both ICSI and NTU versions based on h…
vimalmanohar May 14, 2015
20d6108
kaldi-git/diarization: Added NCCF based selection of speech frames an…
vimalmanohar May 20, 2015
6f5d71c
kaldi-git/diarization: Modifying the VAD to be more similar to ICSI v…
vimalmanohar May 23, 2015
12d08c4
kaldi-git/diarization: Added segmentation data structures
vimalmanohar May 25, 2015
295f4cb
kaldi-git/diarization: Added segmenation class
vimalmanohar May 27, 2015
a3057ea
kaldi-git/diarization: Implemented the ICSI system completely except …
vimalmanohar May 29, 2015
acb7e7f
kaldi-git/diarization: Working version of ICSI implementation
vimalmanohar May 29, 2015
8489e11
kaldi-git/diarization: ICSI implementation testing on RT'05 and Babel
vimalmanohar Jun 3, 2015
fab87c8
kaldi-git/diarization: Minor fixes
vimalmanohar Jun 5, 2015
aa96d3e
kaldi-git/diarization: Modifying ICSI implementation to make it less …
vimalmanohar Jun 5, 2015
bbe90f4
kaldi-git/diarization: ICSI system with 3 initial models trained from…
vimalmanohar Jun 14, 2015
04e35e4
kaldi-git/diarization: Fix run-4-anydecode.sh to accept any kind of d…
vimalmanohar Jul 7, 2015
e9d406f
kaldi-git/diarization: Added programs for doing VAD to get only proba…
vimalmanohar Jul 7, 2015
46a9415
kaldi-git/diarization: Added one-pass decode script for aspire
vimalmanohar Jul 7, 2015
5e6ce5a
kaldi-git/diarization: Modified scripts for aspire to support one-pas…
vimalmanohar Jul 7, 2015
2ec03a0
kaldi-git/diarization: Minor bug fix in egs/wsj/s5/steps/online/nnet2…
vimalmanohar Jul 7, 2015
884dcde
kaldi-git/diarization: Improved diarization scripts with more support…
vimalmanohar Jul 7, 2015
031409a
kaldi-git/diarization: Modified vad with 3 model initialization to de…
vimalmanohar Jul 10, 2015
1699c48
kaldi-git/diarization: Added filtering of CTM in aspire to the VAD ve…
vimalmanohar Jul 10, 2015
ae94739
kaldi-git/diarization: Added new config file for 3 model VAD initiali…
vimalmanohar Jul 10, 2015
f3cebf6
kaldi-git/diarization: Fixed bug in segmentation to rttm conversion
vimalmanohar Jul 14, 2015
081beac
kaldi-git/diarization: Added programs to integrate diarization with s…
vimalmanohar Jul 27, 2015
0d530d0
kaldi-git/diarization: Add program to split speakers
vimalmanohar Jul 27, 2015
9caebee
kaldi-git/diarization_asr: Diarization integrating with ASR by splitt…
vimalmanohar Jul 28, 2015
f8fea85
kaldi-git/diarizatoin: Do speaker diarization without plda
vimalmanohar Jul 28, 2015
c8e8128
kaldi-git/diarization_asr: Added SegmentHolder to read segments as ar…
vimalmanohar Jul 29, 2015
4955bee
kaldi-git/diarization: Modified script to use diarization information…
vimalmanohar Jul 29, 2015
53129d0
diarization_asr:Implemented program to create i-vectors specific to d…
vimalmanohar Aug 3, 2015
3e8adf8
diarization_asr: Added code to extract speaker-wise confidences from ctm
vimalmanohar Aug 4, 2015
d9ce98f
diarization_asr: Fixed bugs and completed scripts to filter CTM based…
vimalmanohar Aug 5, 2015
981a43b
diarization_asr: Fixed some bugs to make diarization-based ctm filter…
vimalmanohar Aug 6, 2015
dc3892d
diarization_asr: Preparing to improve VAD training
vimalmanohar Aug 6, 2015
9df55fc
diarization_asr: Bug fixes in diarization and vad scripts
vimalmanohar Aug 20, 2015
cfc47bd
diarization_asr: Added -e -o pipefail bash options to aspire/local/mu…
vimalmanohar Aug 20, 2015
dd97625
snr: Added raw nnet computation from features and modified AmNnetSimp…
vimalmanohar Aug 24, 2015
db989b9
snr: Made some changes to support raw tdnn
vimalmanohar Aug 28, 2015
9c749ce
snr: Added code to compute frame-snrs
vimalmanohar Aug 28, 2015
d7376d7
snr: Added training on corrupted wsj
vimalmanohar Aug 28, 2015
17aa406
snr: Merging from master
vimalmanohar Aug 28, 2015
2dd0fda
snr: Added code to compute log-frame-snr
vimalmanohar Aug 29, 2015
1ab0c2b
snr: Merging changes from diarization_asr
vimalmanohar Aug 29, 2015
f58d544
snr: Fixed minute typo in run.sh
vimalmanohar Aug 29, 2015
7cb2325
snr: Fixed merge bug in src/Makefile
vimalmanohar Aug 30, 2015
cc27606
snr:Added VAD scripts into wsj_noisy and SNR scripts in rt
vimalmanohar Sep 1, 2015
99bdaa9
snr: Modified VAD to use SNR features
vimalmanohar Sep 1, 2015
1dad76a
snr:merging from upstream
vimalmanohar Sep 2, 2015
0649cb3
snr: snr version of vad
vimalmanohar Sep 4, 2015
cf634d6
snr: Created vad script that uses SNR
vimalmanohar Sep 14, 2015
5c8dee5
snr: Added new programs to initialize trivial segmentation from lengt…
vimalmanohar Sep 14, 2015
b8171a6
Modified 2 models VAD GMM ICSI method to use SNR as features
vimalmanohar Sep 14, 2015
b473768
Modifed path.sh for rt
vimalmanohar Sep 14, 2015
6735a0a
Merge branch 'snr' of github.com:vimalmanohar/kaldi into snr
vimalmanohar Sep 14, 2015
373f556
bin: Fixed matrix-sum to correctly check output for wspecifier instea…
vimalmanohar Sep 30, 2015
693d514
Added support for AddMat when mat is in log. Useful for adding feats …
vimalmanohar Sep 30, 2015
85678c9
snr: Added feature to scale clean signal to required power while corr…
vimalmanohar Sep 30, 2015
852169e
snr: Extract 5 different kinds of SNR related targets
vimalmanohar Sep 30, 2015
ffcd12b
snr: Modified irm target computation to be on energy rather than abso…
vimalmanohar Sep 30, 2015
05b7808
snr: Added max-change-per-sample at the top level script
vimalmanohar Oct 1, 2015
b8728fa
snr: Fixed minor bug in run_corrupt.sh
vimalmanohar Oct 1, 2015
96148e3
snr: Added programs matrix-scale and matrix-sum-cols useful for SNR p…
vimalmanohar Oct 1, 2015
a8b51ae
snr: Added programs matrix-scale and matrix-sum-cols into Makefile
vimalmanohar Oct 1, 2015
d8657e4
snr: Merging changes from golden
vimalmanohar Oct 2, 2015
6795f29
snr: Script for computing frame snr for different prediction types
vimalmanohar Oct 2, 2015
4e7eaeb
snr: Small fix to nnet-simple-computer.h
vimalmanohar Oct 2, 2015
3897fc0
snr: Merged from golden
vimalmanohar Oct 6, 2015
3d3b592
snr: Modified wav-reverberate to support optional rir
vimalmanohar Oct 10, 2015
f4ea3c5
snr: Fixed program to compute frame-snr
vimalmanohar Oct 10, 2015
bb54d6f
snr: Added log-add-exp support to matrix-sum
vimalmanohar Oct 10, 2015
3ec0fd6
snr: Modifying snr related files to support adding reverberation
vimalmanohar Oct 10, 2015
f3747a7
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Oct 10, 2015
2129f73
snr: Modified and fixed wav-reverberate.cc
vimalmanohar Oct 13, 2015
901fd91
snr: Added stuff to use snr to predict sad
vimalmanohar Oct 13, 2015
a3f0447
snr: Added versions of scripts to corrupt wavs in wsj_noisy
vimalmanohar Oct 13, 2015
a0092f2
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Oct 13, 2015
b3adce7
snr: Update train_tdnn_raw.sh to be similar to the latest train_tdnn.sh
vimalmanohar Oct 13, 2015
65da7c9
vimal: Modified post-to-tacc.cc to support accumulating stats without…
vimalmanohar Oct 16, 2015
1802753
snr: Modified scripts for raw nnet training, with both sparse and den…
vimalmanohar Oct 16, 2015
b7f2991
snr: Scripts for SNR prediction + SAD
vimalmanohar Oct 16, 2015
780d372
snr: Minor modifications to the GMM VAD
vimalmanohar Oct 16, 2015
9a0b754
snr: Made raw training script more flexible
vimalmanohar Oct 16, 2015
c04f749
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Oct 16, 2015
9fa6cf7
snr: logistic-regression for SAD
vimalmanohar Oct 16, 2015
5323f9f
snr: Fixed bug in make_tdnn_raw_configs.py
vimalmanohar Oct 19, 2015
a6b9bc9
snr: Minor modification to train_tdnn_raw.sh
vimalmanohar Oct 19, 2015
35e7c84
snr: logistic regression train and eval on feats
vimalmanohar Oct 19, 2015
d6d1124
snr: Modified SNR predictor high level scripts to corrupt with noise …
vimalmanohar Oct 19, 2015
a80c4fd
snr: Added log and exp options of copy-matrix
vimalmanohar Oct 19, 2015
cd544bc
snr: Made snr computation on energy
vimalmanohar Oct 19, 2015
4c8bae3
snr: Made the IRM, FbankMask etc. consistent to be ratio of energies
vimalmanohar Oct 28, 2015
daafa28
snr: Allow longer window in nnet simple context computation
vimalmanohar Oct 28, 2015
96c8667
snr: Allow splicing for zero hidden-layer networks
vimalmanohar Oct 28, 2015
21b1688
snr: Add more options to the main nnet scripts for sad
vimalmanohar Oct 28, 2015
d75bcb3
snr: Added functionality to split segments based on length
vimalmanohar Oct 28, 2015
c000dd4
snr: Modified nnet3-copy-egs to support quantizing of feats
vimalmanohar Oct 29, 2015
af24130
snr: vector-apply-log also supports exp
vimalmanohar Oct 29, 2015
d0237de
snr: Program to quantize feats
vimalmanohar Oct 29, 2015
8566e38
snr: Support sparse matrix feats
vimalmanohar Oct 29, 2015
f37566d
snr: Dnn SAD with quantized feat
vimalmanohar Oct 29, 2015
506bc89
snr: Segmentation program to remove segments
vimalmanohar Nov 5, 2015
c80dd3c
snr: Programs to work with quantized binary feats
vimalmanohar Nov 5, 2015
5f6f1c3
snr: Boolean feat programs for decision tree
vimalmanohar Nov 6, 2015
11030e9
snr: Minor bug fix while dealing with quantized feats
vimalmanohar Nov 6, 2015
c347d26
snr: Silence padding in corrupting data
vimalmanohar Nov 6, 2015
c9d4d36
snr: nnet3-compute from sparse input
vimalmanohar Nov 6, 2015
b52b351
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Nov 6, 2015
e7add3f
snr: Cleaned up and commented segmentation programs
vimalmanohar Nov 14, 2015
789614c
snr: Cleaned up and commented segmentation programs
vimalmanohar Nov 14, 2015
18d3404
snr: Added positivity and sparsity constrainted affine component
vimalmanohar Nov 19, 2015
e6a851e
snr: Added ApplySignum function
vimalmanohar Nov 19, 2015
f879382
snr: Modified segmentation scripts and added test programs
vimalmanohar Nov 19, 2015
6632dba
snr: Makefiles for segmenter programs
vimalmanohar Nov 19, 2015
28395a3
snr: Renamed kaldi::Segment to kaldi::UtteranceSegment
vimalmanohar Nov 19, 2015
d64187e
snr: Modified scripts for SNR SAD on Fisher
vimalmanohar Nov 19, 2015
f9f3c69
snr: Merging from golden
vimalmanohar Nov 19, 2015
b36da07
diarization: aspire prep_test_aspire script using SNR SAD
vimalmanohar Nov 20, 2015
e501bcc
snr: programs to extract segments of features, vectors or alignments
vimalmanohar Nov 20, 2015
b445f76
snr: Some fix to combine-vector-segments program due to renaming of k…
vimalmanohar Nov 20, 2015
689207e
snr: Modifications to rirs preparation and data corruption
vimalmanohar Nov 20, 2015
d103c33
snr: corrupt-wav binary
vimalmanohar Nov 20, 2015
3905c61
snr: Support training with sparse inputs and positivity and sparsity …
vimalmanohar Nov 20, 2015
f36b28f
snr: Some scripts for vad directory preparation
vimalmanohar Nov 20, 2015
1b20694
snr: Added segmentation.conf for aspire segmentation
vimalmanohar Nov 20, 2015
a201a4c
snr:Minor modification to SNR SAD
vimalmanohar Nov 20, 2015
eb81cc5
snr: data dir modification scripts in utils modified with few extra o…
vimalmanohar Nov 21, 2015
a2301b3
snr: Modification to some functions in segmenter
vimalmanohar Nov 30, 2015
137c227
snr: Modified VAD training and evaluation scripts
vimalmanohar Nov 30, 2015
6fd2967
snr: Add priors to even raw nnet training
vimalmanohar Nov 30, 2015
7b5b8b3
snr: Bug fix in copy_data_dir.sh introduced previously
vimalmanohar Nov 30, 2015
12b6f54
snr: Fix to convert_data_dir_to_whole
vimalmanohar Nov 30, 2015
967eb07
snr: Aspire VAD evalutation with more options added
vimalmanohar Nov 30, 2015
bdfa6d0
snr: Merging from golden
vimalmanohar Nov 30, 2015
95acb8a
snr: Updates to SAD SNR to make good working system in Aspire
vimalmanohar Dec 7, 2015
71530b7
snr: Modified feature functions to support larger FFT sizes
vimalmanohar Dec 8, 2015
8547fd3
snr: Modified snr stuff to be more careful about the volume level of …
vimalmanohar Dec 27, 2015
ae128f5
snr: adjust_priors script for raw
vimalmanohar Dec 27, 2015
5fedb7c
snr: Increases the number of bands in input feat and added more helpe…
vimalmanohar Dec 28, 2015
00820e1
snr: Removing some files that aren't supposed to be there
vimalmanohar Dec 28, 2015
39fe9de
snr: Adding bin/matrix-add-offset.cc
vimalmanohar Dec 29, 2015
75cb7a1
snr: Added a lot of small changes
vimalmanohar Dec 29, 2015
c306a32
snr: modified components.py to support raw nnet
vimalmanohar Dec 30, 2015
68299e4
snr: merging from golden, but ignoring changes to components.py
vimalmanohar Dec 30, 2015
9ec04b7
snr: Adding LogExpAffineComponent
vimalmanohar Dec 31, 2015
e438ae7
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Dec 31, 2015
2209c77
snr: Added Log and Exp components
vimalmanohar Jan 3, 2016
7ac2a43
snr: Modifications to lower level scripts
vimalmanohar Jan 3, 2016
608bc7a
snr: High level scripts
vimalmanohar Jan 3, 2016
abb1a8b
snr: Added options to control speech and silence HMM transition proba…
vimalmanohar Jan 4, 2016
5b1ebc1
snr: minor bug fixes
vimalmanohar Jan 4, 2016
6ad3c1e
snr: Made use_gpu=no as default in some scripts and fixed some bugs
vimalmanohar Jan 4, 2016
5afe331
snr: Cleaned up SAD training script
vimalmanohar Jan 4, 2016
eb5e12f
snr: Allow setting sparsity to 0
vimalmanohar Jan 5, 2016
bd39ec1
Merge branch 'master' of github.com:kaldi-asr/kaldi into snr
vimalmanohar Jan 5, 2016
5efae85
snr: Removing bias_init from NaturalGradientPositiveAffineComponent
vimalmanohar Jan 11, 2016
4c1278e
snr: Replace std::round with truncation to int to make it compatible …
vimalmanohar Jan 11, 2016
6785392
snr: Fixed bug in Exponent component
vimalmanohar Jan 25, 2016
4af1804
snr: Added Sort in segmentation-combine-segments
vimalmanohar Jan 25, 2016
6f12493
snr: Sort lists in get_egs
vimalmanohar Jan 25, 2016
9b97272
snr: Added a dry-run option to create lists but not actually do corru…
vimalmanohar Jan 25, 2016
7caa5a9
snr: Added length-tolerance option to computing snr targets to accoun…
vimalmanohar Jan 25, 2016
c9abc04
snr: Better scripts to prepare vad training corpus
vimalmanohar Jan 25, 2016
ca07d86
snr: Better scripts to prepare vad training corpus
vimalmanohar Jan 25, 2016
14d70b4
snr: Added fake cmvn_stats computation in create_snr_data_dir
vimalmanohar Jan 25, 2016
6e808ec
snr: Fixing top level snr training scripts
vimalmanohar Jan 25, 2016
507da74
snr: Added length tolerance to nnet3-get-egs
vimalmanohar Feb 1, 2016
cedc120
snr: Added length-tolerance to some programs
vimalmanohar Feb 5, 2016
02012f9
snr: Modified some snr related scripts
vimalmanohar Feb 5, 2016
8a624c3
snr: modified aspire vad script
vimalmanohar Feb 5, 2016
25452f1
snr: Mergin from master
vimalmanohar Feb 5, 2016
480a394
snr: minor upgrade to snr scripts
vimalmanohar Feb 8, 2016
6eaa392
snr: Added DerivWeights into NnetIo class
vimalmanohar Feb 8, 2016
24642ea
snr: Modify scripts to read DerivWeights for NnetIo class
vimalmanohar Feb 8, 2016
05aa1a2
snr: Fixed merge error in Makefile
vimalmanohar Feb 8, 2016
2d8e7d9
snr: Updated segmentation script to use on arbitrary data dirs
vimalmanohar Feb 15, 2016
5a59985
snr: Fixed weights for ivector binary
vimalmanohar Feb 15, 2016
7043831
snr: Minor fix to segmentation function
vimalmanohar Feb 15, 2016
c908e4b
snr: Added deriv weights for nnet3-get-egs
vimalmanohar Feb 15, 2016
e4f9b29
snr: Removing some unused quantization related programs
vimalmanohar Feb 17, 2016
4f508aa
snr: Added support to create snr targets from uncorrupted signal
vimalmanohar Feb 18, 2016
36503c9
snr: Added config generation script for snr_predictor
vimalmanohar Feb 18, 2016
207b2f5
aspire: Aspire VAD recipe configs for ivector
vimalmanohar Feb 18, 2016
b82145a
snr: Merging from golden
vimalmanohar Feb 18, 2016
385c28c
snr: Using deriv weights for lda stats and also to ignore when writin…
vimalmanohar Feb 24, 2016
8808d99
snr: Send valid egs creation to background
vimalmanohar Feb 24, 2016
c3410c8
snr: Python configs modified
vimalmanohar Feb 24, 2016
704b726
snr: Modify run scripts
vimalmanohar Feb 24, 2016
5e314b5
snr: Raw pov features
vimalmanohar Feb 24, 2016
9274f23
snr: Move chain functions to a different example-utils
vimalmanohar Feb 24, 2016
28b6347
snr: Removing positivity constraint in nnet3 training
vimalmanohar Feb 24, 2016
96e1ba9
snr: Added utt2uniq in copy_data_dir.sh
vimalmanohar Feb 24, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
8 changes: 4 additions & 4 deletions egs/aspire/s5/cmd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
# the number of cpus on your machine.

#a) JHU cluster options
export train_cmd="queue.pl -l arch=*64"
export decode_cmd="queue.pl -l arch=*64,mem_free=2G,ram_free=2G"
export mkgraph_cmd="queue.pl -l arch=*64,ram_free=4G,mem_free=4G"
export train_cmd="queue.pl"
export decode_cmd="queue.pl --mem 2G"
export mkgraph_cmd="queue.pl --mem 4G"

export cuda_cmd="queue.pl -l gpu=1 -q g.q"
export cuda_cmd="queue.pl --gpu 1"


#b) BUT cluster options
Expand Down
6 changes: 6 additions & 0 deletions egs/aspire/s5/conf/fbank.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# config for high-resolution Fbank features
--use-energy=false # do not add energy
--sample-frequency=8000 # Switchboard is sampled at 8kHz
--num-mel-bins=40 # similar to Google's setup.
--low-freq=40 # low cutoff frequency for mel bins
--high-freq=-200 # high cutoff frequently, relative to Nyquist of 4000 (=3800)
8 changes: 8 additions & 0 deletions egs/aspire/s5/conf/fbank_bp.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# config for high-resolution Fbank features
--use-energy=false # do not add energy
--sample-frequency=8000 # Switchboard is sampled at 8kHz
--num-mel-bins=152 # similar to Google's setup.
--num-fft-bins=512
--low-freq=330 # low cutoff frequency for mel bins
--high-freq=-1000 # high cutoff frequently, relative to Nyquist of 4000 (=3000)

6 changes: 6 additions & 0 deletions egs/aspire/s5/conf/mfcc_diarization.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
--sample-frequency=8000
--frame-length=25 # the default is 25, but we usually use 20 for SID
--low-freq=20 # the default.
--high-freq=3700 # the default is zero meaning use the Nyquist (4k in this case).
--num-ceps=20 # higher than the default which is 12.
--snip-edges=false
12 changes: 12 additions & 0 deletions egs/aspire/s5/conf/mfcc_hires_bp.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=8000 # Switchboard is sampled at 8kHz
--num-mel-bins=152 # similar to Google's setup.
--num-ceps=152 # there is no dimensionality reduction.
--num-fft-bins=512
--low-freq=330 # low cutoff frequency for mel bins
--high-freq=-1000 # high cutoff frequently, relative to Nyquist of 4000 (=3000)

5 changes: 5 additions & 0 deletions egs/aspire/s5/conf/mfcc_vad.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
--sample-frequency=8000
--frame-length=25 # the default is 25.
--low-freq=20 # the default.
--high-freq=-300 # the default is zero meaning use the Nyquist (4k in this case).
--num-ceps=13 # higher than the default which is 12.
24 changes: 24 additions & 0 deletions egs/aspire/s5/conf/segmentation.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
method=Viterbi

# General segmentation options
max_intersegment_length=50 # Merge nearby speech segments if the silence
# between them is less than this many frames.
max_relabel_length=10 # maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
pad_length=5 # Pad speech segments by this many frames on either side
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=100 # Overlapping frames when segments are split.
# See the above option.

# Viterbi options
min_silence_duration=30 # minimum number of frames for silence
min_speech_duration=30 # minimum number of frames for speech
speech_to_sil_ratio=1 # the prior on speech vs silence

# Decoding options
acwt=1
beam=10
max_active=7000

25 changes: 25 additions & 0 deletions egs/aspire/s5/conf/segmentation_aspire.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
method=Viterbi

# General segmentation options
max_intersegment_length=50 # Merge nearby speech segments if the silence
# between them is less than this many frames.
max_relabel_length=10 # maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
pad_length=5 # Pad speech segments by this many frames on either side
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=100 # Overlapping frames when segments are split.
# See the above option.

# Viterbi options
min_silence_duration=30 # minimum number of frames for silence
min_speech_duration=30 # minimum number of frames for speech
speech_to_sil_ratio=1 # the prior on speech vs silence

# Decoding options
acwt=1
beam=10
max_active=7000


26 changes: 26 additions & 0 deletions egs/aspire/s5/conf/segmentation_babel.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
method=Viterbi

# General segmentation options
max_intersegment_length=100 # Merge nearby speech segments if the silence
# between them is less than this many frames.
max_relabel_length=10 # maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
pad_length=10 # Pad speech segments by this many frames on either side
post_pad_length=10 # Pad speech segments by this many frames on either side
max_segment_length=1000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=100 # Overlapping frames when segments are split.
# See the above option.

# Viterbi options
min_silence_duration=30 # minimum number of frames for silence
min_speech_duration=30 # minimum number of frames for speech
speech_to_sil_ratio=1 # the prior on speech vs silence

# Decoding options
acwt=1
beam=10
max_active=7000


39 changes: 39 additions & 0 deletions egs/aspire/s5/conf/vad_icsi_babel.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## Features paramters
window_size=10 # 100 ms
frames_per_gaussian=200

## Phase 1 parameters
num_frames_init_silence=2000 # 20s - Lowest energy frames selected to initialize Silence GMM
num_frames_init_sound=10000 # 100s - Highest energy frames selected to initialize Sound GMM
num_frames_init_sound_next=2000 # 20s - Highest zero crossing frames selected to initialize Sound GMM
sil_num_gauss_init=2
sound_num_gauss_init=2
sil_max_gauss=2
sound_max_gauss=6
sil_gauss_incr=0
sound_gauss_incr=2
num_iters=5
min_sil_variance=0.1
min_sound_variance=0.01
min_speech_variance=0.001

## Phase 2 parameters
speech_num_gauss_init=6
sil_max_gauss_phase2=7
sound_max_gauss_phase2=18
speech_max_gauss_phase2=16
sil_gauss_incr_phase2=1
sound_gauss_incr_phase2=2
speech_gauss_incr_phase2=2
num_iters_phase2=5

## Phase 3 parameters
sil_num_gauss_init_phase3=2
speech_num_gauss_init_phase3=2
sil_max_gauss_phase3=5
speech_max_gauss_phase3=12
sil_gauss_incr_phase3=1
speech_gauss_incr_phase3=2
num_iters_phase3=7


54 changes: 54 additions & 0 deletions egs/aspire/s5/conf/vad_icsi_babel_3models.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
## Features paramters
window_size=10 # 100 ms
frames_per_gaussian=200

## Phase 1 parameters
num_frames_init_silence=2000 # 20s - Lowest energy frames selected to initialize Silence GMM
num_frames_init_sound=10000 # 100s - Highest energy frames selected to initialize Sound GMM
num_frames_init_sound_next=2000 # 20s - Highest zero crossing frames selected to initialize Sound GMM
sil_num_gauss_init=2
sound_num_gauss_init=2
sil_max_gauss=2
sound_max_gauss=6
sil_gauss_incr=0
sound_gauss_incr=2
num_iters=5
min_sil_variance=0.1
min_sound_variance=0.01
min_speech_variance=0.001

## Phase 2 parameters
speech_num_gauss_init=6
sil_max_gauss_phase2=7
sound_max_gauss_phase2=18
speech_max_gauss_phase2=16
sil_gauss_incr_phase2=1
sound_gauss_incr_phase2=2
speech_gauss_incr_phase2=2
num_iters_phase2=5

## Phase 3 parameters
num_frames_silence_phase3_init=2000
num_frames_speech_phase3_init=2000
sil_num_gauss_init_phase3=2
speech_num_gauss_init_phase3=2
sil_max_gauss_phase3=5
sil_max_gauss_phase4=8
speech_max_gauss_phase4=16
sil_gauss_incr_phase3=1
sil_gauss_incr_phase4=1
speech_gauss_incr_phase4=2
num_iters_phase3=5
num_iters_phase4=5

## Phase 5 parameters
sil_num_gauss_init_phase5=2
speech_num_gauss_init_phase5=2
sil_max_gauss_phase5=5
speech_max_gauss_phase5=12
sil_gauss_incr_phase5=1
speech_gauss_incr_phase5=2
num_iters_phase5=7



41 changes: 41 additions & 0 deletions egs/aspire/s5/conf/vad_icsi_rt.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Features paramters
window_size=10 # 1s
frames_per_gaussian=2000

## Phase 1 parameters
num_frames_init_silence=2000
num_frames_init_sound=10000
num_frames_init_sound_next=2000
sil_num_gauss_init=2
sound_num_gauss_init=2
sil_max_gauss=2
sound_max_gauss=6
sil_gauss_incr=0
sound_gauss_incr=2
num_iters=5
min_sil_variance=0.1
min_sound_variance=0.01
min_speech_variance=0.001

## Phase 2 parameters
num_frames_init_speech=10000
speech_num_gauss_init=6
sil_max_gauss_phase2=7
sound_max_gauss_phase2=18
speech_max_gauss_phase2=16
sil_gauss_incr_phase2=1
sound_gauss_incr_phase2=2
speech_gauss_incr_phase2=2
num_iters_phase2=5

## Phase 3 parameters
sil_num_gauss_init_phase3=2
speech_num_gauss_init_phase3=2
sil_max_gauss_phase3=5
speech_max_gauss_phase3=12
sil_gauss_incr_phase3=1
speech_gauss_incr_phase3=2
num_iters_phase3=7



26 changes: 26 additions & 0 deletions egs/aspire/s5/conf/weights_segmentation_aspire.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
method=Viterbi

# General segmentation options
max_intersegment_length=0 # Merge nearby speech segments if the silence
# between them is less than this many frames.
max_relabel_length=0 # maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
pad_length=0 # Pad speech segments by this many frames on either side
max_segment_length=2000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=0 # Overlapping frames when segments are split.
# See the above option.

# Viterbi options
min_silence_duration=30 # minimum number of frames for silence
min_speech_duration=30 # minimum number of frames for speech
speech_to_sil_ratio=0.1 # the prior on speech vs silence

# Decoding options
acwt=1
beam=10
max_active=7000



25 changes: 25 additions & 0 deletions egs/aspire/s5/conf/weights_segmentation_babel.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
method=Viterbi

# General segmentation options
max_intersegment_length=0 # Merge nearby speech segments if the silence
# between them is less than this many frames.
max_relabel_length=0 # maximum duration of speech that will be removed as part
# of smoothing process. This is only if there are no other
# speech segments nearby.
pad_length=0 # Pad speech segments by this many frames on either side
max_segment_length=2000 # Segments that are longer than this are split into
# overlapping frames.
overlap_length=0 # Overlapping frames when segments are split.
# See the above option.

# Viterbi options
min_silence_duration=30 # minimum number of frames for silence
min_speech_duration=30 # minimum number of frames for speech
speech_to_sil_ratio=0.1 # the prior on speech vs silence

# Decoding options
acwt=1
beam=10
max_active=7000


4 changes: 4 additions & 0 deletions egs/aspire/s5/conf/zc_vad.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
--sample-frequency=8000
--frame-length=25 # the default is 25.
--dither=0.0
--zero-crossing-threshold=1e-5
1 change: 1 addition & 0 deletions egs/aspire/s5/diarization
5 changes: 3 additions & 2 deletions egs/aspire/s5/local/multi_condition/combine_ali_dirs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
# Begin configuration section.
extra_files= # specify addtional files in 'src-data-dir' to merge, ex. "file1 file2 ..."
ref_data_dir= # data directory to be used as reference for rearranging alignments
cmd=run.pl
# End configuration section.

echo "$0 $@" # Print the command line for logging
Expand Down Expand Up @@ -78,7 +79,7 @@ if [ ! -z "$ref_data_dir" ]; then
awk -v p=\$ali_file '{printf "%s %s %s\n", \$1, p, NR}' > $temp_dir/ali_utt_index.\$JOB
EOF
chmod +x $temp_dir/create_ali_utt_index.sh
$decode_cmd -v PATH JOB=1:$num_jobs $temp_dir/ali_copy_int.JOB.log $temp_dir/create_ali_utt_index.sh JOB
$cmd -v PATH JOB=1:$num_jobs $temp_dir/ali_copy_int.JOB.log $temp_dir/create_ali_utt_index.sh JOB

cat <<EOF >$temp_dir/create_new_ali.py

Expand Down Expand Up @@ -147,7 +148,7 @@ EOF
# split the ref_data_dir to get reference utt2spk for individual ali.JOB.gz files
utils/split_data.sh $ref_data_dir $num_jobs

$decode_cmd -v PATH JOB=1:$num_jobs $temp_dir/create_new_ali.JOB.run.log \
$cmd JOB=1:$num_jobs $temp_dir/create_new_ali.JOB.run.log \
python $temp_dir/create_new_ali.py \
$ref_data_dir/split$num_jobs/JOB/utt2spk \
$temp_dir/create_new_ali.JOB.sh $temp_dir/ali.JOB.gz || exit 1;
Expand Down
3 changes: 2 additions & 1 deletion egs/aspire/s5/local/multi_condition/copy_ali_dir.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
# begin configuration section
utt_prefix=
utt_suffix=
cmd=run.pl
# end configuration section

. utils/parse_options.sh
Expand Down Expand Up @@ -72,6 +73,6 @@ for line in sys.stdin:
set +o pipefail; # unset the pipefail option.
EOF
chmod +x $dest_dir/temp/copy_ali.sh
$decode_cmd -v PATH JOB=1:$nj $dest_dir/temp/copy_ali.JOB.log $dest_dir/temp/copy_ali.sh JOB || exit 1;
$cmd -v PATH JOB=1:$nj $dest_dir/temp/copy_ali.JOB.log $dest_dir/temp/copy_ali.sh JOB || exit 1;

echo "$0: copied alignments from $src_dir to $dest_dir"
Loading