Professional Documents
Culture Documents
HTKBook
VoxForge
http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/how-to
http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial
Kyle Gorman
http://www.ling.upenn.edu/~kgorman/papers/segmentation/.speechseg.html
Kumpulan script dari Kyle Gorman seg.tar.gz Training process based on this model
Praat
Audacity
Alur Training
Prosedur Training
: lossless WAV
: 18 kHz
: 16 bit, mono
: silent 500 ms
: per-speaker, per-kalimat
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 24
CEPLIFTER = 22
NUMCEPS = 12
RAWENERGY = F
ENORMALISE = F
ZMEANSOURCE = F
train.conf
berisi informasi
TARGETKIND = MFCC_0_D_N_Z
TARGETRATE = 100000.0
mkPhones0.led
berisi informasi
EX
IS sil sil
DE sp
Membuat pair antara .wav dan .mfc (dengan asumsi file .wav di direktori ./data dan .mfc di direktori
./mfc)
Contoh :
data/maju001_M001.wav
mfc/maju001_M001.mfc
<Variance> 25
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 3
<Mean> 25
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 25
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 4
<Mean> 25
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 25
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<TransP> 5
0.0 1.0 0.0 0.0 0.0
0.0 0.6 0.4 0.0 0.0
0.0 0.0 0.6 0.4 0.0
0.0 0.0 0.0 0.7 0.3
0.0 0.0 0.0 0.0 0.0
<EndHMM>