2014-10-14 2 views
1

오늘 Freeswitch에서 사용하는 음성 인식 Pocketsphinx에 대한 도움이 필요합니다. 그래서 프로그래머가 나를 "듣지"않기 때문에 데모 "pizza demo"이 작동하지 않습니다.Freeswitch pocketsphinx가 나를 인식하지 못합니다

lua 스크립트를 사용하여 다른 example을 사용해 보았습니다. 그리고 여기도 Pocketsphinx는 나를 "듣지"않습니다.

어쩌면 누군가가 작동하지 않는 것을 알고있을 것입니다. 왜냐하면 나는 아무것도 구현하지 않기 때문에 여기에 붙여 넣을 수있는 코드를 모르겠습니다. 따라서 일부 코드 또는 구성이 필요하면 알려주십시오.

제 아이디어 : 아마도 pocketsphinx가 사용해야하는 .dic 파일을 설정해야합니다. 누군가가 나를 도울 수 있기를 바랍니다.

편집 : //

2014-10-14 15:13:08.923330 [NOTICE] switch_channel.c:1055 New Channel sofia/internal/[email protected] [326a4157-aa80-48d2-bd7e-db8d8afd525b] 
2014-10-14 15:13:09.042378 [INFO] mod_dialplan_xml.c:558 Processing me <1001>->74992 in context default 
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING 
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Open /usr/local/freeswitch/conf/vars.xml and change the default_password. 
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Once changed type 'reloadxml' at the console. 
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING 
2014-10-14 15:13:19.932900 [INFO] switch_core_media.c:5162 Activating RTCP PORT 4077 
2014-10-14 15:13:19.932900 [NOTICE] sofia_media.c:92 Pre-Answer sofia/internal/[email protected]! 
2014-10-14 15:13:19.943925 [NOTICE] fssession.cpp:1167 Channel [sofia/internal/[email protected]] has been answered 
INFO: cmd_ln.c(691): Parsing command line: 
\ 
    -samprate 8000 \ 
    -hmm /usr/local/freeswitch/grammar/model/communicator \ 
    -jsgf /usr/local/freeswitch/grammar/pizza_order.gram \ 
    -lw 6.5 \ 
    -dict /usr/local/freeswitch/grammar/default.dic \ 
    -frate 50 \ 
    -silprob 0.005 

Current configuration: 
[NAME]  [DEFLT]  [VALUE] 
-agc  none  none 
-agcthresh 2.0  2.000000e+00 
-alpha  0.97  9.700000e-01 
-ascale  20.0  2.000000e+01 
-aw  1  1 
-backtrace no  no 
-beam  1e-48  1.000000e-48 
-bestpath yes  yes 
-bestpathlw 9.5  9.500000e+00 
-bghist  no  no 
-ceplen  13  13 
-cmn  current  current 
-cmninit 8.0  8.0 
-compallsen no  no 
-debug    0 
-dict    /usr/local/freeswitch/grammar/default.dic 
-dictcase no  no 
-dither  no  no 
-doublebw no  no 
-ds  1  1 
-fdict 
-feat  1s_c_d_dd 1s_c_d_dd 
-featparams 
-fillprob 1e-8  1.000000e-08 
-frate  100  50 
-fsg 
-fsgusealtpron yes  yes 
-fsgusefiller yes  yes 
-fwdflat yes  yes 
-fwdflatbeam 1e-64  1.000000e-64 
-fwdflatefwid 4  4 
-fwdflatlw 8.5  8.500000e+00 
-fwdflatsfwin 25  25 
-fwdflatwbeam 7e-29  7.000000e-29 
-fwdtree yes  yes 
-hmm    /usr/local/freeswitch/grammar/model/communicator 
-input_endian little  little 
-jsgf    /usr/local/freeswitch/grammar/pizza_order.gram 
-kdmaxbbi -1  -1 
-kdmaxdepth 0  0 
-kdtree 
-latsize 5000  5000 
-lda 
-ldadim  0  0 
-lextreedump 0  0 
-lifter  0  0 
-lm 
-lmctl 
-lmname  default  default 
-logbase 1.0001  1.000100e+00 
-logfn 
-logspec no  no 
-lowerf  133.33334 1.333333e+02 
-lpbeam  1e-40  1.000000e-40 
-lponlybeam 7e-29  7.000000e-29 
-lw  6.5  6.500000e+00 
-maxhmmpf -1  -1 
-maxnewoov 20  20 
-maxwpf  -1  -1 
-mdef 
-mean 
-mfclogdir 
-min_endfr 0  0 
-mixw 
-mixwfloor 0.0000001 1.000000e-07 
-mllr 
-mmap  yes  yes 
-ncep  13  13 
-nfft  512  512 
-nfilt  40  40 
-nwpen  1.0  1.000000e+00 
-pbeam  1e-48  1.000000e-48 
-pip  1.0  1.000000e+00 
-pl_beam 1e-10  1.000000e-10 
-pl_pbeam 1e-5  1.000000e-05 
-pl_window 0  0 
-rawlogdir 
-remove_dc no  no 
-round_filters yes  yes 
-samprate 16000  8.000000e+03 
-seed  -1  -1 
-sendump 
-senlogdir 
-senmgau 
-silprob 0.005  5.000000e-03 
-smoothspec no  no 
-svspec 
-tmat 
-tmatfloor 0.0001  1.000000e-04 
-topn  4  4 
-topn_beam 0  0 
-toprule 
-transform legacy  legacy 
-unit_area yes  yes 
-upperf  6855.4976 6.855498e+03 
-usewdphones no  no 
-uw  1.0  1.000000e+00 
-var 
-varfloor 0.0001  1.000000e-04 
-varnorm no  no 
-verbose no  no 
-warp_params 
-warp_type inverse_linear inverse_linear 
-wbeam  7e-29  7.000000e-29 
-wip  0.65  6.500000e-01 
-wlen  0.025625 2.562500e-02 

INFO: cmd_ln.c(691): Parsing command line: 
\ 
    -alpha 0.97 \ 
    -dither yes \ 
    -doublebw no \ 
    -nfilt 31 \ 
    -ncep 13 \ 
    -lowerf 200 \ 
    -upperf 3500 \ 
    -nfft 256 \ 
    -wlen 0.0256 \ 
    -transform legacy \ 
    -feat s2_4x \ 
    -agc none \ 
    -cmn current \ 
    -varnorm no 

Current configuration: 
[NAME]  [DEFLT]  [VALUE] 
-agc  none  none 
-agcthresh 2.0  2.000000e+00 
-alpha  0.97  9.700000e-01 
-ceplen  13  13 
-cmn  current  current 
-cmninit 8.0  8.0 
-dither  no  yes 
-doublebw no  no 
-feat  1s_c_d_dd s2_4x 
-frate  100  50 
-input_endian little  little 
-lda 
-ldadim  0  0 
-lifter  0  0 
-logspec no  no 
-lowerf  133.33334 2.000000e+02 
-ncep  13  13 
-nfft  512  256 
-nfilt  40  31 
-remove_dc no  no 
-round_filters yes  yes 
-samprate 16000  8.000000e+03 
-seed  -1  -1 
-smoothspec no  no 
-svspec 
-transform legacy  legacy 
-unit_area yes  yes 
-upperf  6855.4976 3.500000e+03 
-varnorm no  no 
-verbose no  no 
-warp_params 
-warp_type inverse_linear inverse_linear 
-wlen  0.025625 2.560000e-02 

INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/freeswitch/grammar/model/communicator/feat.params 
INFO: fe_interface.c(299): You are using the internal mechanism to generate the seed. 
INFO: feat.c(713): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='current', VARNORM='no', AGC='none' 
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0 
INFO: mdef.c(517): Reading model definition: /usr/local/freeswitch/grammar/model/communicator/mdef 
INFO: bin_mdef.c(179): Allocating 104160 * 8 bytes (813 KiB) for CD tree 
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/freeswitch/grammar/model/communicator/transition_matrices 
INFO: acmod.c(121): Attempting to use SCHMM computation module 
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/means 
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size: 
INFO: ms_gauden.c(294): 256x12 
INFO: ms_gauden.c(294): 256x24 
INFO: ms_gauden.c(294): 256x3 
INFO: ms_gauden.c(294): 256x12 
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/variances 
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size: 
INFO: ms_gauden.c(294): 256x12 
INFO: ms_gauden.c(294): 256x24 
INFO: ms_gauden.c(294): 256x3 
INFO: ms_gauden.c(294): 256x12 
INFO: ms_gauden.c(354): 59 variance values floored 
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/local/freeswitch/grammar/model/communicator/sendump 
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION 
INFO: s2_semi_mgau.c(990): Rows: 256, Columns: 6256 
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones 
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0 0 
INFO: dict.c(317): Allocating 137549 * 32 bytes (4298 KiB) for word entries 
INFO: dict.c(332): Reading main dictionary: /usr/local/freeswitch/grammar/default.dic 
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones 
INFO: dict.c(335): 133436 words read 
INFO: dict.c(341): Reading filler dictionary: /usr/local/freeswitch/grammar/model/communicator/noisedict 
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones 
INFO: dict.c(344): 17 words read 
INFO: dict2pid.c(396): Building PID tables for dictionary 
INFO: dict2pid.c(404): Allocating 51^3 * 2 bytes (259 KiB) for word-initial triphones 
INFO: dict2pid.c(131): Allocated 62832 bytes (61 KiB) for word-final triphones 
INFO: dict2pid.c(195): Allocated 62832 bytes (61 KiB) for single-phone word triphones 
INFO: fsg_search.c(145): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0) 
INFO: jsgf.c(581): Defined rule: <pizza_order.g00000> 
INFO: jsgf.c(581): Defined rule: PUBLIC <pizza_order.delivery> 
INFO: fsg_model.c(215): Computing transitive closure for null transitions 
INFO: fsg_model.c(270): 9 null transitions added 
INFO: fsg_model.c(421): Adding silence transitions for <sil> to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++AE++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++AH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++BACKGROUND++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++BREATH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++COUGH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++EH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++ER++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++LAUGH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++MM++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++MUMBLE++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++NOISE++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++OH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++SMACK++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++UH++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++UH_NOISE++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++UM++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_model.c(421): Adding silence transitions for ++UM_NOISE++ to FSG 
INFO: fsg_model.c(441): Added 8 silence word transitions 
INFO: fsg_search.c(366): Added 0 alternate word transitions 
INFO: fsg_lextree.c(108): Allocated 832 bytes (0 KiB) for left and right context phones 
INFO: fsg_lextree.c(253): 213 HMM nodes in lextree (199 leaves) 
INFO: fsg_lextree.c(255): Allocated 27264 bytes (26 KiB) for all lextree nodes 
INFO: fsg_lextree.c(258): Allocated 25472 bytes (24 KiB) for lextree leafnodes 
2014-10-14 15:13:25.442814 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:25.442953] SSRC[1123956418]RTT[0.001266] A[2683662693] - DLSR[22111] - LSR[2683640499] 
INFO: cmn_prior.c(121): cmn_prior_update: from < 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 
INFO: cmn_prior.c(139): cmn_prior_update: to < 7.58 0.08 -0.24 -0.08 -0.24 -0.18 -0.21 -0.15 -0.06 -0.18 -0.08 -0.11 -0.11 > 
INFO: fsg_search.c(1032): 86 frames, 1666 HMMs (19/fr), 6967 senones (81/fr), 886 history entries (10/fr) 

INFO: fsg_search.c(1417): Start node <sil>.0:22:85 
INFO: fsg_search.c(1417): Start node <sil>.0:22:55 
INFO: fsg_search.c(1417): Start node <sil>.0:22:85 
INFO: fsg_search.c(1417): Start node <sil>.0:22:55 
INFO: fsg_search.c(1417): Start node <sil>.0:22:85 
INFO: fsg_search.c(1417): Start node takeout.0:21:33 
INFO: fsg_search.c(1417): Start node pickup.0:19:71 
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076) 
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076) 
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076) 
INFO: fsg_search.c(1456): End node <sil>.26:28:85 (-1180) 
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201) 
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201) 
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201) 
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.86 
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:86:86) = -333411 
INFO: ps_lattice.c(1403): Joint P(O,S) = -333414 P(S|O) = -3 
2014-10-14 15:13:28.822614 [WARNING] mod_pocketsphinx.c:348 Lost the text, never mind.... 
2014-10-14 15:13:30.922352 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:30.922476] SSRC[1123956418]RTT[0.001648] A[2684021799] - DLSR[53573] - LSR[2683968118] 
2014-10-14 15:13:36.403317 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:36.403451] SSRC[1123956418]RTT[0.002731] A[2684381000] - DLSR[85028] - LSR[2684295793] 
INFO: fsg_search.c(1032): 149 frames, 1750 HMMs (11/fr), 8700 senones (58/fr), 1006 history entries (6/fr) 

INFO: fsg_search.c(1417): Start node <sil>.0:2:90 
INFO: fsg_search.c(1417): Start node <sil>.0:2:90 
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955) 
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955) 
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955) 
INFO: fsg_search.c(1456): End node pickup.87:107:148 (-4233) 
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.149 
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:149:149) = -927641 
INFO: ps_lattice.c(1403): Joint P(O,S) = -927641 P(S|O) = 0 
2014-10-14 15:13:41.883453 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:41.883618] SSRC[1123956418]RTT[0.002487] A[2684740148] - DLSR[116488] - LSR[2684623497] 
2014-10-14 15:13:44.732381 [NOTICE] sofia.c:952 Hangup sofia/internal/[email protected] [CS_EXECUTE] [NORMAL_CLEARING] 
2014-10-14 15:13:44.732381 [ERR] SpeechTools.jm:368 Exception: Session is not active! (near: "   rv = this.asr.session.collectInput(this.asr.onInput, this.asr, 500);") 
INFO: fsg_search.c(1032): 33 frames, 377 HMMs (11/fr), 1733 senones (52/fr), 275 history entries (8/fr) 

2014-10-14 15:13:44.802526 [INFO] mod_pocketsphinx.c:257 Port Closed. 
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1633 Session 25 (sofia/internal/[email protected]) Ended 
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1637 Close Channel sofia/internal/[email protected] [CS_DESTROY] 

편집 2 :

나는 음성 인식 작동하는 것을 찾아 내 연설을 감지합니다. 그래서 문제는 SpeechTools.jm에서 XML의 결과가로드 될 수없고 정의되지 않는다는 것입니다.

body = body.replace(/<\?.*?\?>/g, ''); 
console_log("debug", "----XML:\n" + body + "\n"); 
xml = new XML("<xml>" + body + "</xml>"); 
result = xml.result; //undefined 

및 console_log

<result grammar="pizza_order"> 
    <interpretation grammar="pizza_order" confidence="100"> 
    <input mode="speech">pickup</input> 
    </interpretation> 
</result> 
+0

이렇게 정의되지 않은 오류를 해결할 수있는 방법은 무엇입니까? ??? 우리는 스크립트를 편집해야합니까 ??? 또는 어떻게하면 XML에서 데이터를로드 할 수 있습니다; –

답변

2

자에서 내 출력, 음성 인식은 전체 시간 (편집 참조) 작동합니다. 진짜 문제는 전체 스크립트 (SpeechTools.jm)가 작동하지 않는다는 것입니다. 그들은 스크립트를 편집하지 않고 mozilla javascript 엔진에서 google v8로 전환했습니다. 그러나 스크립트를 수정하는 것은 자바 스크립트 문제이며 더 이상이 질문과 관련이 없습니다.

+0

이 문제에 대한 해결책이 있습니까? 친절하게도 나는 같은 문제가 있다고 말합니다 ... –

+0

우리는 더 이상 사용하지 않았습니다. 실제로 더이상 프리 스위치에 들어 가지 않아서 나는 너를 도울 수 없다. 죄송합니다 – Zero

관련 문제