2012-02-25 5 views
1

몇 일 전 Membase가 매우 느리게 작동하기 시작했습니다. 우리는 각 서버 (우리는 4 대의 서버가 있음)에 로컬 moxi 프록시를 사용하고 있습니다. 그리고 복제가 가능한 2 대의 서버 (10GBx2)가있는 Membase 클러스터가 있습니다. 이제 우리는 클러스터 (복제본과 함께 30M)에 약 15M 레코드를 가지고 있습니다. 클러스터에 약 2000-6000ops로드 및 100 % 메모리 상주.Membase가 매우 천천히 작동하기 시작했습니다.

Membase : 1.7.1 시스템 : 리눅스 CentOS는 2.6.18-238.19.1.el5 # 1 SMP (금) 7월 15일 동부 서머 타임 07시 31분 24초 2011 x86_64에의 x86_64에의 x86_64의 GNU/리눅스

여기

입니다 타이밍 moxi 프록시 :

stats proxy timings 
STAT 11311:default:connect 100+100=23 37.70% ****************** 
STAT 11311:default:connect 200+100=30 86.89% ************************ 
STAT 11311:default:connect 300+100=7 98.36% ***** 
STAT 11311:default:connect 400+100=1 100.00% 
STAT 11311:default:connect 500+100=0 100.00% 
STAT 11311:default:reserved  100+100 =6848 1.64% * 
STAT 11311:default:reserved  200+100 =41143 11.49% ******* 
STAT 11311:default:reserved  300+100 =102670 36.08% ******************* 
STAT 11311:default:reserved  400+100 =127986 66.73% ************************ 
STAT 11311:default:reserved  500+100 =53582 79.56% ********** 
STAT 11311:default:reserved  600+100 =31544 87.11% ***** 
STAT 11311:default:reserved  700+100 =13188 90.27% ** 
STAT 11311:default:reserved  800+100 =3630 91.14% 
STAT 11311:default:reserved  900+100 =1629 91.53% 
STAT 11311:default:reserved 1000+100 =1253 91.83% 
STAT 11311:default:reserved 1100+100 =1135 92.10% 
STAT 11311:default:reserved 1200+100 =967 92.33% 
STAT 11311:default:reserved 1300+100 =553 92.46% 
STAT 11311:default:reserved 1400+100 =354 92.55% 
STAT 11311:default:reserved 1500+100 =265 92.61% 
STAT 11311:default:reserved 1600+100 =226 92.67% 
STAT 11311:default:reserved 1700+100 =252 92.73% 
STAT 11311:default:reserved 1800+100 =197 92.77% 
STAT 11311:default:reserved 1900+100 =132 92.81% 
STAT 11311:default:reserved 2000+100 =73  92.82% 
STAT 11311:default:reserved 2100+200 =242 92.88% 
STAT 11311:default:reserved 2300+400 =307 92.96% 
STAT 11311:default:reserved 2700+800 =256 93.02% 
STAT 11311:default:reserved 3500+1600 =244 93.08% 
STAT 11311:default:reserved 5100+3200 =241 93.13% 
STAT 11311:default:reserved 8300+6400 =90  93.15% 
STAT 11311:default:reserved 14700+12800 =13  93.16% 
STAT 11311:default:reserved 27500+25600 =256 93.22% 
STAT 11311:default:reserved 53100+51200 =5  93.22% 
STAT 11311:default:reserved 104300+102400 =17107 97.32% *** 
STAT 11311:default:reserved 206700+204800 =8553 99.36% * 
STAT 11311:default:reserved 411500+409600 =2358 99.93% 
STAT 11311:default:reserved 821100+819200 =255 99.99% 
STAT 11311:default:reserved 1640300+1638400=37 100.00% 
STAT 11311:default:reserved 3278700+3276800=3  100.00% 
STAT 11311:default:reserved 6555500+6553600=0  100.00% 
END 

Membase 통계 :

STAT accepting_conns 1 
STAT auth_cmds 174777 
STAT auth_errors 0 
STAT bucket_active_conns 1 
STAT bucket_conns 304 
STAT bytes_read 6765632401898 
STAT bytes_written 5857686469455 
STAT cas_badval 827124 
STAT cas_hits 116425346 
STAT cas_misses 97243268 
STAT cmd_flush 0 
STAT cmd_get 10962483669 
STAT cmd_set 1406835206 
STAT connection_structures 684 
STAT conn_yields 2180210 
STAT curr_connections 313 
STAT curr_items 7636008 
STAT curr_items_tot 15386636 
STAT daemon_connections 10 
STAT decr_hits 0 
STAT decr_misses 0 
STAT delete_hits 311792 
STAT delete_misses 9 
STAT ep_bg_fetched 0 
STAT ep_commit_num 3320450 
STAT ep_commit_time 0 
STAT ep_commit_time_total 9492892 
STAT ep_data_age 84 
STAT ep_data_age_highwat 1398 
STAT ep_db_cleaner_status complete 
STAT ep_dbinit 2 
STAT ep_dbname /data/membase/data/default-data/default 
STAT ep_dbshards 4 
STAT ep_db_strategy multiMTVBDB 
STAT ep_diskqueue_drain 1647838030 
STAT ep_diskqueue_fill 1576916461 
STAT ep_diskqueue_items 9919 
STAT ep_diskqueue_memory 872872 
STAT ep_diskqueue_pending 13765304 
STAT ep_expired 244175343 
STAT ep_flush_all false 
STAT ep_flush_duration 132 
STAT ep_flush_duration_highwat 768 
STAT ep_flush_duration_total 9517040 
STAT ep_flusher_state running 
STAT ep_flusher_todo 1457 
STAT ep_flush_preempts 0 
STAT ep_io_num_read 3570394 
STAT ep_io_num_write 1430459297 
STAT ep_io_read_bytes 647400207 
STAT ep_io_write_bytes 3645986635220 
STAT ep_item_begin_failed 0 
STAT ep_item_commit_failed 0 
STAT ep_item_flush_expired 195564028 
STAT ep_item_flush_failed 0 
STAT ep_items_rm_from_checkpoints 689034281 
STAT ep_kv_size 5046455536 
STAT ep_latency_arith_cmd 33757761 
STAT ep_latency_get_cmd 18446744071823123379 
STAT ep_latency_store_cmd 1655074486 
STAT ep_max_data_size 10485760000 
STAT ep_max_txn_size 1000 
STAT ep_mem_high_wat 7864320000 
STAT ep_mem_low_wat 6291456000 
STAT ep_min_data_age 0 
STAT ep_num_active_non_resident 0 
STAT ep_num_checkpoint_remover_runs 2386690 
STAT ep_num_eject_failures 0 
STAT ep_num_eject_replicas 0 
STAT ep_num_expiry_pager_runs 3316 
STAT ep_num_non_resident 0 
STAT ep_num_not_my_vbuckets 42220981 
STAT ep_num_pager_runs 0 
STAT ep_num_value_ejects 0 
STAT ep_onlineupdate false 
STAT ep_onlineupdate_revert_add 0 
STAT ep_onlineupdate_revert_delete 0 
STAT ep_onlineupdate_revert_update 0 
STAT ep_oom_errors 0 
STAT ep_overhead 123317215 
STAT ep_pending_ops 0 
STAT ep_pending_ops_max 0 
STAT ep_pending_ops_max_duration 0 
STAT ep_pending_ops_total 0 
STAT ep_queue_age_cap 900 
STAT ep_queue_size 8521 
STAT ep_storage_age 84 
STAT ep_storage_age_highwat 29808 
STAT ep_storage_type featured 
STAT ep_store_max_concurrency 10 
STAT ep_store_max_readers 9 
STAT ep_store_max_readwrite 1 
STAT ep_tap_bg_fetched 0 
STAT ep_tap_bg_fetch_requeued 0 
STAT ep_tap_keepalive 300 
STAT ep_tmp_oom_errors 0 
STAT ep_too_old 1553154 
STAT ep_too_young 0 
STAT ep_total_cache_size 78231164348 
STAT ep_total_del_items 196040440 
STAT ep_total_enqueued 1576916461 
STAT ep_total_new_items 208010695 
STAT ep_total_persisted 1626499737 
STAT ep_uncommitted_items 998 
STAT ep_value_size 3107006691 
STAT ep_vb_total 1024 
STAT ep_vbucket_del 45 
STAT ep_vbucket_del_avg_walltime 3567 
STAT ep_vbucket_del_fail 0 
STAT ep_vbucket_del_max_walltime 22153 
STAT ep_vbucket_del_total_walltime 160519 
STAT ep_version 1.7.1_2_g4648391 
STAT ep_warmed_up 3569370 
STAT ep_warmup_dups 0 
STAT ep_warmup_oom 0 
STAT ep_warmup_thread complete 
STAT ep_warmup_time 1675261980 
STAT ep_warmup true 
STAT get_hits 7075301263 
STAT get_misses 3887182406 
STAT incr_hits 33740914 
STAT incr_misses 14178 
STAT libevent 2.0.11-stable 
STAT limit_maxbytes 67108864 
STAT listen_disabled_num 0 
STAT mem_used 5169772751 
STAT pid 4225 
STAT pointer_size 64 
STAT rejected_conns 0 
STAT rusage_system 274452.680850 
STAT rusage_user 527816.140750 
STAT tap_checkpoint_end_received 10177375 
STAT tap_checkpoint_end_sent 10176851 
STAT tap_checkpoint_start_received 10177881 
STAT tap_checkpoint_start_sent 10177361 
STAT tap_connect_received 11827 
STAT tap_delete_received 101873556 
STAT tap_delete_sent 101820996 
STAT tap_mutation_received 1182343906 
STAT tap_mutation_sent 1194330191 
STAT tap_opaque_received 35863 
STAT tap_opaque_sent 34956 
STAT threads 4 
STAT time 1331297508 
STAT total_connections 174956 
STAT uptime 11938092 
STAT vb_active_curr_items 7636008 
STAT vb_active_eject 0 
STAT vb_active_ht_memory 50561024 
STAT vb_active_itm_memory 2194686368 
STAT vb_active_num 512 
STAT vb_active_num_non_resident 0 
STAT vb_active_ops_create 104054142 
STAT vb_active_ops_delete 98202339 
STAT vb_active_ops_reject 0 
STAT vb_active_ops_update 625882155 
STAT vb_active_perc_mem_resident 100 
STAT vb_active_queue_age 915152000 
STAT vb_active_queue_drain 839960787 
STAT vb_active_queue_fill 839965441 
STAT vb_active_queue_memory 409552 
STAT vb_active_queue_pending 5710163 
STAT vb_active_queue_size 4654 
STAT vb_dead_num 0 
STAT vb_pending_curr_items 0 
STAT vb_pending_eject 0 
STAT vb_pending_ht_memory 0 
STAT vb_pending_itm_memory 0 
STAT vb_pending_num 0 
STAT vb_pending_num_non_resident 0 
STAT vb_pending_ops_create 0 
STAT vb_pending_ops_delete 0 
STAT vb_pending_ops_reject 0 
STAT vb_pending_ops_update 0 
STAT vb_pending_perc_mem_resident 0 
STAT vb_pending_queue_age 0 
STAT vb_pending_queue_drain 0 
STAT vb_pending_queue_fill 0 
STAT vb_pending_queue_memory 0 
STAT vb_pending_queue_pending 0 
STAT vb_pending_queue_size 0 
STAT vb_replica_curr_items 7750628 
STAT vb_replica_eject 0 
STAT vb_replica_ht_memory 50561024 
STAT vb_replica_itm_memory 2212229588 
STAT vb_replica_num 512 
STAT vb_replica_num_non_resident 0 
STAT vb_replica_ops_create 103956553 
STAT vb_replica_ops_delete 97838101 
STAT vb_replica_ops_reject 0 
STAT vb_replica_ops_update 596566447 
STAT vb_replica_perc_mem_resident 100 
STAT vb_replica_queue_age 1028068000 
STAT vb_replica_queue_drain 807877243 
STAT vb_replica_queue_fill 736951020 
STAT vb_replica_queue_memory 463320 
STAT vb_replica_queue_pending 8055141 
STAT vb_replica_queue_size 5265 
STAT version 1.4.4_461_gf99c147 

Membase 통계 (카우치베이스 주식회사 우리가 1.8.0로 업그레이드 때문에)

arith_cmd (974447 total) 
    1us - 2us  : ( 0.00%)  1 
    2us - 4us  : ( 0.01%)  53 
    4us - 8us  : ( 0.03%) 201 
    8us - 16us : ( 1.03%) 9773 # 
    16us - 32us : (92.06%) 887081 ##################################################################################################################################################################################### 
    32us - 64us : (97.53%) 53314 ########## 
    64us - 128us : (99.82%) 22276 #### 
    128us - 256us : (100.00%) 1723 
    256us - 512us : (100.00%)  11 
    512us - 1ms : (100.00%)  9 
    1ms - 2ms  : (100.00%)  5 
data_age (20636834 total) 
    0 - 1s  : ( 0.87%) 179419 # 
    1s - 2s  : ( 6.83%) 1229526 ########### 
    2s - 4s  : (16.33%) 1962033 ################## 
    4s - 7s  : (26.81%) 2161109 #################### 
    7s - 10s  : (33.92%) 1468459 ############## 
    10s - 16s  : (43.93%) 2064852 ################### 
    16s - 23s  : (52.64%) 1798684 ################# 
    23s - 34s  : (63.39%) 2218079 ##################### 
    34s - 49s  : (73.93%) 2174966 #################### 
    49s - 1m  : (84.78%) 2237841 ##################### 
    1m - 1m  : (93.53%) 1807563 ################# 
    1m - 2m  : (97.64%) 848159 ######## 
    2m - 3m  : (98.96%) 271698 ## 
    3m - 4m  : (99.67%) 145530 # 
    4m - 6m  : (99.91%) 50132 
    6m - 9m  : (99.95%) 8945 
    9m - 12m  : (100.00%) 8831 
    12m - 17m  : (100.00%) 1008 
disk_commit (29439 total) 
    0 - 1s  : (27.51%) 8098 ####################################################### 
    1s - 2s  : (28.23%) 214 # 
    2s - 4s  : (43.74%) 4565 ############################### 
    4s - 7s  : (62.51%) 5524 ##################################### 
    7s - 10s  : (83.00%) 6032 ######################################### 
    10s - 16s  : (96.00%) 3827 ########################## 
    16s - 23s  : (99.78%) 1115 ####### 
    23s - 34s  : (100.00%) 63 
    34s - 49s  : (100.00%) 1 
disk_del (3864702 total) 
    1us - 2us  : ( 0.00%)  2 
    2us - 4us  : (23.32%) 901064 ############################################## 
    4us - 8us  : (68.06%) 1729127 ######################################################################################## 
    8us - 16us : (95.18%) 1048267 ##################################################### 
    16us - 32us : (98.97%) 146376 ####### 
    32us - 64us : (99.34%) 14471 
    64us - 128us : (99.98%) 24694 # 
    128us - 256us : (100.00%)  615 
    256us - 512us : (100.00%)  34 
    512us - 1ms : (100.00%)  27 
    1ms - 2ms  : (100.00%)  24 
    2ms - 4ms  : (100.00%)  1 
disk_insert (3720767 total) 
    2us - 4us  : (16.68%) 620801 ################################# 
    4us - 8us  : (46.24%) 1099803 ########################################################## 
    8us - 16us : (94.96%) 1812778 ################################################################################################ 
    16us - 32us : (98.76%) 141193 ####### 
    32us - 64us : (99.19%) 15989 
    64us - 128us : (99.98%) 29502 # 
    128us - 256us : (100.00%)  619 
    256us - 512us : (100.00%)  34 
    512us - 1ms : (100.00%)  31 
    1ms - 2ms  : (100.00%)  16 
    2ms - 4ms  : (100.00%)  1 
disk_invalid_item_del (1 total) 
    1us - 2us  : (100.00%) 1 ############################################################################################################################################################################################################ 
disk_invalid_vbtable_del (294 total) 
    512us - 1ms : (81.29%) 239 #################################################################################################################################################################### 
    1ms - 2ms  : (100.00%) 55 ##################################### 
disk_update (16891367 total) 
    2us - 4us  : ( 0.00%)  617 
    4us - 8us  : (35.44%) 5985479 ###################################################################### 
    8us - 16us : (88.98%) 9043092 ########################################################################################################## 
    16us - 32us : (95.49%) 1099896 ############ 
    32us - 64us : (97.40%) 323792 ### 
    64us - 128us : (99.49%) 351597 #### 
    128us - 256us : (99.96%) 79771 
    256us - 512us : (100.00%) 6809 
    512us - 1ms : (100.00%)  205 
    1ms - 2ms  : (100.00%)  108 
    2ms - 4ms  : (100.00%)  1 
get_cmd (271990315 total) 
    0 - 1us  : ( 0.00%)  3098 
    1us - 2us  : ( 0.42%) 1134954 
    2us - 4us  : (26.85%) 71886528 ################################################### 
    4us - 8us  : (96.55%) 189592751 ######################################################################################################################################## 
    8us - 16us : (99.21%) 7231257 ##### 
    16us - 32us : (99.54%) 885000 
    32us - 64us : (99.56%)  67815 
    64us - 128us : (100.00%) 1177254 
    128us - 256us : (100.00%)  8664 
    256us - 512us : (100.00%)  1087 
    512us - 1ms : (100.00%)  1486 
    1ms - 2ms  : (100.00%)  402 
    2ms - 4ms  : (100.00%)  15 
    4ms - 8ms  : (100.00%)   2 
    8ms - 16ms : (100.00%)   2 
set_vb_cmd (7680 total) 
    0 - 1us  : ( 0.08%) 6 
    1us - 2us  : (11.60%) 885 ####################### 
    2us - 4us  : (90.59%) 6066 ############################################################################################################################################################## 
    4us - 8us  : (98.80%) 631 ################ 
    8us - 16us : (99.67%) 67 # 
    16us - 32us : (99.82%) 11 
    64us - 128us : (99.96%) 11 
    128us - 256us : (99.99%) 2 
    512us - 1ms : (100.00%) 1 
storage_age (20636834 total) 
    0 - 1s  : ( 0.39%) 79479 
    1s - 2s  : ( 1.73%) 278232 ## 
    2s - 4s  : ( 2.91%) 242634 ## 
    4s - 7s  : ( 4.18%) 263057 ## 
    7s - 10s  : ( 5.25%) 219060 ## 
    10s - 16s  : ( 7.25%) 412722 ### 
    16s - 23s  : ( 9.97%) 562678 ##### 
    23s - 34s  : (15.01%) 1039406 ######### 
    34s - 49s  : (22.71%) 1589602 ############### 
    49s - 1m  : (32.58%) 2036916 ################### 
    1m - 1m  : (43.13%) 2176751 #################### 
    1m - 2m  : (51.79%) 1786380 ################# 
    2m - 3m  : (59.80%) 1653388 ############### 
    3m - 4m  : (69.64%) 2031245 ################### 
    4m - 6m  : (81.27%) 2399569 ####################### 
    6m - 9m  : (94.47%) 2724318 ########################## 
    9m - 12m  : (99.52%) 1042119 ######### 
    12m - 17m  : (99.73%) 44332 
    17m - 24m  : (99.90%) 34353 
    24m - 34m  : (100.00%) 20552 
    34m - 48m  : (100.00%)  41 
store_cmd (32950985 total) 
    1us - 2us  : ( 0.00%)  296 
    2us - 4us  : ( 0.25%) 83421 
    4us - 8us  : ( 2.63%) 782380 #### 
    8us - 16us : (91.01%) 29122181 ############################################################################################################################################################################## 
    16us - 32us : (98.48%) 2461695 ############## 
    32us - 64us : (98.77%) 95381 
    64us - 128us : (99.96%) 394021 ## 
    128us - 256us : (100.00%) 10725 
    256us - 512us : (100.00%)  538 
    512us - 1ms : (100.00%)  260 
    1ms - 2ms  : (100.00%)  77 
    2ms - 4ms  : (100.00%)  9 
    8ms - 16ms : (100.00%)  1 
tap_mutation (15722391 total) 
    1us - 2us  : ( 0.00%)  4 
    2us - 4us  : ( 0.18%) 28274 
    4us - 8us  : (34.16%) 5342711 ################################################################### 
    8us - 16us : (92.32%) 9144157 ################################################################################################################### 
    16us - 32us : (98.76%) 1012068 ############ 
    32us - 64us : (99.04%) 44704 
    64us - 128us : (99.98%) 147357 # 
    128us - 256us : (100.00%) 2927 
    256us - 512us : (100.00%)  93 
    512us - 1ms : (100.00%)  73 
    1ms - 2ms  : (100.00%)  20 
    2ms - 4ms  : (100.00%)  3 
+0

확실한가? membase의 버그입니까? 네트워크에서이 트래픽에 대한 책임이 있습니다. 일부 추가 트래픽 및/또는 잘못된 설정이있는 것 같습니다 ... – Yahia

+0

우리는 1Gb 이더넷을 보유하고 있으며 트래픽은 다음과 같습니다. 15-20Mb. 네트워크 괜찮아. 또한 문제가 발생하면 클러스터 서버간에 복제 문제가 발생하지만 정상적으로 작동합니다. –

+0

mbstats를 사용하여 서버 중 하나에서 타이밍 통계를 얻을 수 있습니까? 예. (/ opt/membase/bin/mbstats localhost : 11210 타이밍) – mikewied

답변

0

문제가 발견되었습니다.

문제가 Coushbase/Membase 클러스터 목록에 있습니다. 우리 클러스터에는 로컬 (1Tbit) 및 글로벌 (100Mbit)의 두 가지 네트워크 인터페이스가 있습니다. 모든 백엔드 서버에는 Couchbase에 연결하는 데 사용되는 로컬 Moxi 프록시가 있습니다. Moxi 클러스터 구성에서는 Couchbase 서버에 로컬 IP가 있지만 Moxi가 Couchbase에 연결하면 서버 목록 Couchbase가 글로벌 IP를 반환하고 글로벌 네트워크 인터페이스가 많이로드되어 응답 시간에 문제가 발생합니다.

그래서 수정의 몇 가지 변종이있다 : 프록시없이

나는 카우치베이스 주식회사 생각은 인터페이스 연결을 분리, 또는 더 나은 것은이 동작을 구성 할 필요가있다. 또한 명확하지 않기 때문에이 동작을 워드 프로세서에 추가해야합니다.

관련 문제