YCSB 낮은 읽기 처리량 cassandra

YCSB Endpoint benchmark은 Cassandra가 Nosql 데이터베이스의 가장 중요한 하위 요소라고 생각할 것입니다. 그러나 자체 상자 (하이퍼 스레딩 8 코어, 60GB 메모리, 2 500GB SSD)에서 작업 결과를 재현하는 작업 부하 b (읽기 95 %, 업데이트 5 %)에 대한 읽기 처리량이 저조합니다.YCSB 낮은 읽기 처리량 cassandra

cassandra.yaml 설정은 다른 ip 주소 및 디스크 구성 (데이터의 경우 SSS 1, 커밋 로그의 경우 1)을 제외하고 종점 설정과 완전히 동일합니다. 처리량은 초 당 ~ 38,000 회이지만, 우리는 스레드 수/클라이언트 노드 수에 관계없이 ~ 16,000입니다. 나는. 256 스레드를 가진 작업자 노드 하나는 ~ 16,000 ops/sec를보고하고 4 노드는 각각 ~ 4000 ops/sec를보고합니다.

SSD 데이터 드라이브에 대해 readahead 값을 8KB로 설정했습니다. 아래에 사용자 지정 작업 파일을 저장합니다.

iostat를 사용하여 디스크 io &cpu를 분석 할 때 읽기 처리량이 지속적으로 ~ 200,000KB/s 인 것 같습니다. 그러면 ycsb 클러스터 처리량이 높아야합니다 (레코드는 100 바이트). ~ 25-30 %의 CPU가 % iowait 미만인 것으로 보이며 사용자가 10-25 %를 사용합니다.

top 및 nload 통계는 표면적으로 병목 현상이 발생하지 않습니다 (메모리 사용량은, 10Gb/s 링크의 경우 10-50Mbits/sec).

# The name of the workload class to use 
workload=com.yahoo.ycsb.workloads.CoreWorkload 

# There is no default setting for recordcount but it is 
# required to be set. 
# The number of records in the table to be inserted in 
# the load phase or the number of records already in the 
# table before the run phase. 
recordcount=2000000000 

# There is no default setting for operationcount but it is 
# required to be set. 
# The number of operations to use during the run phase. 
operationcount=9000000 

# The offset of the first insertion 
insertstart=0 
insertcount=500000000 

core_workload_insertion_retry_limit = 10 
core_workload_insertion_retry_interval = 1 

# The number of fields in a record 
fieldcount=10 

# The size of each field (in bytes) 
fieldlength=10 

# Should read all fields 
readallfields=true 

# Should write all fields on update 
writeallfields=false 

fieldlengthdistribution=constant 

readproportion=0.95 

updateproportion=0.05 

insertproportion=0 

readmodifywriteproportion=0 

scanproportion=0 

maxscanlength=1000 

scanlengthdistribution=uniform 

insertorder=hashed 

requestdistribution=zipfian 
hotspotdatafraction=0.2 

hotspotopnfraction=0.8 
table=usertable 

measurementtype=histogram 

histogram.buckets=1000 
timeseries.granularity=1000

출처

2016-08-03 Rdesmond

처리량은 클라이언트 수와 관련하여 전혀 조정되지 않습니다. 1 명의 클라이언트로 우리는 정해진 양의 ops/sec를 얻습니다. 병렬로 2 개의 ycsb 클라이언트를 실행할 때 각각 처리량/2를 얻습니다. 4 개의 클라이언트가 처리량을 얻습니다/4. 동시 읽기를 1024로 설정하고 ycsb 클라이언트의 코어 및 최대 연결을 1024로 설정하십시오. nload는 추가시 네트워크 트래픽에 변화가 없음을 나타냅니다. 클라이언트 (1 클라이언트에서 2 클라이언트로 갈 때 2x 트래픽을 예상 함). 트래픽은 ~ 20Mbits이므로 병목 현상이 발생하지 않습니다. – Rdesmond

SSTables 압축은 성능에 영향을 미치지 않습니다. – Rdesmond

당신이 묻고있는 것에 대해 좀 더 구체적이어야합니다. – cabad

키가 casssandra.yaml 파일의 native_transport_max_threads를 증가시키고 있습니다.

주석의 증가 된 설정 (ycsb 클라이언트의 연결 증가 및 cassandra의 동시 읽기/쓰기)과 함께 Cassandra는 ~ 80,000 ops/초로 뛰었습니다.

출처

2016-08-09 17:42:40 Rdesmond

YCSB 낮은 읽기 처리량 cassandra

답변

관련 문제