what is select distribution ratio under insert distributions in cassandra stress tool? -


select distribution ratio: ratio of rows each partition should insert proportion of total possible rows partition (as defined clustering distribution columns). default fixed(1)/1

can explain means? , why called select distribution ration when under insert distribution?

http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema

in cassandra, data assigned given node partition key, , stored sorted on disk based on clustering key within partition.

the 'distribution ratio' allows define:

1) how many rows stress tool create in each partition,

2) how many rows stress tool read each partition (they'll ordered, it's fast grab more one)

in case of fixed(), means each partition have fixed number of rows - if choose of other options, you'll end variable number of rows.

edit explain multiple rows per partition:

for example, if had data model gathered weather information different cities:

create table sensor_readings ( station_id text, weather_time timestamp, temperature int, humidity int, primary key(station_id, weather_time));  

in case, have multiple rows (one each weather_time) in each partition (station_id). can query sensor readings in given station_id, or can query 1 specific weather_time. distribution ratio controls how many weather_times have per station_id.


Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -