Apache Cassandra NotesRussell Bateman |
It turns out that Cassandra was so named because of the allusions to a curse on an oracle—pun intended toward the latter, software giant's famous RDBMS.
The actual history of the prophetess is rather murky and sordid coming as it does from myriad sources and inspirations. The synthesis I'm used to is that she was given the prophetic ability by Apollo in a hoped-for exchange of her womanly pleasures, but she backed out at the last minute whereupon the god spat in her mouth condemning her always to prophesy and never be believed. She was simply thought to be mad.
So it is that she famously warned against bring the Achaean offering left behind into the gates of Troy. Despite being the first-family daughter of Priam and Hecuba, her warning was ignored leading to the well known city's infamous downfall.
Cassandra's strengths are:
Down-sides, if important to you:
Test first, code second, that's the order...
Here's what I'm using in pom.xml:
<properties> <cassandra.version>3.3.0</cassandra.version> <cassandra-unit.version>3.1.3.2</cassandra-unit.version> <slf4j.version>1.7.25</slf4j.version> </properties> <dependencies> <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>${cassandra.version}</version> </dependency> <dependency> <groupId>org.cassandraunit</groupId> <artifactId>cassandra-unit</artifactId> <version>${cassandra-unit.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>${slf4j.version}</version> </dependency> </dependencies>
It's notoriously difficult to unit-test code that calls into a database's APIs. Cassandra provides an embedded, stand-alone database that calling isn't like a real instance in terms of having to set up a local instance let alone separate cluster-node instances.
Here is a simple test to see if this embedded Cassandra will start up. It does nothing except demonstrate that Cassandra's unit-testing helper will work.
package com.etretatlogiciels.cassandra; import java.io.IOException; import org.junit.BeforeClass; import org.junit.Test; import org.apache.cassandra.exceptions.ConfigurationException; import org.apache.thrift.transport.TTransportException; import org.cassandraunit.utils.EmbeddedCassandraServerHelper; /** * To run this, you must add a Run/Debug Configuration in the form * of an Environment Variable: * * LD_LIBRARY_PATH=/home/russ/dev/cassandra/target/classes * * This is so that libsigar-amd64-linux.so can be found and loaded * by the Cassandra code. */ public class CassandraExampleTest { @BeforeClass public static void startCassandra() throws TTransportException, IOException, InterruptedException, ConfigurationException { EmbeddedCassandraServerHelper.startEmbeddedCassandra( "another-cassandra.yaml", 20000 ); } @Test public void test() { System.out.println( "This is a test!" ); } }
<properties> <cassandra.version>3.3.0</cassandra.version> <cassandra-unit.version>3.1.3.2</cassandra-unit.version> <slf4j.version>1.7.25</slf4j.version> </properties> <dependencies> <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>${cassandra.version}</version> </dependency> <dependency> <groupId>org.cassandraunit</groupId> <artifactId>cassandra-unit</artifactId> <version>${cassandra-unit.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>${slf4j.version}</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-simple</artifactId> <version>${slf4j.version}</version> </dependency> </dependencies>
cluster_name: 'Test Cluster' hints_directory: target/embeddedCassandra/hints cdc_raw_directory: target/embeddedCassandra/data/cdc_raw hinted_handoff_enabled: true max_hint_window_in_ms: 10800000 # 3 hours hinted_handoff_throttle_in_kb: 1024 max_hints_delivery_threads: 2 authenticator: AllowAllAuthenticator authorizer: AllowAllAuthorizer permissions_validity_in_ms: 2000 partitioner: org.apache.cassandra.dht.Murmur3Partitioner # directories where Cassandra should store data on disk. data_file_directories: commitlog_directory: target/embeddedCassandra/commitlog disk_failure_policy: stop key_cache_size_in_mb: key_cache_save_period: 14400 row_cache_size_in_mb: 0 row_cache_save_period: 0 saved_caches_directory: target/embeddedCassandra/saved_caches commitlog_sync: periodic commitlog_sync_period_in_ms: 10000 commitlog_segment_size_in_mb: 32 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: "127.0.0.1" concurrent_reads: 32 concurrent_writes: 32 trickle_fsync: false trickle_fsync_interval_in_kb: 10240 storage_port: 7010 ssl_storage_port: 7011 listen_address: 127.0.0.1 start_native_transport: true native_transport_port: 9152 start_rpc: true rpc_address: localhost rpc_port: 9175 rpc_keepalive: true rpc_server_type: sync thrift_framed_transport_size_in_mb: 15 thrift_max_message_length_in_mb: 16 incremental_backups: false snapshot_before_compaction: false auto_snapshot: false column_index_size_in_kb: 64 compaction_throughput_mb_per_sec: 16 read_request_timeout_in_ms: 5000 range_request_timeout_in_ms: 10000 write_request_timeout_in_ms: 2000 cas_contention_timeout_in_ms: 1000 truncate_request_timeout_in_ms: 60000 request_timeout_in_ms: 10000 cross_node_timeout: false endpoint_snitch: SimpleSnitch dynamic_snitch_update_interval_in_ms: 100 dynamic_snitch_reset_interval_in_ms: 600000 dynamic_snitch_badness_threshold: 0.1 request_scheduler: org.apache.cassandra.scheduler.NoScheduler index_interval: 128 encryption_options: internode_encryption: none keystore: conf/.keystore keystore_password: cassandra truststore: conf/.truststore truststore_password: cassandra
public class CassandraConnector { private Cluster cluster; private Session session; public void connect( final String node, final int port ) { cluster = Cluster.builder() .addContactPoint( node ) .withPort( port ) .build(); Metadata metadata = cluster.getMetadata(); System.out.println( String.format( "Connected to cluster: %s", metadata.getClusterName() ) ); for( Host host : metadata.getAllHosts() ) { System.out.println( String.format( "Datacenter: %s, Host: %s, Rack: %s", host.getDatacenter(), host.getAddress(), host.getRack() ) ); } session = cluster.connect(); } public Session getSession() { return session; } public void close() { cluster.close(); } }
Nothing too surprising here...
ascii | counter | float | list | text | tinyint | varint |
bigint | date | frozen | map | time | tuple | |
blob | decimal | inet | set | timestamp | uuid | |
boolean | double | int | smallint | timeuuid | varchar |
Assuming I'll ever need to do so, here's a Java enumeration for internal use. However, this is really code too early and may not be of much use ultimately.
public enum CassandraType { c_text, // UTF-8 encoded string c_ascii, // US_ASCII 7-bit c_varchar, // UTF-8 encoded string c_int, // 32-bit signed c_bigint, // 64-bit signed c_smallint, // 2-byte signed c_tinyint, // 1-byte signed c_varint, // arbitrary-precision c_decimal, // variable-precision c_float, // 32-bit IEEEE-754 c_double, // 64-bit IEEEE-754 c_boolean, // true/false c_counter, // distributed, 64-bit c_date, // 32-bit day since Epoch c_time, // 64-bit nanoseconds since midnight c_timestamp, // 8 bytes since Epoch; date and time with millisecond precision c_timeuuid, // ? c_inet, // IPv4 or IPv6 c_tuple, // 2-3 fields c_uuid, // 128-bit globally unique identifier c_list, // collection of 1+ elements (performance impact) c_map, // JSON-style array of literals c_set, // collection of 1+ literal elements c_blob, // arbitrary bytes (no validation), in hexadecimal c_frozen, // multiple types in single value, treated as blob ; /** * Useful to determine whether potential enum type, * in string form, is a Cassandra type. */ public static boolean contains( String type ) { try { CassandraType.valueOf( type ); return true; } catch( IllegalArgumentException e ) { return false; } } /** * Useful to determine whether potential type, * in string form, is a Cassandra type. */ public static CassandraType stringToCassandraType( String string ) { try { CassandraType type = CassandraType.valueOf( "c_" + string ); if( type != null ) return type; return CassandraType.valueOf( string ); } catch( IllegalArgumentException e ) { return null; } } /** * Useful to return a list of Cassandra types. */ public static List< String > getCassandraTypes() { List< String > list = new ArrayList<>( CassandraType.values().length ); for( CassandraType type : CassandraType.values() ) list.add( type.name() ); return list; } }
public class CassandraConnector { private Cluster cluster; private Session session; public void connect( String node, Integer port ) { Builder b = Cluster.builder().addContactPoint( node ); if( port != null ) b.withPort( port ); cluster = b.build(); session = cluster.connect(); } public Session getSession() { return this.session; } public void close() { session.close(); cluster.close(); } }
In Cassandra, there's something called, keyspace. This is a little like the schema in a relational context. Remember, Cassandra isn't a document database like MongoDB, but a columnar database. The keyspace is the outermost container for data in Cassandra. The main attributes to set per keyspace are the...
Another important notion in Cassandra are the column, a data structure that contains a column name, value and timestamp. The columns and the number of columns in each row may vary in contrast with the contents of a relational database table where data are rigidly structured.
For the example I'm studying, the keyspace to create is "library":
public void createKeyspace( String keyspaceName, String replicationStrategy, int replicationFactor ) { StringBuilder sb = new StringBuilder(); sb.append( "CREATE KEYSPACE IF NOT EXISTS ") .append( keyspaceName ) .append( " WITH replication = {" ) .append( "'class':'" ) .append( replicationStrategy ) .append( "','replication_factor':" ) .append( replicationFactor ) .append( "};" ); String query = sb.toString(); session.execute( query ); }
I reposted to the Cassandra users' forum asking for a reply so that I know my posts are even getting there. I finally got an answer back, but the suggestion was just a pile of code that merged unit testing and production together without ultimately providing a solution around the problem I'm having:
Exception (java.lang.ExceptionInInitializerError) encountered during startup: null java.lang.ExceptionInInitializerError at org.apache.cassandra.transport.Server.start(Server.java:128) at java.util.Collections$SingletonSet.forEach(Collections.java:4767) at org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128) at org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649) at org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616) at org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException: name at io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39) at io.netty.util.internal.logging.Slf4JLogger. (Slf4JLogger.java:30) at io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73) at io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84) at io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77) at io.netty.bootstrap.ServerBootstrap. (ServerBootstrap.java:46) ... 10 more
I've since read other attempts to explain using this helper class, but no matter how hard I've tried, I keep coming back to the error above. I worried originally that the error was saying that I had done something stupid, but I don't believe that now. It means that I don't know how to start the Cassandra unit test help up. The articles I've read all assert that I need only call it:
EmbeddedCassandraServerHelper.startEmbeddedCassandra();
...but, this is not true. I've tried to supply a YAML file and have, I think. It came from step 2 in this article. Though this is required (and hardly do all the authors allude to it), it doesn't work the magic. I got one from someplace that I'm using. I've also added log4j-embedded-cassandra.properties to no avail.
I bottled up some simple test code from Testing Cassandra repositorys using Cassandra Unit. I didn't use the Spring Boot code, but just the basic Java code. It worked; it's the early project above. This means there's some crapola going on, likely slf4j in my greater nifi-pipeline project.
This Cassandra unit-test stuff works. Sadly, the thrust of the tutorial is Spring Boot, and the useful code is overly infected by it and therefore pretty useless when there are other tutorials around.
CREATE KEYSPACE MyKeySpace WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }; USE MyKeySpace; CREATE COLUMNFAMILY MyColumns ( id text, Last text, First text, PRIMARY KEY( id ) ); INSERT INTO MyColumns ( id, Last, First ) VALUES ( '1', 'Doe', 'John' ); SELECT * FROM MyColumns;
CREATE KEYSPACE mainspace WITH replication = { 'class': 'NetworkTopologyStrategy', 'dc1': '2' } AND durable_writes = true; CREATE TABLE mainspace.record ( mpid bigint, dateofservice timestamp, uri text, data text, PRIMARY KEY ( mpid, dateofservice, uri ) ) WITH CLUSTERING ORDER BY ( dateofservice DESCENDING, uri ASCENDING ) AND bloom_filter_fp_chance = 0.01 AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' } AND comment = '' AND compaction = { 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4' } AND compression = { 'enabled': 'false' } AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE';
Setting up Cassandra as local to my development host:
https://www.tutorialspoint.com/cassandra/cassandra_installation.htm (following along with this) http://cassandra.apache.org/download/ (Browser download to ~/dev/cassandra) ~/dev/cassandra $ tar -zxf apache-cassandra-3.11.0-bin.tar.gz ~/dev/cassandra $ ll total 37060 drwxr-xr-x 3 russ russ 4096 Aug 28 12:59 . drwxrwxr-x. 96 russ russ 4096 Aug 28 12:58 .. drwxr-xr-x 10 russ russ 4096 Aug 28 12:59 apache-cassandra-3.11.0 -rw-rw-r-- 1 russ russ 37929669 Aug 28 12:58 apache-cassandra-3.11.0-bin.tar.gz ~/dev/cassandra/apache-cassandra-3.11.0/bin $ gvim cassandra.yaml (insert https://svn.apache.org/repos/asf/cassandra/trunk/conf/cassandra.yaml) export CASSANDRA_HOME = ~/dev/cassandra/apache-cassandra-3.11.0 export PATH = $PATH:$CASSANDRA_HOME/bin ~/dev/cassandra/apache-cassandra-3.11.0/bin $ sudo bash [root@localhost bin]# mkdir -p /var/lib/cassandra/data [root@localhost bin]# mkdir -p /var/lib/cassandra/commitlog [root@localhost bin]# mkdir -p /var/lib/cassandra/saved_caches [root@localhost bin]# mkdir -p /var/log/cassandra [root@localhost bin]# chmod 777 /var/lib/cassandra/ [root@localhost bin]# chmod 777 /var/log/cassandra/ ~/dev/cassandra/apache-cassandra-3.11.0 $ ./bin/cassandra -f (lots of fun stuff...) (output from my Cassandra connector test code...) Connected to cluster: Test Cluster Datacenter: datacenter1, Host: /127.0.0.1, Rack: rack1
I want to connect to Cassandra, and do some stuff like use prepared statements. Here's my Cassandra code...
package com.etretatlogiciels.cassandra; import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Metadata; import com.datastax.driver.core.Session; public class CassandraConnector { private Cluster cluster; private Session session; public void connect( final String node, final int port ) { cluster = Cluster.builder() .addContactPoint( node ) .withPort( port ) .build(); session = cluster.connect(); } public Session getSession() { return session; } public Metadata getMetadata() { return cluster.getMetadata(); } public void close() { cluster.close(); } }
package com.etretatlogiciels.cassandra; import org.junit.After; import org.junit.Before; import org.junit.Rule; import org.junit.Test; import org.junit.rules.TestName; import com.datastax.driver.core.Host; import com.datastax.driver.core.Metadata; import com.etretatlogiciels.testing.TestUtilities; public class CassandraConnectorTest { // @formatter:off @Rule public TestName name = new TestName(); @After public void tearDown() { } @Before public void setUp() throws Exception { TestUtilities.setUp( name ); } @Test public void testConnector() { if( !TestUtilities.runningInsideIntelliJ() ) return; // connects to Cassandra instance running on local box... CassandraConnector client = new CassandraConnector(); client.connect( "127.0.0.1", 9042 ); Metadata metadata = client.getMetadata(); System.out.println( String.format( "Connected to cluster: %s", metadata.getClusterName() ) ); for( Host host : metadata.getAllHosts() ) { System.out.println( String.format( "Datacenter: %s, Host: %s, Rack: %s", host.getDatacenter(), host.getAddress(), host.getRack() ) ); } } }
This test also appears to work...
package com.etretatlogiciels.cassandra; import org.junit.After; import org.junit.Before; import org.junit.Rule; import org.junit.Test; import org.junit.rules.TestName; import com.datastax.driver.core.BoundStatement; import com.datastax.driver.core.LocalDate; import com.datastax.driver.core.PreparedStatement; import com.datastax.driver.core.Session; public class TryPreparedStatementsTest { @After public void tearDown() { } @Before public void setUp() throws Exception { // connects to Cassandra instance running on local box... CassandraConnector client = new CassandraConnector(); client.connect( "127.0.0.1", 9042 ); session = client.getSession(); } private static final String DROP_KEYSPACE = "drop keyspace if exists product"; private static final String CREATE_KEYSPACE = "create keyspace product with replication = { 'class' : 'SimpleStrategy'," + " 'replication_factor' : 1 };"; private static final String USE_KEYSPACE = "use product;"; private static final String DROP_TABLE = "drop table if exists product.sku_list;"; private static final String CREATE_TABLE = "create table " + "product.sku_list( sku text, description text, when date, primary key( sku ) );"; private static final String INSERT_SKU = "insert into sku_list( sku, description, when ) values( ?, ?, ? );"; private CassandraConnector client; private Session session; @Test public void testPreparedStatement() { PreparedStatement statement; BoundStatement bound; statement = session.prepare( DROP_KEYSPACE ); bound = statement.bind(); session.execute( bound ); statement = session.prepare( CREATE_KEYSPACE ); bound = statement.bind(); session.execute( bound ); statement = session.prepare( USE_KEYSPACE ); bound = statement.bind(); session.execute( bound ); statement = session.prepare( CREATE_TABLE ); bound = statement.bind(); session.execute( bound ); statement = session.prepare( INSERT_SKU ); bound = statement.bind(); bound.setString( 0, "665892" ); bound.setString( 1, "LCD screen" ); bound.setDate( 2, LocalDate.fromMillisSinceEpoch( System.currentTimeMillis() ) ); session.execute( bound ); } }
Here's evidence:
~/dev/cassandra/apache-cassandra-3.11.0 $ ./bin/cqlsh cqlsh> show host; Connected to Test Cluster at 127.0.0.1:9042. cqlsh> describe keyspaces; system_schema system_auth product system system_distributed system_traces cqlsh> use product; cqlsh:product> describe tables; sku_list cqlsh> describe product; CREATE KEYSPACE product WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true; CREATE TABLE product.sku_list ( sku text PRIMARY KEY, description text, when date ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; cqlsh:product> select * from sku_list; sku | description | when --------+-------------+------------ 665892 | LCD screen | 2017-08-28 (1 rows) cqlsh:product> exit;
I worked with David for half an hour on setting up to debug on Cassandra. Decidedly, trying to do development work under Windows is sorely limiting and greatly lengthens the amount of research one must to do accomplish what are simple actions under a UNIX/Linux shell. This said, it's not going to be a piece of cake on Linux either the first time. Here are some links I looked at:
JVM_OPTS="$JVM_OPTS -Xdebug" JVM_OPTS="$JVM_OPTS -Xnoagent" JVM_OPTS="$JVM_OPTS -Djava.compiler=NONE" JVM_OPTS="$JVM_OPTS -Xrunjdwp:transport=dt_socket,server=y,address=5005,suspend=n"When you start the server (from the command line, duh), you should see Cassandra echo back:
Listening for transport dt_socket at address: 5005It's a simple matter to configure IntelliJ IDEA or Eclipse to create a remote debugger connection to 5005.
-Dcassandra.config=file:E:\DI\cassandra\conf\cassandra.yaml -Dcassandra-foreground -ea -Xmx1G -Dlog4j.configuration=file:E:\DI\cassandra\conf\log4j-server.properties -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.ssl=false -Dcassandra.jmx.local.port=7199 -Dcassandra.storagedir=F:\data
-Dcom.sun.management.jmxremote.port=10036This may reveal that a) it's the wrong configuration filename in modern Cassandra installations and b) that the Sun JMX remote port was relevant at some point. This is likely a requirement for success on remote hosts (and not in local "remote" situations).
# JVM_OPTS="$JVM_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=1414"
Note that if JVM_OPTS isn't defined in the environemnt, it can be for the process starting Cassandra. That means they'll be present. Also, note that there are option order problems with http://10.10.10.6/notes/daily.html#cassandra-debug. See notes from late last year and early this year for running NiFi remotely.
Based on what I've read of the Cassandra-Lucene Index plug-in, the lib subdirectory is guaranteed to be on Cassandra's classpath.
The installation of the Cassandra-Lucene Index plug-in, which must be done by cloning and building the source, is done thus:
mvn clean package -Ppatch -Dcassandra_home=<CASSANDRA_HOME>
Stuff to figure out:
### THIS FILE WAS CREATED BY HAND ### # from http://cassandra.apache.org/download/ deb http://www.apache.org/dist/cassandra/debian 311x main
# curl https://www.apache.org/dist/cassandra/KEYS | apt-key add -
# apt-get install cassandra nargothrond sources.list.d # dpkg --list | grep [c]assandra ii cassandra 3.11.0 all distributed storage system for structured data
# service cassandra start russ@nargothrond ~ $ sudo service --status-all | grep [c]assandra [ + ] cassandra
Important things learned...
# uncomment to have Cassandra JVM listen for remote debuggers/profilers on port 1414 -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=1414
I was able to copy my plug-in to Cassandra, bounce it, then connect IntelliJ IDEA via remote session to my Cassandra service. Presumably, it will kick into the debugger once I figure out how to get Cassandra to call through the plug-in.
Next up, how to tell Cassandra to call my plug-in?
To execute a cql command file (i.e.: a text file containing a cql command) from the cql shell, do this:
cqlsh> source 'create-keyspace.cql'
Of course, a full or relative path to the command file works to (although, if Cassandra is installed as a service, what would the current-working directory be?) the file works as well.
Comments in Cassandra cql command files can be:
In the IntelliJ IDEA editor, however, on the first one gives the warm fuzzies of grey text; the other two will not stop IntelliJ from highlighting SQL/CQL keywords found in the comment.
If you attempt to query on a column in a table that's not part of the PRIMARY key, a error will be returned (let's do this in cqlsh). In this example, assume that first_name and not last_name is the primary key:
cqlsh:some_keyspace> SELECT * FROM some_table WHERE last_name = 'Schwartz'; InvalidRequest: code=2200 [Invalid query] message="No supported secondary index found for the non primary key colum ns restrictions"
The error alludes, a secondary index must be created consisting at least of last_name. By definition, a secondary index is one created on/for a column that's not in the primary key:
cqlsh:some_keyspace> CREATE INDEX last_name_index ON some_table( last_name );
(Note: the name last_name_index is completely optional.)
...whereupon the original query begins to work:
cqlsh:some_keyspace> SELECT * FROM some_table WHERE last_name = 'Schwartz'; first_name | last_name -----------+----------- Joe | Schwartz (1 rows)
The secondary index is a different concept than the custom index that I'm working on.
Cassandra partitions data across multiple nodes in a cluster. For this reason, a secondary index based on the the data it refers to must be kept as a copy on every, relevant node. So, queries using a secondary index are significantly more expensive.
Because of how secondary indices are built and maintained, there are cases in which they are not recommended:
An index is built using:
In Cassandra, by opposition to RDBMS practices, begin design by laying out what queries are to be used instead of what the data and data relationships are to be. Organize the data to satisfy the queries. I see this as being a little like test-driven development, so it's a good thing.
Keep related columns together in the same Cassandra table. Queries that search a single partition will yield the best performance.
The SSTable, or "sorted-strings table," in Cassandra is created when the data of a column family (in memory) is flushed to disk.
The reason a disk needs to be left with 50% space free is so that Cassandra has space rebuild SSTables to optimize them.
Today, I'm looking into this topic.
When you move from an RDBMS to Cassandra, whether really or conceptually because you're adopting Cassandra and, like most, have a sort of solidly SQL mindset, you must denormalize data into separate tables based on the queries that will be run against your database (keyspace and tables).
Thinking about how to organize data in Cassandra requires different thoughts and approaches.
For example, the only way to query a column in a table without specifying the partition key is to use a secondary index. This method is not fit for data of high cardinality, that is, columns that contain values that are very uncommon or unique, like a GUID, e-mail address, user name, etc. This is very slow because high-cardinality, secondary-index queries can require all nodes in the ring to respond, adding considerable latency to the action.
One solution to this problem has been to make the client (the one making the query) perform denormalization as a part of his processing of queries into multiple, independent tables. This means that such code, in an application, would be running at the hands of many users on many hosts (instead of just one place).
In Cassandra 3.0 was introduced a new feature, materialized views, one that handles automated, server-side denormalization. This feature takes the form of a statement that's sort of a combination index-creation and select query. For example, suppose this table:
CREATE TABLE scores ( user TEXT, game TEXT, year INT, month INT, day INT, score INT, PRIMARY KEY( user, game, year, month, day ) )
We want some way to get the all-timer high scores from the data in this table:
CREATE MATERIALIZED VIEW alltimehigh AS # name the view SELECT user # must identify the columns to be contained FROM scores # must identify the base table WHERE game IS NOT NULL # filter must be specified for each column AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULL PRIMARY KEY( game, score, user, year, month, day ) # must include all of the columns WITH CLUSTERING ORDER BY( score desc )
In this example, we prime the table with some data:
INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'pcmanus', 'Coup', 2015, 05, 01, 4000 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'jbellis', 'Coup', 2015, 05, 03, 1750 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'yukim', 'Coup', 2015, 05, 03, 2250 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'tjake', 'Coup', 2015, 05, 03, 500 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'jmckenzie', 'Coup', 2015, 06, 01, 2000 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'iamaleksey', 'Coup', 2015, 06, 01, 2500 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'tjake', 'Coup', 2015, 06, 02, 1000 ) INSERT INTO scores( user, game, year, month, day, score ) VALUES( 'pcmanus', 'Coup', 2015, 06, 02, 2000 )
...and here's how we search for the all-time high score:
SELECT user, score FROM alltimehigh WHERE game = 'Coup' LIMIT 1
The result is:
user | score -----------+------- pcmanus | 4000
A lot of the magic happens at write time, i.e.: when the table is built. Consequently, there is a performance penalty at write- and query time. Low-cardinality data will create hotspots around the ring. In our example, because the only game is 'Coup', only the node storing 'Coup' have any data store on them. If there are tombstoned entries, the materialized view must query for and generate a tombstone for each entry. This is all overhead.
It's possible to use something called Cassandra Cluster Manager (CCM), but for practice and deep learning about configuration aspects and details in administration, do each box manually as a separate node. This comes from Jeff Jirsa, who says that official, first-time set-up documents are pretty lacking and gives the following steps:
Another good reference is How To Run a Multi-Node Cluster Database with Cassandra on Ubuntu 14.04.
cqlsh> CREATE ROLE cassadmn WITH PASSWORD = 'Cassadmn' AND LOGIN = true;
NoHostAvailable: ('Unable to complete the operation against any hosts', {})
"Unavailable" indicates that the number of nodes Cassandra needs for the query to succeed isn't available. Too many nodes are down. Either it's a single node that thinks it's more than one node and others are down (you added/removed nodes to/from that cluster in the past), or the replication strategy for system_auth is wrong.
...or, more properly, coordinator node, is what sends the client's search request (or query) to each node in the cluster. Each node then returns its result whereupon the coordinator combines these partial results, then gives the n (where n is prescribed in the query by a limit) most highly ranked. This avoids a full scan of all the data.
Cassandra says that the client read or write requests can go to any node in the cluster because all nodes in Cassandra are peers. When a client connects to a node and issues a read or write request, that node serves as the coordinator for that particular client operation.
The job of the coordinator is to act as a proxy between the client application and the nodes (or replicas) that own the data being requested. The coordinator determines which nodes in the ring should get the request based on the cluster configured partitioner and replica placement strategy.
In my mind, this begs a number of questions, "Will every node offer a coordinator?" Or, only some nodes? "Does the coordinator consist of universal code or code that's not everywhere installed?"
My hypothesis is that every "stock" Cassandra node is a potential coordinator for mere Cassandra purposes: Any node that receives a client query is referred to as the coordinator for that client operation (query).
The coordinator node is typically chosen by an algorithm that takes network distance into account. Any node can act as the coordinator. At first requests will be sent to the nodes the client driver knows about. (Remember, a client application initiates its connection to Cassandra by passing a list of one or more contact points which are hostname plus port.)
It's also useful to know that each client request may be coordinated by a different node and there is no single point of failure (fundamental to Cassandra's architecture).
However, once the client connects and understands the topology of the cluster, the driver may change to a closer coordinator, i.e.: choose a different node including one that wasn't in the original list of contact points. This is because each node contains the metadata of all the other nodes, meaning as long as one is connected, the driver could get infomation of all the nodes in the cluster. The driver will then use the metadata of the entire cluster got from the connected node to create the connection pool. This also means that it's not necessary to set the addresses of all the nodes in the cluster in the contact-points list. Best practice is to set the nodes (in the contact-point list) that respond the quickest to the client application when it starts up. This can be difficult if impossible to predict at the finest level.
How is a coordinator chosen? How your application sets up its own load-balancing policy has an effect.
In configuring Cassandra load-balancing policy for your client application, the options are:
import com.datastax.driver.core.Cluster; import com.datastax.driver.core.policies.RoundRobinPolicy; public class ClientApplicationStub { public static void main( String[] args ) { Cluster cluster = Cluster.builder() .addContactPoint( "127.0.0.1:9042" ) .withLoadBalancingPolicy( new RoundRobinPolicy() ) .build(); ...
Once the cluster is built, it's not possible to change the policy set.
Here's a summary of installing a two-node cluster for my personal, development use. I'm developing a custom, secondary-index plug-in as if I were developing Stratio's Lucene index plug-in, which happens to be the model for what I'm doing. So, imagine that when following these steps.
Given that I'm doing development work, I need lightning-fast turn-around. Hence, I give scripts, short-cuts on the VMs, etc. for that purpose. Remember that these are VMs running on a private host with 32 Gb memory, ample SSD to which no one has access but me.
# passwd root
#PermitRootLogin prohibit-password PermitRootLogin yes
# dpkg --install cassandra_3.11.0_all.deb
(Note: Cassandra is running at this point.)
# apt-mark hold cassandra
cassandra set on hold.
#!/bin/sh # Open these ports for Cassandra. # Internode ports: iptables -A INPUT -p tcp --dport 7000 --jump ACCEPT # internode communication iptables -A INPUT -p tcp --dport 7001 --jump ACCEPT # SSL iptables -A INPUT -p tcp --dport 7199 --jump ACCEPT # JMX monitoring # Client ports: iptables -A INPUT -p tcp --dport 9042 --jump ACCEPT # client iptables -A INPUT -p tcp --dport 9160 --jump ACCEPT # Thrift iptables -A INPUT -p tcp --dport 9142 --jump ACCEPT # native-transport port (SSL)
# systemctl stop cassandra # rm -rf /var/lib/cassandra/data/system/*
cqlsh> CREATE KEYSPACE stratio ... WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 1 };
# uncomment to have Cassandra JVM listen for remote debuggers/profilers on port 1414 -agentlib:jdpw=transport=dt_socket,server=y,suspend=n,address=1414
# shutdown -h now (on scylla)
Remember, there are a couple of settings in /etc/cassandra/cassandra.yaml
specific to charybdis that must be done after launching that new VM.
These include - seeds, listen_address and rpc_address.
(I found out that you can't cheat and use localhost in any of these
places.) Moreover, you'll likely discover charybdis's IP address which
figures in configuration for both VMs, so these instructions were just a little
bit simplified and made assumptions.
$ ssh [email protected] -L 1717:127.0.0.1:1414 # (scylla) $ ssh [email protected] -L 1818:127.0.0.1:1414 # (charybdis)This way, in IntelliJ IDEA, I configure the debugger with Settings:
#!/bin/sh rm -rf /var/log/cassandra/* # zap log files systemctl restart cassandra # toss Cassandra
$ cqlsh scylla --request-timeout=3600
This is where I'll set up all my schema, enter data and begin conducting the
cqlsh-based development.
#!/bin/sh # ------------------------------------------------------ # Replace the Stratio Lucene plug-in on the mini-cluster # or walk the cluster's nodes just to "do stuff." # ------------------------------------------------------ NODE_1=10.10.8.248 NODE_2=10.10.8.9 CLUSTER="${NODE_1} ${NODE_2}" args=${1:-} if [ "$args" = "--walk" ]; then # walk the cluster's nodes doing whatever you need, like bouncing Cassandra... for node in ${CLUSTER}; do echo "ssh root@${node}" ssh root@${node} done exit 0 fi STRATIO_JAR="/home/russ/dev/cassandra-lucene-index/plugin/target/cassandra-lucene-index-plugin-3.11.0.0.jar" CASSANDRA_LIB_PATH="/usr/share/cassandra/lib" for node in $CLUSTER; do echo "scp ${STRATIO_JAR} root@${node}:${CASSANDRA_LIB_PATH}" scp ${STRATIO_JAR} root@${node}:${CASSANDRA_LIB_PATH} done # vim: set tabstop=2 shiftwidth=2 noexpandtab:
# chmod ga+w /usr/share/cassandra/lib
Eventually, clearing out Cassandra data and reloading it, over and over again, I reached a point at which my two-node microcluster broke:
russ@gondolin ~ $ cqlsh scylla
--request-timeout=30
Connection error: ('Unable to connect to any servers', {'10.10.8.248': error(111, "Tried connecting to [('10.10.8.248',
9042)]. Last error: Connection refused")})
Alerted by cqlsh, I looked at the nodes and both began to give me this:
root@charybdis:/etc/cassandra# nodetool status
error: No nodes present in the cluster. Has this node finished starting up?
-- StackTrace --
java.lang.RuntimeException: No nodes present in the cluster. Has this node finished starting up?
at org.apache.cassandra.dht.Murmur3Partitioner.describeOwnership(Murmur3Partitioner.java:262)
at org.apache.cassandra.service.StorageService.effectiveOwnership(StorageService.java:4725)
at org.apache.cassandra.service.StorageService.effectiveOwnership(StorageService.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
On the verge of reinstalling Cassandra, I got brave and removed more than just what was under the data subdirectory:
rm -rf /var/lib/cassandra/data/* rm -rf /var/lib/cassandra/commitlog/* rm -rf /var/lib/cassandra/hints/* rm -rf /var/lib/cassandra/saved-caches/*
cqlsh> INSERT INTO myaddressspace.table ( mpid, date_of_service, uri, data )
... VALUES ( 4,
... '2017-01-03',
... '/home/russ/document/Folder010/665892_004.xml.4',
... '<document>This is a test for mpid 4</document>' );
WriteTimeout: Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses]
message="Operation timed out - received only 0 responses."
info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
This also would tend to suggest that you cannot create a useful (even only two-node) microcluster using an i5 with only 2 cores although I think it's possible to give a set of VMs (2 in the case of a private microcluster of the sort discussed here) a number of cores that adds up to or perhaps even exceeds the actual number the supporting hardware has to offer. It may result in only slowness. This is my case at home although I intend to enroll the services of another unused i5 I've got sitting around for the second VM of my two-node cluster.
Maintaining a private microcluster is very useful for development. This is because it removes all, possible interference by competing needs for a minicluster shared across a team of people.
I read in the mailing list someone suggest, before restarting a Cassandra node, running the following nodetool commands in order to produce a very careful and graceful Cassandra shut-down:
$ nodetool disablethrift && sleep 5 $ nodetool disablebinary && sleep 5 $ nodetool disable gossip && sleep 5 $ nodetool drain
Kurt Greaves countered with "[This is not] essential. Cassandra will gracefully shut down in any scenario as long as it's not killed with a SIGKILL. However, drain does have a few benefits over just a normal shut-down. It will stop a few extra services (batchlog, compactions) and importantly it will also force recycling of dirty commitlog segments, meaning there will be [fewer] commitlog files to replay on startup and reduc[ed] start-up time.
"A comment in the code for drain also indicates that it will wait for in-progress streaming to complete, but I haven't managed to find 1) where this occurs, or 2) if it actually differs to a normal shut-down. Note that this is all with respect to Cassandra 2.1. In 3.0.10 and 3.10, drain and shut-down more or less do exactly the same thing, [though] drain will log some extra messages."
On JVM memory and how heap settings are arrived at plus cautionaries on arbitrary, custom settings, see Tuning [Cassandra] Java resources.
See also my general Java notes Notes on JVM heap memory.
When Cassandra starts up, you can examine what it choses as heap sizes assuming you're not telling it (via /etc/cassandra/jvm.options) what to use:
# ps fuww `pgrep java` | egrep -- '-Xms|.Xmx' # egrep -- '-Xms|-Xmx' /var/log/cassandra/system.log
Explanation of some of the command options:
-f Do full-format listing. This option can be combined with many other UNIX-style options to add additional columns. It also causes the command arguments to be printed. When used with -L, the NLWP (number of threads) and LWP (thread ID) columns will be added. See the c option, the format keyword args, and the format keyword comm. -u Display user-oriented format. -w Wide output. Use this option twice for unlimited width. -- Tell grep not to interpret -Xms and -Xmx as options (flags).
On a related note, here's someone's 84Gb heap settings and GC consequences in Cassandra.
##### # Simpler, new generation G1GC settings. ##### JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$JVM_OPTS -XX:+UnlockExperimentalVMOptions" JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled" JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=50" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2" ##### # GC logging options -- uncomment to enable JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintGCTimeStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -Xloggc:/home/vchadoop/var/logs/cassandra/cassandra-gc.log" JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation" JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=10" JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=1M" ######### MAX_HEAP_SIZE="84G" HEAP_NEWSIZE="2G" #########
The only issue that we currently have and are looking to fix it soon is the need to upgrade our old JDK version and to set metaspace to a higher value.
We found that when the Java runtime reaches the high watermark, it induces a full GC even if there is plenty of memory to expand the heap.
{Heap before GC invocations=1 (full 1): garbage-first heap total 88080384K, used 655025K [0x00007fdd60000000, 0x00007ff260000000, 0x00007ff260000000) region size 32768K, 20 young (655360K), 0 survivors (0K) Metaspace used 34166K, capacity 35325K, committed 35328K, reserved 36864K 2018-01-05T08:10:31.491+0000: 81.789: [Full GC (Metadata GC Threshold) 651M->30M(84G), 0.6598667 secs] [Eden: 640.0M(2048.0M)->0.0B(2048.0M) Survivors: 0.0B->0.0B Heap: 651.4M(84.0G)->30.4M(84.0G)], [Metaspace: 34166K->34162K(36864K)] Heap after GC invocations=2 (full 2): garbage-first heap total 88080384K, used 31140K [0x00007fdd60000000, 0x00007ff260000000, 0x00007ff260000000) region size 32768K, 0 young (0K), 0 survivors (0K) Metaspace used 34162K, capacity 35315K, committed 35328K, reserved 36864K } [Times: user=0.67 sys=0.00, real=0.66 secs]
Enabling DEBUG-level logging in /etc/cassandra/logback.xml will turn on dumps of CQL commands to Cassandra, i.e.: records of what queries are made begin to appear in /var/log/cassandra/debug.log. This is where it's done in that configuration file (look for this paragraph):
<root level="INFO"> <!-- change INFO to DEBUG --> <appender-ref ref="SYSTEMLOG" /> <appender-ref ref="STDOUT" /> <appender-ref ref="ASYNCDEBUGLOG" /> </root>
Slow queries are queries that take longer than a configured threshold. This threshold is established in the /etc/cassandra/cassandra.yaml file, notion:
# How long before a node logs slow queries. Select queries that take longer than # this timeout to execute, will generate an aggregated log message, so that slow queries # can be identified. Set this value to zero to disable slow query logging. slow_query_log_timeout_in_ms: 300
(Note: default in Cassandra 3.0.11 was 500. I'm setting it to 300 in order to accompany the example below.)
The queries one runs may not always execute as quickly as desired, some worse than others. Set performance expectations (see above). Search queries are the slowest. Reads and writes against primary keys that are designed properly should generally execute with single-digit, millisecond latency. Search will always be slower, there's a great deal more to do, there are multiple index tables to consider, etc.
A rule of thumb for how long searches take is tens of milliseconds on the low end or a couple of seconds on the high end. Above the 2-second figure may indicate a problem to be looked into. Note: a search query, as opposed to simply reading data from Cassandra which is done by simple SELECT, would be a SELECT plus a WHERE clause like (this is using DataStax' Solr integration which would be expected to take longer than most and this would be true also of the stuff I'm working on at present):
cqlsh> SELECT * FROM killervideo.videos ... WHERE solr_query = '{ "q" : "title:Terminator", "query.name": "Ahnold" }';
A more tame example might be:
cqlsh> SELECT * FROM fun.users WHERE user_id=42;
Logging queries...
<logger name="com.datastax.driver.core.QueryLogger.SLOW"> <!-- or NORMAL or ERROR --> <logger value="DEBUG" /> <!-- or TRACE --> </logger>
(You can also set this in Java.) This will print messages like
DEBUG [cluster1] [/127.0.0.1:9042] Query too slow, took 329 ms: SELECT * FROM users WHERE user_id=?;
...for every slow query. To get query-parameter values, resort to TRACE instead of merely DEBUG. You'll see something like this:
TRACE [cluster1] [/127.0.0.1:9042] Query too slow, took 329 ms: SELECT * FROM users WHERE user_id=? [user_id=42];
The upgrade of openjdk-8-8u162-b12 broke Cassandra 3.11.0 on my cluster. It created a situation (in Cassandra dæmon start-up code) in which some method has become abstract and no longer has an executable body defined. See https://docs.oracle.com/javase/8/docs/api/java/lang/AbstractMethodError.html. This isn't in my code; I can't easily find and fix it.
root@scylla:/var/log/cassandra# bounce-cassandra.sh && sleep 3 && tail -f debug.log
ERROR [main] 2018-04-05 11:11:56,873 o.a.c.s.CassandraDaemon:706 - Exception encountered during startup
java.lang.AbstractMethodError: org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote; \
ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
at javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150) ~[na:1.8.0_162]
at javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135) ~[na:1.8.0_162]
at javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405) ~[na:1.8.0_162]
at org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143) [apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188) [apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600) [apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) [apache-cassandra-3.11.0.jar:3.11.0]
I had to do this:
root@scylla:/var/log/# ll -d dpkg*
root@scylla:/var/log/# gunzip dpkg.log.4 root@scylla:/var/log/# fgrep openjdk-8 dpkg.4.log 2017-12-06 14:08:44 install openjdk-8-jre-headless:amd648u151-b12-0ubuntu0.16.04.2
root@scylla:/# apt -f install /home/russ/Downloads/openjdk-8-jre-headless_8u151-b12-0ubuntu0.16.04.2_amd64.deb root@scylla:/# apt -f install /home/russ/Downloads/openjdk-8-jdk-headless_8u151-b12-0ubuntu0.16.04.2_amd64.deb
root@scylla:/# apt list --installed | grep [o]penjdk openjdk-8-jdk-headless/now 8u151-b12-0ubuntu0.16.04.2 amd64 [installed,upgradable to: 8u162-b12-0ubuntu0.16.04.2] openjdk-8-jre-headless/now 8u151-b12-0ubuntu0.16.04.2 amd64 [installed,upgradable to: 8u162-b12-0ubuntu0.16.04.2] root@scylla:/# apt-mark hold openjdk-8-jre-headless/now openjdk-8-jre-headless set on hold. root@scylla:/# apt-mark hold openjdk-8-jdk-headless/now openjdk-8-jdk-headless set on hold.Of course, to hold the JRE/JDK from being updated is not something to do lightly given that updates likely include security-hole patches.
One of our number joined the [email protected] mailing list only to learn that, when discussing using vnodes, especially lots of vnodes, developers are a little incredulous.
Indeed, our experience without vnodes has been excellent, but, when we've instituted vnodes, we've had no end to problems in running node repairs.
We've known that vnodes, especially greater numbers of them, lets us grow our clusters easily and reduce the likelihood of gross misbalancing in terms of how much of our data ends up on the nodes in the ring. More vnodes, more evenly the data is distributed.
We had no idea that resorting to vnodes had such a bad reputation among developers though we've been fretting about problems with Cassandra for some weeks now (our product has not been released yet, we're still in research and development).
Without using vnodes, imaging growing a customer's cluster predicts an entirely hand-wrought experience by our IT folk. Without vnodes, adding a node or two to a cluster promises a potential of failure. Instead, doubling the size of the cluster seems the safest solution.
We've found Cassandra to be very robust at data entry with no vnodes. With vnodes, we have to repair nodes frequently and, sometimes, abandon and rebuild the cluster from scratch.
Stay tuned.
When a new node is added to a cluster, it's necessarily added to some data center. As this happens, the "Cassandra shuffle," i.e.: what data is shifted from other nodes to the new one as the node equalize the data load they're sharing, goes on only in the data center ring to which the new node has been added, and not across all nodes in the cluster.
Thinking about it, it's important to understand too that when a new data center is added, there's no rebalancing of tokens (tokens being what determines what's called sharding in other database paradigms) since the data center will keep whatever replicas it's set up to keep.
When configuring a data center, it's stated how many replicas there are to be per data center. This will result in a lot of copying between data centers, but doesn't change the replicas for existing data centers or involve movement with them (the existing data centers).
Now that we know that vnodes are Satan's spawn (at least, for now) and because I need another microcluster that simulates multiple data centers, here's a remake of my 8 December 2017 microcluster.
I cloned scylla to sampo and louhi. Here are the differences between scylla's and sampo's /etc/cassandra/cassandra.yaml files. (louhi's IP address is 10.10.8.113.)
$ diff scylla.cassandra.yaml sampo.cassandra.yaml
10c10
< cluster_name: 'odyssey'
---
> cluster_name: 'kalevala'
25c25
< num_tokens: 2
---
> num_tokens: 1
424c424
< - seeds: "10.10.8.248,10.10.8.9"
---
> - seeds: "10.10.8.114,10.10.8.113"
598c598
< listen_address: 10.10.8.248
---
> listen_address: 10.10.8.114
675c675
< rpc_address: 10.10.8.248
---
> rpc_address: 10.10.8.114
948c948
< endpoint_snitch: SimpleSnitch
---
> endpoint_snitch: GossipingPropertyFileSnitch
Now, I had to smoke all the subdirectories under /var/lib/cassandra, including:
Here's the schema and everything else I created for testing and learning a few things I need to know about the Stratio-Lucene plug-in. Now, for Lucene use, this is idiot schema, but I'm not trying to learn about Lucene, only about how this plug-in works its (non Lucene-specific) magic.
CREATE KEYSPACE stratio WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1', 'dc2': '1'} AND durable_writes = true; CREATE TABLE stratio.lucene ( mpid bigint, dos text, uri text, data text, PRIMARY KEY ( mpid, dos, uri ) ) WITH CLUSTERING ORDER BY ( dos DESC, uri ASC ); CREATE CUSTOM INDEX lucene_index ON stratio.lucene (data) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = { 'refresh_seconds' : '10', 'schema' : '{ fields : { mpid : { type : "bigint" }, dos : { type : "text" }, uri : { type : "text" }, data : { type : "text" } } }' }; INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 1, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_001.xml.1', 'This is a test for mpid 1' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 2, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_002.xml.1', 'This is a test for mpid 2' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 3, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_003.xml.1', 'This is a test for mpid 3' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 4, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_004.xml.1', 'This is a test for mpid 4' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 5, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_005.xml.1', 'This is a test for mpid 5' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 13, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_013.xml.1', 'This is a test for mpid 13' ); INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 69, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_069.xml.1', 'This is a test for mpid 69' ); SELECT * FROM stratio.lucene WHERE expr( lucene_index, '{ query: { type: "phrase", field: "data", value: "for mpid" } }' );
I've been doing some very heavy-duty debugging of the Stratio-Lucene plug-in to see how it works, in particular, what happens when nodetool repair is run on a node after it's been down and new data's been added on other nodes. I make these observations for later use.
A number of assumptions exist that are, I hope, established in previous days' work (see notes above).
This is pretty tedious, painful and fraught with error. For example, if you don't erase the hints files from both nodes, restarting Cassandra on the downed node will result in repairs happening right away which is what I'm trying to step through. The factual entry points into the code are important too. Here are some exact steps:
# systemctl stop cassandra
INSERT INTO stratio.lucene ( mpid, dos, uri, data ) VALUES ( 504, '2017-01-01', '/home/searchadmin/cxml/Folder010/665892_504.xml.1', 'This is a test for mpid 504' );
# rm /var/lib/cassandra/hints/*
# systemctl restart cassandra
root@sampo:/var/log/cassandra# tail -f debug.log
* Note on grooming the debug.log:
Wait for sampo and louhi logs to settle down, then add blank lines. Note: this is an on-going requirement—before everything that causes lines to be added to debug.log, add blank lines so that, ultimately, you've got blank lines before the log entries you really want to see (and these aren't lost among all the other lines of the log file). If your objective is to debug, maybe this doesn't matter so much, but what's happening will be much clearer.