Cassandra Interview Questions and Answers

Find 100+ Cassandra interview questions and answers to assess candidates’ skills in NoSQL data modeling, replication, consistency, performance tuning, and distributed databases.
By
WeCP Team

As organizations build highly scalable, fault-tolerant, and distributed data systems, recruiters must identify Apache Cassandra professionals who can design and manage databases capable of handling massive volumes of data with zero downtime. Cassandra is widely used for real-time analytics, IoT platforms, messaging systems, and high-availability applications.

This resource, "100+ Cassandra Interview Questions and Answers," is tailored for recruiters to simplify the evaluation process. It covers a wide range of topics from Cassandra fundamentals to advanced distributed database concepts, including data modeling, replication, partitioning, and performance tuning.

Whether you're hiring Cassandra Developers, Data Engineers, Big Data Engineers, or Distributed Systems Specialists, this guide enables you to assess a candidate’s:

  • Core Cassandra Knowledge: Cassandra architecture, nodes, clusters, keyspaces, tables, partitions, and CQL basics.
  • Advanced Skills: Data modeling for Cassandra, replication strategies, consistency levels, compaction strategies, indexing, and tuning read/write performance.
  • Real-World Proficiency: Designing scalable schemas, handling large-scale data ingestion, managing node failures, ensuring high availability, and monitoring Cassandra clusters in production.

For a streamlined assessment process, consider platforms like WeCP, which allow you to:

  • Create customized Cassandra assessments tailored to distributed databases and big data roles.
  • Include hands-on tasks such as writing CQL queries, designing data models, or troubleshooting cluster issues.
  • Proctor exams remotely while ensuring integrity.
  • Evaluate results with AI-driven analysis for faster, more accurate decision-making.

Save time, enhance your hiring process, and confidently hire Cassandra professionals who can build resilient, scalable, and high-performance distributed data systems from day one.

Cassandra Interview Questions

Cassandra – Beginner (1–40)

  1. What is Apache Cassandra?
  2. Explain the CAP theorem and where Cassandra fits.
  3. What is a node in Cassandra?
  4. What is a cluster?
  5. What is a data center in Cassandra?
  6. What is a keyspace?
  7. What is a table (column family) in Cassandra?
  8. What is a partition key?
  9. What is a clustering key?
  10. What is a primary key in Cassandra?
  11. What is a wide row?
  12. Why is Cassandra considered highly available?
  13. What is replication in Cassandra?
  14. What is replication factor?
  15. What is a consistency level?
  16. What is tunable consistency?
  17. Explain eventual consistency.
  18. What is CQL?
  19. What is a SSTable?
  20. What is a Memtable?
  21. What is a commit log?
  22. What are tombstones?
  23. What is compaction?
  24. What is compression in Cassandra?
  25. What is a read repair?
  26. What is hinted handoff?
  27. What is Gossip protocol?
  28. What is anti-entropy repair?
  29. What is a coordinator node?
  30. What is a bootstrap in Cassandra?
  31. What is nodetool?
  32. What is partitioning?
  33. Explain Murmur3 partitioner.
  34. What is a virtual node (vnode)?
  35. How does Cassandra scale horizontally?
  36. What is QUORUM consistency?
  37. What is LOCAL_QUORUM?
  38. What is ANY consistency level?
  39. What is a lightweight transaction (LWT)?
  40. What is the purpose of system keyspaces?

Cassandra – Intermediate (1–40)

  1. Explain how Cassandra writes data internally.
  2. Explain how Cassandra reads data internally.
  3. What is coordinator node responsibility during reads?
  4. What is coordinator node responsibility during writes?
  5. What is speculative retry?
  6. Explain the internal structure of SSTables.
  7. What is a bloom filter?
  8. What is a partition index?
  9. What is a compression offset map?
  10. What is a commit log segment?
  11. How does Cassandra ensure durability?
  12. What is incremental repair?
  13. What is full repair?
  14. What is anti-compaction?
  15. What causes tombstone buildup?
  16. What is TTL in Cassandra?
  17. How do you model 1-to-many relationships?
  18. How do you model time-series data?
  19. Why does Cassandra avoid joins?
  20. What is eventual vs strong consistency in Cassandra?
  21. What is batch operation and when not to use it?
  22. What is an unlogged batch?
  23. What is a logged batch?
  24. What is coordinator-side batching?
  25. What is secondary index?
  26. What are limitations of secondary indexes?
  27. What are materialized views?
  28. Explain problems with materialized views.
  29. What is the difference between TRUNCATE and DROP?
  30. What is a compaction strategy?
  31. Compare STCS, LCS, and TWCS.
  32. What is hinted handoff and when is it disabled?
  33. Explain consistency level EACH_QUORUM.
  34. What is read repair chance?
  35. What is LOCAL_ONE consistency?
  36. How do you handle data skew?
  37. What is hot partition?
  38. Explain write amplification in Cassandra.
  39. What is node decommissioning?
  40. What is repairing vs rebalancing?

Cassandra – Experienced (1–40)

  1. Explain the full read path end-to-end inside Cassandra.
  2. Explain the full write path end-to-end inside Cassandra.
  3. Explain the internal algorithm of compaction in detail.
  4. How does Cassandra achieve linear scalability?
  5. Explain leaderless replication architecture.
  6. How does Cassandra handle network partitions?
  7. Explain hinted handoff internal mechanics.
  8. Explain how Paxos works in LWT (Lightweight Transactions).
  9. How does Cassandra handle clock drift?
  10. Explain how gossip state propagation works.
  11. Explain the purpose and internals of Merkle trees.
  12. Explain why Cassandra needs anti-entropy repair.
  13. What happens during node bootstrap?
  14. What happens during node decommission?
  15. What happens during node replacement?
  16. Explain repair scheduling and impact.
  17. Explain how Cassandra identifies replicas.
  18. How does Cassandra handle read-before-write situations?
  19. How does coordinator choose replica for read?
  20. Explain speculative execution in deep detail.
  21. How do you design data models for multi-tenant systems?
  22. How do you identify and fix hot partitions?
  23. How do you optimize compaction strategies for time-series workloads?
  24. What are the internals of STCS?
  25. What are the internals of LCS?
  26. How to debug high tombstone reads?
  27. How to debug read latency issues?
  28. How to debug write latency issues?
  29. How does Cassandra ensure high availability during failures?
  30. Explain consistency level semantics during partition tolerance.
  31. Explain how hinted handoff can cause data bloat or overload.
  32. Explain internal metrics exposed by Cassandra (latency, pending tasks, TP stats).
  33. What happens during streaming operations?
  34. Explain what happens during compaction failure.
  35. Explain JVM tuning for Cassandra.
  36. How do you tune read path performance?
  37. How do you tune write path performance?
  38. How do you design for multi-region Cassandra deployments?
  39. What are typical anti-patterns in Cassandra data modeling?
  40. Describe disaster recovery strategies for Cassandra.

Cassandra Interview Questions and Answers

Beginner (Q&A)

1. What is Apache Cassandra?

Apache Cassandra is a distributed, NoSQL, column-oriented database designed to handle massive amounts of data across many servers with no single point of failure. It was originally created at Facebook to power their Inbox Search feature and later open-sourced.

Cassandra is built for high availability, horizontal scalability, and fault tolerance, especially in scenarios where applications require:

  • Continuous uptime (24×7 availability)
  • Large-scale read/write throughput
  • Ability to store petabytes of data
  • Low-latency access across globally distributed regions

Cassandra uses a masterless architecture, meaning every node in the cluster is equal and can handle read/write requests. This eliminates bottlenecks and ensures that the failure of any node does not stop the system.

Key strengths include:

  • Linear scalability—adding more nodes increases throughput proportionally.
  • High availability—thanks to replication and no master node.
  • Tunable consistency—application can decide the balance between consistency and performance.
  • Write-optimized design with append-only storage and minimal disk seeks.

Cassandra is widely used by Netflix, Uber, Apple, Instagram, and other tech companies that need globally scalable systems.

2. Explain the CAP theorem and where Cassandra fits.

CAP Theorem states that in a distributed system, you can only guarantee two out of the following three:

  • Consistency (C): Every read gets the latest data.
  • Availability (A): Every request receives a response, even during failures.
  • Partition Tolerance (P): The system continues to work even if network partitions occur.

In reality, partition tolerance is mandatory for any distributed system spanning multiple machines.
So the trade-off is between Consistency and Availability.

Where Cassandra fits:
Cassandra is AP (Availability + Partition Tolerance) by default, meaning:

  • During network issues, Cassandra prefers availability, allowing the cluster to continue accepting reads and writes.
  • It uses eventual consistency to synchronize replicas later.

However, Cassandra is unique because it provides tunable consistency, allowing the user to choose:

  • Strong consistency (e.g., QUORUM, ALL)
  • Eventual consistency (e.g., ONE, ANY)

This flexibility makes Cassandra suitable for a wide range of use cases, from banking systems requiring consistency to social networks needing speed.

3. What is a node in Cassandra?

A node is the basic building block of a Cassandra cluster. It is a single machine (physical or virtual) that stores part of the data and participates in replication, querying, and cluster communication.

Each node in Cassandra:

  • Holds a portion of the data determined by the partitioner.
  • Reads and writes data independently (no master node).
  • Communicates with other nodes using the Gossip protocol.
  • Manages its own storage structures such as memtables, SSTables, and commit logs.
  • Can act as a coordinator node, routing client requests to appropriate replicas.
  • Can join or leave the cluster without downtime.

The strength of Cassandra lies in this peer-to-peer, masterless architecture, where every node is equal and contributes to high availability and fault tolerance.

4. What is a cluster?

A cluster is a collection of nodes that together hold the entire dataset in a distributed manner. The cluster is the top-level structure that Cassandra uses to distribute and replicate data.

Characteristics of a Cassandra cluster:

  • All nodes share a common cluster name.
  • The data is partitioned across nodes using consistent hashing.
  • Replicas of the same data are stored on multiple nodes based on replication factor.
  • The cluster ensures reliability by automatically distributing and balancing data.
  • Nodes work together without a master-slave hierarchy.

A Cassandra cluster can scale from a few machines to hundreds or thousands, and data automatically redistributes without downtime.

5. What is a data center in Cassandra?

A data center (DC) is a logical grouping of nodes within a cluster. Cassandra’s multi-data-center architecture allows organizations to distribute data geographically and handle regional traffic efficiently.

Reasons to use data centers:

  • Fault isolation—A failure in one DC doesn’t affect others.
  • Low latency—Applications connect to the nearest DC.
  • Workload segregation—Separate DCs for analytics, search, backups, or production traffic.
  • Disaster recovery—Cross-region replication for reliable failover.

Replication strategies like NetworkTopologyStrategy use data center awareness to define how many replicas should be stored in each DC.

For example:

  • DC1: replication_factor = 3
  • DC2: replication_factor = 2

This ensures strong multi-region resilience and performance.

6. What is a keyspace?

A keyspace is the top-level namespace in Cassandra, similar to a database in relational systems.

A keyspace defines:

  • Replication strategy (SimpleStrategy, NetworkTopologyStrategy)
  • Replication factor (number of replicas for each piece of data)
  • Durable writes setting

Example:

CREATE KEYSPACE myapp
WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'DC1': 3
};

Every table belongs to a keyspace.
Keyspaces ensure that Cassandra is aware of where and how to store the data across nodes and data centers.

7. What is a table (column family) in Cassandra?

A table (often called a column family in earlier Cassandra versions) is where actual data is stored. It is conceptually similar to a table in SQL but with fewer restrictions and a flexible schema.

Features of Cassandra tables:

  • Data is stored using the primary key, which determines location and order.
  • Each row can have a different set of columns—Cassandra supports a semi-structured model.
  • Tables are optimized for fast writes and range queries on clustering columns.
  • Tables automatically generate SSTables and memtables during writes.

A table's schema is defined using CQL, specifying partition keys, clustering keys, and data types.

8. What is a partition key?

The partition key is the most important part of the primary key in Cassandra. It determines which node will store a row of data.

Functions of the partition key:

  • Defines the distribution of data across the cluster.
  • Ensures load balancing—good partition keys avoid hotspots.
  • Determines the physical partition on disk.
  • Ensures that all rows with the same partition key are stored together.

Example:

PRIMARY KEY ((user_id), order_time)

Here:

  • user_id = partition key
  • order_time = clustering key

If the partition key is poorly chosen (e.g., a constant or low-cardinality field), Cassandra may suffer from hot partitions and performance bottlenecks.

9. What is a clustering key?

A clustering key defines the ordering of rows within a partition.
All rows with the same partition key are sorted by clustering columns, stored in sequence for fast range queries.

Role of clustering keys:

  • Enable efficient sorted queries within a partition.
  • Determine how data is stored on disk inside SSTables.
  • Allow support for time-series modeling, logs, and chronological ordering.
  • Provide support for range queries (e.g., >, <, BETWEEN).

Example:

PRIMARY KEY ((customer_id), order_date, order_id)

Here:

  • order_date and order_id are clustering keys.
  • Rows are ordered first by order_date, then by order_id.

Clustering keys are essential for fast retrieval of sorted or time-sequenced data.

10. What is a primary key in Cassandra?

A primary key uniquely identifies a row in a Cassandra table. It is composed of:

  1. Partition key — Determines data distribution
  2. Clustering keys — Determines sort order within a partition

Structure:

PRIMARY KEY ((partition_key), clustering_key1, clustering_key2, ...)

Example:

PRIMARY KEY ((user_id), login_time)

This means:

  • All rows with the same user_id go to the same partition.
  • Rows are sorted by login_time within that partition.

Importance of the primary key:

  • Defines the physical layout of data.
  • Controls partitioning, ordering, and uniqueness.
  • Drives performance—good primary keys ensure efficient queries.
  • Cassandra does not support joins, so primary key design is critical.

A well-designed primary key is the foundation of optimal performance, data distribution, and scalability.

11. What is a wide row?

A wide row in Cassandra refers to a row that contains a large number of columns or cells, often stored under the same partition key but ordered using clustering keys. Unlike relational databases where rows are typically fixed in width, Cassandra allows rows to grow extremely large, sometimes even containing millions of cells.

A wide row grows vertically based on clustering columns. For example, a time-series table where every sensor reading is stored under the same device ID will naturally create a wide row.

Characteristics of a wide row:

  • All data shares the same partition key.
  • Clustering keys define the ordering of data within the row.
  • Data may span multiple SSTables, but logically it remains a single row.
  • Enables efficient range queries, e.g., “get last 24 hours of readings.”

Use cases:

  • Time-series data
  • IoT sensor readings
  • Activity logs
  • Event timelines

Wide rows are extremely powerful, but poor design (e.g., unbounded growth) can create hot partitions, causing uneven load distribution. Proper sizing is essential to maintain performance.

12. Why is Cassandra considered highly available?

Apache Cassandra is considered highly available because of its masterless architecture, replication, and fault-tolerant design. Cassandra was engineered to remain operational even during node failures or network splits.

Key reasons for high availability:

1. Masterless peer-to-peer architecture

Every node is equal—there is no master, eliminating single points of failure.

2. Replication across multiple nodes

Data is copied (replicated) across nodes and even across multiple data centers, so if one node goes down, another replica can serve requests.

3. Tunable consistency

By using consistency levels like ONE or LOCAL_ONE, Cassandra can continue to serve reads and writes even when multiple nodes are unavailable.

4. Automatic failover

If a node fails:

  • The coordinator node routes the request to other replicas.
  • Hinted handoff temporarily stores missed writes.
  • Gossip protocol helps nodes detect failures quickly.

5. Handling of network partitions

Cassandra chooses availability over consistency by default (AP in CAP theorem), allowing the cluster to continue functioning during partial outages.

This design makes Cassandra ideal for applications requiring 100% uptime, such as ecommerce, banking ledgers, IoT analytics, and large distributed systems.

13. What is replication in Cassandra?

Replication in Cassandra refers to the process of storing multiple copies of data across different nodes for reliability, durability, and availability.

When a write is performed:

  • The coordinator node writes the data to several replicas based on the replication factor.
  • These replicas may be located in different data centers depending on the replication strategy.

Replication ensures that:

  • The system remains available if a node or rack fails.
  • Reads can be served from multiple replicas for improved performance.
  • Data durability is increased because multiple nodes store the same data.

Cassandra supports two main replication strategies:

  1. SimpleStrategy — suited for development/testing.
  2. NetworkTopologyStrategy — used in production for multi-DC replication.

Replication is the foundation of Cassandra’s fault tolerance and ability to continue operating despite failures.

14. What is replication factor?

The replication factor (RF) is the number of copies (replicas) of each piece of data stored in a Cassandra cluster.

Example:

  • RF = 3 → Each piece of data exists on three different nodes.
  • RF = 2 → Two copies on two nodes.

The replication factor is configured per keyspace, not globally.

Impact of RF:

  • Higher RF = higher availability and durability, but more storage usage.
  • RF must always be greater than or equal to the highest consistency level you plan to use.
  • In multi-data center setups, RF is defined per DC, e.g.:
{
 'DC1': 3,
 'DC2': 2
}

A well-chosen RF is critical for balancing durability, availability, and storage cost.

15. What is a consistency level?

A consistency level (CL) determines how many replicas must acknowledge a read or write before Cassandra considers the operation successful.

Consistency levels help balance:

  • Consistency
  • Availability
  • Latency

Common consistency levels include:

For both reads and writes:

  • ONE – Only one replica must respond.
  • TWO, THREE – Two or three replicas required.
  • QUORUM – Majority of replicas (RF/2 + 1).
  • ALL – All replicas must respond.

Multi-DC levels:

  • LOCAL_ONE
  • LOCAL_QUORUM
  • EACH_QUORUM

Usage examples:

  • Reads:
    SELECT * FROM users USING CONSISTENCY QUORUM;
  • Writes:
    INSERT INTO users (...) USING CONSISTENCY ONE;

Consistency levels allow Cassandra to be used for both strong consistency and high availability use cases.

16. What is tunable consistency?

Tunable consistency is Cassandra’s ability to give applications the power to choose the strength of consistency for every read or write operation.

Unlike traditional databases (fixed consistency), Cassandra allows developers to tune each query for:

  • Low latency (weaker consistency)
  • High accuracy (stronger consistency)
  • Balance between the two

You can increase or decrease consistency per operation by selecting the appropriate consistency level.

Example:

  • If you want faster reads:
    Use LOCAL_ONE
  • If you want strong consistency:
    Use QUORUM or ALL

This flexibility is a major reason Cassandra is widely used for systems with mixed workloads—some queries need strong consistency while others prioritize speed.

17. Explain eventual consistency.

Eventual consistency means that after a write is made, all replicas will eventually converge to the same value, but they may not be immediately consistent.

Cassandra follows eventual consistency because:

  • Writes go to multiple replicas, but one may temporarily be down.
  • Gossip, read repair, hints, and repairs are used to synchronize data later.

Cassandra uses several mechanisms to achieve eventual consistency:

1. Hinted Handoff

If a replica is down, the coordinator stores a hint and applies it later.

2. Read Repair

During reads, inconsistent replicas are repaired automatically.

3. Anti-Entropy Repair

Periodic full repairs ensure all nodes eventually match.

4. Merkle Trees

Used to detect inconsistencies efficiently.

Eventual consistency allows Cassandra to stay available during failures while guaranteeing that data consistency is restored eventually.

18. What is CQL?

CQL (Cassandra Query Language) is Cassandra’s SQL-like query language that simplifies interacting with the database. CQL abstracts Cassandra’s underlying storage architecture and provides a familiar, relational-style syntax.

Purpose of CQL:

  • Perform CRUD operations (SELECT, INSERT, UPDATE, DELETE)
  • Create schemas (tables, keyspaces)
  • Manage data types
  • Handle batching and prepared statements

Example syntax:

CREATE TABLE users(
  user_id UUID PRIMARY KEY,
  name TEXT,
  email TEXT
);

Key characteristics:

  • No JOINs
  • No subqueries
  • No foreign keys
  • Query patterns must follow data model rules

CQL makes Cassandra much easier to use compared to early versions that required Thrift APIs.

19. What is a SSTable?

A SSTable (Sorted String Table) is an immutable, disk-based data file used by Cassandra to store data permanently. SSTables are created during flush operations when in-memory data (memtables) is written to disk.

Key features:

  • Immutable (never changed once written)
  • Data is sorted by partition key and clustering key
  • Stored as multiple components:
    • Data file
    • Index file
    • Summary file
    • Bloom filter
    • Compression offsets

Benefits of SSTables:

  • Fast sequential writes
  • Efficient reads using indexes and bloom filters
  • Compaction merges SSTables to reduce fragmentation
  • Immutable nature avoids locking issues

SSTables are the backbone of Cassandra’s storage engine, enabling extremely fast writes and large-scale persistence.

20. What is a Memtable?

A Memtable is an in-memory, sorted data structure where Cassandra stores written data temporarily before flushing it to disk as an SSTable.

Write path process:

  1. Data is written to the commit log for durability.
  2. Data is written to the memtable for fast in-memory access.
  3. When the memtable becomes full, it is flushed to disk as an SSTable.

Characteristics of Memtables:

  • They store data in sorted order.
  • Each table has its own memtable.
  • Memtables are volatile (in-memory), but commit logs ensure durability.
  • Flushing is triggered by thresholds such as size or time.

Memtables provide extremely fast write performance because writes are applied in RAM before they are persisted.

21. What is a commit log?

The commit log is a crucial durability component in Cassandra’s write path. It is an append-only log file on disk where every write operation is recorded before being acknowledged. This ensures that data is not lost even if the node crashes before the in-memory data (memtable) is flushed to disk.

How commit log works in the write path:

  1. Client sends a write request.
  2. Cassandra writes the data to the commit log (on disk) to guarantee durability.
  3. The same data is written to the memtable (in RAM) for fast access.
  4. Once the memtable is full, it is flushed to disk as an SSTable.
  5. After successful flush, the related portion of commit log can be discarded.

Key properties of the commit log:

  • Durability — protects against power loss or crashes.
  • Append-only — minimizes disk seeks and improves write speed.
  • Segmented — split into segments so they can be deleted after flush.
  • Sequential writes — extremely fast due to no random I/O.

Because of the commit log, Cassandra achieves fast, reliable writes, which is one of its strongest design features.

22. What are tombstones?

A tombstone in Cassandra is a marker indicating that a data item (a row, column, or cell) has been deleted. Cassandra does not delete data immediately from SSTables because SSTables are immutable. Instead, it writes a tombstone to mark that the data should be ignored.

When tombstones are created:

  • DELETE operations
  • Expired TTL (Time-To-Live)
  • Updates that overwrite an existing column with null
  • Dropping a column

Why tombstones exist:

  • Keep deletes consistent across replicas
  • Allow safe removal only after compaction
  • Support eventual consistency (slow/failed replicas may receive delete later)

Problems tombstones can cause:

  • Too many tombstones can slow reads (known as tombstone-heavy reads)
  • Large partitions with many tombstones can cause timeouts
  • Compaction takes longer

After the gc_grace_seconds period, compaction removes the tombstones permanently from SSTables.

23. What is compaction?

Compaction is a background process in Cassandra that merges multiple SSTables into fewer, larger SSTables, removing deleted data (tombstones) and reducing fragmentation.

Why compaction is needed:

  • SSTables are immutable → new writes create new SSTables
  • Too many SSTables slow reads (more files to search)
  • Tombstones accumulate and must be purged
  • Merging improves read performance

What happens during compaction:

  1. Cassandra selects several SSTables based on the compaction strategy.
  2. Data from these SSTables is read, merged, sorted, and rewritten.
  3. Expired or deleted data (tombstones) is removed.
  4. Old SSTables are deleted.

Common compaction strategies:

  • STCS (SizeTieredCompactionStrategy) – default
  • LCS (LeveledCompactionStrategy) – ideal for read-heavy workloads
  • TWCS (TimeWindowCompactionStrategy) – ideal for time-series data

Compaction is essential for performance, data correctness, and long-term storage efficiency.

24. What is compression in Cassandra?

Compression in Cassandra reduces the size of SSTables stored on disk, decreasing storage requirements and improving I/O efficiency.

Key benefits of compression:

  • Reduces disk space usage (high savings for large datasets)
  • Improves disk I/O speed (less data to read/write)
  • Reduces network transfer size during streaming (repairs, bootstrap)

Common compression algorithms used:

  • LZ4 (default) — high-speed compression/decompression
  • Snappy — fast but slightly less compression
  • Deflate — high compression ratio but slower

How compression works inside SSTables:

  • Cassandra compresses data in chunks (default 4 KB or more).
  • A compression offset map allows Cassandra to jump to the correct chunk when reading data.
  • Reads do not require decompressing the entire file, only specific chunks.

Compression is transparent to the user and configurable per table, offering a balance between speed and storage efficiency.

25. What is a read repair?

A read repair is a mechanism in Cassandra that ensures data consistency among replicas during read operations.

Why read repair occurs:

Because Cassandra prioritizes availability and may allow inconsistent replicas temporarily (due to network failures, node crashes, or delayed writes).

How read repair works:

  1. A read request is sent to multiple replicas.
  2. Coordinator compares responses.
  3. If mismatch is found (stale values):
    • The coordinator sends the latest correct version to outdated replicas.
  4. Replicas update themselves automatically.

Types of read repair:

  • Blocking read repair — triggered when reading at QUORUM or higher.
  • Speculative/background read repair — controlled by parameter read_repair_chance.

Read repair ensures eventual consistency without running a full repair process manually.

26. What is hinted handoff?

Hinted handoff is a Cassandra mechanism where the coordinator stores a hint when a replica is down or temporarily unavailable to receive a write.

How hinted handoff works:

  1. A write occurs.
  2. One replica is down.
  3. Coordinator stores a "hint" (a small record containing the missed write).
  4. When the replica comes back online:
    • The coordinator replays the hints to update that node.

Hints are temporary and respect gc_grace_seconds.

Benefits:

  • Maintains write availability even if replicas are down
  • Reduces inconsistency windows
  • Helps avoid expensive repair operations

However, if many nodes are down for a long time, hints may accumulate and put pressure on the coordinator.

27. What is Gossip protocol?

The Gossip protocol in Cassandra is a peer-to-peer communication protocol that nodes use to exchange information about:

  • Their own health status
  • Other nodes' status (reachable/unreachable)
  • Schema changes
  • Token ownership

Gossip runs every second and spreads information quickly across the cluster.

Key features:

  • Decentralized
  • Lightweight
  • Fault-tolerant
  • Eventually consistent state sharing

Each gossip round involves:

  1. A node randomly picks another node.
  2. They exchange state information.
  3. Knowledge spreads exponentially (like virus transmission).

Without Gossip, Cassandra would not know which nodes are alive, dead, or changed, making coordination impossible.

28. What is anti-entropy repair?

Anti-entropy repair (often just called repair) is a process in Cassandra to ensure that all replicas of a partition have the same data.

Why repair is needed:

  • Nodes may miss writes due to failures or network issues.
  • Hinted handoff might not be enough.
  • Tombstones need to be propagated before expiry.

How repair works:

  1. Cassandra compares data between replicas using Merkle trees.
  2. Differences are detected at a granular level.
  3. Missing data is streamed to nodes that are outdated.

Repairs are expensive, but they are essential. They should be done regularly to prevent data inconsistency.

You can use:

  • Full repair
  • Incremental repair
  • Node repair
  • Partition repair

Regular repair ensures consistent replicas and prevents data resurrection after tombstone expiration.

29. What is a coordinator node?

A coordinator node is the node that receives a read or write request from the client. It does not need to own the data itself.

Coordinator responsibilities:

  • Determines which replicas hold the data using the partitioner.
  • For writes:
    • Sends the write to all required replicas.
    • Waits for the number of acknowledgments defined by the consistency level.
  • For reads:
    • Queries replicas.
    • Performs read repair if needed.
    • Returns merged result to the client.
  • Handles retries and speculative executions.

Any node in Cassandra can act as a coordinator because of the masterless architecture.
This provides high availability and load distribution.

30. What is a bootstrap in Cassandra?

Bootstrap is the process of adding a new node to a Cassandra cluster. It involves transferring data (streaming) from existing nodes to the new node so that it becomes a full participant.

Steps in bootstrap:

  1. New node starts with an empty data directory.
  2. Node contacts seed nodes through the Gossip protocol.
  3. It determines its token range.
  4. Existing nodes stream data belonging to that range.
  5. The new node fully joins the cluster once streaming finishes.

Bootstrap ensures:

  • Proper load distribution
  • Data balance across nodes
  • Automatic integration into Gossip and cluster state

Bootstrap is essential for Cassandra’s horizontal scalability—just add a node, and Cassandra handles the rest automatically.

31. What is nodetool?

nodetool is a powerful command-line administrative tool used to manage and monitor Cassandra nodes. It interacts directly with a node through JMX (Java Management Extensions) and provides nearly all operational functionalities you need for maintenance, troubleshooting, and cluster inspection.

Key capabilities of nodetool:

  • Monitoring
    • nodetool status — cluster and node health
    • nodetool info — system info
    • nodetool tpstats — thread pool stats
  • Repair and maintenance
    • nodetool repair — anti-entropy repair
    • nodetool cleanup — remove data no longer owned
    • nodetool compact — triggers manual compaction
  • Node lifecycle management
    • nodetool decommission — safely remove a node
    • nodetool bootstrap — view bootstrap progress
    • nodetool drain — gracefully stop writes before shutdown
  • Cache and flush operations
    • nodetool flush — flush memtables to SSTables
    • nodetool invalidatekeycache — clear caches

Why nodetool is important:

  • Essential for DevOps, SREs, and DBAs managing production clusters
  • Helps diagnose performance issues
  • Allows safe node management

In Cassandra, nodetool is the primary administrative interface for real-world operations.

32. What is partitioning?

Partitioning is the process Cassandra uses to distribute data across multiple nodes in a cluster. It determines which node stores which rows based on the value of the partition key.

How partitioning works:

  1. Cassandra applies a partitioner function to the partition key.
  2. This generates a token, which is a large integer.
  3. Each node owns a range of tokens.
  4. The row is stored on the node responsible for the token’s range.

Why partitioning matters:

  • Ensures even distribution of data
  • Prevents hotspots
  • Enables linear scalability
  • Determines replica placement

Incorrect partitioning (e.g., low-cardinality keys) can lead to hot partitions, causing performance bottlenecks.

Cassandra’s ability to scale and perform efficiently is heavily dependent on good partitioning design.

33. Explain Murmur3 partitioner.

The Murmur3Partitioner is the default partitioner in Cassandra. It uses the Murmur3 hashing algorithm to hash the partition key into a uniform 64-bit token.

Why Murmur3 is used:

  • Fast — optimized for speed
  • Uniform distribution — avoids hotspots
  • Deterministic — same key always maps to the same token
  • Low collision probability

How it works:

  1. Partition key → Murmur3 hash → 64-bit token
  2. Token → mapped to a node’s token range
  3. Replicated according to replication factor

Benefits:

  • Ensures even data distribution across all nodes
  • Allows smooth scaling as nodes are added
  • Supports virtual nodes (vnodes) effectively

Murmur3 partitioner is one of the foundations of Cassandra’s performance and scalability.

34. What is a virtual node (vnode)?

A virtual node (vnode) is a small token range assigned to a Cassandra node. Instead of each node owning one large token range, it now owns multiple smaller ranges.

Why vnodes exist:

Originally, Cassandra required manual assignment of token ranges per node. Scaling or replacing nodes was difficult. Vnodes solve this.

How vnodes work:

  • Each Cassandra node holds multiple tokens, often 64–256.
  • The cluster has thousands of small token ranges.
  • When adding/removing nodes, ranges can be reassigned easily.

Benefits of vnodes:

  • Faster node bootstrap and decommission (less data to move)
  • Better load balancing
  • Improved fault tolerance
  • Simpler operations (no manual token management)

Vnodes are now standard in modern Cassandra clusters.

35. How does Cassandra scale horizontally?

Cassandra provides true horizontal scalability, meaning you can increase capacity by simply adding more nodes.

Key reasons Cassandra scales horizontally:

1. Masterless architecture

All nodes are equal and can accept read/write requests. No bottleneck.

2. Automatic data distribution

Partitioner ensures data is spread evenly across the cluster.

3. Replication across nodes

More nodes → more replicas → higher throughput.

4. Virtual nodes (vnodes)

Make redistribution of token ranges automatic and balanced.

5. Linear scalability

If you double the number of nodes, throughput nearly doubles as well.

6. No downtime requirement

New nodes can be added live using bootstrap, with no application interruption.

Horizontal scaling benefits:

  • Handle more writes
  • Handle more reads
  • Handle more storage
  • Support multi-data-center expansion

Cassandra is specifically designed for clusters of hundreds or thousands of nodes.

36. What is QUORUM consistency?

QUORUM is a commonly used strong consistency level in Cassandra. It ensures that a majority of replicas must respond before the operation is considered successful.

Formula for QUORUM:

QUORUM = (Replication Factor / 2) + 1

For RF = 3, QUORUM = 2
For RF = 5, QUORUM = 3

Properties of QUORUM:

  • Guarantees no stale reads when used for both read and write
  • Balances performance and consistency
  • Ensures that the majority replica set is up-to-date

Example:

Write at QUORUM → at least 2 replicas get the write
Read at QUORUM → coordinator queries at least 2 replicas and picks the latest timestamp

When writes use QUORUM and reads use QUORUM, you achieve strong consistency.

37. What is LOCAL_QUORUM?

LOCAL_QUORUM is a consistency level that ensures a majority of replicas within the local data center respond to a read or write.

Why LOCAL_QUORUM matters:

  • Avoids cross-data-center latency
  • Enables strong consistency within a single region
  • Ideal for multi-DC setups where each DC handles its own traffic

Example (RF = 3 in DC1 and 3 in DC2):

LOCAL_QUORUM = 2 responses from the same DC

If the request goes to DC1:

  • Only DC1 replicas matter
  • DC2 replicas are not contacted

Benefits:

  • Low-latency strong consistency
  • Fault-tolerant within the data center
  • Recommended for production multi-region setups (e.g., Netflix)

LOCAL_QUORUM is one of the most used consistency levels in real-world deployments.

38. What is ANY consistency level?

ANY is the weakest consistency level in Cassandra and applies only to writes (not reads).

Behavior of ANY:

  • Write succeeds if any replica or even a hinted handoff stores the data.
  • Even if all replicas are down, coordinator logs a hint and returns success.

Why ANY exists:

  • Provides maximum write availability
  • Ensures writes never fail, even during full replica outage

Trade-offs:

  • Does NOT guarantee data is immediately written to a replica
  • Strongly eventual — consistency will be restored later
  • Should NOT be used for critical data

ANY is useful for analytics, logging, IoT ingestion—where data loss is unacceptable but consistency is less important.

39. What is a lightweight transaction (LWT)?

Lightweight transactions (LWTs) in Cassandra provide conditional updates using a distributed consensus protocol based on Paxos.

LWT ensures linearizable consistency for specific operations, unlike normal Cassandra operations which are eventually consistent.

Use case example:

Prevent inserting a duplicate username:

INSERT INTO users(username, email)
VALUES ('alice', 'alice@example.com')
IF NOT EXISTS;

How LWT works:

  1. Paxos prepare phase — Coordinator asks replicas if they accept the proposal.
  2. Proposal phase — Vote is cast using timestamps.
  3. Commit phase — If consensus is reached, value is written.
  4. Learn phase — Replicas update and confirm decision.

Benefits:

  • Provides serial consistency
  • Guarantees atomic compare-and-set operations

Costs:

  • Slower due to multiple network round trips
  • Should be used sparingly

LWT is powerful for enforcing constraints in a distributed environment.

40. What is the purpose of system keyspaces?

System keyspaces are internal keyspaces used by Cassandra to store metadata and maintain cluster functionality. They are essential to the proper operation of the cluster.

Common system keyspaces:

  • system — stores core metadata (token ranges, cluster name, schema)
  • system_schema — stores schema definitions
  • system_auth — stores authentication and authorization data
  • system_distributed — stores repair history, view build status
  • system_traces — stores query tracing information
  • system_views — stores materialized view metadata

Purpose of system keyspaces:

  • Track node states and token ownership
  • Persist schema across restarts
  • Store authentication/authorization policies
  • Support tracing, repairs, and compactions
  • Maintain cluster topology information

These keyspaces should never be modified manually, as they are vital to cluster health.

Intermediate (Q&A)

1. Explain how Cassandra writes data internally.

Cassandra’s write path is designed for speed, durability, and low latency, and is one of the biggest reasons Cassandra achieves massive write throughput. Writes in Cassandra are append-only and do not require costly disk seeks.

Here is the detailed internal write flow:

Step-by-step write path:

  1. Client sends a write request to any node (the coordinator node).
  2. Coordinator determines replicas using the partitioner + token ring.
  3. Write sent to replicas based on consistency level (ONE, QUORUM, etc.).
  4. Each replica performs the following:
    • Append to Commit Log:
      The write is appended sequentially to the commit log on disk.
      This ensures durability even if the node crashes before flushing to SSTables.
    • Update Memtable:
      The write is applied to the memtable (in-memory sorted structure).
      This supports fast in-memory reads and prepares for a flush later.
  5. Coordinator waits for required acknowledgments from replicas based on consistency level.
  6. Memtable Flush:
    When a memtable becomes full:
    • It is flushed to disk as an immutable SSTable.
    • Corresponding commit log segments are marked as safe for deletion.

Key Concepts:

  • Writes never read existing data (no read-before-write).
  • No updates in-place on disk → everything is append-only.
  • No locking → supports massive concurrency.

This architecture allows Cassandra to handle millions of writes per second with minimal latency.

2. Explain how Cassandra reads data internally.

Cassandra’s read path is more complex than the write path because it must search multiple in-memory and on-disk structures to reconstruct a complete, up-to-date row.

Detailed internal read flow:

  1. Client sends a read request to a coordinator node.
  2. Coordinator determines replicas for the partition.
  3. Depending on consistency level (ONE, QUORUM, etc.), coordinator sends requests to replicas.
  4. Each replica performs the following:

Replica read path (important):

  1. Check Row Cache (if enabled)
    If present, full row is returned immediately.
  2. Check Key Cache (if enabled)
    Helps jump directly to the partition index in SSTables.
  3. Memtable lookup:
    Most recent writes reside in the memtable.
  4. SSTable lookup sequence:
    Cassandra checks SSTables in order:
    • Bloom filter → quick check if partition might exist
    • Partition index → locate position in SSTable
    • Compression offset map → find compressed block
    • Decompress and read data
    • Merge column data with other SSTables and memtable

Coordinator merges results:

  • Compares timestamps and selects the newest value.
  • Performs read repair if replica values differ.
  • Returns result to the client.

This multi-layered lookup ensures that Cassandra reads the most recent and consistent version of the data.

3. What is coordinator node responsibility during reads?

The coordinator node orchestrates the entire read operation. It does not need to own the data itself.

Coordinator node responsibilities:

  1. Determine replicas
    Using partition key → token → replica nodes.
  2. Send read requests
    Based on consistency level:
    • ONE → send to 1 replica
    • QUORUM → send to multiple replicas
    • Plus an additional replica (to support read repair)
  3. Merge data from replicas
    • Cassandra uses timestamps (last-write-wins)
    • Coordinator picks newest values
  4. Perform read repair
    • If replicas have inconsistent data
    • Coordinator pushes latest values to outdated replicas
  5. Retry on failure or timeout
    • If a replica is slow/unresponsive, coordinator triggers speculative retry
  6. Return final merged result to client

The coordinator ensures the read is correct, consistent, and within SLA even if some replicas lag or are unavailable.

4. What is coordinator node responsibility during writes?

During writes, the coordinator ensures that the write is applied successfully across replicas according to the chosen consistency level.

Coordinator responsibilities during write path:

  1. Locate replicas
    Compute token and determine replica nodes.
  2. Send write requests
    Based on consistency level:
    • ONE → wait for 1 replica
    • QUORUM → wait for majority
    • ALL → wait for all replicas
  3. Handle unavailable replicas
    • If a replica is down, coordinator creates hints for hinted handoff.
  4. Collect acknowledgments
    Coordinator waits for required number of ACKs to satisfy consistency.
  5. Acknowledge client
    Once enough replicas confirm the write, coordinator returns success.
  6. Manage timestamp reconciliation
    Cassandra uses timestamp-based last-write-wins; coordinator passes timestamp to replicas.

Coordinator ensures writes are durable, consistent, and fault-tolerant across the cluster.

5. What is speculative retry?

Speculative retry is a performance optimization that minimizes latency caused by slow or overloaded replicas.

How it works:

  1. Coordinator sends read request to replicas.
  2. If a replica is slow (detected by percentile-based latency settings), the coordinator:
    • Sends a duplicate read request to another replica.
    • Uses the fastest response.
  3. Slow replica responses are discarded.

Why it’s important:

  • Some nodes may be temporarily overloaded.
  • Avoids tail latency and stuck reads.
  • Improves overall SLA for read-heavy workloads.

Speculative retry does not compromise consistency because Cassandra still respects the consistency level required by the query.

6. Explain the internal structure of SSTables.

An SSTable (Sorted String Table) is an immutable on-disk data file. Each SSTable consists of multiple components, stored separately for efficiency.

Internal components of SSTables:

  1. Data.db
    • Actual data stored in sorted order
    • Contains partitions, rows, and columns
  2. Index.db
    • Index of partition keys → offset in Data.db
    • Helps quickly find the start of each partition
  3. Summary.db
    • A sampled, lightweight version of the index
    • Reduces index scanning overhead
    • Improves performance
  4. Bloom Filter
    • Probabilistic structure to check if a partition may exist
    • Avoids expensive disk reads for non-existent keys
  5. CompressionInfo.db
    • Contains compression offset map
    • Maps compressed blocks to actual data positions
  6. Statistics.db
    • Stores metadata: min/max timestamps, row counts, etc.
  7. TOC.txt
    • Lists all files that make up the SSTable

Why this structure exists:

  • Enables fast range scans
  • Efficient compressed storage
  • Avoids disk seeks
  • Improves read performance

The SSTable architecture is one of Cassandra’s strongest engineering achievements.

7. What is a bloom filter?

A Bloom filter is a probabilistic data structure that helps Cassandra quickly determine whether a partition key might exist in an SSTable.

Properties of bloom filters:

  • Fast (O(1)) membership checks
  • Zero false negatives
  • Possible false positives
    (meaning: if bloom filter says "not present," it is guaranteed)
  • Saves expensive disk I/O by avoiding unnecessary reads

Bloom filter in read path:

  1. Read request arrives at SSTable.
  2. Bloom filter checks if key might exist.
  3. If bloom filter says “no”—we skip that SSTable.
  4. If “maybe”—we check index + data file.

Bloom filters drastically reduce latency for read-heavy workloads.

8. What is a partition index?

A partition index is a structure inside each SSTable that maps partition keys to the file offsets in the data file where those partitions begin.

Purpose:

  • Fast direct lookup of partitions in the SSTable
  • Works together with bloom filter and summary file

How it works:

  1. Bloom filter says partition might exist.
  2. Partition index is used to find the exact on-disk location.
  3. Cassandra jumps directly to that byte offset.

The partition index reduces search time and avoids unnecessary disk scanning.

9. What is a compression offset map?

The compression offset map is a metadata structure used to locate the compressed blocks inside an SSTable.

Why needed:

SSTables can be compressed in chunks (default 4KB).
When reading data, Cassandra needs to know which block contains the target row.

Compression offset map includes:

  • Mapping of uncompressed position → compressed position
  • Metadata to allow direct jumps to compressed segments

Use in read path:

  1. Partition index gives byte position.
  2. Compression offset map finds which compressed block contains it.
  3. Cassandra decompresses only that block, not whole SSTable.

This improves performance while keeping disk usage efficient.

10. What is a commit log segment?

A commit log segment is a chunk of the commit log file where Cassandra stores write-ahead log entries before flushing memtables to disk.

Commit log segmentation:

  • Commit log is divided into fixed-size segments (e.g., 32 MB).
  • When one segment fills, Cassandra creates a new one.
  • When memtables associated with a segment are flushed, that segment is recycled or deleted.

Benefits of segmentation:

  • Faster deletion after flush
  • Better crash recovery
  • Efficient sequential disk writes

Crash recovery:

  • On restart, Cassandra replays only non-flushed segments
  • Ensures no data loss

Commit log segments are the backbone of Cassandra’s durability and write performance.

11. How does Cassandra ensure durability?

Cassandra ensures durability—the guarantee that once a write is acknowledged, it will survive crashes or power failures—through multiple layers in its write path.

1. Commit Log for Write-Ahead Logging

Every write is first appended to the commit log on disk.

  • Writes are sequential, making them extremely fast.
  • Even if the node crashes before flushing memtables, commit log replay recovers the data.

2. Memtable for Fast In-Memory Writes

After hitting the commit log, data is stored in the memtable, an in-memory structure.
Memtables are periodically flushed to SSTables.

3. SSTables Are Immutable and Durable

Once flushed, data becomes part of SSTables, which are:

  • On-disk
  • Append-only
  • Immutable (no overwrites)
  • Crash-safe

4. Multiple Replicas

Data is written to multiple nodes depending on the replication factor (RF).
Even if one node fails permanently, replicas preserve data.

5. Hinted Handoff

If a replica is down during the write, a hint is stored and replayed when the node is back.

6. Tunable Consistency

Durability guarantees improve with stronger consistency like QUORUM or ALL.

7. Repair Mechanisms

Anti-entropy repair ensures replicas converge even long after failures.

Through these mechanisms, Cassandra achieves strong durability without sacrificing write performance.

12. What is incremental repair?

Incremental repair is an optimized version of Cassandra’s repair process that repairs only the data that has changed since the last repair, instead of scanning the entire dataset.

Why incremental repair exists:

  • Full repair scans everything, which is expensive for large datasets.
  • Incremental repair reduces workload and network traffic.

How incremental repair works:

  • Data is divided into repair sessions.
  • Each SSTable is marked with a repair generation.
  • Only SSTables with new or changed repair generations are compared.

Benefits:

  • Reduced CPU usage
  • Faster repair operations
  • Minimizes network streaming
  • Less IO overhead

Challenges:

  • Historically had bugs (consistent in newer Cassandra versions).
  • Requires scheduled execution for best results.

Incremental repair is critical for maintaining cluster consistency in large environments.

13. What is full repair?

A full repair is a comprehensive repair process that compares all data across all replicas for a given token range, regardless of whether the data has changed since the last repair.

Behavior of full repair:

  • Reads all data from every replica
  • Builds Merkle trees for each replica
  • Compares trees to detect inconsistencies
  • Streams missing or outdated data to fix inconsistencies

Advantages:

  • Guarantees full data convergence
  • Cleans up missed writes and downed-replica issues
  • Removes risk of "zombie data" (resurrected deleted values)

Disadvantages:

  • Very CPU-intensive
  • Heavy network streaming
  • High IO overhead
  • Slow for large datasets

Full repair is often used:

  • During major cluster maintenance
  • After significant outages
  • When incremental repair is not trusted

Production clusters typically mix full repair (periodic) with incremental repair (routine).

14. What is anti-compaction?

Anti-compaction is a process triggered during incremental repair. It separates SSTables into repaired and unrepaired components, making future repairs more efficient.

Why anti-compaction is needed:

  • Incremental repair must track what data has already been repaired.
  • Separating repaired and unrepaired data makes comparisons faster.

How it works:

  1. During repair, Cassandra identifies data ranges that were repaired.
  2. SSTables containing those ranges are split into:
    • Repaired SSTables → no need to include in the next incremental repair
    • Unrepaired SSTables → may need repair later
  3. Metadata is updated with the new repair status.

Benefits:

  • Reduces future repair workloads
  • Improves read and compaction performance
  • Keeps repair generation consistent

Anti-compaction is integral to incremental repair logic.

15. What causes tombstone buildup?

Tombstone buildup occurs when too many delete markers accumulate, creating performance and storage problems.

Common causes:

  1. Frequent deletes
    Cassandra does not immediately remove deleted data; it writes tombstones instead.
  2. Low TTL values
    When many rows expire frequently, tombstones rapidly accumulate.
  3. Large partitions
    Tombstones in wide partitions cause heavy read amplification.
  4. Long gc_grace_seconds
    Tombstones persist until the grace period expires (default: 10 days).
  5. High delete churn
    Repeated deletes of the same rows generate many tombstones.
  6. Materialized views
    MV updates create tombstones under the hood.

Risks of tombstone buildup:

  • Slower reads (must scan tombstones)
  • Timeouts
  • Memory pressure
  • Compaction overhead

Proper modeling and TTL usage are critical to controlling tombstones.

16. What is TTL in Cassandra?

TTL (Time-To-Live) is a feature in Cassandra that automatically expires (deletes) data after a specified time.

How TTL works:

  • Each write can include a TTL value in seconds.
  • After TTL expires, the data is marked with a tombstone.
  • During compaction, tombstones remove expired data permanently.

Example:

INSERT INTO session_data (user_id, token)
VALUES ('123', 'abc') 
USING TTL 3600;

This data expires after 1 hour.

TTL can be applied to:

  • Individual columns
  • Entire rows
  • Default TTL at table level

Use cases:

  • Session management
  • Time-bound caching
  • IoT logs
  • Temporary data

Caution:

Overusing TTL creates many tombstones → potential performance issues.

17. How do you model 1-to-many relationships?

Cassandra does not support joins, so you denormalize and store the 1:M relationship within a single table, typically using the parent ID as the partition key and the child attributes as clustering columns.

Example: User → Orders

Table: orders_by_user

PRIMARY KEY (user_id, order_time)
  • user_id = Partition key
  • order_time = Clustering key (sorted by time)

Why this works:

  • Fetch all orders for a user via 1 partition.
  • Cassandra excels at partition-local queries.
  • Range queries (e.g., last N orders) become efficient.

Alternative modeling patterns:

  • Duplicate the data across multiple tables (query-based modeling)
  • Use bucketed partitions to avoid extremely wide rows

Cassandra modeling always starts from query-first design.

18. How do you model time-series data?

Time-series modeling is one of Cassandra’s strongest use cases. The primary goal is to avoid unbounded partitions while enabling efficient time-based queries.

Common modeling pattern:

Use a partition key based on entity + time bucket, and a clustering key based on timestamp.

Example: Sensor readings

PRIMARY KEY ((sensor_id, day_bucket), timestamp)
  • Partition key = sensor_id + day_bucket
  • Clustering key = ascending timestamp

Benefits:

  • Avoids wide partitions (e.g., split by day or hour)
  • Allows range queries:
    WHERE timestamp > ... AND timestamp < ...
  • Enables efficient writes (time-ordered clustering)

Time bucketing strategies:

  • Hourly buckets
  • Daily buckets
  • Weekly buckets
  • Size-based buckets

Rules for time-series modeling:

  • Partitions must be bounded.
  • Writes should follow natural time order.
  • Avoid huge TTL + large partitions → tombstone storms.

Proper time-series modeling ensures Cassandra remains performant.

19. Why does Cassandra avoid joins?

Cassandra avoids joins because it is optimized for distributed, scalable, high-throughput workloads, not complex relational operations.

Technical reasons Cassandra avoids joins:

  1. Distributed nature
    A join would require fetching data from multiple partitions, potentially across many nodes → extremely slow.
  2. No full table scan model
    Cassandra is not designed for unbounded queries.
  3. Focus on write and read performance
    Joins require expensive server-side processing.
  4. Denormalization preferred
    Cassandra encourages query-based modeling, where data is duplicated across tables.
  5. Avoids cross-node locking
    Joins often require transaction semantics.

Resulting design:

  • Use multiple denormalized tables, each designed for a specific query.
  • Store related data together using partition + clustering keys.

This tradeoff allows Cassandra to scale linearly while maintaining very low latency.

20. What is eventual vs strong consistency in Cassandra?

Cassandra provides tunable consistency, allowing applications to choose between stronger or weaker guarantees.

Eventual Consistency

  • Writes propagate asynchronously to replicas.
  • Replicas may temporarily return different versions.
  • Background processes (hints, read repair, repair) fix inconsistencies.
  • Prioritizes availability and performance.
  • Typical with CL: ONE, ANY, LOCAL_ONE.

Example scenario:
A write is sent to only one replica; others get it later.

Strong Consistency

  • Read/write operations wait for majority or all replicas.
  • Guarantees that data returned is the most recent committed value.
  • Achieved via CL: QUORUM, LOCAL_QUORUM, ALL.

Example:
Write at QUORUM and read at QUORUM guarantees consistent reads.

Key Differences:

PropertyEventual ConsistencyStrong ConsistencyFreshnessMaybe staleAlways freshAvailabilityVery highLower if nodes downLatencyLowHigherUse casesLogging, IoTFinance, inventory

Cassandra allows mixing both depending on the workload—one of its biggest strengths.

21. What is batch operation and when not to use it?

A batch operation in Cassandra allows multiple write statements to be grouped and sent to the server in a single request. It is important to understand that Cassandra does NOT use batch to increase performance (unlike relational systems where batch execution is usually faster).

Purpose of batch in Cassandra:

  • To ensure atomicity of writes across multiple partitions (when using logged batch)
  • To group writes to the same partition for efficiency
  • To simplify client application logic

When NOT to use batch (very important):

You should not use batch to improve performance when writing to multiple different partition keys.

Example of incorrect usage:

BEGIN BATCH
INSERT INTO users_by_id ...
INSERT INTO users_by_email ...
INSERT INTO users_by_phone ...
APPLY BATCH;

If these statements use different partition keys, Cassandra must coordinate across nodes, creating heavy load on the coordinator.

Problems caused by misuse of batch:

  • Coordinator becomes overloaded
  • High network traffic
  • Large tombstone creation
  • Write latencies spike
  • Timeouts increase

Best practice:
Use batch only when:

  • You need atomicity for multiple writes
  • You write to the same partition key

Cassandra batches ≠ performance optimization.

22. What is an unlogged batch?

An unlogged batch is a batch operation where Cassandra does not use the batch log to guarantee atomicity.

Properties of unlogged batch:

  • No safety guarantees
  • No batch log
  • Faster than logged batch
  • Ideal for grouping writes to same partition
  • Fails partially if a node goes down

When to use unlogged batch:

  • When you write multiple rows to the same partition (e.g. inserting multiple items into same user)
  • When atomicity is NOT needed
  • When improving network efficiency by sending multiple mutations in one request

Example:

BEGIN UNLOGGED BATCH
INSERT INTO user_events ...
INSERT INTO user_events ...
APPLY BATCH;

Unlogged batch is purely an optimization for reducing network round-trips—not for providing consistency.

23. What is a logged batch?

A logged batch guarantees that all statements inside the batch will be executed atomically (all succeed or all fail). Cassandra ensures this using the batch log, which tracks batch operations until all are completed.

How it works:

  1. Coordinator writes batch metadata into the batch log system keyspace.
  2. Executes each statement in the batch.
  3. When all statements succeed, batch log entry is removed.
  4. If a replica is down, coordinator ensures application of statements later.

When to use logged batch:

  • When writing multiple rows that must be atomically consistent
  • When maintaining denormalized tables that must update together
    Example:
    • user_by_id
    • user_by_email
    • user_by_phone

When NOT to use logged batch:

  • When writing to large numbers of distinct partition keys
  • For performance reasons
  • For bulk loading

Performance impact:

  • Highest overhead of all batch types
  • Involves additional writes for batch log
  • Should be used sparingly

24. What is coordinator-side batching?

Coordinator-side batching occurs when a client sends a large number of individual writes in rapid succession to the same coordinator node. Although they are separate statements, the coordinator internally groups them and sends them as a batch to replicas.

Key characteristics:

  • Happens automatically
  • Does not guarantee atomicity
  • Improves network efficiency
  • Transparent to the user
  • NOT equivalent to logged or unlogged batch

Risks:

  • Too many operations on same coordinator → coordinator overload
  • Not controllable like CQL batch

Coordinator-side batching is not something you design for—it's simply an internal behavior.

25. What is a secondary index?

A secondary index in Cassandra allows you to query tables by columns other than the primary key. Cassandra automatically maintains these indexes internally.

Use case:

Query data without knowing the partition key:

SELECT * FROM users WHERE email = 'abc@example.com';

How secondary indexes work:

  • For each indexed value, Cassandra stores a lookup entry mapping the value → primary key.
  • Index data is stored on nodes that own the indexed value's partition.

Types of indexes:

  • Local secondary index (classic)
  • Storage-attached index (SAI) – newer, more efficient

Secondary indexes must be used carefully because of performance implications.

26. What are limitations of secondary indexes?

Secondary indexes are powerful but have several important limitations in Cassandra.

1. Not efficient for high-cardinality columns

If too many unique values, index entries become distributed across many nodes.

2. Not efficient for low-cardinality columns

If too few unique values, thousands of rows map to the same index entry → huge partitions.

3. Index queries may hit multiple nodes

Unlike primary key queries which always target one partition.

4. High read amplification

Querying the index may require fetching many primary rows.

5. Update cost

Indexes generate extra writes each time the indexed column changes.

6. Not suitable for write-heavy workloads

Indexes slow down write performance due to additional index maintenance.

7. Cannot guarantee strict performance

Secondary index queries can be unpredictable depending on data distribution.

Because of these limitations, denormalization or materialized views are often preferred.

27. What are materialized views?

Materialized Views (MVs) are a feature introduced in Cassandra to automatically maintain alternate query tables derived from a base table.

How MVs work:

  • You define a view with a new primary key.
  • Cassandra automatically maintains the view whenever the base table updates.
  • Data is pushed to the MV asynchronously.

Example:

CREATE MATERIALIZED VIEW users_by_email AS
SELECT * FROM users
WHERE email IS NOT NULL
PRIMARY KEY (email, user_id);

Purpose:

  • Simplify denormalization
  • Automatically maintain derived tables
  • Reduce application complexity

However, materialized views come with serious caveats.

28. Explain problems with materialized views.

Materialized views in Cassandra have historically suffered from correctness and performance issues.

Main problems:

  1. Eventual Consistency Only
    MV updates happen asynchronously → stale data possible.
  2. View Update Failure
    If the base table write succeeds but MV write fails, data becomes inconsistent.
  3. Race Conditions
    Simultaneous updates can cause out-of-order writes.
  4. Repair Challenges
    Synchronizing MV with base table during repair is difficult.
  5. High Write Amplification
    Each base write generates additional writes to all views.
  6. Complex Failure Modes
    Nodes down during MV updates cause inconsistent state that is hard to correct.
  7. Performance Overhead
    MV maintenance increases CPU, disk, and network workload.

Current state:

Materialized views are considered experimental and not recommended for mission-critical systems.
Modern Cassandra deployments prefer:

  • Denormalization
  • Duplicate tables
  • Change Data Capture (CDC) → custom MV logic

29. What is the difference between TRUNCATE and DROP?

Both TRUNCATE and DROP remove data but work differently.

TRUNCATE

  • Removes all data from a table
  • Keeps the table schema
  • Very fast (metadata-only operation)
  • Old SSTables are deleted asynchronously
  • Partition keys, clustering keys, and schema remain

Example:

TRUNCATE TABLE users;

DROP

  • Removes the entire table definition and all data
  • Deletes schema + storage files
  • Table no longer exists
  • Irreversible via CQL

Example:

DROP TABLE users;

Key Differences:

ActionTRUNCATEDROPRemoves dataYesYesKeeps schemaYesNoReusable tableYesNoDeletes SSTablesYesYesMetadata retentionTable staysTable removed

Choose based on whether you want to keep the table structure.

30. What is a compaction strategy?

A compaction strategy determines how Cassandra merges SSTables over time to remove deleted data and optimize performance.

Compaction reduces:

  • Fragmentation
  • Tombstones
  • Duplicate data
  • Read amplification

Three major compaction strategies:

1. SizeTiered Compaction Strategy (STCS)

(Default strategy)

  • Groups SSTables of similar size and compacts them
  • Best for write-heavy workloads
  • Not great for read-heavy patterns

2. Leveled Compaction Strategy (LCS)

  • Organizes SSTables into levels (L0, L1, L2…)
  • Ensures each level contains SSTables with non-overlapping ranges
  • Most efficient for read-heavy workloads
  • Higher write amplification

3. TimeWindow Compaction Strategy (TWCS)

  • Designed for time-series data
  • Compacts data based on time windows (e.g., hourly or daily)
  • Efficient for TTL-heavy and append-only workloads
  • Avoids merging old data with new data

Why compaction strategy matters:

  • Affects IO performance
  • Determines tombstone cleanup speed
  • Impacts disk usage
  • Controls read performance

Choosing the right strategy is crucial to optimal Cassandra performance.

31. Compare STCS, LCS, and TWCS.

Cassandra supports three main compaction strategies, each optimized for different workload patterns. Choosing the right strategy is essential for performance.

1. STCS (Size-Tiered Compaction Strategy)

Best for: Write-heavy workloads with few range queries.

How it works:
  • Groups SSTables of similar size into tiers
  • Compacts them into a larger SSTable
  • Default strategy
Pros:
  • Very low write amplification
  • Good for bulk ingestion
  • Simple and efficient for append-only workloads
Cons:
  • Poor read performance
  • Large overlapping SSTables → high read amplification
  • Tombstones linger for longer periods

2. LCS (Leveled Compaction Strategy)

Best for: Read-heavy workloads, low-latency read requirements.

How it works:
  • Organizes SSTables into levels L0, L1, L2…
  • Levels have non-overlapping token ranges
  • Ensures predictable, small SSTables in upper levels
Pros:
  • Very low read amplification
  • Ideal for workloads with lots of point reads
  • Fast read latencies
  • Efficient tombstone cleanup
Cons:
  • High write amplification
  • More disk IO
  • Larger disk space requirements

3. TWCS (Time-Window Compaction Strategy)

Best for: Time-series & TTL-heavy workloads.

How it works:
  • Groups SSTables by time windows (hour/day/week)
  • SSTables in the same time window are compacted together
  • Old SSTables are not re-compacted with new data
Pros:
  • Ideal for logs, IoT, time-series data
  • Efficient TTL expiration
  • Extremely low read/write amplification for time-ordered inserts
  • Prevents tombstone storms
Cons:
  • Not suitable for random access data
  • Not ideal for workloads without time clustering

Summary Table

StrategyBest ForStrengthWeaknessSTCSWrite-heavyLow write amplificationPoor read performanceLCSRead-heavyLow read amplificationHigh write amplificationTWCSTime-seriesGood TTL handlingOnly good for time-ordered data

32. What is hinted handoff and when is it disabled?

Hinted handoff is a Cassandra mechanism where the coordinator saves a "hint" for a replica that is temporarily down, and replays it when the replica comes back online.

How it works:

  1. A write request arrives.
  2. One or more replicas are unreachable.
  3. Coordinator stores a hint (missed mutation).
  4. Once replica recovers, coordinator replays hints.
  5. Replica becomes consistent with the rest of the cluster.

Benefits:

  • Improves write availability
  • Helps keep replicas in sync
  • Reduces need for full repair

When is hinted handoff disabled?

1. Node down for too long

If a node is down longer than max_hint_window_in_ms (default: 3 hours), hints are discarded and only repair can fix consistency.

2. For certain consistency levels

For CL = ANY, hints may not be used in the same way.

3. When manually disabled

Admins may disable hints to reduce coordinator load:

hinted_handoff_enabled: false

4. If hints are too large

If disk or memory thresholds are exceeded, hints may be dropped.

Hinted handoff is useful but must be monitored carefully to avoid coordinator pressure.

33. Explain consistency level EACH_QUORUM.

EACH_QUORUM is a multi–data center consistency level that ensures a quorum of replicas in every data center acknowledges the read/write.

Usage:

Only valid for multi-DC clusters using NetworkTopologyStrategy.

How EACH_QUORUM works:

Suppose you have:

  • DC1 RF = 3 → quorum = 2
  • DC2 RF = 3 → quorum = 2

Then:

  • Write at EACH_QUORUM → requires 2 acknowledgments from DC1 + 2 from DC2
  • Read at EACH_QUORUM → requires 2 responses from DC1 + 2 from DC2

Properties:

  • Stronger consistency than LOCAL_QUORUM
  • Ensures consistent data across ALL data centers
  • Extremely expensive in latency and availability
  • Not commonly used in production

When to use EACH_QUORUM:

  • Financial systems requiring global consistency
  • Cross-DC transactional writes
  • Strict compliance environments

Why it's rarely used:

  • Latency increases significantly
  • If one DC is slow/unreachable → entire query fails
  • LOW availability guarantee

LOCAL_QUORUM is usually the better choice.

34. What is read repair chance?

Read repair chance is a Cassandra setting that determines how frequently background read repair should occur on queries that do not require QUORUM.

Two parameters control it:

  • read_repair_chance
  • dc_local_read_repair_chance

How read repair works:

  1. Client issues a read.
  2. Coordinator requests from extra replicas.
  3. If mismatch detected → coordinator triggers repair.
  4. Outdated replicas are updated to the latest values.

Purpose of read repair chance:

  • Maintain eventual consistency
  • Repair inconsistent replicas proactively
  • Reduce need for full repair operations

Drawbacks:

  • Increases read latency
  • Extra network and disk usage
  • Should be used sparingly

Modern Cassandra versions discourage using high read repair chance and rely more on incremental repair.

35. What is LOCAL_ONE consistency?

LOCAL_ONE is a consistency level where the coordinator waits for a response from only one replica in the local data center.

Properties:

  • Fastest read/write consistency level
  • Low latency
  • Avoids cross-DC communication
  • Ensures writes succeed even under high failure scenarios
  • Weakest local consistency after ONE

Benefits:

  • Excellent for globally distributed apps
  • Ideal for geo-nearby traffic flow
  • High availability

Drawbacks:

  • Might return stale data
  • Not strongly consistent
  • Shouldn't be used for financial or transactional systems

LOCAL_ONE is widely used for low-latency, non-critical queries.

36. How do you handle data skew?

Data skew happens when some partitions receive disproportionately more data or queries, causing hotspots and performance issues.

Techniques to handle data skew:

1. Salting the partition key

Add a random component to partition key:

PRIMARY KEY ((bucket, user_id), timestamp)

Spreads partitions across multiple buckets.

2. Bucketing

Group data based on time or value ranges:

day_bucket = timestamp / 86400

3. Time-windowing

Used for time-series to avoid unbounded partitions.

4. Resharding data

Split large partitions manually using composite keys.

5. Denormalization

Spread read load across multiple tables.

6. Using materialized views (carefully)

Or better: multiple denormalized tables designed per query.

7. Increase RF

More replicas reduce pressure on single partitions.

Data skew must be addressed early because Cassandra’s performance depends on balanced partitions.

37. What is a hot partition?

A hot partition is a partition that receives disproportionately high read/write traffic, causing:

  • High latency
  • Node overload
  • Uneven CPU/memory usage
  • Failed requests
  • Timeouts

Causes of hot partitions:

  1. Poor partition key design
  2. Time series with no time bucketing
  3. Popular keys (e.g., “global settings”)
  4. Low-cardinality keys
  5. Unbounded partitions

Fixes:

  • Use composite partition keys
  • Time-bucket partitions
  • Add randomness/salting
  • Split workload across multiple tables
  • Increase replication factor
  • Implement caching layer outside Cassandra

Hot partitions violate Cassandra’s goal of uniform load distribution and must be redesigned.

38. Explain write amplification in Cassandra.

Write amplification refers to the phenomenon where a single logical write results in multiple physical writes across the Cassandra system.

Where write amplification happens:

1. Commit log

Every write is appended to the commit log.

2. Memtable flush

When memtables flush, SSTables are created → additional writes.

3. Compaction

SSTables are repeatedly merged, rewriting data multiple times.

4. Replication

Writes are copied across multiple replicas.

5. Secondary indexes / Materialized views

Each indexed column or view results in extra writes.

Impacts:

  • Higher disk IO
  • Increased CPU usage
  • More storage required
  • Lower write throughput

Ways to reduce write amplification:

  • Choose correct compaction strategy
  • Avoid overuse of secondary indexes
  • Avoid unnecessary TTL and deletes
  • Minimize LWT usage
  • Use TWCS for time-series

Write amplification is inherent but manageable with good data modeling.

39. What is node decommissioning?

Node decommissioning is the process of safely removing a node from a Cassandra cluster.

How it works:

  1. Admin runs:
nodetool decommission
  1. Node streams its data to remaining replicas.
  2. Ring topology updates accordingly.
  3. Node removes itself from cluster metadata.

Important points:

  • Safe, online operation
  • Ensures no data loss
  • Cluster rebalances token ranges afterward
  • Should not be used if node has failed permanently → use removenode instead

Decommissioning is crucial for maintenance, scaling down, or hardware replacement.

WeCP Team
Team @WeCP
WeCP is a leading talent assessment platform that helps companies streamline their recruitment and L&D process by evaluating candidates' skills through tailored assessments