In business, they say it takes ten years to become an overnight success. In technology, they say it takes ten years to build a file system. ScyllaDB is in the technology business, offering a distributed NoSQL database that is monstrously fast and scalable. It turns out that it also takes
This is something that Felipe Mendes and Guilherme Nogueira know well. Mendes and Nogueira are Technical Directors at ScyllaDB, working directly on the product as well as consulting clients. Recently, they presented some of the things they’ve been working on at ScyllaDB’s
You can also catch the podcast on
The evolution of ScyllaDB
When
Features such as materialized views, secondary indexes, and integrations with third party solutions are really important as well. Adding such features marked the second generation in ScyllaDB’s evolution. ScyllaDB started as a performance-oriented alternative to Cassandra, so inevitably,
The
The
Strong consistency and tablets
The combination of the new Raft and Tablets features enables clusters to scale up in seconds because it enables nodes to join in parallel, as opposed to sequentially which was the case for the Gossip protocol in Cassandra (which ScyllaDB also relied on originally). But it’s not just adding nodes that’s improved, it’s also removing nodes.When a node goes down for maintenance, for example, ScyllaDB’s strong consistency support means that the rest of the nodes in the cluster will be immediately aware. By contrast, in the previously supported regime of eventual consistency via a gossip protocol, it could take such updates a while to propagate.
Using Raft means transitioning to a state machine mechanism, as Mendes noted. A node leader is appointed, so when a change occurs in the cluster, the state machine is updated and the change is immediately propagated.
Raft is used to propagate updates consistently at every step of a topology change. It also allows for parallel topology updates, such as adding multiple nodes at once. This was not possible under the gossip-based approach.
And this is
Each tablet is independent from the rest, which means that ScyllaDB with Raft can move them to other nodes on demand atomically and in a strongly consistent way as workloads grow or shrink.
Speed, economy, elasticity
By breaking down tables into smaller and more manageable units, data can be moved between nodes in a cluster much faster. This means that clusters can be scaled up rapidly, as Mendes demonstrated. When new nodes join a cluster, the data is redistributed in minutes rather than hours, which was the case previously (and is still the case with alternatives like Cassandra).
When we’re talking about machines that have higher capacity, that also means that they have a higher storage density to be used, as Mendes noted. Tablets balance out in a way that utilizes storage capacity evenly, so all nodes in the cluster will have a similar utilization rate.
That’s because the number of tablets at each node is determined according to the number of CPUs, which is always tied to storage in cloud nodes. In this sense, as storage utilization is more flexible and the cluster can scale faster, it also allows users to run at a much higher storage utilization rate.
A typical storage utilization rate, Mendes said, is 50% to 60%. ScyllaDB aims to run at up to 90% storage utilization. That’s because tablets and cloud automations enable ScyllaDB Cloud to rapidly scale the cluster once those storage thresholds are exceeded, as ScyllaDB’s benchmarking shows.
Going from 60% to 90% storage utilization means an extra 30% per node disk space can be utilized. At scale, that translates to significant savings for users. Further to scaling speed and economy, there is an additional benefit to tablets:
Something old, something new, something borrowed, something blue
Beyond strong consistency and tablets, there is a wide range of new features and improvements that the ScyllaDB team is working on. Some of these, such as
Other features, such as workload prioritization or the
Last but not least, ScyllaDB is also
Once again, ScyllaDB is keeping with the times in its own characteristic way. As Mendes and Nogueira noted, there are many ScyllaDB clients using ScyllaDB to power AI workloads, some of them
After all, why change something that’s so deeply ingrained in the organization’s culture, is working well for them and appreciated by the ones who matter most – users?