RESTBase table storage on Cassandra
This projects provides a high-level table storage service abstraction similar to Amazon DynamoDB or Google DataStore on top of Cassandra. As the production table storage backend for RESTBase, it powers the Wikimedia REST APIs, such as this one for the English Wikipedia.
For testing and small installs, there is also a sqlite backend implementing the same interfaces.
We use Phabricator to track issues. See the list of current issues in restbase-mod-table-cassandra.
In production since March 2015.
- basic table storage service with REST interface, backed by Cassandra, implementing the RESTBase table storage interface
- multi-tenant design: domain creation, prepared for per-domain ACLs
- table creation with declarative JSON schemas
- limited automatic schema migrations
- paging
- Possibly, some amount of transaction support
- Leverage Cassandra 3 materialized views where it makes sense, once those have stabilized.
Configuration of this module takes place from within an x-modules
stanza in the YAML-formatted
RESTBase configuration file.
While complete configuration of RESTBase is beyond the scope of this document, (see the
RESTBase docs for that), this section covers the
restbase-mod-table-cassandra specifics.
- name: restbase-mod-table-cassandra
version: 1.0.0
type: npm
options: # Passed to the module constructor
conf:
version: 1
hosts: [localhost]
username: cassandra
password: cassandra
defaultConsistency: localOne
localDc: datacenter1
datacenters:
- datacenter1
storage_groups:
- name: default.group.local
domains: /./
The version of this configuration. Each edit of the module configuration must correpond to a new, unique version.
Note: Versions must be monotonically increasing.
version: 1
A list of Cassandra nodes to use as contact points.
hosts:
- cassandra-01.sample.org
- cassandra-02.sample.org
- cassandra-03.sample.org
Password credentials to use in authenticating with Cassandra.
Note: Optional; Leave unconfigured if Cassandra authentication is not enabled.
username: someuser
password: somepass
The Cassandra consistency level to use when not otherwise specified. Valid
values are those from the nodejs driver for Cassandra.
Defaults to localOne
.
defaultConsistency: localOne
Key and certificate information for use in TLS-encrypted environments. See the
nodejs documentation on tls.connect
for the meaning of these directives.
Note: Optional; Leave unconfigured if Cassandra client encryption is not enabled.
tls:
cert: /etc/restbase/tls/cert.pem
key: /etc/restbase/tls/key.pem
ca:
- /etc/restbase/tls/root.pem
restbase-mod-table-cassandra
uses a datacenter-aware connection pool. The localDc
directive instructs the module
which datacenter to treat as 'local' to this instance. Cassandra nodes in the local
datacenter will be used for queries, and any others serve as a fallback. Defaults to
datacenter1
(the Cassandra default).
Note: the localDc
must be in the list of configured datacenters (see below).
localDc: datacenter1
The list of datacenters this Cassandra cluster belongs to. Data will be replicated
across these datacenters accordingly. Defaults to [ datacenter1 ]
.
Note: Changing this list alters the underlying Cassandra keyspaces in order to add or remove datacenter replicas accordingly, but replication is NOT made retroactive. You MUST perform a Cassandra repair after adding a new datacenter to realize the added redundancy. Likewise, you must perform a cleanup to reclaim space if a datacenter is removed.
datacenters:
- datacenter1
Storage groups are used to map tables to one or more hosts/domains.
storage_groups:
- name: default.group.local
domains: /./