Amazon Keyspaces (for Apache Cassandra)

Developer Guide

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces (for Apache Cassandra): Developer Guide

Amazon's trademarks and trade dress may not be used in connection with any product or service

that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any

manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are

the property of their respective owners, who may or may not be aﬃliated with, connected to, or

sponsored by Amazon.

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table of Contents

What is Amazon Keyspaces? ........................................................................................................... 1

How it works ................................................................................................................................................. 1

High-level architecture ........................................................................................................................... 2

Cassandra data model ............................................................................................................................ 4

Accessing Amazon Keyspaces ............................................................................................................... 5

Use cases ........................................................................................................................................................ 6

What is CQL? ................................................................................................................................................. 7

Compare Amazon Keyspaces with Cassandra ................................................................................ 9

Functional diﬀerences with Apache Cassandra .................................................................................... 10

Apache Cassandra APIs, operations, and data types ..................................................................... 11

Asynchronous creation and deletion of keyspaces and tables .................................................... 11

Authentication and authorization ..................................................................................................... 11

Batch ........................................................................................................................................................ 11

Cluster conﬁguration ............................................................................................................................ 11

Connections ............................................................................................................................................ 11

IN keyword ............................................................................................................................................ 12

CQL query throughput tuning ........................................................................................................... 12

FROZEN collections ............................................................................................................................... 13

Lightweight transactions ..................................................................................................................... 13

Load balancing ...................................................................................................................................... 14

Pagination ............................................................................................................................................... 14

Partitioners ............................................................................................................................................. 14

Prepared statements ............................................................................................................................ 14

Range delete .......................................................................................................................................... 15

System tables ........................................................................................................................................ 15

Timestamps ............................................................................................................................................ 15

Supported Cassandra APIs, operations, functions, and data types .................................................. 16

Cassandra API support ......................................................................................................................... 16

Cassandra control plane API support ............................................................................................... 18

Cassandra data plane API support .................................................................................................... 18

Cassandra function support ............................................................................................................... 19

Cassandra data type support ............................................................................................................. 19

Supported Cassandra consistency levels ............................................................................................... 21

Write consistency levels ...................................................................................................................... 21

iii

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Read consistency levels ....................................................................................................................... 22

Unsupported consistency levels ........................................................................................................ 23

Migrating to Amazon Keyspaces .................................................................................................. 24

Migrating from Cassandra ........................................................................................................................ 25

Compatibility .......................................................................................................................................... 26

Estimate pricing .................................................................................................................................... 26

Migration strategy ................................................................................................................................ 35

Online migration ................................................................................................................................... 36

Oﬄine migration .................................................................................................................................. 47

Hybrid migration ................................................................................................................................... 49

Migration tools ........................................................................................................................................... 52

Loading data using cqlsh .................................................................................................................... 54

Loading data using DSBulk ................................................................................................................. 66

Accessing Amazon Keyspaces ....................................................................................................... 79

Setting up AWS Identity and Access Management ............................................................................. 79

Sign up for an AWS account .............................................................................................................. 79

Create a user with administrative access ......................................................................................... 79

Setting up Amazon Keyspaces ................................................................................................................ 81

Using the console ....................................................................................................................................... 82

Using AWS CloudShell .............................................................................................................................. 82

Obtaining IAM permissions for AWS CloudShell ............................................................................ 83

Interacting with Amazon Keyspaces using AWS CloudShell ........................................................ 84

Create programmatic access credentials ............................................................................................... 85

Create service-speciﬁc credentials .................................................................................................... 86

Create IAM credentials for AWS authentication ............................................................................. 88

Service endpoints ....................................................................................................................................... 96

Ports and protocols .............................................................................................................................. 96

Global endpoints ................................................................................................................................... 97

AWS GovCloud (US) Region FIPS endpoints ................................................................................... 99

China Regions endpoints .................................................................................................................. 100

Using cqlsh ............................................................................................................................................. 100

Using the cqlsh-expansion ......................................................................................................... 101

How to manually conﬁgure cqlsh connections for TLS ........................................................... 106

Using the AWS CLI .................................................................................................................................. 107

Downloading and Conﬁguring the AWS CLI ................................................................................. 108

Using the AWS CLI with Amazon Keyspaces ................................................................................. 108

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Using the API ............................................................................................................................................ 112

Using a Cassandra client driver ............................................................................................................ 112

Using a Cassandra Java client driver .............................................................................................. 113

Using a Cassandra Python client driver ......................................................................................... 126

Using a Cassandra Node.js client driver ........................................................................................ 129

Using a Cassandra .NET Core client driver .................................................................................... 133

Using a Cassandra Go client driver ................................................................................................. 135

Using a Cassandra Perl client driver ............................................................................................... 140

Connection tutorials ................................................................................................................................ 142

Connecting with VPC endpoints ..................................................................................................... 142

Connecting with Apache Spark ....................................................................................................... 161

Connecting from Amazon EKS ........................................................................................................ 174

Conﬁgure cross-account access ............................................................................................................. 194

Conﬁgure cross-account access in a shared VPC ......................................................................... 195

Conﬁgure cross-account access without a shared VPC ............................................................... 198

Getting started ............................................................................................................................ 200

Prerequisites .............................................................................................................................................. 201

Create a keyspace .................................................................................................................................... 201

Check keyspace creation status ............................................................................................................ 205

Create a table ........................................................................................................................................... 205

Check table creation status ................................................................................................................... 214

CRUD operations ...................................................................................................................................... 214

Create .................................................................................................................................................... 215

Read ....................................................................................................................................................... 219

Update .................................................................................................................................................. 223

Delete .................................................................................................................................................... 224

Delete a table ........................................................................................................................................... 226

Delete a keyspace .................................................................................................................................... 229

Managing serverless resources ................................................................................................... 232

Estimate row size ..................................................................................................................................... 233

Estimate capacity consumption ............................................................................................................ 236

Estimate the capacity consumption of range queries ................................................................. 237

Estimate the read capacity consumption of limit queries ......................................................... 238

Estimate the read capacity consumption of table scans ............................................................ 239

Estimate capacity consumption of LWT ........................................................................................ 239

Estimate capacity consumption of static columns ...................................................................... 240

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Estimate capacity for a multi-Region table .................................................................................. 244

Estimate capacity consumption with CloudWatch ...................................................................... 245

Conﬁgure read/write capacity modes ................................................................................................. 246

Conﬁgure on-demand capacity mode ............................................................................................ 246

Conﬁgure provisioned throughput capacity mode ...................................................................... 249

View the capacity mode of a table ................................................................................................ 250

Change capacity mode ...................................................................................................................... 251

Manage throughput capacity with auto scaling ................................................................................ 252

How Amazon Keyspaces automatic scaling works ....................................................................... 253

How auto scaling works for multi-Region tables ........................................................................ 254

Usage notes ......................................................................................................................................... 255

Conﬁgure and update auto scaling policies .................................................................................. 256

Use burst capacity ................................................................................................................................... 269

Working with Amazon Keyspaces features ................................................................................ 270

System keyspaces .................................................................................................................................... 271

system ................................................................................................................................................. 272

system_schema ................................................................................................................................. 273

system_schema_mcs ....................................................................................................................... 274

system_multiregion_info ......................................................................................................... 276

Multi-Region Replication ........................................................................................................................ 277

Beneﬁts ................................................................................................................................................. 278

Capacity modes and pricing ............................................................................................................. 279

How it works ....................................................................................................................................... 280

Usage notes ......................................................................................................................................... 283

Conﬁgure Multi-Region Replication ................................................................................................ 285

Backup and restore with point-in-time recovery .............................................................................. 308

How it works ....................................................................................................................................... 309

Use point-in-time recovery ............................................................................................................... 313

Expire data with Time to Live ............................................................................................................... 326

Integration with AWS services ......................................................................................................... 327

Create table with default TTL value .............................................................................................. 327

Update table default TTL value ...................................................................................................... 331

Create table with custom TTL ......................................................................................................... 335

Update table custom TTL ................................................................................................................. 337

Use INSERT to set custom TTL for new rows .............................................................................. 339

Use UPDATE to set custom TTL for rows and columns .............................................................. 340

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Client-side timestamps ........................................................................................................................... 341

Integration with AWS services ......................................................................................................... 343

Create table with client-side timestamps ...................................................................................... 343

Conﬁgure client-side timestamps ................................................................................................... 346

Use client-side timestamps in queries ........................................................................................... 349

Working with CQL queries ..................................................................................................................... 350

Use IN SELECT ................................................................................................................................... 350

Order results ........................................................................................................................................ 354

Paginate results .................................................................................................................................. 355

Working with partitioners ...................................................................................................................... 356

Change the partitioner ...................................................................................................................... 357

Working with AWS SDKs ........................................................................................................................ 358

Working with tags ................................................................................................................................... 359

Tagging restrictions ............................................................................................................................ 360

Tag keyspaces and tables ................................................................................................................. 361

Create cost allocation reports .......................................................................................................... 371

Create AWS CloudFormation resources ............................................................................................... 372

Amazon Keyspaces and AWS CloudFormation templates .......................................................... 372

Learn more about AWS CloudFormation ....................................................................................... 372

NoSQL Workbench ................................................................................................................................... 373

Download ............................................................................................................................................. 374

Getting started .................................................................................................................................... 374

Visualize a data model ...................................................................................................................... 376

Create a data model .......................................................................................................................... 380

Edit a data model ............................................................................................................................... 382

Commit a data model ....................................................................................................................... 384

Sample data models .......................................................................................................................... 395

Release history .................................................................................................................................... 396

Code examples ............................................................................................................................. 397

Basics .......................................................................................................................................................... 402

Hello Amazon Keyspaces .................................................................................................................. 403

Learn the basics .................................................................................................................................. 408

Actions .................................................................................................................................................. 470

Libraries and tools ....................................................................................................................... 515

Libraries and examples ........................................................................................................................... 515

Amazon Keyspaces (for Apache Cassandra) developer toolkit .................................................. 515

vii

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces (for Apache Cassandra) examples ................................................................ 515

AWS Signature Version 4 (SigV4) authentication plugins .......................................................... 515

Highlighted sample and developer tool repos .................................................................................. 516

Amazon Keyspaces Protocol Buﬀers .............................................................................................. 516

AWS CloudFormation template to create Amazon CloudWatch dashboard for Amazon

Keyspaces (for Apache Cassandra) metrics ................................................................................... 516

Using Amazon Keyspaces (for Apache Cassandra) with AWS Lambda ..................................... 516

Using Amazon Keyspaces (for Apache Cassandra) with Spring ................................................. 517

Using Amazon Keyspaces (for Apache Cassandra) with Scala ................................................... 517

Using Amazon Keyspaces (for Apache Cassandra) with AWS Glue ........................................... 517

Amazon Keyspaces (for Apache Cassandra) Cassandra query language (CQL) to AWS

CloudFormation converter ................................................................................................................ 517

Amazon Keyspaces (for Apache Cassandra) helpers for Apache Cassandra driver for Java . 518

Amazon Keyspaces (for Apache Cassandra) snappy compression demo ................................. 518

Amazon Keyspaces (for Apache Cassandra) and Amazon S3 codec demo .............................. 518

Best practices ............................................................................................................................... 519

NoSQL design ........................................................................................................................................... 520

NoSQL vs. RDBMS .............................................................................................................................. 521

Two key concepts ............................................................................................................................... 521

General approach ............................................................................................................................... 522

Connections ............................................................................................................................................... 523

How they work .................................................................................................................................... 523

How to conﬁgure connections ......................................................................................................... 524

VPC endpoint connections ............................................................................................................... 526

How to monitor connections ........................................................................................................... 527

How to handle connection errors ................................................................................................... 528

Data modeling .......................................................................................................................................... 528

Partition key design ........................................................................................................................... 529

Cost optimization ..................................................................................................................................... 531

Evaluate your costs at the table level ............................................................................................ 532

Evaluate your table's capacity mode .............................................................................................. 534

Evaluate your table's Application Auto Scaling settings ............................................................ 538

Identify your unused resources ....................................................................................................... 545

Evaluate your table usage patterns ................................................................................................ 550

Evaluate your provisioned capacity for right-sized provisioning .............................................. 551

Troubleshooting ........................................................................................................................... 561

viii

Amazon Keyspaces (for Apache Cassandra) Developer Guide

General errors ........................................................................................................................................... 562

General errors ...................................................................................................................................... 562

Connection errors .................................................................................................................................... 564

Errors connecting to an Amazon Keyspaces endpoint ................................................................ 564

Capacity management errors ................................................................................................................ 576

Serverless capacity errors ................................................................................................................. 576

Data deﬁnition language errors ............................................................................................................ 581

Data deﬁnition language errors ...................................................................................................... 581

Monitoring Amazon Keyspaces ................................................................................................... 586

Monitoring with CloudWatch ................................................................................................................ 587

Using metrics ....................................................................................................................................... 588

Metrics and dimensions .................................................................................................................... 589

Creating alarms ................................................................................................................................... 609

Logging with CloudTrail ......................................................................................................................... 610

Conﬁguring log ﬁle entries in CloudTrail ...................................................................................... 610

DDL information in CloudTrail ......................................................................................................... 611

DML information in CloudTrail ........................................................................................................ 612

Understanding log ﬁle entries ......................................................................................................... 613

Security ........................................................................................................................................ 624

Data protection ........................................................................................................................................ 625

Encryption at rest ............................................................................................................................... 626

Encryption in transit .......................................................................................................................... 646

Internetwork traﬃc privacy .............................................................................................................. 646

AWS Identity and Access Management ............................................................................................... 648

Audience ............................................................................................................................................... 648

Authenticating with identities ......................................................................................................... 649

Managing access using policies ....................................................................................................... 652

How Amazon Keyspaces works with IAM ...................................................................................... 654

Identity-based policy examples ....................................................................................................... 659

AWS managed policies ...................................................................................................................... 666

Troubleshooting .................................................................................................................................. 673

Using service-linked roles ................................................................................................................. 676

Compliance validation ............................................................................................................................ 683

Resilience ................................................................................................................................................... 685

Infrastructure security ............................................................................................................................. 685

Using interface VPC endpoints ........................................................................................................ 686

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁguration and vulnerability analysis for Amazon Keyspaces ................................................... 693

Security best practices ............................................................................................................................ 693

Preventative security best practices ............................................................................................... 693

Detective security best practices ..................................................................................................... 695

CQL language reference .............................................................................................................. 697

Language elements ................................................................................................................................. 698

Identiﬁers ............................................................................................................................................. 698

Constants .............................................................................................................................................. 698

Terms ..................................................................................................................................................... 699

Data types ............................................................................................................................................ 699

JSON encoding of Amazon Keyspaces data types ....................................................................... 703

DDL statements ........................................................................................................................................ 706

Keyspaces ............................................................................................................................................. 706

Tables .................................................................................................................................................... 709

DML statements ....................................................................................................................................... 721

SELECT .................................................................................................................................................. 722

INSERT ................................................................................................................................................... 724

UPDATE ................................................................................................................................................. 726

DELETE .................................................................................................................................................. 728

Built-in functions ..................................................................................................................................... 728

Scalar functions .................................................................................................................................. 729

Quotas .......................................................................................................................................... 731

Amazon Keyspaces service quotas ....................................................................................................... 731

Increasing or decreasing throughput (for provisioned tables) ........................................................ 736

Increasing provisioned throughput ................................................................................................. 736

Decreasing provisioned throughput ............................................................................................... 736

Amazon Keyspaces encryption at rest ................................................................................................. 737

Document history ........................................................................................................................ 738

Amazon Keyspaces (for Apache Cassandra) Developer Guide

What is Amazon Keyspaces (for Apache Cassandra)?

Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache

Cassandra–compatible database service. With Amazon Keyspaces, you don’t have to provision,

patch, or manage servers, and you don’t have to install, maintain, or operate software.

Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service

automatically scales tables up and down in response to application traﬃc. You can build

applications that serve thousands of requests per second with virtually unlimited throughput and

storage.

Note

Apache Cassandra is an open-source, wide-column datastore that is designed to handle

large amounts of data. For more information, see Apache Cassandra.

Amazon Keyspaces makes it easy to migrate, run, and scale Cassandra workloads in the AWS Cloud.

With just a few clicks on the AWS Management Console or a few lines of code, you can create

keyspaces and tables in Amazon Keyspaces, without deploying any infrastructure or installing

software.

With Amazon Keyspaces, you can run your existing Cassandra workloads on AWS using the same

Cassandra application code and developer tools that you use today.

For a list of available AWS Regions and endpoints, see Service endpoints for Amazon Keyspaces.

We recommend that you start by reading the following sections:

Topics

• Amazon Keyspaces: How it works

• Amazon Keyspaces use cases

• What is Cassandra Query Language (CQL)?

Amazon Keyspaces: How it works

Amazon Keyspaces removes the administrative overhead of managing Cassandra. To understand

why, it's helpful to begin with Cassandra architecture and then compare it to Amazon Keyspaces.

How it works 1

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Topics

• High-level architecture: Apache Cassandra vs. Amazon Keyspaces

• Cassandra data model

• Accessing Amazon Keyspaces from an application

High-level architecture: Apache Cassandra vs. Amazon Keyspaces

Traditional Apache Cassandra is deployed in a cluster made up of one or more nodes. You are

responsible for managing each node and adding and removing nodes as your cluster scales.

A client program accesses Cassandra by connecting to one of the nodes and issuing Cassandra

Query Language (CQL) statements. CQL is similar to SQL, the popular language used in relational

databases. Even though Cassandra is not a relational database, CQL provides a familiar interface

for querying and manipulating data in Cassandra.

The following diagram shows a simple Apache Cassandra cluster, consisting of four nodes.

A production Cassandra deployment might consist of hundreds of nodes, running on hundreds

of physical computers across one or more physical data centers. This can cause an operational

High-level architecture 2

Amazon Keyspaces (for Apache Cassandra) Developer Guide

burden for application developers who need to provision, patch, and manage servers in addition to

installing, maintaining, and operating software.

With Amazon Keyspaces (for Apache Cassandra), you don’t need to provision, patch, or manage

servers, so you can focus on building better applications. Amazon Keyspaces oﬀers two throughput

capacity modes for reads and writes: on-demand and provisioned. You can choose your table’s

throughput capacity mode to optimize the price of reads and writes based on the predictability and

variability of your workload.

With on-demand mode, you pay for only the reads and writes that your application actually

performs. You do not need to specify your table’s throughput capacity in advance. Amazon

Keyspaces accommodates your application traﬃc almost instantly as it ramps up or down, making

it a good option for applications with unpredictable traﬃc.

Provisioned capacity mode helps you optimize the price of throughput if you have predictable

application traﬃc and can forecast your table’s capacity requirements in advance. With provisioned

capacity mode, you specify the number of reads and writes per second that you expect your

application to perform. You can increase and decrease the provisioned capacity for your table

automatically by enabling automatic scaling.

You can change the capacity mode of your table once per day as you learn more about your

workload’s traﬃc patterns, or if you expect to have a large burst in traﬃc, such as from a major

event that you anticipate will drive a lot of table traﬃc. For more information about read and write

capacity provisioning, see the section called “Conﬁgure read/write capacity modes”.

Amazon Keyspaces (for Apache Cassandra) stores three copies of your data in multiple Availability

Zones for durability and high availability. In addition, you beneﬁt from a data center and network

architecture that is built to meet the requirements of the most security-sensitive organizations.

Encryption at rest is automatically enabled when you create a new Amazon Keyspaces table and all

client connections require Transport Layer Security (TLS). Additional AWS security features include

monitoring, AWS Identity and Access Management, and virtual private cloud (VPC) endpoints. For

an overview of all available security features, see Security.

The following diagram shows the architecture of Amazon Keyspaces.

High-level architecture 3

Amazon Keyspaces (for Apache Cassandra) Developer Guide

A client program accesses Amazon Keyspaces by connecting to a predetermined endpoint

(hostname and port number) and issuing CQL statements. For a list of available endpoints, see the

section called “Service endpoints”.

Cassandra data model

How you model your data for your business case is critical to achieving optimal performance from

Amazon Keyspaces. A poor data model can signiﬁcantly degrade performance.

Even though CQL looks similar to SQL, the backends of Cassandra and relational databases are

very diﬀerent and must be approached diﬀerently. The following are some of the more signiﬁcant

issues to consider:

Storage

You can visualize your Cassandra data in tables, with each row representing a record and each

column a ﬁeld within that record.

Table design: Query ﬁrst

There are no JOINs in CQL. Therefore, you should design your tables with the shape of your

data and how you need to access it for your business use cases. This might result in de-

normalization with duplicated data. You should design each of your tables speciﬁcally for a

particular access pattern.

Cassandra data model 4

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Partitions

Your data is stored in partitions on disk. The number of partitions your data is stored in and

how it is distributed across the partitions is determined by your partition key. How you deﬁne

your partition key can have a signiﬁcant impact upon the performance of your queries. For best

practices, see the section called “Partition key design”.

Primary key

In Cassandra, data is stored as a key-value pair. Every Cassandra table must have a primary key,

which is the unique key to each row in the table. The primary key is the composite of a required

partition key and optional clustering columns. The data that comprises the primary key must be

unique across all records in a table.

• Partition key – The partition key portion of the primary key is required and determines which

partition of your cluster the data is stored in. The partition key can be a single column, or it

can be a compound value composed of two or more columns. You would use a compound

partition key if a single column partition key would result in a single partition or a very few

partitions having most of the data and thus bearing the majority of the disk I/O operations.

• Clustering column – The optional clustering column portion of your primary key determines

how the data is clustered and sorted within each partition. If you include a clustering column

in your primary key, the clustering column can have one or more columns. If there are

multiple columns in the clustering column, the sorting order is determined by the order that

the columns are listed in the clustering column, from left to right.

For more information about NoSQL design and Amazon Keyspaces, see the section called “NoSQL

design”. For more information about Amazon Keyspaces and data modeling, see the section called

“Data modeling”.

Accessing Amazon Keyspaces from an application

Amazon Keyspaces (for Apache Cassandra) implements the Apache Cassandra Query Language

(CQL) API, so you can use CQL and Cassandra drivers that you already use. Updating your

application is as easy as updating your Cassandra driver or cqlsh conﬁguration to point to the

Amazon Keyspaces service endpoint. For more information about the required credentials, see the

section called “Create IAM credentials for AWS authentication”.

Accessing Amazon Keyspaces 5

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

To help you get started, you can ﬁnd end-to-end code samples of connecting to Amazon

Keyspaces by using various Cassandra client drivers in the Amazon Keyspaces code example

repository on GitHub.

Consider the following Python program, which connects to a Cassandra cluster and queries a table.

from cassandra.cluster import Cluster

#TLS/SSL configuration goes here

ksp = 'MyKeyspace'

tbl = 'WeatherData'

cluster = Cluster(['NNN.NNN.NNN.NNN'], port=NNNN)

session = cluster.connect(ksp)

session.execute('USE ' + ksp)

rows = session.execute('SELECT * FROM ' + tbl)

for row in rows:

print(row)

To run the same program against Amazon Keyspaces, you need to:

• Add the cluster endpoint and port: For example, the host can be replaced with a service

endpoint, such as cassandra.us-east-2.amazonaws.com and the port number with: 9142.

• Add the TLS/SSL conﬁguration: For more information on adding the TLS/SSL conﬁguration to

connect to Amazon Keyspaces by using a Cassandra client Python driver, see Using a Cassandra

Python client driver to access Amazon Keyspaces programmatically.

Amazon Keyspaces use cases

The following are just some of the ways in which you can use Amazon Keyspaces:

• Build applications that require low latency – Process data at high speeds for applications

that require single-digit-millisecond latency, such as industrial equipment maintenance, trade

monitoring, ﬂeet management, and route optimization.

Use cases 6

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Build applications using open-source technologies – Build applications on AWS using open-

source Cassandra APIs and drivers that are available for a wide range of programming languages,

such as Java, Python, Ruby, Microsoft .NET, Node.js, PHP, C++, Perl, and Go. For code examples,

see Libraries and tools.

• Move your Cassandra workloads to the cloud – Managing Cassandra tables yourself is time-

consuming and expensive. With Amazon Keyspaces, you can set up, secure, and scale Cassandra

tables in the AWS Cloud without managing infrastructure. For more information, see Managing

serverless resources.

What is Cassandra Query Language (CQL)?

Cassandra Query Language (CQL) is the primary language for communicating with Apache

Cassandra. Amazon Keyspaces (for Apache Cassandra) is compatible with the CQL 3.x API

(backward-compatible with version 2.x).

In CQL, data is stored in tables, columns, and rows. In this sense CQL is similar to Structured Query

Language (SQL). These are the key concepts in CQL.

• CQL elements – The fundamental elements of CQL are identiﬁers, constants, terms, and data

types.

• Data Deﬁnition Language (DDL) – DDL statements are used to manage data structures like

keyspaces and tables, which are AWS resources in Amazon Keyspaces. DDL statements are

control plane operations in AWS.

• Data Manipulation Language (DML) – DML statements are used to manage data within tables.

DML statements are used for selecting, inserting, updating, and deleting data. These are data

plane operations in AWS.

• Built-in functions – Amazon Keyspaces supports a variety of built-in scalar functions that you

can use in CQL statements.

For more information about CQL, see CQL language reference for Amazon Keyspaces (for Apache

Cassandra). For functional diﬀerences with Apache Cassandra, see the section called “Functional

diﬀerences with Apache Cassandra”.

To run CQL queries, you can do one of the following:

• Use the CQL editor in the AWS Management Console.

What is CQL? 7

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Use AWS CloudShell and the cqlsh-expansion.

•

Use a cqlsh client.

• Use an Apache 2.0 licensed Cassandra client driver.

In addition to CQL, you can perform Data Deﬁnition Language (DDL) operations in Amazon

Keyspaces using the AWS SDKs and the AWS Command Line Interface.

For more information about using these methods to access Amazon Keyspaces, see Accessing

Amazon Keyspaces (for Apache Cassandra).

What is CQL? 8

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How does Amazon Keyspaces (for Apache Cassandra)

compare to Apache Cassandra?

To establish a connection to Amazon Keyspaces, you can either use a public AWS service endpoint

or a private endpoint using Interface VPC endpoints (AWS PrivateLink) in the Amazon Virtual

Private Cloud . Depending on the endpoint used, Amazon Keyspaces can appear to the client in one

of the following ways.

AWS service endpoint connection

This is a connection established over any public endpoint. In this case, Amazon Keyspaces

appears as a nine-node Apache Cassandra 3.11.2 cluster to the client.

Interface VPC endpoint connection

This is a private connection established using an interface VPC endpoint. In this case, Amazon

Keyspaces appears as a three-node Apache Cassandra 3.11.2 cluster to the client.

Independent of the connection type and the number of nodes that are visible to the client, Amazon

Keyspaces provides virtually limitless throughput and storage. To do this, Amazon Keyspaces

maps the nodes to load balancers that route your queries to one of the many underlying storage

partitions. For more information about connections, see the section called “How they work”.

Amazon Keyspaces stores data in partitions. A partition is an allocation of storage for a table,

backed by solid state drives (SSDs). Amazon Keyspaces automatically replicates your data across

multiple Availability Zones within an AWS Region for durability and high availability. As your

throughput or storage needs grow, Amazon Keyspaces handles the partition management for you

and automatically provisions the required additional partitions.

Amazon Keyspaces supports all commonly used Cassandra data-plane operations, such as creating

keyspaces and tables, reading data, and writing data. Amazon Keyspaces is serverless, so you don’t

have to provision, patch, or manage servers. You also don’t have to install, maintain, or operate

software. As a result, in Amazon Keyspaces you don't need to use the Cassandra control plane API

operations to manage cluster and node settings.

Amazon Keyspaces automatically conﬁgures settings such as replication factor and consistency

level to provide you with high availability, durability, and single-digit-millisecond performance.

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For even more resiliency and low-latency local reads, Amazon Keyspaces oﬀers multi-Region

replication.

Topics

• Functional diﬀerences: Amazon Keyspaces vs. Apache Cassandra

• Supported Cassandra APIs, operations, functions, and data types

• Supported Apache Cassandra read and write consistency levels and associated costs

Functional diﬀerences: Amazon Keyspaces vs. Apache

Cassandra

The following are the functional diﬀerences between Amazon Keyspaces and Apache Cassandra.

Topics

• Apache Cassandra APIs, operations, and data types

• Asynchronous creation and deletion of keyspaces and tables

• Authentication and authorization

• Batch

• Cluster conﬁguration

• Connections

• IN keyword

• CQL query throughput tuning

• FROZEN collections

• Lightweight transactions

• Load balancing

• Pagination

• Partitioners

• Prepared statements

• Range delete

• System tables

• Timestamps

Functional diﬀerences with Apache Cassandra 10

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Apache Cassandra APIs, operations, and data types

Amazon Keyspaces supports all commonly used Cassandra data-plane operations, such as creating

keyspaces and tables, reading data, and writing data. To see what is currently supported, see

Supported Cassandra APIs, operations, functions, and data types.

Asynchronous creation and deletion of keyspaces and tables

Amazon Keyspaces performs data deﬁnition language (DDL) operations, such as creating and

deleting keyspaces and tables, asynchronously. To learn how to monitor the creation status of

resources, see the section called “Check keyspace creation status” and the section called “Check

table creation status”. For a list of DDL statements in the CQL language reference, see the section

called “DDL statements”.

Authentication and authorization

Amazon Keyspaces (for Apache Cassandra) uses AWS Identity and Access Management (IAM)

for user authentication and authorization, and supports the equivalent authorization policies as

Apache Cassandra. As such, Amazon Keyspaces does not support Apache Cassandra's security

conﬁguration commands.

Batch

Amazon Keyspaces supports unlogged batch commands with up to 30 commands in the batch.

Only unconditional INSERT, UPDATE, or DELETE commands are permitted in a batch. Logged

batches are not supported.

Cluster conﬁguration

Amazon Keyspaces is serverless, so there are no clusters, hosts, or Java virtual machines (JVMs)

to conﬁgure. Cassandra’s settings for compaction, compression, caching, garbage collection, and

bloom ﬁltering are not applicable to Amazon Keyspaces and are ignored if speciﬁed.

Connections

You can use existing Cassandra drivers to communicate with Amazon Keyspaces, but you need to

conﬁgure the drivers diﬀerently. Amazon Keyspaces supports up to 3,000 CQL queries per TCP

connection per second, but there is no limit on the number of connections a driver can establish.

Apache Cassandra APIs, operations, and data types 11

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Most open-source Cassandra drivers establish a connection pool to Cassandra and load balance

queries over that pool of connections. Amazon Keyspaces exposes 9 peer IP addresses to drivers,

and the default behavior of most drivers is to establish a single connection to each peer IP address.

Therefore, the maximum CQL query throughput of a driver using the default settings is 27,000 CQL

queries per second.

To increase this number, we recommend increasing the number of connections per IP address your

driver is maintaining in its connection pool. For example, setting the maximum connections per IP

address to 2 doubles the maximum throughput of your driver to 54,000 CQL queries per second.

As a best practice, we recommend conﬁguring drivers to use 500 CQL queries per second per

connection to allow for overhead and to improve distribution. In this scenario, planning for 18,000

CQL queries per second requires 36 connections. Conﬁguring the driver for 4 connections across 9

endpoints provides for 36 connections performing 500 request per second. For more information

about best practices for connections, see the section called “Connections”.

When connecting with VPC endpoints, there might be fewer endpoints available. This means that

you have to increase the number of connections in the driver conﬁguration. For more information

about best practices for VPC connections, see the section called “VPC endpoint connections”.

IN keyword

Amazon Keyspaces supports the IN keyword in the SELECT statement. IN is not supported

with UPDATE and DELETE. When using the IN keyword in the SELECT statement, the results of

the query are returned in the order of how the keys are presented in the SELECT statement. In

Cassandra, the results are ordered lexicographically.

When using ORDER BY, full re-ordering with disabled pagination is not supported and results

are ordered within a page. Slice queries are not supported with the IN keyword. TOKENS are not

supported with the IN keyword. Amazon Keyspaces processes queries with the IN keyword by

creating subqueries. Each subquery counts as a connection towards the 3,000 CQL queries per TCP

connection per second limit. For more information, see the section called “Use IN SELECT”.

CQL query throughput tuning

Amazon Keyspaces supports up to 3,000 CQL queries per TCP connection per second, but there is

no limit on the number of connections a driver can establish.

Most open-source Cassandra drivers establish a connection pool to Cassandra and load balance

queries over that pool of connections. Amazon Keyspaces exposes 9 peer IP addresses to drivers,

IN keyword

Amazon Keyspaces (for Apache Cassandra) Developer Guide

and the default behavior of most drivers is to establish a single connection to each peer IP address.

Therefore, the maximum CQL query throughput of a driver using the default settings will be

27,000 CQL queries per second.

To increase this number, we recommend increasing the number of connections per IP address your

driver is maintaining in its connection pool. For example, setting the maximum connections per IP

address to 2 will double the maximum throughput of your driver to 54,000 CQL queries per second.

For more information about best practices for connections, see the section called “Connections”.

When connecting with VPC endpoints, fewer endpoints are available. This means that you have to

increase the number of connections in the driver conﬁguration. For more information about best

practices for VPC endpoint connections, see the section called “VPC endpoint connections”.

FROZEN collections

The FROZEN keyword in Cassandra serializes multiple components of a collection data type into a

single immutable value that is treated like a BLOB. INSERT and UPDATE statements overwrite the

entire collection.

Amazon Keyspaces supports up to ﬁve levels of nesting for frozen collections by default. For more

information, see the section called “Amazon Keyspaces service quotas”.

Amazon Keyspaces doesn't support inequality comparisons that use the entire frozen collection in a

conditional UPDATE or SELECT statement. The behavior for collections and frozen collections is the

same in Amazon Keyspaces.

When you're using frozen collections with client-side timestamps, in the case where the timestamp

of a write operation is the same as the timestamp of an existing column that isn't expired or

tombstoned, Amazon Keyspaces doesn't perform comparisons. Instead, it lets the server determine

the latest writer, and the latest writer wins.

For more information about frozen collections, see the section called “Collection types”.

Lightweight transactions

Amazon Keyspaces (for Apache Cassandra) fully supports compare and set functionality on INSERT,

UPDATE, and DELETE commands, which are known as lightweight transactions (LWTs) in Apache

Cassandra. As a serverless oﬀering, Amazon Keyspaces (for Apache Cassandra) provides consistent

performance at any scale, including for lightweight transactions. With Amazon Keyspaces, there is

no performance penalty for using lightweight transactions.

FROZEN collections

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Load balancing

The system.peers table entries correspond to Amazon Keyspaces load balancers. For best results,

we recommend using a round robin load-balancing policy and tuning the number of connections

per IP to suit your application's needs.

Pagination

Amazon Keyspaces paginates results based on the number of rows that it reads to process a

request, not the number of rows returned in the result set. As a result, some pages might contain

fewer rows than you specify in PAGE SIZE for ﬁltered queries. In addition, Amazon Keyspaces

paginates results automatically after reading 1 MB of data to provide customers with consistent,

single-digit millisecond read performance. For more information, see the section called “Paginate

results”.

In tables with static columns, both Apache Cassandra and Amazon Keyspaces establish the

partition's static column value at the start of each page in a multi-page query. When a table has

large data rows, as a result of the Amazon Keyspaces pagination behavior, the likelihood is higher

that a range read operation result could return more pages for Amazon Keyspaces than for Apache

Cassandra. Consequently, there is a higher likelihood in Amazon Keyspaces that concurrent updates

to the static column could result in the static column value being diﬀerent in diﬀerent pages of the

range read result set.

Partitioners

The default partitioner in Amazon Keyspaces is the Cassandra-compatible Murmur3Partitioner.

In addition, you have the choice of using either the Amazon Keyspaces DefaultPartitioner or

the Cassandra-compatible RandomPartitioner.

With Amazon Keyspaces, you can safely change the partitioner for your account without having to

reload your Amazon Keyspaces data. After the conﬁguration change has completed, which takes

approximately 10 minutes, clients will see the new partitioner setting automatically the next time

they connect. For more information, see the section called “Working with partitioners”.

Prepared statements

Amazon Keyspaces supports the use of prepared statements for data manipulation language (DML)

operations, such as reading and writing data. Amazon Keyspaces does not currently support the

Load balancing 14

Amazon Keyspaces (for Apache Cassandra) Developer Guide

use of prepared statements for data deﬁnition language (DDL) operations, such as creating tables

and keyspaces. DDL operations must be run outside of prepared statements.

Range delete

Amazon Keyspaces supports deleting rows in range. A range is a contiguous set of rows within a

partition. You specify a range in a DELETE operation by using a WHERE clause. You can specify the

range to be an entire partition.

Furthermore, you can specify a range to be a subset of contiguous rows within a partition by using

relational operators (for example, '>', '<'), or by including the partition key and omitting one or

more clustering columns. With Amazon Keyspaces, you can delete up to 1,000 rows within a range

in a single operation.

Range deletes are not isolated. Individual row deletions are visible to other operations while a

range delete is in process.

System tables

Amazon Keyspaces populates the system tables that are required by Apache 2.0 open-source

Cassandra drivers. The system tables that are visible to a client contain information that's unique to

the authenticated user. The system tables are fully controlled by Amazon Keyspaces and are read-

only. For more information, see the section called “System keyspaces”.

Read-only access to system tables is required, and you can control it with IAM access policies. For

more information, see the section called “Managing access using policies”. You must deﬁne tag-

based access control policies for system tables diﬀerently depending on whether you use the AWS

SDK or Cassandra Query Language (CQL) API calls through Cassandra drivers and developer tools.

To learn more about tag-based access control for system tables, see the section called “ Amazon

Keyspaces resource access based on tags”.

If you access Amazon Keyspaces using Amazon VPC endpoints, you see entries in the

system.peers table for each Amazon VPC endpoint that Amazon Keyspaces has permissions to

see. As a result, your Cassandra driver might issue a warning message about the control node itself

in the system.peers table. You can safely ignore this warning.

Timestamps

In Amazon Keyspaces, cell-level timestamps that are compatible with the default timestamps in

Apache Cassandra are an opt-in feature.

Range delete 15

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The USING TIMESTAMP clause and the WRITETIME function are only available when client-side

timestamps are turned on for a table. To learn more about client-side timestamps in Amazon

Keyspaces, see the section called “Client-side timestamps”.

Supported Cassandra APIs, operations, functions, and data

types

Amazon Keyspaces (for Apache Cassandra) is compatible with Cassandra Query Language (CQL)

3.11 API (backward-compatible with version 2.x).

Amazon Keyspaces supports all commonly used Cassandra data-plane operations, such as creating

keyspaces and tables, reading data, and writing data.

The following sections list the supported functionality.

Topics

• Cassandra API support

• Cassandra control plane API support

• Cassandra data plane API support

• Cassandra function support

• Cassandra data type support

Cassandra API support

API operation Supported

CREATE KEYSPACE

Yes

ALTER KEYSPACE

Yes

DROP KEYSPACE

Yes

CREATE TABLE

Yes

ALTER TABLE

Yes

DROP TABLE

Yes

Supported Cassandra APIs, operations, functions, and data types 16

Amazon Keyspaces (for Apache Cassandra) Developer Guide

API operation Supported

CREATE INDEX

DROP INDEX

UNLOGGED BATCH

Yes

LOGGED BATCH

SELECT

Yes

INSERT

Yes

DELETE

Yes

UPDATE

Yes

USE

Yes

CREATE TYPE

ALTER TYPE

DROP TYPE

CREATE TRIGGER

DROP TRIGGER

CREATE FUNCTION

DROP FUNCTION

CREATE AGGREGATE

DROP AGGREGATE

CREATE MATERIALIZED VIEW

ALTER MATERIALIZED VIEW

Cassandra API support 17

Amazon Keyspaces (for Apache Cassandra) Developer Guide

API operation Supported

DROP MATERIALIZED VIEW

TRUNCATE

Cassandra control plane API support

Because Amazon Keyspaces is managed, the Cassandra control plane API operations to manage

cluster and node settings are not required. As a result, the following Cassandra features are not

applicable.

Feature Reason

Durable writes toggle All writes are durable

Read repair settings Not applicable

GC grace seconds Not applicable

Bloom ﬁlter settings Not applicable

Compaction settings Not applicable

Compression settings Not applicable

Caching settings Not applicable

Security settings Replaced by IAM

Cassandra data plane API support

Feature Supported

JSON support for SELECT and INSERT

statements

Yes

Static columns Yes

Cassandra control plane API support 18

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Feature Supported

Time to Live (TTL) Yes

Cassandra function support

For more information about the supported functions, see the section called “Built-in functions”.

Function Supported

Aggregate functions

Blob conversion

Yes

Cast

Yes

Datetime functions

Yes

Timeconversion functions Yes

TimeUuid functions

Yes

Token

Yes

User defined functions (UDF)

Uuid

Yes

Cassandra data type support

Data type Supported Note

ascii

Yes 

bigint

Yes 

blob

Yes 

Cassandra function support 19

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Data type Supported Note

boolean

Yes 

counter

Yes 

date

Yes 

decimal

Yes 

double

Yes 

float

Yes 

frozen

Yes 

inet

Yes 

int

Yes 

list

Yes 

map

Yes 

set

Yes 

smallint

Yes 

text

Yes 

time

Yes 

timestamp

Yes 

timeuuid

Yes 

tinyint

Yes 

tuple

Yes 

Cassandra data type support 20

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Data type Supported Note

user-defined types

(UDT)

No To refactor UDTs with

Protocol Buﬀers, see Amazon

Keyspaces Protocol Buﬀers.

uuid

Yes 

varchar

Yes 

varint

Yes 

Supported Apache Cassandra read and write consistency levels

and associated costs

The topics in this section describe which Apache Cassandra consistency levels are supported for

read and write operations in Amazon Keyspaces (for Apache Cassandra).

Topics

• Write consistency levels

• Read consistency levels

• Unsupported consistency levels

Write consistency levels

Amazon Keyspaces replicates all write operations three times across multiple Availability Zones

for durability and high availability. Writes are durably stored before they are acknowledged using

the LOCAL_QUORUM consistency level. For each 1 KB write, you are billed 1 write capacity unit

(WCU) for tables using provisioned capacity mode or 1 write request unit (WRU) for tables using

on-demand mode.

You can use cqlsh to set the consistency for all queries in the current session to LOCAL_QUORUM

using the following code.

CONSISTENCY LOCAL_QUORUM;

Supported Cassandra consistency levels 21

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To conﬁgure the consistency level programmatically, you can set the consistency with the

appropriate Cassandra client drivers. For example, the 4.x version Java drivers allow you to set the

consistency level in the app config ﬁle as shown below.

basic.request.consistency = LOCAL_QUORUM

If you're using a 3.x version Java Cassandra driver, you can specify the

consistency level for the session by adding .withQueryOptions(new

QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM) as shown in

the following code example.

Session session = Cluster.builder()

.addContactPoint(endPoint)

.withPort(portNumber)

.withAuthProvider(new SigV4AuthProvider("us-east-2"))

.withSSL()

.withQueryOptions(new

QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)

.build()

.connect();

To conﬁgure the consistency level for speciﬁc write operations, you can deﬁne the consistency

when you call QueryBuilder.insertInto with a setConsistencyLevel argument when

you're using the Java driver.

Read consistency levels

Amazon Keyspaces supports three read consistency levels: ONE, LOCAL_ONE, and LOCAL_QUORUM.

During a LOCAL_QUORUM read, Amazon Keyspaces returns a response reﬂecting the most recent

updates from all prior successful write operations. Using the consistency level ONE or LOCAL_ONE

can improve the performance and availability of your read requests, but the response might not

reﬂect the results of a recently completed write.

For each 4 KB read using ONE or LOCAL_ONE consistency, you are billed 0.5 read capacity units

(RCUs) for tables using provisioned capacity mode or 0.5 read request units (RRUs) for tables using

on-demand mode. For each 4 KB read using LOCAL_QUORUM consistency, you are billed 1 read

capacity unit (RCU) for tables using provisioned capacity mode or 1 read request units (RRU) for

tables using on-demand mode.

Read consistency levels 22

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Billing based on read consistency and read capacity throughput mode per table for each 4 KB

of reads

Consistency level Provisioned On-demand

ONE

0.5 RCUs 0.5 RRUs

LOCAL_ONE

0.5 RCUs 0.5 RRUs

LOCAL_QUORUM

1 RCU 1 RRU

To specify a diﬀerent consistency for read operations, call QueryBuilder.select with a

setConsistencyLevel argument when you're using the Java driver.

Unsupported consistency levels

The following consistency levels are not supported by Amazon Keyspaces and will result in

exceptions.

Unsupported consistency levels

Apache Cassandra Amazon Keyspaces

EACH_QUORUM

Not supported

QUORUM

Not supported

ALL

Not supported

TWO

Not supported

THREE

Not supported

ANY

Not supported

SERIAL

Not supported

LOCAL_SERIAL

Not supported

Unsupported consistency levels 23

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Migrating to Amazon Keyspaces (for Apache Cassandra)

Migrating to Amazon Keyspaces (for Apache Cassandra) presents a range of compelling beneﬁts

for businesses and organizations. Here are some key advantages that make Amazon Keyspaces an

attractive choice for migration.

• Scalability – Amazon Keyspaces is designed to handle massive workloads and scale seamlessly

to accommodate growing data volumes and traﬃc. With traditional Cassandra, scaling is not

performed on demand and requires planning for future peaks. With Amazon Keyspaces, you can

easily scale your tables up or down based on demand, ensuring that your applications can handle

sudden spikes in traﬃc without compromising performance.

• Performance – Amazon Keyspaces oﬀers low-latency data access, enabling applications to

retrieve and process data with exceptional speed. Its distributed architecture ensures that read

and write operations are distributed across multiple nodes, delivering consistent, single-digit

millisecond response times even at high request rates.

• Fully managed – Amazon Keyspaces is a fully managed service provided by AWS. This means

that AWS handles the operational aspects of database management, including provisioning,

conﬁguration, patching, backups, and scaling. This allows you to focus more on developing your

applications and less on database administration tasks.

• Serverless architecture – Amazon Keyspaces is serverless. You pay only for capacity consumed

with no upfront capacity provisioning required. You don't have servers to manage or instances to

choose. This pay-per-request model oﬀers cost eﬃciency and minimal operational overhead, as

you only pay for the resources you consume without the need to provision and monitor capacity.

• NoSQL ﬂexibility with schema – Amazon Keyspaces follows a NoSQL data model, providing

ﬂexibility in schema design. With Amazon Keyspaces, you can store structured, semi-structured,

and unstructured data, making it well-suited for handling diverse and evolving data types.

Additionally, Amazon Keyspaces performs schema validation on write allowing for a centralized

evolution of the data model. This ﬂexibility enables faster development cycles and easier

adaptation to changing business requirements.

• High availability and durability – Amazon Keyspaces replicates data across multiple Availability

Zones within an AWS Region, ensuring high availability and data durability. It automatically

handles replication, failover, and recovery, minimizing the risk of data loss or service disruptions.

Amazon Keyspaces provides an availability SLA of up to 99.999%. For even more resiliency and

low-latency local reads, Amazon Keyspaces oﬀers multi-Region replication.

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Security and compliance – Amazon Keyspaces integrates with AWS Identity and Access

Management for ﬁne-grained access control. It provides encryption at rest and in-transit, helping

to improve the security of your data. Amazon Keyspaces has been assessed by third-party

auditors for security and compliance with speciﬁc programs, including HIPAA, PCI DSS, and SOC,

enabling you to meet regulatory requirements. For more information, see the section called

“Compliance validation”.

• Integration with AWS Ecosystem – As part of the AWS ecosystem, Amazon Keyspaces

seamlessly integrates with other AWS services, for example AWS CloudFormation, Amazon

CloudWatch, and AWS CloudTrail. This integration enables you to build serverless architectures,

leverage infrastructure as code, and create real-time data-driven applications. For more

information, see Monitoring Amazon Keyspaces.

Topics

• Create a migration plan for migrating from Apache Cassandra to Amazon Keyspaces

• How to select the right tool for bulk uploading or migrating data to Amazon Keyspaces

Create a migration plan for migrating from Apache Cassandra

to Amazon Keyspaces

For a successful migration from Apache Cassandra to Amazon Keyspaces, we recommend a review

of the applicable migration concepts and best practices as well as a comparison of the available

options.

This topic outlines how the migration process works by introducing several key concepts and the

tools and techniques available to you. You can evaluate the diﬀerent migration strategies to select

the one that best meets your requirements.

Topics

• Functional compatibility

• Estimate Amazon Keyspaces pricing

• Choose a migration strategy

• Online migration to Amazon Keyspaces: strategies and best practices

• Oﬄine migration process: Apache Cassandra to Amazon Keyspaces

• Using a hybrid migration solution: Apache Cassandra to Amazon Keyspaces

Migrating from Cassandra 25

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Functional compatibility

Consider the functional diﬀerences between Apache Cassandra and Amazon Keyspaces carefully

before the migration. Amazon Keyspaces supports all commonly used Cassandra data-plane

operations, such as creating keyspaces and tables, reading data, and writing data.

However there are some Cassandra APIs that Amazon Keyspaces doesn't support. For more

information about supported APIs, see the section called “Supported Cassandra APIs, operations,

functions, and data types”. For an overview of all functional diﬀerences between Amazon

Keyspaces and Apache Cassandra, see the section called “Functional diﬀerences with Apache

Cassandra”.

To compare the Cassandra APIs and schema that you're using with supported functionality in

Amazon Keyspaces, you can run a compatibility script available in the Amazon Keyspaces toolkit on

GitHub.

How to use the compatibility script

1. Download the compatibility Python script from GitHub and move it to a location that has

access to your existing Apache Cassandra cluster.

The compatibility script uses similar parameters as CQLSH. For --host and --port enter the

IP address and the port you use to connect and run queries to one of the Cassandra nodes in

your cluster.

If your Cassandra cluster uses authentication, you also need to provide -username and -

password. To run the compatibility script, you can use the following command.

python toolkit-compat-tool.py --host hostname or IP -u "username" -p "password" --

port native transport port

Estimate Amazon Keyspaces pricing

This section provides an overview of the information you need to gather from your Apache

Cassandra tables to calculate the estimated cost for Amazon Keyspaces. Each one of your tables

requires diﬀerent data types, needs to support diﬀerent CQL queries, and maintains distinctive

read/write traﬃc.

Compatibility 26

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Thinking of your requirements based on tables aligns with Amazon Keyspaces table-level resource

isolation and read/write throughput capacity modes. With Amazon Keyspaces, you can deﬁne read/

write capacity and automatic scaling policies for tables independently.

Understanding table requirements helps you prioritize tables for migration based on functionality,

cost, and migration eﬀort.

Collect the following Cassandra table metrics before a migration. This information helps to

estimate the cost of your workload on Amazon Keyspaces.

• Table name – The name of the fully qualiﬁed keyspace and table name.

• Description – A description of the table, for example how it’s used, or what type of data is stored

in it.

• Average reads per second – The average number of coordinate-level reads against the table over

a large time interval.

• Average writes per second – The average number of coordinate-level writes against the table

over a large time interval.

• Average row size in bytes – The average row size in bytes.

• Storage size in GBs – The raw storage size for a table.

• Read consistency breakdown – The percentage of reads that use eventual consistency

(LOCAL_ONE or ONE) vs. strong consistency (LOCAL_QUORUM).

This table shows an example of the information about your tables that you need to pull together

when planning a migration.

Table

name

Descripti

Average

reads per

second

Average

writes per

second

Average

row size in

bytes

Storage

size in GBs

Read

consisten

breakdown

mykeyspac

e.mytable

Used to

store

shopping

cart

history

10,000 5,000 2,200 2,000 100%

LOCAL_ONE

Estimate pricing 27

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table

name

Descripti

Average

reads per

second

Average

writes per

second

Average

row size in

bytes

Storage

size in GBs

Read

consisten

breakdown

mykeyspac

e.mytable2

Used to

store latest

proﬁle

informati

20,000 1,000 850 1,000 25%

LOCAL_QUO

RUM 75%

LOCAL_ONE

How to collect table metrics

This section provides step by step instructions on how to collect the necessary table metrics from

your existing Cassandra cluster. These metrics include row size, table size, and read/write requests

per second (RPS). They allow you to assess throughput capacity requirements for an Amazon

Keyspaces table and estimate pricing.

How to collect table metrics on the Cassandra source table

1. Determine row size

Row size is important for determining the read capacity and write capacity utilization

in Amazon Keyspaces. The following diagram shows the typical data distribution over a

Cassandra token range.

Estimate pricing 28

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You can use a row size sampler script available on GitHub to collect row size metrics for each

table in your Cassandra cluster.

The script exports table data from Apache Cassandra by using cqlsh and awk to calculate the

min, max, average, and standard deviation of row size over a conﬁgurable sample set of table

data. The row size sampler passes the arguments to cqlsh, so the same parameters can be

used to connect and read from your Cassandra cluster.

The following statement is an example of this.

./row-size-sampler.sh 10.22.33.44 9142 \\

-u "username" -p "password" --ssl

For more information on how row size is calculated in Amazon Keyspaces, see the section

called “Estimate row size”.

2. Determine table size

With Amazon Keyspaces, you don't need to provision storage in advance. Amazon Keyspaces

monitors the billable size of your tables continuously to determine your storage charges.

Storage is billed per GB-month. Amazon Keyspaces table size is based on the raw size

(uncompressed) of a single replica.

To monitor the table size in Amazon Keyspaces, you can use the metric

BillableTableSizeInBytes, which is displayed for each table in the AWS Management

Console.

To estimate the billable size of your Amazon Keyspaces table, you can use either one of these

two methods:

• Use the average row size and multiply by the number or rows.

You can estimate the size of the Amazon Keyspaces table by multiplying the average row

size by the number of rows from your Cassandra source table. Use the row size sample

script from the previous section to capture the average row size. To capture the row count,

you can use tools like dsbulk count to determine the total number of rows in your

source table.

•

Use the nodetool to gather table metadata.

Estimate pricing 29

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Nodetool is an administrative tool provided in the Apache Cassandra distribution that

provides insight into the state of the Cassandra process and returns table metadata. You

can use nodetool to sample metadata about table size and with that extrapolate the

table size in Amazon Keyspaces.

The command to use is nodetool tablestats. Tablestats returns the table's size and

compression ratio. The table's size is stored as the tablelivespace for the table and you

can divide it by the compression ratio. Then multiple this size value by the number of

nodes. Finally divide by the replication factor (typically three).

This is the complete formula for the calculation that you can use to assess table size.

((tablelivespace / compression ratio) * (total number of nodes))/ (replication

factor)

Let's assume that your Cassandra cluster has 12 nodes. Running the nodetool

tablestats command returns a tablelivespace of 200 GB and a compression

ratio of 0.5. The keyspace has a replication factor of three.

This is how the calculation for this example looks like.

(200 GB / 0.5) * (12 nodes)/ (replication factor of 3)

= 4,800 GB / 3

= 1,600 GB is the table size estimate for Amazon

Keyspaces

3. Capture the number of reads and writes

To determine the capacity and scaling requirements for your Amazon Keyspaces tables,

capture the read and write request rate of your Cassandra tables before the migration.

Amazon Keyspaces is serverless and you only pay for what you use. In general, the price of

read/write throughput in Amazon Keyspaces is based on the number and size of the requests.

There are two capacity modes in Amazon Keyspaces:

• On-demand – This is a ﬂexible billing option capable of serving thousands of requests per

second without the need for capacity planning. It oﬀers pay-per-request pricing for read and

write requests so that you pay only for what you use.

Estimate pricing 30

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Provisioned – If you choose provisioned throughput capacity mode, you specify the number

of reads and writes per second that are required for your application. This helps you manage

your Amazon Keyspaces usage to stay at or below a deﬁned request rate to optimize price

and maintain predictability.

Provisioned mode oﬀers auto scaling to automatically adjust your provisioned rate to scale

up or scale down to improve operational eﬃciency. For more information about serverless

resource management, see Managing serverless resources.

Because you provision read and write throughput capacity in Amazon Keyspaces separately,

you need to measure the request rate for reads and writes in your existing tables

independently.

To gather the most accurate utilization metrics from your existing Cassandra cluster, capture

the average requests per second (RPS) for coordinator-level read and write operations over an

extended period of time for a table that is aggregated over all nodes in a single data center.

Capturing the average RPS over a period of at least several weeks captures peaks and valleys in

your traﬃc patterns, as shown in the following diagram.

You have two options to determine the read and write request rate of your Cassandra table.

• Use existing Cassandra monitoring

You can use the metrics shown in the following table to observe read and write requests.

Note that the metric names can change based on the monitoring tool that you're using.

Estimate pricing 31

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Dimension Cassandra JMX metric

Writes

org.apache.cassandra.metric

s:type=ClientRequest,

scope=Write,name=Latency#Co

unt

Reads

org.apache.cassandra.metric

s:type=ClientRequest,

scope=Read,name=Latency#Count

•

Use the nodetool

Use nodetool tablestats and nodetool info to capture average read and write

operations from the table. tablestats returns the total read and write count from the

time the node has been initiated. nodetool info provides the up-time for a node in

seconds.

To receive the per second average of read and writes, divide the read and write count

by the node up-time in seconds. Then, for reads you divide by the consistency level ad

for writes you divide by the replication factor. These calculations are expressed in the

following formulas.

Formula for average reads per second:

((number of reads * number of nodes in cluster) / read consistency quorum

(2)) / uptime

Formula for average writes per second:

((number of writes * number of nodes in cluster) / replication factor of 3) /

uptime

Let's assume we have a 12 node cluster that has been up for 4 weeks. nodetool info

returns 2,419,200 seconds of up-time and nodetool tablestats returns 1 billion

writes and 2 billion reads. This example would result in the following calculation.

Estimate pricing 32

Amazon Keyspaces (for Apache Cassandra) Developer Guide

((2 billion reads * 12 in cluster) / read consistency quorum (2)) / 2,419,200

seconds

= 12 billion reads / 2,419,200 seconds

= 4,960 read request per second

((1 billion writes * 12 in cluster) / replication

factor of 3) / 2,419,200 seconds

= 4 billion writes / 2,419,200 seconds

= 1,653 write request per second

4. Determine the capacity utilization of the table

To estimate the average capacity utilization, start with the average request rates and the

average row size of your Cassandra source table.

Amazon Keyspaces uses read capacity units (RCUs) and write capacity units (WCUs) to measure

provisioned throughput capacity for reads and writes for tables. For this estimate we use these

units to calculate the read and write capacity needs of the new Amazon Keyspaces table after

migration.

Later in this topic we'll discuss how the choice between provisioned and on-demand capacity

mode aﬀects billing. But for the estimate of capacity utilization in this example, we assume

that the table is in provisioned mode.

•

Reads – One RCU represents one LOCAL_QUORUM read request, or two LOCAL_ONE read

requests, for a row up to 4 KB in size. If you need to read a row that is larger than 4 KB, the

read operation uses additional RCUs. The total number of RCUs required depends on the row

size, and whether you want to use LOCAL_QUORUM or LOCAL_ONE read consistency.

For example, reading an 8 KB row requires 2 RCUs using LOCAL_QUORUM read consistency,

and 1 RCU if you choose LOCAL_ONE read consistency.

• Writes – One WCU represents one write for a row up to 1 KB in size. All writes are using

LOCAL_QUORUM consistency, and there is no additional charge for using lightweight

transactions (LWTs).

The total number of WCUs required depends on the row size. If you need to write a row that

is larger than 1 KB, the write operation uses additional WCUs. For example, if your row size is

2 KB, you require 2 WCUs to perform one write request.

The following formula can be used to estimate the required RCUs and WCUs.

Estimate pricing 33

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Read capacity in RCUs can be determined by multiplying reads per second by number of

rows read per read multiplied by average row size divided by 4KB and rounded up to the

nearest whole number.

• Write capacity in WCUs can be determined by multiplying the number of requests by the

average row size divided by 1KB and rounded up to the nearest whole number.

This is expressed in the following formulas.

Read requests per second * ROUNDUP((Average Row Size)/4096 per unit) = RCUs per

second

Write requests per second * ROUNDUP(Average Row Size/1024 per unit) = WCUs per

second

For example, if you're performing 4,960 read requests with a row size of 2.5KB on your

Cassandra table, you need 4,960 RCUs in Amazon Keyspaces. If you're currently performing

1,653 write requests per second with a row size of 2.5KB on your Cassandra table, you need

4,959 WCUs per second in Amazon Keyspaces.

This example is expressed in the following formulas.

4,960 read requests per second * ROUNDUP( 2.5KB /4KB bytes per unit)

= 4,960 read requests per second * 1 RCU

= 4,960 RCUs

1,653 write requests per second * ROUNDUP(2.5KB/1KB per unit)

= 1,653 requests per second * 3 WCUs

= 4,959 WCUs

Using eventual consistency allows you to save up to half of the throughput capacity on

each read request. Each eventually consistent read can consume up to 8KB. You can calculate

eventual consistent reads by multiplying the previous calculation by 0.5 as shown in the

following formula.

4,960 read requests per second * ROUNDUP( 2.5KB /4KB per unit) * .5

= 2,480 read request per second * 1 RCU

= 2,480 RCUs

Estimate pricing 34

Amazon Keyspaces (for Apache Cassandra) Developer Guide

5. Calculate the monthly pricing estimate for Amazon Keyspaces

To estimate the monthly billing for the table based on read/write capacity throughput, you

can calculate the pricing for on-demand and for provisioned mode using diﬀerent formulas

and compare the options for your table.

Provisioned mode – Read and write capacity consumption is billed on an hourly rate based on

the capacity units per second. First, divide that rate by 0.7 to represent the default autoscaling

target utilization of 70%. Then multiple by 30 calendar days, 24 hours per day, and regional

rate pricing.

This calculation is summarized in the following formulas.

(read capacity per second / .7) * 24 hours * 30 days * regional rate

(write capacity per second / .7) * 24 hours * 30 days * regional

rate

On-demand mode – Read and write capacity are billed on a per request rate. First, multiply

the request rate by 30 calendar days, and 24 hours per day. Then divide by one million request

units. Finally, multiply by the regional rate.

This calculation is summarized in the following formulas.

((read capacity per second * 30 * 24 * 60 * 60) / 1 Million read request units) *

regional rate

((write capacity per second * 30 * 24 * 60 * 60) / 1 Million write

request units) * regional rate

Choose a migration strategy

You can choose between the following migration strategies when migrating from Apache

Cassandra to Amazon Keyspaces:

• Online – This is a live migration using dual writes to start writing new data to Amazon Keyspaces

and the Cassandra cluster simultaneously. This migration type is recommended for applications

that require zero downtime during migration and read after write consistency.

For more information about how to plan and implement an online migration strategy, see the

section called “Online migration”.

Migration strategy 35

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Oﬄine – This migration technique involves copying a data set from Cassandra to Amazon

Keyspaces during a downtime window. Oﬄine migration can simplify the migration process,

because it doesn't require changes to your application or conﬂict resolution between historical

data and new writes.

For more information about how to plan an oﬄine migration, see the section called “Oﬄine

migration”.

• Hybrid – This migration technique allows for changes to be replicated to Amazon Keyspaces in

near real time, but without read after write consistency.

After reviewing the migration techniques and best practices discussed in this topic, you can place

the available options in a decision tree to design a migration strategy based on your requirements

and available resources.

Online migration to Amazon Keyspaces: strategies and best practices

If you need to maintain application availability during a migration from Apache Cassandra to

Amazon Keyspaces, you can prepare a custom online migration strategy by implementing the key

components discussed in this topic. By following these best practices for online migrations, you

can ensure that application availability and read-after-write consistency are maintained during the

entire migration process, minimizing the impact on your users.

When designing an online migration strategy from Apache Cassandra to Amazon Keyspaces, you

need to consider the following key steps.

1. Writing new data

• Application dual-writes: You can implement dual writes in your application using existing

Cassandra client libraries and drivers. Designate one database as the leader and the other as

the follower. Write failures to the follower database are recorded in a dead letter queue (DLQ)

for analysis.

• Messaging tier dual-writes: Alternatively, you can conﬁgure your existing messaging platform

to send writes to both Cassandra and Amazon Keyspaces using an additional consumer. This

creates eventually consistent views across both databases.

2. Migrating historical data

• Copy historical data: You can migrate historical data from Cassandra to Amazon Keyspaces

using AWS Glue or custom extract, transform, and load (ETL) scripts. Handle conﬂict

Online migration 36

Amazon Keyspaces (for Apache Cassandra) Developer Guide

resolution between dual writes and bulk loads using techniques like lightweight transactions

or timestamps.

• Use Time-To-Live (TTL): For shorter data retention periods, you can use TTL in both Cassandra

and Amazon Keyspaces to avoid uploading unnecessary historical data. As old data expires in

Cassandra and new data is written via dual-writes, Amazon Keyspaces eventually catches up.

3. Validating data

• Dual reads: Implement dual reads from both Cassandra (primary) and Amazon Keyspaces

(secondary) databases, comparing results asynchronously. Diﬀerences are logged or sent to a

DLQ.

• Sample reads: Use Λ functions to periodically sample and compare data across both systems,

logging any discrepancies to a DLQ.

4. Migrating the application

• Blue-green strategy: Switch your application to treat Amazon Keyspaces as the primary and

Cassandra as the secondary data store in a single step. Monitor performance and roll back if

issues arise.

• Canary deployment: Gradually roll out the migration to a subset of users ﬁrst, incrementally

increasing traﬃc to Amazon Keyspaces as the primary until fully migrated.

5. Decommissioning Cassandra

Once your application is fully migrated to Amazon Keyspaces and data consistency is validated,

you can plan to decommission your Cassandra cluster based on data retention policies.

By planning an online migration strategy with these components, you can transition smoothly to

the fully managed Amazon Keyspaces service with minimal downtime or disruption. The following

sections go into each component in more detail.

Topics

• Writing new data during an online migration

• Uploading historical data during an online migration

• Validating data consistency during an online migration

• Migrating the application during an online migration

• Decommissioning Cassandra after an online migration

Online migration 37

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Writing new data during an online migration

The ﬁrst step in an online migration plan is to ensure that any new data written by the application

is stored in both databases, your existing Cassandra cluster and Amazon Keyspaces. The goal is to

provide a consistent view across the two data stores. You can do this by applying all new writes to

both databases. To implement dual writes, consider one of the following two options.

• Application dual writes – You can implement dual writes with minimal changes to your

application code by leveraging the existing Cassandra client libraries and drivers. You can either

implement dual writes in your existing application, or create a new layer in the architecture to

handle dual writes. For more information and a customer case study that shows how dual writes

were implemented in an existing application, see Cassandra migration case study.

When implementing dual writes, you can designate one database as the leader and the other

database as the follower. This allows you to keep writing to your original source, or leader

database without letting write failures to the follower, or destination database disrupt the critical

path of your application.

Instead of retrying failed writes to the follower, you can use Amazon Simple Queue Service to

record failed writes in a dead letter queue (DLQ). The DLQ lets you analyze the failed writes to

the follower and determine why processing did not succeed in the destination database.

For a more sophisticated dual write implementation, you can follow AWS best practices for

designing a sequence of local transactions using the saga pattern. A saga pattern ensures that

if a transaction fails, the saga runs compensating transactions to revert the database changes

made by the previous transactions.

When using dual-writes for an online migration, you can conﬁgure the dual-writes following

the saga pattern so that each write is a local transaction to ensure atomic operations across

heterogeneous databases. For more information about designing distributed application using

recommended design patterns for the AWS Cloud, see Cloud design patterns, architectures, and

implementations.

Online migration 38

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Messaging tier dual writes – Instead of implementing dual writes at the application layer, you

can use your existing messaging tier to perform dual writes to Cassandra and Amazon Keyspaces.

To do this you can conﬁgure an additional consumer to your messaging platform to send writes

to both data stores. This approach provides a simple low code strategy using the messaging tier

to create two views across both databases that are eventually consistent.

Uploading historical data during an online migration

After implementing dual writes to ensure that new data is written to both data stores in real time,

the next step in the migration plan is to evaluate how much historical data you need to copy or

bulk upload from Cassandra to Amazon Keyspaces. This ensures that both, new data and historical

data are going to be available in the new Amazon Keyspaces database before you’re migrating the

application.

Online migration 39

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Depending on your data retention requirements, for example how much historical data you need to

preserve based on your organizations policies, you can consider one the following two options.

• Bulk upload of historical data – The migration of historical data from your existing Cassandra

deployment to Amazon Keyspaces can be achieved through various techniques, for example

using AWS Glue or custom scripts to extract, transform, and load (ETL) the data. For more

information about using AWS Glue to upload historical data, see the section called “Oﬄine

migration”.

When planning the bulk upload of historical data, you need to consider how to resolve conﬂicts

that can occur when new writes are trying to update the same data that is in the process of being

uploaded. The bulk upload is expected to be eventually consistent, which means the data is

going to reach all nodes eventually.

If an update of the same data occurs at the same time due to a new write, you want to ensure

that it's not going to be overwritten by the historical data upload. To ensure that you preserve

the latest updates to your data even during the bulk import, you must add conﬂict resolution

either into the bulk upload scripts or into the application logic for dual writes.

For example, you can use the section called “Lightweight transactions” (LWT) to compare and set

operations. To do this, you can add an additional ﬁeld to your data-model that represents time

of modiﬁcation or state.

Additionally, Amazon Keyspaces supports the Cassandra WRITETIME timestamp function. You

can use Amazon Keyspaces client-side timestamps to preserve source database timestamps and

implement last-writer-wins conﬂict resolution. For more information, see the section called

“Client-side timestamps”.

• Using Time-to-Live (TTL) – For data retention periods shorter than 30, 60, or 90 days, you can

use TTL in Cassandra and Amazon Keyspaces during migration to avoid uploading unnecessary

historical data to Amazon Keyspaces. TTL allows you to set a time period after which the data is

automatically removed from the database.

During the migration phase, instead of copying historical data to Amazon Keyspaces, you can

conﬁgure the TTL settings to let the historical data expire automatically in the old system

(Cassandra) while only applying the new writes to Amazon Keyspaces using the dual-write

method. Over time and with old data continually expiring in the Cassandra cluster and new data

written using the dual-write method, Amazon Keyspaces automatically catches up to contain the

same data as Cassandra.

Online migration 40

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This approach can signiﬁcantly reduce the amount of data to be migrated, resulting in a more

eﬃcient and streamlined migration process. You can consider this approach when dealing with

large datasets with varying data retention requirements. For more information about TTL, see

the section called “Expire data with Time to Live”.

Consider the following example of a migration from Cassandra to Amazon Keyspaces using

TTL data expiration. In this example we set TTL for both databases to 60 days and show how

the migration process progresses over a period of 90 days. Both databases receive the same

newly written data during this period using the dual writes method. We're going to look at three

diﬀerent phases of the migration, each phase is 30 days long.

How the migration process works for each phase is shown in the following images.

1. After the ﬁrst 30 days, the Cassandra cluster and Amazon Keyspaces have been receiving new

writes. The Cassandra cluster also contains historical data that has not yet reached 60 days of

retention, which makes up 50% of the data in the cluster.

Data that is older than 60 days is being automatically deleted in the Cassandra cluster using

TTL. At this point Amazon Keyspaces contains 50% of the data stored in the Cassandra cluster,

which is made up of the new writes minus the historical data.

2. After 60 days, both the Cassandra cluster and Amazon Keyspaces contain the same data

written in the last 60 days.

3. Within 90 days, both Cassandra and Amazon Keyspaces contain the same data and are

expiring data at the same rate.

This example illustrates how to avoid the step of uploading historical data by using TTL with an

expiration date set to 60 days.

Online migration 41

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Validating data consistency during an online migration

The next step in the online migration process is data validation. Dual writes are adding new data to

your Amazon Keyspaces database and you have completed the migration of historical data either

using bulk upload or data expiration with TTL.

Now you can use the validation phase to conﬁrm that both data stores contain in fact the same

data and return the same read results. You can choose from one of the following two options to

validate that both your databases contain identical data.

• Dual reads – To validate that both, the source and the destination database contain the same set

of newly written and historical data, you can implement dual reads. To do so you read from both

your primary Cassandra and your secondary Amazon Keyspaces database similarly to the dual

writes method and compare the results asynchronously.

The results from the primary database are returned to the client, and the results from the

secondary database are used to validate against the primary resultset. Diﬀerences found can be

logged or sent to a dead letter queue (DLQ) for later reconciliation.

In the following diagram, the application is performing a synchronous read from Cassandra,

which is the primary data store) and an asynchronous read from Amazon Keyspaces, which is the

secondary data store.

Online migration 42

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Sample reads – An alternative solution that doesn’t require application code changes is to

use AWS Lambda functions to periodically and randomly sample data from both the source

Cassandra cluster and the destination Amazon Keyspaces database.

These Lambda functions can be conﬁgured to run at regular intervals. The Lambda function

retrieves a random subset of data from both the source and destination systems, and then

performs a comparison of the sampled data. Any discrepancies or mismatches between the two

datasets can be recorded and sent to a dedicated dead letter queue (DLQ) for later reconciliation.

This process is illustrated in the following diagram.

Online migration 43

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Migrating the application during an online migration

In the fourth phase of an online migration, you are migrating your application and transitioning to

Amazon Keyspaces as the primary data store. This means that you switch your application to read

and write directly from and to Amazon Keyspaces. To ensure minimal disruption to your users, this

should be a well-planned and coordinated process.

Two diﬀerent recommended solution for application migration are available, the blue green cut

over strategy and the canary cut over strategy. The following sections outline these strategies in

more detail.

• Blue green strategy – Using this approach, you switch your application to treat Amazon

Keyspaces as the primary data store and Cassandra as the secondary data store in a single step.

You can do this using an AWS AppConﬁg feature ﬂag to control the election of primary and

secondary data stores across the application instance. For more information about feature ﬂags,

see Creating a feature ﬂag conﬁguration proﬁle in AWS AppConﬁg.

After making Amazon Keyspaces the primary data store, you monitor the application's behavior

and performance, ensuring that Amazon Keyspaces meets your requirements and that the

migration is successful.

Online migration 44

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For example, if you implemented dual-reads for your application, during the application

migration phase you transition the primary reads going from Cassandra to Amazon Keyspaces

and the secondary reads from Amazon Keyspaces to Cassandra. After the transition, you continue

to monitor and compare results as described in the data validation section to ensure consistency

across both databases before decommissioning Cassandra.

If you detect any issues, you can quickly roll back to the previous state by reverting to Cassandra

as the primary data store. You only proceed to the decommissioning phase of the migration if

Amazon Keyspaces is meeting all your needs as the primary data store.

• Canary strategy – In this approach, you gradually roll out the migration to a subset of your users

or traﬃc. Initially, a small percentage of your application's traﬃc, for example 5% of all traﬃc

is routed to the version using Amazon Keyspaces as the primary data store, while the rest of the

traﬃc continues to use Cassandra as the primary data store.

This allows you to thoroughly test the migrated version with real-world traﬃc and monitor its

performance, stability, and investigate potential issues. If you don't detect any issues, you can

incrementally increase the percentage of traﬃc routed to Amazon Keyspaces until it becomes the

primary data store for all users and traﬃc.

This staged roll out minimizes the risk of widespread service disruptions and allows for a more

controlled migration process. If any critical issues arise during the canary deployment, you

can quickly roll back to the previous version using Cassandra as the primary data store for the

aﬀected traﬃc segment. You only proceed to the decommissioning phase of the migration after

you have validated that Amazon Keyspaces processes 100% of your users and traﬃc as expected.

The following diagram illustrates the individual steps of the canary strategy.

Online migration 45

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Decommissioning Cassandra after an online migration

After the application migration is complete with your application is fully running on Amazon

Keyspaces and you have validated data consistency over a period of time, you can plan to

decommission your Cassandra cluster. During this phase, you can evaluate if the data remaining in

your Cassandra cluster needs to be archived or can be deleted. This depends on your organization’s

policies for data handling and retention.

By following this strategy and considering the recommended best practices described in this topic

when planning your online migration from Cassandra to Amazon Keyspaces, you can ensure a

seamless transition to Amazon Keyspaces while maintaining read-after-write consistency and

availability of your application.

Migrating from Apache Cassandra to Amazon Keyspaces can provide numerous beneﬁts, including

reduced operational overhead, automatic scaling, improved security, and a framework that helps

you to reach your compliance goals. By planning an online migration strategy with dual writes,

Online migration 46

Amazon Keyspaces (for Apache Cassandra) Developer Guide

historical data upload, data validation, and a gradual roll out, you can ensure a smooth transition

with minimal disruption to your application and its users.

Implementing the online migration strategy discussed in this topic allows you to validate the

migration results, identify and address any issues, and ultimately decommission your existing

Cassandra deployment in favor of the fully managed Amazon Keyspaces service.

Oﬄine migration process: Apache Cassandra to Amazon Keyspaces

Oﬄine migrations are suitable when you can aﬀord downtime to perform the migration. It's

common among enterprises to have maintenance windows for patching, large releases, or

downtime for hardware upgrades or major upgrades. Oﬄine migration can use this window to

copy data and switch over the application traﬃc from Apache Cassandra to Amazon Keyspaces.

Oﬄine migration reduces modiﬁcations to the application because it doesn't require

communication to both Cassandra and Amazon Keyspaces simultaneously. Additionally, with the

data ﬂow paused, the exact state can be copied without maintaining mutations.

In this example, we use Amazon Simple Storage Service (Amazon S3) as a staging area for data

during the oﬄine migration to minimize downtime. You can automatically import the data you

stored in Parquet format in Amazon S3 into an Amazon Keyspaces table using the Spark Cassandra

connector and AWS Glue. The following section is going to show the high-level overview of the

process. You can ﬁnd code examples for this process on Github.

The oﬄine migration process from Apache Cassandra to Amazon Keyspaces using Amazon S3 and

AWS Glue requires the following AWS Glue jobs.

1. An ETL job that extracts and transforms CQL data and stores it in an Amazon S3 bucket.

2. A second job that imports the data from the bucket to Amazon Keyspaces.

3. A third job to import incremental data.

How to perform an oﬄine migration to Amazon Keyspaces from Cassandra running on Amazon

EC2 in a Amazon Virtual Private Cloud

1. First you use AWS Glue to export table data from Cassandra in Parquet format and save it to

an Amazon S3 bucket. You need to run an AWS Glue job using a AWS Glue connector to a VPC

where the Amazon EC2 instance running Cassandra resides. Then, using the Amazon S3 private

endpoint, you can save data to the Amazon S3 bucket.

Oﬄine migration 47

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The following diagram illustrates these steps.

2. Shuﬄe the data in the Amazon S3 bucket to improve data randomization. Evenly imported

data allows for more distributed traﬃc in the target table.

This step is required when exporting data from Cassandra with large partitions (partitions

with more than 1000 rows) to avoid hot key patterns when inserting the data into Amazon

Keyspaces. Hot key issues cause WriteThrottleEvents in Amazon Keyspaces and result in

increased load time.

3. Use another AWS Glue job to import data from the Amazon S3 bucket into Amazon Keyspaces.

The shuﬄed data in the Amazon S3 bucket is stored in Parquet format.

Oﬄine migration 48

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Using a hybrid migration solution: Apache Cassandra to Amazon

Keyspaces

The following migration solution can be considered a hybrid between online and oﬄine migration.

With this hybrid approach, data is written to the destination database in near real time without

providing read after write consistency. This means that newly written data won’t be immediately

available and delays are to be expected. If you need read after write consistency, see the section

called “Online migration”.

For a near real time migration from Apache Cassandra to Amazon Keyspaces, you can choose

between two available methods.

• CQLReplicator – CQLReplicator is an open source utility available on Github that helps you to

migrate data from Apache Cassandra to Amazon Keyspaces in near real time.

To determine the writes and updates to propagate to the destination database, CQLReplicator

scans the Apache Cassandra token range and uses an AWS Glue job to remove duplicate events

and apply writes and updates directly to Amazon Keyspaces.

• Change data capture (CDC) – Apache Cassandra oﬀers a built-in CDC feature that allows

capturing changes by copying the commit log to a separate CDC directory.

You can then use these logs to replicate data changes to other systems such as Amazon

Keyspaces, making CDC an eﬀective option for data migration scenarios.

If you don't need read after write consistency, you can use either the CQLReplicator or a CDC

pipeline to migrate data from Apache Cassandra to Amazon Keyspaces based on your preferences

and familiarity with the tools and AWS services used in each solution. Using these methods to

migrate data in near real time can be considered a hybrid approach to migration that oﬀers an

alternative to online migration.

Hybrid migration 49

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This strategy is considered a hybrid approach, because in addition to the options outlined in this

topic, you have to implement some steps of the online migration progress, for example historical

data copy and the application migration strategies discussed in the online migration topic.

The following sections go over the hybrid migration options in more detail.

Topics

• Migrate data using CQLReplicator

• Migrate data using change data capture (CDC)

Migrate data using CQLReplicator

With CQLReplicator, you can read data from Apache Cassandra in near real time through

intelligently scanning the Cassandra token ring using CQL queries. CQLReplicator doesn’t use

Cassandra CDC and instead implements a caching strategy to reduce the performance penalties of

full scans.

To reduce the number of writes to the destination, CQLReplicator automatically removes duplicate

replication events. With CQLReplicator, you can tune the replication of changes from the source

database to the destination database, allowing for a near real time migration of data from Apache

Cassandra to Amazon Keyspaces.

The following diagram shows the typical architecture of a CQLReplicator job using AWS Glue.

1. To allow access to Apache Cassandra running in a private VPC, conﬁgure an AWS Glue

connection with the connection type Network.

2. To remove duplicates and enable key caching with the CQLReplicator job, conﬁgure Amazon

Simple Storage Service (Amazon S3).

3. The CQLReplicator job streams veriﬁed source database changes directly to Amazon Keyspaces.

Hybrid migration 50

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Migrate data using change data capture (CDC)

To successfully implement a change data capture (CDC) pipeline for migrating data from Cassandra

to Amazon Keyspaces, we recommend using the Debezium platform. Debezium is an open-source,

distributed platform for CDC, designed to monitor a database and capture row-level changes

reliably.

The Debezium connector for Apache Cassandra uploads changes to Amazon Managed Streaming

for Apache Kafka (Amazon MSK) so that they can be consumed and processed by downstream

consumers which in turn write the data to Amazon Keyspaces.

To address any potential data consistency issues, you can implement a process with Amazon MSK

where a consumer compares the keys or partitions in Cassandra with those in Amazon Keyspaces.

To implement this solution successfully, we recommend to consider the following.

• How to parse the CDC commit log, for example how to remove duplicate events.

• How to maintain the CDC directory, for example how to delete old logs.

• How to handle partial failures in Apache Cassandra, for example if a write only succeeds in one

out of three replicas.

This pattern treats changes from Cassandra as a "hint" that a key may have changed from its

previous state. To determine if there are changes to propagate to the destination database, you

Hybrid migration 51

Amazon Keyspaces (for Apache Cassandra) Developer Guide

must ﬁrst read from the source Cassandra cluster using a LOCAL_QUORUM operation to receive the

latest records and then write them to Amazon Keyspaces.

In the case of range deletes or range updates, you may need to perform a comparison against the

entire partition to determine which write or update events need to be written to your destination

database.

In cases where writes are not idempotent, you also need to compare your writes with what is

already in the destination database before writing to Amazon Keyspaces.

The following diagram shows the typical architecture of a CDC pipeline using Debezium and

Amazon MSK.

How to select the right tool for bulk uploading or migrating

data to Amazon Keyspaces

In this section you can review the diﬀerent tools that you can use to bulk upload or migrate data to

Amazon Keyspaces, and learn how to select the correct tool based on your needs. In addition, this

Migration tools 52

Amazon Keyspaces (for Apache Cassandra) Developer Guide

section provides an overview and use cases of the available step-by-step tutorials that demonstrate

how to import data into Amazon Keyspaces.

To review the available strategies to migrate workloads from Apache Cassandra to Amazon

Keyspaces, see the section called “Migrating from Cassandra”.

• Migration tools

• For large migrations, consider using an extract, transform, and load (ETL) tool. You can use

AWS Glue to quickly and eﬀectively perform data transformation migrations. For more

information, see the section called “Oﬄine migration”.

• CQLReplicator – CQLReplicator is an open source utility available on Github that helps you to

migrate data from Apache Cassandra to Amazon Keyspaces in near real time.

For more information, see the section called “CQLReplicator”.

• To learn how to use the Apache Cassandra Spark connector to write data to Amazon

Keyspaces, see the section called “Connecting with Apache Spark”.

•

Get started quickly with loading data into Amazon Keyspaces by using the cqlsh COPY FROM

command. cqlsh is included with Apache Cassandra and is best suited for loading small

datasets or test data. For step-by-step instructions, see the section called “Loading data using

cqlsh”.

• You can also use the DataStax Bulk Loader for Apache Cassandra to load data into Amazon

Keyspaces using the dsbulk command. DSBulk provides more robust import capabilities than

cqlsh and is available from the GitHub repository. For step-by-step instructions, see the section

called “Loading data using DSBulk”.

General considerations for data uploads to Amazon Keyspaces

• Break the data upload down into smaller components.

Consider the following units of migration and their potential footprint in terms of raw data size.

Uploading smaller amounts of data in one or more phases may help simplify your migration.

• By cluster – Migrate all of your Cassandra data at once. This approach may be ﬁne for smaller

clusters.

• By keyspace or table – Break up your migration into groups of keyspaces or tables. This

approach can help you migrate data in phases based on your requirements for each workload.

Migration tools 53

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• By data – Consider migrating data for a speciﬁc group of users or products, to bring the size of

data down even more.

• Prioritize what data to upload ﬁrst based on simplicity.

Consider if you have data that could be migrated ﬁrst more easily—for example, data that does

not change during speciﬁc times, data from nightly batch jobs, data not used during oﬄine

hours, or data from internal apps.

Topics

• Tutorial: Loading data into Amazon Keyspaces using cqlsh

• Tutorial: Loading data into Amazon Keyspaces using DSBulk

Tutorial: Loading data into Amazon Keyspaces using cqlsh

This tutorial guides you through the process of migrating data from Apache Cassandra to Amazon

Keyspaces using the cqlsh COPY FROM command. The cqlsh COPY FROM command is useful

to quickly and easily upload small datasets to Amazon Keyspaces for academic or test purposes.

For more information about how to migrate production workloads, see the section called “Oﬄine

migration”. In this tutorial, you'll complete the following steps:

Prerequisites – Set up an AWS account with credentials, create a JKS trust store ﬁle for the

certiﬁcate, and conﬁgure cqlsh to connect to Amazon Keyspaces.

1. Create source CSV and target table – Prepare a CSV ﬁle as the source data and create the target

keyspace and table in Amazon Keyspaces.

2. Prepare the data – Randomize the data in the CSV ﬁle and analyze it to determine the average

and maximum row sizes.

3. Set throughput capacity – Calculate the required write capacity units (WCUs) based on the data

size and desired load time, and conﬁgure the table's provisioned capacity.

Conﬁgure cqlsh parameters – Determine optimal values for cqlsh COPY FROM parameters

like INGESTRATE, NUMPROCESSES, MAXBATCHSIZE, and CHUNKSIZE to distribute the workload

evenly.

Run the cqlsh COPY FROM command – Run the cqlsh COPY FROM command to upload the

data from the CSV ﬁle to the Amazon Keyspaces table, and monitor the progress.

Loading data using cqlsh 54

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Troubleshooting – Resolve common issues like invalid requests, parser errors, capacity errors, and

cqlsh errors during the data upload process.

Topics

• Prerequisites: Steps to complete before you can upload data using cqlsh COPY FROM

• Step 1: Create the source CSV ﬁle and a target table for the data upload

• Step 2: Prepare the source data for a successful data upload

• Step 3: Set throughput capacity for the table

• Step 4: Conﬁgure cqlsh COPY FROM settings

• Step 5: Run the cqlsh COPY FROM command to upload data from the CSV ﬁle to the target table

• Troubleshooting

Prerequisites: Steps to complete before you can upload data using cqlsh COPY

FROM

You must complete the following tasks before you can start this tutorial.

1. If you have not already done so, sign up for an AWS account by following the steps at the

section called “Setting up AWS Identity and Access Management”.

2. Create service-speciﬁc credentials by following the steps at the section called “Create service-

speciﬁc credentials”.

3. Set up the Cassandra Query Language shell (cqlsh) connection and conﬁrm that you can

connect to Amazon Keyspaces by following the steps at the section called “Using cqlsh”.

Step 1: Create the source CSV ﬁle and a target table for the data upload

For this tutorial, we use a comma-separated values (CSV) ﬁle with the name

keyspaces_sample_table.csv as the source ﬁle for the data migration. The provided sample

ﬁle contains a few rows of data for a table with the name book_awards.

1. Create the source ﬁle. You can choose one of the following options:

•

Download the sample CSV ﬁle (keyspaces_sample_table.csv) contained in the

following archive ﬁle samplemigration.zip. Unzip the archive and take note of the path to

keyspaces_sample_table.csv.

Loading data using cqlsh 55

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• To populate a CSV ﬁle with your own data stored in an Apache Cassandra database, you

can populate the source CSV ﬁle by using the cqlsh COPY TO statement as shown in the

following example.

cqlsh localhost 9042 -u "username" -p "password" --execute

"COPY mykeyspace.mytable TO 'keyspaces_sample_table.csv' WITH HEADER=true"

Make sure the CSV ﬁle you create meets the following requirements:

• The ﬁrst row contains the column names.

• The column names in the source CSV ﬁle match the column names in the target table.

• The data is delimited with a comma.

• All data values are valid Amazon Keyspaces data types. See the section called “Data

types”.

2. Create the target keyspace and table in Amazon Keyspaces.

Connect to Amazon Keyspaces using cqlsh, replacing the service endpoint, user name,

and password in the following example with your own values.

cqlsh cassandra.us-east-2.amazonaws.com 9142 -u "111122223333" -

p "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" --ssl

Create a new keyspace with the name catalog as shown in the following example.

CREATE KEYSPACE catalog WITH REPLICATION = {'class': 'SingleRegionStrategy'};

c. When the new keyspace is available, use the following code to create the target table

book_awards.

CREATE TABLE "catalog.book_awards" (

year int,

award text,

rank int,

category text,

book_title text,

author text,

publisher text,

PRIMARY KEY ((year, award), category, rank)

);

Loading data using cqlsh 56

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If Apache Cassandra is your original data source, a simple way to create the Amazon Keyspaces

target table with matching headers is to generate the CREATE TABLE statement from the

source table, as shown in the following statement.

cqlsh localhost 9042 -u "username" -p "password" --execute "DESCRIBE

TABLE mykeyspace.mytable;"

Then create the target table in Amazon Keyspaces with the column names and data types

matching the description from the Cassandra source table.

Step 2: Prepare the source data for a successful data upload

Preparing the source data for an eﬃcient transfer is a two-step process. First, you randomize the

data. In the second step, you analyze the data to determine the appropriate cqlsh parameter

values and required table settings to ensure that the data upload is successful.

Randomize the data

The cqlsh COPY FROM command reads and writes data in the same order that it appears in the

CSV ﬁle. If you use the cqlsh COPY TO command to create the source ﬁle, the data is written

in key-sorted order in the CSV. Internally, Amazon Keyspaces partitions data using partition keys.

Although Amazon Keyspaces has built-in logic to help load balance requests for the same partition

key, loading the data is faster and more eﬃcient if you randomize the order. This is because you

can take advantage of the built-in load balancing that occurs when Amazon Keyspaces is writing to

diﬀerent partitions.

To spread the writes across the partitions evenly, you must randomize the data in the source ﬁle.

You can write an application to do this or use an open-source tool, such as Shuf. Shuf is freely

available on Linux distributions, on macOS (by installing coreutils in homebrew), and on Windows

(by using Windows Subsystem for Linux (WSL)). One extra step is required to prevent the header

row with the column names to get shuﬄed in this step.

To randomize the source ﬁle while preserving the header, enter the following code.

tail -n +2 keyspaces_sample_table.csv | shuf -o keyspace.table.csv && (head

-1 keyspaces_sample_table.csv && cat keyspace.table.csv ) > keyspace.table.csv1 &&

mv keyspace.table.csv1 keyspace.table.csv

Loading data using cqlsh 57

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Shuf rewrites the data to a new CSV ﬁle called keyspace.table.csv. You can now delete the

keyspaces_sample_table.csv ﬁle—you no longer need it.

Analyze the data

Determine the average and maximum row size by analyzing the data.

You do this for the following reasons:

• The average row size helps to estimate the total amount of data to be transferred.

• You need the average row size to provision the write capacity needed for the data upload.

• You can make sure that each row is less than 1 MB in size, which is the maximum row size in

Amazon Keyspaces.

Note

This quota refers to row size, not partition size. Unlike Apache Cassandra partitions,

Amazon Keyspaces partitions can be virtually unbound in size. Partition keys and clustering

columns require additional storage for metadata, which you must add to the raw size of

rows. For more information, see the section called “Estimate row size”.

The following code uses AWK to analyze a CSV ﬁle and print the average and maximum row size.

awk -F, 'BEGIN {samp=10000;max=-1;}{if(NR>1){len=length($0);t+=len;avg=t/

NR;max=(len>max ? len : max)}}NR==samp{exit}END{printf("{lines: %d, average: %d bytes,

max: %d bytes}\n",NR,avg,max);}' keyspace.table.csv

Running this code results in the following output.

using 10,000 samples:

{lines: 10000, avg: 123 bytes, max: 225 bytes}

You use the average row size in the next step of this tutorial to provision the write capacity for the

table.

Loading data using cqlsh 58

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 3: Set throughput capacity for the table

This tutorial shows you how to tune cqlsh to load data within a set time range. Because you know

how many reads and writes you perform in advance, use provisioned capacity mode. After you

ﬁnish the data transfer, you should set the capacity mode of the table to match your application’s

traﬃc patterns. To learn more about capacity management, see Managing serverless resources.

With provisioned capacity mode, you specify how much read and write capacity you want

to provision to your table in advance. Write capacity is billed hourly and metered in write

capacity units (WCUs). Each WCU is enough write capacity to support writing 1 KB of data

per second. When you load the data, the write rate must be under the max WCUs (parameter:

write_capacity_units) that are set on the target table.

By default, you can provision up to 40,000 WCUs to a table and 80,000 WCUs across all the tables

in your account. If you need additional capacity, you can request a quota increase in the Service

Quotas console. For more information about quotas, see Quotas.

Calculate the average number of WCUs required for an insert

Inserting 1 KB of data per second requires 1 WCU. If your CSV ﬁle has 360,000 rows and you want

to load all the data in 1 hour, you must write 100 rows per second (360,000 rows / 60 minutes / 60

seconds = 100 rows per second). If each row has up to 1 KB of data, to insert 100 rows per second,

you must provision 100 WCUs to your table. If each row has 1.5 KB of data, you need two WCUs to

insert one row per second. Therefore, to insert 100 rows per second, you must provision 200 WCUs.

To determine how many WCUs you need to insert one row per second, divide the average row size

in bytes by 1024 and round up to the nearest whole number.

For example, if the average row size is 3000 bytes, you need three WCUs to insert one row per

second.

ROUNDUP(3000 / 1024) = ROUNDUP(2.93) = 3 WCUs

Calculate data load time and capacity

Now that you know the average size and number of rows in your CSV ﬁle, you can calculate how

many WCUs you need to load the data in a given amount of time, and the approximate time it

takes to load all the data in your CSV ﬁle using diﬀerent WCU settings.

For example, if each row in your ﬁle is 1 KB and you have 1,000,000 rows in your CSV ﬁle, to load

the data in 1 hour, you need to provision at least 278 WCUs to your table for that hour.

Loading data using cqlsh 59

Amazon Keyspaces (for Apache Cassandra) Developer Guide

1,000,000 rows * 1 KBs = 1,000,000 KBs

1,000,000 KBs / 3600 seconds =277.8 KBs / second = 278 WCUs

Conﬁgure provisioned capacity settings

You can set a table’s write capacity settings when you create the table or by using the ALTER

TABLE CQL command. The following is the syntax for altering a table’s provisioned capacity

settings with the ALTER TABLE CQL statement.

ALTER TABLE mykeyspace.mytable WITH custom_properties={'capacity_mode':

{'throughput_mode': 'PROVISIONED', 'read_capacity_units': 100,

'write_capacity_units': 278}} ;

For the complete language reference, see the section called “ALTER TABLE”.

Step 4: Conﬁgure cqlsh COPY FROM settings

This section outlines how to determine the parameter values for cqlsh COPY FROM. The cqlsh

COPY FROM command reads the CSV ﬁle that you prepared earlier and inserts the data into

Amazon Keyspaces using CQL. The command divides up the rows and distributes the INSERT

operations among a set of workers. Each worker establishes a connection with Amazon Keyspaces

and sends INSERT requests along this channel.

The cqlsh COPY command doesn’t have internal logic to distribute work evenly among its

workers. However, you can conﬁgure it manually to make sure that the work is distributed evenly.

Start by reviewing these key cqlsh parameters:

• DELIMITER – If you used a delimiter other than a comma, you can set this parameter, which

defaults to comma.

•

INGESTRATE – The target number of rows that cqlsh COPY FROM attempts to process per

second. If unset, it defaults to 100,000.

•

NUMPROCESSES – The number of child worker processes that cqlsh creates for COPY FROM

tasks. The maximum for this setting is 16, the default is num_cores - 1, where num_cores is

the number of processing cores on the host running cqlsh.

• MAXBATCHSIZE – The batch size determines the maximal number of rows inserted into the

destination table in a single batch. If unset, cqlsh uses batches of 20 inserted rows.

• CHUNKSIZE – The size of the work unit that passes to the child worker. By default, it is set to

5,000.

Loading data using cqlsh 60

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• MAXATTEMPTS – The maximum number of times to retry a failed worker chunk. After the

maximum attempt is reached, the failed records are written to a new CSV ﬁle that you can run

again later after investigating the failure.

Set INGESTRATE based on the number of WCUs that you provisioned to the target destination

table. The INGESTRATE of the cqlsh COPY FROM command isn’t a limit—it’s a target average.

This means it can (and often does) burst above the number you set. To allow for bursts and make

sure that enough capacity is in place to handle the data load requests, set INGESTRATE to 90% of

the table’s write capacity.

INGESTRATE = WCUs * .90

Next, set the NUMPROCESSES parameter to equal to one less than the number of cores on your

system. To ﬁnd out what the number of cores of your system is, you can run the following code.

python -c "import multiprocessing; print(multiprocessing.cpu_count())"

For this tutorial, we use the following value.

NUMPROCESSES = 4

Each process creates a worker, and each worker establishes a connection to Amazon Keyspaces.

Amazon Keyspaces can support up to 3,000 CQL requests per second on every connection. This

means that you have to make sure that each worker is processing fewer than 3,000 requests per

second.

As with INGESTRATE, the workers often burst above the number you set and aren’t limited by

clock seconds. Therefore, to account for bursts, set your cqlsh parameters to target each worker to

process 2,500 requests per second. To calculate the amount of work distributed to a worker, use

the following guideline.

•

Divide INGESTRATE by NUMPROCESSES.

•

If INGESTRATE / NUMPROCESSES > 2,500, lower the INGESTRATE to make this formula true.

INGESTRATE / NUMPROCESSES <= 2,500

Loading data using cqlsh 61

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Before you conﬁgure the settings to optimize the upload of our sample data, let's review the

cqlsh default settings and see how using them impacts the data upload process. Because cqlsh

COPY FROM uses the CHUNKSIZE to create chunks of work (INSERT statements) to distribute to

workers, the work is not automatically distributed evenly. Some workers might sit idle, depending

on the INGESTRATE setting.

To distribute work evenly among the workers and keep each worker at the optimal 2,500 requests

per second rate, you must set CHUNKSIZE, MAXBATCHSIZE, and INGESTRATE by changing the

input parameters. To optimize network traﬃc utilization during the data load, choose a value for

MAXBATCHSIZE that is close to the maximum value of 30. By changing CHUNKSIZE to 100 and

MAXBATCHSIZE to 25, the 10,000 rows are spread evenly among the four workers (10,000 / 2500 =

4).

The following code example illustrates this.

INGESTRATE = 10,000

NUMPROCESSES = 4

CHUNKSIZE = 100

MAXBATCHSIZE. = 25

Work Distribution:

Connection 1 / Worker 1 : 2,500 Requests per second

Connection 2 / Worker 2 : 2,500 Requests per second

Connection 3 / Worker 3 : 2,500 Requests per second

Connection 4 / Worker 4 : 2,500 Requests per second

To summarize, use the following formulas when setting cqlsh COPY FROM parameters:

• INGESTRATE = write_capacity_units * .90

• NUMPROCESSES = num_cores -1 (default)

• INGESTRATE / NUMPROCESSES = 2,500 (This must be a true statement.)

• MAXBATCHSIZE = 30 (Defaults to 20. Amazon Keyspaces accepts batches up to 30.)

• CHUNKSIZE = (INGESTRATE / NUMPROCESSES) / MAXBATCHSIZE

Now that you have calculated NUMPROCESSES, INGESTRATE, and CHUNKSIZE, you’re ready to load

your data.

Loading data using cqlsh 62

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 5: Run the cqlsh COPY FROM command to upload data from the CSV ﬁle to

the target table

To run the cqlsh COPY FROM command, complete the following steps.

1. Connect to Amazon Keyspaces using cqlsh.

2. Choose your keyspace with the following code.

USE catalog;

Set write consistency to LOCAL_QUORUM. To ensure data durability, Amazon Keyspaces doesn’t

allow other write consistency settings. See the following code.

CONSISTENCY LOCAL_QUORUM;

Prepare your cqlsh COPY FROM syntax using the following code example.

COPY book_awards FROM './keyspace.table.csv' WITH HEADER=true

AND INGESTRATE=calculated ingestrate

AND NUMPROCESSES=calculated numprocess

AND MAXBATCHSIZE=20

AND CHUNKSIZE=calculated chunksize;

5. Run the statement prepared in the previous step. cqlsh echoes back all the settings that you've

conﬁgured.

a. Make sure that the settings match your input. See the following example.

Reading options from the command line: {'chunksize': '120', 'header': 'true',

'ingestrate': '36000', 'numprocesses': '15', 'maxbatchsize': '20'}

Using 15 child processes

b. Review the number of rows transferred and the current average rate, as shown in the

following example.

Processed: 57834 rows; Rate: 6561 rows/s; Avg. rate: 31751 rows/s

c. When cqlsh has ﬁnished uploading the data, review the summary of the data load

statistics (the number of ﬁles read, runtime, and skipped rows) as shown in the following

example.

Loading data using cqlsh 63

Amazon Keyspaces (for Apache Cassandra) Developer Guide

15556824 rows imported from 1 files in 8 minutes and 8.321 seconds (0 skipped).

In this ﬁnal step of the tutorial, you have uploaded the data to Amazon Keyspaces.

Important

Now that you have transferred your data, adjust the capacity mode settings of your target

table to match your application’s regular traﬃc patterns. You incur charges at the hourly

rate for your provisioned capacity until you change it.

Troubleshooting

After the data upload has completed, check to see if rows were skipped. To do so, navigate to the

source directory of the source CSV ﬁle and search for a ﬁle with the following name.

import_yourcsvfilename.err.timestamp.csv

cqlsh writes any skipped rows of data into a ﬁle with that name. If the ﬁle exists in your source

directory and has data in it, these rows didn't upload to Amazon Keyspaces. To retry these rows,

ﬁrst check for any errors that were encountered during the upload and adjust the data accordingly.

To retry these rows, you can rerun the process.

Common errors

The most common reasons why rows aren’t loaded are capacity errors and parsing errors.

Invalid request errors when uploading data to Amazon Keyspaces

In the following example, the source table contains a counter column, which results in logged

batch calls from the cqlsh COPY command. Logged batch calls are not supported by Amazon

Keyspaces.

Failed to import 10 rows: InvalidRequest - Error from server: code=2200 [Invalid query]

message=“Only UNLOGGED Batches are supported at this time.“, will retry later,

attempt 22 of 25

Loading data using cqlsh 64

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To resolve this error, use DSBulk to migrate the data. For more information, see the section called

“Loading data using DSBulk”.

Parser errors when uploading data to Amazon Keyspaces

The following example shows a skipped row due to a ParseError.

Failed to import 1 rows: ParseError - Invalid ... –

To resolve this error, you need to make sure that the data to be imported matches the table

schema in Amazon Keyspaces. Review the import ﬁle for parsing errors. You can try using a single

row of data using an INSERT statement to isolate the error.

Capacity errors when uploading data to Amazon Keyspaces

Failed to import 1 rows: WriteTimeout - Error from server: code=1100 [Coordinator node

timed out waiting for replica nodes' responses]

message="Operation timed out - received only 0 responses." info={'received_responses':

0, 'required_responses': 2, 'write_type': 'SIMPLE', 'consistency':

'LOCAL_QUORUM'}, will retry later, attempt 1 of 100

Amazon Keyspaces uses the ReadTimeout and WriteTimeout exceptions to indicate when a

write request fails due to insuﬃcient throughput capacity. To help diagnose insuﬃcient capacity

exceptions, Amazon Keyspaces publishes WriteThrottleEvents and ReadThrottledEvents

metrics in Amazon CloudWatch. For more information, see the section called “Monitoring with

CloudWatch”.

cqlsh errors when uploading data to Amazon Keyspaces

To help troubleshoot cqlsh errors, rerun the failing command with the --debug ﬂag.

When using an incompatible version of cqlsh, you see the following error.

AttributeError: 'NoneType' object has no attribute 'is_up'

Failed to import 3 rows: AttributeError - 'NoneType' object has no attribute 'is_up',

given up after 1 attempts

Conﬁrm that the correct version of cqlsh is installed by running the following command.

cqlsh --version

Loading data using cqlsh 65

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You should see something like the following for output.

cqlsh 5.0.1

If you're using Windows, replace all instances of cqlsh with cqlsh.bat. For example, to check the

version of cqlsh in Windows, run the following command.

cqlsh.bat --version

The connection to Amazon Keyspaces fails after the cqlsh client receives three consecutive errors of

any type from the server. The cqlsh client fails with the following message.

Failed to import 1 rows: NoHostAvailable - , will retry later, attempt 3 of 100

To resolve this error, you need to make sure that the data to be imported matches the table

schema in Amazon Keyspaces. Review the import ﬁle for parsing errors. You can try using a single

row of data by using an INSERT statement to isolate the error.

The client automatically attempts to reestablish the connection.

Tutorial: Loading data into Amazon Keyspaces using DSBulk

This step-by-step tutorial guides you through migrating data from Apache Cassandra to Amazon

Keyspaces using the DataStax Bulk Loader (DSBulk) available on GitHub. Using DSBulk is useful to

upload datasets to Amazon Keyspaces for academic or test purposes. For more information about

how to migrate production workloads, see the section called “Oﬄine migration”. In this tutorial,

you complete the following steps.

Prerequisites – Set up an AWS account with credentials, create a JKS trust store ﬁle for the

certiﬁcate, conﬁgure cqlsh, download and install DSBulk, and conﬁgure an application.conf

ﬁle.

1. Create source CSV and target table – Prepare a CSV ﬁle as the source data and create the target

keyspace and table in Amazon Keyspaces.

2. Prepare the data – Randomize the data in the CSV ﬁle and analyze it to determine the average

and maximum row sizes.

3. Set throughput capacity – Calculate the required write capacity units (WCUs) based on the data

size and desired load time, and conﬁgure the table's provisioned capacity.

Loading data using DSBulk 66

Amazon Keyspaces (for Apache Cassandra) Developer Guide

4. Conﬁgure DSBulk settings – Create a DSBulk conﬁguration ﬁle with settings like authentication,

SSL/TLS, consistency level, and connection pool size.

5. Run the DSBulk load command – Run the DSBulk load command to upload the data from the

CSV ﬁle to the Amazon Keyspaces table, and monitor the progress.

Topics

• Prerequisites: Steps you have to complete before you can upload data with DSBulk

• Step 1: Create the source CSV ﬁle and a target table for the data upload using DSBulk

• Step 2: Prepare the data to upload using DSBulk

• Step 3: Set the throughput capacity for the target table

• Step 4: Conﬁgure DSBulk settings to upload data from the CSV ﬁle to the target table

• Step 5: Run the DSBulk load command to upload data from the CSV ﬁle to the target table

Prerequisites: Steps you have to complete before you can upload data with

DSBulk

You must complete the following tasks before you can start this tutorial.

1. If you have not already done so, sign up for an AWS account by following the steps at the

section called “Setting up AWS Identity and Access Management”.

2. Create credentials by following the steps at the section called “Create IAM credentials for AWS

authentication”.

3. Create a JKS trust store ﬁle.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-

class2-root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces

and can continue to do so if your client is connecting to Amazon Keyspaces

Loading data using DSBulk 67

Amazon Keyspaces (for Apache Cassandra) Developer Guide

successfully. The Starﬁeld certiﬁcate provides additional backwards compatibility

for clients using older certiﬁcate authorities.

b. Convert the Starﬁeld digital certiﬁcate into a trustStore ﬁle.

openssl x509 -outform der -in sf-class2-root.crt -out temp_file.der

keytool -import -alias cassandra -keystore cassandra_truststore.jks -file

temp_file.der

In this step, you need to create a password for the keystore and trust this certiﬁcate. The

interactive command looks like this.

Enter keystore password:

Re-enter new password:

Owner: OU=Starfield Class 2 Certification Authority, O="Starfield Technologies,

Inc.", C=US

Issuer: OU=Starfield Class 2 Certification Authority, O="Starfield

Technologies, Inc.", C=US

Serial number: 0

Valid from: Tue Jun 29 17:39:16 UTC 2004 until: Thu Jun 29 17:39:16 UTC 2034

Certificate fingerprints:

MD5: 32:4A:4B:BB:C8:63:69:9B:BE:74:9A:C6:DD:1D:46:24

SHA1: AD:7E:1C:28:B0:64:EF:8F:60:03:40:20:14:C3:D0:E3:37:0E:B5:8A

SHA256:

14:65:FA:20:53:97:B8:76:FA:A6:F0:A9:95:8E:55:90:E4:0F:CC:7F:AA:4F:B7:C2:C8:67:75:21:FB:5F:B6:58

Signature algorithm name: SHA1withRSA

Subject Public Key Algorithm: 2048-bit RSA key

Version: 3

Extensions:

#1: ObjectId: 2.5.29.35 Criticality=false

AuthorityKeyIdentifier [

KeyIdentifier [

0000: BF 5F B7 D1 CE DD 1F 86 F4 5B 55 AC DC D7 10 C2 ._.......[U.....

0010: 0E A9 88 E7 ....

]

[OU=Starfield Class 2 Certification Authority, O="Starfield Technologies,

Inc.", C=US]

SerialNumber: [ 00]

]

#2: ObjectId: 2.5.29.19 Criticality=false

BasicConstraints:[

CA:true

Loading data using DSBulk 68

Amazon Keyspaces (for Apache Cassandra) Developer Guide

PathLen:2147483647

]

#3: ObjectId: 2.5.29.14 Criticality=false

SubjectKeyIdentifier [

KeyIdentifier [

0000: BF 5F B7 D1 CE DD 1F 86 F4 5B 55 AC DC D7 10 C2 ._.......[U.....

0010: 0E A9 88 E7 ....

]

Trust this certificate? [no]: y

4. Set up the Cassandra Query Language shell (cqlsh) connection and conﬁrm that you can

connect to Amazon Keyspaces by following the steps at the section called “Using cqlsh”.

5. Download and install DSBulk.

a. To download DSBulk, you can use the following code.

curl -OL https://downloads.datastax.com/dsbulk/dsbulk-1.8.0.tar.gz

Then unpack the tar ﬁle and add DSBulk to your PATH as shown in the following example.

tar -zxvf dsbulk-1.8.0.tar.gz

# add the DSBulk directory to the path

export PATH=$PATH:./dsbulk-1.8.0/bin

Create an application.conf ﬁle to store settings to be used by DSBulk. You can save

the following example as ./dsbulk_keyspaces.conf. Replace localhost with the

contact point of your local Cassandra cluster if you are not on the local node, for example

the DNS name or IP address. Take note of the ﬁle name and path, as you're going to need

to specify this later in the dsbulk load command.

datastax-java-driver {

basic.contact-points = [ "localhost"]

advanced.auth-provider {

class = software.aws.mcs.auth.SigV4AuthProvider

aws-region = us-east-1

}

To enable SigV4 support, download the shaded jar ﬁle from GitHub and place it in the

DSBulk lib folder as shown in the following example.

Loading data using DSBulk 69

Amazon Keyspaces (for Apache Cassandra) Developer Guide

curl -O -L https://github.com/aws/aws-sigv4-auth-cassandra-java-driver-plugin/

releases/download/4.0.6-shaded-v2/aws-sigv4-auth-cassandra-java-driver-

plugin-4.0.6-shaded.jar

Step 1: Create the source CSV ﬁle and a target table for the data upload using

DSBulk

For this tutorial, we use a comma-separated values (CSV) ﬁle with the name

keyspaces_sample_table.csv as the source ﬁle for the data migration. The provided sample

ﬁle contains a few rows of data for a table with the name book_awards.

1. Create the source ﬁle. You can choose one of the following options:

•

Download the sample CSV ﬁle (keyspaces_sample_table.csv) contained in the

following archive ﬁle samplemigration.zip. Unzip the archive and take note of the path to

keyspaces_sample_table.csv.

• To populate a CSV ﬁle with your own data stored in an Apache Cassandra database, you

can populate the source CSV ﬁle by using dsbulk unload as shown in the following

example.

dsbulk unload -k mykeyspace -t mytable -f ./my_application.conf

> keyspaces_sample_table.csv

Make sure the CSV ﬁle you create meets the following requirements:

• The ﬁrst row contains the column names.

• The column names in the source CSV ﬁle match the column names in the target table.

• The data is delimited with a comma.

• All data values are valid Amazon Keyspaces data types. See the section called “Data

types”.

2. Create the target keyspace and table in Amazon Keyspaces.

Connect to Amazon Keyspaces using cqlsh, replacing the service endpoint, user name,

and password in the following example with your own values.

Loading data using DSBulk 70

Amazon Keyspaces (for Apache Cassandra) Developer Guide

cqlsh cassandra.us-east-2.amazonaws.com 9142 -u "111122223333" -

p "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" --ssl

Create a new keyspace with the name catalog as shown in the following example.

CREATE KEYSPACE catalog WITH REPLICATION = {'class': 'SingleRegionStrategy'};

c. After the new keyspace has a status of available, use the following code to create the

target table book_awards. To learn more about asynchronous resource creation and how

to check if a resource is available, see the section called “Check keyspace creation status”.

CREATE TABLE catalog.book_awards (

year int,

award text,

rank int,

category text,

book_title text,

author text,

publisher text,

PRIMARY KEY ((year, award), category, rank)

);

If Apache Cassandra is your original data source, a simple way to create the Amazon Keyspaces

target table with matching headers is to generate the CREATE TABLE statement from the

source table as shown in the following statement.

cqlsh localhost 9042 -u "username" -p "password" --execute "DESCRIBE

TABLE mykeyspace.mytable;"

Then create the target table in Amazon Keyspaces with the column names and data types

matching the description from the Cassandra source table.

Step 2: Prepare the data to upload using DSBulk

Preparing the source data for an eﬃcient transfer is a two-step process. First, you randomize the

data. In the second step, you analyze the data to determine the appropriate dsbulk parameter

values and required table settings.

Loading data using DSBulk 71

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Randomize the data

The dsbulk command reads and writes data in the same order that it appears in the CSV ﬁle. If

you use the dsbulk command to create the source ﬁle, the data is written in key-sorted order

in the CSV. Internally, Amazon Keyspaces partitions data using partition keys. Although Amazon

Keyspaces has built-in logic to help load balance requests for the same partition key, loading the

data is faster and more eﬃcient if you randomize the order. This is because you can take advantage

of the built-in load balancing that occurs when Amazon Keyspaces is writing to diﬀerent partitions.

To spread the writes across the partitions evenly, you must randomize the data in the source ﬁle.

You can write an application to do this or use an open-source tool, such as Shuf. Shuf is freely

available on Linux distributions, on macOS (by installing coreutils in homebrew), and on Windows

(by using Windows Subsystem for Linux (WSL)). One extra step is required to prevent the header

row with the column names to get shuﬄed in this step.

To randomize the source ﬁle while preserving the header, enter the following code.

tail -n +2 keyspaces_sample_table.csv | shuf -o keyspace.table.csv && (head

-1 keyspaces_sample_table.csv && cat keyspace.table.csv ) > keyspace.table.csv1 &&

mv keyspace.table.csv1 keyspace.table.csv

Shuf rewrites the data to a new CSV ﬁle called keyspace.table.csv. You can now delete the

keyspaces_sample_table.csv ﬁle—you no longer need it.

Analyze the data

Determine the average and maximum row size by analyzing the data.

You do this for the following reasons:

• The average row size helps to estimate the total amount of data to be transferred.

• You need the average row size to provision the write capacity needed for the data upload.

• You can make sure that each row is less than 1 MB in size, which is the maximum row size in

Amazon Keyspaces.

Note

This quota refers to row size, not partition size. Unlike Apache Cassandra partitions,

Amazon Keyspaces partitions can be virtually unbound in size. Partition keys and clustering

Loading data using DSBulk 72

Amazon Keyspaces (for Apache Cassandra) Developer Guide

columns require additional storage for metadata, which you must add to the raw size of

rows. For more information, see the section called “Estimate row size”.

The following code uses AWK to analyze a CSV ﬁle and print the average and maximum row size.

awk -F, 'BEGIN {samp=10000;max=-1;}{if(NR>1){len=length($0);t+=len;avg=t/

NR;max=(len>max ? len : max)}}NR==samp{exit}END{printf("{lines: %d, average: %d bytes,

max: %d bytes}\n",NR,avg,max);}' keyspace.table.csv

Running this code results in the following output.

using 10,000 samples:

{lines: 10000, avg: 123 bytes, max: 225 bytes}

Make sure that your maximum row size doesn't exceed 1 MB. If it does, you have to break up the

row or compress the data to bring the row size below 1 MB. In the next step of this tutorial, you use

the average row size to provision the write capacity for the table.

Step 3: Set the throughput capacity for the target table

This tutorial shows you how to tune DSBulk to load data within a set time range. Because you know

how many reads and writes you perform in advance, use provisioned capacity mode. After you

ﬁnish the data transfer, you should set the capacity mode of the table to match your application’s

traﬃc patterns. To learn more about capacity management, see Managing serverless resources.

With provisioned capacity mode, you specify how much read and write capacity you want

to provision to your table in advance. Write capacity is billed hourly and metered in write

capacity units (WCUs). Each WCU is enough write capacity to support writing 1 KB of data

per second. When you load the data, the write rate must be under the max WCUs (parameter:

write_capacity_units) that are set on the target table.

By default, you can provision up to 40,000 WCUs to a table and 80,000 WCUs across all the tables

in your account. If you need additional capacity, you can request a quota increase in the Service

Quotas console. For more information about quotas, see Quotas.

Calculate the average number of WCUs required for an insert

Inserting 1 KB of data per second requires 1 WCU. If your CSV ﬁle has 360,000 rows and you want

to load all the data in 1 hour, you must write 100 rows per second (360,000 rows / 60 minutes / 60

Loading data using DSBulk 73

Amazon Keyspaces (for Apache Cassandra) Developer Guide

seconds = 100 rows per second). If each row has up to 1 KB of data, to insert 100 rows per second,

you must provision 100 WCUs to your table. If each row has 1.5 KB of data, you need two WCUs to

insert one row per second. Therefore, to insert 100 rows per second, you must provision 200 WCUs.

To determine how many WCUs you need to insert one row per second, divide the average row size

in bytes by 1024 and round up to the nearest whole number.

For example, if the average row size is 3000 bytes, you need three WCUs to insert one row per

second.

ROUNDUP(3000 / 1024) = ROUNDUP(2.93) = 3 WCUs

Calculate data load time and capacity

Now that you know the average size and number of rows in your CSV ﬁle, you can calculate how

many WCUs you need to load the data in a given amount of time, and the approximate time it

takes to load all the data in your CSV ﬁle using diﬀerent WCU settings.

For example, if each row in your ﬁle is 1 KB and you have 1,000,000 rows in your CSV ﬁle, to load

the data in 1 hour, you need to provision at least 278 WCUs to your table for that hour.

1,000,000 rows * 1 KBs = 1,000,000 KBs

1,000,000 KBs / 3600 seconds =277.8 KBs / second = 278 WCUs

Conﬁgure provisioned capacity settings

You can set a table’s write capacity settings when you create the table or by using the ALTER

TABLE command. The following is the syntax for altering a table’s provisioned capacity settings

with the ALTER TABLE command.

ALTER TABLE catalog.book_awards WITH custom_properties={'capacity_mode':

{'throughput_mode': 'PROVISIONED', 'read_capacity_units': 100, 'write_capacity_units':

278}} ;

For the complete language reference, see the section called “CREATE TABLE” and the section called

“ALTER TABLE”.

Loading data using DSBulk 74

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 4: Conﬁgure DSBulk settings to upload data from the CSV ﬁle to the target

table

This section outlines the steps required to conﬁgure DSBulk for data upload to Amazon Keyspaces.

You conﬁgure DSBulk by using a conﬁguration ﬁle. You specify the conﬁguration ﬁle directly from

the command line.

1. Create a DSBulk conﬁguration ﬁle for the migration to Amazon Keyspaces, in this example we

use the ﬁle name dsbulk_keyspaces.conf. Specify the following settings in the DSBulk

conﬁguration ﬁle.

PlainTextAuthProvider – Create the authentication provider with the

PlainTextAuthProvider class. ServiceUserName and ServicePassword should

match the user name and password you obtained when you generated the service-speciﬁc

credentials by following the steps at the section called “Create programmatic access

credentials”.

local-datacenter – Set the value for local-datacenter to the AWS Region that

you're connecting to. For example, if the application is connecting to cassandra.us-

east-2.amazonaws.com, then set the local data center to us-east-2. For all available

AWS Regions, see the section called “Service endpoints”. To avoid replicas, set slow-

replica-avoidance to false.

SSLEngineFactory – To conﬁgure SSL/TLS, initialize the SSLEngineFactory

by adding a section in the conﬁguration ﬁle with a single line that speciﬁes

the class with class = DefaultSslEngineFactory. Provide the path to

cassandra_truststore.jks and the password that you created previously.

consistency – Set the consistency level to LOCAL QUORUM. Other write consistency

levels are not supported, for more information see the section called “Supported

Cassandra consistency levels”.

e. The number of connections per pool is conﬁgurable in the Java driver. For this example,

set advanced.connection.pool.local.size to 3.

The following is the complete sample conﬁguration ﬁle.

datastax-java-driver {

basic.contact-points = [ "cassandra.us-east-2.amazonaws.com:9142"]

advanced.auth-provider {

class = PlainTextAuthProvider

Loading data using DSBulk 75

Amazon Keyspaces (for Apache Cassandra) Developer Guide

username = "ServiceUserName"

password = "ServicePassword"

}

basic.load-balancing-policy {

local-datacenter = "us-east-2"

slow-replica-avoidance = false

}

basic.request {

consistency = LOCAL_QUORUM

default-idempotence = true

}

advanced.ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "./cassandra_truststore.jks"

truststore-password = "my_password"

hostname-validation = false

}

advanced.connection.pool.local.size = 3

}

Review the parameters for the DSBulk load command.

executor.maxPerSecond – The maximum number of rows that the load command

attempts to process concurrently per second. If unset, this setting is disabled with -1.

Set executor.maxPerSecond based on the number of WCUs that you provisioned to

the target destination table. The executor.maxPerSecond of the load command isn’t

a limit – it’s a target average. This means it can (and often does) burst above the number

you set. To allow for bursts and make sure that enough capacity is in place to handle the

data load requests, set executor.maxPerSecond to 90% of the table’s write capacity.

executor.maxPerSecond = WCUs * .90

In this tutorial, we set executor.maxPerSecond to 5.

Note

If you are using DSBulk 1.6.0 or higher, you can use

dsbulk.engine.maxConcurrentQueries instead.

Loading data using DSBulk 76

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁgure these additional parameters for the DSBulk load command.

•

batch-mode – This parameter tells the system to group operations by partition key. We

recommend to disable batch mode, because it can result in hot key scenarios and cause

WriteThrottleEvents.

•

driver.advanced.retry-policy-max-retries – This determines how many times

to retry a failed query. If unset, the default is 10. You can adjust this value as needed.

•

driver.basic.request.timeout – The time in minutes the system waits for a query

to return. If unset, the default is "5 minutes". You can adjust this value as needed.

Step 5: Run the DSBulk load command to upload data from the CSV ﬁle to the

target table

In the ﬁnal step of this tutorial, you upload the data into Amazon Keyspaces.

To run the DSBulk load command, complete the following steps.

1. Run the following code to upload the data from your csv ﬁle to your Amazon Keyspaces table.

Make sure to update the path to the application conﬁguration ﬁle you created earlier.

dsbulk load -f ./dsbulk_keyspaces.conf --connector.csv.url keyspace.table.csv

-header true --batch.mode DISABLED --executor.maxPerSecond 5 --

driver.basic.request.timeout "5 minutes" --driver.advanced.retry-policy.max-

retries 10 -k catalog -t book_awards

2. The output includes the location of a log ﬁle that details successful and unsuccessful

operations. The ﬁle is stored in the following directory.

Operation directory: /home/user_name/logs/UNLOAD_20210308-202317-801911

3. The log ﬁle entries will include metrics, as in the following example. Check to make sure that

the number of rows is consistent with the number of rows in your csv ﬁle.

total | failed | rows/s | p50ms | p99ms | p999ms

200 | 0 | 200 | 21.63 | 21.89 | 21.89

Loading data using DSBulk 77

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Important

Now that you have transferred your data, adjust the capacity mode settings of your target

table to match your application’s regular traﬃc patterns. You incur charges at the hourly

rate for your provisioned capacity until you change it. For more information, see the section

called “Conﬁgure read/write capacity modes”.

Loading data using DSBulk 78

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Accessing Amazon Keyspaces (for Apache Cassandra)

You can access Amazon Keyspaces using the console, AWS CloudShell, programmatically by

running a cqlsh client, the AWS SDK, or by using an Apache 2.0 licensed Cassandra driver. Amazon

Keyspaces supports drivers and clients that are compatible with Apache Cassandra 3.11.2. Before

accessing Amazon Keyspaces, you must complete setting up AWS Identity and Access Management

and then grant an IAM identity access permissions to Amazon Keyspaces.

Setting up AWS Identity and Access Management

If you do not have an AWS account, complete the following steps to create one.

To sign up for an AWS account

1. Open https://portal.aws.amazon.com/billing/signup.

2. Follow the online instructions.

Part of the sign-up procedure involves receiving a phone call and entering a veriﬁcation code

on the phone keypad.

When you sign up for an AWS account, an AWS account root user is created. The root user

has access to all AWS services and resources in the account. As a security best practice, assign

administrative access to a user, and use only the root user to perform tasks that require root

user access.

AWS sends you a conﬁrmation email after the sign-up process is complete. At any time, you can

view your current account activity and manage your account by going to https://aws.amazon.com/

and choosing My Account.

Create a user with administrative access

After you sign up for an AWS account, secure your AWS account root user, enable AWS IAM Identity

Center, and create an administrative user so that you don't use the root user for everyday tasks.

Setting up AWS Identity and Access Management 79

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Secure your AWS account root user

1. Sign in to the AWS Management Console as the account owner by choosing Root user and

entering your AWS account email address. On the next page, enter your password.

For help signing in by using root user, see Signing in as the root user in the AWS Sign-In User

Guide.

2. Turn on multi-factor authentication (MFA) for your root user.

For instructions, see Enable a virtual MFA device for your AWS account root user (console) in

the IAM User Guide.

Create a user with administrative access

1. Enable IAM Identity Center.

For instructions, see Enabling AWS IAM Identity Center in the AWS IAM Identity Center User

Guide.

2. In IAM Identity Center, grant administrative access to a user.

For a tutorial about using the IAM Identity Center directory as your identity source, see

Conﬁgure user access with the default IAM Identity Center directory in the AWS IAM Identity

Center User Guide.

• To sign in with your IAM Identity Center user, use the sign-in URL that was sent to your email

address when you created the IAM Identity Center user.

For help signing in using an IAM Identity Center user, see Signing in to the AWS access portal in

the AWS Sign-In User Guide.

Assign access to additional users

1. In IAM Identity Center, create a permission set that follows the best practice of applying least-

privilege permissions.

For instructions, see Create a permission set in the AWS IAM Identity Center User Guide.

Create a user with administrative access 80

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. Assign users to a group, and then assign single sign-on access to the group.

For instructions, see Add groups in the AWS IAM Identity Center User Guide.

Setting up Amazon Keyspaces

Access to Amazon Keyspaces resources is managed using IAM. Using IAM, you can attach policies

to IAM users, roles, and federated identities that grant read and write permissions to speciﬁc

resources in Amazon Keyspaces.

To get started with granting permissions to an IAM identity, you can use one of the AWS managed

policies for Amazon Keyspaces:

• AmazonKeyspacesFullAccess – this policy grants permissions to access all resources in Amazon

Keyspaces with full access to all features.

• AmazonKeyspacesReadOnlyAccess_v2 – this policy grants read-only permissions to Amazon

Keyspaces.

For a detailed explanation of the actions deﬁned in the managed policies, see the section called

“AWS managed policies”.

To limit the scope of actions that an IAM identity can perform or limit the resources that the

identity can access, you can create a custom policy that uses the AmazonKeyspacesFullAccess

managed policy as a template and remove all permissions that you don't need. You can also limit

access to speciﬁc keyspaces or tables. For more information about how to restrict actions or limit

access to speciﬁc resources in Amazon Keyspaces, see the section called “How Amazon Keyspaces

works with IAM”.

To access Amazon Keyspaces after you have created the AWS account and created a policy that

grants an IAM identity access to Amazon Keyspaces, continue to one of the following sections:

• Using the console

• Using AWS CloudShell

Setting up Amazon Keyspaces 81

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Accessing Amazon Keyspaces using the console

You can access the console for Amazon Keyspaces at https://console.aws.amazon.com/keyspaces/

home. For more information about AWS Management Console access, see Controlling IAM users

access to the AWS Management Console in the IAM User Guide.

You can use the console to do the following in Amazon Keyspaces:

• Create, delete, and manage keyspaces and tables.

• Monitor important table metrics on a table's Monitor tab:

• Billable table size (Bytes)

• Capacity metrics

• Run queries using the CQL editor, for example insert, update, and delete data.

• Change the partitioner conﬁguration of the account.

• View performance and error metrics for the account on the dashboard.

To learn how to create an Amazon Keyspaces keyspace and table and set it up with sample

application data, see Getting started with Amazon Keyspaces (for Apache Cassandra).

Using AWS CloudShell to access Amazon Keyspaces

AWS CloudShell is a browser-based, pre-authenticated shell that you can launch directly from

the AWS Management Console. You can run AWS CLI commands against AWS services using your

preferred shell (Bash, PowerShell or Z shell). To work with Amazon Keyspaces using cqlsh, you

must install the cqlsh-expansion. For cqlsh-expansion installation instructions, see the

section called “Using the cqlsh-expansion”.

You launch AWS CloudShell from the AWS Management Console, and the AWS credentials

you used to sign in to the console are automatically available in a new shell session. This pre-

authentication of AWS CloudShell users allows you to skip conﬁguring credentials when interacting

with AWS services such as Amazon Keyspaces using cqlsh or AWS CLI version 2 (pre-installed on

the shell's compute environment).

Using the console 82

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Obtaining IAM permissions for AWS CloudShell

Using the access management resources provided by AWS Identity and Access Management,

administrators can grant permissions to IAM users so they can access AWS CloudShell and use the

environment's features.

The quickest way for an administrator to grant access to users is through an AWS managed policy.

An AWS managed policy is a standalone policy that's created and administered by AWS. The

following AWS managed policy for CloudShell can be attached to IAM identities:

•

AWSCloudShellFullAccess: Grants permission to use AWS CloudShell with full access to all

features.

If you want to limit the scope of actions that an IAM user can perform with AWS CloudShell, you

can create a custom policy that uses the AWSCloudShellFullAccess managed policy as a

template. For more information about limiting the actions that are available to users in CloudShell,

see Managing AWS CloudShell access and usage with IAM policies in the AWS CloudShell User

Guide.

Note

Your IAM identity also requires a policy that grants permission to make calls to Amazon

Keyspaces.

You can use an AWS managed policy to give your IAM identity access you Amazon Keyspaces, or

start with the managed policy as a template and remove the permissions that you don't need.

You can also limit access to speciﬁc keyspaces and tables to create a custom policy. The following

managed policy for Amazon Keyspaces can be attached to IAM identities:

• AmazonKeyspacesFullAccess – This policy grants permission to use Amazon Keyspaces with full

access to all features.

For a detailed explanation of the actions deﬁned in the managed policy, see the section called

“AWS managed policies”.

For more information about how to restrict actions or limit access to speciﬁc resources in Amazon

Keyspaces, see the section called “How Amazon Keyspaces works with IAM”.

Obtaining IAM permissions for AWS CloudShell 83

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Interacting with Amazon Keyspaces using AWS CloudShell

After you launch AWS CloudShell from the AWS Management Console, you can immediately start

to interact with Amazon Keyspaces using cqlsh or the command line interface. If you haven't

already installed the cqlsh-expansion, see the section called “Using the cqlsh-expansion” for

detailed steps.

Note

When using the cqlsh-expansion in AWS CloudShell, you don't need to conﬁgure

credentials before making calls, because you're already authenticated within the shell.

Connect to Amazon Keyspaces and create a new keyspace. Then read from a system table to

conﬁrm that the keyspace was created using AWS CloudShell

1. From the AWS Management Console, you can launch CloudShell by choosing the following

options available on the navigation bar:

• Choose the CloudShell icon.

• Start typing "cloudshell" in Search box and then choose the CloudShell option.

2. You can establish a connection to Amazon Keyspaces using the following command. Make

sure to replace cassandra.us-east-1.amazonaws.com with the correct endpoint for your

Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

If the connection is successful, you should see output similar to the following example.

Connected to Amazon Keyspaces at cassandra.us-east-1.amazonaws.com:9142

[cqlsh 6.1.0 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help.

cqlsh current consistency level is ONE.

cqlsh>

Create a new keyspace with the name mykeyspace. You can use the following command to do

that.

Interacting with Amazon Keyspaces using AWS CloudShell 84

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CREATE KEYSPACE mykeyspace WITH REPLICATION = {'class': 'SingleRegionStrategy'};

4. To conﬁrm that the keyspace was created, you can read from a system table using the

following command.

SELECT * FROM system_schema_mcs.keyspaces WHERE keyspace_name = 'mykeyspace';

If the call is successful, the command line displays a response from the service similar to the

following output:

keyspace_name | durable_writes | replication

----------------+----------------

+-------------------------------------------------------------------------------------

mykeyspace | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

(1 rows)

Create credentials for programmatic access to Amazon

Keyspaces

To provide users and applications with credentials for programmatic access to Amazon Keyspaces

resources, you can do either of the following:

• Create service-speciﬁc credentials that are similar to the traditional username and password that

Cassandra uses for authentication and access management. AWS service-speciﬁc credentials

are associated with a speciﬁc AWS Identity and Access Management (IAM) user and can only be

used for the service they were created for. For more information, see Using IAM with Amazon

Keyspaces (for Apache Cassandra) in the IAM User Guide.

Warning

IAM users have long-term credentials, which presents a security risk. To help mitigate this

risk, we recommend that you provide these users with only the permissions they require

to perform the task and that you remove these users when they are no longer needed.

Create programmatic access credentials 85

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For enhanced security, we recommend to create IAM identities that are used across all AWS

services and use temporary credentials. The Amazon Keyspaces SigV4 authentication plugin for

Cassandra client drivers enables you to authenticate calls to Amazon Keyspaces using IAM access

keys instead of user name and password. To learn more about how the Amazon Keyspaces SigV4

plugin enables IAM users, roles, and federated identities to authenticate in Amazon Keyspaces

API requests, see AWS Signature Version 4 process (SigV4).

You can download the SigV4 plugins from the following locations.

• Java: https://github.com/aws/aws-sigv4-auth-cassandra-java-driver-plugin.

• Node.js: https://github.com/aws/aws-sigv4-auth-cassandra-nodejs-driver-plugin.

• Python: https://github.com/aws/aws-sigv4-auth-cassandra-python-driver-plugin.

• Go: https://github.com/aws/aws-sigv4-auth-cassandra-gocql-driver-plugin.

For code samples that show how to establish connections using the SigV4 authentication plugin,

see the section called “Using a Cassandra client driver”.

Topics

• Create service-speciﬁc credentials for programmatic access to Amazon Keyspaces

• Create and conﬁgure AWS credentials for Amazon Keyspaces

Create service-speciﬁc credentials for programmatic access to Amazon

Keyspaces

Service-speciﬁc credentials are similar to the traditional username and password that Cassandra

uses for authentication and access management. Service-speciﬁc credentials enable IAM users

to access a speciﬁc AWS service. These long-term credentials can't be used to access other AWS

services. They are associated with a speciﬁc IAM user and can't be used by other IAM users.

Important

Service-speciﬁc credentials are long-term credentials associated with a speciﬁc IAM user

and can only be used for the service they were created for. To give IAM roles or federated

identities permissions to access all your AWS resources using temporary credentials,

you should use AWS authentication with the SigV4 authentication plugin for Amazon

Keyspaces.

Create service-speciﬁc credentials 86

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Use one of the following procedures to generate service-speciﬁc credentials.

Console

Create service-speciﬁc credentials using the console

1. Sign in to the AWS Management Console and open the AWS Identity and Access

Management console at https://console.aws.amazon.com/iam/home.

2. In the navigation pane, choose Users, and then choose the user that you created earlier

that has Amazon Keyspaces permissions (policy attached).

3. Choose Security Credentials. Under Credentials for Amazon Keyspaces, choose Generate

credentials to generate the service-speciﬁc credentials.

Your service-speciﬁc credentials are now available. This is the only time you can download

or view the password. You cannot recover it later. However, you can reset your password at

any time. Save the user and password in a secure location, because you'll need them later.

CLI

Create service-speciﬁc credentials using the AWS CLI

Before generating service-speciﬁc credentials, you need to download, install, and conﬁgure the

AWS Command Line Interface (AWS CLI):

1. Download the AWS CLI at http://aws.amazon.com/cli.

Note

The AWS CLI runs on Windows, macOS, or Linux.

2. Follow the instructions for Installing the AWS CLI and Conﬁguring the AWS CLI in the AWS

Command Line Interface User Guide.

3. Using the AWS CLI, run the following command to generate service-speciﬁc credentials for

the user alice, so that she can access Amazon Keyspaces.

aws iam create-service-specific-credential \

--user-name alice \

--service-name cassandra.amazonaws.com

Create service-speciﬁc credentials 87

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The output looks like the following.

{

"ServiceSpecificCredential": {

"CreateDate": "2019-10-09T16:12:04Z",

"ServiceName": "cassandra.amazonaws.com",

"ServiceUserName": "alice-at-111122223333",

"ServicePassword": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",

"ServiceSpecificCredentialId": "ACCAYFI33SINPGJEBYESF",

"UserName": "alice",

"Status": "Active"

}

In the output, note the values for ServiceUserName and ServicePassword. Save these

values in a secure location, because you'll need them later.

Important

This is the only time that the ServicePassword will be available to you.

Create and conﬁgure AWS credentials for Amazon Keyspaces

To access Amazon Keyspaces programmatically with the AWS CLI, the AWS SDK, or with

Cassandra client drivers and the SigV4 plugin, you need an IAM user or role with access keys.

When you use AWS programmatically, you provide your AWS access keys so that AWS can verify

your identity in programmatic calls. Your access keys consist of an access key ID (for example,

AKIAIOSFODNN7EXAMPLE) and a secret access key (for example, wJalrXUtnFEMI/K7MDENG/

bPxRﬁCYEXAMPLEKEY). This topic walks you through the required steps in this process.

Security best practices recommend that you create IAM users with limited permissions and

instead associate IAM roles with the permissions needed to perform speciﬁc tasks. IAM users

can then temporarily assume IAM roles to perform the required tasks. For example, IAM users

in your account using the Amazon Keyspaces console can switch to a role to temporarily use

the permissions of the role in the console. The users give up their original permissions and take

on the permissions assigned to the role. When the users exit the role, their original permissions

are restored. The credentials the users use to assume the role are temporary. On the contrary,

IAM users have long-term credentials, which presents a security risk if instead of assuming roles

Create IAM credentials for AWS authentication 88

Amazon Keyspaces (for Apache Cassandra) Developer Guide

they have permissions directly assigned to them. To help mitigate this risk, we recommend that

you provide these users with only the permissions they require to perform the task and that you

remove these users when they are no longer needed. For more information about roles, see

Common scenarios for roles: Users, applications, and services in the IAM User Guide.

Topics

• Credentials required by the AWS CLI, the AWS SDK, or the Amazon Keyspaces SigV4 plugin for

Cassandra client drivers

• Create temporary credentials to connect to Amazon Keyspaces using an IAM role and the SigV4

plugin

• Create an IAM user for programmatic access to Amazon Keyspaces in your AWS account

• Create new access keys for an IAM user

• Store access keys for IAM users

Credentials required by the AWS CLI, the AWS SDK, or the Amazon Keyspaces

SigV4 plugin for Cassandra client drivers

The following credentials are required to authenticate the IAM user or role:

AWS_ACCESS_KEY_ID

Speciﬁes an AWS access key associated with an IAM user or role.

The access key aws_access_key_id is required to connect to Amazon Keyspaces

programmatically.

AWS_SECRET_ACCESS_KEY

Speciﬁes the secret key associated with the access key. This is essentially the "password" for the

access key.

The aws_secret_access_key is required to connect to Amazon Keyspaces programmatically.

AWS_SESSION_TOKEN – Optional

Speciﬁes the session token value that is required if you are using temporary security credentials

that you retrieved directly from AWS Security Token Service operations. For more information,

see the section called “Create temporary credentials to connect to Amazon Keyspaces”.

Create IAM credentials for AWS authentication 89

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If you are connecting with an IAM user, the aws_session_token is not required.

Create temporary credentials to connect to Amazon Keyspaces using an IAM role

and the SigV4 plugin

The recommended way to access Amazon Keyspaces programmatically is by using temporary

credentials to authenticate with the SigV4 plugin. In many scenarios, you don't need long-term

access keys that never expire (as you have with an IAM user). Instead, you can create an IAM role

and generate temporary security credentials. Temporary security credentials consist of an access

key ID and a secret access key, but they also include a security token that indicates when the

credentials expire. To learn more about how to use IAM roles instead of long-term access keys, see

Switching to an IAM role (AWS API).

To get started with temporary credentials, you ﬁrst need to create an IAM role.

Create an IAM role that grants read-only access to Amazon Keyspaces

1. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

2. In the navigation pane, choose Roles, then Create role.

3. On the Create role page, under Select type of trusted entity, choose AWS service. Under

Choose a use case, choose Amazon EC2, then choose Next.

4. On the Add permissions page, under Permissions policies, choose Amazon Keyspaces Read

Only Access from the policy list, then choose Next.

5. On the Name, review, and create page, enter a name for the role, and review the Select

trusted entities and Add permissions sections. You can also add optional tags for the role on

this page. When you are done, select Create role. Remember this name because you’ll need it

when you launch your Amazon EC2 instance.

To use temporary security credentials in code, you programmatically call an AWS Security Token

Service API like AssumeRole and extract the resulting credentials and session token from your

IAM role that you created in the previous step. You then use those values as credentials for

subsequent calls to AWS. The following example shows pseudocode for how to use temporary

security credentials:

assumeRoleResult = AssumeRole(role-arn);

Create IAM credentials for AWS authentication 90

Amazon Keyspaces (for Apache Cassandra) Developer Guide

tempCredentials = new SessionAWSCredentials(

assumeRoleResult.AccessKeyId,

assumeRoleResult.SecretAccessKey,

assumeRoleResult.SessionToken);

cassandraRequest = CreateAmazoncassandraClient(tempCredentials);

For an example that implements temporary credentials using the Python driver to access Amazon

Keyspaces, see ???.

For details about how to call AssumeRole, GetFederationToken, and other API operations,

see the AWS Security Token Service API Reference. For information on getting the temporary

security credentials and session token from the result, see the documentation for the SDK that

you're working with. You can ﬁnd the documentation for all the AWS SDKs on the main AWS

documentation page, in the SDKs and Toolkits section.

Create an IAM user for programmatic access to Amazon Keyspaces in your AWS

account

To obtain credentials for programmatic access to Amazon Keyspaces with the AWS CLI, the AWS

SDK, or the SigV4 plugin, you need to ﬁrst create an IAM user or role. The process of creating a IAM

user and conﬁguring that IAM user to have programmatic access to Amazon Keyspaces is shown in

the following steps:

1. Create the user in the AWS Management Console, the AWS CLI, Tools for Windows PowerShell,

or using an AWS API operation. If you create the user in the AWS Management Console, then the

credentials are created automatically.

2. If you create the user programmatically, then you must create an access key (access key ID and a

secret access key) for that user in an additional step.

3. Give the user permissions to access Amazon Keyspaces.

For information about the permissions that you need in order to create an IAM user, see

Permissions required to access IAM resources.

Console

Create an IAM user with programmatic access (console)

1. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

Create IAM credentials for AWS authentication 91

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. In the navigation pane, choose Users and then choose Add users.

3. Type the user name for the new user. This is the sign-in name for AWS.

Note

User names can be a combination of up to 64 letters, digits, and these characters:

plus (+), equal (=), comma (,), period (.), at sign (@), underscore (_), and hyphen (-).

Names must be unique within an account. They are not distinguished by case. For

example, you cannot create two users named TESTUSER and testuser.

4. Select Access key - Programmatic access to create an access key for the new user. You can

view or download the access key when you get to the Final page.

Choose Next: Permissions.

5. On the Set permissions page, choose Attach existing policies directly to assign

permissions to the new user.

This option displays the list of AWS managed and customer managed policies available in

your account. You can enter keyspaces into the search ﬁeld to display only the policies

that are related to Amazon Keyspaces.

For Amazon Keyspaces, the available managed policies are

AmazonKeyspacesFullAccess and AmazonKeyspacesReadOnlyAccess. For more

information about each policy, see the section called “AWS managed policies”.

For testing purposes and to follow the connection tutorials, select the

AmazonKeyspacesReadOnlyAccess policy for the new IAM user. Note: As a best practice,

we recommend that you follow the principle of least privilege and create custom policies

that limit access to speciﬁc resources and only allow the required actions. For more

information about IAM policies and to view example policies for Amazon Keyspaces, see the

section called “Amazon Keyspaces identity-based policies”. After you have created custom

permission policies, attach your policies to roles and then let users assume the appropriate

roles temporarily.

Choose Next: Tags.

6. On the Add tags (optional) page you can add tags for the user, or choose Next: Review.

7. On the Review page you can see all of the choices you made up to this point. When you're

ready to proceed, choose Create user.

Create IAM credentials for AWS authentication 92

Amazon Keyspaces (for Apache Cassandra) Developer Guide

8. To view the user's access keys (access key IDs and secret access keys), choose Show next to

the password and access key. To save the access keys, choose Download .csv and then save

the ﬁle to a safe location.

Important

This is your only opportunity to view or download the secret access keys, and you

need this information before they can use the SigV4 plugin. Save the user's new

access key ID and secret access key in a safe and secure place. You will not have

access to the secret keys again after this step.

CLI

Create an IAM user with programmatic access (AWS CLI)

1. Create a user with the following AWS CLI code.

• aws iam create-user

2. Give the user programmatic access. This requires access keys, that can be generated in the

following ways.

• AWS CLI: aws iam create-access-key

• Tools for Windows PowerShell: New-IAMAccessKey

• IAM API: CreateAccessKey

Important

This is your only opportunity to view or download the secret access keys, and you

need this information before they can use the SigV4 plugin. Save the user's new

access key ID and secret access key in a safe and secure place. You will not have

access to the secret keys again after this step.

Attach the AmazonKeyspacesReadOnlyAccess policy to the user that deﬁnes the user's

permissions. Note: As a best practice, we recommend that you manage user permissions by

adding the user to a group and attaching a policy to the group instead of attaching directly

to a user.

Create IAM credentials for AWS authentication 93

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• AWS CLI: aws iam attach-user-policy

Create new access keys for an IAM user

If you already have an IAM user, you can create new access keys at any time. For more information

about key management, for example how to update access keys, see Managing access keys for IAM

users.

To create access keys for an IAM user (console)

1. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

2. In the navigation pane, choose Users.

3. Choose the name of the user whose access keys you want to create.

4. On the Summary page of the user, choose the Security credentials tab.

5. In the Access keys section, choose Create access key.

To view the new access key pair, choose Show. Your credentials will look something like this:

• Access key ID: AKIAIOSFODNN7EXAMPLE

• Secret access key: wJalrXUtnFEMI/K7MDENG/bPxRﬁCYEXAMPLEKEY

Note

You will not have access to the secret access key again after this dialog box closes.

6. To download the key pair, choose Download .csv ﬁle. Store the keys in a secure location.

7. After you download the .csv ﬁle, choose Close.

When you create an access key, the key pair is active by default, and you can use the pair right

away.

Create IAM credentials for AWS authentication 94

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Store access keys for IAM users

As a best practice, we recommend that you don't embed access keys directly into code. The AWS

SDKs and the AWS Command Line Tools enable you to put access keys in known locations so that

you do not have to keep them in code. Put access keys in one of the following locations:

• Environment variables – On a multitenant system, choose user environment variables, not

system environment variables.

•

CLI credentials ﬁle – The credentials and config ﬁle are updated when you run the

command aws configure. The credentials ﬁle is located at ~/.aws/credentials on

Linux, macOS, or Unix, or at C:\Users\USERNAME\.aws\credentials on Windows. This ﬁle

can contain the credential details for the default proﬁle and any named proﬁles.

•

CLI conﬁguration ﬁle – The credentials and config ﬁle are updated when you run the

command aws configure. The config ﬁle is located at ~/.aws/config on Linux, macOS,

or Unix, or at C:\Users\USERNAME\.aws\config on Windows. This ﬁle contains the

conﬁguration settings for the default proﬁle and any named proﬁles.

Storing access keys as environment variables is a prerequisite for the the section called

“Authentication plugin for Java 4.x”. The client searches for credentials using the default

credentials provider chain, and access keys stored as environment variables take precedent over all

other locations, for example conﬁguration ﬁles. For more information, see Conﬁguration settings

and precedence.

The following examples show how you can conﬁgure environment variables for the default user.

Linux, macOS, or Unix

$ export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE

$ export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

$ export AWS_SESSION_TOKEN=AQoDYXdzEJr...<remainder of security token>

Setting the environment variable changes the value used until the end of your shell session,

or until you set the variable to a diﬀerent value. You can make the variables persistent across

future sessions by setting them in your shell's startup script.

Windows Command Prompt

C:\> setx AWS_ACCESS_KEY_ID AKIAIOSFODNN7EXAMPLE

Create IAM credentials for AWS authentication 95

Amazon Keyspaces (for Apache Cassandra) Developer Guide

C:\> setx AWS_SECRET_ACCESS_KEY wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

C:\> setx AWS_SESSION_TOKEN AQoDYXdzEJr...<remainder of security token>

Using set to set an environment variable changes the value used until the end of the current

command prompt session, or until you set the variable to a diﬀerent value. Using setx to set an

environment variable changes the value used in both the current command prompt session and

all command prompt sessions that you create after running the command. It does not aﬀect

other command shells that are already running at the time you run the command.

PowerShell

PS C:\> $Env:AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"

PS C:\> $Env:AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

PS C:\> $Env:AWS_SESSION_TOKEN="AQoDYXdzEJr...<remainder of security token>"

If you set an environment variable at the PowerShell prompt as shown in the previous

examples, it saves the value for only the duration of the current session. To make the

environment variable setting persistent across all PowerShell and Command Prompt sessions,

store it by using the System application in Control Panel. Alternatively, you can set the variable

for all future PowerShell sessions by adding it to your PowerShell proﬁle. See the PowerShell

documentation for more information about storing environment variables or persisting them

across sessions.

Service endpoints for Amazon Keyspaces

Topics

• Ports and protocols

• Global endpoints

• AWS GovCloud (US) Region FIPS endpoints

• China Regions endpoints

Ports and protocols

You can access Amazon Keyspaces programmatically by running a cqlsh client, with an Apache 2.0

licensed Cassandra driver, or by using the AWS CLI and the AWS SDK.

The following table shows the ports and protocols for the diﬀerent access mechanisms.

Service endpoints 96

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Programmatic Access Port Protocol

CQLSH 9142 TLS

Cassandra Driver 9142 TLS

AWS CLI 443 HTTPS

AWS SDK 443 HTTPS

For TLS connections, Amazon Keyspaces uses the Starﬁeld CA to authenticate against the server.

For more information, see the section called “How to manually conﬁgure cqlsh connections for

TLS” or the Before you begin section of your driver in the the section called “Using a Cassandra

client driver” chapter.

Global endpoints

Amazon Keyspaces is available in the following AWS Regions. This table shows the available service

endpoint for each Region.

Region

Name

Region Endpoint Protocol

US East

(Ohio)

us-east-2 cassandra.us-east-2.amazonaws.com HTTPS

and TLS

East (N.

Virginia)

us-east-1 cassandra.us-east-1.amazonaws.com

cassandra-ﬁps.us-east-1.amazonaws.com

HTTPS

and TLS

TLS

West (N.

Californi

us-

west-1

cassandra.us-west-1.amazonaws.com HTTPS

and TLS

US West

(Oregon)

us-

west-2

cassandra.us-west-2.amazonaws.com HTTPS

and TLS

Global endpoints 97

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Region

Name

Region Endpoint Protocol

cassandra-ﬁps.us-west-2.amazonaws.com TLS

Asia

Paciﬁc

(Hong

Kong)

ap-

east-1

cassandra.ap-east-1.amazonaws.com HTTPS

and TLS

Asia

Paciﬁc

(Mumbai)

ap-

south-1

cassandra.ap-south-1.amazonaws.com HTTPS

and TLS

Asia

Paciﬁc

(Seoul)

ap-

northe

ast-2

cassandra.ap-northeast-2.amazonaws.com HTTPS

and TLS

Asia

Paciﬁc

(Singapor

ap-

southe

ast-1

cassandra.ap-southeast-1.amazonaws.com HTTPS

and TLS

Asia

Paciﬁc

(Sydney)

ap-

southe

ast-2

cassandra.ap-southeast-2.amazonaws.com HTTPS

and TLS

Asia

Paciﬁc

(Tokyo)

ap-

northe

ast-1

cassandra.ap-northeast-1.amazonaws.com HTTPS

and TLS

Canada

(Central)

ca-centra

l-1

cassandra.ca-central-1.amazonaws.com HTTPS

and TLS

Europe

(Frankfur

eu-

central-1

cassandra.eu-central-1.amazonaws.com HTTPS

and TLS

Global endpoints 98

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Region

Name

Region Endpoint Protocol

Europe

(Ireland)

eu-

west-1

cassandra.eu-west-1.amazonaws.com HTTPS

and TLS

Europe

(London)

eu-

west-2

cassandra.eu-west-2.amazonaws.com HTTPS

and TLS

Europe

(Paris)

eu-

west-3

cassandra.eu-west-3.amazonaws.com HTTPS

and TLS

Europe

(Stockhol

eu-

north-1

cassandra.eu-north-1.amazonaws.com HTTPS

and TLS

Middle

East

(Bahrain)

me-

south-1

cassandra.me-south-1.amazonaws.com HTTPS

and TLS

South

America

(São

Paulo)

sa-east-1 cassandra.sa-east-1.amazonaws.com HTTPS

and TLS

AWS

GovCloud

(US-East)

us-gov-

east-1

cassandra.us-gov-east-1.amazonaws.com HTTPS

and TLS

AWS

GovCloud

(US-

West)

us-gov-

west-1

cassandra.us-gov-west-1.amazonaws.com HTTPS

and TLS

AWS GovCloud (US) Region FIPS endpoints

Available FIPS endpoints in the AWS GovCloud (US) Region. For more information, see Amazon

Keyspaces in the AWS GovCloud (US) User Guide.

AWS GovCloud (US) Region FIPS endpoints 99

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Region

name

Region FIPS endpoint Protocol

AWS

GovCloud

(US-East)

us-gov-ea

st-1

cassandra.us-gov-east-1.amazonaws.com HTTPS and

TLS

AWS

GovCloud

(US-West)

us-gov-we

st-1

cassandra.us-gov-west-1.amazonaws.com HTTPS and

TLS

China Regions endpoints

The following Amazon Keyspaces endpoints are available in the AWS China Regions.

To access these endpoints, you have to sign up for a separate set of account credentials unique to

the China Regions. For more information, see China Signup, Accounts, and Credentials.

Region

name

Region Endpoint Protocol

China

(Beijing)

cn-north-1 cassandra.cn-north-1.amazonaws.com.cn HTTPS and

TLS

China

(Ningxia)

cn-northw

est-1

cassandra.cn-northwest-1.amazonaws.com.cn HTTPS and

TLS

Using cqlsh to connect to Amazon Keyspaces

To connect to Amazon Keyspaces using cqlsh, you can use the cqlsh-expansion. This is

a toolkit that contains common Apache Cassandra tooling like cqlsh and helpers that are

preconﬁgured for Amazon Keyspaces while maintaining full compatibility with Apache Cassandra.

The cqlsh-expansion integrates the SigV4 authentication plugin and allows you to connect

using IAM access keys instead of user name and password. You only need to install the cqlsh

scripts to make a connection and not the full Apache Cassandra distribution, because Amazon

China Regions endpoints 100

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Keyspaces is serverless. This lightweight install package includes the cqlsh-expansion and the

classic cqlsh scripts that you can install on any platform that supports Python.

Note

Murmur3Partitioner is the recommended partitioner for Amazon Keyspaces and the

cqlsh-expansion. The cqlsh-expansion doesn't support the Amazon Keyspaces

DefaultPartitioner. For more information, see the section called “Working with

partitioners”.

For general information about cqlsh, see cqlsh: the CQL shell.

Topics

• Using the cqlsh-expansion to connect to Amazon Keyspaces

• How to manually conﬁgure cqlsh connections for TLS

Using the cqlsh-expansion to connect to Amazon Keyspaces

Installing and conﬁguring the cqlsh-expansion

To install the cqlsh-expansion Python package, you can run a pip command. This installs

the cqlsh-expansion scripts on your machine using a pip install along with a ﬁle containing

a list of dependencies. The --user flag tells pip to use the Python user install directory for

your platform. On a Unix based system, that should be the ~/.local/ directory.

You need Python 3 to install the cqlsh-expansion, to ﬁnd out your Python version, use

Python --version. To install, you can run the following command.

python3 -m pip install --user cqlsh-expansion

The output should look similar to this.

Collecting cqlsh-expansion

Downloading cqlsh_expansion-0.9.6-py3-none-any.whl (153 kB)

######################################## 153.7/153.7 KB 3.3 MB/s eta 0:00:00

Collecting cassandra-driver

Using the cqlsh-expansion

101

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Downloading cassandra_driver-3.28.0-cp310-cp310-

manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.1 MB)

######################################## 19.1/19.1 MB 44.5 MB/s eta 0:00:00

Requirement already satisfied: six>=1.12.0 in /usr/lib/python3/dist-packages (from

cqlsh-expansion) (1.16.0)

Collecting boto3

Downloading boto3-1.29.2-py3-none-any.whl (135 kB)

######################################## 135.8/135.8 KB 17.2 MB/s eta 0:00:00

Collecting cassandra-sigv4>=4.0.2

Downloading cassandra_sigv4-4.0.2-py2.py3-none-any.whl (9.8 kB)

Collecting botocore<1.33.0,>=1.32.2

Downloading botocore-1.32.2-py3-none-any.whl (11.4 MB)

######################################## 11.4/11.4 MB 60.9 MB/s eta 0:00:00

Collecting s3transfer<0.8.0,>=0.7.0

Downloading s3transfer-0.7.0-py3-none-any.whl (79 kB)

######################################## 79.8/79.8 KB 13.1 MB/s eta 0:00:00

Collecting jmespath<2.0.0,>=0.7.1

Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)

Collecting geomet<0.3,>=0.1

Downloading geomet-0.2.1.post1-py3-none-any.whl (18 kB)

Collecting python-dateutil<3.0.0,>=2.1

Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)

######################################## 247.7/247.7 KB 33.1 MB/s eta 0:00:00

Requirement already satisfied: urllib3<2.1,>=1.25.4 in /usr/lib/python3/dist-

packages (from botocore<1.33.0,>=1.32.2->boto3->cqlsh-expansion) (1.26.5)

Requirement already satisfied: click in /usr/lib/python3/dist-packages (from

geomet<0.3,>=0.1->cassandra-driver->cqlsh-expansion) (8.0.3)

Installing collected packages: python-dateutil, jmespath, geomet, cassandra-driver,

botocore, s3transfer, boto3, cassandra-sigv4, cqlsh-expansion

WARNING: The script geomet is installed in '/home/ubuntu/.local/bin' which is not

on PATH.

Consider adding this directory to PATH or, if you prefer to suppress this

warning, use --no-warn-script-location.

WARNING: The scripts cqlsh, cqlsh-expansion and cqlsh-expansion.init are

installed in '/home/ubuntu/.local/bin' which is not on PATH.

Consider adding this directory to PATH or, if you prefer to suppress this

warning, use --no-warn-script-location.

Successfully installed boto3-1.29.2 botocore-1.32.2 cassandra-driver-3.28.0

cassandra-sigv4-4.0.2 cqlsh-expansion-0.9.6 geomet-0.2.1.post1 jmespath-1.0.1

python-dateutil-2.8.2 s3transfer-0.7.0

If the install directory is not in the PATH, you need to add it following the instructions of your

operating system. Below is one example for Ubuntu Linux.

Using the cqlsh-expansion

102

Amazon Keyspaces (for Apache Cassandra) Developer Guide

export PATH="$PATH:/home/ubuntu/.local/bin"

To conﬁrm that the package is installed, you can run the following command.

cqlsh-expansion --version

The output should look like this.

cqlsh 6.1.0

To conﬁgure the cqlsh-expansion, you can run a post-install script to automatically

complete the following steps:

Create the .cassandra directory in the user home directory if it doesn't already exist.

Copy a preconﬁgured cqlshrc conﬁguration ﬁle into the .cassandra directory.

Copy the Starﬁeld digital certiﬁcate into the .cassandra directory. Amazon Keyspaces

uses this certiﬁcate to conﬁgure the secure connection with Transport Layer Security (TLS).

Encryption in transit provides an additional layer of data protection by encrypting your data

as it travels to and from Amazon Keyspaces.

To review the script ﬁrst, you can access it in the Github repo at post_install.py.

To use the script, you can run the following command.

cqlsh-expansion.init

Note

The directory and ﬁle created by the post-install script are not removed when you

uninstall the cqlsh-expansion using pip uninstall, and have to be deleted

manually.

Connecting to Amazon Keyspaces using the cqlsh-expansion

1. Conﬁgure your AWS Region and add it as a user environment variable.

Using the cqlsh-expansion

103

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To add your default Region as an environment variable on a Unix based system, you can run

the following command. For this example, we use US East (N. Virginia).

export AWS_DEFAULT_REGION=us-east-1

For more information about how to set environment variables, including for other platforms,

see How to set environment variables.

2. Find your service endpoint.

Choose the appropriate service endpoint for your Region. To review the available endpoints

for Amazon Keyspaces, see the section called “Service endpoints”. For this example, we use the

endpoint cassandra.us-east-1.amazonaws.com.

3. Conﬁgure the authentication method.

Connecting with IAM access keys (IAM users, roles, and federated identities) is the

recommended method for enhanced security.

Before you can connect with IAM access keys, you need to complete the following steps:

a. Create an IAM user, or follow the best practice and create an IAM role that IAM users can

assume. For more information on how to create IAM access keys, see the section called

“Create IAM credentials for AWS authentication”.

b. Create an IAM policy that grants the role (or IAM user) at least read-only access to Amazon

Keyspaces. For more information about the permissions required for the IAM user or role

to connect to Amazon Keyspaces, see the section called “Accessing Amazon Keyspaces

tables”.

c. Add the access keys of the IAM user to the user's environment variables as shown in the

following example.

export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE

export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

For more information about how to set environment variables, including for other

platforms, see How to set environment variables.

Using the cqlsh-expansion

104

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

If you're connecting from an Amazon EC2 instance, you also need to conﬁgure

an outbound rule in the security group that allows traﬃc from the instance

to Amazon Keyspaces. For more information about how to view and edit EC2

outbound rules, see Add rules to a security group in the Amazon EC2 User Guide.

Connect to Amazon Keyspaces using the cqlsh-expansion and SigV4 authentication.

To connect to Amazon Keyspaces with the cqlsh-expansion, you can use the following

command. Make sure to replace the service endpoint with the correct endpoint for your

Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

If the connection is successful, you should see output similar to the following example.

Connected to Amazon Keyspaces at cassandra.us-east-1.amazonaws.com:9142

[cqlsh 6.1.0 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help.

cqlsh current consistency level is ONE.

cqlsh>

If you encounter a connection error, see the section called “Cqlsh connection errors” for

troubleshooting information.

• Connect to Amazon Keyspaces with service-speciﬁc credentials.

To connect with the traditional username and password combination that Cassandra uses

for authentication, you must ﬁrst create service-speciﬁc credentials for Amazon Keyspaces

as described in the section called “Create service-speciﬁc credentials”. You also have to

give that user permissions to access Amazon Keyspaces, for more information see the

section called “Accessing Amazon Keyspaces tables”.

After you have created service-speciﬁc credentials and permissions for the user, you must

update the cqlshrc ﬁle, typically found in the user directory path ~/.cassandra/. In

the cqlshrc ﬁle, go to the Cassandra [authentication] section and comment out the

Using the cqlsh-expansion

105

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SigV4 module and class under [auth_provider] using the ";" character as shown in the

following example.

[auth_provider]

; module = cassandra_sigv4.auth

; classname = SigV4AuthProvider

After you have updated the cqlshrc ﬁle, you can connect to Amazon Keyspaces with

service-speciﬁc credentials using the following command.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 -u myUserName -

p myPassword --ssl

Cleanup

•

To remove the cqlsh-expansion package you can use the pip uninstall command.

pip3 uninstall cqlsh-expansion

The pip3 uninstall command doesn't remove the directory and related ﬁles created by

the post-install script. To remove the folder and ﬁles created by the post-install script, you can

delete the .cassandra directory.

How to manually conﬁgure cqlsh connections for TLS

Amazon Keyspaces only accepts secure connections using Transport Layer Security (TLS). You can

use the cqlsh-expansion utility that automatically downloads the certiﬁcate for you and installs

a preconﬁgured cqlshrc conﬁguration ﬁle. For more information, see the section called “Using

the cqlsh-expansion” on this page.

If you want to download the certiﬁcate and conﬁgure the connection manually, you can do so using

the following steps.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-class2-

root.crt locally or in your home directory.

How to manually conﬁgure cqlsh connections for TLS

106

Amazon Keyspaces (for Apache Cassandra) Developer Guide

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces and

can continue to do so if your client is connecting to Amazon Keyspaces successfully. The

Starﬁeld certiﬁcate provides additional backwards compatibility for clients using older

certiﬁcate authorities.

Open the cqlshrc conﬁguration ﬁle in the Cassandra home directory, for example

${HOME}/.cassandra/cqlshrc and add the following lines.

[connection]

port = 9142

factory = cqlshlib.ssl.ssl_transport_factory

[ssl]

validate = true

certfile = path_to_file/sf-class2-root.crt

Using the AWS CLI to connect to Amazon Keyspaces

You can use the AWS Command Line Interface (AWS CLI) to control multiple AWS services from the

command line and automate them through scripts. With Amazon Keyspaces you can use the AWS

CLI for data deﬁnition language (DDL) operations, such as creating a table. In addition, you can use

infrastructure as code (IaC) services and tools such as AWS CloudFormation and Terraform.

Before you can use the AWS CLI with Amazon Keyspaces, you must get an access key ID and

secret access key. For more information, see the section called “Create IAM credentials for AWS

authentication”.

For a complete listing of all the commands available for Amazon Keyspaces in the AWS CLI, see the

AWS CLI Command Reference.

Topics

• Downloading and Conﬁguring the AWS CLI

• Using the AWS CLI with Amazon Keyspaces

Using the AWS CLI 107

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Downloading and Conﬁguring the AWS CLI

The AWS CLI is available at https://aws.amazon.com/cli. It runs on Windows, macOS, or Linux.

After downloading the AWS CLI, follow these steps to install and conﬁgure it:

1. Go to the AWS Command Line Interface User Guide

2. Follow the instructions for Installing the AWS CLI and Conﬁguring the AWS CLI

Using the AWS CLI with Amazon Keyspaces

The command line format consists of a Amazon Keyspaces operation name followed by the

parameters for that operation. The AWS CLI supports a shorthand syntax for the parameter values,

as well as JSON. The following Amazon Keyspaces examples use AWS CLI shorthand syntax. For

more information, see Using shorthand syntax with the AWS CLI.

The following command creates a keyspace with the name catalog.

aws keyspaces create-keyspace --keyspace-name 'catalog'

The command returns the resource Amazon Resource Name (ARN) in the output.

{

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/catalog/"

}

To conﬁrm that the keyspace catalog exists, you can use the following command.

aws keyspaces get-keyspace --keyspace-name 'catalog'

The output of the command returns the following values.

{

"keyspaceName": "catalog",

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/catalog/"

}

The following command creates a table with the name book_awards. The partition key of the

table consists of the columns year and award and the clustering key consists of the columns

Downloading and Conﬁguring the AWS CLI 108

Amazon Keyspaces (for Apache Cassandra) Developer Guide

category and rank, both clustering columns use the ascending sort order. (For easier readability,

long commands in this section are broken into separate lines.)

aws keyspaces create-table --keyspace-name 'catalog' --table-name 'book_awards'

--schema-definition 'allColumns=[{name=year,type=int},

{name=award,type=text},{name=rank,type=int},

{name=category,type=text}, {name=author,type=text},

{name=book_title,type=text},{name=publisher,type=text}],

partitionKeys=[{name=year},

{name=award}],clusteringKeys=[{name=category,orderBy=ASC},{name=rank,orderBy=ASC}]'

This command results in the following output.

{

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/catalog/table/

book_awards"

}

To conﬁrm the metadata and properties of the table, you can use the following command.

aws keyspaces get-table --keyspace-name 'catalog' --table-name 'book_awards'

This command returns the following output.

{

"keyspaceName": "catalog",

"tableName": "book_awards",

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/catalog/table/

book_awards",

"creationTimestamp": 1645564368.628,

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "year",

"type": "int"

{

"name": "award",

"type": "text"

{

Using the AWS CLI with Amazon Keyspaces 109

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"name": "category",

"type": "text"

{

"name": "rank",

"type": "int"

{

"name": "author",

"type": "text"

{

"name": "book_title",

"type": "text"

{

"name": "publisher",

"type": "text"

}

"partitionKeys": [

{

"name": "year"

{

"name": "award"

}

"clusteringKeys": [

{

"name": "category",

"orderBy": "ASC"

{

"name": "rank",

"orderBy": "ASC"

}

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": 1645564368.628

Using the AWS CLI with Amazon Keyspaces 110

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

}

When creating tables with complex schemas, it can be helpful to load the table's schema deﬁnition

from a JSON ﬁle. The following is an example of this. Download the schema deﬁnition example

JSON ﬁle from schema_deﬁnition.zip and extract schema_definition.json, taking note of the

path to the ﬁle. In this example, the schema deﬁnition JSON ﬁle is located in the current directory.

For diﬀerent ﬁle path options, see How to load parameters from a ﬁle.

aws keyspaces create-table --keyspace-name 'catalog'

--table-name 'book_awards' --schema-definition 'file://

schema_definition.json'

The following examples show how to create a simple table with the name myTable with additional

options. Note that the commands are broken down into separate rows to improve readability. This

command shows how to create a table and:

• set the capacity mode of the table

• enable Point-in-time recovery for the table

• set the default Time to Live (TTL) value for the table to one year

• add two tags for the table

aws keyspaces create-table --keyspace-name 'catalog' --table-name 'myTable'

--schema-definition 'allColumns=[{name=id,type=int},{name=name,type=text},

{name=date,type=timestamp}],partitionKeys=[{name=id}]'

--capacity-specification

'throughputMode=PROVISIONED,readCapacityUnits=5,writeCapacityUnits=5'

Using the AWS CLI with Amazon Keyspaces 111

Amazon Keyspaces (for Apache Cassandra) Developer Guide

--point-in-time-recovery 'status=ENABLED'

--default-time-to-live '31536000'

--tags 'key=env,value=test' 'key=dpt,value=sec'

This example shows how to create a new table that uses a customer managed key for encryption

and has TTL enabled to allow you to set expiration dates for columns and rows. To run this sample,

you must replace the resource ARN for the customer managed AWS KMS key with your own key

and ensure Amazon Keyspaces has access to it.

aws keyspaces create-table --keyspace-name 'catalog' --table-name 'myTable'

--schema-definition 'allColumns=[{name=id,type=int},{name=name,type=text},

{name=date,type=timestamp}],partitionKeys=[{name=id}]'

--encryption-specification

'type=CUSTOMER_MANAGED_KMS_KEY,kmsKeyIdentifier=arn:aws:kms:us-

east-1:111222333444:key/11111111-2222-3333-4444-555555555555'

--ttl 'status=ENABLED'

Using the API to connect to Amazon Keyspaces

You can use the AWS SDK and the AWS Command Line Interface (AWS CLI) to work interactively

with Amazon Keyspaces. You can use the API for data language deﬁnition (DDL) operations, such

as creating a keyspace or a table. In addition, you can use infrastructure as code (IaC) services and

tools such as AWS CloudFormation and Terraform.

Before you can use the AWS CLI with Amazon Keyspaces, you must get an access key ID and

secret access key. For more information, see the section called “Create IAM credentials for AWS

authentication”.

For a complete listing of all operations available for Amazon Keyspaces in the API, see Amazon

Keyspaces API Reference.

Using a Cassandra client driver to access Amazon Keyspaces

programmatically

You can use many third-party, open-source Cassandra drivers to connect to Amazon Keyspaces.

Amazon Keyspaces is compatible with Cassandra drivers that support Apache Cassandra version

3.11.2. These are the drivers and latest versions that we’ve tested and recommend to use with

Amazon Keyspaces:

Using the API 112

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

Java v3.3

•

Java v4.17

•

Python Cassandra-driver 3.29.1

•

Node.js cassandra driver -v 4.7.2

•

GO using GOCQL v1.6

•

.NET CassandraCSharpDriver -v 3.20.1

For more information about Cassandra drivers, see Apache Cassandra Client drivers.

Note

To help you get started, you can view and download end-to-end code examples that

establish connections to Amazon Keyspaces with popular drivers. See Amazon Keyspaces

examples on GitHub.

The tutorials in this chapter include a simple CQL query to conﬁrm that the connection to Amazon

Keyspaces has been successfully established. To learn how to work with keyspaces and tables after

you connect to an Amazon Keyspaces endpoint, see CQL language reference. For a step-by-step

tutorial that shows how to connect to Amazon Keyspaces from an Amazon VPC endpoint, see the

section called “Connecting with VPC endpoints”.

Topics

• Using a Cassandra Java client driver to access Amazon Keyspaces programmatically

• Using a Cassandra Python client driver to access Amazon Keyspaces programmatically

• Using a Cassandra Node.js client driver to access Amazon Keyspaces programmatically

• Using a Cassandra .NET Core client driver to access Amazon Keyspaces programmatically

• Using a Cassandra Go client driver to access Amazon Keyspaces programmatically

• Using a Cassandra Perl client driver to access Amazon Keyspaces programmatically

Using a Cassandra Java client driver to access Amazon Keyspaces

programmatically

This section shows you how to connect to Amazon Keyspaces by using a Java client driver.

Using a Cassandra Java client driver 113

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

Java 17 and the DataStax Java Driver 4.17 are currently only in Beta support. For

more information, see https://docs.datastax.com/en/developer/java-driver/4.17/

upgrade_guide/.

To provide users and applications with credentials for programmatic access to Amazon Keyspaces

resources, you can do either of the following:

• Create service-speciﬁc credentials that are associated with a speciﬁc AWS Identity and Access

Management (IAM) user.

• For enhanced security, we recommend to create IAM access keys for IAM identities that are used

across all AWS services. The Amazon Keyspaces SigV4 authentication plugin for Cassandra client

drivers enables you to authenticate calls to Amazon Keyspaces using IAM access keys instead of

user name and password. For more information, see the section called “Create IAM credentials

for AWS authentication”.

Note

For an example how to use Amazon Keyspaces with Spring Boot, see https://github.com/

aws-samples/amazon-keyspaces-examples/tree/main/java/datastax-v4/spring.

Topics

• Before you begin

• Step-by-step tutorial to connect to Amazon Keyspaces using the DataStax Java driver for Apache

Cassandra using service-speciﬁc credentials

• Step-by-step tutorial to connect to Amazon Keyspaces using the 4.x DataStax Java driver for

Apache Cassandra and the SigV4 authentication plugin

• Connect to Amazon Keyspaces using the 3.x DataStax Java driver for Apache Cassandra and the

SigV4 authentication plugin

Before you begin

To connect to Amazon Keyspaces, you need to complete the following tasks before you can start.

Using a Cassandra Java client driver 114

Amazon Keyspaces (for Apache Cassandra) Developer Guide

1. Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure

connections with clients.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-

class2-root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces

and can continue to do so if your client is connecting to Amazon Keyspaces

successfully. The Starﬁeld certiﬁcate provides additional backwards compatibility

for clients using older certiﬁcate authorities.

b. Convert the Starﬁeld digital certiﬁcate into a trustStore ﬁle.

openssl x509 -outform der -in sf-class2-root.crt -out temp_file.der

keytool -import -alias cassandra -keystore cassandra_truststore.jks -file

temp_file.der

In this step, you need to create a password for the keystore and trust this certiﬁcate. The

interactive command looks like this.

Enter keystore password:

Re-enter new password:

Owner: OU=Starfield Class 2 Certification Authority, O="Starfield Technologies,

Inc.", C=US

Issuer: OU=Starfield Class 2 Certification Authority, O="Starfield

Technologies, Inc.", C=US

Serial number: 0

Valid from: Tue Jun 29 17:39:16 UTC 2004 until: Thu Jun 29 17:39:16 UTC 2034

Certificate fingerprints:

MD5: 32:4A:4B:BB:C8:63:69:9B:BE:74:9A:C6:DD:1D:46:24

SHA1: AD:7E:1C:28:B0:64:EF:8F:60:03:40:20:14:C3:D0:E3:37:0E:B5:8A

SHA256:

14:65:FA:20:53:97:B8:76:FA:A6:F0:A9:95:8E:55:90:E4:0F:CC:7F:AA:4F:B7:C2:C8:67:75:21:FB:5F:B6:58

Signature algorithm name: SHA1withRSA

Subject Public Key Algorithm: 2048-bit RSA key

Version: 3

Using a Cassandra Java client driver 115

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Extensions:

#1: ObjectId: 2.5.29.35 Criticality=false

AuthorityKeyIdentifier [

KeyIdentifier [

0000: BF 5F B7 D1 CE DD 1F 86 F4 5B 55 AC DC D7 10 C2 ._.......[U.....

0010: 0E A9 88 E7 ....

]

[OU=Starfield Class 2 Certification Authority, O="Starfield Technologies,

Inc.", C=US]

SerialNumber: [ 00]

]

#2: ObjectId: 2.5.29.19 Criticality=false

BasicConstraints:[

CA:true

PathLen:2147483647

]

#3: ObjectId: 2.5.29.14 Criticality=false

SubjectKeyIdentifier [

KeyIdentifier [

0000: BF 5F B7 D1 CE DD 1F 86 F4 5B 55 AC DC D7 10 C2 ._.......[U.....

0010: 0E A9 88 E7 ....

]

Trust this certificate? [no]: y

2. Attach the trustStore ﬁle in the JVM arguments:

-Djavax.net.ssl.trustStore=path_to_file/cassandra_truststore.jks

-Djavax.net.ssl.trustStorePassword=my_password

Step-by-step tutorial to connect to Amazon Keyspaces using the DataStax Java

driver for Apache Cassandra using service-speciﬁc credentials

The following step-by-step tutorial walks you through connecting to Amazon Keyspaces using a

Java driver for Cassandra using service-speciﬁc credentials. Speciﬁcally, you'll use the 4.0 version of

the DataStax Java driver for Apache Cassandra.

Topics

• Step 1: Prerequisites

• Step 2: Conﬁgure the driver

Using a Cassandra Java client driver 116

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Step 3: Run the sample application

Step 1: Prerequisites

To follow this tutorial, you need to generate service-speciﬁc credentials and add the DataStax Java

driver for Apache Cassandra to your Java project.

• Generate service-speciﬁc credentials for your Amazon Keyspaces IAM user by completing the

steps in the section called “Create service-speciﬁc credentials”. If you prefer to use IAM access

keys for authentication, see the section called “Authentication plugin for Java 4.x”.

• Add the DataStax Java driver for Apache Cassandra to your Java project. Ensure that you're using

a version of the driver that supports Apache Cassandra 3.11.2. For more information, see the

DataStax Java driver for Apache Cassandra documentation.

Step 2: Conﬁgure the driver

You can specify settings for the DataStax Java Cassandra driver by creating a conﬁguration ﬁle

for your application. This conﬁguration ﬁle overrides the default settings and tells the driver to

connect to the Amazon Keyspaces service endpoint using port 9142. For a list of available service

endpoints, see the section called “Service endpoints”.

Create a conﬁguration ﬁle and save the ﬁle in the application's resources folder—for example,

src/main/resources/application.conf. Open application.conf and add the following

conﬁguration settings.

1. Authentication provider – Create the authentication provider with the

PlainTextAuthProvider class. ServiceUserName and ServicePassword should match

the user name and password you obtained when you generated the service-speciﬁc credentials

by following the steps in Create service-speciﬁc credentials for programmatic access to

Amazon Keyspaces.

Note

You can use short-term credentials by using the authentication plugin for the DataStax

Java driver for Apache Cassandra instead of hardcoding credentials in your driver

conﬁguration ﬁle. To learn more, follow the instructions for the the section called

“Authentication plugin for Java 4.x”.

Using a Cassandra Java client driver 117

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Local data center – Set the value for local-datacenter to the Region you're connecting to.

For example, if the application is connecting to cassandra.us-east-2.amazonaws.com,

then set the local data center to us-east-2. For all available AWS Regions, see ???. Set slow-

replica-avoidance = false to load balance against fewer nodes.

3. SSL/TLS – Initialize the SSLEngineFactory by adding a section in the conﬁguration ﬁle with a

single line that speciﬁes the class with class = DefaultSslEngineFactory. Provide the

path to the trustStore ﬁle and the password that you created previously. Amazon Keyspaces

doesn't support hostname-validation of peers, so set this option to false.

datastax-java-driver {

basic.contact-points = [ "cassandra.us-east-2.amazonaws.com:9142"]

advanced.auth-provider{

class = PlainTextAuthProvider

username = "ServiceUserName"

password = "ServicePassword"

}

basic.load-balancing-policy {

local-datacenter = "us-east-2"

slow-replica-avoidance = false

}

advanced.ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "./src/main/resources/cassandra_truststore.jks"

truststore-password = "my_password"

hostname-validation = false

}

Note

Instead of adding the path to the trustStore in the conﬁguration ﬁle, you can also add the

trustStore path directly in the application code or you can add the path to the trustStore to

your JVM arguments.

Using a Cassandra Java client driver 118

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 3: Run the sample application

This code example shows a simple command line application that creates a connection pool

to Amazon Keyspaces by using the conﬁguration ﬁle we created earlier. It conﬁrms that the

connection is established by running a simple query.

package <your package>;

// add the following imports to your project

import com.datastax.oss.driver.api.core.CqlSession;

import com.datastax.oss.driver.api.core.config.DriverConfigLoader;

import com.datastax.oss.driver.api.core.cql.ResultSet;

import com.datastax.oss.driver.api.core.cql.Row;

public class App

{

public static void main( String[] args )

{

//Use DriverConfigLoader to load your configuration file

DriverConfigLoader loader =

DriverConfigLoader.fromClasspath("application.conf");

try (CqlSession session = CqlSession.builder()

.withConfigLoader(loader)

.build()) {

ResultSet rs = session.execute("select * from system_schema.keyspaces");

Row row = rs.one();

System.out.println(row.getString("keyspace_name"));

}

Note

Use a try block to establish the connection to ensure that it's always closed. If you don't

use a try block, remember to close your connection to avoid leaking resources.

Using a Cassandra Java client driver 119

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step-by-step tutorial to connect to Amazon Keyspaces using the 4.x DataStax Java

driver for Apache Cassandra and the SigV4 authentication plugin

The following section describes how to use the SigV4 authentication plugin for the open-source 4.x

DataStax Java driver for Apache Cassandra to access Amazon Keyspaces (for Apache Cassandra).

The plugin is available from the GitHub repository.

The SigV4 authentication plugin allows you to use IAM credentials for users or roles when

connecting to Amazon Keyspaces. Instead of requiring a user name and password, this plugin signs

API requests using access keys. For more information, see the section called “Create IAM credentials

for AWS authentication”.

Step 1: Prerequisites

To follow this tutorial, you need to complete the following tasks.

• If you haven't already done so, create credentials for your IAM user or role following the steps

at the section called “Create IAM credentials for AWS authentication”. This tutorial assumes that

the access keys are stored as environment variables. For more information, see the section called

“Manage access keys”.

• Add the DataStax Java driver for Apache Cassandra to your Java project. Ensure that you're using

a version of the driver that supports Apache Cassandra 3.11.2. For more information, see the

DataStax Java Driver for Apache Cassandra documentation.

• Add the authentication plugin to your application. The authentication plugin supports version

4.x of the DataStax Java driver for Apache Cassandra. If you’re using Apache Maven, or a build

system that can use Maven dependencies, add the following dependencies to your pom.xml ﬁle.

Important

Replace the version of the plugin with the latest version as shown at GitHub repository.

<groupId>software.aws.mcs</groupId>

<artifactId>aws-sigv4-auth-cassandra-java-driver-plugin</artifactId>

</dependency>

Using a Cassandra Java client driver 120

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 2: Conﬁgure the driver

You can specify settings for the DataStax Java Cassandra driver by creating a conﬁguration ﬁle

for your application. This conﬁguration ﬁle overrides the default settings and tells the driver to

connect to the Amazon Keyspaces service endpoint using port 9142. For a list of available service

endpoints, see the section called “Service endpoints”.

Create a conﬁguration ﬁle and save the ﬁle in the application's resources folder—for example,

src/main/resources/application.conf. Open application.conf and add the following

conﬁguration settings.

Authentication provider – Set the advanced.auth-provider.class to a new instance

of software.aws.mcs.auth.SigV4AuthProvider. The SigV4AuthProvider is the

authentication handler provided by the plugin for performing SigV4 authentication.

Local data center – Set the value for local-datacenter to the Region you're connecting to.

For example, if the application is connecting to cassandra.us-east-2.amazonaws.com,

then set the local data center to us-east-2. For all available AWS Regions, see ???. Set slow-

replica-avoidance = false to load balance against all available nodes.

Idempotence – Set the default idempotence for the application to true to conﬁgure the

driver to always retry failed read/write/prepare/execute requests. This is a best practice for

distributed applications that helps to handle transient failures by retrying failed requests.

4. SSL/TLS – Initialize the SSLEngineFactory by adding a section in the conﬁguration ﬁle with a

single line that speciﬁes the class with class = DefaultSslEngineFactory. Provide the

path to the trustStore ﬁle and the password that you created previously. Amazon Keyspaces

doesn't support hostname-validation of peers, so set this option to false.

Connections – Create at least 3 local connections per endpoint by setting local.size =

3. This is a best practice that helps your application to handle overhead and traﬃc bursts.

For more information about how to calculate how many local connections per endpoint your

application needs based on expected traﬃc patterns, see the section called “How to conﬁgure

connections”.

6. Retry policy – The Amazon Keyspaces retry policy

AmazonKeyspacesExponentialRetryPolicy is an alternative to the

DefaultRetryPolicy that comes with the Cassandra driver. The main diﬀerence

between the two retry policies is that you can conﬁgure the amount of retry attempts

for the AmazonKeyspacesExponentialRetryPolicy to meet your needs. By default,

the number of retry attempts for the AmazonKeyspacesExponentialRetryPolicy

is set to 3. In addition, the Amazon Keyspaces retry policy doesn't return the generic

Using a Cassandra Java client driver 121

Amazon Keyspaces (for Apache Cassandra) Developer Guide

NoHostAvailableException. Instead, the Amazon Keyspaces retry policy passes back

the original exception returned by the service. For more code examples implementing retry

policies, see Amazon Keyspaces retry policies on Github.

Prepared statements – Set prepare-on-all-nodes to false to optimize network usage.

datastax-java-driver {

basic {

contact-points = [ "cassandra.us-east-2.amazonaws.com:9142"]

request {

timeout = 2 seconds

consistency = LOCAL_QUORUM

page-size = 1024

default-idempotence = true

}

load-balancing-policy {

local-datacenter = "us-east-2"

class = DefaultLoadBalancingPolicy

slow-replica-avoidance = false

}

advanced {

auth-provider {

class = software.aws.mcs.auth.SigV4AuthProvider

aws-region = us-east-2

}

ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "./src/main/resources/cassandra_truststore.jks"

truststore-password = "my_password"

hostname-validation = false

}

connection {

connect-timeout = 5 seconds

max-requests-per-connection = 512

pool {

local.size = 3

}

retry-policy {

class = com.aws.ssa.keyspaces.retry.AmazonKeyspacesExponentialRetryPolicy

max-attempts = 3

min-wait = 10 mills

Using a Cassandra Java client driver 122

Amazon Keyspaces (for Apache Cassandra) Developer Guide

max-wait = 100 mills

}

prepared-statements {

prepare-on-all-nodes = false

}

Note

Instead of adding the path to the trustStore in the conﬁguration ﬁle, you can also add the

trustStore path directly in the application code or you can add the path to the trustStore to

your JVM arguments.

Step 3: Run the application

This code example shows a simple command line application that creates a connection pool

to Amazon Keyspaces by using the conﬁguration ﬁle we created earlier. It conﬁrms that the

connection is established by running a simple query.

package <your package>;

// add the following imports to your project

import com.datastax.oss.driver.api.core.CqlSession;

import com.datastax.oss.driver.api.core.config.DriverConfigLoader;

import com.datastax.oss.driver.api.core.cql.ResultSet;

import com.datastax.oss.driver.api.core.cql.Row;

public class App

{

public static void main( String[] args )

{

//Use DriverConfigLoader to load your configuration file

DriverConfigLoader loader =

DriverConfigLoader.fromClasspath("application.conf");

try (CqlSession session = CqlSession.builder()

.withConfigLoader(loader)

.build()) {

ResultSet rs = session.execute("select * from system_schema.keyspaces");

Row row = rs.one();

Using a Cassandra Java client driver 123

Amazon Keyspaces (for Apache Cassandra) Developer Guide

System.out.println(row.getString("keyspace_name"));

}

Note

Use a try block to establish the connection to ensure that it's always closed. If you don't

use a try block, remember to close your connection to avoid leaking resources.

Connect to Amazon Keyspaces using the 3.x DataStax Java driver for Apache

Cassandra and the SigV4 authentication plugin

The following section describes how to use the SigV4 authentication plugin for the 3.x open-source

DataStax Java driver for Apache Cassandra to access Amazon Keyspaces. The plugin is available

from the GitHub repository.

The SigV4 authentication plugin allows you to use IAM credentials for users and roles when

connecting to Amazon Keyspaces. Instead of requiring a user name and password, this plugin signs

API requests using access keys. For more information, see the section called “Create IAM credentials

for AWS authentication”.

Step 1: Prerequisites

To run this code sample, you ﬁrst need to complete the following tasks.

• Create credentials for your IAM user or role following the steps at the section called “Create IAM

credentials for AWS authentication”. This tutorial assumes that the access keys are stored as

environment variables. For more information, see the section called “Manage access keys”.

• Follow the steps at the section called “Before you begin” to download the Starﬁeld digital

certiﬁcate, convert it to a trustStore ﬁle, and attach the trustStore ﬁle in the JVM arguments to

your application.

• Add the DataStax Java driver for Apache Cassandra to your Java project. Ensure that you're using

a version of the driver that supports Apache Cassandra 3.11.2. For more information, see the

DataStax Java Driver for Apache Cassandra documentation.

• Add the authentication plugin to your application. The authentication plugin supports version

3.x of the DataStax Java driver for Apache Cassandra. If you’re using Apache Maven, or a build

Using a Cassandra Java client driver 124

Amazon Keyspaces (for Apache Cassandra) Developer Guide

system that can use Maven dependencies, add the following dependencies to your pom.xml ﬁle.

Replace the version of the plugin with the latest version as shown at GitHub repository.

<groupId>software.aws.mcs</groupId>

<artifactId>aws-sigv4-auth-cassandra-java-driver-plugin_3</artifactId>

</dependency>

Step 2: Run the application

This code example shows a simple command line application that creates a connection pool to

Amazon Keyspaces. It conﬁrms that the connection is established by running a simple query.

package <your package>;

// add the following imports to your project

import software.aws.mcs.auth.SigV4AuthProvider;

import com.datastax.driver.core.Cluster;

import com.datastax.driver.core.ResultSet;

import com.datastax.driver.core.Row;

import com.datastax.driver.core.Session;

public class App

{

public static void main( String[] args )

{

String endPoint = "cassandra.us-east-2.amazonaws.com";

int portNumber = 9142;

Session session = Cluster.builder()

.addContactPoint(endPoint)

.withPort(portNumber)

.withAuthProvider(new SigV4AuthProvider("us-east-2"))

.withSSL()

.build()

.connect();

ResultSet rs = session.execute("select * from system_schema.keyspaces");

Row row = rs.one();

System.out.println(row.getString("keyspace_name"));

Using a Cassandra Java client driver 125

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

Usage notes:

For a list of available endpoints, see the section called “Service endpoints”.

See the following repository for helpful Java driver policies, examples, and best practices when

using the Java Driver with Amazon Keyspaces: https://github.com/aws-samples/amazon-

keyspaces-java-driver-helpers.

Using a Cassandra Python client driver to access Amazon Keyspaces

programmatically

In this section, we show you how to connect to Amazon Keyspaces using a Python client driver.

To provide users and applications with credentials for programmatic access to Amazon Keyspaces

resources, you can do either of the following:

• Create service-speciﬁc credentials that are associated with a speciﬁc AWS Identity and Access

Management (IAM) user.

• For enhanced security, we recommend to create IAM access keys for IAM users or roles that are

used across all AWS services. The Amazon Keyspaces SigV4 authentication plugin for Cassandra

client drivers enables you to authenticate calls to Amazon Keyspaces using IAM access keys

instead of user name and password. For more information, see the section called “Create IAM

credentials for AWS authentication”.

Topics

• Before you begin

• Connect to Amazon Keyspaces using the Python driver for Apache Cassandra and service-speciﬁc

credentials

• Connect to Amazon Keyspaces using the DataStax Python driver for Apache Cassandra and the

SigV4 authentication plugin

Before you begin

You need to complete the following task before you can start.

Using a Cassandra Python client driver 126

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections

with clients. To connect to Amazon Keyspaces using TLS, you need to download an Amazon digital

certiﬁcate and conﬁgure the Python driver to use TLS.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-class2-

root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces and

can continue to do so if your client is connecting to Amazon Keyspaces successfully. The

Starﬁeld certiﬁcate provides additional backwards compatibility for clients using older

certiﬁcate authorities.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Connect to Amazon Keyspaces using the Python driver for Apache Cassandra and

service-speciﬁc credentials

The following code example shows you how to connect to Amazon Keyspaces with a Python client

driver and service-speciﬁc credentials.

from cassandra.cluster import Cluster

from ssl import SSLContext, PROTOCOL_TLSv1_2 , CERT_REQUIRED

from cassandra.auth import PlainTextAuthProvider

ssl_context = SSLContext(PROTOCOL_TLSv1_2 )

ssl_context.load_verify_locations('path_to_file/sf-class2-root.crt')

ssl_context.verify_mode = CERT_REQUIRED

auth_provider = PlainTextAuthProvider(username='ServiceUserName',

password='ServicePassword')

cluster = Cluster(['cassandra.us-east-2.amazonaws.com'], ssl_context=ssl_context,

auth_provider=auth_provider, port=9142)

session = cluster.connect()

r = session.execute('select * from system_schema.keyspaces')

print(r.current_rows)

Using a Cassandra Python client driver 127

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in the

ﬁrst step.

Ensure that the ServiceUserName and ServicePassword match the user name and password

you obtained when you generated the service-speciﬁc credentials by following the steps to

Create service-speciﬁc credentials for programmatic access to Amazon Keyspaces.

3. For a list of available endpoints, see the section called “Service endpoints”.

Connect to Amazon Keyspaces using the DataStax Python driver for Apache

Cassandra and the SigV4 authentication plugin

The following section shows how to use the SigV4 authentication plugin for the open-source

DataStax Python driver for Apache Cassandra to access Amazon Keyspaces (for Apache Cassandra).

If you haven't already done so, begin with creating credentials for your IAM role following the steps

at the section called “Create IAM credentials for AWS authentication”. This tutorial uses temporary

credentials, which requires an IAM role. For more information about temporary credentials, see the

section called “Create temporary credentials to connect to Amazon Keyspaces”.

Then, add the Python SigV4 authentication plugin to your environment from the GitHub

repository.

pip install cassandra-sigv4

The following code example shows how to connect to Amazon Keyspaces by using the open-source

DataStax Python driver for Cassandra and the SigV4 authentication plugin. The plugin depends on

the AWS SDK for Python (Boto3). It uses boto3.session to obtain temporary credentials.

from cassandra.cluster import Cluster

from ssl import SSLContext, PROTOCOL_TLSv1_2 , CERT_REQUIRED

from cassandra.auth import PlainTextAuthProvider

import boto3

from cassandra_sigv4.auth import SigV4AuthProvider

ssl_context = SSLContext(PROTOCOL_TLSv1_2)

ssl_context.load_verify_locations('path_to_file/sf-class2-root.crt')

ssl_context.verify_mode = CERT_REQUIRED

Using a Cassandra Python client driver 128

Amazon Keyspaces (for Apache Cassandra) Developer Guide

# use this if you want to use Boto to set the session parameters.

boto_session = boto3.Session(aws_access_key_id="AKIAIOSFODNN7EXAMPLE",

aws_secret_access_key="wJalrXUtnFEMI/K7MDENG/

bPxRfiCYEXAMPLEKEY",

aws_session_token="AQoDYXdzEJr...<remainder of token>",

region_name="us-east-2")

auth_provider = SigV4AuthProvider(boto_session)

# Use this instead of the above line if you want to use the Default Credentials and not

bother with a session.

# auth_provider = SigV4AuthProvider()

cluster = Cluster(['cassandra.us-east-2.amazonaws.com'], ssl_context=ssl_context,

auth_provider=auth_provider,

port=9142)

session = cluster.connect()

r = session.execute('select * from system_schema.keyspaces')

print(r.current_rows)

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in the

ﬁrst step.

Ensure that the aws_access_key_id, aws_secret_access_key, and the

aws_session_token match the Access Key, Secret Access Key, and Session Token

you obtained using boto3.session. For more information, see Credentials in the AWS SDK for

Python (Boto3).

3. For a list of available endpoints, see the section called “Service endpoints”.

Using a Cassandra Node.js client driver to access Amazon Keyspaces

programmatically

This section shows you how to connect to Amazon Keyspaces by using a Node.js client driver. To

provide users and applications with credentials for programmatic access to Amazon Keyspaces

resources, you can do either of the following:

• Create service-speciﬁc credentials that are associated with a speciﬁc AWS Identity and Access

Management (IAM) user.

Using a Cassandra Node.js client driver 129

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For enhanced security, we recommend to create IAM access keys for IAM users or roles that are

used across all AWS services. The Amazon Keyspaces SigV4 authentication plugin for Cassandra

client drivers enables you to authenticate calls to Amazon Keyspaces using IAM access keys

instead of user name and password. For more information, see the section called “Create IAM

credentials for AWS authentication”.

Topics

• Before you begin

• Connect to Amazon Keyspaces using the Node.js DataStax driver for Apache Cassandra and

service-speciﬁc credentials

• Connect to Amazon Keyspaces using the DataStax Node.js driver for Apache Cassandra and the

SigV4 authentication plugin

Before you begin

You need to complete the following task before you can start.

Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections

with clients. To connect to Amazon Keyspaces using TLS, you need to download an Amazon digital

certiﬁcate and conﬁgure the Python driver to use TLS.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-class2-

root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces and

can continue to do so if your client is connecting to Amazon Keyspaces successfully. The

Starﬁeld certiﬁcate provides additional backwards compatibility for clients using older

certiﬁcate authorities.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Using a Cassandra Node.js client driver 130

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connect to Amazon Keyspaces using the Node.js DataStax driver for Apache

Cassandra and service-speciﬁc credentials

Conﬁgure your driver to use the Starﬁeld digital certiﬁcate for TLS and authenticate using service-

speciﬁc credentials. For example:

const cassandra = require('cassandra-driver');

const fs = require('fs');

const auth = new cassandra.auth.PlainTextAuthProvider('ServiceUserName',

'ServicePassword');

const sslOptions1 = {

ca: [

fs.readFileSync('path_to_file/sf-class2-root.crt', 'utf-8')],

host: 'cassandra.us-west-2.amazonaws.com',

rejectUnauthorized: true

};

const client = new cassandra.Client({

contactPoints: ['cassandra.us-west-2.amazonaws.com'],

localDataCenter: 'us-west-2',

authProvider: auth,

sslOptions: sslOptions1,

protocolOptions: { port: 9142 }

});

const query = 'SELECT * FROM system_schema.keyspaces';

client.execute(query)

.then( result => console.log('Row from Keyspaces %s',

result.rows[0]))

.catch( e=> console.log(`${e}`));

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in the

ﬁrst step.

Ensure that the ServiceUserName and ServicePassword match the user name and password

you obtained when you generated the service-speciﬁc credentials by following the steps to

Create service-speciﬁc credentials for programmatic access to Amazon Keyspaces.

3. For a list of available endpoints, see the section called “Service endpoints”.

Using a Cassandra Node.js client driver 131

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connect to Amazon Keyspaces using the DataStax Node.js driver for Apache

Cassandra and the SigV4 authentication plugin

The following section shows how to use the SigV4 authentication plugin for the open-source

DataStax Node.js driver for Apache Cassandra to access Amazon Keyspaces (for Apache Cassandra).

If you haven't already done so, create credentials for your IAM user or role following the steps at

the section called “Create IAM credentials for AWS authentication”.

Add the Node.js SigV4 authentication plugin to your application from the GitHub repository. The

plugin supports version 4.x of the DataStax Node.js driver for Cassandra and depends on the AWS

SDK for Node.js. It uses AWSCredentialsProvider to obtain credentials.

$ npm install aws-sigv4-auth-cassandra-plugin --save

This code example shows how to set a Region-speciﬁc instance of SigV4AuthProvider as the

authentication provider.

const cassandra = require('cassandra-driver');

const fs = require('fs');

const sigV4 = require('aws-sigv4-auth-cassandra-plugin');

const auth = new sigV4.SigV4AuthProvider({

region: 'us-west-2',

accessKeyId:'AKIAIOSFODNN7EXAMPLE',

secretAccessKey: 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'});

const sslOptions1 = {

ca: [

fs.readFileSync('path_to_filecassandra/sf-class2-root.crt', 'utf-8')],

host: 'cassandra.us-west-2.amazonaws.com',

rejectUnauthorized: true

};

const client = new cassandra.Client({

contactPoints: ['cassandra.us-west-2.amazonaws.com'],

localDataCenter: 'us-west-2',

authProvider: auth,

sslOptions: sslOptions1,

protocolOptions: { port: 9142 }

});

Using a Cassandra Node.js client driver 132

Amazon Keyspaces (for Apache Cassandra) Developer Guide

const query = 'SELECT * FROM system_schema.keyspaces';

client.execute(query).then(

result => console.log('Row from Keyspaces %s', result.rows[0]))

.catch( e=> console.log(`${e}`));

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in the

ﬁrst step.

Ensure that the accessKeyId and secretAccessKey match the Access Key and Secret

Access Key you obtained using AWSCredentialsProvider. For more information, see Setting

Credentials in Node.js in the AWS SDK for JavaScript in Node.js.

3. To store access keys outside of code, see best practices at the section called “Manage access

keys”.

4. For a list of available endpoints, see the section called “Service endpoints”.

Using a Cassandra .NET Core client driver to access Amazon Keyspaces

programmatically

This section shows you how to connect to Amazon Keyspaces by using a .NET Core client driver.

The setup steps will vary depending on your environment and operating system, you might have

to modify them accordingly. Amazon Keyspaces requires the use of Transport Layer Security (TLS)

to help secure connections with clients. To connect to Amazon Keyspaces using TLS, you need to

download a Starﬁeld digital certiﬁcate and conﬁgure your driver to use TLS.

1. Download the Starﬁeld certiﬁcate and save it to a local directory, taking note of the path.

Following is an example using PowerShell.

$client = new-object System.Net.WebClient

$client.DownloadFile("https://certs.secureserver.net/repository/sf-class2-

root.crt","path_to_file\sf-class2-root.crt")

2. Install the CassandraCSharpDriver through nuget, using the nuget console.

PM> Install-Package CassandraCSharpDriver

Using a Cassandra .NET Core client driver 133

Amazon Keyspaces (for Apache Cassandra) Developer Guide

3. The following example uses a .NET Core C# console project to connect to Amazon Keyspaces and

run a query.

using Cassandra;

using System;

using System.Collections.Generic;

using System.Linq;

using System.Net.Security;

using System.Runtime.ConstrainedExecution;

using System.Security.Cryptography.X509Certificates;

using System.Text;

using System.Threading.Tasks;

namespace CSharpKeyspacesExample

{

class Program

{

public Program(){}

static void Main(string[] args)

{

X509Certificate2Collection certCollection = new

X509Certificate2Collection();

X509Certificate2 amazoncert = new X509Certificate2(@"path_to_file\sf-

class2-root.crt");

var userName = "ServiceUserName";

var pwd = "ServicePassword";

certCollection.Add(amazoncert);

var awsEndpoint = "cassandra.us-east-2.amazonaws.com" ;

var cluster = Cluster.Builder()

.AddContactPoints(awsEndpoint)

.WithPort(9142)

.WithAuthProvider(new PlainTextAuthProvider(userName, pwd))

.WithSSL(new

SSLOptions().SetCertificateCollection(certCollection))

.Build();

var session = cluster.Connect();

var rs = session.Execute("SELECT * FROM system_schema.tables;");

foreach (var row in rs)

{

Using a Cassandra .NET Core client driver 134

Amazon Keyspaces (for Apache Cassandra) Developer Guide

var name = row.GetValue<String>("keyspace_name");

Console.WriteLine(name);

}

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in

the ﬁrst step.

Ensure that the ServiceUserName and ServicePassword match the user name and

password you obtained when you generated the service-speciﬁc credentials by following the

steps to Create service-speciﬁc credentials for programmatic access to Amazon Keyspaces.

c. For a list of available endpoints, see the section called “Service endpoints”.

Using a Cassandra Go client driver to access Amazon Keyspaces

programmatically

This section shows you how to connect to Amazon Keyspaces by using a Go client driver. To provide

users and applications with credentials for programmatic access to Amazon Keyspaces resources,

you can do either of the following:

• Create service-speciﬁc credentials that are associated with a speciﬁc AWS Identity and Access

Management (IAM) user.

• For enhanced security, we recommend to create IAM access keys for IAM users and roles that are

used across all AWS services. The Amazon Keyspaces SigV4 authentication plugin for Cassandra

client drivers enables you to authenticate calls to Amazon Keyspaces using IAM access keys

instead of user name and password. For more information, see the section called “Create IAM

credentials for AWS authentication”.

Topics

• Before you begin

• Connect to Amazon Keyspaces using the Gocql driver for Apache Cassandra and service-speciﬁc

credentials

Using a Cassandra Go client driver 135

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Connect to Amazon Keyspaces using the Go driver for Apache Cassandra and the SigV4

authentication plugin

Before you begin

You need to complete the following task before you can start.

Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections

with clients. To connect to Amazon Keyspaces using TLS, you need to download an Amazon digital

certiﬁcate and conﬁgure the Python driver to use TLS.

Download the Starﬁeld digital certiﬁcate using the following command and save sf-class2-

root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces and

can continue to do so if your client is connecting to Amazon Keyspaces successfully. The

Starﬁeld certiﬁcate provides additional backwards compatibility for clients using older

certiﬁcate authorities.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Connect to Amazon Keyspaces using the Gocql driver for Apache Cassandra and

service-speciﬁc credentials

1. Create a directory for your application.

mkdir ./gocqlexample

2. Navigate to the new directory.

cd gocqlexample

3. Create a ﬁle for your application.

Using a Cassandra Go client driver 136

Amazon Keyspaces (for Apache Cassandra) Developer Guide

touch cqlapp.go

4. Download the Go driver.

go get github.com/gocql/gocql

5. Add the following sample code to the cqlapp.go ﬁle.

package main

import (

"fmt"

"github.com/gocql/gocql"

"log"

)

func main() {

// add the Amazon Keyspaces service endpoint

cluster := gocql.NewCluster("cassandra.us-east-2.amazonaws.com")

cluster.Port=9142

// add your service specific credentials

cluster.Authenticator = gocql.PasswordAuthenticator{

Username: "ServiceUserName",

Password: "ServicePassword"}

// provide the path to the sf-class2-root.crt

cluster.SslOpts = &gocql.SslOptions{

CaPath: "path_to_file/sf-class2-root.crt",

EnableHostVerification: false,

}

// Override default Consistency to LocalQuorum

cluster.Consistency = gocql.LocalQuorum

cluster.DisableInitialHostLookup = false

session, err := cluster.CreateSession()

if err != nil {

fmt.Println("err>", err)

}

defer session.Close()

// run a sample query from the system keyspace

Using a Cassandra Go client driver 137

Amazon Keyspaces (for Apache Cassandra) Developer Guide

var text string

iter := session.Query("SELECT keyspace_name FROM system_schema.tables;").Iter()

for iter.Scan(&text) {

fmt.Println("keyspace_name:", text)

}

if err := iter.Close(); err != nil {

log.Fatal(err)

}

session.Close()

}

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in

the ﬁrst step.

Ensure that the ServiceUserName and ServicePassword match the user name and

password you obtained when you generated the service-speciﬁc credentials by following the

steps to Create service-speciﬁc credentials for programmatic access to Amazon Keyspaces.

c. For a list of available endpoints, see the section called “Service endpoints”.

6. Build the program.

go build cqlapp.go

7. Run the program.

./cqlapp

Connect to Amazon Keyspaces using the Go driver for Apache Cassandra and the

SigV4 authentication plugin

The following code sample shows how to use the SigV4 authentication plugin for the open-source

Go driver to access Amazon Keyspaces (for Apache Cassandra).

If you haven't already done so, create credentials for your IAM user or role following the steps at

the section called “Create IAM credentials for AWS authentication”.

Add the Go SigV4 authentication plugin to your application from the GitHub repository. The plugin

supports version 1.2.x of the open-source Go driver for Cassandra and depends on the AWS SDK for

Go.

Using a Cassandra Go client driver 138

Amazon Keyspaces (for Apache Cassandra) Developer Guide

$ go mod init

$ go get github.com/aws/aws-sigv4-auth-cassandra-gocql-driver-plugin

In this code sample, the Amazon Keyspaces endpoint is represented by the Cluster class. It uses

the AwsAuthenticator for the authenticator property of the cluster to obtain credentials.

package main

import (

"fmt"

"github.com/aws/aws-sigv4-auth-cassandra-gocql-driver-plugin/sigv4"

"github.com/gocql/gocql"

"log"

)

func main() {

// configuring the cluster options

cluster := gocql.NewCluster("cassandra.us-west-2.amazonaws.com")

cluster.Port=9142

var auth sigv4.AwsAuthenticator = sigv4.NewAwsAuthenticator()

auth.Region = "us-west-2"

auth.AccessKeyId = "AKIAIOSFODNN7EXAMPLE"

auth.SecretAccessKey = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

cluster.Authenticator = auth

cluster.SslOpts = &gocql.SslOptions{

CaPath: "path_to_file/sf-class2-root.crt",

EnableHostVerification: false,

}

cluster.Consistency = gocql.LocalQuorum

cluster.DisableInitialHostLookup = false

session, err := cluster.CreateSession()

if err != nil {

fmt.Println("err>", err)

return

}

defer session.Close()

// doing the query

var text string

iter := session.Query("SELECT keyspace_name FROM system_schema.tables;").Iter()

Using a Cassandra Go client driver 139

Amazon Keyspaces (for Apache Cassandra) Developer Guide

for iter.Scan(&text) {

fmt.Println("keyspace_name:", text)

}

if err := iter.Close(); err != nil {

log.Fatal(err)

}

Usage notes:

Replace "path_to_file/sf-class2-root.crt" with the path to the certiﬁcate saved in the

ﬁrst step.

Ensure that the AccessKeyId and SecretAccessKey match the access key and secret access

key you obtained using AwsAuthenticator. For more information, see Conﬁguring the AWS

SDK for Go in the AWS SDK for Go.

3. To store access keys outside of code, see best practices at the section called “Manage access

keys”.

4. For a list of available endpoints, see the section called “Service endpoints”.

Using a Cassandra Perl client driver to access Amazon Keyspaces

programmatically

This section shows you how to connect to Amazon Keyspaces by using a Perl client driver. For this

code sample, we used Perl 5. Amazon Keyspaces requires the use of Transport Layer Security (TLS)

to help secure connections with clients.

Important

To create a secure connection, our code samples use the Starﬁeld digital certiﬁcate to

authenticate the server before establishing the TLS connection. The Perl driver doesn't

validate the server's Amazon SSL certiﬁcate, which means that you can't conﬁrm that you

are connecting to Amazon Keyspaces. The second step, to conﬁgure the driver to use TLS

when connecting to Amazon Keyspaces is still required, and ensures that data transferred

between the client and server is encrypted.

Using a Cassandra Perl client driver 140

Amazon Keyspaces (for Apache Cassandra) Developer Guide

1. Download the Cassandra DBI driver from https://metacpan.org/pod/DBD::Cassandra and install

the driver to your Perl environment. The exact steps depend on the environment. The following

is a common example.

cpanm DBD::Cassandra

2. Create a ﬁle for your application.

touch cqlapp.pl

3. Add the following sample code to the cqlapp.pl ﬁle.

use DBI;

my $user = "ServiceUserName";

my $password = "ServicePassword";

my $db = DBI->connect("dbi:Cassandra:host=cassandra.us-

east-2.amazonaws.com;port=9142;tls=1;",

$user, $password);

my $rows = $db->selectall_arrayref("select * from system_schema.keyspaces");

print "Found the following Keyspaces...\n";

for my $row (@$rows) {

print join(" ",@$row['keyspace_name']),"\n";

}

$db->disconnect;

Important

Ensure that the ServiceUserName and ServicePassword match the user name and

password you obtained when you generated the service-speciﬁc credentials by following

the steps to Create service-speciﬁc credentials for programmatic access to Amazon

Keyspaces.

Note

For a list of available endpoints, see the section called “Service endpoints”.

4. Run the application.

Using a Cassandra Perl client driver 141

Amazon Keyspaces (for Apache Cassandra) Developer Guide

perl cqlapp.pl

Use a step-by-step tutorial to connect to Amazon Keyspaces

This topic includes various tutorials that show the detailed steps required to connect to Amazon

Keyspaces programmatically. The tutorials cover diﬀerent popular cross-service access scenarios

and demonstrate how to integrate Amazon Keyspaces with other AWS services, for example

Amazon Virtual Private Cloud or Amazon Elastic Kubernetes Service, or other open source

technologies like Apache Spark. For step-by-step tutorials that demonstrate how to connect

to Amazon Keyspaces using diﬀerent Apache Cassandra drivers, see the section called “Using a

Cassandra client driver”.

Topics

• Tutorial: Connecting to Amazon Keyspaces using an interface VPC endpoint

• Connecting to Amazon Keyspaces with Apache Spark

• Tutorial: Connecting to Amazon Keyspaces from Amazon Elastic Kubernetes Service

Tutorial: Connecting to Amazon Keyspaces using an interface VPC

endpoint

This tutorial walks you through setting up and using an interface VPC endpoint for Amazon

Keyspaces.

Interface VPC endpoints enable private communication between your virtual private cloud (VPC)

running in Amazon VPC and Amazon Keyspaces. Interface VPC endpoints are powered by AWS

PrivateLink, which is an AWS service that enables private communication between VPCs and AWS

services. For more information, see the section called “Using interface VPC endpoints”.

Topics

• Tutorial prerequisites and considerations

• Step 1: Launch an Amazon EC2 instance

• Step 2: Conﬁgure your Amazon EC2 instance

• Step 3: Create a VPC endpoint for Amazon Keyspaces

• Step 4: Conﬁgure permissions for the VPC endpoint connection

Connection tutorials 142

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Step 5: Conﬁgure monitoring with CloudWatch

• Step 6: (Optional) Best practices to conﬁgure the connection pool size for your application

• Step 7: (Optional) Clean up

Tutorial prerequisites and considerations

Before you start this tutorial, follow the AWS setup instructions in Accessing Amazon Keyspaces

(for Apache Cassandra). These steps include signing up for AWS and creating an AWS Identity and

Access Management (IAM) principal with access to Amazon Keyspaces. Take note of the name of

the IAM user and the access keys because you'll need them later in this tutorial.

Create a keyspace with the name myKeyspaceand at least one table to test the connection using

the VPC endpoint later in this tutorial. You can ﬁnd detailed instructions in Getting started.

After completing the prerequisite steps, proceed to Step 1: Launch an Amazon EC2 instance.

Step 1: Launch an Amazon EC2 instance

In this step, you launch an Amazon EC2 instance in your default Amazon VPC. You can then create

and use a VPC endpoint for Amazon Keyspaces.

To launch an Amazon EC2 instance

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

2. Choose Launch Instance and do the following:

From the EC2 console dashboard, in the Launch instance box, choose Launch instance, and

then choose Launch instance from the options that appear.

Under Name and tags, for Name, enter a descriptive name for your instance.

Under Application and OS Images (Amazon Machine Image):

• Choose Quick Start, and then choose Ubuntu. This is the operating system (OS) for your

instance.

• Under Amazon Machine Image (AMI), you can use the default image that is marked as

Free tier eligible. An Amazon Machine Image (AMI) is a basic conﬁguration that serves as a

template for your instance.

Connecting with VPC endpoints 143

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Under Instance Type:

• From the Instance type list, choose the t2.micro instance type, which is selected by default.

Under Key pair (login), for Key pair name, choose one of the following options for this

tutorial:

• If you don't have an Amazon EC2 key pair, choose Create a new key pair and follow the

instructions. You will be asked to download a private key ﬁle (.pem ﬁle). You will need this

ﬁle later when you log in to your Amazon EC2 instance, so take note of the ﬁle path.

• If you already have an existing Amazon EC2 key pair, go to Select a key pair and choose your

key pair from the list. You must already have the private key ﬁle ( .pem ﬁle) available in order

to log in to your Amazon EC2 instance.

Under Network Settings:

• Choose Edit.

• Choose Select an existing security group.

• In the list of security groups, choose default. This is the default security group for your VPC.

Continue to Summary.

• Review a summary of your instance conﬁguration in the Summary panel. When you're ready,

choose Launch instance.

3. On the completion screen for the new Amazon EC2 instance, choose the Connect to instance

tile. The next screen shows the necessary information and the required steps to connect to

your new instance. Take note of the following information:

• The sample command to protect the key ﬁle

• The connection string

• The Public IPv4 DNS name

After taking note of the information on this page, you can continue to the next step in this

tutorial (Step 2: Conﬁgure your Amazon EC2 instance).

Connecting with VPC endpoints 144

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

It takes a few minutes for your Amazon EC2 instance to become available. Before you go

on to the next step, ensure that the Instance State is running and that all of its Status

Checks have passed.

Step 2: Conﬁgure your Amazon EC2 instance

When your Amazon EC2 instance is available, you can log into it and prepare it for ﬁrst use.

Note

The following steps assume that you're connecting to your Amazon EC2 instance from a

computer running Linux. For other ways to connect, see Connect to your Linux instance in

the Amazon EC2 User Guide.

To conﬁgure your Amazon EC2 instance

1. You need to authorize inbound SSH traﬃc to your Amazon EC2 instance. To do this, create a

new EC2 security group, and then assign the security group to your EC2 instance.

a. In the navigation pane, choose Security Groups.

b. Choose Create Security Group. In the Create Security Group window, do the following:

•

Security group name – Enter a name for your security group. For example: my-ssh-

access

• Description – Enter a short description for the security group.

• VPC – Choose your default VPC.

• In the Inbound rules section, choose Add Rule and do the following:

• Type – Choose SSH.

• Source – Choose My IP.

• Choose Add rule.

On the bottom of the page, conﬁrm the conﬁguration settings and choose Create

Security Group.

Connecting with VPC endpoints 145

Amazon Keyspaces (for Apache Cassandra) Developer Guide

c. In the navigation pane, choose Instances.

d. Choose the Amazon EC2 instance that you launched in Step 1: Launch an Amazon EC2

instance.

e. Choose Actions, choose Security, and then choose Change Security Groups.

f. In Change Security Groups, select the security group that you created earlier in this

procedure (for example, my-ssh-access). The existing default security group should

also be selected. Conﬁrm the conﬁguration settings and choose Assign Security Groups.

2. Use the following command to protect your private key ﬁle from access. If you skip this step,

the connection fails.

chmod 400 path_to_file/my-keypair.pem

Use the ssh command to log in to your Amazon EC2 instance, as in the following example.

ssh -i path_to_file/my-keypair.pem ubuntu@public-dns-name

You need to specify your private key ﬁle (.pem ﬁle) and the public DNS name of your instance.

(See Step 1: Launch an Amazon EC2 instance).

The login ID is ubuntu. No password is required.

For more information about allowing connections to your Amazon EC2 instance and for AWS

CLI instructions, see Authorize inbound traﬃc for your Linux instances in the Amazon EC2 User

Guide.

4. Download and install the latest version of the AWS Command Line Interface.

Install unzip.

sudo apt install unzip

Download the zip ﬁle with the AWS CLI.

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o

"awscliv2.zip"

c. Unzip the ﬁle.

unzip awscliv2.zip

Connecting with VPC endpoints 146

Amazon Keyspaces (for Apache Cassandra) Developer Guide

d. Install the AWS CLI.

sudo ./aws/install

e. Conﬁrm the version of the AWS CLI installation.

aws --version

The output should look like this:

aws-cli/2.9.19 Python/3.9.11 Linux/5.15.0-1028-aws exe/x86_64.ubuntu.22 prompt/

off

5. Conﬁgure your AWS credentials, as shown in the following example. Enter your AWS access key

ID, secret key, and default Region name when prompted.

aws configure

AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE

AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Default region name [None]: us-east-1

Default output format [None]:

You have to use a cqlsh connection to Amazon Keyspaces to conﬁrm that your VPC endpoint

has been conﬁgured correctly. If you use your local environment or the Amazon Keyspaces CQL

editor in the AWS Management Console, the connection automatically goes through the public

endpoint instead of your VPC endpoint. To use cqlsh to test your VPC endpoint connection in

this tutorial, complete the setup instructions in Using cqlsh to connect to Amazon Keyspaces.

You are now ready to create a VPC endpoint for Amazon Keyspaces.

Step 3: Create a VPC endpoint for Amazon Keyspaces

In this step, you create a VPC endpoint for Amazon Keyspaces using the AWS CLI. To create the VPC

endpoint using the VPC console, you can follow the Create a VPC endpoint instructions in the AWS

PrivateLink Guide. When ﬁltering for the Service name, enter Cassandra.

Connecting with VPC endpoints 147

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To create a VPC endpoint using the AWS CLI

1. Before you begin, verify that you can communicate with Amazon Keyspaces using its public

endpoint.

aws keyspaces list-tables --keyspace-name 'myKeyspace'

The output shows a list of Amazon Keyspaces tables that are contained in the speciﬁed

keyspace. If you don't have any tables, the list is empty.

{

"tables": [

{

"keyspaceName": "myKeyspace",

"tableName": "myTable1",

"resourceArn": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/

catalog/table/myTable1"

{

"keyspaceName": "myKeyspace",

"tableName": "myTable2",

"resourceArn": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/

catalog/table/myTable2"

}

]

}

2. Verify that Amazon Keyspaces is an available service for creating VPC endpoints in the current

AWS Region. (The command is shown in bold text, followed by example output.)

aws ec2 describe-vpc-endpoint-services

{

"ServiceNames": [

"com.amazonaws.us-east-1.cassandra",

"com.amazonaws.us-east-1.cassandra-fips"

]

}

In the example output, Amazon Keyspaces is one of the services available, so you can proceed

with creating a VPC endpoint for it.

Connecting with VPC endpoints 148

Amazon Keyspaces (for Apache Cassandra) Developer Guide

3. Determine your VPC identiﬁer.

aws ec2 describe-vpcs

{

"Vpcs": [

{

"VpcId": "vpc-a1234bcd",

"InstanceTenancy": "default",

"State": "available",

"DhcpOptionsId": "dopt-8454b7e1",

"CidrBlock": "111.31.0.0/16",

"IsDefault": true

}

]

}

In the example output, the VPC ID is vpc-a1234bcd.

4. Use a ﬁlter to gather details about the subnets of the VPC.

aws ec2 describe-subnets --filters "Name=vpc-id,Values=vpc-a1234bcd"

{

"Subnets":[

{

"AvailabilityZone":"us-east-1a",

"AvailabilityZoneId":"use2-az1",

"AvailableIpAddressCount":4085,

"CidrBlock":"111.31.0.0/20",

"DefaultForAz":true,

"MapPublicIpOnLaunch":true,

"MapCustomerOwnedIpOnLaunch":false,

"State":"available",

"SubnetId":"subnet-920aacf9",

"VpcId":"vpc-a1234bcd",

"OwnerId":"111122223333",

"AssignIpv6AddressOnCreation":false,

"Ipv6CidrBlockAssociationSet":[

"SubnetArn":"arn:aws:ec2:us-east-1:111122223333:subnet/subnet-920aacf9",

Connecting with VPC endpoints 149

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"EnableDns64":false,

"Ipv6Native":false,

"PrivateDnsNameOptionsOnLaunch":{

"HostnameType":"ip-name",

"EnableResourceNameDnsARecord":false,

"EnableResourceNameDnsAAAARecord":false

}

{

"AvailabilityZone":"us-east-1c",

"AvailabilityZoneId":"use2-az3",

"AvailableIpAddressCount":4085,

"CidrBlock":"111.31.32.0/20",

"DefaultForAz":true,

"MapPublicIpOnLaunch":true,

"MapCustomerOwnedIpOnLaunch":false,

"State":"available",

"SubnetId":"subnet-4c713600",

"VpcId":"vpc-a1234bcd",

"OwnerId":"111122223333",

"AssignIpv6AddressOnCreation":false,

"Ipv6CidrBlockAssociationSet":[

"SubnetArn":"arn:aws:ec2:us-east-1:111122223333:subnet/subnet-4c713600",

"EnableDns64":false,

"Ipv6Native":false,

"PrivateDnsNameOptionsOnLaunch":{

"HostnameType":"ip-name",

"EnableResourceNameDnsARecord":false,

"EnableResourceNameDnsAAAARecord":false

}

{

"AvailabilityZone":"us-east-1b",

"AvailabilityZoneId":"use2-az2",

"AvailableIpAddressCount":4086,

"CidrBlock":"111.31.16.0/20",

"DefaultForAz":true,

"MapPublicIpOnLaunch":true,

}

]

Connecting with VPC endpoints 150

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

In the example output, there are two available subnet IDs: subnet-920aacf9 and

subnet-4c713600.

Create the VPC endpoint. For the --vpc-id parameter, specify the VPC ID from the previous

step. For the --subnet-id parameter, specify the subnet IDs from the previous step. Use

the --vpc-endpoint-type parameter to deﬁne the endpoint as an interface. For more

information about the command, see create-vpc-endpoint in the AWS CLI Command

Reference.

aws ec2 create-vpc-endpoint --vpc-endpoint-type Interface --vpc-id vpc-a1234bcd

--service-name com.amazonaws.us-east-1.cassandra --subnet-id subnet-920aacf9

subnet-4c713600

{

"VpcEndpoint": {

"VpcEndpointId": "vpce-000ab1cdef23456789",

"VpcEndpointType": "Interface",

"VpcId": "vpc-a1234bcd",

"ServiceName": "com.amazonaws.us-east-1.cassandra",

"State": "pending",

"RouteTableIds": [],

"SubnetIds": [

"subnet-920aacf9",

"subnet-4c713600"

"Groups": [

{

"GroupId": "sg-ac1b0e8d",

"GroupName": "default"

}

"IpAddressType": "ipv4",

"DnsOptions": {

"DnsRecordIpType": "ipv4"

"PrivateDnsEnabled": true,

"RequesterManaged": false,

"NetworkInterfaceIds": [

"eni-043c30c78196ad82e",

"eni-06ce37e3fd878d9fa"

Connecting with VPC endpoints 151

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"DnsEntries": [

{

"DnsName": "vpce-000ab1cdef23456789-m2b22rtz.cassandra.us-

east-1.vpce.amazonaws.com",

"HostedZoneId": "Z7HUB22UULQXV"

{

"DnsName": "vpce-000ab1cdef23456789-m2b22rtz-us-

east-1a.cassandra.us-east-1.vpce.amazonaws.com",

"HostedZoneId": "Z7HUB22UULQXV"

{

"DnsName": "vpce-000ab1cdef23456789-m2b22rtz-us-

east-1c.cassandra.us-east-1.vpce.amazonaws.com",

"HostedZoneId": "Z7HUB22UULQXV"

{

"DnsName": "vpce-000ab1cdef23456789-m2b22rtz-us-

east-1b.cassandra.us-east-1.vpce.amazonaws.com",

"HostedZoneId": "Z7HUB22UULQXV"

{

"DnsName": "vpce-000ab1cdef23456789-m2b22rtz-us-

east-1d.cassandra.us-east-1.vpce.amazonaws.com",

"HostedZoneId": "Z7HUB22UULQXV"

{

"DnsName": "cassandra.us-east-1.amazonaws.com",

"HostedZoneId": "ZONEIDPENDING"

}

"CreationTimestamp": "2023-01-27T16:12:36.834000+00:00",

"OwnerId": "111122223333"

}

Step 4: Conﬁgure permissions for the VPC endpoint connection

The procedures in this step demonstrate how to conﬁgure rules and permissions for using the VPC

endpoint with Amazon Keyspaces.

Connecting with VPC endpoints 152

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To conﬁgure an inbound rule for the new endpoint to allow TCP inbound traﬃc

1. In the Amazon VPC console, on the left-side panel, choose Endpoints and choose the endpoint

you created in the earlier step.

2. Choose Security groups and then choose the security group associated with this endpoint.

3. Choose Inbound rules and then choose Edit inbound rules.

4. Add an inbound rule with Type as CQLSH / CASSANDRA. This sets the Port range,

automatically to 9142.

5. To save the new inbound rule, choose Save rules.

To conﬁgure IAM user permissions

1. Conﬁrm that the IAM user used to connect to Amazon Keyspaces has the appropriate

permissions. In AWS Identity and Access Management (IAM), you can use the AWS managed

policy AmazonKeyspacesReadOnlyAccess to grant the IAM user read access to Amazon

Keyspaces.

a. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

b. On the IAM console dashboard, choose Users, and then choose your IAM user from the list.

c. On the Summary page, choose Add permissions.

d. Choose Attach existing policies directly.

e. From the list of policies, choose AmazonKeyspacesReadOnlyAccess, and then choose

Next: Review.

f. Choose Add permissions.

2. Verify that you can access Amazon Keyspaces through the VPC endpoint.

aws keyspaces list-tables --keyspace-name 'my_Keyspace'

If you want, you can try some other AWS CLI commands for Amazon Keyspaces. For more

information, see the AWS CLI Command Reference.

Note

The minimum permissions required for an IAM user or role to access Amazon Keyspaces

are read permissions to the system table, as shown in the following policy. For more

Connecting with VPC endpoints 153

Amazon Keyspaces (for Apache Cassandra) Developer Guide

information about policy-based permissions, see the section called “Identity-based

policy examples”.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select"

"Resource":[

"arn:aws:cassandra:us-east-1:555555555555:/keyspace/system*"

]

}

]

}

3. Grant the IAM user read access to the Amazon EC2 instance with the VPC.

When you use Amazon Keyspaces with VPC endpoints, you need to grant the IAM user or role

that accesses Amazon Keyspaces read-only permissions to your Amazon EC2 instance and the

VPC to gather endpoint and network interface data. Amazon Keyspaces stores this information

in the system.peers table and uses it to manage connections.

Note

The managed policies AmazonKeyspacesReadOnlyAccess_v2 and

AmazonKeyspacesFullAccess include the required permissions to let Amazon

Keyspaces access the Amazon EC2 instance to read information about available

interface VPC endpoints.

a. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

b. On the IAM console dashboard, choose Policies.

c. Choose Create policy, and then choose the JSON tab.

d. Copy the following policy and choose Next: Tags.

Connecting with VPC endpoints 154

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"Version":"2012-10-17",

"Statement":[

{

"Sid":"ListVPCEndpoints",

"Effect":"Allow",

"Action":[

"ec2:DescribeNetworkInterfaces",

"ec2:DescribeVpcEndpoints"

"Resource": "*"

}

]

}

Choose Next: Review, enter the name keyspacesVPCendpoint for the policy, and

choose Create policy.

f. On the IAM console dashboard, choose Users, and then choose your IAM user from the list.

g. On the Summary page, choose Add permissions.

h. Choose Attach existing policies directly.

i. From the list of policies, choose keyspacesVPCendpoint, and then choose Next: Review.

j. Choose Add permissions.

To verify that the Amazon Keyspaces system.peers table is getting updated with VPC

information, run the following query from your Amazon EC2 instance using cqlsh. If you

haven't already installed cqlshon your Amazon EC2 instance in step 2, follow the instructions

in the section called “Using the cqlsh-expansion”.

SELECT peer FROM system.peers;

The output returns nodes with private IP addresses, depending on your VPC and subnet setup

in your AWS Region.

peer

---------------

112.11.22.123

112.11.22.124

112.11.22.125

Connecting with VPC endpoints 155

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

You have to use a cqlshconnection to Amazon Keyspaces to conﬁrm that your VPC

endpoint has been conﬁgured correctly. If you use your local environment or the

Amazon Keyspaces CQL editor in the AWS Management Console, the connection

automatically goes through the public endpoint instead of your VPC endpoint. If you

see nine IP addresses, these are the entries Amazon Keyspaces automatically writes to

the system.peers table for public endpoint connections.

Step 5: Conﬁgure monitoring with CloudWatch

This step shows you how to use Amazon CloudWatch to monitor the VPC endpoint connection to

Amazon Keyspaces.

AWS PrivateLink publishes data points to CloudWatch about your interface endpoints. You can use

metrics to verify that your system is performing as expected. The AWS/PrivateLinkEndpoints

namespace in CloudWatch includes the metrics for interface endpoints. For more information, see

CloudWatch metrics for AWS PrivateLink in the AWS PrivateLink Guide.

To create a CloudWatch dashboard with VPC endpoint metrics

1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

2. In the navigation pane, choose Dashboards. Then choose Create dashboard. Enter a name for

the dashboard and choose Create.

3. Under Add widget, choose Number.

4. Under Metrics, choose AWS/PrivateLinkEndpoints.

5. Choose Endpoint Type, Service Name, VPC Endpoint ID, VPC ID.

Select the metrics ActiveConnections and NewConnections, and choose Create Widget.

7. Save the dashboard.

The ActiveConnections metric is deﬁned as the number of concurrent active connections that

the endpoint received during the last one-minute period. The NewConnections metric is deﬁned

as the number of new connections that were established through the endpoint during the last one-

minute period.

Connecting with VPC endpoints 156

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information about creating dashboards, see Create dashboard in the CloudWatch User

Guide.

Step 6: (Optional) Best practices to conﬁgure the connection pool size for your

application

In this section, we outline how to determine the ideal connection pool size based on the query

throughput requirements of your application.

Amazon Keyspaces allows a maximum of 3,000 CQL queries per second per TCP connection. So

there's virtually no limit on the number of connections that a driver can establish with Amazon

Keyspaces. However, we recommend that you match the connection pool size to the requirements

of your application and consider the available endpoints when you're using Amazon Keyspaces with

VPC endpoint connections.

You conﬁgure the connection pool size in the client driver. For example, based on a local pool

size of 2 and a VPC interface endpoint created across 3 Availability Zones, the driver establishes

6 connections for querying (7 in total, which includes a control connection). Using these 6

connections, you can support a maximum of 18,000 CQL queries per second.

If your application needs to support 40,000 CQL queries per second, work backwards from the

number of queries that are needed to determine the required connection pool size. To support

40,000 CQL queries per second, you need to conﬁgure the local pool size to be at least 5, which

supports a minimum of 45,000 CQL queries per second.

You can monitor if you exceed the quota for the maximum number of operations per second, per

connection by using the PerConnectionRequestRateExceeded CloudWatch metric in the AWS/

Cassandra namespace. The PerConnectionRequestRateExceeded metric shows the number

of requests to Amazon Keyspaces that exceed the quota for the per-connection request rate.

The code examples in this step show how to estimate and conﬁgure connection pooling when

you're using interface VPC endpoints.

Java

You can conﬁgure the number of connections per pool in the Java driver. For a complete

example of a Java client driver connection, see the section called “Using a Cassandra Java client

driver”.

When the client driver is started, ﬁrst the control connection is established for administrative

tasks, such as for schema and topology changes. Then the additional connections are created.

Connecting with VPC endpoints 157

Amazon Keyspaces (for Apache Cassandra) Developer Guide

In the following example, the local pool size driver conﬁguration is speciﬁed as 2. If the VPC

endpoint is created across 3 subnets within the VPC, this results in 7 NewConnections in

CloudWatch for the interface endpoint, as shown in the following formula.

NewConnections = 3 (VPC subnet endpoints created across) * 2 (pool size) + 1

( control connection)

datastax-java-driver {

basic.contact-points = [ "cassandra.us-east-1.amazonaws.com:9142"]

advanced.auth-provider{

class = PlainTextAuthProvider

username = "ServiceUserName"

password = "ServicePassword"

}

basic.load-balancing-policy {

local-datacenter = "us-east-1"

slow-replica-avoidance = false

}

advanced.ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "./src/main/resources/cassandra_truststore.jks"

truststore-password = "my_password"

hostname-validation = false

}

advanced.connection {

pool.local.size = 2

}

If the number of active connections doesn’t match your conﬁgured pool size (aggregation

across subnets) + 1 control connection, something is preventing the connections from being

created.

Node.js

You can conﬁgure the number of connections per pool in the Node.js driver. For a complete

example of a Node.js client driver connection, see the section called “Using a Cassandra Node.js

client driver”.

Connecting with VPC endpoints 158

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For the following code example, the local pool size driver conﬁguration is speciﬁed as 1. If the

VPC endpoint is created across 4 subnets within the VPC, this results in 5 NewConnections in

CloudWatch for the interface endpoint, as shown in the following formula.

NewConnections = 4 (VPC subnet endpoints created across) * 1 (pool size) + 1

( control connection)

const cassandra = require('cassandra-driver');

const fs = require('fs');

const types = cassandra.types;

const auth = new cassandra.auth.PlainTextAuthProvider('ServiceUserName',

'ServicePassword');

const sslOptions1 = {

ca: [

fs.readFileSync('/home/ec2-user/sf-class2-root.crt', 'utf-8')],

host: 'cassandra.us-east-1.amazonaws.com',

rejectUnauthorized: true

};

const client = new cassandra.Client({

contactPoints: ['cassandra.us-east-1.amazonaws.com'],

localDataCenter: 'us-east-1',

pooling: { coreConnectionsPerHost: { [types.distance.local]:

1 } },

consistency: types.consistencies.localQuorum,

queryOptions: { isIdempotent: true },

authProvider: auth,

sslOptions: sslOptions1,

protocolOptions: { port: 9142 }

});

Step 7: (Optional) Clean up

If you want to delete the resources that you have created in this tutorial, follow these procedures.

To remove your VPC endpoint for Amazon Keyspaces

1. Log in to your Amazon EC2 instance.

Determine the VPC endpoint ID that is used for Amazon Keyspaces. If you omit the grep

parameters, VPC endpoint information is shown for all services.

Connecting with VPC endpoints 159

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws ec2 describe-vpc-endpoint-services | grep ServiceName | grep cassandra

{

"VpcEndpoint": {

"PolicyDocument": "{\"Version\":\"2008-10-17\",\"Statement\":[{\"Effect\":

\"Allow\",\"Principal\":\"*\",\"Action\":\"*\",\"Resource\":\"*\"}]}",

"VpcId": "vpc-0bbc736e",

"State": "available",

"ServiceName": "com.amazonaws.us-east-1.cassandra",

"RouteTableIds": [],

"VpcEndpointId": "vpce-9b15e2f2",

"CreationTimestamp": "2017-07-26T22:00:14Z"

}

In the example output, the VPC endpoint ID is vpce-9b15e2f2.

3. Delete the VPC endpoint.

aws ec2 delete-vpc-endpoints --vpc-endpoint-ids vpce-9b15e2f2

{

"Unsuccessful": []

}

The empty array [] indicates success (there were no unsuccessful requests).

To terminate your Amazon EC2 instance

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

2. In the navigation pane, choose Instances.

3. Choose your Amazon EC2 instance.

4. Choose Actions, choose Instance State, and then choose Terminate.

5. In the conﬁrmation window, choose Yes, Terminate.

Connecting with VPC endpoints 160

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connecting to Amazon Keyspaces with Apache Spark

Apache Spark is an open-source engine for large-scale data analytics. Apache Spark enables you to

perform analytics on data stored in Amazon Keyspaces more eﬃciently. You can also use Amazon

Keyspaces to provide applications with consistent, single-digit-millisecond read access to analytics

data from Spark. The open-source Spark Cassandra Connector simpliﬁes reading and writing data

between Amazon Keyspaces and Spark.

Amazon Keyspaces support for the Spark Cassandra Connector streamlines running Cassandra

workloads in Spark-based analytics pipelines by using a fully managed and serverless database

service. With Amazon Keyspaces, you don’t need to worry about Spark competing for the same

underlying infrastructure resources as your tables. Amazon Keyspaces tables scale up and down

automatically based on your application traﬃc.

The following tutorial walks you through steps and best practices required to read and write data

to Amazon Keyspaces using the Spark Cassandra Connector. The tutorial demonstrates how to

migrate data to Amazon Keyspaces by loading data from a ﬁle with the Spark Cassandra Connector

and writing it to an Amazon Keyspaces table. Then, the tutorial shows how to read the data back

from Amazon Keyspaces using the Spark Cassandra Connector. You would do this to run Cassandra

workloads in Spark-based analytics pipelines.

Topics

• Prerequisites for establishing connections to Amazon Keyspaces with the Spark Cassandra

Connector

• Step 1: Conﬁgure Amazon Keyspaces for integration with the Apache Cassandra Spark Connector

• Step 2: Conﬁgure the Apache Cassandra Spark Connector

• Step 3: Create the application conﬁguration ﬁle

• Step 4: Prepare the source data and the target table in Amazon Keyspaces

• Step 5: Write and read Amazon Keyspaces data using the Apache Cassandra Spark Connector

• Troubleshooting common errors when using the Spark Cassandra Connector with Amazon

Keyspaces

Connecting with Apache Spark 161

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Prerequisites for establishing connections to Amazon Keyspaces with the Spark

Cassandra Connector

Before you connect to Amazon Keyspaces with the Spark Cassandra Connector, you need to make

sure that you've installed the following. The compatibility of Amazon Keyspaces with the Spark

Cassandra Connector has been tested with the following recommended versions:

• Java version 8

• Scala 2.12

• Spark 3.4

• Cassandra Connector 2.5 and higher

• Cassandra driver 4.12

1. To install Scala, follow the instructions at https://www.scala-lang.org/download/scala2.html.

2. To install Spark 3.4.1, follow this example.

curl -o spark-3.4.1-bin-hadoop3.tgz -k https://dlcdn.apache.org/spark/spark-3.4.1/

spark-3.4.1-bin-hadoop3.tgz

# now to untar

tar -zxvf spark-3.4.1-bin-hadoop3.tgz

# set this variable.

export SPARK_HOME=$PWD/spark-3.4.1-bin-hadoop3

```

Step 1: Conﬁgure Amazon Keyspaces for integration with the Apache Cassandra

Spark Connector

In this step, you conﬁrm that the partitioner for your account is compatible with the Apache

Spark Connector and setup the required IAM permissions. The following best practices help you to

provision suﬃcient read/write capacity for the table.

Conﬁrm that the Murmur3Partitioner partitioner is the default partitioner for your

account. This partitioner is compatible with the Spark Cassandra Connector. For more

information on partitioners and how to change them, see the section called “Working with

partitioners”.

Connecting with Apache Spark 162

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. Setup your IAM permissions for Amazon Keyspaces, using interface VPC endpoints, with

Apache Spark.

• Assign read/write access to the user table and read access to the system tables as shown in

the IAM policy example listed below.

• Populating the system.peers table with your available interface VPC endpoints is required

for clients accessing Amazon Keyspaces with Spark over VPC endpoints.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select",

"cassandra:Modify"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

{

"Sid":"ListVPCEndpoints",

"Effect":"Allow",

"Action":[

"ec2:DescribeNetworkInterfaces",

"ec2:DescribeVpcEndpoints"

"Resource":"*"

}

]

}

3. Consider the following best practices to conﬁgure suﬃcient read/write throughput capacity

for your Amazon Keyspaces table to support the traﬃc from the Spark Cassandra Connector.

• Start using on-demand capacity to help you test the scenario.

• To optimize the cost of table throughput for production environments, use a rate limiter

for traﬃc from the connector, and conﬁgure your table to use provisioned capacity with

Connecting with Apache Spark 163

Amazon Keyspaces (for Apache Cassandra) Developer Guide

automatic scaling. For more information, see the section called “Manage throughput

capacity with auto scaling”.

• You can use a ﬁxed rate limiter that comes with the Cassandra driver. There are some rate

limiters tailored to Amazon Keyspaces in the AWS samples repo.

• For more information about capacity management, see the section called “Conﬁgure read/

write capacity modes”.

Step 2: Conﬁgure the Apache Cassandra Spark Connector

Apache Spark is a general-purpose compute platform that you can conﬁgure in diﬀerent ways. To

conﬁgure Spark and the Spark Cassandra Connector for integration with Amazon Keyspaces, we

recommend that you start with the minimum conﬁguration settings described in the following

section, and then increase them later as appropriate for your workload.

• Create Spark partition sizes smaller than 8 MBs.

In Spark, partitions represent an atomic chunk of data that can be run in parallel. When you

are writing data to Amazon Keyspaces with the Spark Cassandra Connector, the smaller the

Spark partition, the smaller the amount of records that the task is going to write. If a Spark task

encounters multiple errors, it fails after the designated number of retries has been exhausted.

To avoid replaying large tasks and reprocessing a lot of data, keep the size of the Spark partition

small.

• Use a low concurrent number of writes per executor with a large number of retries.

Amazon Keyspaces returns insuﬃcient capacity errors back to Cassandra drivers as operation

timeouts. You can't address timeouts caused by insuﬃcient capacity by changing the conﬁgured

timeout duration because the Spark Cassandra Connector attempts to retry requests

transparently using the MultipleRetryPolicy. To ensure that retries don’t overwhelm

the driver’s connection pool, use a low concurrent number of writes per executor with a large

number of retries. The following code snippet is an example of this.

spark.cassandra.query.retry.count = 500

spark.cassandra.output.concurrent.writes = 3

• Break down the total throughput and distribute it across multiple Cassandra sessions.

Connecting with Apache Spark 164

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• The Cassandra Spark Connector creates one session for each Spark executor. Think about

this session as the unit of scale to determine the required throughput and the number of

connections required.

• When deﬁning the number of cores per executor and the number of cores per task, start low

and increase as needed.

• Set Spark task failures to allow processing in the event of transient errors. After you become

familiar with your application's traﬃc characteristics and requirements, we recommend setting

spark.task.maxFailures to a bounded value.

• For example, the following conﬁguration can handle two concurrent tasks per executor, per

session:

spark.executor.instances = configurable -> number of executors for the session.

spark.executor.cores = 2 -> Number of cores per executor.

spark.task.cpus = 1 -> Number of cores per task.

spark.task.maxFailures = -1

• Turn oﬀ batching.

• We recommend that you turn oﬀ batching to improve random access patterns. The following

code snippet is an example of this.

spark.cassandra.output.batch.size.rows = 1 (Default = None)

spark.cassandra.output.batch.grouping.key = none (Default = Partition)

spark.cassandra.output.batch.grouping.buffer.size = 100 (Default = 1000)

•

Set SPARK_LOCAL_DIRS to a fast, local disk with enough space.

•

By default, Spark saves map output ﬁles and resilient distributed datasets (RDDs) to a /tmp

folder. Depending on your Spark host’s conﬁguration, this can result in no space left on the

device style errors.

•

To set the SPARK_LOCAL_DIRS environment variable to a directory called /example/spark-

dir, you can use the following command.

export SPARK_LOCAL_DIRS=/example/spark-dir

Connecting with Apache Spark 165

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 3: Create the application conﬁguration ﬁle

To use the open-source Spark Cassandra Connector with Amazon Keyspaces, you need to provide

an application conﬁguration ﬁle that contains the settings required to connect with the DataStax

Java driver. You can use either service-speciﬁc credentials or the SigV4 plugin to connect.

If you haven't already done so, you need to convert the Starﬁeld digital certiﬁcate into a trustStore

ﬁle. You can follow the detailed steps at the section called “Before you begin” from the Java driver

connection tutorial. Take note of the trustStore ﬁle path and password because you need this

information when you create the application conﬁg ﬁle.

Connect with SigV4 authentication

This section shows you an example application.conf ﬁle that you can use when connecting

with AWS credentials and the SigV4 plugin. If you haven't already done so, you need to generate

your IAM access keys (an access key ID and a secret access key) and save them in your AWS

conﬁg ﬁle or as environment variables. For detailed instructions, see the section called “Required

credentials for AWS authentication”.

In the following example, replace the ﬁle path to your trustStore ﬁle, and replace the password.

datastax-java-driver {

basic.contact-points = ["cassandra.us-east-1.amazonaws.com:9142"]

basic.load-balancing-policy {

class = DefaultLoadBalancingPolicy

local-datacenter = us-east-1

slow-replica-avoidance = false

}

basic.request {

consistency = LOCAL_QUORUM

}

advanced {

auth-provider = {

class = software.aws.mcs.auth.SigV4AuthProvider

aws-region = us-east-1

}

ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "path_to_file/cassandra_truststore.jks"

truststore-password = "password"

hostname-validation=false

}

Connecting with Apache Spark 166

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

advanced.connection.pool.local.size = 3

}

Update and save this conﬁguration ﬁle as /home/user1/application.conf. The following

examples use this path.

Connect with service-speciﬁc credentials

This section shows you an example application.conf ﬁle that you can use when connecting

with service-speciﬁc credentials. If you haven't already done so, you need to generate service-

speciﬁc credentials for Amazon Keyspaces. For detailed instructions, see the section called “Create

service-speciﬁc credentials”.

In the following example, replace username and password with your own credentials. Also,

replace the ﬁle path to your trustStore ﬁle, and replace the password.

datastax-java-driver {

basic.contact-points = ["cassandra.us-east-1.amazonaws.com:9142"]

basic.load-balancing-policy {

class = DefaultLoadBalancingPolicy

local-datacenter = us-east-1

}

basic.request {

consistency = LOCAL_QUORUM

}

advanced {

auth-provider = {

class = PlainTextAuthProvider

username = "username"

password = "password"

aws-region = "us-east-1"

}

ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "path_to_file/cassandra_truststore.jks"

truststore-password = "password"

hostname-validation=false

}

metadata = {

schema {

token-map.enabled = true

}

Connecting with Apache Spark 167

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

Update and save this conﬁguration ﬁle as /home/user1/application.conf to use with the

code example.

Connect with a ﬁxed rate

To force a ﬁxed rate per Spark executor, you can deﬁne a request throttler. This request throttler

limits the rate of requests per second. The Spark Cassandra Connector deploys a Cassandra session

per executor. Using the following formula can help you achieve consistent throughput against a

table.

max-request-per-second * numberOfExecutors = total throughput against a table

You can add this example to the application conﬁg ﬁle that you created earlier.

datastax-java-driver {

advanced.throttler {

class = RateLimitingRequestThrottler

max-requests-per-second = 3000

max-queue-size = 30000

drain-interval = 1 millisecond

}

Step 4: Prepare the source data and the target table in Amazon Keyspaces

In this step, you create a source ﬁle with sample data and an Amazon Keyspaces table.

1. Create the source ﬁle. You can choose one of the following options:

• For this tutorial, you use a comma-separated values (CSV) ﬁle with the name

keyspaces_sample_table.csv as the source ﬁle for the data migration. The provided

sample ﬁle contains a few rows of data for a table with the name book_awards.

•

Download the sample CSV ﬁle (keyspaces_sample_table.csv) that is contained

in the following archive ﬁle samplemigration.zip. Unzip the archive and take note of

the path to keyspaces_sample_table.csv.

Connecting with Apache Spark 168

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• If you want to follow along with your own CSV ﬁle to write data to Amazon Keyspaces,

make sure that the data is randomized. Data that is read directly from a database or

exported to ﬂat ﬁles is typically ordered by the partition and primary key. Importing

ordered data to Amazon Keyspaces can cause it to be written to smaller segments of

Amazon Keyspaces partitions, which results in an uneven traﬃc distribution. This can lead

to slower performance and higher error rates.

In contrast, randomizing data helps to take advantage of the built-in load balancing

capabilities of Amazon Keyspaces by distributing traﬃc across partitions more evenly.

There are various tools that you can use for randomizing data. For an example that uses

the open-source tool Shuf, see the section called “Step 2: Prepare the data” in the data

migration tutorial. The following is an example that shows how to shuﬄe data as a

DataFrame.

import org.apache.spark.sql.functions.randval

shuffledDF = dataframe.orderBy(rand())

2. Create the target keyspace and table in Amazon Keyspaces.

Connect to Amazon Keyspaces using cqlsh, and replace the service endpoint, user name,

and password in the following example with your own values.

cqlsh cassandra.us-east-2.amazonaws.com 9142 -u "111122223333" -

p "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" --ssl

Create a new keyspace with the name catalog as shown in the following example.

CREATE KEYSPACE catalog WITH REPLICATION = {'class': 'SingleRegionStrategy'};

c. After the new keyspace has a status of available, use the following code to create the

target table book_awards. To learn more about asynchronous resource creation and how

to check if a resource is available, see the section called “Check keyspace creation status”.

CREATE TABLE catalog.book_awards (

year int,

award text,

rank int,

category text,

book_title text,

author text,

Connecting with Apache Spark 169

Amazon Keyspaces (for Apache Cassandra) Developer Guide

publisher text,

PRIMARY KEY ((year, award), category, rank)

);

Step 5: Write and read Amazon Keyspaces data using the Apache Cassandra Spark

Connector

In this step, you start by loading the data from the sample ﬁle into a DataFrame with the Spark

Cassandra Connector. Next, you write the data from the DataFrame into your Amazon Keyspaces

table. You can also use this part independently, for example, to migrate data into an Amazon

Keyspaces table. Finally, you read the data from your table into a DataFrame using the Spark

Cassandra Connector. You can also use this part independently, for example, to read data from an

Amazon Keyspaces table to perform data analytics with Apache Spark.

1. Start the Spark Shell as shown in the following example. Note that this example is using SigV4

authentication.

./spark-shell --files application.conf --conf

spark.cassandra.connection.config.profile.path=application.conf

--packages software.aws.mcs:aws-sigv4-auth-cassandra-java-driver-

plugin:4.0.5,com.datastax.spark:spark-cassandra-connector_2.12:3.1.0 --conf

spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions

2. Import the Spark Cassandra Connector with the following code.

import org.apache.spark.sql.cassandra._

To read data from the CSV ﬁle and store it in a DataFrame, you can use the following code

example.

var df =

spark.read.option("header","true").option("inferSchema","true").csv("keyspaces_sample_table.csv")

You can display the result with the following command.

scala> df.show();

The output should look similar to this.

Connecting with Apache Spark 170

Amazon Keyspaces (for Apache Cassandra) Developer Guide

+----------------+----+-----------+----+------------------+--------------------

+-------------+

publisher|

+----------------+----+-----------+----+------------------+--------------------

+-------------+

SomePublisher|

Example Books|

AnyPublisher|

Example Books|

SomePublisher|

AnyPublisher|

SomePublisher|

Example Books|

AnyPublisher|

+----------------+----+-----------+----+------------------+--------------------

+-------------+

You can conﬁrm the schema of the data in the DataFrame as shown in the following example.

scala> df.printSchema

The output should look like this.

root

|-- award: string (nullable = true)

|-- year: integer (nullable = true)

|-- category: string (nullable = true)

|-- rank: integer (nullable = true)

|-- author: string (nullable = true)

|-- book_title: string (nullable = true)

|-- publisher: string (nullable = true)

Connecting with Apache Spark 171

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Use the following command to write the data in the DataFrame to the Amazon Keyspaces

table.

df.write.cassandraFormat("book_awards", "catalog").mode("APPEND").save()

5. To conﬁrm that the data was saved, you can read it back to a dataframe, as shown in the

following example.

var newDf = spark.read.cassandraFormat("book_awards", "catalog").load()

Then you can show the data that is now contained in the dataframe.

scala> newDf.show()

The output of that command should look like this.

+--------------------+------------------+----------------+-----------+-------------

+----+----+

publisher|rank|year|

+--------------------+------------------+----------------+-----------+-------------

+----+----+

SomePublisher| 1|2020|

Books| 1|2020|

SomePublisher| 1|2020|

AnyPublisher| 3|2020|

Books| 2|2020|

AnyPublisher| 3|2020|

AnyPublisher| 3|2020|

SomePublisher| 2|2020|

Books| 2|2020|

Connecting with Apache Spark 172

Amazon Keyspaces (for Apache Cassandra) Developer Guide

+--------------------+------------------+----------------+-----------+-------------

+----+----+

Troubleshooting common errors when using the Spark Cassandra Connector with

Amazon Keyspaces

If you're using Amazon Virtual Private Cloud and you connect to Amazon Keyspaces, the most

common errors experienced when using the Spark connector are caused by the following

conﬁguration issues.

•

The IAM user or role used in the VPC lacks the required permissions to access the system.peers

table in Amazon Keyspaces. For more information, see the section called “Populating

system.peers table entries with interface VPC endpoint information”.

• The IAM user or role lacks the required read/write permissions to the user table and read access

to the system tables in Amazon Keyspaces. For more information, see the section called “Step 1:

Conﬁgure Amazon Keyspaces”.

• The Java driver conﬁguration doesn't disable hostname veriﬁcation when creating the SSL/TLS

connection. For examples, see the section called “Step 2: Conﬁgure the driver”.

For detailed connection troubleshooting steps, see the section called “VPC endpoint connection

errors”.

In addition, you can use Amazon CloudWatch metrics to help you troubleshoot issues with your

Spark Cassandra Connector conﬁguration in Amazon Keyspaces. To learn more about using

Amazon Keyspaces with CloudWatch, see the section called “Monitoring with CloudWatch”.

The following section describes the most useful metrics to observe when you're using the Spark

Cassandra Connector.

PerConnectionRequestRateExceeded

Amazon Keyspaces has a quota of 3,000 requests per second, per connection. Each Spark

executor establishes a connection with Amazon Keyspaces. Running multiple retries can exhaust

your per-connection request rate quota. If you exceed this quota, Amazon Keyspaces emits a

PerConnectionRequestRateExceeded metric in CloudWatch.

Connecting with Apache Spark 173

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If you see PerConnectionRequestRateExceeded events present along with other system or user

errors, it's likely that Spark is running multiple retries beyond the allotted number of requests

per connection.

If you see PerConnectionRequestRateExceeded events without other errors, then you

might need to increase the number of connections in your driver settings to allow for more

throughput, or you might need to increase the number of executors in your Spark job.

StoragePartitionThroughputCapacityExceeded

Amazon Keyspaces has a quota of 1,000 WCUs or WRUs per second/3,000 RCUs or RRUs per

second, per-partition. If you're seeing StoragePartitionThroughputCapacityExceeded

CloudWatch events, it could indicate that data is not randomized on load. For examples how to

shuﬄe data, see the section called “Step 4: Prepare the source data and the target table”.

Common errors and warnings

If you're using Amazon Virtual Private Cloud and you connect to Amazon Keyspaces, the Cassandra

driver might issue a warning message about the control node itself in the system.peers table.

For more information, see the section called “Common errors and warnings”. You can safely ignore

this warning.

Tutorial: Connecting to Amazon Keyspaces from Amazon Elastic

Kubernetes Service

This tutorial walks you through the steps required to set up an Amazon Elastic Kubernetes Service

(Amazon EKS) cluster to host a containerized application that connects to Amazon Keyspaces using

SigV4 authentication.

Amazon EKS is a managed service that eliminates the need to install, operate, and maintain

your own Kubernetes control plane. Kubernetes is an open-source system that automates the

management, scaling, and deployment of containerized applications.

The tutorial provides step-by-step guidance to conﬁgure, build, and deploy a containerized Java

application to Amazon EKS. In the last step you run the application to write data to an Amazon

Keyspaces table.

Topics

• Prerequisites for connecting from Amazon EKS to Amazon Keyspaces

Connecting from Amazon EKS 174

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Step 1: Conﬁgure the Amazon EKS cluster and setup IAM permissions

• Step 2: Conﬁgure the application

• Step 3: Create the application image and upload the Docker ﬁle to your Amazon ECR repository

• Step 4: Deploy the application to Amazon EKS and write data to your table

• Step 5: (Optional) Cleanup

Prerequisites for connecting from Amazon EKS to Amazon Keyspaces

Create the following AWS resources before you can begin with the tutorial

1. Before you start this tutorial, follow the AWS setup instructions in Accessing Amazon

Keyspaces (for Apache Cassandra). These steps include signing up for AWS and creating an

AWS Identity and Access Management (IAM) principal with access to Amazon Keyspaces.

Create an Amazon Keyspaces keyspace with the name aws and a table with the name user

that you can write to from the containerized application running in Amazon EKS later in this

tutorial. You can do this either with the AWS CLI or using cqlsh.

AWS CLI

aws keyspaces create-keyspace --keyspace-name 'aws'

To conﬁrm that the keyspace was created, you can use the following command.

aws keyspaces list-keyspaces

To create the table, you can use the following command.

aws keyspaces create-table --keyspace-name 'aws' --table-name 'user' --schema-

definition 'allColumns=[

{name=username,type=text}, {name=fname,type=text},

{name=last_update_date,type=timestamp},{name=lname,type=text}],

partitionKeys=[{name=username}]'

To conﬁrm that your table was created, you can use the following command.

aws keyspaces list-tables --keyspace-name 'aws'

Connecting from Amazon EKS 175

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information, see create keyspace and create table in the AWS CLI Command

Reference.

cqlsh

CREATE KEYSPACE aws WITH replication = {'class': 'SimpleStrategy',

'replication_factor': '3'} AND durable_writes = true;

CREATE TABLE aws.user (

username text PRIMARY KEY,

fname text,

last_update_date timestamp,

lname text

);

To verify that your table was created, you can use the following statement.

SELECT * FROM system_schema.tables WHERE keyspace_name = "aws";

Your table should be listed in the output of this statement. Note that there can be a delay

until the table is created. For more information, see the section called “CREATE TABLE”.

3. Create an Amazon EKS cluster with a Fargate - Linux node type. Fargate is a serverless

compute engine that lets you deploy Kubernetes Pods without managing Amazon Amazon EC2

instances. To follow this tutorial without having to update the cluster name in all the example

commands, create a cluster with the name my-eks-cluster following the instructions at

Getting started with Amazon EKS – eksctl in the Amazon EKS User Guide. When your cluster

is created, verify that your nodes and the two default Pods are running and healthy. You can

do so with the following command.

kubectl get pods -A -o wide

You should see something similar to this output.

NAMESPACE NAME READY STATUS RESTARTS AGE IP

NODE NOMINATED NODE READINESS

GATES

kube-system coredns-1234567890-abcde 1/1 Running 0 18m

192.0.2.0 fargate-ip-192-0-2-0.region-code.compute.internal <none>

<none>

Connecting from Amazon EKS 176

Amazon Keyspaces (for Apache Cassandra) Developer Guide

kube-system coredns-1234567890-12345 1/1 Running 0 18m

192.0.2.1 fargate-ip-192-0-2-1.region-code.compute.internal <none>

<none>

4. Install Docker. For instructions on how to install Docker on an Amazon EC2 instance, see Install

Docker in the Amazon Elastic Container Registry User Guide.

Docker is available for many diﬀerent operating systems, including most modern Linux

distributions, like Ubuntu, and even macOS and Windows. For more information about how to

install Docker on your particular operating system, go to the Docker installation guide.

5. Create an Amazon ECR repository. Amazon ECR is an AWS managed container image registry

service that you can use with your preferred CLI to push, pull, and manage Docker images. For

more information about Amazon ECR repositories, see the Amazon Elastic Container Registry

User Guide. You can use the following command to create a repository with the name my-

ecr-repository.

aws ecr create-repository --repository-name my-ecr-repository

After completing the prerequisite steps, proceed to the section called “Step 1: Conﬁgure the

Amazon EKS cluster”.

Step 1: Conﬁgure the Amazon EKS cluster and setup IAM permissions

Conﬁgure the Amazon EKS cluster and create the IAM resources that are required to allow an

Amazon EKS service account to connect to your Amazon Keyspaces table

1. Create an Open ID Connect (OIDC) provider for the Amazon EKS cluster. This is needed to use

IAM roles for service accounts. For more information about OIDC providers and how to create

them, see Creating an IAM OIDC provider for your cluster in the Amazon EKS User Guide.

a. Create an IAM OIDC identity provider for your cluster with the following command. This

example assumes that your cluster name is my-eks-cluster. If you have a cluster with a

diﬀerent name, remember to update the name in all future commands.

eksctl utils associate-iam-oidc-provider --cluster my-eks-cluster --approve

b. Conﬁrm that the OIDC identity provider has been registered with IAM with the following

command.

Connecting from Amazon EKS 177

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws iam list-open-id-connect-providers --region aws-region

The output should look similar to this. Take note of the OIDC's Amazon Resource Name

(ARN), you need it in the next step when you create a trust policy for the service account.

{

"OpenIDConnectProviderList": [

{

"Arn": "arn:aws:iam::111122223333:oidc-provider/oidc.eks.aws-

region.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"

}

]

}

2. Create a service account for the Amazon EKS cluster. Service accounts provide an identity for

processes that run in a Pod. A Pod is the smallest and simplest Kubernetes object that you can

use to deploy a containerized application. Next, create an IAM role that the service account

can assume to obtain permissions to resources. You can access any AWS service from a Pod

that has been conﬁgured to use a service account that can assume an IAM role with access

permissions to that service.

a. Create a new namespace for the service account. A namespace helps to isolate cluster

resources created for this tutorial. You can create a new namespace using the following

command.

kubectl create namespace my-eks-namespace

b. To use a custom namespace, you have to associate it with a Fargate proﬁle. The following

code is an example of this.

eksctl create fargateprofile \

--cluster my-eks-cluster \

--name my-fargate-profile \

--namespace my-eks-namespace \

--labels *=*

Create a service account with the name my-eks-serviceaccount in the namespace my-

eks-namespace for your Amazon EKS cluster by using the following command.

Connecting from Amazon EKS 178

Amazon Keyspaces (for Apache Cassandra) Developer Guide

cat >my-serviceaccount.yaml <<EOF

apiVersion: v1

kind: ServiceAccount

metadata:

name: my-eks-serviceaccount

namespace: my-eks-namespace

EOF

kubectl apply -f my-serviceaccount.yaml

d. Run the following command to create a trust policy ﬁle that instructs the IAM role to trust

your service account. This trust relationship is required before a principal can assume a

role. You need to make the following edits to the ﬁle:

•

For the Principal, enter the ARN that IAM returned to the list-open-id-connect-

providers command. The ARN contains your account number and Region.

•

In the condition statement, replace the AWS Region and the OIDC id.

• Conﬁrm that the service account name and namespace are correct.

You need to attach the trust policy ﬁle in the next step when you create the IAM role.

cat >trust-relationship.json <<EOF

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::111122223333:oidc-provider/

oidc.eks.aws-region.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"oidc.eks.aws-region.amazonaws.com/

id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:my-eks-

namespace:my-eks-serviceaccount",

"oidc.eks.aws-region.amazonaws.com/

id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"

}

Connecting from Amazon EKS 179

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

]

}

EOF

Optional: You can also add multiple entries in the StringEquals or StringLike

conditions to allow multiple service accounts or namespaces to assume the role. To allow

your service account to assume an IAM role in a diﬀerent AWS account, see Cross-account

IAM permissions in the Amazon EKS User Guide.

Create an IAM role with the name my-iam-role for the Amazon EKS service account to

assume. Attach the trust policy ﬁle created in the last step to the role. The trust policy speciﬁes

the service account and OIDC provider that the IAM role can trust.

aws iam create-role --role-name my-iam-role --assume-role-policy-document file://

trust-relationship.json --description "EKS service account role"

4. Assign the IAM role permissions to Amazon Keyspaces by attaching an access policy.

a. Attach an access policy to deﬁne the actions the IAM role can perform on speciﬁc

Amazon Keyspaces resources. For this tutorial we use the AWS managed policy

AmazonKeyspacesFullAccess, because our application is going to write data to your

Amazon Keyspaces table. As a best practise however, it's recommended to create custom

access policies that implement the least privileges principle. For more information, see the

section called “How Amazon Keyspaces works with IAM”.

aws iam attach-role-policy --role-name my-iam-role --policy-

arn=arn:aws:iam::aws:policy/AmazonKeyspacesFullAccess

Conﬁrm that the policy was successfully attached to the IAM role with the following

statement.

aws iam list-attached-role-policies --role-name my-iam-role

The output should look like this.

{

"AttachedPolicies": [

{

"PolicyName": "AmazonKeyspacesFullAccess",

Connecting from Amazon EKS 180

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"PolicyArn": "arn:aws:iam::aws:policy/AmazonKeyspacesFullAccess"

}

]

}

b. Annotate the service account with the Amazon Resource Name (ARN) of the IAM role it can

assume. Make sure to update the role ARN with your account ID.

kubectl annotate serviceaccount -n my-eks-namespace my-eks-serviceaccount

eks.amazonaws.com/role-arn=arn:aws:iam::111122223333:role/my-iam-role

5. Conﬁrm that the IAM role and the service account are correctly conﬁgured.

a. Conﬁrm that the IAM role's trust policy is correctly conﬁgured with the following

statement.

aws iam get-role --role-name my-iam-role --query Role.AssumeRolePolicyDocument

The output should look similar to this.

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::111122223333:oidc-provider/

oidc.eks.aws-region.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"oidc.eks.aws-region/id/

EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com",

"oidc.eks.aws-region.amazonaws.com/id/

EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:my-eks-

namespace:my-eks-serviceaccount"

}

]

}

Connecting from Amazon EKS 181

Amazon Keyspaces (for Apache Cassandra) Developer Guide

b. Conﬁrm that the Amazon EKS service account is annotated with the IAM role.

kubectl describe serviceaccount my-eks-serviceaccount -n my-eks-namespace

The output should look similar to this.

Name: my-eks-serviceaccount

Namespace:my-eks-namespace

Labels: <none>

Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/my-iam-

role

Image pull secrets: <none>

Mountable secrets: <none>

Tokens: <none>

[...]

After you created the Amazon EKS service account, the IAM role, and conﬁgured the required

relationships and permissions, proceed to the section called “Step 2: Conﬁgure the application”.

Step 2: Conﬁgure the application

In this step you build your application that connects to Amazon Keyspaces using the SigV4 plugin.

You can view and download the example Java application from the Amazon Keyspaces example

code repo on Github. Or you can follow along using your own application, making sure to complete

all conﬁguration steps.

Conﬁgure your application and add the required dependencies.

1. You can download the example Java application by cloning the Github repository using the

following command.

git clone https://github.com/aws-samples/amazon-keyspaces-examples.git

2. After downloading the Github repo, unzip the downloaded ﬁle and navigate to the

resources directory to the application.conf ﬁle.

a. Application conﬁguration

In this step you conﬁgure the SigV4 authentication plugin. You can use the following

example in your application. If you haven't already done so, you need to generate your

Connecting from Amazon EKS 182

Amazon Keyspaces (for Apache Cassandra) Developer Guide

IAM access keys (an access key ID and a secret access key) and save them in your AWS

conﬁg ﬁle or as environment variables. For detailed instructions, see the section called

“Required credentials for AWS authentication”. Update the AWS Region and the service

endpoint for Amazon Keyspaces as needed. For more service endpoints, see the section

called “Service endpoints”. Replace the truststore location, truststore name, and the

truststore password with your own.

datastax-java-driver {

basic.contact-points = ["cassandra.aws-region.amazonaws.com:9142"]

basic.load-balancing-policy.local-datacenter = "aws-region"

advanced.auth-provider {

class = software.aws.mcs.auth.SigV4AuthProvider

aws-region = "aws-region"

}

advanced.ssl-engine-factory {

class = DefaultSslEngineFactory

truststore-path = "truststore_locationtruststore_name.jks"

truststore-password = "truststore_password;"

}

b. Add the STS module dependency.

This adds the ability to use a WebIdentityTokenCredentialsProvider that returns

the AWS credentials that the application needs to provide so that the service account can

assume the IAM role. You can do this based on the following example.

<groupId>com.amazonaws</groupId>

</dependency>

c. Add the SigV4 dependency.

This package implements the SigV4 authentication plugin that is needed to authenticate

to Amazon Keyspaces

<groupId>software.aws.mcs</groupId>

Connecting from Amazon EKS 183

Amazon Keyspaces (for Apache Cassandra) Developer Guide

<artifactId>aws-sigv4-auth-cassandra-java-driver-plugin</

artifactId>

</dependency>

3. Add a logging dependency.

Without logs, troubleshooting connection issues is impossible. In this tutorial, we use slf4j

as the logging framework, and use logback.xml to store the log output. We set the logging

level to debug to establish the connection. You can use the following example to add the

dependency.

</dependency>

You can use the following code snippet to conﬁgure the logging.

<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</

pattern>

</encoder>

</appender>

<appender-ref ref="STDOUT" />

</rootv

</configuration>

Note

The debug level is needed to investigate connection failures. After you have

successfully connected to Amazon Keyspaces from your application, you can change

the logging level to info or warning as needed.

Connecting from Amazon EKS 184

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Step 3: Create the application image and upload the Docker ﬁle to your Amazon

ECR repository

In this step, you compile the example application, build a Docker image, and push the image to

your Amazon ECR repository.

Build your application, build a Docker image, and submit it to Amazon Elastic Container

Registry

1. Set environment variables for the build that deﬁne your AWS Region. Replace the Regions in

the examples with your own.

export CASSANDRA_HOST=cassandra.aws-region.amazonaws.com:9142

export CASSANDRA_DC=aws-region

2. Compile your application with Apache Maven version 3.6.3 or higher using the following

command.

mvn clean install

This creates a JAR ﬁle with all dependencies included in the target directory.

3. Retrieve your ECR repository URI that's needed for the next step with the following command.

Make sure to update the Region to the one you've been using.

aws ecr describe-repositories --region aws-region

The output should look like in the following example.

"repositories": [

{

"repositoryArn": "arn:aws:ecr:aws-region:111122223333:repository/my-ecr-

repository",

"registryId": "111122223333",

"repositoryName": "my-ecr-repository",

"repositoryUri": "111122223333.dkr.ecr.aws-region.amazonaws.com/my-ecr-

repository",

"createdAt": "2023-11-02T03:46:34+00:00",

"imageTagMutability": "MUTABLE",

"imageScanningConfiguration": {

"scanOnPush": false

Connecting from Amazon EKS 185

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"encryptionConfiguration": {

"encryptionType": "AES256"

}

4. From the application's root directory build the Docker image using the repository URI from the

last step. Modify the Docker ﬁle as needed. In the build command, make sure to replace your

account ID and set the AWS Region to the Region where the Amazon ECR repository my-ecr-

repository is located.

docker build -t 111122223333.dkr.ecr.aws-region.amazonaws.com/my-ecr-

repository:latest .

5. Retrieve an authentication token to push the Docker image to Amazon ECR. You can do so with

the following command.

aws ecr get-login-password --region aws-region | docker login --username AWS --

password-stdin 111122223333.dkr.ecr.aws-region.amazonaws.com

6. First, check for existing images in your Amazon ECR repository. You can use the following

command.

aws ecr describe-images --repository-name my-ecr-repository --region aws-region

Then, push the Docker image to the repo. You can use the following command.

docker push 111122223333.dkr.ecr.aws-region.amazonaws.com/my-ecr-repository:latest

Step 4: Deploy the application to Amazon EKS and write data to your table

In this step of the tutorial, you conﬁgure the Amazon EKS deployment for your application, and

conﬁrm that the application is running and can connect to Amazon Keyspaces.

To deploy an application to Amazon EKS, you need to conﬁgure all relevant settings in a ﬁle called

deployment.yaml. This ﬁle is then used by Amazon EKS to deploy the application. The metadata

in the ﬁle should contain the following information:

•

Application name the name of the application. For this tutorial, we use my-keyspaces-app.

Connecting from Amazon EKS 186

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

Kubernetes namespace the namespace of the Amazon EKS cluster. For this tutorial, we use my-

eks-namespace.

• Amazon EKS service account name the name of the Amazon EKS service account. For this

tutorial, we use my-eks-serviceaccount.

•

image name the name of the application image. For this tutorial, we use my-keyspaces-app.

• Image URI the Docker image URI from Amazon ECR.

• AWS account ID your AWS account ID.

• IAM role ARN the ARN of the IAM role created for the service account to assume. For this

tutorial, we use my-iam-role.

• AWS Region of the Amazon EKS cluster the AWS Region you created your Amazon EKS cluster

in.

In this step, you deploy and run the application that connects to Amazon Keyspaces and writes

data to the table.

Conﬁgure the deployment.yaml ﬁle. You need to replace the following values:

•

name

•

namespace

•

serviceAccountName

•

image

•

AWS_ROLE_ARN value

•

The AWS Region in CASSANDRA_HOST

•

AWS_REGION

You can use the following ﬁle as an example.

apiVersion: apps/v1

kind: Deployment

metadata:

name: my-keyspaces-app

namespace: my-eks-namespace

spec:

replicas: 1

selector:

Connecting from Amazon EKS 187

Amazon Keyspaces (for Apache Cassandra) Developer Guide

matchLabels:

app: my-keyspaces-app

template:

metadata:

labels:

app: my-keyspaces-app

spec:

serviceAccountName: my-eks-serviceaccount

containers:

- name: my-keyspaces-app

image: 111122223333.dkr.ecr.aws-region.amazonaws.com/my-ecr-

repository:latest

ports:

- containerPort: 8080

env:

- name: CASSANDRA_HOST

value: "cassandra.aws-region.amazonaws.com:9142"

- name: CASSANDRA_DC

value: "aws-region"

- name: AWS_WEB_IDENTITY_TOKEN_FILE

value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token

- name: AWS_ROLE_ARN

value: "arn:aws:iam::111122223333:role/my-iam-role"

- name: AWS_REGION

value: "aws-region"

Deploy deployment.yaml.

kubectl apply -f deployment.yaml

The output should look like this.

deployment.apps/my-keyspaces-app created

3. Check the status of the Pod in your namespace of the Amazon EKS cluster.

kubectl get pods -n my-eks-namespace

The output should look similar to this example.

NAME READY STATUS RESTARTS AGE

Connecting from Amazon EKS 188

Amazon Keyspaces (for Apache Cassandra) Developer Guide

my-keyspaces-app-123abcde4f-g5hij 1/1 Running 0 75s

For more details, you can use the following command.

kubectl describe pod my-keyspaces-app-123abcde4f-g5hij -n my-eks-namespace

Name: my-keyspaces-app-123abcde4f-g5hij

Namespace: my-eks-namespace

Priority: 2000001000

Priority Class Name: system-node-critical

Service Account: my-eks-serviceaccount

Node: fargate-ip-192-168-102-209.ec2.internal/192.168.102.209

Start Time: Thu, 23 Nov 2023 12:15:43 +0000

Labels: app=my-keyspaces-app

eks.amazonaws.com/fargate-profile=my-fargate-profile

pod-template-hash=6c56fccc56

Annotations: CapacityProvisioned: 0.25vCPU 0.5GB

Logging: LoggingDisabled: LOGGING_CONFIGMAP_NOT_FOUND

Status: Running

IP: 192.168.102.209

IPs:

IP: 192.168.102.209

Controlled By: ReplicaSet/my-keyspaces-app-6c56fccc56

Containers:

my-keyspaces-app:

Container ID:

containerd://41ff7811d33ae4bc398755800abcdc132335d51d74f218ba81da0700a6f8c67b

Image: 111122223333.dkr.ecr.aws-region.amazonaws.com/

my_eks_repository:latest

Image ID: 111122223333.dkr.ecr.aws-region.amazonaws.com/

my_eks_repository@sha256:fd3c6430fc5251661efce99741c72c1b4b03061474940200d0524b84a951439c

Port: 8080/TCP

Host Port: 0/TCP

State: Running

Started: Thu, 23 Nov 2023 12:15:19 +0000

Finished: Thu, 23 Nov 2023 12:16:17 +0000

Ready: True

Restart Count: 1

Environment:

CASSANDRA_HOST: cassandra.aws-region.amazonaws.com:9142

CASSANDRA_DC: aws-region

Connecting from Amazon EKS 189

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/eks.amazonaws.com/

serviceaccount/token

AWS_ROLE_ARN: arn:aws:iam::111122223333:role/my-iam-role

AWS_REGION: aws-region

AWS_STS_REGIONAL_ENDPOINTS: regional

Mounts:

/var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)

/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fssbf (ro)

Conditions:

Type Status

Initialized True

Ready True

ContainersReady True

PodScheduled True

Volumes:

aws-iam-token:

Type: Projected (a volume that contains injected data from

multiple sources)

TokenExpirationSeconds: 86400

kube-api-access-fssbf:

Type: Projected (a volume that contains injected data from

multiple sources)

TokenExpirationSeconds: 3607

ConfigMapName: kube-root-ca.crt

ConfigMapOptional: <nil>

DownwardAPI: true

QoS Class: BestEffort

Node-Selectors: <none>

Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for

300s

node.kubernetes.io/unreachable:NoExecute op=Exists for

300s

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Warning LoggingDisabled 2m13s fargate-scheduler Disabled logging

because aws-logging configmap was not found. configmap "aws-logging" not found

Normal Scheduled 89s fargate-scheduler Successfully

assigned my-eks-namespace/my-keyspaces-app-6c56fccc56-mgs2m to fargate-

ip-192-168-102-209.ec2.internal

Normal Pulled 75s kubelet

Successfully pulled image "111122223333.dkr.ecr.aws-region.amazonaws.com/

my_eks_repository:latest" in 13.027s (13.027s including waiting)

Connecting from Amazon EKS 190

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Normal Pulling 54s (x2 over 88s) kubelet Pulling image

"111122223333.dkr.ecr.aws-region.amazonaws.com/my_eks_repository:latest"

Normal Created 54s (x2 over 75s) kubelet Created container

my-keyspaces-app

Normal Pulled 54s kubelet

Successfully pulled image "111122223333.dkr.ecr.aws-region.amazonaws.com/

my_eks_repository:latest" in 222ms (222ms including waiting)

Normal Started 53s (x2 over 75s) kubelet Started container

my-keyspaces-app

4. Check the Pod's logs to conﬁrm that your application is running and can connect to your

Amazon Keyspaces table. You can do so with the following command. Make sure to replace the

name of your deployment.

kubectl logs -f my-keyspaces-app-123abcde4f-g5hij -n my-eks-namespace

You should be able to see application log entries conﬁrming the connection to Amazon

Keyspaces like in the example below.

2:47:20.553 [s0-admin-0] DEBUG c.d.o.d.i.c.metadata.MetadataManager

- [s0] Adding initial contact points [Node(endPoint=cassandra.aws-

region.amazonaws.com/1.222.333.44:9142, hostId=null, hashCode=e750d92)]

22:47:20.562 [s0-admin-1] DEBUG c.d.o.d.i.c.c.ControlConnection - [s0] Initializing

with event types [SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE]

22:47:20.564 [s0-admin-1] DEBUG c.d.o.d.i.core.context.EventBus - [s0] Registering

com.datastax.oss.driver.internal.core.metadata.LoadBalancingPolicyWrapper$$Lambda

$812/0x0000000801105e88@769afb95 for class

com.datastax.oss.driver.internal.core.metadata.NodeStateEvent

22:47:20.566 [s0-admin-1] DEBUG c.d.o.d.i.c.c.ControlConnection -

[s0] Trying to establish a connection to Node(endPoint=cassandra.us-

east-1.amazonaws.com/1.222.333.44:9142, hostId=null, hashCode=e750d92)

5. Run the following CQL query on your Amazon Keyspaces table to conﬁrm that one row of data

has been written to your table:

SELECT * from aws.user;

You should see the following output:

fname | lname | username | last_update_date

----------+-------+----------+-----------------------------

Connecting from Amazon EKS 191

Amazon Keyspaces (for Apache Cassandra) Developer Guide

random | k | test | 2023-12-07 13:58:31.57+0000

Step 5: (Optional) Cleanup

Follow these steps to remove all the resources created in this tutorial.

Remove the resources created in this tutorial

1. Delete your deployment. You can use the following command to do so.

kubectl delete deployment my-keyspaces-app -n my-eks-namespace

2. Delete the Amazon EKS cluster and all Pods contained in it. This also deletes related resources

like the service account and OIDC identity provider. You can use the following command to do

so.

eksctl delete cluster --name my-eks-cluster --region aws-region

3. Delete the IAM role used for the Amazon EKS service account with access permissions to

Amazon Keyspaces. First, you have to remove the managed policy that is attached to the role.

aws iam detach-role-policy --role-name my-iam-role --policy-arn

arn:aws:iam::aws:policy/AmazonKeyspacesFullAccess

Then you can delete the role using the following command.

aws iam delete-role --role-name my-iam-role

For more information, see Deleting an IAM role (AWS CLI) in the IAM User Guide.

4. Delete the Amazon ECR repository including all the images stored in it. You can do so using the

following command.

aws ecr delete-repository \

--repository-name my-ecr-repository \

--force \

--region aws-region

Connecting from Amazon EKS 192

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note that the force ﬂag is required to delete a repository that contains images. To delete

your image ﬁrst, you can do so using the following command.

aws ecr batch-delete-image \

--repository-name my-ecr-repository \

--image-ids imageTag=latest \

--region aws-region

For more information, see Delete an image in the Amazon Elastic Container Registry User

Guide.

5. Delete the Amazon Keyspaces keyspace and table. Deleting the keyspace automatically deletes

all tables in that keyspace. You can use one the following options to do so.

AWS CLI

aws keyspaces delete-keyspace --keyspace-name 'aws'

To conﬁrm that the keyspace was deleted, you can use the following command.

aws keyspaces list-keyspaces

To delete the table ﬁrst, you can use the following command.

aws keyspaces delete-table --keyspace-name 'aws' --table-name 'user'

To conﬁrm that your table was deleted, you can use the following command.

aws keyspaces list-tables --keyspace-name 'aws'

For more information, see delete keyspace and delete table in the AWS CLI Command

Reference.

cqlsh

DROP KEYSPACE IF EXISTS "aws";

To verify that your keyspaces was deleted, you can use the following statement.

Connecting from Amazon EKS 193

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT * FROM system_schema.keyspaces ;

Your keyspace should not be listed in the output of this statement. Note that there can be

a delay until the keyspaces is deleted. For more information, see the section called “DROP

KEYSPACE”.

To delete the table ﬁrst, you can use the following command.

DROP TABLE "aws.user"

To conﬁrm that your table was deleted, you can use the following command.

SELECT * FROM system_schema.tables WHERE keyspace_name = "aws";

Your table should not be listed in the output of this statement. Note that there can be a

delay until the table is deleted. For more information, see the section called “DROP TABLE”.

Conﬁgure cross-account access to Amazon Keyspaces with VPC

endpoints

You can create and use separate AWS accounts to isolate resources and for use in diﬀerent

environments, for example development and production. This topic walks you through cross-

account access for Amazon Keyspaces using interface VPC endpoints in an Amazon Virtual Private

Cloud. For more information about IAM cross-account access conﬁguration, see Example scenario

using separate development and production accounts in the IAM User Guide.

For more information about Amazon Keyspaces and private VPC endpoints, see the section called

“Using interface VPC endpoints”.

Topics

• Conﬁgure cross-account access to Amazon Keyspaces using VPC endpoints in a shared VPC

• Conﬁguring cross-account access to Amazon Keyspaces without a shared VPC

Conﬁgure cross-account access 194

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁgure cross-account access to Amazon Keyspaces using VPC

endpoints in a shared VPC

You can create diﬀerent AWS accounts to separate resources from applications. For example, you

can create one account for your Amazon Keyspaces tables, a diﬀerent account for applications in

a development environment, and another account for applications in a production environment.

This topic walks you through the conﬁguration steps required to set up cross-account access for

Amazon Keyspaces using interface VPC endpoints in a shared VPC.

For detailed steps how to conﬁgure a VPC endpoint for Amazon Keyspaces, see the section called

“Step 3: Create a VPC endpoint for Amazon Keyspaces”.

In this example we use the following three accounts in a shared VPC:

•

Account A – This account contains infrastructure, including the VPC endpoints, the VPC

subnets, and Amazon Keyspaces tables.

•

Account B – This account contains an application in a development environment that needs to

connect to the Amazon Keyspaces table in Account A.

•

Account C – This account contains an application in a production environment that needs to

connect to the Amazon Keyspaces table in Account A.

Conﬁgure cross-account access in a shared VPC 195

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Account A is the account that contains the resources that Account B and Account C need to

access, so Account A is the trusting account. Account B and Account C are the accounts with

the principals that need access to the resources in Account A, so Account B and Account C

are the trusted accounts. The trusting account grants the permissions to the trusted accounts by

sharing an IAM role. The following procedure outlines the conﬁguration steps required in Account

Conﬁguration for Account A

1. Use AWS Resource Access Manager to create a resource share for the subnet and share the

private subnet with Account B and Account C.

Conﬁgure cross-account access in a shared VPC 196

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Account B and Account C can now see and create resources in the subnet that has been

shared with them.

2. Create an Amazon Keyspaces private VPC endpoint powered by AWS PrivateLink. This creates

multiple endpoints across shared subnets and DNS entries for the Amazon Keyspaces service

endpoint.

3. Create an Amazon Keyspaces keyspace and table.

4. Create an IAM role that has full access to the Amazon Keyspaces table, read access to the

Amazon Keyspaces system tables, and is able to describe the Amazon EC2 VPC resources as

shown in the following policy example.

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "CrossAccountAccess",

"Effect": "Allow",

"Action": [

"ec2:DescribeNetworkInterfaces",

"ec2:DescribeVpcEndpoints",

"cassandra:*"

"Resource": "*"

}

]

}

Conﬁgure the IAM role trust policy that Account B and Account C can assume as trusted

accounts as shown in the following example.

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"AWS": "arn:aws:iam::111111111111:root"

"Action": "sts:AssumeRole",

"Condition": {}

}

Conﬁgure cross-account access in a shared VPC 197

Amazon Keyspaces (for Apache Cassandra) Developer Guide

]

}

For more information about cross-account IAM policies, see Cross-account policies in the IAM

User Guide.

Conﬁguration in Account B and Account C

In Account B and Account C, create new roles and attach the following policy that allows

the principal to assume the shared role created in Account A.

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Service": "ec2.amazonaws.com"

"Action": "sts:AssumeRole"

}

]

}

Allowing the principal to assume the shared role is implemented using the AssumeRole API of

the AWS Security Token Service (AWS STS). For more information, see Providing access to an

IAM user in another AWS account that you own in the IAM User Guide.

In Account B and Account C, you can create applications that utilize the SIGV4

authentication plugin, which allows an application to assume the shared role to connect to the

Amazon Keyspaces table located in Account A through the VPC endpoint in the shared VPC.

For more information about the SIGV4 authentication plugin, see the section called “Create

programmatic access credentials”.

Conﬁguring cross-account access to Amazon Keyspaces without a

shared VPC

If the Amazon Keyspaces table and private VPC endpoint are owned by diﬀerent accounts but are

not sharing a VPC, applications can still connect cross-account using VPC endpoints. Because the

Conﬁgure cross-account access without a shared VPC 198

Amazon Keyspaces (for Apache Cassandra) Developer Guide

accounts are not sharing the VPC endpoints, Account A, Account B, and Account C require

their own VPC endpoints. To the Cassandra client driver, Amazon Keyspaces appears like a single

node instead of a multi-node cluster. Upon connection, the client driver reaches the DNS server

which returns one of the available endpoints in the account’s VPC.

You can also access Amazon Keyspaces tables across diﬀerent accounts without a shared VPC

endpoint by using the public endpoint or deploying a private VPC endpoint in each account. When

not using a shared VPC, each account requires its own VPC endpoint. In this example Account

A, Account B, and Account C require their own VPC endpoints to access the table in Account

A. When using VPC endpoints in this conﬁguration, Amazon Keyspaces appears as a single node

cluster to the Cassandra client driver instead of a multi-node cluster. Upon connection, the client

driver reaches the DNS server which returns one of the available endpoints in the account’s

VPC. But the client driver is not able to access the system.peers table to discover additional

endpoints. Because there are less hosts available, the driver makes less connections. To adjust this,

increase the connection pool setting of the driver by a factor of three.

Conﬁgure cross-account access without a shared VPC 199

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Getting started with Amazon Keyspaces (for Apache

Cassandra)

If you're new to Apache Cassandra and Amazon Keyspaces, this tutorial guides you through

installing the necessary programs and tools to use Amazon Keyspaces successfully. You'll learn

how to create a keyspace and table using Cassandra Query Language (CQL), the AWS Management

Console, or the AWS Command Line Interface (AWS CLI). You then use Cassandra Query Language

(CQL) to perform create, read, update, and delete (CRUD) operations on data in your Amazon

Keyspaces table.

This tutorial covers the following steps.

• Prerequisites – Before starting the tutorial, follow the AWS setup instructions to sign up for

AWS and create an IAM user with access to Amazon Keyspaces. Then you set up the cqhsh-

expansion and AWS CloudShell. Alternatively you can use the AWS CLI to create resources in

Amazon Keyspaces.

• Step 1: Create a keyspace and table – In this section, you'll create a keyspace named "catalog"

and a table named "book_awards" within it. You'll specify the table's columns, data types,

partition key, and clustering column using the AWS Management Console, CQL, or the AWS CLI.

•

Step 2: Perform CRUD operations – Here, you'll use the cqlsh-expansion in CloudShell to

insert, read, update, and delete data in the "book_awards" table. You'll learn how to use various

CQL statements like SELECT, INSERT, UPDATE, and DELETE, and practice ﬁltering and modifying

data.

• Step 3: Clean up resources – To avoid incurring charges for unused resources, this section

guides you through deleting the "book_awards" table and "catalog" keyspace using the console,

CQL, or the AWS CLI.

For tutorials to connect programmatically to Amazon Keyspaces using diﬀerent Apache Cassandra

client drivers, see the section called “Using a Cassandra client driver”. For code examples using

diﬀerent AWS SDKs, see Code examples for Amazon Keyspaces using AWS SDKs.

Topics

• Tutorial prerequisites and considerations

• Create a keyspace in Amazon Keyspaces

• Check keyspace creation status in Amazon Keyspaces

200

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Create a table in Amazon Keyspaces

• Check table creation status in Amazon Keyspaces

• Create, read, update, and delete data (CRUD) using CQL in Amazon Keyspaces

• Delete a table in Amazon Keyspaces

• Delete a keyspace in Amazon Keyspaces

Tutorial prerequisites and considerations

Before you can get started with Amazon Keyspaces, follow the AWS setup instructions in Accessing

Amazon Keyspaces (for Apache Cassandra). These steps include signing up for AWS and creating an

AWS Identity and Access Management (IAM) user with access to Amazon Keyspaces.

To complete all the steps of the tutorial, you need to install cqlsh. You can follow the setup

instructions at Using cqlsh to connect to Amazon Keyspaces.

To access Amazon Keyspaces using cqlsh or the AWS CLI, we recommend using AWS CloudShell.

CloudShell is a browser-based, pre-authenticated shell that you can launch directly from the AWS

Management Console. You can run AWS Command Line Interface (AWS CLI) commands against

Amazon Keyspaces using your preferred shell (Bash, PowerShell or Z shell). To use cqlsh, you must

install the cqlsh-expansion. For cqlsh-expansion installation instructions, see the section

called “Using the cqlsh-expansion”. For more information about CloudShell see the section

called “Using AWS CloudShell”.

To use the AWS CLI to create, view, and delete resources in Amazon Keyspaces, follow the setup

instructions at the section called “Downloading and Conﬁguring the AWS CLI”.

After completing the prerequisite steps, proceed to Create a keyspace in Amazon Keyspaces.

Create a keyspace in Amazon Keyspaces

In this section, you create a keyspace using the console, cqlsh, or the AWS CLI.

Note

Before you begin, make sure that you have conﬁgured all the tutorial prerequisites.

Prerequisites 201

Amazon Keyspaces (for Apache Cassandra) Developer Guide

A keyspace groups related tables that are relevant for one or more applications. A keyspace

contains one or more tables and deﬁnes the replication strategy for all the tables it contains. For

more information about keyspaces, see the following topics:

• Data deﬁnition language (DDL) statements in the CQL language reference: Keyspaces

• Quotas for Amazon Keyspaces (for Apache Cassandra)

In this tutorial we create a single-Region keyspace, and the replication strategy of the keyspace is

SingleRegionStrategy. Using SingleRegionStrategy, Amazon Keyspaces replicates data

across three Availability Zones in one AWS Region. To learn how to create multi-Region keyspaces,

see the section called “Create a multi-Region keyspace”.

Using the console

To create a keyspace using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces.

3. Choose Create keyspace.

In the Keyspace name box, enter catalog as the name for your keyspace.

Name constraints:

• The name can't be empty.

•

Allowed characters: alphanumeric characters and underscore ( _ ).

• Maximum length is 48 characters.

5. Under AWS Regions, conﬁrm that Single-Region replication is the replication strategy for the

keyspace.

6. To create the keyspace, choose Create keyspace.

Verify that the keyspace catalog was created by doing the following:

a. In the navigation pane, choose Keyspaces.

Locate your keyspace catalog in the list of keyspaces.

Create a keyspace 202

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Using CQL

The following procedure creates a keyspace using CQL.

To create a keyspace using CQL

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

The output of that command should look like this.

Connected to Amazon Keyspaces at cassandra.us-east-1.amazonaws.com:9142

[cqlsh 6.1.0 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help.

cqlsh current consistency level is ONE.

2. Create your keyspace using the following CQL command.

CREATE KEYSPACE catalog WITH REPLICATION = {'class': 'SingleRegionStrategy'};

SingleRegionStrategy uses a replication factor of three and replicates data across three

AWS Availability Zones in its Region.

Note

Amazon Keyspaces defaults all input to lowercase unless you enclose it in quotation

marks.

3. Verify that your keyspace was created.

SELECT * from system_schema.keyspaces;

The output of this command should look similar to this.

cqlsh> SELECT * from system_schema.keyspaces;

keyspace_name | durable_writes | replication

Create a keyspace 203

Amazon Keyspaces (for Apache Cassandra) Developer Guide

-------------------------+----------------

+-------------------------------------------------------------------------------------

system_schema | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system_schema_mcs | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system_multiregion_info | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

catalog | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

(5 rows)

Using the AWS CLI

The following procedure creates a keyspace using the AWS CLI.

To create a keyspace using the AWS CLI

1. To conﬁrm that your environment is setup, you can run the following command in CloudShell.

aws keyspaces help

2. Create your keyspace using the following AWS CLI statement.

aws keyspaces create-keyspace --keyspace-name 'catalog'

3. Verify that your keyspace was created with the following AWS CLI statement

aws keyspaces get-keyspace --keyspace-name 'catalog'

The output of this command should look similar to this example.

{

"keyspaceName": "catalog",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/catalog/",

"replicationStrategy": "SINGLE_REGION"

}

Create a keyspace 204

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Check keyspace creation status in Amazon Keyspaces

Amazon Keyspaces performs data deﬁnition language (DDL) operations, such as creating and

deleting keyspaces, asynchronously.

You can monitor the creation status of new keyspaces in the AWS Management Console, which

indicates when a keyspace is pending or active. You can also monitor the creation status of a new

keyspace programmatically by using the system_schema_mcs keyspace. A keyspace becomes

visible in the system_schema_mcs keyspaces table when it's ready for use.

The recommended design pattern to check when a new keyspace is ready for use is to poll the

Amazon Keyspaces system_schema_mcs keyspaces table (system_schema_mcs.*). For a list of

DDL statements for keyspaces, see the the section called “Keyspaces” section in the CQL language

reference.

The following query shows whether a keyspace has been successfully created.

SELECT * FROM system_schema_mcs.keyspaces WHERE keyspace_name = 'mykeyspace';

For a keyspace that has been successfully created, the output of the query looks like the following.

keyspace_name | durable_writes | replication

--------------+-----------------+--------------

mykeyspace | true |{...} 1 item

Create a table in Amazon Keyspaces

In this section, you create a table using the console, cqlsh, or the AWS CLI.

A table is where your data is organized and stored. The primary key of your table determines how

data is partitioned in your table. The primary key is composed of a required partition key and one

or more optional clustering columns. The combined values that compose the primary key must be

unique across all the table’s data. For more information about tables, see the following topics:

• Partition key design: the section called “Partition key design”

• Working with tables: the section called “Check table creation status”

Check keyspace creation status 205

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• DDL statements in the CQL language reference: Tables

• Table resource management: Managing serverless resources

• Monitoring table resource utilization: the section called “Monitoring with CloudWatch”

• Quotas for Amazon Keyspaces (for Apache Cassandra)

When you create a table, you specify the following:

• The name of the table.

• The name and data type of each column in the table.

• The primary key for the table.

• Partition key – Required

• Clustering columns – Optional

Use the following procedure to create a table with the speciﬁed columns, data types, partition

keys, and clustering columns.

Using the console

The following procedure creates the table book_awards with these columns and data types.

year int

award text

rank int

category text

book_title text

author text

publisher text

To create a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces.

Choose catalog as the keyspace you want to create this table in.

4. Choose Create table.

In the Table name box, enter book_awards as a name for your table.

Create a table 206

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Name constraints:

• The name can't be empty.

•

Allowed characters: alphanumeric characters and underscore ( _ ).

• Maximum length is 48 characters.

6. In the Columns section, repeat the following steps for each column that you want to add to

this table.

Add the following columns and data types.

year int

award text

rank int

category text

book_title text

author text

publisher text

a. Name – Enter a name for the column.

Name constraints:

• The name can't be empty.

•

Allowed characters: alphanumeric characters and underscore ( _ ).

• Maximum length is 48 characters.

b. Type – In the list of data types, choose the data type for this column.

c. To add another column, choose Add column.

Choose award and year as the partition keys under Partition Key. A partition key is required

for each table. A partition key can be made of one or more columns.

Add category and rank as Clustering columns. Clustering columns are optional and

determine the sort order within each partition.

a. To add a clustering column, choose Add clustering column.

b. In the Column list, choose category. In the Order list, choose ASC to sort in ascending

order on the values in this column. (Choose DESC for descending order.)

c. Then select Add clustering column and choose rank.

Create a table 207

Amazon Keyspaces (for Apache Cassandra) Developer Guide

9. In the Table settings section, choose Default settings.

10. Choose Create table.

11. Verify that your table was created.

a. In the navigation pane, choose Tables.

b. Conﬁrm that your table is in the list of tables.

c. Choose the name of your table.

d. Conﬁrm that all your columns and data types are correct.

Note

The columns might not be listed in the same order that you added them to the

table.

Using CQL

This procedure creates a table with the following columns and data types using CQL. The year and

award columns are partition keys with category and rank as clustering columns, together they

make up the primary key of the table.

year int

award text

rank int

category text

book_title text

author text

publisher text

To create a table using CQL

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

The output of that command should look like this.

Create a table 208

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connected to Amazon Keyspaces at cassandra.us-east-1.amazonaws.com:9142

[cqlsh 6.1.0 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help.

cqlsh current consistency level is ONE.

At the keyspace prompt (cqlsh:keyspace_name>), create your table by entering the

following code into your command window.

CREATE TABLE catalog.book_awards (

year int,

award text,

rank int,

category text,

book_title text,

author text,

publisher text,

PRIMARY KEY ((year, award), category, rank)

);

Note

ASC is the default clustering order. You can also specify DESC for descending order.

Note that the year and award are partition key columns. Then, category and rank are

the clustering columns ordered by ascending order (ASC). Together, these columns form the

primary key of the table.

3. Verify that your table was created.

SELECT * from system_schema.tables WHERE keyspace_name='catalog.book_awards' ;

The output should look similar to this.

| compaction | compression | crc_check_chance | dclocal_read_repair_chance

max_index_interval | memtable_flush_period_in_ms | min_index_interval |

read_repair_chance | speculative_retry

Create a table 209

Amazon Keyspaces (for Apache Cassandra) Developer Guide

---------------+------------+------------------------+---------+-----+---------

+------------+-------------+------------------+----------------------------

+----------------------+------------+-------+------------------+----

+--------------------+-----------------------------+--------------------

+--------------------+-------------------

(0 rows)>

4. Verify your table's structure.

SELECT * FROM system_schema.columns WHERE keyspace_name = 'catalog' AND table_name

= 'book_awards';

The output of this statement should look similar to this example.

keyspace_name | table_name | column_name | clustering_order | column_name_bytes

| kind | position | type

---------------+-------------+-------------+------------------

+------------------------+---------------+----------+------

catalog | book_awards | year | none |

0x79656172 | partition_key | 0 | int

catalog | book_awards | award | none |

0x6177617264 | partition_key | 1 | text

catalog | book_awards | category | asc |

0x63617465676f7279 | clustering | 0 | text

catalog | book_awards | rank | asc |

0x72616e6b | clustering | 1 | int

catalog | book_awards | author | none |

0x617574686f72 | regular | -1 | text

catalog | book_awards | book_title | none |

0x626f6f6b5f7469746c65 | regular | -1 | text

catalog | book_awards | publisher | none |

0x7075626c6973686572 | regular | -1 | text

(7 rows)

Conﬁrm that all the columns and data types are as you expected. The order of the columns

might be diﬀerent than in the CREATE statement.

Create a table 210

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Using the AWS CLI

This procedure creates a table with the following columns and data types using the AWS CLI.

The year and award columns make up the partition key with category and rank as clustering

columns.

year int

award text

rank int

category text

book_title text

author text

publisher text

To create a table using the AWS CLI

The following command creates a table with the name book_awards. The partition key of the

table consists of the columns year and award and the clustering key consists of the columns

category and rank, both clustering columns use the ascending sort order. (For easier readability,

the schema-definition of the table create command in this section is broken into separate

lines.)

1. You can create the table using the following statement.

aws keyspaces create-table --keyspace-name 'catalog' \

--table-name 'book_awards' \

--schema-definition 'allColumns=[{name=year,type=int},

{name=award,type=text},{name=rank,type=int},

{name=category,type=text}, {name=author,type=text},

{name=book_title,type=text},{name=publisher,type=text}],

partitionKeys=[{name=year},

{name=award}],clusteringKeys=[{name=category,orderBy=ASC},{name=rank,orderBy=ASC}]'

This command results in the following output.

{

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/catalog/

table/book_awards"

}

2. To conﬁrm the metadata and properties of the table, you can use the following command.

Create a table 211

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws keyspaces get-table --keyspace-name 'catalog' --table-name 'book_awards'

This command returns the following output.

{

"keyspaceName": "catalog",

"tableName": "book_awards",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/catalog/

table/book_awards",

"creationTimestamp": "2024-07-11T15:12:55.571000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "year",

"type": "int"

{

"name": "award",

"type": "text"

{

"name": "category",

"type": "text"

{

"name": "rank",

"type": "int"

{

"name": "author",

"type": "text"

{

"name": "book_title",

"type": "text"

{

"name": "publisher",

"type": "text"

}

Create a table 212

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"partitionKeys": [

{

"name": "year"

{

"name": "award"

}

"clusteringKeys": [

{

"name": "category",

"orderBy": "ASC"

{

"name": "rank",

"orderBy": "ASC"

}

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2024-07-11T15:12:55.571000+00:00"

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

"replicaSpecifications": []

}

To perform CRUD (create, read, update, and delete) operations on the data in your table, proceed

to the section called “CRUD operations”.

Create a table 213

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Check table creation status in Amazon Keyspaces

Amazon Keyspaces performs data deﬁnition language (DDL) operations, such as creating and

deleting tables, asynchronously. You can monitor the creation status of new tables in the AWS

Management Console, which indicates when a table is pending or active. You can also monitor the

creation status of a new table programmatically by using the system schema table.

A table shows as active in the system schema when it's ready for use. The recommended design

pattern to check when a new table is ready for use is to poll the Amazon Keyspaces system schema

tables (system_schema_mcs.*). For a list of DDL statements for tables, see the the section called

“Tables” section in the CQL language reference.

The following query shows the status of a table.

SELECT keyspace_name, table_name, status FROM system_schema_mcs.tables WHERE

keyspace_name = 'mykeyspace' AND table_name = 'mytable';

For a table that is still being created and is pending,the output of the query looks like this.

keyspace_name | table_name | status

--------------+------------+--------

mykeyspace | mytable | CREATING

For a table that has been successfully created and is active, the output of the query looks like the

following.

keyspace_name | table_name | status

--------------+------------+--------

mykeyspace | mytable | ACTIVE

Create, read, update, and delete data (CRUD) using CQL in

Amazon Keyspaces

In this step of the tutorial, you'll learn how to insert, read, update, and delete data in an Amazon

Keyspaces table using CQL data manipulation language (DML) statements. In Amazon Keyspaces,

Check table creation status 214

Amazon Keyspaces (for Apache Cassandra) Developer Guide

you can only create DML statements in CQL language. In this tutorial, you'll practice running DML

statements using the cqlsh-expansion with AWS CloudShell in the AWS Management Console.

• Inserting data – This section covers inserting single and multiple records into a table using the

INSERT statement. You'll learn how to upload data from a CSV ﬁle and verify successful inserts

using SELECT queries.

•

Reading data – Here, you'll explore diﬀerent variations of the SELECT statement to retrieve data

from a table. Topics include selecting all data, selecting speciﬁc columns, ﬁltering rows based on

conditions using the WHERE clause, and understanding simple and compound conditions.

• Updating data – In this section, you'll learn how to modify existing data in a table using the

UPDATE statement. You'll practice updating single and multiple columns while understanding

restrictions around updating primary key columns.

•

Deleting data – The ﬁnal section covers deleting data from a table using the DELETEstatement.

You'll learn how to delete speciﬁc cells, entire rows, and the implications of deleting data versus

deleting the entire table or keyspace.

Throughout the tutorial, you'll ﬁnd examples, tips, and opportunities to practice writing your own

CQL queries for various scenarios.

Topics

• Inserting and loading data into an Amazon Keyspaces table

• Read data from a table using the CQL SELECT statement in Amazon Keyspaces

• Update data in an Amazon Keyspaces table using CQL

• Delete data from a table using the CQL DELETE statement

Inserting and loading data into an Amazon Keyspaces table

To create data in your book_awards table, use the INSERT statement to add a single row.

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

The output of that command should look like this.

Create 215

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connected to Amazon Keyspaces at cassandra.us-east-1.amazonaws.com:9142

[cqlsh 6.1.0 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help.

cqlsh current consistency level is ONE.

2. Before you can write data to your Amazon Keyspaces table using cqlsh, you must set the write

consistency for the current cqlsh session to LOCAL_QUORUM. For more information about

supported consistency levels, see the section called “Write consistency levels”. Note that this

step is not required if you are using the CQL editor in the AWS Management Console.

CONSISTENCY LOCAL_QUORUM;

3. To insert a single record, run the following command in the CQL editor.

INSERT INTO catalog.book_awards (award, year, category, rank, author, book_title,

publisher)

VALUES ('Wolf', 2023, 'Fiction',3,'Shirley Rodriguez','Mountain', 'AnyPublisher') ;

4. Verify that the data was correctly added to your table by running the following command.

SELECT * FROM catalog.book_awards ;

The output of the statement should look like this.

------+-------+----------+------+-------------------+------------+--------------

(1 rows)

To insert multiple records from a ﬁle using cqlsh

Download the sample CSV ﬁle (keyspaces_sample_table.csv) contained in

the archive ﬁle samplemigration.zip. Unzip the archive and take note of the path to

keyspaces_sample_table.csv.

Create 216

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. Open AWS CloudShell in the AWS Management Console and connect to Amazon Keyspaces

using the following command. Make sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

At the cqlsh prompt (cqlsh>), specify a keyspace.

USE catalog ;

Set write consistency to LOCAL_QUORUM. For more information about supported consistency

levels, see the section called “Write consistency levels”.

CONSISTENCY LOCAL_QUORUM;

5. In the AWS CloudShell choose Actions on the top right side of the screen and then choose

Upload ﬁle to upload the csv ﬁle downloaded earlier. Take note of the path to the ﬁle.

At the keyspace prompt (cqlsh:catalog>), run the following statement.

COPY book_awards (award, year, category, rank, author, book_title, publisher) FROM

'/home/cloudshell-user/keyspaces_sample_table.csv' WITH header=TRUE ;

The output of the statement should look similar to this.

cqlsh:catalog> COPY book_awards (award, year, category, rank, author,

book_title, publisher) FROM '/home/cloudshell-user/

keyspaces_sample_table.csv' WITH delimiter=',' AND header=TRUE ;

cqlsh current consistency level is LOCAL_QUORUM.

Reading options from /home/cloudshell-user/.cassandra/cqlshrc:[copy]:

{'numprocesses': '16', 'maxattempts': '1000'}

Create 217

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Reading options from /home/cloudshell-user/.cassandra/cqlshrc:[copy-from]:

{'ingestrate': '1500', 'maxparseerrors': '1000', 'maxinserterrors': '-1',

'maxbatchsize': '10', 'minbatchsize': '1', 'chunksize': '30'}

Reading options from the command line: {'delimiter': ',', 'header': 'TRUE'}

Using 16 child processes

Starting copy of catalog.book_awards with columns [award, year, category, rank,

author, book_title, publisher].

OSError: handle is closed 0 rows/s; Avg. rate: 0 rows/s

Processed: 9 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s

9 rows imported from 1 files in 0 day, 0 hour, 0 minute, and 26.706 seconds (0

skipped).

7. Verify that the data was correctly added to your table by running the following query.

SELECT * FROM book_awards ;

You should see the following output.

| publisher

------+------------------+-------------+------+--------------------

+-----------------------+---------------

of Ideas | Example Books

Science Today | SomePublisher

Sea Ice | AnyPublisher

you go? | SomePublisher

Yesterday | Example Books

Chateau | AnyPublisher

Summer | SomePublisher

The Key | Example Books

the Whale | AnyPublisher

Create 218

Amazon Keyspaces (for Apache Cassandra) Developer Guide

(9 rows)

To learn more about using cqlsh COPY to upload data from csv ﬁles to an Amazon Keyspaces

table, see the section called “Loading data using cqlsh”.

Read data from a table using the CQL SELECT statement in Amazon

Keyspaces

In the Inserting and loading data into an Amazon Keyspaces table section, you used the SELECT

statement to verify that you had successfully added data to your table. In this section, you reﬁne

your use of SELECT to display speciﬁc columns, and only rows that meet speciﬁc criteria.

The general form of the SELECT statement is as follows.

SELECT column_list FROM table_name [WHERE condition [ALLOW FILTERING]] ;

Topics

• Select all the data in your table

• Select a subset of columns

• Select a subset of rows

Select all the data in your table

The simplest form of the SELECT statement returns all the data in your table.

Important

In a production environment, it's typically not a best practice to run this command, because

it returns all the data in your table.

To select all your table's data

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. Run the following query.

SELECT * FROM catalog.book_awards ;

Using the wild-card character ( * ) for the column_list selects all columns. The output of the

statement looks like the following example.

| publisher

------+------------------+-------------+------+--------------------

+-----------------------+---------------

of Ideas | AnyPublisher

Science Today | SomePublisher

Sea Ice | AnyPublisher

you go? | SomePublisher

Yesterday | Example Books

Chateau | AnyPublisher

Summer | SomePublisher

The Key | Example Books

the Whale | AnyPublisher

Select a subset of columns

To query for a subset of columns

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

To retrieve only the award, category, and year columns, run the following query.

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT award, category, year FROM catalog.book_awards ;

The output contains only the speciﬁed columns in the order listed in the SELECT statement.

award | category | year

------------------+-------------+------

Wolf | Non-Fiction | 2020

Kwesi Manu Prize | Fiction | 2020

Richard Roe | Fiction | 2020

Select a subset of rows

When querying a large dataset, you might only want records that meet certain criteria. To do this,

you can append a WHERE clause to the end of our SELECT statement.

To query for a subset of rows

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

2. To retrieve only the records for the awards of a given year, run the following query.

SELECT * FROM catalog.book_awards WHERE year=2020 AND award='Wolf' ;

The preceding SELECT statement returns the following output.

publisher

------+-------+-------------+------+--------------------+-----------------------

+---------------

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AnyPublisher

SomePublisher

AnyPublisher

Understanding the WHERE clause

The WHERE clause is used to ﬁlter the data and return only the data that meets the speciﬁed

criteria. The speciﬁed criteria can be a simple condition or a compound condition.

How to use conditions in a WHERE clause

• A simple condition – A single column.

WHERE column_name=value

You can use a simple condition in a WHERE clause if any of the following conditions are met:

• The column is the only partition key column of the table.

•

You add ALLOW FILTERING after the condition in the WHERE clause.

Be aware that using ALLOW FILTERING can result in inconsistent performance, especially with

large, and multi-partitioned tables.

•

A compound condition – Multiple simple conditions connected by AND.

WHERE column_name1=value1 AND column_name2=value2 AND column_name3=value3...

You can use compound conditions in a WHERE clause if any of the following conditions are met:

•

The columns you can use in the WHERE clause need to include either all or a subset of the

columns in the table's partition key. If you want to use only a subset of the columns in the

WHERE clause, you must include a contiguous set of partition key columns from left to right,

beginning with the partition key's leading column. For example, if the partition key columns

are year, month, and award then you can use the following columns in the WHERE clause:

•

year

•

year AND month

•

year AND month AND award

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

You add ALLOW FILTERING after the compound condition in the WHERE clause, as in the

following example.

SELECT * FROM my_table WHERE col1=5 AND col2='Bob' ALLOW FILTERING ;

Be aware that using ALLOW FILTERING can result in inconsistent performance, especially with

large, and multi-partitioned tables.

Try it

Create your own CQL queries to ﬁnd the following from your book_awards table:

• Find the winners of the 2020 Wolf awards and display the book titles and authors, ordered by

rank.

• Show the ﬁrst prize winners for all awards in 2020 and display the book titles and award names.

Update data in an Amazon Keyspaces table using CQL

To update the data in your book_awards table, use the UPDATE statement.

The general form of the UPDATE statement is as follows.

UPDATE table_name SET column_name=new_value WHERE primary_key=value ;

Tip

•

You can update multiple columns by using a comma-separated list of column_names

and values, as in the following example.

UPDATE my_table SET col1='new_value_1', col2='new_value2' WHERE col3='1' ;

• If the primary key is composed of multiple columns, all primary key columns and their

values must be included in the WHERE clause.

• You cannot update any column in the primary key because that would change the

primary key for the record.

Update 223

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To update a single cell

Using your book_awards table, change the name of a publisher the for winner of the non-ﬁction

Wolf awards in 2020.

UPDATE book_awards SET publisher='new Books' WHERE year = 2020 AND award='Wolf' AND

category='Non-Fiction' AND rank=1;

Verify that the publisher is now new Books.

SELECT * FROM book_awards WHERE year = 2020 AND award='Wolf' AND category='Non-Fiction'

AND rank=1;

The statement should return the following output.

------+-------+-------------+------+-------------+------------------+-----------

Try it

Advanced: The winner of the 2020 ﬁction "Kwezi Manu Prize" has changed their name. Update this

record to change the name to 'Akua Mansa-House'.

Delete data from a table using the CQL DELETE statement

To delete data in your book_awards table, use the DELETE statement.

You can delete data from a row or from a partition. Be careful when deleting data, because

deletions are irreversible.

Deleting one or all rows from a table doesn't delete the table. Thus you can repopulate it with data.

Deleting a table deletes the table and all data in it. To use the table again, you must re-create it

and add data to it. Deleting a keyspace deletes the keyspace and all tables within it. To use the

keyspace and tables, you must re-create them, and then populate them with data. You can use

Amazon Keyspaces Point-in-time (PITR) recovery to help restore deleted tables, to learn more see

the section called “Backup and restore with point-in-time recovery” . To learn how to restore a

deleted table with PITR enabled, see the section called “Restore a deleted table”.

Delete 224

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Delete cells

Deleting a column from a row removes the data from the speciﬁed cell. When you display that

column using a SELECT statement, the data is displayed as null, though a null value is not stored

in that location.

The general syntax to delete one or more speciﬁc columns is as follows.

DELETE column_name1[, column_name2...] FROM table_name WHERE condition ;

In your book_awards table, you can see that the title of the book that won the ﬁrst price of the

2020 "Richard Roe" price is "Long Summer". Imagine that this title has been recalled, and you need

to delete the data from this cell.

To delete a speciﬁc cell

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

Run the following DELETE query.

DELETE book_title FROM catalog.book_awards WHERE year=2020 AND award='Richard Roe'

AND category='Fiction' AND rank=1;

3. Verify that the delete request was made as expected.

SELECT * FROM catalog.book_awards WHERE year=2020 AND award='Richard Roe' AND

category='Fiction' AND rank=1;

The output of this statement looks like this.

------+-------------+----------+------+-------------------+------------

+---------------

SomePublisher

Delete 225

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Delete rows

There might be a time when you need to delete an entire row, for example to meet a data deletion

request. The general syntax for deleting a row is as follows.

DELETE FROM table_name WHERE condition ;

To delete a row

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

Run the following DELETE query.

DELETE FROM catalog.book_awards WHERE year=2020 AND award='Richard Roe' AND

category='Fiction' AND rank=1;

3. Verify that the delete was made as expected.

SELECT * FROM catalog.book_awards WHERE year=2020 AND award='Richard Roe' AND

category='Fiction' AND rank=1;

The output of this statement looks like this after the row has been deleted.

------+-------+----------+------+--------+------------+-----------

(0 rows)

You can delete expired data automatically from your table using Amazon Keyspaces Time to Live,

for more information, see the section called “Expire data with Time to Live”.

Delete a table in Amazon Keyspaces

To avoid being charged for tables and data that you don't need, delete all the tables that you're not

using. When you delete a table, the table and its data are deleted and you stop accruing charges

Delete a table 226

Amazon Keyspaces (for Apache Cassandra) Developer Guide

for them. However, the keyspace remains. When you delete a keyspace, the keyspace and all its

tables are deleted and you stop accruing charges for them.

You can delete a table using the console, CQL, or the AWS CLI. When you delete a table, the table

and all its data are deleted.

Using the console

The following procedure deletes a table and all its data using the AWS Management Console.

To delete a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables.

3. Choose the box to the left of the name of each table that you want to delete.

4. Choose Delete.

On the Delete table screen, enter Delete in the box. Then, choose Delete table.

6. To verify that the table was deleted, choose Tables in the navigation pane, and conﬁrm that

the book_awards table is no longer listed.

Using CQL

The following procedure deletes a table and all its data using CQL.

To delete a table using CQL

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

2. Delete your table by entering the following statement.

DROP TABLE IF EXISTS catalog.book_awards ;

3. Verify that your table was deleted.

SELECT * FROM system_schema.tables WHERE keyspace_name = 'catalog' ;

Delete a table 227

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The output should look like this. Note that this might take some time, so re-run the statement

after a minute if you don't see this result.

| compaction | compression | crc_check_chance | dclocal_read_repair_chance

max_index_interval | memtable_flush_period_in_ms | min_index_interval |

read_repair_chance | speculative_retry

---------------+------------+------------------------+---------+-----+---------

+------------+-------------+------------------+----------------------------

+----------------------+------------+-------+------------------+----

+--------------------+-----------------------------+--------------------

+--------------------+-------------------

(0 rows)

Using the AWS CLI

The following procedure deletes a table and all its data using the AWS CLI.

To delete a table using the AWS CLI

1. Open CloudShell

2. Delete your table with the following statement.

aws keyspaces delete-table --keyspace-name 'catalog' --table-name 'book_awards'

3. To verify that your table was deleted, you can list all tables in a keyspace.

aws keyspaces list-tables --keyspace-name 'catalog'

You should see the following output. Note that this asynchronous operation can take some

time. Re-run the command again after a short while to conﬁrm that the table has been

deleted.

{

"tables": []

}

Delete a table 228

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Delete a keyspace in Amazon Keyspaces

To avoid being charged for keyspaces, delete all the keyspaces that you're not using. When you

delete a keyspace, the keyspace and all its tables are deleted and you stop accruing charges for

them.

You can delete a keyspace using either the console, CQL, or the AWS CLI.

Using the console

The following procedure deletes a keyspace and all its tables and data using the console.

To delete a keyspace using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces.

3. Choose the box to the left of the name of each keyspace that you want to delete.

4. Choose Delete.

On the Delete keyspace screen, enter Delete in the box. Then, choose Delete keyspace.

To verify that the keyspace catalog was deleted, choose Keyspaces in the navigation pane

and conﬁrm that it is no longer listed. Because you deleted its keyspace, the book_awards

table under Tables should also not be listed.

Using CQL

The following procedure deletes a keyspace and all its tables and data using CQL.

To delete a keyspace using CQL

1. Open AWS CloudShell and connect to Amazon Keyspaces using the following command. Make

sure to update us-east-1 with your own Region.

cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl

2. Delete your keyspace by entering the following statement.

DROP KEYSPACE IF EXISTS catalog ;

Delete a keyspace 229

Amazon Keyspaces (for Apache Cassandra) Developer Guide

3. Verify that your keyspace was deleted.

SELECT * from system_schema.keyspaces ;

Your keyspace should not be listed. Note that because this is an asynchronous operation, there

can be a delay until the keyspace is deleted. After the keyspace has been deleted, the output

of the statement should look like this.

keyspace_name | durable_writes | replication

-------------------------+----------------

+-------------------------------------------------------------------------------------

system_schema | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system_schema_mcs | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

system_multiregion_info | True | {'class':

'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

(4 rows)

Using the AWS CLI

The following procedure deletes a keyspace and all its tables and data using the AWS CLI.

To delete a keyspace using the AWS CLI

1. Open AWS CloudShell

2. Delete your keyspace by entering the following statement.

aws keyspaces delete-keyspace --keyspace-name 'catalog'

3. Verify that your keyspace was deleted.

aws keyspaces list-keyspaces

The output of this statement should look similar to this. Note that because this is an

asynchronous operation, there can be a delay until the keyspace is deleted.

Delete a keyspace 230

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"keyspaces": [

{

"keyspaceName": "system_schema",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

system_schema/",

"replicationStrategy": "SINGLE_REGION"

{

"keyspaceName": "system_schema_mcs",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

system_schema_mcs/",

"replicationStrategy": "SINGLE_REGION"

{

"keyspaceName": "system",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

system/",

"replicationStrategy": "SINGLE_REGION"

{

"keyspaceName": "system_multiregion_info",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

system_multiregion_info/",

"replicationStrategy": "SINGLE_REGION"

}

]

}

Delete a keyspace 231

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Managing serverless resources in Amazon Keyspaces (for

Apache Cassandra)

Amazon Keyspaces (for Apache Cassandra) is serverless. Instead of deploying, managing, and

maintaining storage and compute resources for your workload through nodes in a cluster, Amazon

Keyspaces allocates storage and read/write throughput resources directly to tables.

Amazon Keyspaces provisions storage automatically based on the data stored in your tables.

It scales storage up and down as you write, update, and delete data, and you pay only for the

storage you use. Data is replicated across multiple Availability Zones for high availability. Amazon

Keyspaces monitors the size of your tables continuously to determine your storage charges. For

more information about how Amazon Keyspaces calculates the billable size of the data, see the

section called “Estimate row size”.

This chapter covers key aspects of resource management in Amazon Keyspaces.

• Estimate row size – To estimate the encoded size of rows in Amazon Keyspaces, consider factors

like partition key metadata, clustering column metadata, column identiﬁers, data types, and

row metadata. This encoded row size is used for billing, quota management, and provisioned

throughput capacity planning.

• Estimate capacity consumption – This section covers examples of how to estimate read and

write capacity consumption for common scenarios like range queries, limit queries, table

scans, lightweight transactions, static columns, and multi-Region tables. You can use Amazon

CloudWatch to monitor actual capacity utilization. For more information about monitoring with

CloudWatch, see the section called “Monitoring with CloudWatch”.

• Conﬁgure read/write capacity modes – You can choose between two capacity modes for

processing reads and writes on your tables:

• On-demand mode (default) – Pay per request for read and write throughput. Amazon

Keyspaces can instantly scale capacity up to any previously reached traﬃc level.

• Provisioned mode – Specify the required number of read and write capacity units in advance.

This mode helps maintain predictable throughput performance.

• Manage throughput capacity with automatic scaling – For provisioned tables, you can enable

automatic scaling to adjust throughput capacity automatically based on actual application

traﬃc. Amazon Keyspaces uses target tracking to increase or decrease provisioned capacity,

keeping utilization at your speciﬁed target.

232

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Use burst capacity eﬀectively – Amazon Keyspaces provides burst capacity by reserving a

portion of unused throughput for handling spikes in traﬃc. This ﬂexibility allows occasional

bursts of activity beyond your provisioned throughput.

To troubleshoot capacity errors, see the section called “Serverless capacity errors”.

Topics

• Estimate row size in Amazon Keyspaces

• Estimate capacity consumption of read and write throughput in Amazon Keyspaces

• Conﬁgure read/write capacity modes in Amazon Keyspaces

• Manage throughput capacity automatically with Amazon Keyspaces auto scaling

• Use burst capacity eﬀectively in Amazon Keyspaces

Estimate row size in Amazon Keyspaces

Amazon Keyspaces provides fully managed storage that oﬀers single-digit millisecond read and

write performance and stores data durably across multiple AWS Availability Zones. Amazon

Keyspaces attaches metadata to all rows and primary key columns to support eﬃcient data access

and high availability.

This section provides details about how to estimate the encoded size of rows in Amazon Keyspaces.

The encoded row size is used when calculating your bill and quota use. You should also use the

encoded row size when calculating provisioned throughput capacity requirements for tables. To

calculate the encoded size of rows in Amazon Keyspaces, you can use the following guidelines.

•

For regular columns, which are columns that aren't primary keys, clustering columns, or STATIC

columns, use the raw size of the cell data based on the data type and add the required metadata.

For more information about the data types supported in Amazon Keyspaces, see the section

called “Data types”. Some key diﬀerences in how Amazon Keyspaces stores data type values and

metadata are listed below.

• The space required for each column name is stored using a column identiﬁer and added to each

data value stored in the column. The storage value of the column identiﬁer depends on the

overall number of columns in your table:

• 1–62 columns: 1 byte

• 63–124 columns: 2 bytes

Estimate row size 233

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• 125–186 columns: 3 bytes

For each additional 62 columns add 1 byte. Note that in Amazon Keyspaces, up to 225 regular

columns can be modiﬁed with a single INSERT or UPDATE statement. For more information, see

the section called “Amazon Keyspaces service quotas”.

• Partition keys can contain up to 2048 bytes of data. Each key column in the partition key

requires up to 3 bytes of metadata. When calculating the size of your row, you should assume

each partition key column uses the full 3 bytes of metadata.

• Clustering columns can store up to 850 bytes of data. In addition to the size of the data value,

each clustering column requires up to 20% of the data value size for metadata. When calculating

the size of your row, you should add 1 byte of metadata for each 5 bytes of clustering column

data value.

• Amazon Keyspaces stores the data value of each partition key and clustering key column twice.

The extra overhead is used for eﬃcient querying and built-in indexing.

•

Cassandra ASCII, TEXT, and VARCHAR string data types are all stored in Amazon Keyspaces

using Unicode with UTF-8 binary encoding. The size of a string in Amazon Keyspaces equals the

number of UTF-8 encoded bytes.

•

Cassandra INT, BIGINT, SMALLINT, and TINYINT data types are stored in Amazon Keyspaces as

data values with variable length, with up to 38 signiﬁcant digits. Leading and trailing zeroes are

trimmed. The size of any of these data types is approximately 1 byte per two signiﬁcant digits +

1 byte.

•

A BLOB in Amazon Keyspaces is stored with the value's raw byte length.

•

The size of a Null value or a Boolean value is 1 byte.

•

A column that stores collection data types like LIST or MAP requires 3 bytes of metadata,

regardless of its contents. The size of a LIST or MAP is (column id) + sum (size of nested

elements) + (3 bytes). The size of an empty LIST or MAP is (column id) + (3 bytes). Each

individual LIST or MAP element also requires 1 byte of metadata.

•

STATIC column data doesn't count towards the maximum row size of 1 MB. To calculate the

data size of static columns, see the section called “Calculate static column size per logical

partition”.

• Client-side timestamps are stored for every column in each row when the feature is turned on.

These timestamps take up approximately 20–40 bytes (depending on your data), and contribute

to the storage and throughput cost for the row. For more information, see the section called

“Client-side timestamps”.

Estimate row size 234

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Add 100 bytes to the size of each row for row metadata.

The total size of an encoded row of data is based on the following formula:

partition key columns + clustering columns + regular columns + row metadata = total

encoded size of row

Important

All column metadata, for example column ids, partition key metadata, clustering column

metadata, as well as client-side timestamps and row metadata count towards the

maximum row size of 1 MB.

Consider the following example of a table where all columns are of type integer. The table has two

partition key columns, two clustering columns, and one regular column. Because this table has ﬁve

columns, the space required for the column name identiﬁer is 1 byte.

CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2 int,

reg_col1 int, primary key((pk_col1, pk_col2),ck_col1, ck_col2));

In this example, we calculate the size of data when we write a row to the table as shown in the

following statement:

INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, ck_col1, ck_col2, reg_col1)

values(1,2,3,4,5);

To estimate the total bytes required by this write operation, you can use the following steps.

1. Calculate the size of a partition key column by adding the bytes for the data type stored in the

column and the metadata bytes. Repeat this for all partition key columns.

a. Calculate the size of the ﬁrst column of the partition key (pk_col1):

(2 bytes for the integer data type) x 2 + 1 byte for the column id + 3 bytes

for partition key metadata = 8 bytes

b. Calculate the size of the second column of the partition key (pk_col2):

Estimate row size 235

Amazon Keyspaces (for Apache Cassandra) Developer Guide

(2 bytes for the integer data type) x 2 + 1 byte for the column id + 3 bytes

for partition key metadata = 8 bytes

c. Add both columns to get the total estimated size of the partition key columns:

8 bytes + 8 bytes = 16 bytes for the partition key columns

2. Calculate the size of the clustering column by adding the bytes for the data type stored in the

column and the metadata bytes. Repeat this for all clustering columns.

a. Calculate the size of the ﬁrst column of the clustering column (ck_col1):

(2 bytes for the integer data type) x 2 + 20% of the data value (2 bytes) for

clustering column metadata + 1 byte for the column id = 6 bytes

b. Calculate the size of the second column of the clustering column (ck_col2):

(2 bytes for the integer data type) x 2 + 20% of the data value (2 bytes) for

clustering column metadata + 1 byte for the column id = 6 bytes

c. Add both columns to get the total estimated size of the clustering columns:

6 bytes + 6 bytes = 12 bytes for the clustering columns

3. Add the size of the regular columns. In this example we only have one column that stores a

single digit integer, which requires 2 bytes with 1 byte for the column id.

4. Finally, to get the total encoded row size, add up the bytes for all columns and add the

additional 100 bytes for row metadata:

16 bytes for the partition key columns + 12 bytes for clustering columns + 3 bytes

for the regular column + 100 bytes for row metadata = 131 bytes.

To learn how to monitor serverless resources with Amazon CloudWatch, see the section called

“Monitoring with CloudWatch”.

Estimate capacity consumption of read and write throughput in

Amazon Keyspaces

Estimate capacity consumption 236

Amazon Keyspaces (for Apache Cassandra) Developer Guide

When you read or write data in Amazon Keyspaces, the amount of read/write request units

(RRUs/WRUs) or read/write capacity units (RCUs/WCUs) your query consumes depends on the

total amount of data Amazon Keyspaces has to process to run the query. In some cases, the

data returned to the client can be a subset of the data that Amazon Keyspaces had to read to

process the query. For conditional writes, Amazon Keyspaces consumes write capacity even if the

conditional check fails.

To estimate the total amount of data being processed for a request, you have to consider the

encoded size of a row and the total number of rows. This topic covers some examples of common

scenarios and access patterns to show how Amazon Keyspaces processes queries and how that

aﬀects capacity consumption. You can follow the examples to estimate the capacity requirements

of your tables and use Amazon CloudWatch to observe the read and write capacity consumption

for these use cases.

For information on how to calculate the encoded size of rows in Amazon Keyspaces, see the section

called “Estimate row size”.

Topics

• Estimate the capacity consumption of range queries in Amazon Keyspaces

• Estimate the read capacity consumption of limit queries

• Estimate the read capacity consumption of table scans

• Estimate capacity consumption of lightweight transactions in Amazon Keyspaces

• Estimate capacity consumption for static columns in Amazon Keyspaces

• Estimate and provision capacity for a multi-Region table in Amazon Keyspaces

• Estimate read and write capacity consumption with Amazon CloudWatch in Amazon Keyspaces

Estimate the capacity consumption of range queries in Amazon

Keyspaces

To look at the read capacity consumption of a range query, we use the following example table

which is using on-demand capacity mode.

pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value

-----+-----+-----+-----+-----+-----+-------

a | b | 1 | a | b | 50 | <any value that results in a row size larger than 4KB>

a | b | 1 | a | b | 60 | value_1

Estimate the capacity consumption of range queries 237

Amazon Keyspaces (for Apache Cassandra) Developer Guide

a | b | 1 | a | b | 70 | <any value that results in a row size larger than 4KB>

Now run the following query on this table.

SELECT * FROM amazon_keyspaces.example_table_1 WHERE pk1='a' AND pk2='b' AND pk3=1 AND

ck1='a' AND ck2='b' AND ck3 > 50 AND ck3 < 70;

You receive the following result set from the query and the read operation performed by Amazon

Keyspaces consumes 2 RRUs in LOCAL_QUORUM consistency mode.

pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value

-----+-----+-----+-----+-----+-----+-------

a | b | 1 | a | b | 60 | value_1

Amazon Keyspaces consumes 2 RRUs to evaluate the rows with the values ck3=60 and ck3=70 to

process the query. However, Amazon Keyspaces only returns the row where the WHERE condition

speciﬁed in the query is true, which is the row with value ck3=60. To evaluate the range speciﬁed

in the query, Amazon Keyspaces reads the row matching the upper bound of the range, in this case

ck3 = 70, but doesn’t return that row in the result. The read capacity consumption is based on

the data read when processing the query, not on the data returned.

Estimate the read capacity consumption of limit queries

When processing a query that uses the LIMIT clause, Amazon Keyspaces reads rows up to

the maximum page size when trying to match the condition speciﬁed in the query. If Amazon

Keyspaces can't ﬁnd suﬃcient matching data that meets the LIMIT value on the ﬁrst page, one

or more paginated calls could be needed. To continue reads on the next page, you can use a

pagination token. The default page size is 1MB. To consume less read capacity when using LIMIT

clauses, you can reduce the page size. For more information about pagination, see the section

called “Paginate results”.

For an example, let's look at the following query.

SELECT * FROM my_table WHERE partition_key=1234 LIMIT 1;”

If you don’t set the page size, Amazon Keyspaces reads 1MB of data even though it returns only 1

row to you. To only have Amazon Keyspaces read one row, you can set the page size to 1 for this

query. In this case, Amazon Keyspaces would only read one row provided you don’t have expired

Estimate the read capacity consumption of limit queries 238

Amazon Keyspaces (for Apache Cassandra) Developer Guide

rows based on Time-to-live settings or client-side timestamps. To consume less read capacity, we

recommend to set your page size equal to the LIMIT value to reduce the amount of data Amazon

Keyspaces reads.

Estimate the read capacity consumption of table scans

Queries that result in full table scans, for example queries using the ALLOW FILTERING option, are

another example of queries that process more reads than what they return as results. And the read

capacity consumption is based on the data read, not the data returned.

For the table scan example we use the following example table in on-demand capacity mode.

pk | ck | value

---+----+---------

pk | 10 | <any value that results in a row size larger than 4KB>

pk | 20 | value_1

pk | 30 | <any value that results in a row size larger than 4KB>

Amazon Keyspaces creates a table in on-demand capacity mode with four partitions by default.

In this example table, all the data is stored in one partition and the remaining three partitions are

empty.

Now run the following query on the table.

SELECT * from amazon_keyspaces.example_table_2;

This query results in a table scan operation where Amazon Keyspaces scans all four partitions of

the table and consumes 6 RRUs in LOCAL_QUORUM consistency mode. First, Amazon Keyspaces

consumes 3 RRUs for reading the three rows with pk=‘pk’. Then, Amazon Keyspaces consumes

the additional 3 RRUs for scanning the three empty partitions of the table. Because this query

results in a table scan, Amazon Keyspaces scans all the partitions in the table, including partitions

without data.

Estimate capacity consumption of lightweight transactions in Amazon

Keyspaces

Lightweight transactions (LWT) allow you to perform conditional write operations against your

table data. Conditional update operations are useful when inserting, updating and deleting records

based on conditions that evaluate the current state.

Estimate the read capacity consumption of table scans 239

Amazon Keyspaces (for Apache Cassandra) Developer Guide

In Amazon Keyspaces, all write operations require LOCAL_QUORUM consistency and there is no

additional charge for using LWTs. The diﬀerence for LWTs is that when a LWT condition check

results in FALSE, it consumes write capacity units. The number of write capacity units consumed

depends on the size of the row. If the row size is 2 KB, the failed conditional write consumes two

write capacity units. If the row doesn’t currently exist in the table, the operation consumes one

write capacity unit. By monitoring the ConditionalCheckFailed metric in CloudWatch you can

determine the capacity consumed by LWT condition check failures.

Estimate capacity consumption for static columns in Amazon Keyspaces

In an Amazon Keyspaces table with clustering columns, you can use the STATIC keyword to create

a static column. The value stored in a static column is shared between all rows in a logical partition.

When you update the value of this column, Amazon Keyspaces applies the change automatically to

all rows in the partition.

This section describes how to calculate the encoded size of data when you're writing to static

columns. This process is handled separately from the process that writes data to the nonstatic

columns of a row. In addition to size quotas for static data, read and write operations on static

columns also aﬀect metering and throughput capacity for tables independently. For functional

diﬀerences with Apache Cassandra when using static columns and paginated range read results,

see the section called “Pagination”.

Topics

• Calculate the static column size per logical partition in Amazon Keyspaces

• Estimate capacity throughput requirements for read/write operations on static data in Amazon

Keyspaces

Calculate the static column size per logical partition in Amazon Keyspaces

This section provides details about how to estimate the encoded size of static columns in Amazon

Keyspaces. The encoded size is used when you're calculating your bill and quota use. You should

also use the encoded size when you calculate provisioned throughput capacity requirements for

tables. To calculate the encoded size of static columns in Amazon Keyspaces, you can use the

following guidelines.

• Partition keys can contain up to 2048 bytes of data. Each key column in the partition key

requires up to 3 bytes of metadata. These metadata bytes count towards your static data size

Estimate capacity consumption of static columns 240

Amazon Keyspaces (for Apache Cassandra) Developer Guide

quota of 1 MB per partition. When calculating the size of your static data, you should assume

that each partition key column uses the full 3 bytes of metadata.

• Use the raw size of the static column data values based on the data type. For more information

about data types, see the section called “Data types”.

• Add 104 bytes to the size of the static data for metadata.

• Clustering columns and regular, nonprimary key columns do not count towards the size of static

data. To learn how to estimate the size of nonstatic data within rows, see the section called

“Estimate row size”.

The total encoded size of a static column is based on the following formula:

partition key columns + static columns + metadata = total encoded size of static data

Consider the following example of a table where all columns are of type integer. The table has two

partition key columns, two clustering columns, one regular column, and one static column.

CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2

int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1,

ck_col2));

In this example, we calculate the size of static data of the following statement:

INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, static_col1) values(1,2,6);

To estimate the total bytes required by this write operation, you can use the following steps.

1. Calculate the size of a partition key column by adding the bytes for the data type stored in the

column and the metadata bytes. Repeat this for all partition key columns.

a. Calculate the size of the ﬁrst column of the partition key (pk_col1):

4 bytes for the integer data type + 3 bytes for partition key metadata = 7

bytes

b. Calculate the size of the second column of the partition key (pk_col2):

4 bytes for the integer data type + 3 bytes for partition key metadata = 7

bytes

Estimate capacity consumption of static columns 241

Amazon Keyspaces (for Apache Cassandra) Developer Guide

c. Add both columns to get the total estimated size of the partition key columns:

7 bytes + 7 bytes = 14 bytes for the partition key columns

2. Add the size of the static columns. In this example, we only have one static column that stores

an integer (which requires 4 bytes).

3. Finally, to get the total encoded size of the static column data, add up the bytes for the

primary key columns and static columns, and add the additional 104 bytes for metadata:

14 bytes for the partition key columns + 4 bytes for the static column + 104 bytes

for metadata = 122 bytes.

You can also update static and nonstatic data with the same statement. To estimate the total

size of the write operation, you must ﬁrst calculate the size of the nonstatic data update. Then

calculate the size of the row update as shown in the example at the section called “Estimate row

size”, and add the results.

In this case, you can write a total of 2 MB—1 MB is the maximum row size quota, and 1 MB is the

quota for the maximum static data size per logical partition.

To calculate the total size of an update of static and nonstatic data in the same statement, you can

use the following formula:

(partition key columns + static columns + metadata = total encoded size of static data)

+ (partition key columns + clustering columns + regular columns + row metadata = total

encoded size of row)

= total encoded size of data written

Consider the following example of a table where all columns are of type integer. The table has two

partition key columns, two clustering columns, one regular column, and one static column.

CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2

int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1,

ck_col2));

In this example, we calculate the size of data when we write a row to the table, as shown in the

following statement:

Estimate capacity consumption of static columns 242

Amazon Keyspaces (for Apache Cassandra) Developer Guide

INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, ck_col1, ck_col2, reg_col1,

static_col1) values(2,3,4,5,6,7);

To estimate the total bytes required by this write operation, you can use the following steps.

1. Calculate the total encoded size of static data as shown earlier. In this example, it's 122 bytes.

2. Add the size of the total encoded size of the row based on the update of nonstatic data,

following the steps at the section called “Estimate row size”. In this example, the total size of

the row update is 134 bytes.

122 bytes for static data + 134 bytes for nonstatic data = 256 bytes.

Estimate capacity throughput requirements for read/write operations on static

data in Amazon Keyspaces

Static data is associated with logical partitions in Cassandra, not individual rows. Logical partitions

in Amazon Keyspaces can be virtually unbound in size by spanning across multiple physical storage

partitions. As a result, Amazon Keyspaces meters write operations on static and nonstatic data

separately. Furthermore, writes that include both static and nonstatic data require additional

underlying operations to provide data consistency.

If you perform a mixed write operation of both static and nonstatic data, this results in two

separate write operations—one for nonstatic and one for static data. This applies to both on-

demand and provisioned read/write capacity modes.

The following example provides details about how to estimate the required read capacity units

(RCUs) and write capacity units (WCUs) when you're calculating provisioned throughput capacity

requirements for tables in Amazon Keyspaces that have static columns. You can estimate how

much capacity your table needs to process writes that include both static and nonstatic data by

using the following formula:

2 x WCUs required for nonstatic data + 2 x WCUs required for static data

For example, if your application writes 27 KBs of data per second and each write includes 25.5 KBs

of nonstatic data and 1.5 KBs of static data, then your table requires 56 WCUs (2 x 26 WCUs + 2 x 2

WCUs).

Estimate capacity consumption of static columns 243

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces meters the reads of static and nonstatic data the same as reads of multiple

rows. As a result, the price of reading static and nonstatic data in the same operation is based on

the aggregate size of the data processed to perform the read.

To learn how to monitor serverless resources with Amazon CloudWatch, see the section called

“Monitoring with CloudWatch”.

Estimate and provision capacity for a multi-Region table in Amazon

Keyspaces

You can conﬁgure the throughput capacity of a multi-Region table in one of two ways:

• On-demand capacity mode, measured in write request units (WRUs)

• Provisioned capacity mode with auto scaling, measured in write capacity units (WCUs)

You can use provisioned capacity mode with auto scaling or on-demand capacity mode to help

ensure that a multi-Region table has suﬃcient capacity to perform replicated writes to all AWS

Regions.

Note

Changing the capacity mode of the table in one of the Regions changes the capacity mode

for all replicas.

By default, Amazon Keyspaces uses on-demand mode for multi-Region tables. With on-demand

mode, you don't need to specify how much read and write throughput that you expect your

application to perform. Amazon Keyspaces instantly accommodates your workloads as they ramp

up or down to any previously reached traﬃc level. If a workload’s traﬃc level hits a new peak,

Amazon Keyspaces adapts rapidly to accommodate the workload.

If you choose provisioned capacity mode for a table, you have to conﬁgure the number of read

capacity units (RCUs) and write capacity units (WCUs) per second that your application requires.

To plan a multi-Region table's throughput capacity needs, you should ﬁrst estimate the number of

WCUs per second needed for each Region. Then you add the writes from all Regions that your table

is replicated in, and use the sum to provision capacity for each Region. This is required because

every write that is performed in one Region must also be repeated in each replica Region.

Estimate capacity for a multi-Region table 244

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If the table doesn't have enough capacity to handle the writes from all Regions, capacity exceptions

will occur. In addition, inter-Regional replication wait times will rise.

For example, if you have a multi-Region table where you expect 5 writes per second in US East (N.

Virginia), 10 writes per second in US East (Ohio), and 5 writes per second in Europe (Ireland), you

should expect the table to consume 20 WCUs in each Region: US East (N. Virginia), US East (Ohio),

and Europe (Ireland). That means that in this example, you need to provision 20 WCUs for each of

the table's replicas. You can monitor your table's capacity consumption using Amazon CloudWatch.

For more information, see the section called “Monitoring with CloudWatch”.

Because each multi-Region write is billed as 1.25 times WCUs, you would see a total of 75 WCUs

billed in this example. For more information about pricing, see Amazon Keyspaces (for Apache

Cassandra) pricing.

For more information about provisioned capacity with Amazon Keyspaces auto scaling, see the

section called “Manage throughput capacity with auto scaling”.

Note

If a table is running in provisioned capacity mode with auto scaling, the provisioned write

capacity is allowed to ﬂoat within those auto scaling settings for each Region.

Estimate read and write capacity consumption with Amazon

CloudWatch in Amazon Keyspaces

To estimate and monitor read and write capacity consumption, you can use a CloudWatch

dashboard. For more information about available metrics for Amazon Keyspaces, see the section

called “Metrics and dimensions”.

To monitor read and write capacity units consumed by a speciﬁc statement with CloudWatch, you

can follow these steps.

1. Create a new table with sample data

2. Conﬁgure a Amazon Keyspaces CloudWatch dashboard for the table. To get started, you can use

a dashboard template available on Github.

Run the CQL statement, for example using the ALLOW FILTERING option, and check the read

capacity units consumed for the full table scan in the dashboard.

Estimate capacity consumption with CloudWatch 245

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁgure read/write capacity modes in Amazon Keyspaces

Amazon Keyspaces has two read/write capacity modes for processing reads and writes on your

tables:

• On-demand (default)

• Provisioned

The read/write capacity mode that you choose controls how you are charged for read and write

throughput and how table throughput capacity is managed.

Topics

• Conﬁgure on-demand capacity mode

• Conﬁgure provisioned throughput capacity mode

• View the capacity mode of a table in Amazon Keyspaces

• Change capacity mode

Conﬁgure on-demand capacity mode

Amazon Keyspaces (for Apache Cassandra) on-demand capacity mode is a ﬂexible billing option

capable of serving thousands of requests per second without capacity planning. This option oﬀers

pay-per-request pricing for read and write requests so that you pay only for what you use.

When you choose on-demand mode, Amazon Keyspaces can scale the throughput capacity for your

table up to any previously reached traﬃc level instantly, and then back down when application

traﬃc decreases. If a workload’s traﬃc level hits a new peak, the service adapts rapidly to increase

throughput capacity for your table. You can enable on-demand capacity mode for both new and

existing tables.

On-demand mode is a good option if any of the following is true:

• You create new tables with unknown workloads.

• You have unpredictable application traﬃc.

• You prefer the ease of paying for only what you use.

Conﬁgure read/write capacity modes 246

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To get started with on-demand mode, you can create a new table or update an existing table to

use on-demand capacity mode using the console or with a few lines of Cassandra Query Language

(CQL) code. For more information, see the section called “Tables”.

Topics

• Read request units and write request units

• Peak traﬃc and scaling properties

• Initial throughput for on-demand capacity mode

Read request units and write request units

With on-demand capacity mode tables, you don't need to specify how much read and write

throughput you expect your application to use in advance. Amazon Keyspaces charges you for the

reads and writes that you perform on your tables in terms of read request units (RRUs) and write

request units (WRUs).

•

One RRU represents one LOCAL_QUORUM read request, or two LOCAL_ONE read requests, for a

row up to 4 KB in size. If you need to read a row that is larger than 4 KB, the read operation uses

additional RRUs. The total number of RRUs required depends on the row size, and whether you

want to use LOCAL_QUORUM or LOCAL_ONE read consistency. For example, reading an 8 KB row

requires 2 RRUs using LOCAL_QUORUM read consistency, and 1 RRU if you choose LOCAL_ONE

read consistency.

•

One WRU represents one write for a row up to 1 KB in size. All writes are using LOCAL_QUORUM

consistency, and there is no additional charge for using lightweight transactions (LWTs). If you

need to write a row that is larger than 1 KB, the write operation uses additional WRUs. The total

number of WRUs required depends on the row size. For example, if your row size is 2 KB, you

require 2 WRUs to perform one write request.

For information about supported consistency levels, see the section called “Supported Cassandra

consistency levels”.

Peak traﬃc and scaling properties

Amazon Keyspaces tables that use on-demand capacity mode automatically adapt to your

application’s traﬃc volume. On-demand capacity mode instantly accommodates up to double the

previous peak traﬃc on a table. For example, your application's traﬃc pattern might vary between

Conﬁgure on-demand capacity mode 247

Amazon Keyspaces (for Apache Cassandra) Developer Guide

5,000 and 10,000 LOCAL_QUORUM reads per second, where 10,000 reads per second is the previous

traﬃc peak.

With this pattern, on-demand capacity mode instantly accommodates sustained traﬃc of up to

20,000 reads per second. If your application sustains traﬃc of 20,000 reads per second, that peak

becomes your new previous peak, enabling subsequent traﬃc to reach up to 40,000 reads per

second.

If you need more than double your previous peak on a table, Amazon Keyspaces automatically

allocates more capacity as your traﬃc volume increases. This helps ensure that your table has

enough throughput capacity to process the additional requests. However, you might observe

insuﬃcient throughput capacity errors if you exceed double your previous peak within 30 minutes.

For example, suppose that your application's traﬃc pattern varies between 5,000 and 10,000

strongly consistent reads per second, where 20,000 reads per second is the previously reached

traﬃc peak. In this case, the service recommends that you space your traﬃc growth over at least

30 minutes before driving up to 40,000 reads per second.

To learn how to estimate read and write capacity consumption of a table, see the section called

“Estimate capacity consumption”.

To learn more about default quotas for your account and how to increase them, see Quotas.

Initial throughput for on-demand capacity mode

If you create a new table with on-demand capacity mode enabled or switch an existing table to on-

demand capacity mode for the ﬁrst time, the table has the following previous peak settings, even

though it hasn't served traﬃc previously using on-demand capacity mode:

• Newly created table with on-demand capacity mode: The previous peak is 2,000 WRUs and

6,000 RRUs. You can drive up to double the previous peak immediately. Doing this enables newly

created on-demand tables to serve up to 4,000 WRUs and 12,000 RRUs.

• Existing table switched to on-demand capacity mode: The previous peak is half the previous

WCUs and RCUs provisioned for the table or the settings for a newly created table with on-

demand capacity mode, whichever is higher.

Conﬁgure on-demand capacity mode 248

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁgure provisioned throughput capacity mode

If you choose provisioned throughput capacity mode, you specify the number of reads and writes

per second that are required for your application. This helps you manage your Amazon Keyspaces

usage to stay at or below a deﬁned request rate to optimize price and maintain predictability. To

learn more about automatic scaling for provisioned throughput see the section called “Manage

throughput capacity with auto scaling”.

Provisioned throughput capacity mode is a good option if any of the following is true:

• You have predictable application traﬃc.

• You run applications whose traﬃc is consistent or ramps up gradually.

• You can forecast capacity requirements to optimize price.

Read capacity units and write capacity units

For provisioned throughput capacity mode tables, you specify throughput capacity in terms of read

capacity units (RCUs) and write capacity units (WCUs):

•

One RCU represents one LOCAL_QUORUM read per second, or two LOCAL_ONE reads per second,

for a row up to 4 KB in size. If you need to read a row that is larger than 4 KB, the read operation

uses additional RCUs.

The total number of RCUs required depends on the row size, and whether you want

LOCAL_QUORUM or LOCAL_ONE reads. For example, if your row size is 8 KB, you require 2 RCUs to

sustain one LOCAL_QUORUM read per second, and 1 RCU if you choose LOCAL_ONE reads.

• One WCU represents one write per second for a row up to 1 KB in size. All writes are using

LOCAL_QUORUM consistency, and there is no additional charge for using lightweight transactions

(LWTs). If you need to write a row that is larger than 1 KB, the write operation uses additional

WCUs.

The total number of WCUs required depends on the row size. For example, if your row size is 2

KB, you require 2 WCUs to sustain one write request per second. For more information about

how to estimate read and write capacity consumption of a table, see the section called “Estimate

capacity consumption”.

Conﬁgure provisioned throughput capacity mode 249

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If your application reads or writes larger rows (up to the Amazon Keyspaces maximum row size of

1 MB), it consumes more capacity units. To learn more about how to estimate the row size, see the

section called “Estimate row size”. For example, suppose that you create a provisioned table with 6

RCUs and 6 WCUs. With these settings, your application could do the following:

•

Perform LOCAL_QUORUM reads of up to 24 KB per second (4 KB × 6 RCUs).

•

Perform LOCAL_ONE reads of up to 48 KB per second (twice as much read throughput).

• Write up to 6 KB per second (1 KB × 6 WCUs).

Provisioned throughput is the maximum amount of throughput capacity an application can

consume from a table. If your application exceeds your provisioned throughput capacity, you might

observe insuﬃcient capacity errors.

For example, a read request that doesn’t have enough throughput capacity fails with a

Read_Timeout exception and is posted to the ReadThrottleEvents metric. A write request that

doesn’t have enough throughput capacity fails with a Write_Timeout exception and is posted to

the WriteThrottleEvents metric.

You can use Amazon CloudWatch to monitor your provisioned and actual throughput metrics

and insuﬃcient capacity events. For more information about these metrics, see the section called

“Metrics and dimensions”.

Note

Repeated errors due to insuﬃcient capacity can lead to client-side driver

speciﬁc exceptions, for example the DataStax Java driver fails with a

NoHostAvailableException.

To change the throughput capacity settings for tables, you can use the AWS Management Console

or the ALTER TABLE statement using CQL, for more information see the section called “ALTER

TABLE”.

To learn more about default quotas for your account and how to increase them, see Quotas.

View the capacity mode of a table in Amazon Keyspaces

You can query the system table in the Amazon Keyspaces system keyspace to review capacity mode

information about a table. You can also see whether a table is using on-demand or provisioned

View the capacity mode of a table 250

Amazon Keyspaces (for Apache Cassandra) Developer Guide

throughput capacity mode. If the table is conﬁgured with provisioned throughput capacity mode,

you can see the throughput capacity provisioned for the table.

Example

SELECT * from system_schema_mcs.tables where keyspace_name = 'mykeyspace' and

table_name = 'mytable';

A table conﬁgured with on-demand capacity mode returns the following.

{

'capacity_mode': {

'last_update_to_pay_per_request_timestamp':

'1579551547603',

'throughput_mode': 'PAY_PER_REQUEST'

}

A table conﬁgured with provisioned throughput capacity mode returns the following.

{

'capacity_mode': {

'last_update_to_pay_per_request_timestamp':

'1579048006000',

'read_capacity_units': '5000',

'throughput_mode': 'PROVISIONED',

'write_capacity_units': '6000'

}

The last_update_to_pay_per_request_timestamp value is measured in milliseconds.

To change the provisioned throughput capacity for a table, use the section called “ALTER TABLE”.

Change capacity mode

When you switch a table from provisioned capacity mode to on-demand capacity mode, Amazon

Keyspaces makes several changes to the structure of your table and partitions. This process can

Change capacity mode 251

Amazon Keyspaces (for Apache Cassandra) Developer Guide

take several minutes. During the switching period, your table delivers throughput that is consistent

with the previously provisioned WCU and RCU amounts.

When you switch from on-demand capacity mode back to provisioned capacity mode, your table

delivers throughput that is consistent with the previous peak reached when the table was set to

on-demand capacity mode.

Note

You can switch capacity modes from provisioned to on-demand only once in a 24-hour

period.

Manage throughput capacity automatically with Amazon

Keyspaces auto scaling

Many database workloads are cyclical in nature or are diﬃcult to predict in advance. For example,

consider a social networking app where most of the users are active during daytime hours. The

database must be able to handle the daytime activity, but there's no need for the same levels of

throughput at night.

Another example might be a new mobile gaming app that is experiencing rapid adoption. If the

game becomes very popular, it could exceed the available database resources, which would result

in slow performance and unhappy customers. These kinds of workloads often require manual

intervention to scale database resources up or down in response to varying usage levels.

Amazon Keyspaces (for Apache Cassandra) helps you provision throughput capacity eﬃciently

for variable workloads by adjusting throughput capacity automatically in response to actual

application traﬃc. Amazon Keyspaces uses the Application Auto Scaling service to increase and

decrease a table's read and write capacity on your behalf. For more information about Application

Auto Scaling, see the Application Auto Scaling User Guide.

Note

To get started with Amazon Keyspaces automatic scaling quickly, see the section called

“Conﬁgure and update auto scaling policies”.

Manage throughput capacity with auto scaling 252

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How Amazon Keyspaces automatic scaling works

The following diagram provides a high-level overview of how Amazon Keyspaces automatic scaling

manages throughput capacity for a table.

To enable automatic scaling for a table, you create a scaling policy. The scaling policy speciﬁes

whether you want to scale read capacity or write capacity (or both), and the minimum and

maximum provisioned capacity unit settings for the table.

The scaling policy also deﬁnes a target utilization. Target utilization is the ratio of consumed

capacity units to provisioned capacity units at a point in time, expressed as a percentage.

Automatic scaling uses a target tracking algorithm to adjust the provisioned throughput of the

table upward or downward in response to actual workloads. It does this so that the actual capacity

utilization remains at or near your target utilization.

You can set the automatic scaling target utilization values between 20 and 90 percent for your

read and write capacity. The default target utilization rate is 70 percent. You can set the target

utilization to be a lower percentage if your traﬃc changes quickly and you want capacity to begin

scaling up sooner. You can also set the target utilization rate to a higher rate if your application

traﬃc changes more slowly and you want to reduce the cost of throughput.

For more information about scaling policies, see Target tracking scaling policies for Application

Auto Scaling in the Application Auto Scaling User Guide.

How Amazon Keyspaces automatic scaling works 253

Amazon Keyspaces (for Apache Cassandra) Developer Guide

When you create a scaling policy, Amazon Keyspaces creates two pairs of Amazon CloudWatch

alarms on your behalf. Each pair represents your upper and lower boundaries for provisioned and

consumed throughput settings. These CloudWatch alarms are triggered when the table's actual

utilization deviates from your target utilization for a sustained period of time. To learn more about

Amazon CloudWatch, see the Amazon CloudWatch User Guide.

When one of the CloudWatch alarms is triggered, Amazon Simple Notiﬁcation Service (Amazon

SNS) sends you a notiﬁcation (if you have enabled it). The CloudWatch alarm then invokes

Application Auto Scaling to evaluate your scaling policy. This in turn issues an Alter Table

request to Amazon Keyspaces to adjust the table's provisioned capacity upward or downward

as appropriate. To learn more about Amazon SNS notiﬁcations, see Setting up Amazon SNS

notiﬁcations.

Amazon Keyspaces processes the Alter Table request by increasing (or decreasing) the table's

provisioned throughput capacity so that it approaches your target utilization.

Note

Amazon Keyspaces auto scaling modiﬁes provisioned throughput settings only when the

actual workload stays elevated (or depressed) for a sustained period of several minutes.

The target tracking algorithm seeks to keep the target utilization at or near your chosen

value over the long term. Sudden, short-duration spikes of activity are accommodated by

the table's built-in burst capacity.

How auto scaling works for multi-Region tables

To ensure that there's always enough read and write capacity for all table replicas in all AWS

Regions of a multi-Region table in provisioned capacity mode, we recommend that you conﬁgure

Amazon Keyspaces auto scaling.

When you use a multi-Region table in provisioned mode with auto scaling, you can't disable

auto scaling for a single table replica. But you can adjust the table's read auto scaling settings

for diﬀerent Regions. For example, you can specify diﬀerent read capacity and read auto scaling

settings for each Region that the table is replicated in.

The read auto scaling settings that you conﬁgure for a table replica in a speciﬁed Region overwrite

the general auto scaling settings of the table. The write capacity, however, has to remain

How auto scaling works for multi-Region tables 254

Amazon Keyspaces (for Apache Cassandra) Developer Guide

synchronized across all table replicas to ensure that there's enough capacity to replicate writes in

all Regions.

Amazon Keyspaces auto scaling independently updates the provisioned capacity of the table in

each AWS Region based on the usage in that Region. As a result, the provisioned capacity in each

Region for a multi-Region table might be diﬀerent when auto scaling is active.

You can conﬁgure the auto scaling settings of a multi-Region table and its replicas using the

Amazon Keyspaces console, API, AWS CLI, or CQL. For more information on how to create and

update auto scaling settings for multi-Region tables, see the section called “Update provisioned

capacity and auto scaling settings for a multi-Region table”.

Note

If you use auto scaling for multi-Region tables, you must always use Amazon Keyspaces

API operations to conﬁgure auto scaling settings. If you use Application Auto Scaling API

operations directly to conﬁgure auto scaling settings, you don't have the ability to specify

the AWS Regions of the multi-Region table. This can result in unsupported conﬁgurations.

Usage notes

Before you begin using Amazon Keyspaces automatic scaling, you should be aware of the

following:

• Amazon Keyspaces automatic scaling can increase read capacity or write capacity as often as

necessary, in accordance with your scaling policy. All Amazon Keyspaces quotas remain in eﬀect,

as described in Quotas.

• Amazon Keyspaces automatic scaling doesn't prevent you from manually modifying provisioned

throughput settings. These manual adjustments don't aﬀect any existing CloudWatch alarms

that are attached to the scaling policy.

• If you use the console to create a table with provisioned throughput capacity, Amazon Keyspaces

automatic scaling is enabled by default. You can modify your automatic scaling settings at any

time. For more information, see the section called “Turn oﬀ Amazon Keyspaces auto scaling for a

table”.

• If you're using AWS CloudFormation to create scaling policies, you should manage the scaling

policies from AWS CloudFormation so that the stack is in sync with the stack template. If you

Usage notes 255

Amazon Keyspaces (for Apache Cassandra) Developer Guide

change scaling policies from Amazon Keyspaces, they will get overwritten with the original

values from the AWS CloudFormation stack template when the stack is reset.

• If you use CloudTrail to monitor Amazon Keyspaces automatic scaling, you might see alerts

for calls made by Application Auto Scaling as part of its conﬁguration validation process.

You can ﬁlter out these alerts by using the invokedBy ﬁeld, which contains application-

autoscaling.amazonaws.com for these validation checks.

Conﬁgure and update Amazon Keyspaces automatic scaling policies

You can use the console, CQL, or the AWS Command Line Interface (AWS CLI) to conﬁgure Amazon

Keyspaces automatic scaling for new and existing tables. You can also modify automatic scaling

settings or disable automatic scaling.

For more advanced features like setting scale-in and scale-out cooldown times, we recommend

that you use CQL or the AWS CLI to manage Amazon Keyspaces scaling policies.

Topics

• Conﬁgure permissions for Amazon Keyspaces automatic scaling

• Create a new table with automatic scaling

• Conﬁgure automatic scaling on an existing table

• View your table's Amazon Keyspaces auto scaling conﬁguration

• Turn oﬀ Amazon Keyspaces auto scaling for a table

• View auto scaling activity for a Amazon Keyspaces table in Amazon CloudWatch

Conﬁgure permissions for Amazon Keyspaces automatic scaling

To get started, conﬁrm that the principal has the appropriate permissions to create and manage

automatic scaling settings. In AWS Identity and Access Management (IAM), the AWS managed

policy AmazonKeyspacesFullAccess is required to manage Amazon Keyspaces scaling policies.

Important

application-autoscaling:* permissions are required to disable automatic scaling on

a table. You must turn oﬀ auto scaling for a table before you can delete it.

Conﬁgure and update auto scaling policies 256

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To set up an IAM user or role for Amazon Keyspaces console access and Amazon Keyspaces

automatic scaling, add the following policy.

To attach the AmazonKeyspacesFullAccess policy

1. Sign in to the AWS Management Console and open the IAM console at https://

console.aws.amazon.com/iam/.

2. On the IAM console dashboard, choose Users, and then choose your IAM user or role from the

list.

3. On the Summary page, choose Add permissions.

4. Choose Attach existing policies directly.

5. From the list of policies, choose AmazonKeyspacesFullAccess, and then choose Next: Review.

6. Choose Add permissions.

Create a new table with automatic scaling

When you create a new Amazon Keyspaces table, you can automatically enable auto scaling for the

table's write or read capacity. This allows Amazon Keyspaces to contact Application Auto Scaling

on your behalf to register the table as a scalable target and adjust the provisioned write or read

capacity.

For more information on how to create a multi-Region table and conﬁgure diﬀerent auto scaling

settings for table replicas, see the section called “Create a multi-Region table in provisioned mode”.

Note

Amazon Keyspaces automatic scaling requires the presence of a service-linked role

(AWSServiceRoleForApplicationAutoScaling_CassandraTable) that performs

automatic scaling actions on your behalf. This role is created automatically for you. For

more information, see the section called “Using service-linked roles”.

Console

Create a new table with automatic scaling enabled using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

Conﬁgure and update auto scaling policies 257

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. In the navigation pane, choose Tables, and then choose Create table.

3. On the Create table page in the Table details section, select a keyspace and provide a

name for the new table.

4. In the Columns section, create the schema for your table.

5. In the Primary key section, deﬁne the primary key of the table and select optional

clustering columns.

6. In the Table settings section, choose Customize settings.

7. Continue to Read/write capacity settings.

8. For Capacity mode, choose Provisioned.

9. In the Read capacity section, conﬁrm that Scale automatically is selected.

In this step, you select the minimum and maximum read capacity units for the table, as well

as the target utilization.

• Minimum capacity units – Enter the value for the minimum level of throughput that the

table should always be ready to support. The value must be between 1 and the maximum

throughput per second quota for your account (40,000 by default).

• Maximum capacity units – Enter the maximum amount of throughput you want to

provision for the table. The value must be between 1 and the maximum throughput per

second quota for your account (40,000 by default).

• Target utilization – Enter a target utilization rate between 20% and 90%. When traﬃc

exceeds the deﬁned target utilization rate, capacity is automatically scaled up. When

traﬃc falls below the deﬁned target, it is automatically scaled down again.

Note

To learn more about default quotas for your account and how to increase them, see

Quotas.

10. In the Write capacity section, choose the same settings as deﬁned in the previous step for

read capacity, or conﬁgure capacity values manually.

11. Choose Create table. Your table is created with the speciﬁed automatic scaling parameters.

Conﬁgure and update auto scaling policies 258

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

Create a new table with Amazon Keyspaces automatic scaling using CQL

To conﬁgure auto scaling settings for a table programmatically, you use the

AUTOSCALING_SETTINGS statement that contains the parameters for Amazon Keyspaces auto

scaling. The parameters deﬁne the conditions that direct Amazon Keyspaces to adjust your

table's provisioned throughput, and what additional optional actions to take. In this example,

you deﬁne the auto scaling settings for mytable.

The policy contains the following elements:

•

AUTOSCALING_SETTINGS – Speciﬁes if Amazon Keyspaces is allowed to adjust throughput

capacity on your behalf. The following values are required:

•

provisioned_write_capacity_autoscaling_update:

•

minimum_units

•

maximum_units

•

provisioned_read_capacity_autoscaling_update:

•

minimum_units

•

maximum_units

•

scaling_policy – Amazon Keyspaces supports the target tracking policy. To deﬁne the

target tracking policy, you conﬁgure the following parameters.

•

target_value – Amazon Keyspaces auto scaling ensures that the ratio of consumed

capacity to provisioned capacity stays at or near this value. You deﬁne target_value as

a percentage.

•

disableScaleIn: (Optional) A boolean that speciﬁes if scale-in is disabled or

enabled for the table. This parameter is disabled by default. To turn on scale-in, set

the boolean value to FALSE. This means that capacity is automatically scaled down for

a table on your behalf.

•

scale_out_cooldown – A scale-out activity increases the provisioned throughput of

your table. To add a cooldown period for scale-out activities, specify a value, in seconds,

for scale_out_cooldown. If you don't specify a value, the default value is 0. For more

information about target tracking and cooldown periods, see Target Tracking Scaling

Policies in the Application Auto Scaling User Guide.

•

scale_in_cooldown – A scale-in activity decreases the provisioned throughput of

your table. To add a cooldown period for scale-in activities, specify a value, in seconds,

Conﬁgure and update auto scaling policies 259

Amazon Keyspaces (for Apache Cassandra) Developer Guide

for scale_in_cooldown. If you don't specify a value, the default value is 0. For more

information about target tracking and cooldown periods, see Target Tracking Scaling

Policies in the Application Auto Scaling User Guide.

Note

To further understand how target_value works, suppose that you have a table with

a provisioned throughput setting of 200 write capacity units. You decide to create a

scaling policy for this table, with a target_value of 70 percent.

Now suppose that you begin driving write traﬃc to the table so that the actual write

throughput is 150 capacity units. The consumed-to-provisioned ratio is now (150 / 200),

or 75 percent. This ratio exceeds your target, so auto scaling increases the provisioned

write capacity to 215 so that the ratio is (150 / 215), or 69.77 percent—as close to your

target_value as possible, but not exceeding it.

For mytable, you set TargetValue for both read and write capacity to 50 percent. Amazon

Keyspaces auto scaling adjusts the table's provisioned throughput within the range of 5–

10 capacity units so that the consumed-to-provisioned ratio remains at or near 50 percent.

For read capacity, you set the values for ScaleOutCooldown and ScaleInCooldown to 60

seconds.

You can use the following statement to create a new Amazon Keyspaces table with auto scaling

enabled.

CREATE TABLE mykeyspace.mytable(pk int, ck int, PRIMARY KEY (pk, ck))

WITH CUSTOM_PROPERTIES = {

'capacity_mode': {

'throughput_mode': 'PROVISIONED',

'read_capacity_units': 1,

'write_capacity_units': 1

}

} AND AUTOSCALING_SETTINGS = {

'provisioned_write_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50

Conﬁgure and update auto scaling policies 260

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50,

'scale_in_cooldown': 60,

'scale_out_cooldown': 60

}

};

CLI

Create a new table with Amazon Keyspaces automatic scaling using the AWS CLI

To conﬁgure auto scaling settings for a table programmatically, you use the

autoScalingSpecification action that deﬁnes the parameters for Amazon Keyspaces auto

scaling. The parameters deﬁne the conditions that direct Amazon Keyspaces to adjust your

table's provisioned throughput, and what additional optional actions to take. In this example,

you deﬁne the auto scaling settings for mytable.

The policy contains the following elements:

•

autoScalingSpecification – Speciﬁes if Amazon Keyspaces is allowed to adjust

capacity throughput on your behalf. You can enable auto scaling for read and for

write capacity separately. Then you must specify the following parameters for

autoScalingSpecification:

•

writeCapacityAutoScaling – The maximum and minimum write capacity units.

•

readCapacityAutoScaling – The maximum and minimum read capacity units.

•

scalingPolicy – Amazon Keyspaces supports the target tracking policy. To deﬁne the

target tracking policy, you conﬁgure the following parameters.

•

targetValue – Amazon Keyspaces auto scaling ensures that the ratio of consumed

capacity to provisioned capacity stays at or near this value. You deﬁne targetValue as a

percentage.

Conﬁgure and update auto scaling policies 261

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

disableScaleIn: (Optional) A boolean that speciﬁes if scale-in is disabled or

enabled for the table. This parameter is disabled by default. To turn on scale-in, set

the boolean value to FALSE. This means that capacity is automatically scaled down for

a table on your behalf.

•

scaleOutCooldown – A scale-out activity increases the provisioned throughput of

your table. To add a cooldown period for scale-out activities, specify a value, in seconds,

for ScaleOutCooldown. The default value is 0. For more information about target

tracking and cooldown periods, see Target Tracking Scaling Policies in the Application

Auto Scaling User Guide.

•

scaleInCooldown – A scale-in activity decreases the provisioned throughput of your

table. To add a cooldown period for scale-in activities, specify a value, in seconds, for

ScaleInCooldown. The default value is 0. For more information about target tracking

and cooldown periods, see Target Tracking Scaling Policies in the Application Auto

Scaling User Guide.

Note

To further understand how TargetValue works, suppose that you have a table with

a provisioned throughput setting of 200 write capacity units. You decide to create a

scaling policy for this table, with a TargetValue of 70 percent.

Now suppose that you begin driving write traﬃc to the table so that the actual write

throughput is 150 capacity units. The consumed-to-provisioned ratio is now (150 / 200),

or 75 percent. This ratio exceeds your target, so auto scaling increases the provisioned

write capacity to 215 so that the ratio is (150 / 215), or 69.77 percent—as close to your

TargetValue as possible, but not exceeding it.

For mytable, you set TargetValue for both read and write capacity to 50 percent. Amazon

Keyspaces auto scaling adjusts the table's provisioned throughput within the range of 5–

10 capacity units so that the consumed-to-provisioned ratio remains at or near 50 percent.

For read capacity, you set the values for ScaleOutCooldown and ScaleInCooldown to 60

seconds.

When creating tables with complex auto scaling settings, it's helpful to load the auto scaling

settings from a JSON ﬁle. For the following example, you can download the example JSON ﬁle

from auto-scaling.zip and extract auto-scaling.json, taking note of the path to the ﬁle. In

Conﬁgure and update auto scaling policies 262

Amazon Keyspaces (for Apache Cassandra) Developer Guide

this example, the JSON ﬁle is located in the current directory. For diﬀerent ﬁle path options, see

How to load parameters from a ﬁle.

aws keyspaces create-table --keyspace-name mykeyspace --table-name mytable

\ --schema-definition 'allColumns=[{name=pk,type=int},

{name=ck,type=int}],partitionKeys=[{name=pk},{name=ck}]'

\ --capacity-specification

throughputMode=PROVISIONED,readCapacityUnits=1,writeCapacityUnits=1

\ --auto-scaling-specification file://auto-scaling.json

Conﬁgure automatic scaling on an existing table

You can update an existing Amazon Keyspaces table to turn on auto scaling for the table's write or

read capacity. If you're updating a table that is currently in on-demand capacity mode, than you

ﬁrst have to change the table's capacity mode to provisioned capacity mode.

For more information on how to update auto scaling settings for a multi-Region table, see the

section called “Update provisioned capacity and auto scaling settings for a multi-Region table”.

Amazon Keyspaces automatic scaling requires the presence of a service-linked role

(AWSServiceRoleForApplicationAutoScaling_CassandraTable) that performs automatic

scaling actions on your behalf. This role is created automatically for you. For more information, see

the section called “Using service-linked roles”.

Console

Conﬁgure Amazon Keyspaces automatic scaling for an existing table

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose the table that you want to work with, and go to the Capacity tab.

3. In the Capacity settings section, choose Edit.

4. Under Capacity mode, make sure that the table is using Provisioned capacity mode.

5. Select Scale automatically and see step 6 in the section called “Create a new table with

automatic scaling” to edit read and write capacity.

6. When the automatic scaling settings are deﬁned, choose Save.

Conﬁgure and update auto scaling policies 263

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

Conﬁgure an existing table with Amazon Keyspaces automatic scaling using CQL

You can use the ALTER TABLE statement for an existing Amazon Keyspaces table to conﬁgure

auto scaling for the table's write or read capacity. If you're updating a table that is currently in

on-demand capacity mode, you have to set capacity_mode to provisioned. If your table is

already in provisioned capacity mode, this ﬁeld can be omitted.

In the following example, the statement updates the table mytable, which is in on-demand

capacity mode. The statement changes the capacity mode of the table to provisioned mode

with auto scaling enabled.

The write capacity is conﬁgured within the range of 5–10 capacity units with a target value

of 50%. The read capacity is also conﬁgured within the range of 5–10 capacity units with a

target value of 50%. For read capacity, you set the values for scale_out_cooldown and

scale_in_cooldown to 60 seconds.

ALTER TABLE mykeyspace.mytable

WITH CUSTOM_PROPERTIES = {

'capacity_mode': {

'throughput_mode': 'PROVISIONED',

'read_capacity_units': 1,

'write_capacity_units': 1

}

} AND AUTOSCALING_SETTINGS = {

'provisioned_write_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50

}

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50,

'scale_in_cooldown': 60,

Conﬁgure and update auto scaling policies 264

Amazon Keyspaces (for Apache Cassandra) Developer Guide

'scale_out_cooldown': 60

}

};

CLI

Conﬁgure an existing table with Amazon Keyspaces automatic scaling using the AWS CLI

For an existing Amazon Keyspaces table, you can turn on auto scaling for the table's write or

read capacity using the UpdateTable operation.

You can use the following command to turn on Amazon Keyspaces auto scaling for an existing

table. The auto scaling settings for the table are loaded from a JSON ﬁle. For the following

example, you can download the example JSON ﬁle from auto-scaling.zip and extract auto-

scaling.json, taking note of the path to the ﬁle. In this example, the JSON ﬁle is located in

the current directory. For diﬀerent ﬁle path options, see How to load parameters from a ﬁle.

For more information about the auto scaling settings used in the following example, see the

section called “Create a new table with automatic scaling”.

aws keyspaces update-table --keyspace-name mykeyspace --table-name mytable

\ --capacity-specification

throughputMode=PROVISIONED,readCapacityUnits=1,writeCapacityUnits=1

\ --auto-scaling-specification file://auto-scaling.json

View your table's Amazon Keyspaces auto scaling conﬁguration

You can use the console, CQL, or the AWS CLI to view and update the Amazon Keyspaces automatic

scaling settings of a table.

Console

View automatic scaling settings using the console

1. Choose the table you want to view and go to the Capacity tab.

2. In the Capacity settings section, choose Edit. You can now modify the settings in the Read

capacity or Write capacity sections. For more information about these settings, see the

section called “Create a new table with automatic scaling”.

Conﬁgure and update auto scaling policies 265

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

View your table's Amazon Keyspaces automatic scaling policy using CQL

To view details of the auto scaling conﬁguration of a table, use the following command.

SELECT * FROM system_schema_mcs.autoscaling WHERE keyspace_name = 'mykeyspace' AND

table_name = 'mytable';

The output for this command looks like this.

keyspace_name | table_name | provisioned_read_capacity_autoscaling_update

provisioned_write_capacity_autoscaling_update

---------------+------------

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

mykeyspace | mytable | {'minimum_units': 5, 'maximum_units':

10, 'scaling_policy': {'target_tracking_scaling_policy_configuration':

{'scale_out_cooldown': 60, 'disable_scale_in': false, 'target_value':

50, 'scale_in_cooldown': 60}}} | {'minimum_units': 5, 'maximum_units':

10, 'scaling_policy': {'target_tracking_scaling_policy_configuration':

{'scale_out_cooldown': 0, 'disable_scale_in': false, 'target_value': 50,

'scale_in_cooldown': 0}}}

CLI

View your table's Amazon Keyspaces automatic scaling policy using the AWS CLI

To view the auto scaling conﬁguration of a table, you can use the get-table-auto-scaling-

settings operation. The following CLI command is an example of this.

aws keyspaces get-table-auto-scaling-settings --keyspace-name mykeyspace --table-

name mytable

The output for this command looks like this.

{

"keyspaceName": "mykeyspace",

"tableName": "mytable",

"resourceArn": "arn:aws:cassandra:us-east-1:5555-5555-5555:/keyspace/mykeyspace/

table/mytable",

Conﬁgure and update auto scaling policies 266

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"autoScalingSpecification": {

"writeCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 0,

"scaleOutCooldown": 0,

"targetValue": 50.0

}

"readCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 60,

"scaleOutCooldown": 60,

"targetValue": 50.0

}

Turn oﬀ Amazon Keyspaces auto scaling for a table

You can turn oﬀ Amazon Keyspaces auto scaling for your table at any time. If you no longer need

to scale your table's read or write capacity, you should consider turning oﬀ auto scaling so that

Amazon Keyspaces doesn't continue modifying your table’s read or write capacity settings. You can

update the table using the console, CQL, or the AWS CLI.

Turning oﬀ auto scaling also deletes the CloudWatch alarms that were created on your behalf.

To delete the service-linked role used by Application Auto Scaling to access your Amazon

Keyspaces table, follow the steps in the section called “Deleting a service-linked role for Amazon

Keyspaces”.

Conﬁgure and update auto scaling policies 267

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

To delete the service-linked role that Application Auto Scaling uses, you must disable

automatic scaling on all tables in the account across all AWS Regions.

Console

Turn oﬀ Amazon Keyspaces automatic scaling for your table using the console

Using the Amazon Keyspaces console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose the table you want to update and go to the Capacity tab.

3. In the Capacity settings section, choose Edit.

4. To disable Amazon Keyspaces automatic scaling, clear the Scale automatically check box.

Disabling automatic scaling deregisters the table as a scalable target with Application Auto

Scaling.

Cassandra Query Language (CQL)

Turn oﬀ Amazon Keyspaces automatic scaling for your table using CQL

The following statement turns oﬀ auto scaling for write capacity of the table mytable.

ALTER TABLE mykeyspace.mytable

WITH AUTOSCALING_SETTINGS = {

'provisioned_write_capacity_autoscaling_update': {

'autoscaling_disabled': true

}

};

CLI

Turn oﬀ Amazon Keyspaces automatic scaling for your table using the AWS CLI

The following command turns oﬀ auto scaling for the table's read capacity. It also deletes the

CloudWatch alarms that were created on your behalf.

Conﬁgure and update auto scaling policies 268

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws keyspaces update-table --keyspace-name mykeyspace --table-name mytable

\ --auto-scaling-specification

readCapacityAutoScaling={autoScalingDisabled=true}

View auto scaling activity for a Amazon Keyspaces table in Amazon CloudWatch

You can monitor how Amazon Keyspaces automatic scaling uses resources by using Amazon

CloudWatch, which generates metrics about your usage and performance. Follow the steps in the

Application Auto Scaling User Guide to create a CloudWatch dashboard.

Use burst capacity eﬀectively in Amazon Keyspaces

Amazon Keyspaces provides some ﬂexibility in your per-partition throughput provisioning by

providing burst capacity. Whenever you're not fully using a partition's throughput, Amazon

Keyspaces reserves a portion of that unused capacity for later bursts of throughput to handle

usage spikes.

Amazon Keyspaces currently retains up to 5 minutes (300 seconds) of unused read and write

capacity. During an occasional burst of read or write activity, these extra capacity units can be

consumed quickly—even faster than the per-second provisioned throughput capacity that you've

deﬁned for your table.

Amazon Keyspaces can also consume burst capacity for background maintenance and other tasks

without prior notice.

Note that these burst capacity details might change in the future.

Use burst capacity 269

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Working with Amazon Keyspaces (for Apache Cassandra)

features

This chapter provides details about working with Amazon Keyspaces and various database features,

for example backup and restore, Time to Live, and Multi-Region Replication.

• Time to Live – Amazon Keyspaces expires data from tables automatically based on the Time to

Live value you set. Learn how to conﬁgure TTL and how to use it in your tables.

• PITR – Protect your Amazon Keyspaces tables from accidental write or delete operations by

creating continuous backups of your table data. Learn how to conﬁgure PITR on your tables

and how to restore a table to a speciﬁc point in time or how to restore a table that has been

accidentally deleted.

• Working with multi-Region tables – Multi-Region tables in Amazon Keyspaces must have write

throughput capacity conﬁgured in either on-demand or provisioned capacity mode with auto

scaling. Plan the throughput capacity needs by estimating the required write capacity units

(WCUs) for each Region, and provision the sum of writes from all Regions to ensure suﬃcient

capacity for replicated writes.

• Static columns – Amazon Keyspaces handles static columns diﬀerently from regular columns.

This section covers calculating the encoded size of static columns, metering read/write

operations on static data, and guidelines for working with static columns.

• Queries and pagination – Amazon Keyspaces supports advanced querying capabilities like

using the IN operator with SELECT statements, ordering results with ORDER BY, and automatic

pagination of large result sets. This section explains how Amazon Keyspaces processes these

queries and provides examples.

•

Partitioners – Amazon Keyspaces provides three partitioners: Murmur3Partitioner (default),

RandomPartitioner, and DefaultPartitioner. You can change the partitioner per Region

at the account level using the AWS Management Console or Cassandra Query Language (CQL).

• Client-side timestamps – Client-side timestamps are Cassandra-compatible timestamps that

Amazon Keyspaces persists for each cell in your table. Use client-side timestamps for conﬂict

resolution and to let your client application determine the order of writes.

• Tagging resources – You can label Amazon Keyspaces resources like keyspaces and tables using

tags. Tags help categorize resources, enable cost tracking, and let you conﬁgure access control

based on tags. This section covers tagging restrictions, operations, and best practices for Amazon

Keyspaces.

270

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• AWS CloudFormation templates – AWS CloudFormation helps you model and set up your

Amazon Keyspaces keyspaces and tables so that you can spend less time creating and managing

your resources and infrastructure.

Topics

• System keyspaces in Amazon Keyspaces

• Multi-Region Replication for Amazon Keyspaces (for Apache Cassandra)

• Backup and restore data with point-in-time recovery for Amazon Keyspaces

• Expire data with Time to Live (TTL) for Amazon Keyspaces (for Apache Cassandra)

• Client-side timestamps in Amazon Keyspaces

• Working with CQL queries in Amazon Keyspaces

• Working with partitioners in Amazon Keyspaces

• Using this service with an AWS SDK

• Working with tags and labels for Amazon Keyspaces resources

• Create Amazon Keyspaces resources with AWS CloudFormation

• Using NoSQL Workbench with Amazon Keyspaces (for Apache Cassandra)

System keyspaces in Amazon Keyspaces

This section provides details about working with system keyspaces in Amazon Keyspaces (for

Apache Cassandra).

Amazon Keyspaces uses four system keyspaces:

•

system

•

system_schema

•

system_schema_mcs

•

system_multiregion_info

The following sections provide details about the system keyspaces and the system tables that are

supported in Amazon Keyspaces.

System keyspaces 271

Amazon Keyspaces (for Apache Cassandra) Developer Guide

system

This is a Cassandra keyspace. Amazon Keyspaces uses the following tables.

Table names Column names Comments

local key, bootstrap

ped, broadcast

_address, cluster_n

ame, cql_versi

on, data_cent

er, gossip_ge

neration, host_id,

listen_address,

native_protocol_ve

rsion, partition

er, rack, release_v

ersion, rpc_addre

ss, schema_version,

thrift_version,

tokens, truncated_at

Information about the local

keyspace.

peers peer, data_center,

host_id, preferred

_ip, rack, release_v

ersion, rpc_addre

ss, schema_version,

tokens

Query this table to see

the available endpoints

. For example, if you're

connecting through a public

endpoint, you see a list of

nine available IP addresses.

If you're connecting through

a FIPS endpoint, you see a

list of three IP addresses. If

you're connecting through

an AWS PrivateLink VPC

endpoint, you see the

list of IP addresses that

you have conﬁgured. For

more information, see the

system

272

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table names Column names Comments

section called “Populati

ng system.peers table

entries with interface VPC

endpoint information”.

size_estimates keyspace_name,

table_name, range_sta

rt, range_end,

mean_partition_size,

partitions_count

This table deﬁnes the total

size and number of partition

s for each token range for

every table. This is needed for

the Apache Cassandra Spark

Connector, which uses the

estimated partition size to

distribute the work.

prepared_statements prepared_id,

logged_keyspace,

query_string

This table contains informati

on about saved queries.

system_schema

This is a Cassandra keyspace. Amazon Keyspaces uses the following tables.

Table names Column names Comments

keyspaces keyspace_name,

durable_writes,

replication

Information about a speciﬁc

keyspace.

tables keyspace_name,

table_name, bloom_fil

ter_fp_chance,

caching, comment,

compaction, compressi

on, crc_check

_chance, dclocal_r

Information about a speciﬁc

table.

system_schema

273

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table names Column names Comments

ead_repair_chance,

default_time_to_li

ve, extensions,

flags, gc_grace_

seconds, id,

max_index_interval

, memtable_flush_per

iod_in_ms, min_index

_interval, read_repa

ir_chance, speculati

ve_retry

columns keyspace_name,

table_name, column_na

me, clusterin

g_order, column_na

me_bytes, kind,

position, type

Information about a speciﬁc

column.

system_schema_mcs

This is an Amazon Keyspaces keyspace that stores information about AWS or Amazon Keyspaces

speciﬁc settings.

Table names Column names Comments

keyspaces keyspace_name,

durable_writes,

replication

Query this table to ﬁnd

out programmatically if a

keyspace has been created.

For more information, see

the section called “Check

keyspace creation status”.

tables keyspace_name,

creation_time,

Query this table to ﬁnd out

the status of a speciﬁc table.

system_schema_mcs

274

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table names Column names Comments

speculative_retry,

cdc, gc_grace_

seconds, crc_check

_chance, min_index

_interval, bloom_fil

ter_fp_chance,

flags, custom_pr

operties, dclocal_r

ead_repair_chance,

table_name, caching,

default_time_to_li

ve, read_repa

ir_chance, max_index

_interval, extension

s, compaction,

comment, id, compressi

on, memtable_

flush_period_in_ms,

status

For more information, see the

section called “Check table

creation status”.

You can also query this

table to list settings that are

speciﬁc to Amazon Keyspaces

and are stored as custom_pr

operties . For example:

•

capacity_mode

•

client_side_timest

amps

•

encryption_specifi

cation

•

point_in_time_reco

very

•

ttl

tables_history keyspace_name,

table_name, event_tim

e, creation_time,

custom_properties,

event

Query this table to learn

about schema changes for a

speciﬁc table.

columns keyspace_name,

table_name, column_na

me, clusterin

g_order, column_na

me_bytes, kind,

position, type

This table is identical to

the Cassandra table in the

system_schema keyspace.

system_schema_mcs

275

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table names Column names Comments

tags resource_id,

keyspace_name,

resource_name,

resource_type, tags

Query this table to ﬁnd out if

a keyspace has tags. For more

information, see the section

called “View table tags”.

autoscaling keyspace_name,

table_name, provision

ed_read_capacity_a

utoscaling_update,

provisioned_write_

capacity_autoscali

ng_update

Query this table to get the

auto scaling settings of a

provisioned table. Note

that these settings won't

be available until the table

is active. To query this

table, you have to specify

keyspace_name and

table_name in the WHERE

clause. For more informati

on, see the section called

“View your table's Amazon

Keyspaces auto scaling

conﬁguration”.

system_multiregion_info

This is an Amazon Keyspaces keyspace that stores information about Multi-Region Replication.

Table names Column names Comments

tables keyspace_name,

table_name, region,

status

This table contains informati

on about multi-Region tables

—for example, the AWS

Regions that the table is

replicated in and the table's

status. You can also query this

table to list settings that are

speciﬁc to Amazon Keyspaces

system_multiregion_info

276

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table names Column names Comments

that are stored as custom_pr

operties . For example:

•

capacity_mode

To query this table, you

have to specify keyspace_

name and table_name in

the WHERE clause. For more

information, see the section

called “Create a multi-Region

keyspace”.

autoscaling keyspace_name,

table_name, provision

ed_read_capacity_a

utoscaling_update,

provisioned_write_

capacity_autoscali

ng_update, region

Query this table to get the

auto scaling settings of

a multi-Region provision

ed table. Note that these

settings won't be available

until the table is active. To

query this table, you have to

specify keyspace_name

and table_name in the

WHERE clause. For more

information, see the section

called “Update provisioned

capacity and auto scaling

settings for a multi-Region

table”.

Multi-Region Replication for Amazon Keyspaces (for Apache

Cassandra)

You can use Amazon Keyspaces Multi-Region Replication to replicate your data with automated,

fully managed, active-active replication across the AWS Regions of your choice. With active-active

Multi-Region Replication 277

Amazon Keyspaces (for Apache Cassandra) Developer Guide

replication, each Region is able to perform reads and writes in isolation. You can improve both

availability and resiliency from Regional degradation, while also beneﬁting from low-latency local

reads and writes for global applications.

With Multi-Region Replication, Amazon Keyspaces asynchronously replicates data between

Regions, and data is typically propagated across Regions within a second. Also, with Multi-Region

Replication, you no longer have the diﬃcult work of resolving conﬂicts and correcting data

divergence issues, so you can focus on your application.

By default, Amazon Keyspaces replicates data across three Availability Zones within the same AWS

Region for durability and high availability. With Multi-Region Replication, you can create multi-

Region keyspaces that replicate your tables in up to six diﬀerent geographic AWS Regions of your

choice.

Topics

• Beneﬁts of using Multi-Region Replication

• Capacity modes and pricing

• How Multi-Region Replication works in Amazon Keyspaces

• Amazon Keyspaces Multi-Region Replication usage notes

• Conﬁgure Multi-Region Replication for Amazon Keyspaces (for Apache Cassandra)

Beneﬁts of using Multi-Region Replication

Multi-Region Replication provides the following beneﬁts.

• Global reads and writes with single-digit millisecond latency – In Amazon Keyspaces,

replication is active-active. You can serve both reads and writes locally from the Regions closest

to your customers with single-digit millisecond latency at any scale. You can use Amazon

Keyspaces multi-Region tables for global applications that need a fast response time anywhere

in the world.

• Improved business continuity and protection from single-Region degradation – With Multi-

Region Replication, you can recover from degradation in a single AWS Region by redirecting your

application to a diﬀerent Region in your multi-Region keyspace. Because Amazon Keyspaces

oﬀers active-active replication, there is no impact to your reads and writes.

Amazon Keyspaces keeps track of any writes that have been performed on your multi-Region

keyspace but haven't been propagated to all replica Regions. After the Region comes back online,

Beneﬁts 278

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces automatically syncs any missing changes so that you can recover without any

application impact.

• High-speed replication across Regions – Multi-Region Replication uses fast, storage-based

physical replication of data across Regions, with a replication lag that is typically less than 1

second.

Replication in Amazon Keyspaces has little to no impact on your database queries because it

doesn’t share compute resources with your application. This means that you can address high-

write throughput use cases or use cases with sudden spikes or bursts in throughput without any

application impact.

• Consistency and conﬂict resolution – Any changes made to data in any Region are replicated to

the other Regions in a multi-Region keyspace. If applications update the same data in diﬀerent

Regions at the same time, conﬂicts can arise.

To help provide eventual consistency, Amazon Keyspaces uses cell-level timestamps and a last

writer wins reconciliation between concurrent updates. Conﬂict resolution is fully managed and

happens in the background without any application impact.

For more information about supported conﬁgurations and features, see the section called “Usage

notes”.

Capacity modes and pricing

For a multi-Region keyspace, you can either use on-demand capacity mode or provisioned capacity

mode. For more information, see the section called “Conﬁgure read/write capacity modes”.

For on-demand mode, you're billed 1.25 write request units (WRUs) to write up to 1 KB of data per

row. You're billed for writes in each Region of your multi-Region keyspace. For example, writing a

row of 3 KB of data in a multi-Region keyspace with two Regions requires 7.5 WRUs: 3 * 1.25 * 2 =

7.5 WRUs. Additionally, writes that include both static and non-static data require additional write

operations.

For provisioned mode, you're billed 1.25 write capacity units (WCUs) to write up to 1 KB of data per

row. You're billed for writes in each Region of your multi-Region keyspace. For example, writing a

row of 3 KB of data per second in a multi-Region keyspace with two Regions requires 7.5 WCUs:

3 * 1.25 * 2 = 7.5 WCUs. Additionally, writes that include both static and non-static data require

additional write operations.

Capacity modes and pricing 279

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information about pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

How Multi-Region Replication works in Amazon Keyspaces

This section provides an overview of how Amazon Keyspaces Multi-Region Replication works. For

more information about pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

Topics

• How Multi-Region Replication works in Amazon Keyspaces

• Multi-Region Replication conﬂict resolution

• Multi-Region Replication disaster recovery

• Multi-Region Replication and integration with point-in-time recovery (PITR)

• Multi-Region Replication and integration with AWS services

How Multi-Region Replication works in Amazon Keyspaces

Amazon Keyspaces Multi-Region Replication implements a data resiliency architecture that

distributes your data across independent and geographically distributed AWS Regions. It uses

active-active replication, which provides local low latency with each Region being able to perform

reads and writes in isolation.

When you create an Amazon Keyspaces multi-Region keyspace, you can select up to ﬁve additional

Regions where the data is going to be replicated to. Each table you create in a multi-Region

keyspace consists of multiple replica tables (one per Region) that Amazon Keyspaces considers as a

single unit.

Every replica has the same table name and the same primary key schema. When an application

writes data to a local table in one Region, the data is durably written using the LOCAL_QUORUM

consistency level. Amazon Keyspaces automatically replicates the data asynchronously to the other

replication Regions. The replication lag across Regions is typically less than one second and doesn't

impact your application’s performance or throughput.

After the data is written, you can read it from the multi-Region table in another replication Region

with the LOCAL_ONE/LOCAL_QUORUM consistency levels. For more information about supported

conﬁgurations and features, see the section called “Usage notes”.

How it works 280

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Multi-Region Replication conﬂict resolution

Amazon Keyspaces Multi-Region Replication is fully managed, which means that you don't

have to perform replication tasks such as regularly running repair operations to clean-up data

synchronization issues. Amazon Keyspaces monitors data consistency between tables in diﬀerent

AWS Regions by detecting and repairing conﬂicts, and synchronizes replicas automatically.

Amazon Keyspaces uses the last writer wins method of data reconciliation. With this conﬂict

resolution mechanism, all of the Regions in a multi-Region keyspace agree on the latest update

and converge toward a state in which they all have identical data. The reconciliation process has

no impact on application performance. To support conﬂict resolution, client-side timestamps are

automatically turned on for multi-Region tables and can't be turned oﬀ. For more information, see

the section called “Client-side timestamps”.

How it works 281

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Multi-Region Replication disaster recovery

With Amazon Keyspaces Multi-Region Replication, writes are replicated asynchronously across each

Region. In the rare event of a single Region degradation or failure, Multi-Region Replication helps

you to recover from disaster with little to no impact to your application. Recovery from disaster is

typically measured using values for Recovery time objective (RTO) and Recovery point objective

(RPO).

Recovery time objective – The time it takes a system to return to a working state after a disaster.

RTO measures the amount of downtime your workload can tolerate, measured in time. For disaster

recovery plans that use Multi-Region Replication to fail over to an unaﬀected Region, the RTO can

be nearly zero. The RTO is limited by how quickly your application can detect the failure condition

and redirect traﬃc to another Region.

Recovery point objective – The amount of data that can be lost (measured in time). For disaster

recovery plans that use Multi-Region Replication to fail over to an unaﬀected Region, the RPO

is typically single-digit seconds. The RPO is limited by replication latency to the failover target

replica.

In the event of a Regional failure or degradation, you don't need to promote a secondary Region

or perform database failover procedures because replication in Amazon Keyspaces is active-active.

Instead, you can use Amazon Route53 to route your application to the nearest healthy Region. To

learn more about Route53, see What is Amazon Route53?.

If a single AWS Region becomes isolated or degraded, your application can redirect traﬃc to a

diﬀerent Region using Route53 to perform reads and writes against a diﬀerent replica table. You

can also apply custom business logic to determine when to redirect requests to other Regions. An

example of this is making your application aware of the multiple endpoints that are available.

When the Region comes back online, Amazon Keyspaces resumes propagating any pending writes

from that Region to the replica tables in other Regions. It also resumes propagating writes from

other replica tables to the Region that is now back online.

Multi-Region Replication and integration with point-in-time recovery (PITR)

Point-in-time recovery is supported in multi-Region tables. To successfully restore a multi-Region

table with PITR, the following conditions have to be met.

• The source and the target table must be conﬁgured as multi-Region tables.

How it works 282

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• The replication Regions for the keyspace of the source table and for the keyspace of the target

table must be the same.

You can run the restore statement from any of the Regions that the source table is available in.

Amazon Keyspaces automatically restores the target table in each Region. For more information

about PITR, see the section called “How it works”.

Multi-Region Replication and integration with AWS services

You can monitor replication performance between tables in diﬀerent AWS Regions by using

Amazon CloudWatch metrics. The following metric provides continuous monitoring of multi-

Region keyspaces.

•

ReplicationLatency – This metric measures the time it took to replicate updates, inserts,

or deletes from one replica table to another replica table in a multi-Region keyspace.

For more information about how to monitor CloudWatch metrics, see the section called

“Monitoring with CloudWatch”.

Amazon Keyspaces Multi-Region Replication usage notes

Consider the following when you're using Multi-Region Replication with Amazon Keyspaces.

• You can select up to six of the available public AWS Regions. AWS GovCloud (US) Regions, China

Regions, and AWS Regions that are disabled by default are not supported.

• Consider the following workarounds until the features become available:

• Select the replication Regions when you create the keyspace. You can't add or remove Regions

afterwards.

• Conﬁgure Time to Live (TTL) when creating the multi-Region table. You won't be able to

enable and disable TTL, or adjust the TTL value later. For more information, see the section

called “Expire data with Time to Live”.

• For encryption at rest, use an AWS owned key. Customer managed keys are currently not

supported for multi-Region tables. For more information, see

the section called “How it works”.

• When you're using provisioned capacity management with Amazon Keyspaces auto scaling, make

sure to use the Amazon Keyspaces API operations to create and conﬁgure your multi-Region

Usage notes 283

Amazon Keyspaces (for Apache Cassandra) Developer Guide

tables. The underlying Application Auto Scaling API operations that Amazon Keyspaces calls on

your behalf don't have multi-Region capabilities.

For more information, see the section called “Update provisioned capacity and auto scaling

settings for a multi-Region table”. For more information on how to estimate the write capacity

throughput of provisioned multi-Region tables, see the section called “Estimate capacity for a

multi-Region table”.

• Although data is automatically replicated across the selected Regions of a multi-Region table,

when a client connects to an endpoint in one Region and queries the system.peers table, the

query returns only local information. The query result appears like a single data center cluster to

the client.

•

Amazon Keyspaces Multi-Region Replication is asynchronous, and it supports LOCAL_QUORUM

consistency for writes. LOCAL_QUORUM consistency requires that an update to a row is durably

persisted on two replicas in the local Region before returning success to the client. The

propagation of writes to the replicated Region (or Regions) is then performed asynchronously.

Amazon Keyspaces Multi-Region Replication doesn't support synchronous replication or QUORUM

consistency.

• When you create a multi-Region keyspace or table, any tags that you deﬁne during the creation

process are automatically applied to all keyspaces and tables in all Regions. When you change

the existing tags using ALTER KEYSPACE or ALTER TABLE, the update is only applied to the

keyspace or table in the Region where you're making the change.

•

Amazon CloudWatch provides a ReplicationLatency metric for each replicated Region. It

calculates this metric by tracking arriving rows, comparing their arrival time with their initial

write time, and computing an average. Timings are stored within CloudWatch in the source

Region. For more information, see the section called “Monitoring with CloudWatch”.

It can be useful to view the average and maximum timings to determine the average and worst-

case replication lag. There is no SLA on this latency.

• When using a multi-Region table in on-demand mode, you may observe an increase in latency

for asynchronous replication of writes if a table replica experiences a new traﬃc peak. Similar to

how Amazon Keyspaces automatically adapts the capacity of a single-Region on-demand table

to the application traﬃc it receives, Amazon Keyspaces automatically adapts the capacity of a

multi-Region on-demand table replica to the traﬃc that it receives. The increase in replication

latency is transient because Amazon Keyspaces automatically allocates more capacity as your

traﬃc volume increases. Once all replicas have adapted to your traﬃc volume, replication latency

Usage notes 284

Amazon Keyspaces (for Apache Cassandra) Developer Guide

should return back to normal. For more information, see the section called “Peak traﬃc and

scaling properties”.

• When using a multi-Region table in provisioned mode, if your application exceeds your

provisioned throughput capacity, you may observe insuﬃcient capacity errors and an increase

in replication latency. To ensure that there's always enough read and write capacity for all table

replicas in all AWS Regions of a multi-Region table, we recommend that you conﬁgure Amazon

Keyspaces auto scaling. Amazon Keyspaces auto scaling helps you provision throughput capacity

eﬃciently for variable workloads by adjusting throughput capacity automatically in response to

actual application traﬃc. For more information, see the section called “How auto scaling works

for multi-Region tables”.

Conﬁgure Multi-Region Replication for Amazon Keyspaces (for Apache

Cassandra)

You can use the console, Cassandra Query Language (CQL), or the AWS Command Line Interface to

create and manage multi-Region keyspaces and tables in Amazon Keyspaces.

This section provides examples of how to create and manage multi-Region keyspaces and tables.

All tables that you create in a multi-Region keyspace automatically inherit the multi-Region

settings from the keyspace.

For more information about supported conﬁgurations and features, see the section called “Usage

notes”.

Topics

• Conﬁgure the IAM permissions required to create multi-Region keyspaces and tables

• Conﬁgure the IAM permissions required to add an AWS Region to a keyspace

• Create a multi-Region keyspace in Amazon Keyspaces

• Create a multi-Region table with default settings in Amazon Keyspaces

• Create a multi-Region table in provisioned mode with auto scaling in Amazon Keyspaces

• Update the provisioned capacity and auto scaling settings for a multi-Region table in Amazon

Keyspaces

• View the provisioned capacity and auto scaling settings for a multi-Region table in Amazon

Keyspaces

• Turn oﬀ auto scaling for a table in Amazon Keyspaces

Conﬁgure Multi-Region Replication 285

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Set the provisioned capacity of a multi-Region table manually in Amazon Keyspaces

Conﬁgure the IAM permissions required to create multi-Region keyspaces and

tables

To successfully create multi-Region keyspaces and tables, the IAM principal needs to be able to

create a service-linked role. This service-linked role is a unique type of IAM role that is predeﬁned

by Amazon Keyspaces. It includes all the permissions that Amazon Keyspaces requires to perform

actions on your behalf. For more information about the service-linked role, see the section called

“Multi-Region Replication”.

To create the service-linked role required by Multi-Region Replication, the policy for the IAM

principal requires the following elements:

•

iam:CreateServiceLinkedRole – The action the principal can perform.

•

arn:aws:iam::*:role/aws-service-role/replication.cassandra.amazonaws.com/

AWSServiceRoleForKeyspacesReplication – The resource that the action can be

performed on.

•

iam:AWSServiceName": "replication.cassandra.amazonaws.com – The only AWS

service that this role can be attached to is Amazon Keyspaces.

The following is an example of the policy that grants the minimum required permissions to a

principal to create multi-Region keyspaces and tables.

{

"Effect": "Allow",

"Action": "iam:CreateServiceLinkedRole",

"Resource": "arn:aws:iam::*:role/aws-service-role/

replication.cassandra.amazonaws.com/AWSServiceRoleForKeyspacesReplication",

"Condition": {"StringLike": {"iam:AWSServiceName":

"replication.cassandra.amazonaws.com"}}

}

For additional IAM permissions for multi-Region keyspaces and tables, see the Actions, resources,

and condition keys for Amazon Keyspaces (for Apache Cassandra) in the Service Authorization

Reference.

Conﬁgure Multi-Region Replication 286

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Conﬁgure the IAM permissions required to add an AWS Region to a keyspace

To add a Region to a keyspace, the IAM principal needs the following permissions:

•

cassandra:Alter

•

cassandra:AlterMultiRegionResource

•

cassandra:Create

•

cassandra:CreateMultiRegionResource

•

cassandra:Select

•

cassandra:SelectMultiRegionResource

•

cassandra:Modify

•

cassandra:ModifyMultiRegionResource

If the keyspace and table has tags, the IAM principal requires additional permissions.

•

cassandra:TagResource

•

cassandra:TagMultiRegionResource

If the table is conﬁgured in provisioned mode with auto scaling enabled, the following additional

permissions are needed.

•

application-autoscaling:RegisterScalableTarget

•

application-autoscaling:DeregisterScalableTarget

•

application-autoscaling:DescribeScalableTargets

•

application-autoscaling:PutScalingPolicy

•

application-autoscaling:DescribeScalingPolicies

Create a multi-Region keyspace in Amazon Keyspaces

This section provides examples of how to create a multi-Region keyspace. You can do this on the

Amazon Keyspaces console, using CQL or the AWS CLI. All tables that you create in a multi-Region

keyspace automatically inherit the multi-Region settings from the keyspace.

Conﬁgure Multi-Region Replication 287

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

When creating a multi-Region keyspace, Amazon Keyspaces creates a service-linked role

with the name AWSServiceRoleForAmazonKeyspacesReplication in your account.

This role allows Amazon Keyspaces to replicate writes to all replicas of a multi-Region table

on your behalf. To learn more, see the section called “Multi-Region Replication”.

Console

Create a multi-Region keyspace (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces, and then choose Create keyspace.

3. For Keyspace name, enter the name for the keyspace.

4. In the Multi-Region replication section, you can add up to ﬁve additional Regions that are

available in the list.

5. To ﬁnish, choose Create keyspace.

Cassandra Query Language (CQL)

Create a multi-Region keyspace using CQL

To create a multi-Region keyspace, use NetworkTopologyStrategy to specify the AWS

Regions that the keyspace is going to be replicated in. You must include your current

Region and at least one additional Region.

All tables in the keyspace inherit the replication strategy from the keyspace. You can't

change the replication strategy at the table level.

NetworkTopologyStrategy – The replication factor for each Region is three because

Amazon Keyspaces replicates data across three Availability Zones within the same AWS

Region, by default.

The following CQL statement is an example of this.

CREATE KEYSPACE mykeyspace

Conﬁgure Multi-Region Replication 288

Amazon Keyspaces (for Apache Cassandra) Developer Guide

WITH REPLICATION = {'class':'NetworkTopologyStrategy', 'us-east-1':'3', 'ap-

southeast-1':'3','eu-west-1':'3' };

You can use a CQL statement to query the tables table in the

system_multiregion_info keyspace to programmatically list the Regions and the

status of the multi-Region table that you specify. The following code is an example of this.

SELECT * from system_multiregion_info.tables WHERE keyspace_name = 'mykeyspace'

AND table_name = 'mytable';

The output of the statement looks like the following:

keyspace_name | table_name | region | status

----------------+----------------+----------------+--------

mykeyspace | mytable | us-east-1 | ACTIVE

mykeyspace | mytable | ap-southeast-1 | ACTIVE

mykeyspace | mytable | eu-west-1 | ACTIVE

CLI

Create a new multi-Region keyspace using the AWS CLI

• To create a multi-Region keyspace, you can use the following CLI statement. Specify your

current Region and at least one additional Region in the regionList.

aws keyspaces create-keyspace --keyspace-name mykeyspace

\ --replication-specification

replicationStrategy=MULTI_REGION,regionList=us-east-1,eu-west-1

To create a multi-Region table, see the section called “Create a multi-Region table with default

settings” and the section called “Create a multi-Region table in provisioned mode”.

Create a multi-Region table with default settings in Amazon Keyspaces

This section provides examples of how to create a multi-Region table in on-demand mode with all

default settings. You can do this on the Amazon Keyspaces console, using CQL or the AWS CLI. All

tables that you create in a multi-Region keyspace automatically inherit the multi-Region settings

from the keyspace.

Conﬁgure Multi-Region Replication 289

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To create a multi-Region keyspace, see the section called “Create a multi-Region keyspace”.

Console

Create a multi-Region table with default settings (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose a multi-Region keyspace.

3. On the Tables tab, choose Create table.

4. For Table name, enter the name for the table. The AWS Regions that this table is being

replicated in are shown in the info box.

5. Continue with the table schema.

6. Under Table settings, continue with the Default settings option. Note the following

default settings for multi-Region tables.

• Capacity mode – The default capacity mode is On-demand. For more information about

conﬁguring provisioned mode, see the section called “Create a multi-Region table in

provisioned mode”.

• Encryption key management – Only the AWS owned key option is supported.

• Client-side timestamps – This feature is required for multi-Region tables.

• Choose Customize settings if you need to turn on Time to Live (TTL) for the table and

all its replicas.

Note

You won't be able to change TTL settings on an existing multi-Region table.

7. To ﬁnish, choose Create table.

Cassandra Query Language (CQL)

Create a multi-Region table in on-demand mode with default settings

• To create a multi-Region table with default settings, you can use the following CQL

statement.

Conﬁgure Multi-Region Replication 290

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CREATE TABLE mykeyspace.mytable(pk int, ck int, PRIMARY KEY (pk, ck))

WITH CUSTOM_PROPERTIES = {

'capacity_mode':{

'throughput_mode':'PAY_PER_REQUEST'

'point_in_time_recovery':{

'status':'enabled'

'encryption_specification':{

'encryption_type':'AWS_OWNED_KMS_KEY'

'client_side_timestamps':{

'status':'enabled'

}

};

CLI

Using the AWS CLI

1. To create a multi-Region table with default settings, you only need to specify the schema.

You can use the following example.

aws keyspaces create-table --keyspace-name mykeyspace --table-name mytable

\ --schema-definition

'allColumns=[{name=pk,type=int}],partitionKeys={name= pk}'

The output of the command is:

{

"resourceArn": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/

mykeyspace/table/mytable"

}

2. To conﬁrm the table's settings, you can use the following statement.

aws keyspaces get-table --keyspace-name mykeyspace --table-name mytable

The output shows all default settings of a multi-Region table.

Conﬁgure Multi-Region Replication 291

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"keyspaceName": "mykeyspace",

"tableName": "mytable",

"resourceArn": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/

mykeyspace/table/mytable",

"creationTimestamp": "2023-12-19T16:50:37.639000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "pk",

"type": "int"

}

"partitionKeys": [

{

"name": "pk"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2023-12-19T16:50:37.639000+00:00"

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

"clientSideTimestamps": {

"status": "ENABLED"

"replicaSpecifications": [

{

"region": "us-east-1",

"status": "ACTIVE",

Conﬁgure Multi-Region Replication 292

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": 1702895811.469

}

{

"region": "eu-north-1",

"status": "ACTIVE",

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": 1702895811.121

}

]

}

Create a multi-Region table in provisioned mode with auto scaling in Amazon

Keyspaces

This section provides examples of how to create a multi-Region table in provisioned mode with

auto scaling. You can do this on the Amazon Keyspaces console, using CQL or the AWS CLI.

For more information about supported conﬁgurations and Multi-Region Replication features, see

the section called “Usage notes”.

To create a multi-Region keyspace, see the section called “Create a multi-Region keyspace”.

When you create a new multi-Region table in provisioned mode with auto scaling settings, you

can specify the general settings for the table that are valid for all AWS Regions that the table is

replicated in. You can then overwrite read capacity settings and read auto scaling settings for each

replica. The write capacity, however, remains synchronized between all replicas to ensure that

there's enough capacity to replicate writes across all Regions.

Note

Amazon Keyspaces automatic scaling requires the presence of a service-linked role

(AWSServiceRoleForApplicationAutoScaling_CassandraTable) that performs

automatic scaling actions on your behalf. This role is created automatically for you. For

more information, see the section called “Using service-linked roles”.

Conﬁgure Multi-Region Replication 293

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Console

Create a new multi-Region table with automatic scaling enabled

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose a multi-Region keyspace.

3. On the Tables tab, choose Create table.

4. On the Create table page in the Table details section, select a keyspace and provide a

name for the new table.

5. In the Columns section, create the schema for your table.

6. In the Primary key section, deﬁne the primary key of the table and select optional

clustering columns.

7. In the Table settings section, choose Customize settings.

8. Continue to Read/write capacity settings.

9. For Capacity mode, choose Provisioned.

10. In the Read capacity section, conﬁrm that Scale automatically is selected.

You can select to conﬁgure the same read capacity units for all AWS Regions that the table

is replicated in. Alternatively, you can clear the check box and conﬁgure the read capacity

for each Region diﬀerently.

If you choose to conﬁgure each Region diﬀerently, you select the minimum and maximum

read capacity units for each table replica, as well as the target utilization.

• Minimum capacity units – Enter the value for the minimum level of throughput that the

table should always be ready to support. The value must be between 1 and the maximum

throughput per second quota for your account (40,000 by default).

• Maximum capacity units – Enter the maximum amount of throughput that you want to

provision for the table. The value must be between 1 and the maximum throughput per

second quota for your account (40,000 by default).

• Target utilization – Enter a target utilization rate between 20% and 90%. When traﬃc

exceeds the deﬁned target utilization rate, capacity is automatically scaled up. When

traﬃc falls below the deﬁned target, it is automatically scaled down again.

Conﬁgure Multi-Region Replication 294

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Clear the Scale automatically check box if you want to provision the table's read

capacity manually. This setting applies to all replicas of the table.

Note

To ensure that there's enough read capacity for all replicas, we recommend

Amazon Keyspaces automatic scaling for provisioned multi-Region tables.

Note

To learn more about default quotas for your account and how to increase them, see

Quotas.

11. In the Write capacity section, conﬁrm that Scale automatically is selected. Then conﬁgure

the capacity units for the table. The write capacity units stay synced across all AWS Regions

to ensure that there is enough capacity to replicate write events across the Regions.

• Clear Scale automatically if you want to provision the table's write capacity manually.

This setting applies to all replicas of the table.

Note

To ensure that there's enough write capacity for all replicas, we recommend

Amazon Keyspaces automatic scaling for provisioned multi-Region tables.

12. Choose Create table. Your table is created with the speciﬁed automatic scaling parameters.

Cassandra Query Language (CQL)

Create a multi-Region table with provisioned capacity mode and auto scaling using CQL

• To create a multi-Region table in provisioned mode with auto scaling, you must ﬁrst

specify the capacity mode by deﬁning CUSTOM_PROPERTIES for the table. After specifying

provisioned capacity mode, you can conﬁgure the auto scaling settings for the table using

AUTOSCALING_SETTINGS.

Conﬁgure Multi-Region Replication 295

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For detailed information about auto scaling settings, the target tracking policy, target

value, and optional settings, see the section called “Create a new table with automatic

scaling”.

To deﬁne the read capacity for a table replica in a speciﬁc Region, you can conﬁgure the

following parameters as part of the table's replica_updates:

• The Region

• The provisioned read capacity units (optional)

• Auto scaling settings for read capacity (optional)

The following example shows a CREATE TABLE statement for a multi-Region table in

provisioned mode. The general write and read capacity auto scaling settings are the same.

However, the read auto scaling settings specify additional cooldown periods of 60 seconds

before scaling the table's read capacity up or down. In addition, the read capacity auto

scaling settings for the Region US East (N. Virginia) are higher than those for other replicas.

Also, the target value is set to 70% instead of 50%.

CREATE TABLE mykeyspace.mytable(pk int, ck int, PRIMARY KEY (pk, ck))

WITH CUSTOM_PROPERTIES = {

'capacity_mode': {

'throughput_mode': 'PROVISIONED',

'read_capacity_units': 5,

'write_capacity_units': 5

}

} AND AUTOSCALING_SETTINGS = {

'provisioned_write_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50

}

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

Conﬁgure Multi-Region Replication 296

Amazon Keyspaces (for Apache Cassandra) Developer Guide

'target_tracking_scaling_policy_configuration': {

'target_value': 50,

'scale_in_cooldown': 60,

'scale_out_cooldown': 60

}

'replica_updates': {

'us-east-1': {

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 20,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 70

}

};

CLI

Create a new multi-Region table in provisioned mode with auto scaling using the AWS CLI

• To create a multi-Region table in provisioned mode with auto scaling conﬁguration, you

can use the AWS CLI. Note that you must use the Amazon Keyspaces CLI create-table

command to conﬁgure multi-Region auto scaling settings. This is because Application Auto

Scaling, the service that Amazon Keyspaces uses to perform auto scaling on your behalf,

doesn't support multiple Regions.

For more information about auto scaling settings, the target tracking policy, target value,

and optional settings, see the section called “Create a new table with automatic scaling”.

To deﬁne the read capacity for a table replica in a speciﬁc Region, you can conﬁgure the

following parameters as part of the table's replicaSpecifications:

• The Region

• The provisioned read capacity units (optional)

Conﬁgure Multi-Region Replication 297

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Auto scaling settings for read capacity (optional)

When you're creating provisioned multi-Region tables with complex auto scaling settings

and diﬀerent conﬁgurations for table replicas, it's helpful to load the table's auto scaling

settings and replica conﬁgurations from JSON ﬁles.

To use the following code example, you can download the example JSON ﬁles from auto-

scaling.zip, and extract auto-scaling.json and replication.json. Take note of the

path to the ﬁles.

In this example, the JSON ﬁles are located in the current directory. For diﬀerent ﬁle path

options, see How to load parameters from a ﬁle.

aws keyspaces create-table --keyspace-name mykeyspace --table-name mytable

\ --schema-definition 'allColumns=[{name=pk,type=int},

{name=ck,type=int}],partitionKeys=[{name=pk},{name=ck}]'

\ --capacity-specification

throughputMode=PROVISIONED,readCapacityUnits=1,writeCapacityUnits=1

\ --auto-scaling-specification file://auto-scaling.json

\ --replica-specifications file://replication.json

Update the provisioned capacity and auto scaling settings for a multi-Region

table in Amazon Keyspaces

This section includes examples of how to use the console, CQL, and the AWS CLI to manage the

Amazon Keyspaces auto scaling settings of provisioned multi-Region tables. For more information

about general auto scaling conﬁguration options and how they work, see the section called

“Manage throughput capacity with auto scaling”.

Note that if you're using provisioned capacity mode for multi-Region tables, you must always use

Amazon Keyspaces API calls to conﬁgure auto scaling. This is because the underlying Application

Auto Scaling API operations are not Region-aware.

For more information on how to estimate write capacity throughput of provisioned multi-Region

tables, see the section called “Estimate capacity for a multi-Region table”.

For more information about the Amazon Keyspaces API, see Amazon Keyspaces API Reference.

Conﬁgure Multi-Region Replication 298

Amazon Keyspaces (for Apache Cassandra) Developer Guide

When you update the provisioned mode or auto scaling settings of a multi-Region table, you can

update read capacity settings and the read auto scaling conﬁguration for each replica of the table.

The write capacity, however, remains synchronized between all replicas to ensure that there's

enough capacity to replicate writes across all Regions.

Cassandra Query Language (CQL)

Update the provisioned capacity and auto scaling settings of a multi-Region table using CQL

•

You can use ALTER TABLE to update the capacity mode and auto scaling settings of an

existing table. If you're updating a table that is currently in on-demand capacity mode,

capacity_mode is required. If your table is already in provisioned capacity mode, this ﬁeld

can be omitted.

For detailed information about auto scaling settings, the target tracking policy, target

value, and optional settings, see the section called “Create a new table with automatic

scaling”.

In the same statement, you can also update the read capacity and auto scaling settings of

table replicas in speciﬁc Regions by updating the table's replica_updates property. The

following statement is an example of this.

ALTER TABLE mykeyspace.mytable

WITH CUSTOM_PROPERTIES = {

'capacity_mode': {

'throughput_mode': 'PROVISIONED',

'read_capacity_units': 1,

'write_capacity_units': 1

}

} AND AUTOSCALING_SETTINGS = {

'provisioned_write_capacity_autoscaling_update': {

'maximum_units': 10,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50

}

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 10,

Conﬁgure Multi-Region Replication 299

Amazon Keyspaces (for Apache Cassandra) Developer Guide

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 50,

'scale_in_cooldown': 60,

'scale_out_cooldown': 60

}

'replica_updates': {

'us-east-1': {

'provisioned_read_capacity_autoscaling_update': {

'maximum_units': 20,

'minimum_units': 5,

'scaling_policy': {

'target_tracking_scaling_policy_configuration': {

'target_value': 70

}

};

CLI

Update the provisioned capacity and auto scaling settings of a multi-Region table using the

AWS CLI

• To update the provisioned mode and auto scaling conﬁguration of an existing table, you

can use the AWS CLI update-table command.

Note that you must use the Amazon Keyspaces CLI commands to create or modify multi-

Region auto scaling settings. This is because Application Auto Scaling, the service that

Amazon Keyspaces uses to perform auto scaling of table capacity on your behalf, doesn't

support multiple AWS Regions.

To update the read capacity for a table replica in a speciﬁc Region, you can change one of

the following optional parameters of the table's replicaSpecifications:

• The provisioned read capacity units (optional)

Conﬁgure Multi-Region Replication 300

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Auto scaling settings for read capacity (optional)

When you're updating multi-Region tables with complex auto scaling settings and diﬀerent

conﬁgurations for table replicas, it's helpful to load the table's auto scaling settings and

replica conﬁgurations from JSON ﬁles.

To use the following code example, you can download the example JSON ﬁles from auto-

scaling.zip, and extract auto-scaling.json and replication.json. Take note of the

path to the ﬁles.

In this example, the JSON ﬁles are located in the current directory. For diﬀerent ﬁle path

options, see How to load parameters from a ﬁle.

aws keyspaces update-table --keyspace-name mykeyspace --table-name mytable

\ --capacity-specification

throughputMode=PROVISIONED,readCapacityUnits=1,writeCapacityUnits=1

\ --auto-scaling-specification file://auto-scaling.json

\ --replica-specifications file://replication.json

View the provisioned capacity and auto scaling settings for a multi-Region table

in Amazon Keyspaces

You can view a multi-Region table's provisioned capacity and auto scaling settings on the Amazon

Keyspaces console, using CQL, or the AWS CLI. This section provides examples of how to do this

using CQL and the AWS CLI.

Cassandra Query Language (CQL)

View the provisioned capacity and auto scaling settings of a multi-Region table using CQL

• To view the auto scaling conﬁguration of a multi-Region table, use the following command.

SELECT * FROM system_multiregion_info.autoscaling WHERE keyspace_name =

'mykeyspace' AND table_name = 'mytable';

The output for this command looks like the following:

Conﬁgure Multi-Region Replication 301

Amazon Keyspaces (for Apache Cassandra) Developer Guide

keyspace_name | table_name | region |

provisioned_read_capacity_autoscaling_update

provisioned_write_capacity_autoscaling_update

----------------+------------+----------------

mykeyspace | mytable | ap-southeast-1 | {'minimum_units': 5,

'maximum_units': 10, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown':

60, 'disable_scale_in': false, 'target_value': 50, 'scale_in_cooldown':

60}}} | {'minimum_units': 5, 'maximum_units': 10, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown': 0,

'disable_scale_in': false, 'target_value': 50, 'scale_in_cooldown': 0}}}

mykeyspace | mytable | us-east-1 | {'minimum_units': 5,

'maximum_units': 20, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown':

60, 'disable_scale_in': false, 'target_value': 70, 'scale_in_cooldown':

60}}} | {'minimum_units': 5, 'maximum_units': 10, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown': 0,

'disable_scale_in': false, 'target_value': 50, 'scale_in_cooldown': 0}}}

mykeyspace | mytable | eu-west-1 | {'minimum_units': 5,

'maximum_units': 10, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown':

60, 'disable_scale_in': false, 'target_value': 50, 'scale_in_cooldown':

60}}} | {'minimum_units': 5, 'maximum_units': 10, 'scaling_policy':

{'target_tracking_scaling_policy_configuration': {'scale_out_cooldown': 0,

'disable_scale_in': false, 'target_value': 50, 'scale_in_cooldown': 0}}}

CLI

View the provisioned capacity and auto scaling settings of a multi-Region table using the

AWS CLI

•

To view the auto scaling conﬁguration of a multi-Region table, you can use the get-

table-auto-scaling-settings operation. The following CLI command is an example

of this.

Conﬁgure Multi-Region Replication 302

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws keyspaces get-table-auto-scaling-settings --keyspace-name mykeyspace --

table-name mytable

You should see the following output.

{

"keyspaceName": "mykeyspace",

"tableName": "mytable",

"resourceArn": "arn:aws:cassandra:us-east-1:777788889999:/keyspace/

mykeyspace/table/mytable",

"autoScalingSpecification": {

"writeCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 0,

"scaleOutCooldown": 0,

"targetValue": 50.0

}

"readCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 20,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 60,

"scaleOutCooldown": 60,

"targetValue": 70.0

}

"replicaSpecifications": [

{

"region": "us-east-1",

"autoScalingSpecification": {

Conﬁgure Multi-Region Replication 303

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"writeCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 0,

"scaleOutCooldown": 0,

"targetValue": 50.0

}

"readCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 20,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 60,

"scaleOutCooldown": 60,

"targetValue": 70.0

}

{

"region": "eu-north-1",

"autoScalingSpecification": {

"writeCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 0,

"scaleOutCooldown": 0,

"targetValue": 50.0

}

Conﬁgure Multi-Region Replication 304

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"readCapacityAutoScaling": {

"autoScalingDisabled": false,

"minimumUnits": 5,

"maximumUnits": 10,

"scalingPolicy": {

"targetTrackingScalingPolicyConfiguration": {

"disableScaleIn": false,

"scaleInCooldown": 60,

"scaleOutCooldown": 60,

"targetValue": 50.0

}

]

}

Turn oﬀ auto scaling for a table in Amazon Keyspaces

This section provides examples of how to turn oﬀ auto scaling for a multi-Region table in

provisioned capacity mode. You can do this on the Amazon Keyspaces console, using CQL or the

AWS CLI.

Important

We recommend using auto scaling for multi-Region tables that use provisioned capacity

mode. For more information, see the section called “Estimate capacity for a multi-Region

table”.

Note

To delete the service-linked role that Application Auto Scaling uses, you must disable

automatic scaling on all tables in the account across all AWS Regions.

Conﬁgure Multi-Region Replication 305

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Console

Turn oﬀ Amazon Keyspaces automatic scaling for an existing multi-Region table on the

console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose the table that you want to work with and choose the Capacity tab.

3. In the Capacity settings section, choose Edit.

4. To disable Amazon Keyspaces automatic scaling, clear the Scale automatically check box.

Disabling automatic scaling deregisters the table as a scalable target with Application Auto

Scaling. To delete the service-linked role that Application Auto Scaling uses to access your

Amazon Keyspaces table, follow the steps in the section called “Deleting a service-linked

role for Amazon Keyspaces”.

5. When the automatic scaling settings are deﬁned, choose Save.

Cassandra Query Language (CQL)

Turn oﬀ auto scaling for a multi-Region table using CQL

•

You can use ALTER TABLE to turn oﬀ auto scaling for an existing table. Note that you can't

turn oﬀ auto scaling for an individual table replica.

In the following example, auto scaling is turned oﬀ for the table's read capacity.

ALTER TABLE mykeyspace.mytable

WITH AUTOSCALING_SETTINGS = {

'provisioned_read_capacity_autoscaling_update': {

'autoscaling_disabled': true

}

};

CLI

Turn oﬀ auto scaling for a multi-Region table using the AWS CLI

•

You can use the AWS CLI update-table command to turn oﬀ auto scaling for an existing

table. Note that you can't turn oﬀ auto scaling for an individual table replica.

Conﬁgure Multi-Region Replication 306

Amazon Keyspaces (for Apache Cassandra) Developer Guide

In the following example, auto scaling is turned oﬀ for the table's read capacity.

aws keyspaces update-table --keyspace-name mykeyspace --table-name mytable

\ --auto-scaling-specification

readCapacityAutoScaling={autoScalingDisabled=true}

Set the provisioned capacity of a multi-Region table manually in Amazon

Keyspaces

If you have to turn oﬀ auto scaling for a multi-Region table, you can provision the table's read

capacity for a replica table manually using CQL or the AWS CLI.

Note

We recommend using auto scaling for multi-Region tables that use provisioned capacity

mode. For more information, see the section called “Estimate capacity for a multi-Region

table”.

Cassandra Query Language (CQL)

Setting the provisioned capacity of a multi-Region table manually using CQL

•

You can use ALTER TABLE to provision the table's read capacity for a replica table

manually.

ALTER TABLE mykeyspace.mytable

WITH CUSTOM_PROPERTIES = {

'capacity_mode': {

'throughput_mode': 'PROVISIONED',

'read_capacity_units': 1,

'write_capacity_units': 1

'replica_updates': {

'us-east-1': {

'read_capacity_units': 2

}

};

Conﬁgure Multi-Region Replication 307

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CLI

Set the provisioned capacity of a multi-Region table manually using the AWS CLI

•

If you have to turn oﬀ auto scaling for a multi-Region table, you can use update-table to

provision the table's read capacity for a replica table manually.

aws keyspaces update-table --keyspace-name mykeyspace --table-name mytable

\ --capacity-specification

throughputMode=PROVISIONED,readCapacityUnits=1,writeCapacityUnits=1

\ --replica-specifications region="us-east-1",readCapacityUnits=5

Backup and restore data with point-in-time recovery for

Amazon Keyspaces

Point-in-time recovery (PITR) helps protect your Amazon Keyspaces tables from accidental write or

delete operations by providing you continuous backups of your table data.

For example, suppose that a test script writes accidentally to a production Amazon Keyspaces

table. With point-in-time recovery, you can restore that table's data to any second in time since

PITR was enabled within the last 35 days. If you delete a table with point-in-time recovery enabled,

you can query for the deleted table's data for 35 days (at no additional cost), and restore it to the

state it was in just before the point of deletion.

You can restore an Amazon Keyspaces table to a point in time by using the console, the AWS SDK

and the AWS Command Line Interface (AWS CLI), or Cassandra Query Language (CQL). For more

information, see Use point-in-time recovery in Amazon Keyspaces.

Point-in-time operations have no performance or availability impact on the base table, and

restoring a table doesn't consume additional throughput.

For information about PITR quotas, see Quotas.

For information about pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

Topics

• How point-in-time recovery works in Amazon Keyspaces

• Use point-in-time recovery in Amazon Keyspaces

Backup and restore with point-in-time recovery 308

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How point-in-time recovery works in Amazon Keyspaces

This section provides an overview of how Amazon Keyspaces point-in-time recovery (PITR) works.

For more information about pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

Topics

• Time window for PITR continuous backups

• PITR restore settings

• PITR restore of encrypted tables

• PITR restore of multi-Region tables

• Table restore time with PITR

• Amazon Keyspaces PITR and integration with AWS services

Time window for PITR continuous backups

Amazon Keyspaces PITR uses two timestamps to maintain the time frame for which restorable

backups are available for a table.

• Earliest restorable time – Marks the time of the earliest restorable backup. The earliest restorable

backup goes back up to 35 days or when PITR was enabled, whichever is more recent. The

maximum backup window of 35 days can't be modiﬁed.

• Current time – The timestamp for the latest restorable backup is the current time. If no

timestamp is provided during a restore, current time is used.

When PITR is enabled, you can restore to any point in time between

EarliestRestorableDateTime and CurrentTime. You can only restore table data to a time

when PITR was enabled.

If you disable PITR and later reenable it again, you reset the start time for the ﬁrst available backup

to when PITR was reenabled. This means that disabling PITR erases your backup history.

Note

Data deﬁnition language (DDL) operations on tables, such as schema changes, are

performed asynchronously. You can only see completed operations in your restored

table data, but you might see additional actions on your source table if they were in

How it works 309

Amazon Keyspaces (for Apache Cassandra) Developer Guide

progress at the time of the restore. For a list of DDL statements, see the section called “DDL

statements”.

A table doesn't have to be active to be restored. You can also restore deleted tables if PITR was

enabled on the deleted table and the deletion occurred within the backup window (or within the

last 35 days).

Note

If a new table is created with the same qualiﬁed name (for example, mykeyspace.mytable)

as a previously deleted table, the deleted table will no longer be restorable. If you attempt

to do this from the console, a warning is displayed.

PITR restore settings

When you restore a table using PITR, Amazon Keyspaces restores your source table's schema and

data to the state based on the selected timestamp (day:hour:minute:second) to a new table.

PITR doesn't overwrite existing tables.

In addition to the table's schema and data, PITR restores the custom_properties from the

source table. Unlike the table's data, which is restored based on the selected timestamp between

earliest restore time and current time, custom properties are always restored based on the table's

settings as of the current time.

The settings of the restored table match the settings of the source table with the timestamp of

when the restore was initiated. If you want to overwrite these settings during restore, you can do

so using WITH custom_properties. Custom properties include the following settings.

• Read/write capacity mode

• Provisioned throughput capacity settings

• PITR settings

If the table is in provisioned capacity mode with auto scaling enabled, the restore

operation also restores the table's auto scaling settings. You can overwrite them using the

autoscaling_settings parameter in CQL or autoScalingSpecification with the CLI. For

How it works 310

Amazon Keyspaces (for Apache Cassandra) Developer Guide

more information on auto scaling settings, see the section called “Manage throughput capacity

with auto scaling”.

When you do a full table restore, all table settings for the restored table come from the current

settings of the source table at the time of the restore.

For example, suppose that a table's provisioned throughput was recently lowered to 50 read

capacity units and 50 write capacity units. You then restore the table's state to three weeks ago.

At this time, its provisioned throughput was set to 100 read capacity units and 100 write capacity

units. In this case, Amazon Keyspaces restores your table data to that point in time, but uses the

current provisioned throughput settings (50 read capacity units and 50 write capacity units).

The following settings are not restored, and you must conﬁgure them manually for the new table.

• AWS Identity and Access Management (IAM) policies

• Amazon CloudWatch metrics and alarms

•

Tags (can be added to the CQL RESTORE statement using WITH TAGS)

PITR restore of encrypted tables

When you restore a table using PITR, Amazon Keyspaces restores your source table's encryption

settings. If the table was encrypted with an AWS owned key (default), the table is restored with

the same setting automatically. If the table you want to restore was encrypted using a customer

managed key, the same customer managed key needs to be accessible to Amazon Keyspaces to

restore the table data.

You can change the encryption settings of the table at the time of restore. To change from an

AWS owned key to a customer managed key, you need to supply a valid and accessible customer

managed key at the time of restore.

If you want to change from a customer managed key to an AWS owned key, conﬁrm that Amazon

Keyspaces has access to the customer managed key of the source table to restore the table with an

AWS owned key. For more information about encryption at rest settings for tables, see the section

called “How it works”.

Note

If the table was deleted because Amazon Keyspaces lost access to your customer managed

key, you need to ensure the customer managed key is accessible to Amazon Keyspaces

How it works 311

Amazon Keyspaces (for Apache Cassandra) Developer Guide

before trying to restore the table. A table that was encrypted with a customer managed

key can't be restored if Amazon Keyspaces doesn't have access to that key. For more

information, see Troubleshooting key access in the AWS Key Management Service

Developer Guide.

PITR restore of multi-Region tables

You can restore a multi-Region table using PITR. For the restore operation to be successful, both

the source and the destination table have to be replicated to the same AWS Regions.

Amazon Keyspaces restores the settings of the source table in each of the replicated Regions that

are part of the keyspace. You can also override settings during the restore operation. For more

information about settings that can be changed during the restore, see the section called “Restore

settings”.

For more information about Multi-Region Replication, see the section called “How it works”.

Table restore time with PITR

The time it takes you to restore a table is based on multiple factors and isn't always correlated

directly to the size of the table.

The following are some considerations for restore times.

• You restore backups to a new table. It can take up to 20 minutes (even if the table is empty) to

perform all the actions to create the new table and initiate the restore process.

• Restore times for large tables with well-distributed data models can be several hours or longer.

• If your source table contains data that is signiﬁcantly skewed, the time to restore might increase.

For example, if your table’s primary key is using the month of the year as a partition key, and all

your data is from the month of December, you have skewed data.

A best practice when planning for disaster recovery is to regularly document average restore

completion times and establish how these times aﬀect your overall Recovery Time Objective.

Amazon Keyspaces PITR and integration with AWS services

The following PITR operations are logged using AWS CloudTrail to enable continuous monitoring

and auditing.

How it works 312

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Create a new table with PITR enabled or disabled.

• Enable or disable PITR on an existing table.

• Restore an active or a deleted table.

For more information, see Logging Amazon Keyspaces API calls with AWS CloudTrail.

You can perform the following PITR actions using AWS CloudFormation.

• Create a new table with PITR enabled or disabled.

• Enable or disable PITR on an existing table.

For more information, see the Cassandra Resource Type Reference in the AWS CloudFormation User

Guide.

Use point-in-time recovery in Amazon Keyspaces

With Amazon Keyspaces (for Apache Cassandra), you can restore tables to a speciﬁc point in time

using Point-in-Time Restore (PITR). PITR enables you to restore a table to a prior state within the

last 35 days, providing data protection and recovery capabilities. This feature is valuable in cases

such as accidental data deletion, application errors, or for testing purposes. You can quickly and

eﬃciently recover data, minimizing downtime and data loss. The following sections guide you

through the process of restoring tables using PITR in Amazon Keyspaces, ensuring data integrity

and business continuity.

Topics

• Conﬁgure restore table IAM permissions for Amazon Keyspaces PITR

• Conﬁgure PITR for a table in Amazon Keyspaces

• Turn oﬀ PITR for an Amazon Keyspaces table

• Restore a table from backup to a speciﬁed point in time in Amazon Keyspaces

• Restore a deleted table using Amazon Keyspaces PITR

Conﬁgure restore table IAM permissions for Amazon Keyspaces PITR

This section summarizes how to conﬁgure permissions for an AWS Identity and Access

Management (IAM) principal to restore Amazon Keyspaces tables. In IAM, the AWS managed policy

Use point-in-time recovery 313

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AmazonKeyspacesFullAccess includes the permissions to restore Amazon Keyspaces tables.

To implement a custom policy with minimum required permissions, consider the requirements

outlined in the next section.

To successfully restore a table, the IAM principal needs the following minimum permissions:

•

cassandra:Restore – The restore action is required for the target table to be restored.

•

cassandra:Select – The select action is required to read from the source table.

•

cassandra:TagResource – The tag action is optional, and only required if the restore

operation adds tags.

This is an example of a policy that grants minimum required permissions to a user to restore tables

in keyspace mykeyspace.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Restore",

"cassandra:Select"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/*",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

}

]

}

Additional permissions to restore a table might be required based on other selected features. For

example, if the source table is encrypted at rest with a customer managed key, Amazon Keyspaces

must have permissions to access the customer managed key of the source table to successfully

restore the table. For more information, see the section called “PITR and encrypted tables”.

If you are using IAM policies with condition keys to restrict incoming traﬃc to speciﬁc sources,

you must ensure that Amazon Keyspaces has permission to perform a restore operation on your

principal's behalf. You must add an aws:ViaAWSService condition key to your IAM policy if your

policy restricts incoming traﬃc to any of the following:

Use point-in-time recovery 314

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

VPC endpoints with aws:SourceVpce

•

IP ranges with aws:SourceIp

•

VPCs with aws:SourceVpc

The aws:ViaAWSService condition key allows access when any AWS service makes a request

using the principal's credentials. For more information, see IAM JSON policy elements: Condition

key in the IAM User Guide.

The following is an example of a policy that restricts source traﬃc to a speciﬁc IP address and

allows Amazon Keyspaces to restore a table on the principal's behalf.

{

"Version":"2012-10-17",

"Statement":[

{

"Sid":"CassandraAccessForCustomIp",

"Effect":"Allow",

"Action":"cassandra:*",

"Resource":"*",

"Condition":{

"Bool":{

"aws:ViaAWSService":"false"

"ForAnyValue:IpAddress":{

"aws:SourceIp":[

"123.45.167.89"

]

}

{

"Sid":"CassandraAccessForAwsService",

"Effect":"Allow",

"Action":"cassandra:*",

"Resource":"*",

"Condition":{

"Bool":{

"aws:ViaAWSService":"true"

}

Use point-in-time recovery 315

Amazon Keyspaces (for Apache Cassandra) Developer Guide

]

}

For an example policy using the aws:ViaAWSService global condition key, see the section called

“VPC endpoint policies and Amazon Keyspaces point-in-time recovery (PITR)”.

Conﬁgure PITR for a table in Amazon Keyspaces

You can conﬁgure a table in Amazon Keyspaces for backup and restore operations using PITR with

the console, CQL, and the AWS CLI.

When creating a new table using CQL or the AWS CLI, you must explicitly enable PITR in the create

table statement. When you create a new table using the console, PITR will be enable by default.

To learn how to restore a table, see the section called “Restore a table to a point in time”.

Console

Conﬁgure PITR for a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables and select the table you want to edit.

3. On the Backups tab, choose Edit.

4. In the Edit point-in-time recovery settings section, select Enable Point-in-time recovery.

5. Choose Save changes.

Cassandra Query Language (CQL)

Conﬁgure PITR for a table using CQL

You can manage PITR settings for tables by using the point_in_time_recovery custom

property.

To enable PITR when you're creating a new table, you must set the status of

point_in_time_recovery to enabled. You can use the following CQL command as an

example.

CREATE TABLE "my_keyspace1"."my_table1"(

"id" int,

Use point-in-time recovery 316

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"name" ascii,

"date" timestamp,

PRIMARY KEY("id"))

WITH CUSTOM_PROPERTIES = {

'capacity_mode':{'throughput_mode':'PAY_PER_REQUEST'},

'point_in_time_recovery':{'status':'enabled'}

}

Note

If no point-in-time recovery custom property is speciﬁed, point-in-time recovery is

disabled by default.

2. To enable PITR for an existing table using CQL, run the following CQL command.

ALTER TABLE mykeyspace.mytable

WITH custom_properties = {'point_in_time_recovery': {'status': 'enabled'}}

CLI

Conﬁgure PITR for a table using the AWS CLI

You can manage PITR settings for tables by using the UpdateTable API.

To enable PITR when you're creating a new table, you must include point-in-time-

recovery 'status=ENABLED' in the create table command. You can use the following

AWS CLI command as an example. The command has been broken into separate lines to

improve readability.

aws keyspaces create-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--schema-definition 'allColumns=[{name=id,type=int},

{name=name,type=text},{name=date,type=timestamp}],partitionKeys=[{name=id}]'

--point-in-time-recovery 'status=ENABLED'

Note

If no point-in-time recovery value is speciﬁed, point-in-time recovery is disabled by

default.

Use point-in-time recovery 317

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. To conﬁrm the point-in-time recovery setting for a table, you can use the following AWS

CLI command.

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

3. To enable PITR for an existing table using the AWS CLI, run the following command.

aws keyspaces update-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--point-in-time-recovery 'status=ENABLED'

Turn oﬀ PITR for an Amazon Keyspaces table

You can turn oﬀ PITR for an Amazon Keyspaces table at any time using the console, CQL, or the

AWS CLI.

Important

Disabling PITR deletes your backup history immediately, even if you reenable PITR on the

table within 35 days.

To learn how to restore a table, see the section called “Restore a table to a point in time”.

Console

Disable PITR for a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables and select the table you want to edit.

3. On the Backups tab, choose Edit.

4. In the Edit point-in-time recovery settings section, clear the Enable Point-in-time

recovery check box.

5. Choose Save changes.

Use point-in-time recovery 318

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

Disable PITR for a table using CQL

• To disable PITR for an existing table, run the following CQL command.

ALTER TABLE mykeyspace.mytable

WITH custom_properties = {'point_in_time_recovery': {'status': 'disabled'}}

CLI

Disable PITR for a table using the AWS CLI

• To disable PITR for an existing table, run the following AWS CLI command.

aws keyspaces update-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--point-in-time-recovery 'status=DISABLED'

Restore a table from backup to a speciﬁed point in time in Amazon Keyspaces

The following section demonstrates how to restore an existing Amazon Keyspaces table to a

speciﬁed point in time.

Note

This procedure assumes that the table you're using has been conﬁgured with point-in-time

recovery. To enable PITR for a table, see the section called “Conﬁgure PITR”.

Important

While a restore is in progress, don't modify or delete the AWS Identity and Access

Management (IAM) policies that grant the IAM principal (for example, user, group, or

role) permission to perform the restore. Otherwise, unexpected behavior can result. For

example, if you remove write permissions for a table while that table is being restored, the

underlying RestoreTableToPointInTime operation can't write any of the restored data

to the table.

Use point-in-time recovery 319

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You can modify or delete permissions only after the restore operation is complete.

Console

Restore a table to a speciﬁed point in time using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane on the left side of the console, choose Tables.

3. In the list of tables, choose the table you want to restore.

4. On the Backups tab of the table, in the Point-in-time recovery section, choose Restore.

5. For the new table name, enter a new name for the restored table, for example

mytable_restored.

6. To deﬁne the point in time for the restore operation, you can choose between two options:

• Select the preconﬁgured Earliest time.

• Select Specify date and time and enter the date and time you want to restore the new

table to.

Note

You can restore to any point in time within Earliest time and the current time.

Amazon Keyspaces restores your table data to the state based on the selected date

and time (day:hour:minute:second).

7. Choose Restore to start the restore process.

The table that is being restored is shown with the status Restoring. After the restore

process is ﬁnished, the status of the restored table changes to Active.

Cassandra Query Language (CQL)

Restore a table to a point in time using CQL

1. You can restore an active table to a point-in-time between

earliest_restorable_timestamp and the current time. Current time is the default.

Use point-in-time recovery 320

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To conﬁrm that point-in-time recovery is enabled for the table, query the

system_schema_mcs.tables as shown in this example.

SELECT custom_properties

FROM system_schema_mcs.tables

WHERE keyspace_name = 'mykeyspace' AND table_name = 'mytable';

Point-in-time recovery is enabled as shown in the following sample output.

custom_properties

-----------------

{

...,

"point_in_time_recovery": {

"earliest_restorable_timestamp":"2020-06-30T19:19:21.175Z"

"status":"enabled"

}

2. •

Restore the table to the current time. When you omit the WITH restore_timestamp

= ... clause, the current timestamp is used.

RESTORE TABLE mykeyspace.mytable_restored

FROM TABLE mykeyspace.mytable;

•

You can also restore to a speciﬁc point in time, deﬁned by a restore_timestamp

in ISO 8601 format. You can specify any point in time during the last

35 days. For example, the following command restores the table to the

EarliestRestorableDateTime.

RESTORE TABLE mykeyspace.mytable_restored

FROM TABLE mykeyspace.mytable

WITH restore_timestamp = '2020-06-30T19:19:21.175Z';

For a full syntax description, see the section called “RESTORE TABLE” in the language

reference.

3. To verify that the restore of the table was successful, query the

system_schema_mcs.tables to conﬁrm the status of the table.

Use point-in-time recovery 321

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT status

FROM system_schema_mcs.tables

WHERE keyspace_name = 'mykeyspace' AND table_name = 'mytable_restored'

The query shows the following output.

status

------

RESTORING

The table that is being restored is shown with the status Restoring. After the restore

process is ﬁnished, the status of the table changes to Active.

CLI

Restore a table to a point in time using the AWS CLI

Create a simple table named myTable that has PITR enabled. The command has been

broken up into separate lines for readability.

aws keyspaces create-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--schema-definition 'allColumns=[{name=id,type=int},

{name=name,type=text},{name=date,type=timestamp}],partitionKeys=[{name=id}]'

--point-in-time-recovery 'status=ENABLED'

Conﬁrm the properties of the new table and review the earliestRestorableTimestamp

for PITR.

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

The output of this command returns the following.

{

"keyspaceName": "myKeyspace",

"tableName": "myTable",

"resourceArn": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/

myKeyspace/table/myTable",

"creationTimestamp": "2022-06-20T14:34:57.049000-07:00",

"status": "ACTIVE",

Use point-in-time recovery 322

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2022-06-20T14:34:57.049000-07:00"

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "ENABLED",

"earliestRestorableTimestamp": "2022-06-20T14:35:13.693000-07:00"

"defaultTimeToLive": 0,

"comment": {

"message": ""

}

3. •

To restore a table to a point in time, specify a restore_timestamp in ISO

8601 format. You can chose any point in time during the last 35 days in one

Use point-in-time recovery 323

Amazon Keyspaces (for Apache Cassandra) Developer Guide

second intervals. For example, the following command restores the table to the

EarliestRestorableDateTime.

aws keyspaces restore-table --source-keyspace-name 'myKeyspace' --source-

table-name 'myTable' --target-keyspace-name 'myKeyspace' --target-table-name

'myTable_restored' --restore-timestamp "2022-06-20 21:35:14.693"

The output of this command returns the ARN of the restored table.

{

"restoredTableARN": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/

myKeyspace/table/myTable_restored"

}

•

To restore the table to the current time, you can omit the restore-timestamp

parameter.

aws keyspaces restore-table --source-keyspace-name 'myKeyspace' --source-

table-name 'myTable' --target-keyspace-name 'myKeyspace' --target-table-name

'myTable_restored1'"

Restore a deleted table using Amazon Keyspaces PITR

The following procedure shows how to restore a deleted table from backup to the time of deletion.

You can do this using CQL or the AWS CLI.

Note

This procedure assumes that PITR was enabled on the deleted table.

Cassandra Query Language (CQL)

Restore a deleted table using CQL

1. To conﬁrm that point-in-time recovery is enabled for a deleted table, query the system

table. Only tables with point-in-time recovery enabled are shown.

SELECT custom_properties

Use point-in-time recovery 324

Amazon Keyspaces (for Apache Cassandra) Developer Guide

FROM system_schema_mcs.tables_history

WHERE keyspace_name = 'mykeyspace' AND table_name = 'my_table';

The query shows the following output.

custom_properties

------------------

{

...,

"point_in_time_recovery":{

"restorable_until_time":"2020-08-04T00:48:58.381Z",

"status":"enabled"

}

2. Restore the table to the time of deletion with the following sample statement.

RESTORE TABLE mykeyspace.mytable_restored

FROM TABLE mykeyspace.mytable;

CLI

Restore a deleted table using the AWS CLI

1. Delete a table that you created previously that has PITR enabled. The following command

is an example.

aws keyspaces delete-table --keyspace-name 'myKeyspace' --table-name 'myTable'

2. Restore the deleted table to the time of deletion with the following command.

aws keyspaces restore-table --source-keyspace-name 'myKeyspace' --source-

table-name 'myTable' --target-keyspace-name 'myKeyspace' --target-table-name

'myTable_restored2'

The output of this command returns the ARN of the restored table.

{

"restoredTableARN": "arn:aws:cassandra:us-east-1:111222333444:/keyspace/

myKeyspace/table/myTable_restored2"

Use point-in-time recovery 325

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

Expire data with Time to Live (TTL) for Amazon Keyspaces (for

Apache Cassandra)

Amazon Keyspaces (for Apache Cassandra) Time to Live (TTL) helps you simplify your application

logic and optimize the price of storage by expiring data from tables automatically. Data that you

no longer need is automatically deleted from your table based on the Time to Live value that you

set.

This makes it easier to comply with data retention policies based on business, industry, or

regulatory requirements that deﬁne how long data needs to be retained or specify when data must

be deleted.

For example, you can use TTL in an AdTech application to schedule when data for speciﬁc ads

expires and is no longer visible to clients. You can also use TTL to retire older data automatically

and save on your storage costs.

You can set a default TTL value for the entire table, and overwrite that value for individual rows

and columns. TTL operations don't impact your application's performance. Also, the number of

rows and columns marked to expire with TTL doesn't aﬀect your table's availability.

Amazon Keyspaces automatically ﬁlters out expired data so that expired data isn't returned in

query results or available for use in data manipulation language (DML) statements. Amazon

Keyspaces typically deletes expired data from storage within 10 days of the expiration date.

In rare cases, Amazon Keyspaces may not be able to delete data within 10 days if there is sustained

activity on the underlying storage partition to protect availability. In these cases, Amazon

Keyspaces continues to attempt to delete the expired data once traﬃc on the partition decreases.

After the data is permanently deleted from storage, you stop incurring storage fees.

You can set, modify, or disable default TTL settings for new and existing tables by using the

console, Cassandra Query Language (CQL), or the AWS CLI.

On tables with default TTL conﬁgured, you can use CQL statements to override the default TTL

settings of the table and apply custom TTL values to rows and columns. For more information,

see the section called “Use INSERT to set custom TTL for new rows” and the section called “Use

UPDATE to set custom TTL for rows and columns”.

Expire data with Time to Live 326

Amazon Keyspaces (for Apache Cassandra) Developer Guide

TTL pricing is based on the size of the rows being deleted or updated by using Time to Live. TTL

operations are metered in units of TTL deletes. One TTL delete is consumed per KB of data per

row that is deleted or updated.

For example, to update a row that stores 2.5 KB of data and to delete one or more columns within

the row at the same time requires three TTL deletes. Or, to delete an entire row that contains 3.5

KB of data requires four TTL deletes.

One TTL delete is consumed per KB of deleted data per row. For more information about pricing,

see Amazon Keyspaces (for Apache Cassandra) pricing.

Topics

• Amazon Keyspaces Time to Live and integration with AWS services

• Create a new table with default Time to Live (TTL) settings

• Update the default Time to Live (TTL) value of a table

• Create table with custom Time to Live (TTL) settings enabled

• Update table with custom Time to Live (TTL)

• Use the INSERT statement to set custom Time to Live (TTL) values for new rows

• Use the UPDATE statement to edit custom Time to Live (TTL) settings for rows and columns

Amazon Keyspaces Time to Live and integration with AWS services

The following TTL metric is available in Amazon CloudWatch to enable continuous monitoring.

•

TTLDeletes – The units consumed to delete or update data in a row by using Time to Live

(TTL).

For more information about how to monitor CloudWatch metrics, see the section called

“Monitoring with CloudWatch”.

When you use AWS CloudFormation, you can turn on TTL when creating an Amazon Keyspaces

table. For more information, see the AWS CloudFormation User Guide.

Create a new table with default Time to Live (TTL) settings

In Amazon Keyspaces, you can set a default TTL value for all rows in a table when the table is

created.

Integration with AWS services 327

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The default TTL value for a table is zero, which means that data doesn't expire automatically. If the

default TTL value for a table is greater than zero, an expiration timestamp is added to each row.

TTL values are set in seconds, and the maximum conﬁgurable value is 630,720,000 seconds, which

is the equivalent of 20 years.

After table creation, you can overwrite the table's default TTL setting for speciﬁc rows or columns

with CQL DML statements. For more information, see the section called “Use INSERT to set custom

TTL for new rows” and the section called “Use UPDATE to set custom TTL for rows and columns”.

When you enable TTL on a table, Amazon Keyspaces begins to store additional TTL-related

metadata for each row. In addition, TTL uses expiration timestamps to track when rows or columns

expire. The timestamps are stored as row metadata and contribute to the storage cost for the row.

After the TTL feature is enabled, you can't disable it for a table. Setting the table’s

default_time_to_live to 0 disables default expiration times for new data, but it doesn't

deactivate the TTL feature or revert the table back to the original Amazon Keyspaces storage

metadata or write behavior.

The following examples show how to create a new table with a default TTL value.

Console

Create a new table with a Time to Live default value using the console.

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables, and then choose Create table.

3. On the Create table page in the Table details section, select a keyspace and provide a

name for the new table.

4. In the Schema section, create the schema for your table.

5. In the Table settings section, choose Customize settings.

6. Continue to Time to Live (TTL).

In this step, you select the default TTL settings for the table.

For the Default TTL period, enter the expiration time and choose the unit of time you

entered, for example seconds, days, or years. Amazon Keyspaces will store the value in

seconds.

Create table with default TTL value 328

Amazon Keyspaces (for Apache Cassandra) Developer Guide

7. Choose Create table. Your table is created with the speciﬁed default TTL value.

Cassandra Query Language (CQL)

Create a new table with a default TTL value using CQL

1. The following statement creates a new table with the default TTL value set to 3,024,000

seconds, which represents 35 days.

CREATE TABLE my_table (

userid uuid,

time timeuuid,

subject text,

body text,

user inet,

PRIMARY KEY (userid, time)

) WITH default_time_to_live = 3024000;

To conﬁrm the TTL settings for the new table, use the cqlsh DESCRIBE statement as

shown in the following example. The output shows the default TTL setting for the table as

default_time_to_live.

DESC TABLE my_table;

CREATE TABLE my_keyspace.my_table (

userid uuid,

time timeuuid,

body text,

subject text,

user inet,

PRIMARY KEY (userid, time)

) WITH CLUSTERING ORDER BY (time ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = {'class': 'com.amazonaws.cassandra.DefaultCaching'}

AND comment = ''

AND compaction = {'class': 'com.amazonaws.cassandra.DefaultCompaction'}

AND compression = {'class': 'com.amazonaws.cassandra.DefaultCompression'}

AND crc_check_chance = 1.0

AND dclocal_read_repair_chance = 0.0

AND default_time_to_live = 3024000

AND gc_grace_seconds = 7776000

Create table with default TTL value 329

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 3600000

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99PERCENTILE';

CLI

Create a new table with a default TTL value using the AWS CLI

1. You can use the following command to create a new table with the default TTL value set to

one year.

aws keyspaces create-table --keyspace-name 'myKeyspace' --table-name 'myTable' \

--schema-definition 'allColumns=[{name=id,type=int},

{name=name,type=text},{name=date,type=timestamp}],partitionKeys=[{name=id}]' \

--default-time-to-live '31536000'

2. To conﬁrm the TTL status of the table, you can use the following command.

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

The output of the command looks like in the following example

{

"keyspaceName": "myKeyspace",

"tableName": "myTable",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

myKeyspace/table/myTable",

"creationTimestamp": "2024-09-02T10:52:22.190000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

Create table with default TTL value 330

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2024-09-02T10:52:22.190000+00:00"

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 31536000,

"comment": {

"message": ""

"replicaSpecifications": []

}

Update the default Time to Live (TTL) value of a table

You can update an existing table with a new default TTL value. TTL values are set in seconds, and

the maximum conﬁgurable value is 630,720,000 seconds, which is the equivalent of 20 years.

When you enable TTL on a table, Amazon Keyspaces begins to store additional TTL-related

metadata for each row. In addition, TTL uses expiration timestamps to track when rows or columns

expire. The timestamps are stored as row metadata and contribute to the storage cost for the row.

Update table default TTL value 331

Amazon Keyspaces (for Apache Cassandra) Developer Guide

After TTL has been enabled for a table, you can overwrite the table's default TTL setting for

speciﬁc rows or columns with CQL DML statements. For more information, see the section called

“Use INSERT to set custom TTL for new rows” and the section called “Use UPDATE to set custom

TTL for rows and columns”.

After the TTL feature is enabled, you can't disable it for a table. Setting the table’s

default_time_to_live to 0 disables default expiration times for new data, but it doesn't

deactivate the TTL feature or revert the table back to the original Amazon Keyspaces storage

metadata or write behavior.

Follow these steps to update default Time to Live settings for existing tables using the console,

CQL, or the AWS CLI.

Console

Update the default TTL value of a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose the table that you want to update, and then choose the Additional settings tab.

3. Continue to Time to Live (TTL) and choose Edit.

4. For the Default TTL period, enter the expiration time and choose the unit of time, for

example seconds, days, or years. Amazon Keyspaces will store the value in seconds. This

doesn't change the TTL value of existing rows.

5. When the TTL settings are deﬁned, choose Save changes.

Cassandra Query Language (CQL)

Update the default TTL value of a table using CQL

You can use ALTER TABLE to edit default Time to Live (TTL) settings of a table. To update

the default TTL settings of the table to 2,592,000 seconds, which represents 30 days, you

can use the following statement.

ALTER TABLE my_table WITH default_time_to_live = 2592000;

Update table default TTL value 332

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To conﬁrm the TTL settings for the updated table, use the cqlsh DESCRIBE statement as

shown in the following example. The output shows the default TTL setting for the table as

default_time_to_live.

DESC TABLE my_table;

The output of the statement should look similar to this example.

CREATE TABLE my_keyspace.my_table (

id int PRIMARY KEY,

date timestamp,

name text

) WITH bloom_filter_fp_chance = 0.01

AND caching = {'class': 'com.amazonaws.cassandra.DefaultCaching'}

AND comment = ''

AND compaction = {'class': 'com.amazonaws.cassandra.DefaultCompaction'}

AND compression = {'class': 'com.amazonaws.cassandra.DefaultCompression'}

AND crc_check_chance = 1.0

AND dclocal_read_repair_chance = 0.0

AND default_time_to_live = 2592000

AND gc_grace_seconds = 7776000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 3600000

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99PERCENTILE';

CLI

Update the default TTL value of a table using the AWS CLI

You can use update-table to edit the default TTL value a table. To update the default

TTL settings of the table to 2,592,000 seconds, which represents 30 days, you can use the

following statement.

aws keyspaces update-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--default-time-to-live '2592000'

2. To conﬁrm the updated default TTL value, you can use the following statement.

Update table default TTL value 333

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

The output of the statement should look like in the following example.

{

"keyspaceName": "myKeyspace",

"tableName": "myTable",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

myKeyspace/table/myTable",

"creationTimestamp": "2024-09-02T10:52:22.190000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2024-09-02T10:52:22.190000+00:00"

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

Update table default TTL value 334

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"status": "DISABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 2592000,

"comment": {

"message": ""

"replicaSpecifications": []

}

Create table with custom Time to Live (TTL) settings enabled

To create a new table with Time to Live custom settings that can be applied to rows and columns

without enabling TTL default settings for the entire table, you can use the following commands.

Note

If a table is created with ttl custom settings enabled, you can't disable the setting later.

Cassandra Query Language (CQL)

Create a new table with custom TTL setting using CQL

•

CREATE TABLE my_keyspace.my_table (id int primary key) WITH

CUSTOM_PROPERTIES={'ttl':{'status': 'enabled'}};

CLI

Create a new table with custom TTL setting using the AWS CLI

1. You can use the following command to create a new table with TTL enabled.

aws keyspaces create-table --keyspace-name 'myKeyspace' --table-name 'myTable' \

--schema-definition

'allColumns=[{name=id,type=int},{name=name,type=text},

{name=date,type=timestamp}],partitionKeys=[{name=id}]' \

Create table with custom TTL 335

Amazon Keyspaces (for Apache Cassandra) Developer Guide

--ttl 'status=ENABLED'

2. To conﬁrm that TTL is enabled for the table, you can use the following statement.

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

The output of the statement should look like in the following example.

{

"keyspaceName": "myKeyspace",

"tableName": "myTable",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

myKeyspace/table/myTable",

"creationTimestamp": "2024-09-02T10:52:22.190000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2024-09-02T11:18:55.796000+00:00"

"encryptionSpecification": {

Create table with custom TTL 336

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

"replicaSpecifications": []

}

Update table with custom Time to Live (TTL)

To enable Time to Live custom settings for a table so that TTL values can be applied to individual

rows and columns without setting a TTL default value for the entire table, you can use the

following commands.

Note

After ttl is enabled, you can't disable it for the table.

Cassandra Query Language (CQL)

Enable custom TTL settings for a table using CQL

•

ALTER TABLE my_table WITH CUSTOM_PROPERTIES={'ttl':{'status': 'enabled'}};

CLI

Enable custom TTL settings for a table using the AWS CLI

1. You can use the following command to update the custom TTL setting of a table.

Update table custom TTL 337

Amazon Keyspaces (for Apache Cassandra) Developer Guide

aws keyspaces update-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--ttl 'status=ENABLED'

2. To conﬁrm that TTL is now enabled for the table, you can use the following statement.

aws keyspaces get-table --keyspace-name 'myKeyspace' --table-name 'myTable'

The output of the statement should look like in the following example.

{

"keyspaceName": "myKeyspace",

"tableName": "myTable",

"resourceArn": "arn:aws:cassandra:us-east-1:123SAMPLE012:/keyspace/

myKeyspace/table/myTable",

"creationTimestamp": "2024-09-02T11:32:27.349000+00:00",

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": "2024-09-02T11:32:27.349000+00:00"

Update table custom TTL 338

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

"replicaSpecifications": []

}

Use the INSERT statement to set custom Time to Live (TTL) values for

new rows

Note

Before you can set custom TTL values for rows using the INSERT statement, you must ﬁrst

enable custom TTL on the table. For more information, see the section called “Update table

custom TTL”.

To overwrite a table's default TTL value by setting expiration dates for individual rows, you can use

the INSERT statement:

•

INSERT – Insert a new row of data with a TTL value set.

Setting TTL values for new rows using the INSERT statement takes precedence over the default

TTL setting of the table.

The following CQL statement inserts a row of data into the table and changes the default TTL

setting to 259,200 seconds (which is equivalent to 3 days).

INSERT INTO my_table (userid, time, subject, body, user)

Use INSERT to set custom TTL for new rows

339

Amazon Keyspaces (for Apache Cassandra) Developer Guide

VALUES (B79CB3BA-745E-5D9A-8903-4A02327A7E09, 96a29100-5e25-11ec-90d7-

b5d91eceda0a, 'Message', 'Hello','205.212.123.123')

USING TTL 259200;

To conﬁrm the TTL settings for the inserted row, use the following statement.

SELECT TTL (subject) from my_table;

Use the UPDATE statement to edit custom Time to Live (TTL) settings

for rows and columns

Note

Before you can set custom TTL values for rows and columns, you must enable TTL on the

table ﬁrst. For more information, see the section called “Update table custom TTL”.

You can use the UPDATE statement to overwrite a table's default TTL value by setting the

expiration date for individual rows and columns:

• Rows – You can update an existing row of data with a custom TTL value.

• Columns – You can update a subset of columns within existing rows with a custom TTL value.

Setting TTL values for rows and columns takes precedence over the default TTL setting for the

table.

To change the TTL settings of the 'subject' column inserted earlier from 259,200 seconds (3 days)

to 86,400 seconds (one day), use the following statement.

UPDATE my_table USING TTL 86400 set subject = 'Updated Message' WHERE userid =

B79CB3BA-745E-5D9A-8903-4A02327A7E09 and time = 96a29100-5e25-11ec-90d7-b5d91eceda0a;

You can run a simple select query to see the updated record before the expiration time.

SELECT * from my_table;

The query shows the following output.

Use UPDATE to set custom TTL for rows and columns

340

Amazon Keyspaces (for Apache Cassandra) Developer Guide

userid | time | body |

subject | user

--------------------------------------+--------------------------------------+-------

+-----------------+-----------------

b79cb3ba-745e-5d9a-8903-4a02327a7e09 | 96a29100-5e25-11ec-90d7-b5d91eceda0a | Hello |

Updated Message | 205.212.123.123

50554d6e-29bb-11e5-b345-feff819cdc9f | cf03fb21-59b5-11ec-b371-dff626ab9620 | Hello |

Message | 205.212.123.123

To conﬁrm that the expiration was successful, run the same query again after the conﬁgured

expiration time.

SELECT * from my_table;

The query shows the following output after the 'subject' column has expired.

userid | time | body |

subject | user

--------------------------------------+--------------------------------------+-------

+---------+-----------------

b79cb3ba-745e-5d9a-8903-4a02327a7e09 | 96a29100-5e25-11ec-90d7-b5d91eceda0a | Hello |

null | 205.212.123.123

50554d6e-29bb-11e5-b345-feff819cdc9f | cf03fb21-59b5-11ec-b371-dff626ab9620 | Hello |

Message | 205.212.123.123

Client-side timestamps in Amazon Keyspaces

In Amazon Keyspaces, client-side timestamps are Cassandra-compatible timestamps that are

persisted for each cell in your table. You can use client-side timestamps for conﬂict resolution

by letting your client applications determine the order of writes. For example, when clients of a

globally distributed application make updates to the same data, client-side timestamps persist the

order in which the updates were made on the clients. Amazon Keyspaces uses these timestamps to

process the writes.

Amazon Keyspaces client-side timestamps are fully managed. You don’t have to manage low-level

system settings such as clean-up and compaction strategies.

When you delete data, the rows are marked for deletion with a tombstone. Amazon Keyspaces

removes tombstoned data automatically (typically within 10 days) without impacting your

application performance or availability. Tombstoned data isn't available for data manipulation

Client-side timestamps 341

Amazon Keyspaces (for Apache Cassandra) Developer Guide

language (DML) statements. As you continue to perform reads and writes on rows that contain

tombstoned data, the tombstoned data continues to count towards storage, read capacity units

(RCUs), and write capacity units (WCUs) until it's deleted from storage.

After client-side timestamps have been turned on for a table, you can specify a timestamp with

the USING TIMESTAMP clause in your Data Manipulation Language (DML) CQL query. For more

information, see the section called “Use client-side timestamps in queries”. If you do not specify a

timestamp in your CQL query, Amazon Keyspaces uses the timestamp passed by your client driver.

If the client driver doesn’t supply timestamps, Amazon Keyspaces assigns a cell-level timestamp

automatically, because timestamps can't be NULL. To query for timestamps, you can use the

WRITETIME function in your DML statement.

Amazon Keyspaces doesn't charge extra to turn on client-side timestamps. However, with client-

side timestamps you store and write additional data for each value in your row. This can lead to

additional storage usage and in some cases additional throughput usage. For more information

about Amazon Keyspaces pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

When client-side timestamps are turned on in Amazon Keyspaces, every column of every row

stores a timestamp. These timestamps take up approximately 20–40 bytes (depending on your

data), and contribute to the storage and throughput cost for the row. These metadata bytes also

count towards your 1-MB row size quota. To determine the overall increase in storage space (to

ensure that the row size stays under 1 MB), consider the number of columns in your table and

the number of collection elements in each row. For example, if a table has 20 columns, with each

column storing 40 bytes of data, the size of the row increases from 800 bytes to 1200 bytes. For

more information on how to estimate the size of a row, see the section called “Estimate row size”.

In addition to the extra 400 bytes for storage, in this example, the number of write capacity units

(WCUs) consumed per write increases from 1 WCU to 2 WCUs. For more information on how to

calculate read and write capacity, see the section called “Conﬁgure read/write capacity modes”.

After client-side timestamps have been turned on for a table, you can't turn it oﬀ.

To learn more about how to use client-side timestamps in queries, see the section called “Use

client-side timestamps in queries”.

Topics

• How Amazon Keyspaces client-side timestamps integrate with AWS services

• Create a new table with client-side timestamps in Amazon Keyspaces

• Conﬁgure client-side timestamps for a table in Amazon Keyspaces

Client-side timestamps 342

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Use client-side timestamps in queries in Amazon Keyspaces

How Amazon Keyspaces client-side timestamps integrate with AWS

services

The following client-side timestamps metric is available in Amazon CloudWatch to enable

continuous monitoring.

•

SystemReconciliationDeletes – The number of delete operations required to remove

tombstoned data.

For more information about how to monitor CloudWatch metrics, see the section called

“Monitoring with CloudWatch”.

When you use AWS CloudFormation, you can enable client-side timestamps when creating a

Amazon Keyspaces table. For more information, see the AWS CloudFormation User Guide.

Create a new table with client-side timestamps in Amazon Keyspaces

Follow these examples to create a new Amazon Keyspaces table with client-side timestamps

enabled using the Amazon Keyspaces AWS Management Console, Cassandra Query Language

(CQL), or the AWS Command Line Interface

Console

Create a new table with client-side timestamps (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables, and then choose Create table.

3. On the Create table page in the Table details section, select a keyspace and provide a

name for the new table.

4. In the Schema section, create the schema for your table.

5. In the Table settings section, choose Customize settings.

6. Continue to Client-side timestamps.

Choose Turn on client-side timestamps to turn on client-side timestamps for the table.

Integration with AWS services 343

Amazon Keyspaces (for Apache Cassandra) Developer Guide

7. Choose Create table. Your table is created with client-side timestamps turned on.

Cassandra Query Language (CQL)

Create a new table using CQL

1. To create a new table with client-side timestamps enabled using CQL, you can use the

following example.

CREATE TABLE my_keyspace.my_table (

userid uuid,

time timeuuid,

subject text,

body text,

user inet,

PRIMARY KEY (userid, time)

) WITH CUSTOM_PROPERTIES = {'client_side_timestamps': {'status': 'enabled'}};

To conﬁrm the client-side timestamps settings for the new table, use a SELECT statement

to review the custom_properties as shown in the following example.

SELECT custom_properties from system_schema_mcs.tables where keyspace_name =

'my_keyspace' and table_name = 'my_table';

The output of this statement shows the status for client-side timestamps.

'client_side_timestamps': {'status': 'enabled'}

AWS CLI

Create a new table using the AWS CLI

1. To create a new table with client-side timestamps enabled, you can use the following

example.

./aws keyspaces create-table \

--keyspace-name my_keyspace \

--table-name my_table \

--client-side-timestamps 'status=ENABLED' \

Create table with client-side timestamps 344

Amazon Keyspaces (for Apache Cassandra) Developer Guide

--schema-definition 'allColumns=[{name=id,type=int},{name=date,type=timestamp},

{name=name,type=text}],partitionKeys=[{name=id}]'

2. To conﬁrm that client-side timestamps are turned on for the new table, run the following

code.

./aws keyspaces get-table \

--keyspace-name my_keyspace \

--table-name my_table

The output should look similar to this example.

{

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"resourceArn": "arn:aws:cassandra:us-east-2:555555555555:/keyspace/

my_keyspace/table/my_table",

"creationTimestamp": 1662681206.032,

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

Create table with client-side timestamps 345

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": 1662681206.032

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

"clientSideTimestamps": {

"status": "ENABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

}

Conﬁgure client-side timestamps for a table in Amazon Keyspaces

Follow these examples to turn on client-side timestamps for existing tables using the Amazon

Keyspaces AWS Management Console, Cassandra Query Language (CQL), or the AWS Command

Line Interface.

Console

To turn on client-side timestamps for an existing table (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. Choose the table that you want to update, and then choose Additional settings tab.

3. On the Additional settings tab, go to Modify client-side timestamps and select Turn on

client-side timestamps

4. Choose Save changes to change the settings of the table.

Conﬁgure client-side timestamps 346

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

Using a CQL statement

Turn on client-side timestamps for an existing table with the ALTER TABLE CQL

statement.

ALTER TABLE my_table WITH custom_properties = {'client_side_timestamps':

{'status': 'enabled'}};;

To conﬁrm the client-side timestamps settings for the new table, use a SELECT statement

to review the custom_properties as shown in the following example.

SELECT custom_properties from system_schema_mcs.tables where keyspace_name =

'my_keyspace' and table_name = 'my_table';

The output of this statement shows the status for client-side timestamps.

'client_side_timestamps': {'status': 'enabled'}

AWS CLI

Using the AWS CLI

1. You can turn on client-side timestamps for an existing table using the AWS CLI using the

following example.

./aws keyspaces update-table \

--keyspace-name my_keyspace \

--table-name my_table \

--client-side-timestamps 'status=ENABLED'

2. To conﬁrm that client-side timestamps are turned on for the table, run the following code.

./aws keyspaces get-table \

--keyspace-name my_keyspace \

--table-name my_table

Conﬁgure client-side timestamps 347

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The output should look similar to this example and state the status for client-side

timestamps as ENABLED.

{

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"resourceArn": "arn:aws:cassandra:us-east-2:555555555555:/keyspace/

my_keyspace/table/my_table",

"creationTimestamp": 1662681312.906,

"status": "ACTIVE",

"schemaDefinition": {

"allColumns": [

{

"name": "id",

"type": "int"

{

"name": "date",

"type": "timestamp"

{

"name": "name",

"type": "text"

}

"partitionKeys": [

{

"name": "id"

}

"clusteringKeys": [],

"staticColumns": []

"capacitySpecification": {

"throughputMode": "PAY_PER_REQUEST",

"lastUpdateToPayPerRequestTimestamp": 1662681312.906

"encryptionSpecification": {

"type": "AWS_OWNED_KMS_KEY"

"pointInTimeRecovery": {

"status": "DISABLED"

Conﬁgure client-side timestamps 348

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"clientSideTimestamps": {

"status": "ENABLED"

"ttl": {

"status": "ENABLED"

"defaultTimeToLive": 0,

"comment": {

"message": ""

}

Use client-side timestamps in queries in Amazon Keyspaces

After you have turned on client-side timestamps, you can pass the timestamp in your INSERT,

UPDATE, and DELETE statements with the USING TIMESTAMP clause.

The timestamp value is a bigint representing a number of microseconds since the standard base

time known as the epoch: January 1 1970 at 00:00:00 GMT. A timestamp that is supplied by the

client has to fall between the range of 2 days in the past and 5 minutes in the future from the

current wall clock time.

Amazon Keyspaces keeps timestamp metadata for the life of the data. You can use the WRITETIME

function to look up timestamps that occurred years in the past. For more information about CQL

syntax, see the section called “DML statements”.

The following CQL statement is an example of how to use a timestamp as an update_parameter.

INSERT INTO catalog.book_awards (year, award, rank, category, book_title, author,

publisher)

VALUES (2022, 'Wolf', 4, 'Non-Fiction', 'Science Update', 'Ana Carolina Silva',

'SomePublisher')

USING TIMESTAMP 1669069624;

If you do not specify a timestamp in your CQL query, Amazon Keyspaces uses the timestamp

passed by your client driver. If no timestamp is supplied by the client driver, Amazon Keyspaces

assigns a server-side timestamp for your write operation.

To see the timestamp value that is stored for a speciﬁc column, you can use the WRITETIME

function in a SELECT statement as shown in the following example.

Use client-side timestamps in queries 349

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT year, award, rank, category, book_title, author, publisher, WRITETIME(year),

WRITETIME(award), WRITETIME(rank),

WRITETIME(category), WRITETIME(book_title), WRITETIME(author), WRITETIME(publisher)

from catalog.book_awards;

Working with CQL queries in Amazon Keyspaces

This section gives an introduction into working with queries in Amazon Keyspaces (for Apache

Cassandra). The CQL statements available to query, transform, and manage data are SELECT,

INSERT, UPDATE, and DELETE. The following topics outline some of the more complex options

available when working with queries. For the complete language syntax with examples, see the

section called “DML statements”.

Topics

• Use the IN operator with the SELECT statement in a query in Amazon Keyspaces

• Order results with ORDER BY in Amazon Keyspaces

• Paginate results in Amazon Keyspaces

Use the IN operator with the SELECT statement in a query in Amazon

Keyspaces

SELECT IN

You can query data from tables using the SELECT statement, which reads one or more columns

for one or more rows in a table and returns a result-set containing the rows matching the request.

A SELECT statement contains a select_clause that determines which columns to read and to

return in the result-set. The clause can contain instructions to transform the data before returning

it. The optional WHERE clause speciﬁes which rows must be queried and is composed of relations

on the columns that are part of the primary key. Amazon Keyspaces supports the IN keyword in

the WHERE clause. This section uses examples to show how Amazon Keyspaces processes SELECT

statements with the IN keyword.

This examples demonstrates how Amazon Keyspaces breaks down the SELECT statement

with the IN keyword into subqueries. In this example we use a table with the name

my_keyspace.customers. The table has one primary key column department_id, two

Working with CQL queries 350

Amazon Keyspaces (for Apache Cassandra) Developer Guide

clustering columns sales_region_id and sales_representative_id, and one column that

contains the name of the customer in the customer_name column.

SELECT * FROM my_keyspace.customers;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

0 | 0 | 0 | a

0 | 0 | 1 | b

0 | 1 | 0 | c

0 | 1 | 1 | d

1 | 0 | 0 | e

1 | 0 | 1 | f

1 | 1 | 0 | g

1 | 1 | 1 | h

Using this table, you can run the following SELECT statement to ﬁnd the customers in the

departments and sales regions that you are interested in with the IN keyword in the WHERE clause.

The following statement is an example of this.

SELECT * FROM my_keyspace.customers WHERE department_id IN (0, 1) AND sales_region_id

IN (0, 1);

Amazon Keyspaces divides this statement into four subqueries as shown in the following output.

SELECT * FROM my_keyspace.customers WHERE department_id = 0 AND sales_region_id = 0;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

0 | 0 | 0 | a

0 | 0 | 1 | b

SELECT * FROM my_keyspace.customers WHERE department_id = 0 AND sales_region_id = 1;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

0 | 1 | 0 | c

0 | 1 | 1 | d

SELECT * FROM my_keyspace.customers WHERE department_id = 1 AND sales_region_id = 0;

department_id | sales_region_id | sales_representative_id | customer_name

Use IN SELECT

351

Amazon Keyspaces (for Apache Cassandra) Developer Guide

---------------+-----------------+-------------------------+--------------

1 | 0 | 0 | e

1 | 0 | 1 | f

SELECT * FROM my_keyspace.customers WHERE department_id = 1 AND sales_region_id = 1;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

1 | 1 | 0 | g

1 | 1 | 1 | h

When the IN keyword is used, Amazon Keyspaces automatically paginates the results in any of the

following cases:

• After every 10th subquery is processed.

• After processing 1MB of logical IO.

•

If you conﬁgured a PAGE SIZE, Amazon Keyspaces paginates after reading the number of

queries for processing based on the set PAGE SIZE.

•

When you use the LIMIT keyword to reduce the number of rows returned, Amazon Keyspaces

paginates after reading the number of queries for processing based on the set LIMIT.

The following table is used to illustrate this with an example.

For more information about pagination, see the section called “Paginate results”.

SELECT * FROM my_keyspace.customers;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

2 | 0 | 0 | g

2 | 1 | 1 | h

2 | 2 | 2 | i

0 | 0 | 0 | a

0 | 1 | 1 | b

0 | 2 | 2 | c

1 | 0 | 0 | d

1 | 1 | 1 | e

1 | 2 | 2 | f

3 | 0 | 0 | j

3 | 1 | 1 | k

Use IN SELECT

352

Amazon Keyspaces (for Apache Cassandra) Developer Guide

3 | 2 | 2 | l

You can run the following statement on this table to see how pagination works.

SELECT * FROM my_keyspace.customers WHERE department_id IN (0, 1, 2, 3) AND

sales_region_id IN (0, 1, 2) AND sales_representative_id IN (0, 1);

Amazon Keyspaces processes this statement as 24 subqueries, because the cardinality of the

Cartesian product of all the IN terms contained in this query is 24.

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

0 | 0 | 0 | a

0 | 1 | 1 | b

1 | 0 | 0 | d

1 | 1 | 1 | e

---MORE---

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

2 | 0 | 0 | g

2 | 1 | 1 | h

3 | 0 | 0 | j

---MORE---

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

3 | 1 | 1 | k

This example shows how you can use the ORDER BY clause in a SELECT statement with the IN

keyword.

SELECT * FROM my_keyspace.customers WHERE department_id IN (3, 2, 1) ORDER BY

sales_region_id DESC;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

3 | 2 | 2 | l

3 | 1 | 1 | k

3 | 0 | 0 | j

2 | 2 | 2 | i

Use IN SELECT

353

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2 | 1 | 1 | h

2 | 0 | 0 | g

1 | 2 | 2 | f

1 | 1 | 1 | e

1 | 0 | 0 | d

Subqueries are processed in the order in which the partition key and clustering key columns are

presented in the query. In the example below, subqueries for partition key value ”2“ are processed

ﬁrst, followed by subqueries for partition key value ”3“ and ”1“. Results of a given subquery are

ordered according to the query's ordering clause, if present, or the table's clustering order deﬁned

during table creation.

SELECT * FROM my_keyspace.customers WHERE department_id IN (2, 3, 1) ORDER BY

sales_region_id DESC;

department_id | sales_region_id | sales_representative_id | customer_name

---------------+-----------------+-------------------------+--------------

2 | 2 | 2 | i

2 | 1 | 1 | h

2 | 0 | 0 | g

3 | 2 | 2 | l

3 | 1 | 1 | k

3 | 0 | 0 | j

1 | 2 | 2 | f

1 | 1 | 1 | e

1 | 0 | 0 | d

Order results with ORDER BY in Amazon Keyspaces

The ORDER BY clause speciﬁes the sort order of the results returned in a SELECT statement. The

statement takes a list of column names as arguments and for each column you can specify the

sort order for the data. You can only specify clustering columns in ordering clauses, non-clustering

columns are not allowed.

The two available sort order options for the returned results are ASC for ascending and DESC for

descending sort order.

SELECT * FROM my_keyspace.my_table ORDER BY (col1 ASC, col2 DESC, col3 ASC);

col1 | col2 | col3

------+------+------

Order results 354

Amazon Keyspaces (for Apache Cassandra) Developer Guide

0 | 6 | a

1 | 5 | b

2 | 4 | c

3 | 3 | d

4 | 2 | e

5 | 1 | f

6 | 0 | g

SELECT * FROM my_keyspace.my_table ORDER BY (col1 DESC, col2 ASC, col3 DESC);

col1 | col2 | col3

------+------+------

6 | 0 | g

5 | 1 | f

4 | 2 | e

3 | 3 | d

2 | 4 | c

1 | 5 | b

0 | 6 | a

If you don't specify the sort order in the query statement, the default ordering of the clustering

column is used.

The possible sort orders you can use in an ordering clause depend on the sort order assigned to

each clustering column at table creation. Query results can only be sorted in the order deﬁned

for all clustering columns at table creation or the inverse of the deﬁned sort order. Other possible

combinations are not allowed.

For example, if the table's CLUSTERING ORDER is (col1 ASC, col2 DESC, col3 ASC), then the valid

parameters for ORDER BY are either (col1 ASC, col2 DESC, col3 ASC) or (col1 DESC, col2 ASC, col3

DESC). For more information on CLUSTERING ORDER, see table_options under the section

called “CREATE TABLE”.

Paginate results in Amazon Keyspaces

Amazon Keyspaces automatically paginates the results from SELECT statements when the data

read to process the SELECT statement exceeds 1 MB. With pagination, the SELECT statement

results are divided into "pages" of data that are 1 MB in size (or less). An application can process the

ﬁrst page of results, then the second page, and so on. Clients should always check for pagination

tokens when processing SELECT queries that return multiple rows.

Paginate results 355

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If a client supplies a PAGE SIZE that requires reading more than 1 MB of data, Amazon Keyspaces

breaks up the results automatically into multiple pages based on the 1 MB data-read increments.

For example, if the average size of a row is 100 KB and you specify a PAGE SIZE of 20, Amazon

Keyspaces paginates data automatically after it reads 10 rows (1000 KB of data read).

Because Amazon Keyspaces paginates results based on the number of rows that it reads to process

a request and not the number of rows returned in the result set, some pages may not contain any

rows if you are running ﬁltered queries.

For example, if you set PAGE SIZE to 10 and Keyspaces evaluates 30 rows to process your SELECT

query, Amazon Keyspaces will return three pages. If only a subset of the rows matched your query,

some pages may have less than 10 rows. For an example how the PAGE SIZE of LIMIT queries

can aﬀect read capacity, see the section called “Estimate the read capacity consumption of limit

queries”.

For a comparison with Apache Cassandra pagination, see the section called “Pagination”.

Working with partitioners in Amazon Keyspaces

In Apache Cassandra, partitioners control which nodes data is stored on in the cluster. Partitioners

create a numeric token using a hashed value of the partition key. Cassandra uses this token to

distribute data across nodes. Clients can also use these tokens in SELECT operations and WHERE

clauses to optimize read and write operations. For example, clients can eﬃciently perform parallel

queries on large tables by specifying distinct token ranges to query in each parallel job.

Amazon Keyspaces provides three diﬀerent partitioners.

Murmur3Partitioner (Default)

Apache Cassandra-compatible Murmur3Partitioner. The Murmur3Partitioner is the

default Cassandra partitioner in Amazon Keyspaces and in Cassandra 1.2 and later versions.

RandomPartitioner

Apache Cassandra-compatible RandomPartitioner. The RandomPartitioner is the default

Cassandra partitioner for versions earlier than Cassandra 1.2.

Keyspaces Default Partitioner

The DefaultPartitioner returns the same token function results as the

RandomPartitioner.

Working with partitioners 356

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The partitioner setting is applied per Region at the account level. For example, if you change the

partitioner in US East (N. Virginia), the change is applied to all tables in the same account in this

Region. You can safely change your partitioner at any time. Note that the conﬁguration change

takes approximately 10 minutes to complete. You do not need to reload your Amazon Keyspaces

data when you change the partitioner setting. Clients will automatically use the new partitioner

setting the next time they connect.

How to change the partitioner in Amazon Keyspaces

You can change the partitioner by using the AWS Management Console or Cassandra Query

Language (CQL).

AWS Management Console

To change the partitioner using the Amazon Keyspaces console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Conﬁguration.

3. On the Conﬁguration page, go to Edit partitioner.

4. Select the partitioner compatible with your version of Cassandra. The partitioner change

takes approximately 10 minutes to apply.

Note

After the conﬁguration change is complete, you have to disconnect and reconnect

to Amazon Keyspaces for requests to use the new partitioner.

Cassandra Query Language (CQL)

1. To see which partitioner is conﬁgured for the account, you can use the following query.

SELECT partitioner from system.local;

If the partitioner hasn't been changed, the query has the following output.

partitioner

Change the partitioner 357

Amazon Keyspaces (for Apache Cassandra) Developer Guide

--------------------------------------------

com.amazonaws.cassandra.DefaultPartitioner

To update the partitioner to the Murmur3 partitioner, you can use the following statement.

UPDATE system.local set

partitioner='org.apache.cassandra.dht.Murmur3Partitioner' where key='local';

3. Note that this conﬁguration change takes approximately 10 minutes to complete. To

conﬁrm that the partitioner has been set, you can run the SELECT query again. Note that

due to eventual read consistency, the response might not reﬂect the results of the recently

completed partitioner change yet. If you repeat the SELECT operation again after a short

time, the response should return the latest data.

SELECT partitioner from system.local;

Note

You have to disconnect and reconnect to Amazon Keyspaces so that requests use

the new partitioner.

Using this service with an AWS SDK

AWS software development kits (SDKs) are available for many popular programming languages.

Each SDK provides an API, code examples, and documentation that make it easier for developers to

build applications in their preferred language.

SDK documentation Code examples

AWS SDK for C++ AWS SDK for C++ code examples

AWS CLI AWS CLI code examples

AWS SDK for Go AWS SDK for Go code examples

AWS SDK for Java AWS SDK for Java code examples

AWS SDK for JavaScript AWS SDK for JavaScript code examples

Working with AWS SDKs 358

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SDK documentation Code examples

AWS SDK for Kotlin AWS SDK for Kotlin code examples

AWS SDK for .NET AWS SDK for .NET code examples

AWS SDK for PHP AWS SDK for PHP code examples

AWS Tools for PowerShell Tools for PowerShell code examples

AWS SDK for Python (Boto3) AWS SDK for Python (Boto3) code examples

AWS SDK for Ruby AWS SDK for Ruby code examples

AWS SDK for Rust AWS SDK for Rust code examples

AWS SDK for SAP ABAP AWS SDK for SAP ABAP code examples

AWS SDK for Swift AWS SDK for Swift code examples

Example availability

Can't ﬁnd what you need? Request a code example by using the Provide feedback link at

the bottom of this page.

Working with tags and labels for Amazon Keyspaces resources

You can label Amazon Keyspaces (for Apache Cassandra) resources using tags. Tags let you

categorize your resources in diﬀerent ways—for example, by purpose, owner, environment, or other

criteria. Tags can help you do the following:

• Quickly identify a resource based on the tags that you assigned to it.

• See AWS bills broken down by tags.

• Control access to Amazon Keyspaces resources based on tags. For IAM policy examples using

tags, see the section called “Authorization based on Amazon Keyspaces tags”.

Working with tags 359

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Tagging is supported by AWS services like Amazon Elastic Compute Cloud (Amazon EC2), Amazon

Simple Storage Service (Amazon S3), Amazon Keyspaces, and more. Eﬃcient tagging can provide

cost insights by enabling you to create reports across services that carry a speciﬁc tag.

To get started with tagging, do the following:

1. Understand Restrictions for using tags to label resources in Amazon Keyspaces.

2. Create tags by using Tag keyspaces and tables in Amazon Keyspaces.

3. Use Create cost allocation reports using tags for Amazon Keyspaces to track your AWS costs per

active tag.

Finally, it is good practice to follow optimal tagging strategies. For information, see AWS tagging

strategies.

Restrictions for using tags to label resources in Amazon Keyspaces

Each tag consists of a key and a value, both of which you deﬁne. The following restrictions apply:

• Each Amazon Keyspaces keyspace or table can have only one tag with the same key. If you try to

add an existing tag (same key), the existing tag value is updated to the new value.

• Tags applied to a keyspace do not automatically apply to tables within that keyspace. To apply

the same tag to a keyspace and all its tables, each resource must be individually tagged.

• When you create a multi-Region keyspace or table, any tags that you deﬁne during the creation

process are automatically applied to all keyspaces and tables in all Regions. When you change

existing tags using ALTER KEYSPACE or ALTER TABLE, the update is only applied to the

keyspace or table in the Region where you're making the change.

• A value acts as a descriptor within a tag category (key). In Amazon Keyspaces the value cannot be

empty or null.

• Tag keys and values are case sensitive.

• The maximum key length is 128 Unicode characters.

• The maximum value length is 256 Unicode characters.

• The allowed characters are letters, white space, and numbers, plus the following special

characters: + - = . _ : /

• The maximum number of tags per resource is 50.

Tagging restrictions 360

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

AWS-assigned tag names and values are automatically assigned the aws: preﬁx, which you can't

assign. AWS-assigned tag names don't count toward the tag limit of 50. User-assigned tag names

have the preﬁx user: in the cost allocation report.

• You can't backdate the application of a tag.

Tag keyspaces and tables in Amazon Keyspaces

You can add, list, edit, or delete tags for keyspaces and tables using the Amazon Keyspaces (for

Apache Cassandra) console, the AWS CLI, or Cassandra Query Language (CQL). You can then

activate these user-deﬁned tags so that they appear on the AWS Billing and Cost Management

console for cost allocation tracking. For more information, see Create cost allocation reports using

tags for Amazon Keyspaces.

For bulk editing, you can also use Tag Editor on the console. For more information, see Working

with Tag Editor in the AWS Resource Groups User Guide.

For information about tag structure, see Restrictions for using tags to label resources in Amazon

Keyspaces.

Topics

• Add tags when creating a new keyspace

• Add tags to a keyspace

• Delete tags from a keyspace

• View the tags of a keyspace

• Add tags when creating a new table

• Add tags to a table

• Delete tags from a table

• View the tags of a table

Add tags when creating a new keyspace

You can use the Amazon Keyspaces console, CQL or the AWS CLI to add tags when you create a

new keyspace.

Tag keyspaces and tables 361

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Console

Set a tag when creating a new keyspace using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces, and then choose Create keyspace.

3. On the Create keyspace page, provide a name for the keyspace.

4. Under Tags choose Add new tag and enter a key and a value.

5. Choose Create keyspace.

Cassandra Query Language (CQL)

Set a tag when creating a new keyspace using CQL

• The following example creates a new keyspace with tags.

CREATE KEYSPACE mykeyspace WITH TAGS = {'key1':'val1', 'key2':'val2'};

CLI

Set a tag when creating a new keyspace using the AWS CLI

• The following statement creates a new keyspace with tags.

aws keyspaces create-keyspace --keyspace-name 'myKeyspace' --tags

'key=key1,value=val1' 'key=key2,value=val2'

Add tags to a keyspace

The following examples show how to add tags to a keyspace in Amazon Keyspaces.

Console

Add a tag to an existing keyspace using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

Tag keyspaces and tables 362

Amazon Keyspaces (for Apache Cassandra) Developer Guide

2. In the navigation pane, choose Keyspaces.

3. Choose a keyspace from the list. Then choose the Tags tab where you can view the tags of

the keyspace.

4. Choose Manage tags to add, edit, or delete tags.

5. Choose Save changes.

Cassandra Query Language (CQL)

Add a tag to an existing keyspace using CQL

•

ALTER KEYSPACE mykeyspace ADD TAGS {'key1':'val1', 'key2':'val2'};

CLI

Add a tag to an existing keyspace using the AWS CLI

• The following example shows how to add new tags to an existing keyspace.

aws keyspaces tag-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/' --tags 'key=key3,value=val3'

'key=key4,value=val4'

Delete tags from a keyspace

Console

Delete a tag from an existing keyspace using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces.

3. Choose a keyspace from the list. Then choose the Tags tab where you can view the tags of

the keyspace.

4. Choose Manage tags and delete the tags you don't need anymore.

5. Choose Save changes.

Tag keyspaces and tables 363

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Cassandra Query Language (CQL)

Delete a tag from an existing keyspace using CQL

•

ALTER KEYSPACE mykeyspace DROP TAGS {'key1':'val1', 'key2':'val2'};

CLI

Delete a tag from an existing keyspace using the AWS CLI

• The following statement removes the speciﬁed tags from a keyspace.

aws keyspaces untag-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/' --tags 'key=key3,value=val3'

'key=key4,value=val4'

View the tags of a keyspace

The following examples show how to read tags using the console, CQL or the AWS CLI.

Console

View the tags of a keyspace using the Amazon Keyspaces console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Keyspaces.

3. Choose a keyspace from the list. Then choose the Tags tab where you can view the tags of

the keyspace.

Cassandra Query Language (CQL)

View the tags of a keyspace using CQL

To read the tags attached to a keyspace, use the following CQL statement.

SELECT * FROM system_schema_mcs.tags WHERE valid_where_clause;

Tag keyspaces and tables 364

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The WHERE clause is required, and must use one of the following formats:

•

keyspace_name = 'mykeyspace' AND resource_type = 'keyspace'

•

resource_id = arn

• The following statement shows whether a keyspace has tags.

SELECT * FROM system_schema_mcs.tags WHERE keyspace_name = 'mykeyspace' AND

resource_type = 'keyspace';

The output of the query looks like the following.

resource_id | keyspace_name

| resource_name | resource_type | tags

-----------------------------------------------------------------

+---------------+---------------+---------------+------

arn:aws:cassandra:us-east-1:123456789:/keyspace/mykeyspace/ | mykeyspace

| mykeyspace | keyspace | {'key1': 'val1', 'key2': 'val2'}

CLI

View the tags of a keyspace using the AWS CLI

• This example shows how to list the tags of the speciﬁed resource.

aws keyspaces list-tags-for-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/'

The output of the last command looks like this.

{

"tags": [

{

"key": "key1",

"value": "val1"

{

"key": "key2",

Tag keyspaces and tables 365

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"value": "val2"

{

"key": "key3",

"value": "val3"

{

"key": "key4",

"value": "val4"

}

]

}

Add tags when creating a new table

You can use the Amazon Keyspaces console, CQL or the AWS CLI to add tags to new keyspaces and

tables when you create them.

Console

Add a tag when creating a new table using the (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables, and then choose Create table.

3. On the Create table page in the Table details section, select a keyspace and provide a

name for the table.

4. In the Schema section, create the schema for your table.

5. In the Table settings section, choose Customize settings.

6. Continue to the Table tags – optional section, and choose Add new tag to create new tags.

7. Choose Create table.

Cassandra Query Language (CQL)

Add tags when creating a new table using CQL

• The following example creates a new table with tags.

Tag keyspaces and tables 366

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CREATE TABLE mytable(...) WITH TAGS = {'key1':'val1', 'key2':'val2'};

CLI

Add tags when creating a new table using the AWS CLI

• The following example shows how to create a new table with tags. The command creates

a table myTable in an already existing keyspace myKeyspace. Note that the command has

been broken up into diﬀerent lines to help with readability.

aws keyspaces create-table --keyspace-name 'myKeyspace' --table-name 'myTable'

--schema-definition 'allColumns=[{name=id,type=int},

{name=name,type=text},{name=date,type=timestamp}],partitionKeys=[{name=id}]'

--tags 'key=key1,value=val1' 'key=key2,value=val2'

Add tags to a table

You can add tags to an existing table in Amazon Keyspaces using the console, CQL or the AWS CLI.

Console

Add tags to a table using the Amazon Keyspaces console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables.

3. Choose a table from the list and choose the Tags tab.

4. Choose Manage tags to add tags to the table.

5. Choose Save changes.

Cassandra Query Language (CQL)

Add tags to a table using CQL

• The following statement shows how to add tags to an existing table.

Tag keyspaces and tables 367

Amazon Keyspaces (for Apache Cassandra) Developer Guide

ALTER TABLE mykeyspace.mytable ADD TAGS {'key1':'val1', 'key2':'val2'};

CLI

Add tags to a table using the AWS CLI

• The following example shows how to add new tags to an existing table.

aws keyspaces tag-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/table/myTable' --tags

'key=key3,value=val3' 'key=key4,value=val4'

Delete tags from a table

Console

Delete tags from a table using the Amazon Keyspaces console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables.

3. Choose a table from the list and choose the Tags tab.

4. Choose Manage tags to delete tags from the table.

5. Choose Save changes.

Cassandra Query Language (CQL)

Delete tags from a table using CQL

• The following statement shows how to delete tags from an existing table.

ALTER TABLE mytable DROP TAGS {'key3':'val3', 'key4':'val4'};

Tag keyspaces and tables 368

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CLI

Add tags to a table using the AWS CLI

• The following statement removes the speciﬁed tags from a keyspace.

aws keyspaces untag-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/table/myTable' --tags

'key=key3,value=val3' 'key=key4,value=val4'

View the tags of a table

The following examples show how to the tags of a table in Amazon Keyspaces using the console,

CQL, or the AWS CLI.

Console

View the tags of a table using the console

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at

https://console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables.

3. Choose a table from the list and choose the Tags tab.

Cassandra Query Language (CQL)

View the tags of a table using CQL

To read the tags attached to a table, use the following CQL statement.

SELECT * FROM system_schema_mcs.tags WHERE valid_where_clause;

The WHERE clause is required, and must use one of the following formats:

•

keyspace_name = 'mykeyspace' AND resource_name = 'mytable'

•

resource_id = arn

• The following query returns the tags of the speciﬁed table.

Tag keyspaces and tables 369

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT * FROM system_schema_mcs.tags WHERE keyspace_name = 'mykeyspace' AND

resource_name = 'mytable';

The output of that query looks like the following.

resource_id |

keyspace_name | resource_name | resource_type | tags

----------------------------------------------------------------------------

+---------------+---------------+---------------+------

arn:aws:cassandra:us-east-1:123456789:/keyspace/mykeyspace/table/mytable

| mykeyspace | mytable | table | {'key1': 'val1', 'key2':

'val2'}

CLI

View the tags of a table using the AWS CLI

• This example shows how to list the tags of the speciﬁed resource.

aws keyspaces list-tags-for-resource --resource-arn 'arn:aws:cassandra:us-

east-1:111222333444:/keyspace/myKeyspace/table/myTable'

The output of the last command looks like this.

{

"tags": [

{

"key": "key1",

"value": "val1"

{

"key": "key2",

"value": "val2"

{

"key": "key3",

"value": "val3"

Tag keyspaces and tables 370

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"key": "key4",

"value": "val4"

}

]

}

Create cost allocation reports using tags for Amazon Keyspaces

AWS uses tags to organize resource costs on your cost allocation report. AWS provides two types of

cost allocation tags:

• An AWS-generated tag. AWS deﬁnes, creates, and applies this tag for you.

• User-deﬁned tags. You deﬁne, create, and apply these tags.

You must activate both types of tags separately before they can appear in Cost Explorer or on a

cost allocation report.

To activate AWS-generated tags:

1. Sign in to the AWS Management Console and open the Billing and Cost Management console at

https://console.aws.amazon.com/billing/home#/.

2. In the navigation pane, choose Cost Allocation Tags.

3. Under AWS-Generated Cost Allocation Tags, choose Activate.

To activate user-deﬁned tags:

1. Sign in to the AWS Management Console and open the Billing and Cost Management console at

https://console.aws.amazon.com/billing/home#/.

2. In the navigation pane, choose Cost Allocation Tags.

3. Under User-Deﬁned Cost Allocation Tags, choose Activate.

After you create and activate tags, AWS generates a cost allocation report with your usage and

costs grouped by your active tags. The cost allocation report includes all of your AWS costs for each

billing period. The report includes both tagged and untagged resources, so that you can clearly

organize the charges for resources.

Create cost allocation reports 371

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

Currently, any data transferred out from Amazon Keyspaces won't be broken down by tags

on cost allocation reports.

For more information, see Using cost allocation tags.

Create Amazon Keyspaces resources with AWS CloudFormation

Amazon Keyspaces is integrated with AWS CloudFormation, a service that helps you model and set

up your AWS keyspaces and tables so that you can spend less time creating and managing your

resources and infrastructure. You create a template that describes the keyspaces and tables that

you want, and AWS CloudFormation takes care of provisioning and conﬁguring those resources for

you.

When you use AWS CloudFormation, you can reuse your template to set up your Amazon

Keyspaces resources consistently and repeatedly. Just describe your resources once, and then

provision the same resources over and over in multiple AWS accounts and Regions.

Amazon Keyspaces and AWS CloudFormation templates

To provision and conﬁgure resources for Amazon Keyspaces, you must understand AWS

CloudFormation templates. Templates are formatted text ﬁles in JSON or YAML. These templates

describe the resources that you want to provision in your AWS CloudFormation stacks. If you're

unfamiliar with JSON or YAML, you can use AWS CloudFormation Designer to help you get started

with AWS CloudFormation templates. For more information, see What is AWS CloudFormation

designer? in the AWS CloudFormation User Guide.

Amazon Keyspaces supports creating keyspaces and tables in AWS CloudFormation. For the tables

you create using AWS CloudFormation templates, you can specify the schema, read/write mode,

provisioned throughput settings, and other supported features. For more information, including

examples of JSON and YAML templates for keyspaces and tables, see Cassandra resource type

reference in the AWS CloudFormation User Guide.

Learn more about AWS CloudFormation

To learn more about AWS CloudFormation, see the following resources:

Create AWS CloudFormation resources 372

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• AWS CloudFormation

• AWS CloudFormation User Guide

• AWS CloudFormation command line interface User Guide

Using NoSQL Workbench with Amazon Keyspaces (for Apache

Cassandra)

NoSQL Workbench is a client-side application that helps you design and visualize nonrelational

data models for Amazon Keyspaces more easily. NoSQL Workbench clients are available for

Windows, macOS, and Linux.

Designing data models and creating resources automatically

NoSQL Workbench provides you a point-and-click interface to design and create Amazon

Keyspaces data models. You can easily create new data models from scratch by deﬁning

keyspaces, tables, and columns. You can also import existing data models and make

modiﬁcations (such as adding, editing, or removing columns) to adapt the data models for

new applications. NoSQL Workbench then enables you to commit the data models to Amazon

Keyspaces or Apache Cassandra, and create the keyspaces and tables automatically. To learn

how to build data models, see the section called “Create a data model” and the section called

“Edit a data model”.

Visualizing data models

Using NoSQL Workbench, you can visualize your data models to help ensure that the data

models can support your application’s queries and access patterns. You can also save and export

your data models in a variety of formats for collaboration, documentation, and presentations.

For more information, see the section called “Visualize a data model”.

Topics

• Download NoSQL Workbench

• Getting started with NoSQL Workbench

• Visualize data models with NoSQL Workbench

• Create a new data model with NoSQL Workbench

• Edit existing data models with NoSQL Workbench

NoSQL Workbench 373

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• How to commit data models to Amazon Keyspaces and Apache Cassandra

• Sample data models in NoSQL Workbench

• Release history for NoSQL Workbench

Download NoSQL Workbench

Follow these instructions to download and install NoSQL Workbench.

To download and install NoSQL Workbench

1. Use one of the following links to download NoSQL Workbench for free.

Operating System Download Link

macOS Download for macOS

Linux* Download for Linux

Windows Download for Windows

* NoSQL Workbench supports Ubuntu 12.04, Fedora 21, and Debian 8 or any newer versions of

these Linux distributions.

2. After the download completes, start the application and follow the onscreen instructions to

complete the installation.

Getting started with NoSQL Workbench

To get started with NoSQL Workbench, on the Database Catalog page in NoSQL Workbench,

choose Amazon Keyspaces, and then choose Launch.

Download 374

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This opens the NoSQL Workbench home page for Amazon Keyspaces where you have the following

options to get started:

1. Create a new data model.

2. Import an existing data model in JSON format.

3. Open a recently edited data model.

4. Open one of the available sample models.

Getting started 375

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Each of the options opens the NoSQL Workbench data modeler. To continue creating a new data

model, see the section called “Create a data model”. To edit an existing data model, see the section

called “Edit a data model”.

Visualize data models with NoSQL Workbench

Using NoSQL Workbench, you can visualize your data models to help ensure that the data models

can support your application’s queries and access patterns. You also can save and export your data

models in a variety of formats for collaboration, documentation, and presentations.

After you have created a new data model or edited an existing data model, you can visualize the

model.

Visualizing data models with NoSQL Workbench

When you have completed the data model in the data modeler, choose Visualize data model.

Visualize a data model 376

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This takes you to the data visualizer in NoSQL Workbench. The data visualizer provides a visual

representation of the table's schema and lets you add sample data. To add sample data to a table,

choose a table from the model, and then choose Edit. To add a new row of data, choose Add new

row at the bottom of the screen. Choose Save when you're done.

Visualize a data model 377

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Aggregate view

After you have conﬁrmed the table's schema, you can aggregate data model visualizations.

Visualize a data model 378

Amazon Keyspaces (for Apache Cassandra) Developer Guide

After you have aggregated the view of the data model, you can export the view to a PNG ﬁle. To

export the data model to a JSON ﬁle, choose the upload sign under the data model name.

Note

You can export the data model in JSON format at any time in the design process.

Visualize a data model 379

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You have the following options to commit the changes:

• Commit to Amazon Keyspaces

• Commit to an Apache Cassandra cluster

To learn more about how to commit changes, see the section called “Commit a data model”.

Create a new data model with NoSQL Workbench

You can use the NoSQL Workbench data modeler to design new data models based on your

application's data access patterns. To create a new data model for Amazon Keyspaces, you can use

the NoSQL Workbench data modeler to create keyspaces, tables, and columns. Follow these steps

to create a new data model.

1. To create a new keyspace, choose the plus sign under Keyspace.

In this step, choose the following properties and settings.

Create a data model 380

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Keyspace name – Enter the name of the new keyspace.

• Replication strategy – Choose the replication strategy for the keyspace. Amazon Keyspaces

uses the SingleRegionStrategy to replicate data three times automatically in multiple AWS

Availability Zones. If you're planning to commit the data model to an Apache Cassandra

cluster, you can choose SimpleStrategy or NetworkTopologyStrategy.

• Keyspaces tags – Resource tags are optional and let you categorize your resources in

diﬀerent ways—for example, by purpose, owner, environment, or other criteria. To learn

more about tags for Amazon Keyspaces resources, see the section called “Working with

tags”.

2. Choose Add keyspace deﬁnition to create the keyspace.

3. To create a new table, choose the plus sign next to Tables. In this step, you deﬁne the

following properties and settings.

• Table name – The name of the new table.

• Columns – Add a column name and choose the data type. Repeat these steps for every

column in your schema.

• Partition key – Choose columns for the partition key.

• Clustering columns – Choose clustering columns (optional).

Create a data model 381

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Capacity mode – Choose the read/write capacity mode for the table. You can choose

provisioned or on-demand capacity. To learn more about capacity modes, see the section

called “Conﬁgure read/write capacity modes”.

• Table tags – Resource tags are optional and let you categorize your resources in diﬀerent

ways—for example, by purpose, owner, environment, or other criteria. To learn more about

tags for Amazon Keyspaces resources, see the section called “Working with tags”.

4. Choose Add table deﬁnition to create the new table.

5. Repeat these steps to create additional tables.

6. Continue to the section called “Visualizing a Data Model” to visualize the data model that you

created.

Edit existing data models with NoSQL Workbench

You can use the data modeler to import and modify existing data models created using NoSQL

Workbench. The data modeler also includes a few sample data models to help you get started with

Edit a data model 382

Amazon Keyspaces (for Apache Cassandra) Developer Guide

data modeling. The data models you can edit with NoSQL Workbench can be data models that are

imported from a ﬁle, the provided sample data models, or data models that you created previously.

1. To edit a keyspace, choose the edit symbol under Keyspace.

In this step, you can edit the following properties and settings.

• Keyspace name – Enter the name of the new keyspace.

• Replication strategy – Choose the replication strategy for the keyspace. Amazon Keyspaces

uses the SingleRegionStrategy to replicate data three times automatically in multiple AWS

Availability Zones. If you're planning to commit the data model to an Apache Cassandra

cluster, you can choose SimpleStrategy or NetworkTopologyStrategy.

• Keyspaces tags – Resource tags are optional and let you categorize your resources in

diﬀerent ways—for example, by purpose, owner, environment, or other criteria. To learn

more about tags for Amazon Keyspaces resources, see the section called “Working with

tags”.

2. Choose Save edits to update the keyspace.

3. To edit a table, choose Edit next to the table name. In this step, you can update the following

properties and settings.

Edit a data model 383

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Table name – The name of the new table.

• Columns – Add a column name and choose the data type. Repeat these steps for every

column in your schema.

• Partition key – Choose columns for the partition key.

• Clustering columns – Choose clustering columns (optional).

• Capacity mode – Choose the read/write capacity mode for the table. You can choose

provisioned or on-demand capacity. To learn more about capacity modes, see the section

called “Conﬁgure read/write capacity modes”.

• Table tags – Resource tags are optional and let you categorize your resources in diﬀerent

ways—for example, by purpose, owner, environment, or other criteria. To learn more about

tags for Amazon Keyspaces resources, see the section called “Working with tags”.

4. Choose Save edits to update the table.

5. Continue to the section called “Visualizing a Data Model” to visualize the data model that you

updated.

How to commit data models to Amazon Keyspaces and Apache

Cassandra

This section shows you how to commit completed data models to Amazon Keyspaces and Apache

Cassandra clusters. This process automatically creates the server-side resources for keyspaces and

tables based on the settings that you deﬁned in the data model.

Commit a data model 384

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Topics

• Before you begin

• Connect to Amazon Keyspaces with service-speciﬁc credentials

• Connect to Amazon Keyspaces with AWS Identity and Access Management (IAM) credentials

• Use a saved connection

• Commit to Apache Cassandra

Before you begin

Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections

with clients. To connect to Amazon Keyspaces using TLS, you need to complete the following task

before you can start.

•

Download the Starﬁeld digital certiﬁcate using the following command and save sf-class2-

root.crt locally or in your home directory.

Commit a data model 385

Amazon Keyspaces (for Apache Cassandra) Developer Guide

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Note

You can also use the Amazon digital certiﬁcate to connect to Amazon Keyspaces and

can continue to do so if your client is connecting to Amazon Keyspaces successfully. The

Starﬁeld certiﬁcate provides additional backwards compatibility for clients using older

certiﬁcate authorities.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

After you have saved the certiﬁcate ﬁle, you can connect to Amazon Keyspaces. One option is

to connect by using service-speciﬁc credentials. Service-speciﬁc credentials are a user name and

password that are associated with a speciﬁc IAM user and can only be used with the speciﬁed

service. The second option is to connect with IAM credentials that are using the AWS Signature

Version 4 process (SigV4). To learn more about these two options, see the section called “Create

programmatic access credentials”.

To connect with service-speciﬁc credentials, see the section called “Connect with service-speciﬁc

credentials”.

To connect with IAM credentials, see the section called “Connect with IAM credentials”.

Connect to Amazon Keyspaces with service-speciﬁc credentials

This section shows how to use service-speciﬁc credentials to commit the data model you created or

edited with NoSQL Workbench.

1. To create a new connection using service-speciﬁc credentials, choose the Connect by using

user name and password tab.

• Before you begin, you must create service-speciﬁc credentials using the process

documented at the section called “Create service-speciﬁc credentials”.

Commit a data model 386

Amazon Keyspaces (for Apache Cassandra) Developer Guide

After you have obtained the service-speciﬁc credentials, you can continue to set up the

connection. Continue with one of the following:

• User name – Enter the user name.

• Password – Enter the password.

• AWS Region – For available Regions, see the section called “Service endpoints”.

• Port – Amazon Keyspaces uses port 9142.

Alternatively, you can import saved credentials from a ﬁle.

2. Choose Commit to update Amazon Keyspaces with the data model.

Commit a data model 387

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Commit a data model 388

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connect to Amazon Keyspaces with AWS Identity and Access Management (IAM)

credentials

This section shows how to use IAM credentials to commit the data model created or edited with

NoSQL Workbench.

1. To create a new connection using IAM credentials, choose the Connect by using IAM

credentials tab.

• Before you begin, you must create IAM credentials using one of the following methods.

• For console access, use your IAM user name and password to sign in to the AWS

Management Console from the IAM sign-in page. For information about AWS security

credentials, including programmatic access and alternatives to long-term credentials,

see AWS security credentials in the IAM User Guide. For details about signing in to your

AWS account, see How to sign in to AWS in the AWS Sign-In User Guide.

• For CLI access, you need an access key ID and a secret access key. Use temporary

credentials instead of long-term access keys when possible. Temporary credentials

include an access key ID, a secret access key, and a security token that indicates when

the credentials expire. For more information, see Using temporary credentials with AWS

resources in the IAM User Guide.

• For API access, you need an access key ID and secret access key. Use IAM user access

keys instead of AWS account root user access keys. For more information about creating

access keys, see Managing access keys for IAM users in the IAM User Guide.

For more information, see Managing access keys for IAM users.

After you have obtained the IAM credentials, you can continue to set up the connection.

• Connection name – The name of the connection.

• AWS Region – For available Regions, see the section called “Service endpoints”.

• Access key ID – Enter the access key ID.

• Secret access key – Enter the secret access key.

• Port – Amazon Keyspaces uses port 9142.

• AWS public certiﬁcate – Point to the AWS certiﬁcate that was downloaded in the ﬁrst step.

Commit a data model 389

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Persist connection – Select this check box if you want to save the AWS connection secrets

locally.

2. Choose Commit to update Amazon Keyspaces with the data model.

Commit a data model 390

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Commit a data model 391

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Use a saved connection

If you have previously set up a connection to Amazon Keyspaces, you can use that as the default

connection to commit data model changes. Choose the Use saved connections tab and continue to

commit the updates.

Commit to Apache Cassandra

This section walks you through making the connections to an Apache Cassandra cluster to commit

the data model created or edited with NoSQL Workbench.

Commit a data model 392

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

Only data models that have been created with SimpleStrategy or

NetworkTopologyStrategy can be committed to Apache Cassandra clusters. To change

the replication strategy, edit the keyspace in the data modeler.

1. • User name – Enter the user name if authentication is enabled on the cluster.

• Password – Enter the password if authentication is enabled on the cluster.

• Contact points – Enter the contact points.

• Local data center – Enter the name of the local data center.

• Port – The connection uses port 9042.

2. Choose Commit to update the Apache Cassandra cluster with the data model.

Commit a data model 393

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Commit a data model 394

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Sample data models in NoSQL Workbench

The home page for the modeler and visualizer displays a number of sample models that ship with

NoSQL Workbench. This section describes these models and their potential uses.

Topics

• Employee data model

• Credit card transactions data model

• Airline operations data model

Employee data model

This data model represents an Amazon Keyspaces schema for an employee database application.

Applications that access employee information for a given company can use this data model.

The access patterns supported by this data model are:

• Retrieval of an employee record with a given ID.

• Retrieval of an employee record with a given ID and division.

• Retrieval of an employee record with a given ID and name.

Credit card transactions data model

This data model represents an Amazon Keyspaces schema for credit card transactions at retail

stores.

The storage of credit card transactions not only helps stores with bookkeeping, but also helps store

managers analyze purchase trends, which can help them with forecasting and planning.

The access patterns supported by this data model are:

• Retrieval of transactions by credit card number, month and year, and date.

• Retrieval of transactions by credit card number, category, and date.

• Retrieval of transactions by category, location, and credit card number.

• Retrieval of transactions by credit card number and dispute status.

Sample data models 395

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Airline operations data model

This data model shows data about plane ﬂights, including airports, airlines, and ﬂight routes.

Key components of Amazon Keyspaces modeling that are demonstrated are key-value pairs,

wide-column data stores, composite keys, and complex data types such as maps to demonstrate

common NoSQL data-access patterns.

The access patterns supported by this data model are:

• Retrieval of routes originating from a given airline at a given airport.

• Retrieval of routes with a given destination airport.

• Retrieval of airports with direct ﬂights.

• Retrieval of airport details and airline details.

Release history for NoSQL Workbench

The following table describes the important changes in each release of the NoSQL Workbench

client-side application.

Change Description Date

NoSQL Workbench for

Amazon Keyspaces – GA.

NoSQL Workbench for

Amazon Keyspaces is

generally available.

October 28, 2020

NoSQL Workbench preview

released.

NoSQL Workbench is a client-

side application that helps

you design and visualize

nonrelational data models

for Amazon Keyspaces more

easily. NoSQL Workbench

clients are available for

Windows, macOS, and Linux.

For more information, see

NoSQL Workbench for

Amazon Keyspaces.

October 5, 2020

Release history 396

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Code examples for Amazon Keyspaces using AWS SDKs

The following code examples show how to use Amazon Keyspaces with an AWS software

development kit (SDK).

Basics are code examples that show you how to perform the essential operations within a service.

Actions are code excerpts from larger programs and must be run in context. While actions show you

how to call individual service functions, you can see actions in context in their related scenarios.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Get started

Hello Amazon Keyspaces

The following code examples show how to get started using Amazon Keyspaces.

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

namespace KeyspacesActions;

public class HelloKeyspaces

{

private static ILogger logger = null!;

static async Task Main(string[] args)

{

// Set up dependency injection for Amazon Keyspaces (for Apache

Cassandra).

using var host = Host.CreateDefaultBuilder(args)

397

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.ConfigureLogging(logging =>

logging.AddFilter("System", LogLevel.Debug)

.AddFilter<DebugLoggerProvider>("Microsoft",

LogLevel.Information)

.AddFilter<ConsoleLoggerProvider>("Microsoft",

LogLevel.Trace))

.ConfigureServices((_, services) =>

services.AddAWSService<IAmazonKeyspaces>()

.AddTransient<KeyspacesWrapper>()

)

.Build();

logger = LoggerFactory.Create(builder => { builder.AddConsole(); })

.CreateLogger<HelloKeyspaces>();

var keyspacesClient =

host.Services.GetRequiredService<IAmazonKeyspaces>();

var keyspacesWrapper = new KeyspacesWrapper(keyspacesClient);

Console.WriteLine("Hello, Amazon Keyspaces! Let's list your keyspaces:");

await keyspacesWrapper.ListKeyspaces();

}

• For API details, see ListKeyspaces in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

import software.amazon.awssdk.regions.Region;

import software.amazon.awssdk.services.keyspaces.KeyspacesClient;

import software.amazon.awssdk.services.keyspaces.model.KeyspaceSummary;

import software.amazon.awssdk.services.keyspaces.model.KeyspacesException;

398

Amazon Keyspaces (for Apache Cassandra) Developer Guide

import software.amazon.awssdk.services.keyspaces.model.ListKeyspacesRequest;

import software.amazon.awssdk.services.keyspaces.model.ListKeyspacesResponse;

import java.util.List;

/**

* Before running this Java (v2) code example, set up your development

* environment, including your credentials.

* For more information, see the following documentation topic:

* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-

started.html

public class HelloKeyspaces {

public static void main(String[] args) {

Region region = Region.US_EAST_1;

KeyspacesClient keyClient = KeyspacesClient.builder()

.region(region)

.build();

listKeyspaces(keyClient);

}

public static void listKeyspaces(KeyspacesClient keyClient) {

try {

ListKeyspacesRequest keyspacesRequest =

ListKeyspacesRequest.builder()

.maxResults(10)

.build();

ListKeyspacesResponse response =

keyClient.listKeyspaces(keyspacesRequest);

List<KeyspaceSummary> keyspaces = response.keyspaces();

for (KeyspaceSummary keyspace : keyspaces) {

System.out.println("The name of the keyspace is " +

keyspace.keyspaceName());

}

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

399

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see ListKeyspaces in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/**

Before running this Kotlin code example, set up your development environment,

including your credentials.

For more information, see the following documentation topic:

https://docs.aws.amazon.com/sdk-for-kotlin/latest/developer-guide/setup.html

suspend fun main() {

listKeyspaces()

}

suspend fun listKeyspaces() {

val keyspacesRequest =

ListKeyspacesRequest {

maxResults = 10

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.listKeyspaces(keyspacesRequest)

response.keyspaces?.forEach { keyspace ->

println("The name of the keyspace is ${keyspace.keyspaceName}")

}

400

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see ListKeyspaces in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

import boto3

def hello_keyspaces(keyspaces_client):

"""

Use the AWS SDK for Python (Boto3) to create an Amazon Keyspaces (for Apache

Cassandra)

client and list the keyspaces in your account.

This example uses the default settings specified in your shared credentials

and config files.

:param keyspaces_client: A Boto3 Amazon Keyspaces Client object. This object

wraps

the low-level Amazon Keyspaces service API.

"""

print("Hello, Amazon Keyspaces! Let's list some of your keyspaces:\n")

for ks in keyspaces_client.list_keyspaces(maxResults=5).get("keyspaces", []):

print(ks["keyspaceName"])

print(f"\t{ks['resourceArn']}")

if __name__ == "__main__":

hello_keyspaces(boto3.client("keyspaces"))

• For API details, see ListKeyspaces in AWS SDK for Python (Boto3) API Reference.

401

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Code examples

• Basic examples for Amazon Keyspaces using AWS SDKs

• Hello Amazon Keyspaces

• Learn the basics of Amazon Keyspaces with an AWS SDK

• Actions for Amazon Keyspaces using AWS SDKs

• Use CreateKeyspace with an AWS SDK or CLI

• Use CreateTable with an AWS SDK or CLI

• Use DeleteKeyspace with an AWS SDK or CLI

• Use DeleteTable with an AWS SDK or CLI

• Use GetKeyspace with an AWS SDK or CLI

• Use GetTable with an AWS SDK or CLI

• Use ListKeyspaces with an AWS SDK or CLI

• Use ListTables with an AWS SDK or CLI

• Use RestoreTable with an AWS SDK or CLI

• Use UpdateTable with an AWS SDK or CLI

Basic examples for Amazon Keyspaces using AWS SDKs

The following code examples show how to use the basics of Amazon Keyspaces (for Apache

Cassandra) with AWS SDKs.

Examples

• Hello Amazon Keyspaces

• Learn the basics of Amazon Keyspaces with an AWS SDK

• Actions for Amazon Keyspaces using AWS SDKs

• Use CreateKeyspace with an AWS SDK or CLI

• Use CreateTable with an AWS SDK or CLI

• Use DeleteKeyspace with an AWS SDK or CLI

• Use DeleteTable with an AWS SDK or CLI

• Use GetKeyspace with an AWS SDK or CLI

• Use GetTable with an AWS SDK or CLI

• Use ListKeyspaces with an AWS SDK or CLI

Basics 402

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Use ListTables with an AWS SDK or CLI

• Use RestoreTable with an AWS SDK or CLI

• Use UpdateTable with an AWS SDK or CLI

Hello Amazon Keyspaces

The following code examples show how to get started using Amazon Keyspaces.

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

namespace KeyspacesActions;

public class HelloKeyspaces

{

private static ILogger logger = null!;

static async Task Main(string[] args)

{

// Set up dependency injection for Amazon Keyspaces (for Apache

Cassandra).

using var host = Host.CreateDefaultBuilder(args)

.ConfigureLogging(logging =>

logging.AddFilter("System", LogLevel.Debug)

.AddFilter<DebugLoggerProvider>("Microsoft",

LogLevel.Information)

.AddFilter<ConsoleLoggerProvider>("Microsoft",

LogLevel.Trace))

.ConfigureServices((_, services) =>

services.AddAWSService<IAmazonKeyspaces>()

.AddTransient<KeyspacesWrapper>()

)

.Build();

Hello Amazon Keyspaces 403

Amazon Keyspaces (for Apache Cassandra) Developer Guide

logger = LoggerFactory.Create(builder => { builder.AddConsole(); })

.CreateLogger<HelloKeyspaces>();

var keyspacesClient =

host.Services.GetRequiredService<IAmazonKeyspaces>();

var keyspacesWrapper = new KeyspacesWrapper(keyspacesClient);

Console.WriteLine("Hello, Amazon Keyspaces! Let's list your keyspaces:");

await keyspacesWrapper.ListKeyspaces();

}

• For API details, see ListKeyspaces in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

import software.amazon.awssdk.regions.Region;

import software.amazon.awssdk.services.keyspaces.KeyspacesClient;

import software.amazon.awssdk.services.keyspaces.model.KeyspaceSummary;

import software.amazon.awssdk.services.keyspaces.model.KeyspacesException;

import software.amazon.awssdk.services.keyspaces.model.ListKeyspacesRequest;

import software.amazon.awssdk.services.keyspaces.model.ListKeyspacesResponse;

import java.util.List;

/**

* Before running this Java (v2) code example, set up your development

* environment, including your credentials.

* For more information, see the following documentation topic:

Hello Amazon Keyspaces 404

Amazon Keyspaces (for Apache Cassandra) Developer Guide

* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-

started.html

public class HelloKeyspaces {

public static void main(String[] args) {

Region region = Region.US_EAST_1;

KeyspacesClient keyClient = KeyspacesClient.builder()

.region(region)

.build();

listKeyspaces(keyClient);

}

public static void listKeyspaces(KeyspacesClient keyClient) {

try {

ListKeyspacesRequest keyspacesRequest =

ListKeyspacesRequest.builder()

.maxResults(10)

.build();

ListKeyspacesResponse response =

keyClient.listKeyspaces(keyspacesRequest);

List<KeyspaceSummary> keyspaces = response.keyspaces();

for (KeyspaceSummary keyspace : keyspaces) {

System.out.println("The name of the keyspace is " +

keyspace.keyspaceName());

}

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see ListKeyspaces in AWS SDK for Java 2.x API Reference.

Hello Amazon Keyspaces 405

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/**

Before running this Kotlin code example, set up your development environment,

including your credentials.

For more information, see the following documentation topic:

https://docs.aws.amazon.com/sdk-for-kotlin/latest/developer-guide/setup.html

suspend fun main() {

listKeyspaces()

}

suspend fun listKeyspaces() {

val keyspacesRequest =

ListKeyspacesRequest {

maxResults = 10

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.listKeyspaces(keyspacesRequest)

response.keyspaces?.forEach { keyspace ->

println("The name of the keyspace is ${keyspace.keyspaceName}")

}

• For API details, see ListKeyspaces in AWS SDK for Kotlin API reference.

Hello Amazon Keyspaces 406

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

import boto3

def hello_keyspaces(keyspaces_client):

"""

Use the AWS SDK for Python (Boto3) to create an Amazon Keyspaces (for Apache

Cassandra)

client and list the keyspaces in your account.

This example uses the default settings specified in your shared credentials

and config files.

:param keyspaces_client: A Boto3 Amazon Keyspaces Client object. This object

wraps

the low-level Amazon Keyspaces service API.

"""

print("Hello, Amazon Keyspaces! Let's list some of your keyspaces:\n")

for ks in keyspaces_client.list_keyspaces(maxResults=5).get("keyspaces", []):

print(ks["keyspaceName"])

print(f"\t{ks['resourceArn']}")

if __name__ == "__main__":

hello_keyspaces(boto3.client("keyspaces"))

• For API details, see ListKeyspaces in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Hello Amazon Keyspaces 407

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Learn the basics of Amazon Keyspaces with an AWS SDK

The following code examples show how to:

• Create a keyspace and table. The table schema holds movie data and has point-in-time recovery

enabled.

• Connect to the keyspace using a secure TLS connection with SigV4 authentication.

• Query the table. Add, retrieve, and update movie data.

• Update the table. Add a column to track watched movies.

• Restore the table to its previous state and clean up resources.

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

global using System.Security.Cryptography.X509Certificates;

global using Amazon.Keyspaces;

global using Amazon.Keyspaces.Model;

global using KeyspacesActions;

global using KeyspacesScenario;

global using Microsoft.Extensions.Configuration;

global using Microsoft.Extensions.DependencyInjection;

global using Microsoft.Extensions.Hosting;

global using Microsoft.Extensions.Logging;

global using Microsoft.Extensions.Logging.Console;

global using Microsoft.Extensions.Logging.Debug;

global using Newtonsoft.Json;

namespace KeyspacesBasics;

/// <summary>

/// Amazon Keyspaces (for Apache Cassandra) scenario. Shows some of the basic

Learn the basics 408

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// actions performed with Amazon Keyspaces.

/// </summary>

public class KeyspacesBasics

{

private static ILogger logger = null!;

static async Task Main(string[] args)

{

// Set up dependency injection for the Amazon service.

using var host = Host.CreateDefaultBuilder(args)

.ConfigureLogging(logging =>

logging.AddFilter("System", LogLevel.Debug)

.AddFilter<DebugLoggerProvider>("Microsoft",

LogLevel.Information)

.AddFilter<ConsoleLoggerProvider>("Microsoft",

LogLevel.Trace))

.ConfigureServices((_, services) =>

services.AddAWSService<IAmazonKeyspaces>()

.AddTransient<KeyspacesWrapper>()

.AddTransient<CassandraWrapper>()

)

.Build();

logger = LoggerFactory.Create(builder => { builder.AddConsole(); })

.CreateLogger<KeyspacesBasics>();

var configuration = new ConfigurationBuilder()

.SetBasePath(Directory.GetCurrentDirectory())

.AddJsonFile("settings.json") // Load test settings from .json file.

.AddJsonFile("settings.local.json",

true) // Optionally load local settings.

.Build();

var keyspacesWrapper =

host.Services.GetRequiredService<KeyspacesWrapper>();

var uiMethods = new UiMethods();

var keyspaceName = configuration["KeyspaceName"];

var tableName = configuration["TableName"];

bool success; // Used to track the results of some operations.

uiMethods.DisplayOverview();

uiMethods.PressEnter();

Learn the basics 409

Amazon Keyspaces (for Apache Cassandra) Developer Guide

// Create the keyspace.

var keyspaceArn = await keyspacesWrapper.CreateKeyspace(keyspaceName);

// Wait for the keyspace to be available. GetKeyspace results in a

// resource not found error until it is ready for use.

try

{

var getKeyspaceArn = "";

Console.Write($"Created {keyspaceName}. Waiting for it to become

available. ");

{

getKeyspaceArn = await

keyspacesWrapper.GetKeyspace(keyspaceName);

Console.Write(". ");

} while (getKeyspaceArn != keyspaceArn);

}

catch (ResourceNotFoundException)

{

Console.WriteLine("Waiting for keyspace to be created.");

}

Console.WriteLine($"\nThe keyspace {keyspaceName} is ready for use.");

uiMethods.PressEnter();

// Create the table.

// First define the schema.

var allColumns = new List<ColumnDefinition>

{

new ColumnDefinition { Name = "title", Type = "text" },

new ColumnDefinition { Name = "year", Type = "int" },

new ColumnDefinition { Name = "release_date", Type = "timestamp" },

new ColumnDefinition { Name = "plot", Type = "text" },

};

var partitionKeys = new List<PartitionKey>

{

new PartitionKey { Name = "year", },

new PartitionKey { Name = "title" },

};

var tableSchema = new SchemaDefinition

Learn the basics 410

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

AllColumns = allColumns,

PartitionKeys = partitionKeys,

};

var tableArn = await keyspacesWrapper.CreateTable(keyspaceName,

tableSchema, tableName);

// Wait for the table to be active.

try

{

var resp = new GetTableResponse();

Console.Write("Waiting for the new table to be active. ");

{

try

{

resp = await keyspacesWrapper.GetTable(keyspaceName,

tableName);

Console.Write(".");

}

catch (ResourceNotFoundException)

{

Console.Write(".");

}

} while (resp.Status != TableStatus.ACTIVE);

// Display the table's schema.

Console.WriteLine($"\nTable {tableName} has been created in

{keyspaceName}");

Console.WriteLine("Let's take a look at the schema.");

uiMethods.DisplayTitle("All columns");

resp.SchemaDefinition.AllColumns.ForEach(column =>

{

Console.WriteLine($"{column.Name,-40}\t{column.Type,-20}");

});

uiMethods.DisplayTitle("Cluster keys");

resp.SchemaDefinition.ClusteringKeys.ForEach(clusterKey =>

{

Console.WriteLine($"{clusterKey.Name,-40}\t{clusterKey.OrderBy,-20}");

});

Learn the basics 411

Amazon Keyspaces (for Apache Cassandra) Developer Guide

uiMethods.DisplayTitle("Partition keys");

resp.SchemaDefinition.PartitionKeys.ForEach(partitionKey =>

{

Console.WriteLine($"{partitionKey.Name}");

});

uiMethods.PressEnter();

}

catch (ResourceNotFoundException ex)

{

Console.WriteLine($"Error: {ex.Message}");

}

// Access Apache Cassandra using the Cassandra drive for C#.

var cassandraWrapper =

host.Services.GetRequiredService<CassandraWrapper>();

var movieFilePath = configuration["MovieFile"];

Console.WriteLine("Let's add some movies to the table we created.");

var inserted = await cassandraWrapper.InsertIntoMovieTable(keyspaceName,

tableName, movieFilePath);

uiMethods.PressEnter();

Console.WriteLine("Added the following movies to the table:");

var rows = await cassandraWrapper.GetMovies(keyspaceName, tableName);

uiMethods.DisplayTitle("All Movies");

foreach (var row in rows)

{

var title = row.GetValue<string>("title");

var year = row.GetValue<int>("year");

var plot = row.GetValue<string>("plot");

var release_date = row.GetValue<DateTime>("release_date");

Console.WriteLine($"{release_date}\t{title}\t{year}\n{plot}");

Console.WriteLine(uiMethods.SepBar);

}

// Update the table schema

uiMethods.DisplayTitle("Update table schema");

Console.WriteLine("Now we will update the table to add a boolean field

called watched.");

// First save the current time as a UTC Date so the original

Learn the basics 412

Amazon Keyspaces (for Apache Cassandra) Developer Guide

// table can be restored later.

var timeChanged = DateTime.UtcNow;

// Now update the schema.

var resourceArn = await keyspacesWrapper.UpdateTable(keyspaceName,

tableName);

uiMethods.PressEnter();

Console.WriteLine("Now let's mark some of the movies as watched.");

// Pick some files to mark as watched.

var movieToWatch = rows[2].GetValue<string>("title");

var watchedMovieYear = rows[2].GetValue<int>("year");

var changedRows = await cassandraWrapper.MarkMovieAsWatched(keyspaceName,

tableName, movieToWatch, watchedMovieYear);

movieToWatch = rows[6].GetValue<string>("title");

watchedMovieYear = rows[6].GetValue<int>("year");

changedRows = await cassandraWrapper.MarkMovieAsWatched(keyspaceName,

tableName, movieToWatch, watchedMovieYear);

movieToWatch = rows[9].GetValue<string>("title");

watchedMovieYear = rows[9].GetValue<int>("year");

changedRows = await cassandraWrapper.MarkMovieAsWatched(keyspaceName,

tableName, movieToWatch, watchedMovieYear);

movieToWatch = rows[10].GetValue<string>("title");

watchedMovieYear = rows[10].GetValue<int>("year");

changedRows = await cassandraWrapper.MarkMovieAsWatched(keyspaceName,

tableName, movieToWatch, watchedMovieYear);

movieToWatch = rows[13].GetValue<string>("title");

watchedMovieYear = rows[13].GetValue<int>("year");

changedRows = await cassandraWrapper.MarkMovieAsWatched(keyspaceName,

tableName, movieToWatch, watchedMovieYear);

uiMethods.DisplayTitle("Watched movies");

Console.WriteLine("These movies have been marked as watched:");

rows = await cassandraWrapper.GetWatchedMovies(keyspaceName, tableName);

foreach (var row in rows)

{

var title = row.GetValue<string>("title");

var year = row.GetValue<int>("year");

Console.WriteLine($"{title,-40}\t{year,8}");

Learn the basics 413

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

uiMethods.PressEnter();

Console.WriteLine("We can restore the table to its previous state but

that can take up to 20 minutes to complete.");

string answer;

{

Console.WriteLine("Do you want to restore the table? (y/n)");

answer = Console.ReadLine();

} while (answer.ToLower() != "y" && answer.ToLower() != "n");

if (answer == "y")

{

var restoredTableName = $"{tableName}_restored";

var restoredTableArn = await keyspacesWrapper.RestoreTable(

keyspaceName,

tableName,

restoredTableName,

timeChanged);

// Loop and call GetTable until the table is gone. Once it has been

// deleted completely, GetTable will raise a

ResourceNotFoundException.

bool wasRestored = false;

try

{

var resp = await keyspacesWrapper.GetTable(keyspaceName,

restoredTableName);

wasRestored = (resp.Status == TableStatus.ACTIVE);

} while (!wasRestored);

}

catch (ResourceNotFoundException)

{

// If the restored table raised an error, it isn't

// ready yet.

Console.Write(".");

}

uiMethods.DisplayTitle("Clean up resources.");

Learn the basics 414

Amazon Keyspaces (for Apache Cassandra) Developer Guide

// Delete the table.

success = await keyspacesWrapper.DeleteTable(keyspaceName, tableName);

Console.WriteLine($"Table {tableName} successfully deleted from

{keyspaceName}.");

Console.WriteLine("Waiting for the table to be removed completely. ");

// Loop and call GetTable until the table is gone. Once it has been

// deleted completely, GetTable will raise a ResourceNotFoundException.

bool wasDeleted = false;

try

{

var resp = await keyspacesWrapper.GetTable(keyspaceName,

tableName);

} while (!wasDeleted);

}

catch (ResourceNotFoundException ex)

{

wasDeleted = true;

Console.WriteLine($"{ex.Message} indicates that the table has been

deleted.");

}

// Delete the keyspace.

success = await keyspacesWrapper.DeleteKeyspace(keyspaceName);

Console.WriteLine("The keyspace has been deleted and the demo is now

complete.");

}

namespace KeyspacesActions;

/// <summary>

/// Performs Amazon Keyspaces (for Apache Cassandra) actions.

/// </summary>

public class KeyspacesWrapper

{

private readonly IAmazonKeyspaces _amazonKeyspaces;

Learn the basics 415

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// <summary>

/// Constructor for the KeyspaceWrapper.

/// </summary>

/// <param name="amazonKeyspaces">An Amazon Keyspaces client object.</param>

public KeyspacesWrapper(IAmazonKeyspaces amazonKeyspaces)

{

_amazonKeyspaces = amazonKeyspaces;

}

/// <summary>

/// Create a new keyspace.

/// </summary>

/// <param name="keyspaceName">The name for the new keyspace.</param>

/// <returns>The Amazon Resource Name (ARN) of the new keyspace.</returns>

public async Task<string> CreateKeyspace(string keyspaceName)

{

var response =

await _amazonKeyspaces.CreateKeyspaceAsync(

new CreateKeyspaceRequest { KeyspaceName = keyspaceName });

return response.ResourceArn;

}

/// <summary>

/// Create a new Amazon Keyspaces table.

/// </summary>

/// <param name="keyspaceName">The keyspace where the table will be

created.</param>

/// <param name="schema">The schema for the new table.</param>

/// <param name="tableName">The name of the new table.</param>

/// <returns>The Amazon Resource Name (ARN) of the new table.</returns>

public async Task<string> CreateTable(string keyspaceName, SchemaDefinition

schema, string tableName)

{

var request = new CreateTableRequest

{

KeyspaceName = keyspaceName,

SchemaDefinition = schema,

TableName = tableName,

PointInTimeRecovery = new PointInTimeRecovery { Status =

PointInTimeRecoveryStatus.ENABLED }

};

Learn the basics 416

Amazon Keyspaces (for Apache Cassandra) Developer Guide

var response = await _amazonKeyspaces.CreateTableAsync(request);

return response.ResourceArn;

}

/// <summary>

/// Delete an existing keyspace.

/// </summary>

/// <param name="keyspaceName"></param>

/// <returns>A Boolean value indicating the success of the action.</returns>

public async Task<bool> DeleteKeyspace(string keyspaceName)

{

var response = await _amazonKeyspaces.DeleteKeyspaceAsync(

new DeleteKeyspaceRequest { KeyspaceName = keyspaceName });

return response.HttpStatusCode == HttpStatusCode.OK;

}

/// <summary>

/// Delete an Amazon Keyspaces table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to delete.</param>

/// <returns>A Boolean value indicating the success of the action.</returns>

public async Task<bool> DeleteTable(string keyspaceName, string tableName)

{

var response = await _amazonKeyspaces.DeleteTableAsync(

new DeleteTableRequest { KeyspaceName = keyspaceName, TableName =

tableName });

return response.HttpStatusCode == HttpStatusCode.OK;

}

/// <summary>

/// Get data about a keyspace.

/// </summary>

/// <param name="keyspaceName">The name of the keyspace.</param>

/// <returns>The Amazon Resource Name (ARN) of the keyspace.</returns>

public async Task<string> GetKeyspace(string keyspaceName)

{

var response = await _amazonKeyspaces.GetKeyspaceAsync(

new GetKeyspaceRequest { KeyspaceName = keyspaceName });

return response.ResourceArn;

}

Learn the basics 417

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// <summary>

/// Get information about an Amazon Keyspaces table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the Amazon Keyspaces table.</param>

/// <returns>The response containing data about the table.</returns>

public async Task<GetTableResponse> GetTable(string keyspaceName, string

tableName)

{

var response = await _amazonKeyspaces.GetTableAsync(

new GetTableRequest { KeyspaceName = keyspaceName, TableName =

tableName });

return response;

}

/// <summary>

/// Lists all keyspaces for the account.

/// </summary>

/// <returns>Async task.</returns>

public async Task ListKeyspaces()

{

var paginator = _amazonKeyspaces.Paginators.ListKeyspaces(new

ListKeyspacesRequest());

Console.WriteLine("{0, -30}\t{1}", "Keyspace name", "Keyspace ARN");

Console.WriteLine(new string('-', Console.WindowWidth));

await foreach (var keyspace in paginator.Keyspaces)

{

Console.WriteLine($"{keyspace.KeyspaceName,-30}\t{keyspace.ResourceArn}");

}

/// <summary>

/// Lists the Amazon Keyspaces tables in a keyspace.

/// </summary>

/// <param name="keyspaceName">The name of the keyspace.</param>

/// <returns>A list of TableSummary objects.</returns>

public async Task<List<TableSummary>> ListTables(string keyspaceName)

{

Learn the basics 418

Amazon Keyspaces (for Apache Cassandra) Developer Guide

var response = await _amazonKeyspaces.ListTablesAsync(new

ListTablesRequest { KeyspaceName = keyspaceName });

response.Tables.ForEach(table =>

{

Console.WriteLine($"{table.KeyspaceName}\t{table.TableName}\t{table.ResourceArn}");

});

return response.Tables;

}

/// <summary>

/// Restores the specified table to the specified point in time.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to restore.</param>

/// <param name="timestamp">The time to which the table will be restored.</

param>

/// <returns>The Amazon Resource Name (ARN) of the restored table.</returns>

public async Task<string> RestoreTable(string keyspaceName, string tableName,

string restoredTableName, DateTime timestamp)

{

var request = new RestoreTableRequest

{

RestoreTimestamp = timestamp,

SourceKeyspaceName = keyspaceName,

SourceTableName = tableName,

TargetKeyspaceName = keyspaceName,

TargetTableName = restoredTableName

};

var response = await _amazonKeyspaces.RestoreTableAsync(request);

return response.RestoredTableARN;

}

/// <summary>

/// Updates the movie table to add a boolean column named watched.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to change.</param>

/// <returns>The Amazon Resource Name (ARN) of the updated table.</returns>

public async Task<string> UpdateTable(string keyspaceName, string tableName)

Learn the basics 419

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

var newColumn = new ColumnDefinition { Name = "watched", Type =

"boolean" };

var request = new UpdateTableRequest

{

KeyspaceName = keyspaceName,

TableName = tableName,

AddColumns = new List<ColumnDefinition> { newColumn }

};

var response = await _amazonKeyspaces.UpdateTableAsync(request);

return response.ResourceArn;

}

using System.Net;

using Cassandra;

namespace KeyspacesScenario;

/// <summary>

/// Class to perform CRUD methods on an Amazon Keyspaces (for Apache Cassandra)

database.

///

/// NOTE: This sample uses a plain text authenticator for example purposes only.

/// Recommended best practice is to use a SigV4 authentication plugin, if

available.

/// </summary>

public class CassandraWrapper

{

private readonly IConfiguration _configuration;

private readonly string _localPathToFile;

private const string _certLocation = "https://certs.secureserver.net/

repository/sf-class2-root.crt";

private const string _certFileName = "sf-class2-root.crt";

private readonly X509Certificate2Collection _certCollection;

private X509Certificate2 _amazoncert;

private Cluster _cluster;

// User name and password for the service.

private string _userName = null!;

Learn the basics 420

Amazon Keyspaces (for Apache Cassandra) Developer Guide

private string _pwd = null!;

public CassandraWrapper()

{

_configuration = new ConfigurationBuilder()

.SetBasePath(Directory.GetCurrentDirectory())

.AddJsonFile("settings.json") // Load test settings from .json file.

.AddJsonFile("settings.local.json",

true) // Optionally load local settings.

.Build();

_localPathToFile = Path.GetTempPath();

// Get the Starfield digital certificate and save it locally.

var client = new WebClient();

client.DownloadFile(_certLocation, $"{_localPathToFile}/

{_certFileName}");

//var httpClient = new HttpClient();

//var httpResult = httpClient.Get(fileUrl);

//using var resultStream = await httpResult.Content.ReadAsStreamAsync();

//using var fileStream = File.Create(pathToSave);

//resultStream.CopyTo(fileStream);

_certCollection = new X509Certificate2Collection();

_amazoncert = new X509Certificate2($"{_localPathToFile}/

{_certFileName}");

// Get the user name and password stored in the configuration file.

_userName = _configuration["UserName"]!;

_pwd = _configuration["Password"]!;

// For a list of Service Endpoints for Amazon Keyspaces, see:

// https://docs.aws.amazon.com/keyspaces/latest/devguide/

programmatic.endpoints.html

var awsEndpoint = _configuration["ServiceEndpoint"];

_cluster = Cluster.Builder()

.AddContactPoints(awsEndpoint)

.WithPort(9142)

.WithAuthProvider(new PlainTextAuthProvider(_userName, _pwd))

.WithSSL(new SSLOptions().SetCertificateCollection(_certCollection))

.WithQueryOptions(

new QueryOptions()

Learn the basics 421

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.SetConsistencyLevel(ConsistencyLevel.LocalQuorum)

.SetSerialConsistencyLevel(ConsistencyLevel.LocalSerial))

.Build();

}

/// <summary>

/// Loads the contents of a JSON file into a list of movies to be

/// added to the Apache Cassandra table.

/// </summary>

/// <param name="movieFileName">The full path to the JSON file.</param>

/// <returns>A list of movie objects.</returns>

public List<Movie> ImportMoviesFromJson(string movieFileName, int numToImport

= 0)

{

if (!File.Exists(movieFileName))

{

return null!;

}

using var sr = new StreamReader(movieFileName);

string json = sr.ReadToEnd();

var allMovies = JsonConvert.DeserializeObject<List<Movie>>(json);

// If numToImport = 0, return all movies in the collection.

if (numToImport == 0)

{

// Now return the entire list of movies.

return allMovies;

}

else

{

// Now return the first numToImport entries.

return allMovies.GetRange(0, numToImport);

}

/// <summary>

/// Insert movies into the movie table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="movieTableName">The Amazon Keyspaces table.</param>

/// <param name="movieFilePath">The path to the resource file containing

/// movie data to insert into the table.</param>

Learn the basics 422

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// <returns>A Boolean value indicating the success of the action.</returns>

public async Task<bool> InsertIntoMovieTable(string keyspaceName, string

movieTableName, string movieFilePath, int numToImport = 20)

{

// Get some movie data from the movies.json file

var movies = ImportMoviesFromJson(movieFilePath, numToImport);

var session = _cluster.Connect(keyspaceName);

string insertCql;

RowSet rs;

// Now we insert the numToImport movies into the table.

foreach (var movie in movies)

{

// Escape single quote characters in the plot.

insertCql = $"INSERT INTO {keyspaceName}.{movieTableName}

(title, year, release_date, plot) values($${movie.Title}$$, {movie.Year},

'{movie.Info.Release_Date.ToString("yyyy-MM-dd")}', $${movie.Info.Plot}$$)";

rs = await session.ExecuteAsync(new SimpleStatement(insertCql));

}

return true;

}

/// <summary>

/// Gets all of the movies in the movies table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table.</param>

/// <returns>A list of row objects containing movie data.</returns>

public async Task<List<Row>> GetMovies(string keyspaceName, string tableName)

{

var session = _cluster.Connect();

RowSet rs;

try

{

rs = await session.ExecuteAsync(new SimpleStatement($"SELECT * FROM

{keyspaceName}.{tableName}"));

// Extract the row data from the returned RowSet.

var rows = rs.GetRows().ToList();

return rows;

Learn the basics 423

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

catch (Exception ex)

{

Console.WriteLine(ex.Message);

return null!;

}

/// <summary>

/// Mark a movie in the movie table as watched.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table.</param>

/// <param name="title">The title of the movie to mark as watched.</param>

/// <param name="year">The year the movie was released.</param>

/// <returns>A set of rows containing the changed data.</returns>

public async Task<List<Row>> MarkMovieAsWatched(string keyspaceName, string

tableName, string title, int year)

{

var session = _cluster.Connect();

string updateCql = $"UPDATE {keyspaceName}.{tableName} SET watched=true

WHERE title = $${title}$$ AND year = {year};";

var rs = await session.ExecuteAsync(new SimpleStatement(updateCql));

var rows = rs.GetRows().ToList();

return rows;

}

/// <summary>

/// Retrieve the movies in the movies table where watched is true.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table.</param>

/// <returns>A list of row objects containing information about movies

/// where watched is true.</returns>

public async Task<List<Row>> GetWatchedMovies(string keyspaceName, string

tableName)

{

var session = _cluster.Connect();

RowSet rs;

try

{

rs = await session.ExecuteAsync(new SimpleStatement($"SELECT

title, year, plot FROM {keyspaceName}.{tableName} WHERE watched = true ALLOW

FILTERING"));

Learn the basics 424

Amazon Keyspaces (for Apache Cassandra) Developer Guide

// Extract the row data from the returned RowSet.

var rows = rs.GetRows().ToList();

return rows;

}

catch (Exception ex)

{

Console.WriteLine(ex.Message);

return null!;

}

• For API details, see the following topics in AWS SDK for .NET API Reference.

• CreateKeyspace

• CreateTable

• DeleteKeyspace

• DeleteTable

• GetKeyspace

• GetTable

• ListKeyspaces

• ListTables

• RestoreTable

• UpdateTable

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

Learn the basics 425

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/**

* Before running this Java (v2) code example, set up your development

* environment, including your credentials.

* For more information, see the following documentation topic:

* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-

started.html

* Before running this Java code example, you must create a

* Java keystore (JKS) file and place it in your project's resources folder.

* This file is a secure file format used to hold certificate information for

* Java applications. This is required to make a connection to Amazon Keyspaces.

* For more information, see the following documentation topic:

* https://docs.aws.amazon.com/keyspaces/latest/devguide/using_java_driver.html

* This Java example performs the following tasks:

* 1. Create a keyspace.

* 2. Check for keyspace existence.

* 3. List keyspaces using a paginator.

* 4. Create a table with a simple movie data schema and enable point-in-time

* recovery.

* 5. Check for the table to be in an Active state.

* 6. List all tables in the keyspace.

* 7. Use a Cassandra driver to insert some records into the Movie table.

* 8. Get all records from the Movie table.

* 9. Get a specific Movie.

* 10. Get a UTC timestamp for the current time.

* 11. Update the table schema to add a ‘watched’ Boolean column.

* 12. Update an item as watched.

* 13. Query for items with watched = True.

* 14. Restore the table back to the previous state using the timestamp.

* 15. Check for completion of the restore action.

* 16. Delete the table.

* 17. Confirm that both tables are deleted.

* 18. Delete the keyspace.

public class ScenarioKeyspaces {

Learn the basics 426

Amazon Keyspaces (for Apache Cassandra) Developer Guide

public static final String DASHES = new String(new char[80]).replace("\0",

"-");

* Usage:

* fileName - The name of the JSON file that contains movie data. (Get this

file

* from the GitHub repo at resources/sample_file.)

* keyspaceName - The name of the keyspace to create.

public static void main(String[] args) throws InterruptedException,

IOException {

String fileName = "<Replace with the JSON file that contains movie

data>";

String keyspaceName = "<Replace with the name of the keyspace to

create>";

String titleUpdate = "The Family";

int yearUpdate = 2013;

String tableName = "Movie";

String tableNameRestore = "MovieRestore";

Region region = Region.US_EAST_1;

KeyspacesClient keyClient = KeyspacesClient.builder()

.region(region)

.build();

DriverConfigLoader loader =

DriverConfigLoader.fromClasspath("application.conf");

CqlSession session = CqlSession.builder()

.withConfigLoader(loader)

.build();

System.out.println(DASHES);

System.out.println("Welcome to the Amazon Keyspaces example scenario.");

System.out.println(DASHES);

System.out.println("1. Create a keyspace.");

createKeySpace(keyClient, keyspaceName);

System.out.println(DASHES);

Thread.sleep(5000);

System.out.println("2. Check for keyspace existence.");

checkKeyspaceExistence(keyClient, keyspaceName);

Learn the basics 427

Amazon Keyspaces (for Apache Cassandra) Developer Guide

System.out.println(DASHES);

System.out.println("3. List keyspaces using a paginator.");

listKeyspacesPaginator(keyClient);

System.out.println(DASHES);

System.out.println("4. Create a table with a simple movie data schema and

enable point-in-time recovery.");

createTable(keyClient, keyspaceName, tableName);

System.out.println(DASHES);

System.out.println("5. Check for the table to be in an Active state.");

Thread.sleep(6000);

checkTable(keyClient, keyspaceName, tableName);

System.out.println(DASHES);

System.out.println("6. List all tables in the keyspace.");

listTables(keyClient, keyspaceName);

System.out.println(DASHES);

System.out.println("7. Use a Cassandra driver to insert some records into

the Movie table.");

Thread.sleep(6000);

loadData(session, fileName, keyspaceName);

System.out.println(DASHES);

System.out.println("8. Get all records from the Movie table.");

getMovieData(session, keyspaceName);

System.out.println(DASHES);

System.out.println("9. Get a specific Movie.");

getSpecificMovie(session, keyspaceName);

System.out.println(DASHES);

System.out.println("10. Get a UTC timestamp for the current time.");

ZonedDateTime utc = ZonedDateTime.now(ZoneOffset.UTC);

Learn the basics 428

Amazon Keyspaces (for Apache Cassandra) Developer Guide

System.out.println("DATETIME = " + Date.from(utc.toInstant()));

System.out.println(DASHES);

System.out.println("11. Update the table schema to add a watched Boolean

column.");

updateTable(keyClient, keyspaceName, tableName);

System.out.println(DASHES);

System.out.println("12. Update an item as watched.");

Thread.sleep(10000); // Wait 10 secs for the update.

updateRecord(session, keyspaceName, titleUpdate, yearUpdate);

System.out.println(DASHES);

System.out.println("13. Query for items with watched = True.");

getWatchedData(session, keyspaceName);

System.out.println(DASHES);

System.out.println("14. Restore the table back to the previous state

using the timestamp.");

System.out.println("Note that the restore operation can take up to 20

minutes.");

restoreTable(keyClient, keyspaceName, utc);

System.out.println(DASHES);

System.out.println("15. Check for completion of the restore action.");

Thread.sleep(5000);

checkRestoredTable(keyClient, keyspaceName, "MovieRestore");

System.out.println(DASHES);

System.out.println("16. Delete both tables.");

deleteTable(keyClient, keyspaceName, tableName);

deleteTable(keyClient, keyspaceName, tableNameRestore);

System.out.println(DASHES);

System.out.println("17. Confirm that both tables are deleted.");

checkTableDelete(keyClient, keyspaceName, tableName);

checkTableDelete(keyClient, keyspaceName, tableNameRestore);

Learn the basics 429

Amazon Keyspaces (for Apache Cassandra) Developer Guide

System.out.println(DASHES);

System.out.println("18. Delete the keyspace.");

deleteKeyspace(keyClient, keyspaceName);

System.out.println(DASHES);

System.out.println("The scenario has completed successfully.");

System.out.println(DASHES);

}

public static void deleteKeyspace(KeyspacesClient keyClient, String

keyspaceName) {

try {

DeleteKeyspaceRequest deleteKeyspaceRequest =

DeleteKeyspaceRequest.builder()

.keyspaceName(keyspaceName)

.build();

keyClient.deleteKeyspace(deleteKeyspaceRequest);

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void checkTableDelete(KeyspacesClient keyClient, String

keyspaceName, String tableName)

throws InterruptedException {

try {

String status;

GetTableResponse response;

GetTableRequest tableRequest = GetTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

// Keep looping until table cannot be found and a

ResourceNotFoundException is

// thrown.

while (true) {

response = keyClient.getTable(tableRequest);

Learn the basics 430

Amazon Keyspaces (for Apache Cassandra) Developer Guide

status = response.statusAsString();

System.out.println(". The table status is " + status);

Thread.sleep(500);

}

} catch (ResourceNotFoundException e) {

System.err.println(e.awsErrorDetails().errorMessage());

}

System.out.println("The table is deleted");

}

public static void deleteTable(KeyspacesClient keyClient, String

keyspaceName, String tableName) {

try {

DeleteTableRequest tableRequest = DeleteTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

keyClient.deleteTable(tableRequest);

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void checkRestoredTable(KeyspacesClient keyClient, String

keyspaceName, String tableName)

throws InterruptedException {

try {

boolean tableStatus = false;

String status;

GetTableResponse response = null;

GetTableRequest tableRequest = GetTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

while (!tableStatus) {

response = keyClient.getTable(tableRequest);

status = response.statusAsString();

System.out.println("The table status is " + status);

Learn the basics 431

Amazon Keyspaces (for Apache Cassandra) Developer Guide

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true;

}

Thread.sleep(500);

}

List<ColumnDefinition> cols =

response.schemaDefinition().allColumns();

for (ColumnDefinition def : cols) {

System.out.println("The column name is " + def.name());

System.out.println("The column type is " + def.type());

}

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void restoreTable(KeyspacesClient keyClient, String

keyspaceName, ZonedDateTime utc) {

try {

Instant myTime = utc.toInstant();

RestoreTableRequest restoreTableRequest =

RestoreTableRequest.builder()

.restoreTimestamp(myTime)

.sourceTableName("Movie")

.targetKeyspaceName(keyspaceName)

.targetTableName("MovieRestore")

.sourceKeyspaceName(keyspaceName)

.build();

RestoreTableResponse response =

keyClient.restoreTable(restoreTableRequest);

System.out.println("The ARN of the restored table is " +

response.restoredTableARN());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void getWatchedData(CqlSession session, String keyspaceName) {

Learn the basics 432

Amazon Keyspaces (for Apache Cassandra) Developer Guide

ResultSet resultSet = session

.execute("SELECT * FROM \"" + keyspaceName + "\".\"Movie\" WHERE

watched = true ALLOW FILTERING;");

resultSet.forEach(item -> {

System.out.println("The Movie title is " + item.getString("title"));

System.out.println("The Movie year is " + item.getInt("year"));

System.out.println("The plot is " + item.getString("plot"));

});

}

public static void updateRecord(CqlSession session, String keySpace, String

titleUpdate, int yearUpdate) {

String sqlStatement = "UPDATE \"" + keySpace

+ "\".\"Movie\" SET watched=true WHERE title = :k0 AND year

= :k1;";

BatchStatementBuilder builder =

BatchStatement.builder(DefaultBatchType.UNLOGGED);

builder.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);

PreparedStatement preparedStatement = session.prepare(sqlStatement);

builder.addStatement(preparedStatement.boundStatementBuilder()

.setString("k0", titleUpdate)

.setInt("k1", yearUpdate)

.build());

BatchStatement batchStatement = builder.build();

session.execute(batchStatement);

}

public static void updateTable(KeyspacesClient keyClient, String keySpace,

String tableName) {

try {

ColumnDefinition def = ColumnDefinition.builder()

.name("watched")

.type("boolean")

.build();

UpdateTableRequest tableRequest = UpdateTableRequest.builder()

.keyspaceName(keySpace)

.tableName(tableName)

.addColumns(def)

.build();

keyClient.updateTable(tableRequest);

Learn the basics 433

Amazon Keyspaces (for Apache Cassandra) Developer Guide

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void getSpecificMovie(CqlSession session, String keyspaceName)

{

ResultSet resultSet = session.execute(

"SELECT * FROM \"" + keyspaceName + "\".\"Movie\" WHERE title =

'The Family' ALLOW FILTERING ;");

resultSet.forEach(item -> {

System.out.println("The Movie title is " + item.getString("title"));

System.out.println("The Movie year is " + item.getInt("year"));

System.out.println("The plot is " + item.getString("plot"));

});

}

// Get records from the Movie table.

public static void getMovieData(CqlSession session, String keyspaceName) {

ResultSet resultSet = session.execute("SELECT * FROM \"" + keyspaceName +

"\".\"Movie\";");

resultSet.forEach(item -> {

System.out.println("The Movie title is " + item.getString("title"));

System.out.println("The Movie year is " + item.getInt("year"));

System.out.println("The plot is " + item.getString("plot"));

});

}

// Load data into the table.

public static void loadData(CqlSession session, String fileName, String

keySpace) throws IOException {

String sqlStatement = "INSERT INTO \"" + keySpace + "\".\"Movie\" (title,

year, plot) values (:k0, :k1, :k2)";

JsonParser parser = new JsonFactory().createParser(new File(fileName));

com.fasterxml.jackson.databind.JsonNode rootNode = new

ObjectMapper().readTree(parser);

Iterator<JsonNode> iter = rootNode.iterator();

ObjectNode currentNode;

int t = 0;

while (iter.hasNext()) {

// Add 20 movies to the table.

if (t == 20)

Learn the basics 434

Amazon Keyspaces (for Apache Cassandra) Developer Guide

break;

currentNode = (ObjectNode) iter.next();

int year = currentNode.path("year").asInt();

String title = currentNode.path("title").asText();

String plot = currentNode.path("info").path("plot").toString();

// Insert the data into the Amazon Keyspaces table.

BatchStatementBuilder builder =

BatchStatement.builder(DefaultBatchType.UNLOGGED);

builder.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);

PreparedStatement preparedStatement = session.prepare(sqlStatement);

builder.addStatement(preparedStatement.boundStatementBuilder()

.setString("k0", title)

.setInt("k1", year)

.setString("k2", plot)

.build());

BatchStatement batchStatement = builder.build();

session.execute(batchStatement);

t++;

}

System.out.println("You have added " + t + " records successfully!");

}

public static void listTables(KeyspacesClient keyClient, String keyspaceName)

{

try {

ListTablesRequest tablesRequest = ListTablesRequest.builder()

.keyspaceName(keyspaceName)

.build();

ListTablesIterable listRes =

keyClient.listTablesPaginator(tablesRequest);

listRes.stream()

.flatMap(r -> r.tables().stream())

.forEach(content -> System.out.println(" ARN: " +

content.resourceArn() +

" Table name: " + content.tableName()));

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

Learn the basics 435

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

public static void checkTable(KeyspacesClient keyClient, String keyspaceName,

String tableName)

throws InterruptedException {

try {

boolean tableStatus = false;

String status;

GetTableResponse response = null;

GetTableRequest tableRequest = GetTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

while (!tableStatus) {

response = keyClient.getTable(tableRequest);

status = response.statusAsString();

System.out.println(". The table status is " + status);

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true;

}

Thread.sleep(500);

}

List<ColumnDefinition> cols =

response.schemaDefinition().allColumns();

for (ColumnDefinition def : cols) {

System.out.println("The column name is " + def.name());

System.out.println("The column type is " + def.type());

}

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void createTable(KeyspacesClient keyClient, String keySpace,

String tableName) {

try {

// Set the columns.

ColumnDefinition defTitle = ColumnDefinition.builder()

Learn the basics 436

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.name("title")

.type("text")

.build();

ColumnDefinition defYear = ColumnDefinition.builder()

.name("year")

.type("int")

.build();

ColumnDefinition defReleaseDate = ColumnDefinition.builder()

.name("release_date")

.type("timestamp")

.build();

ColumnDefinition defPlot = ColumnDefinition.builder()

.name("plot")

.type("text")

.build();

List<ColumnDefinition> colList = new ArrayList<>();

colList.add(defTitle);

colList.add(defYear);

colList.add(defReleaseDate);

colList.add(defPlot);

// Set the keys.

PartitionKey yearKey = PartitionKey.builder()

.name("year")

.build();

PartitionKey titleKey = PartitionKey.builder()

.name("title")

.build();

List<PartitionKey> keyList = new ArrayList<>();

keyList.add(yearKey);

keyList.add(titleKey);

SchemaDefinition schemaDefinition = SchemaDefinition.builder()

.partitionKeys(keyList)

.allColumns(colList)

.build();

PointInTimeRecovery timeRecovery = PointInTimeRecovery.builder()

Learn the basics 437

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.status(PointInTimeRecoveryStatus.ENABLED)

.build();

CreateTableRequest tableRequest = CreateTableRequest.builder()

.keyspaceName(keySpace)

.tableName(tableName)

.schemaDefinition(schemaDefinition)

.pointInTimeRecovery(timeRecovery)

.build();

CreateTableResponse response = keyClient.createTable(tableRequest);

System.out.println("The table ARN is " + response.resourceArn());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void listKeyspacesPaginator(KeyspacesClient keyClient) {

try {

ListKeyspacesRequest keyspacesRequest =

ListKeyspacesRequest.builder()

.maxResults(10)

.build();

ListKeyspacesIterable listRes =

keyClient.listKeyspacesPaginator(keyspacesRequest);

listRes.stream()

.flatMap(r -> r.keyspaces().stream())

.forEach(content -> System.out.println(" Name: " +

content.keyspaceName()));

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void checkKeyspaceExistence(KeyspacesClient keyClient, String

keyspaceName) {

try {

GetKeyspaceRequest keyspaceRequest = GetKeyspaceRequest.builder()

.keyspaceName(keyspaceName)

Learn the basics 438

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.build();

GetKeyspaceResponse response =

keyClient.getKeyspace(keyspaceRequest);

String name = response.keyspaceName();

System.out.println("The " + name + " KeySpace is ready");

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

public static void createKeySpace(KeyspacesClient keyClient, String

keyspaceName) {

try {

CreateKeyspaceRequest keyspaceRequest =

CreateKeyspaceRequest.builder()

.keyspaceName(keyspaceName)

.build();

CreateKeyspaceResponse response =

keyClient.createKeyspace(keyspaceRequest);

System.out.println("The ARN of the KeySpace is " +

response.resourceArn());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see the following topics in AWS SDK for Java 2.x API Reference.

• CreateKeyspace

• CreateTable

• DeleteKeyspace

• DeleteTable

• GetKeyspace

• GetTable

Learn the basics 439

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• ListKeyspaces

• ListTables

• RestoreTable

• UpdateTable

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/**

Before running this Kotlin code example, set up your development environment,

including your credentials.

For more information, see the following documentation topic:

https://docs.aws.amazon.com/sdk-for-kotlin/latest/developer-guide/setup.html

This example uses a secure file format to hold certificate information for

Kotlin applications. This is required to make a connection to Amazon Keyspaces.

For more information, see the following documentation topic:

https://docs.aws.amazon.com/keyspaces/latest/devguide/using_java_driver.html

This Kotlin example performs the following tasks:

1. Create a keyspace.

2. Check for keyspace existence.

3. List keyspaces using a paginator.

4. Create a table with a simple movie data schema and enable point-in-time

recovery.

5. Check for the table to be in an Active state.

6. List all tables in the keyspace.

7. Use a Cassandra driver to insert some records into the Movie table.

Learn the basics 440

Amazon Keyspaces (for Apache Cassandra) Developer Guide

8. Get all records from the Movie table.

9. Get a specific Movie.

10. Get a UTC timestamp for the current time.

11. Update the table schema to add a ‘watched’ Boolean column.

12. Update an item as watched.

13. Query for items with watched = True.

14. Restore the table back to the previous state using the timestamp.

15. Check for completion of the restore action.

16. Delete the table.

17. Confirm that both tables are deleted.

18. Delete the keyspace.

Usage:

fileName - The name of the JSON file that contains movie data. (Get this

file from the GitHub repo at resources/sample_file.)

keyspaceName - The name of the keyspace to create.

val DASHES: String = String(CharArray(80)).replace("\u0000", "-")

suspend fun main() {

val fileName = "<Replace with the JSON file that contains movie data>"

val keyspaceName = "<Replace with the name of the keyspace to create>"

val titleUpdate = "The Family"

val yearUpdate = 2013

val tableName = "MovieKotlin"

val tableNameRestore = "MovieRestore"

val loader = DriverConfigLoader.fromClasspath("application.conf")

val session =

CqlSession

.builder()

.withConfigLoader(loader)

.build()

println(DASHES)

println("Welcome to the Amazon Keyspaces example scenario.")

println(DASHES)

println("1. Create a keyspace.")

createKeySpace(keyspaceName)

println(DASHES)

Learn the basics 441

Amazon Keyspaces (for Apache Cassandra) Developer Guide

println(DASHES)

delay(5000)

println("2. Check for keyspace existence.")

checkKeyspaceExistence(keyspaceName)

println(DASHES)

println("3. List keyspaces using a paginator.")

listKeyspacesPaginator()

println(DASHES)

println("4. Create a table with a simple movie data schema and enable point-

in-time recovery.")

createTable(keyspaceName, tableName)

println(DASHES)

println("5. Check for the table to be in an Active state.")

delay(6000)

checkTable(keyspaceName, tableName)

println(DASHES)

println("6. List all tables in the keyspace.")

listTables(keyspaceName)

println(DASHES)

println("7. Use a Cassandra driver to insert some records into the Movie

table.")

delay(6000)

loadData(session, fileName, keyspaceName)

println(DASHES)

println("8. Get all records from the Movie table.")

getMovieData(session, keyspaceName)

println(DASHES)

println("9. Get a specific Movie.")

getSpecificMovie(session, keyspaceName)

Learn the basics 442

Amazon Keyspaces (for Apache Cassandra) Developer Guide

println(DASHES)

println("10. Get a UTC timestamp for the current time.")

val utc = ZonedDateTime.now(ZoneOffset.UTC)

println("DATETIME = ${Date.from(utc.toInstant())}")

println(DASHES)

println("11. Update the table schema to add a watched Boolean column.")

updateTable(keyspaceName, tableName)

println(DASHES)

println("12. Update an item as watched.")

delay(10000) // Wait 10 seconds for the update.

updateRecord(session, keyspaceName, titleUpdate, yearUpdate)

println(DASHES)

println("13. Query for items with watched = True.")

getWatchedData(session, keyspaceName)

println(DASHES)

println("14. Restore the table back to the previous state using the

timestamp.")

println("Note that the restore operation can take up to 20 minutes.")

restoreTable(keyspaceName, utc)

println(DASHES)

println("15. Check for completion of the restore action.")

delay(5000)

checkRestoredTable(keyspaceName, "MovieRestore")

println(DASHES)

println("16. Delete both tables.")

deleteTable(keyspaceName, tableName)

deleteTable(keyspaceName, tableNameRestore)

println(DASHES)

Learn the basics 443

Amazon Keyspaces (for Apache Cassandra) Developer Guide

println("17. Confirm that both tables are deleted.")

checkTableDelete(keyspaceName, tableName)

checkTableDelete(keyspaceName, tableNameRestore)

println(DASHES)

println("18. Delete the keyspace.")

deleteKeyspace(keyspaceName)

println(DASHES)

println("The scenario has completed successfully.")

println(DASHES)

}

suspend fun deleteKeyspace(keyspaceNameVal: String?) {

val deleteKeyspaceRequest =

DeleteKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.deleteKeyspace(deleteKeyspaceRequest)

}

suspend fun checkTableDelete(

keyspaceNameVal: String?,

tableNameVal: String?,

) {

var status: String

var response: GetTableResponse

val tableRequest =

GetTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

try {

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

// Keep looping until the table cannot be found and a

ResourceNotFoundException is thrown.

while (true) {

response = keyClient.getTable(tableRequest)

Learn the basics 444

Amazon Keyspaces (for Apache Cassandra) Developer Guide

status = response.status.toString()

println(". The table status is $status")

delay(500)

}

} catch (e: ResourceNotFoundException) {

println(e.message)

}

println("The table is deleted")

}

suspend fun deleteTable(

keyspaceNameVal: String?,

tableNameVal: String?,

) {

val tableRequest =

DeleteTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.deleteTable(tableRequest)

}

suspend fun checkRestoredTable(

keyspaceNameVal: String?,

tableNameVal: String?,

) {

var tableStatus = false

var status: String

var response: GetTableResponse? = null

val tableRequest =

GetTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

while (!tableStatus) {

response = keyClient.getTable(tableRequest)

status = response!!.status.toString()

Learn the basics 445

Amazon Keyspaces (for Apache Cassandra) Developer Guide

println("The table status is $status")

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true

}

delay(500)

}

val cols = response!!.schemaDefinition?.allColumns

if (cols != null) {

for (def in cols) {

println("The column name is ${def.name}")

println("The column type is ${def.type}")

}

suspend fun restoreTable(

keyspaceName: String?,

utc: ZonedDateTime,

) {

// Create an aws.smithy.kotlin.runtime.time.Instant value.

val timeStamp =

aws.smithy.kotlin.runtime.time

.Instant(utc.toInstant())

val restoreTableRequest =

RestoreTableRequest {

restoreTimestamp = timeStamp

sourceTableName = "MovieKotlin"

targetKeyspaceName = keyspaceName

targetTableName = "MovieRestore"

sourceKeyspaceName = keyspaceName

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.restoreTable(restoreTableRequest)

println("The ARN of the restored table is ${response.restoredTableArn}")

}

fun getWatchedData(

session: CqlSession,

keyspaceName: String,

Learn the basics 446

Amazon Keyspaces (for Apache Cassandra) Developer Guide

) {

val resultSet = session.execute("SELECT * FROM \"$keyspaceName\".

\"MovieKotlin\" WHERE watched = true ALLOW FILTERING;")

resultSet.forEach { item: Row ->

println("The Movie title is ${item.getString("title")}")

println("The Movie year is ${item.getInt("year")}")

println("The plot is ${item.getString("plot")}")

}

fun updateRecord(

session: CqlSession,

keySpace: String,

titleUpdate: String?,

yearUpdate: Int,

) {

val sqlStatement =

"UPDATE \"$keySpace\".\"MovieKotlin\" SET watched=true WHERE title = :k0

AND year = :k1;"

val builder = BatchStatement.builder(DefaultBatchType.UNLOGGED)

builder.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)

val preparedStatement = session.prepare(sqlStatement)

builder.addStatement(

preparedStatement

.boundStatementBuilder()

.setString("k0", titleUpdate)

.setInt("k1", yearUpdate)

.build(),

)

val batchStatement = builder.build()

session.execute(batchStatement)

}

suspend fun updateTable(

keySpace: String?,

tableNameVal: String?,

) {

val def =

ColumnDefinition {

name = "watched"

type = "boolean"

}

val tableRequest =

Learn the basics 447

Amazon Keyspaces (for Apache Cassandra) Developer Guide

UpdateTableRequest {

keyspaceName = keySpace

tableName = tableNameVal

addColumns = listOf(def)

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.updateTable(tableRequest)

}

fun getSpecificMovie(

session: CqlSession,

keyspaceName: String,

) {

val resultSet =

session.execute("SELECT * FROM \"$keyspaceName\".\"MovieKotlin\" WHERE

title = 'The Family' ALLOW FILTERING ;")

resultSet.forEach { item: Row ->

println("The Movie title is ${item.getString("title")}")

println("The Movie year is ${item.getInt("year")}")

println("The plot is ${item.getString("plot")}")

}

// Get records from the Movie table.

fun getMovieData(

session: CqlSession,

keyspaceName: String,

) {

val resultSet = session.execute("SELECT * FROM \"$keyspaceName\".

\"MovieKotlin\";")

resultSet.forEach { item: Row ->

println("The Movie title is ${item.getString("title")}")

println("The Movie year is ${item.getInt("year")}")

println("The plot is ${item.getString("plot")}")

}

// Load data into the table.

fun loadData(

session: CqlSession,

fileName: String,

Learn the basics 448

Amazon Keyspaces (for Apache Cassandra) Developer Guide

keySpace: String,

) {

val sqlStatement =

"INSERT INTO \"$keySpace\".\"MovieKotlin\" (title, year, plot) values

(:k0, :k1, :k2)"

val parser = JsonFactory().createParser(File(fileName))

val rootNode = ObjectMapper().readTree<JsonNode>(parser)

val iter: Iterator<JsonNode> = rootNode.iterator()

var currentNode: ObjectNode

var t = 0

while (iter.hasNext()) {

if (t == 50) {

break

}

currentNode = iter.next() as ObjectNode

val year = currentNode.path("year").asInt()

val title = currentNode.path("title").asText()

val info = currentNode.path("info").toString()

// Insert the data into the Amazon Keyspaces table.

val builder = BatchStatement.builder(DefaultBatchType.UNLOGGED)

builder.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)

val preparedStatement: PreparedStatement = session.prepare(sqlStatement)

builder.addStatement(

preparedStatement

.boundStatementBuilder()

.setString("k0", title)

.setInt("k1", year)

.setString("k2", info)

.build(),

)

val batchStatement = builder.build()

session.execute(batchStatement)

t++

}

suspend fun listTables(keyspaceNameVal: String?) {

val tablesRequest =

ListTablesRequest {

keyspaceName = keyspaceNameVal

Learn the basics 449

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient

.listTablesPaginated(tablesRequest)

.transform { it.tables?.forEach { obj -> emit(obj) } }

.collect { obj ->

println(" ARN: ${obj.resourceArn} Table name: ${obj.tableName}")

}

suspend fun checkTable(

keyspaceNameVal: String?,

tableNameVal: String?,

) {

var tableStatus = false

var status: String

var response: GetTableResponse? = null

val tableRequest =

GetTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

while (!tableStatus) {

response = keyClient.getTable(tableRequest)

status = response!!.status.toString()

println(". The table status is $status")

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true

}

delay(500)

}

val cols: List<ColumnDefinition>? =

response!!.schemaDefinition?.allColumns

if (cols != null) {

for (def in cols) {

println("The column name is ${def.name}")

println("The column type is ${def.type}")

}

Learn the basics 450

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

suspend fun createTable(

keySpaceVal: String?,

tableNameVal: String?,

) {

// Set the columns.

val defTitle =

ColumnDefinition {

name = "title"

type = "text"

}

val defYear =

ColumnDefinition {

name = "year"

type = "int"

}

val defReleaseDate =

ColumnDefinition {

name = "release_date"

type = "timestamp"

}

val defPlot =

ColumnDefinition {

name = "plot"

type = "text"

}

val colList = ArrayList<ColumnDefinition>()

colList.add(defTitle)

colList.add(defYear)

colList.add(defReleaseDate)

colList.add(defPlot)

// Set the keys.

val yearKey =

PartitionKey {

name = "year"

}

val titleKey =

Learn the basics 451

Amazon Keyspaces (for Apache Cassandra) Developer Guide

PartitionKey {

name = "title"

}

val keyList = ArrayList<PartitionKey>()

keyList.add(yearKey)

keyList.add(titleKey)

val schemaDefinitionOb =

SchemaDefinition {

partitionKeys = keyList

allColumns = colList

}

val timeRecovery =

PointInTimeRecovery {

status = PointInTimeRecoveryStatus.Enabled

}

val tableRequest =

CreateTableRequest {

keyspaceName = keySpaceVal

tableName = tableNameVal

schemaDefinition = schemaDefinitionOb

pointInTimeRecovery = timeRecovery

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.createTable(tableRequest)

println("The table ARN is ${response.resourceArn}")

}

suspend fun listKeyspacesPaginator() {

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient

.listKeyspacesPaginated(ListKeyspacesRequest {})

.transform { it.keyspaces?.forEach { obj -> emit(obj) } }

.collect { obj ->

println("Name: ${obj.keyspaceName}")

}

Learn the basics 452

Amazon Keyspaces (for Apache Cassandra) Developer Guide

suspend fun checkKeyspaceExistence(keyspaceNameVal: String?) {

val keyspaceRequest =

GetKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response: GetKeyspaceResponse =

keyClient.getKeyspace(keyspaceRequest)

val name = response.keyspaceName

println("The $name KeySpace is ready")

}

suspend fun createKeySpace(keyspaceNameVal: String) {

val keyspaceRequest =

CreateKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.createKeyspace(keyspaceRequest)

println("The ARN of the KeySpace is ${response.resourceArn}")

}

• For API details, see the following topics in AWS SDK for Kotlin API reference.

• CreateKeyspace

• CreateTable

• DeleteKeyspace

• DeleteTable

• GetKeyspace

• GetTable

• ListKeyspaces

• ListTables

• RestoreTable

• UpdateTable

Learn the basics 453

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

Run an interactive scenario at a command prompt.

class KeyspaceScenario:

"""Runs an interactive scenario that shows how to get started using Amazon

Keyspaces."""

def __init__(self, ks_wrapper):

"""

:param ks_wrapper: An object that wraps Amazon Keyspace actions.

"""

self.ks_wrapper = ks_wrapper

@demo_func

def create_keyspace(self):

"""

1. Creates a keyspace.

2. Lists up to 10 keyspaces in your account.

"""

print("Let's create a keyspace.")

ks_name = q.ask(

"Enter a name for your new keyspace.\nThe name can contain only

letters, "

"numbers and underscores: ",

q.non_empty,

)

if self.ks_wrapper.exists_keyspace(ks_name):

print(f"A keyspace named {ks_name} exists.")

else:

ks_arn = self.ks_wrapper.create_keyspace(ks_name)

ks_exists = False

while not ks_exists:

wait(3)

ks_exists = self.ks_wrapper.exists_keyspace(ks_name)

Learn the basics 454

Amazon Keyspaces (for Apache Cassandra) Developer Guide

print(f"Created a new keyspace.\n\t{ks_arn}.")

print("The first 10 keyspaces in your account are:\n")

self.ks_wrapper.list_keyspaces(10)

@demo_func

def create_table(self):

"""

1. Creates a table in the keyspace. The table is configured with a schema

to hold

movie data and has point-in-time recovery enabled.

2. Waits for the table to be in an active state.

3. Displays schema information for the table.

4. Lists tables in the keyspace.

"""

print("Let's create a table for movies in your keyspace.")

table_name = q.ask("Enter a name for your table: ", q.non_empty)

table = self.ks_wrapper.get_table(table_name)

if table is not None:

print(

f"A table named {table_name} already exists in keyspace "

f"{self.ks_wrapper.ks_name}."

)

else:

table_arn = self.ks_wrapper.create_table(table_name)

print(f"Created table {table_name}:\n\t{table_arn}")

table = {"status": None}

print("Waiting for your table to be ready...")

while table["status"] != "ACTIVE":

wait(5)

table = self.ks_wrapper.get_table(table_name)

print(f"Your table is {table['status']}. Its schema is:")

pp(table["schemaDefinition"])

print("\nThe tables in your keyspace are:\n")

self.ks_wrapper.list_tables()

@demo_func

def ensure_tls_cert(self):

"""

Ensures you have a TLS certificate available to use to secure the

connection

to the keyspace. This function downloads a default certificate or lets

you

specify your own.

"""

Learn the basics 455

Amazon Keyspaces (for Apache Cassandra) Developer Guide

print("To connect to your keyspace, you must have a TLS certificate.")

print("Checking for TLS certificate...")

cert_path = os.path.join(

os.path.dirname(__file__), QueryManager.DEFAULT_CERT_FILE

)

if not os.path.exists(cert_path):

cert_choice = q.ask(

f"Press enter to download a certificate from

{QueryManager.CERT_URL} "

f"or enter the full path to the certificate you want to use: "

)

if cert_choice:

cert_path = cert_choice

else:

cert = requests.get(QueryManager.CERT_URL).text

with open(cert_path, "w") as cert_file:

cert_file.write(cert)

else:

q.ask(f"Certificate {cert_path} found. Press Enter to continue.")

print(

f"Certificate {cert_path} will be used to secure the connection to

your keyspace."

)

return cert_path

@demo_func

def query_table(self, qm, movie_file):

"""

1. Adds movies to the table from a sample movie data file.

2. Gets a list of movies from the table and lets you select one.

3. Displays more information about the selected movie.

"""

qm.add_movies(self.ks_wrapper.table_name, movie_file)

movies = qm.get_movies(self.ks_wrapper.table_name)

print(f"Added {len(movies)} movies to the table:")

sel = q.choose("Pick one to learn more about it: ", [m.title for m in

movies])

movie_choice = qm.get_movie(

self.ks_wrapper.table_name, movies[sel].title, movies[sel].year

)

print(movie_choice.title)

print(f"\tReleased: {movie_choice.release_date}")

print(f"\tPlot: {movie_choice.plot}")

Learn the basics 456

Amazon Keyspaces (for Apache Cassandra) Developer Guide

@demo_func

def update_and_restore_table(self, qm):

"""

1. Updates the table by adding a column to track watched movies.

2. Marks some of the movies as watched.

3. Gets the list of watched movies from the table.

4. Restores to a movies_restored table at a previous point in time.

5. Gets the list of movies from the restored table.

"""

print("Let's add a column to record which movies you've watched.")

pre_update_timestamp = datetime.utcnow()

print(

f"Recorded the current UTC time of {pre_update_timestamp} so we can

restore the table later."

)

self.ks_wrapper.update_table()

print("Waiting for your table to update...")

table = {"status": "UPDATING"}

while table["status"] != "ACTIVE":

wait(5)

table = self.ks_wrapper.get_table(self.ks_wrapper.table_name)

print("Column 'watched' added to table.")

q.ask(

"Let's mark some of the movies as watched. Press Enter when you're

ready.\n"

)

movies = qm.get_movies(self.ks_wrapper.table_name)

for movie in movies[:10]:

qm.watched_movie(self.ks_wrapper.table_name, movie.title, movie.year)

print(f"Marked {movie.title} as watched.")

movies = qm.get_movies(self.ks_wrapper.table_name, watched=True)

print("-" * 88)

print("The watched movies in our table are:\n")

for movie in movies:

print(movie.title)

print("-" * 88)

if q.ask(

"Do you want to restore the table to the way it was before all of

these\n"

"updates? Keep in mind, this can take up to 20 minutes. (y/n) ",

q.is_yesno,

starting_table_name = self.ks_wrapper.table_name

Learn the basics 457

Amazon Keyspaces (for Apache Cassandra) Developer Guide

table_name_restored =

self.ks_wrapper.restore_table(pre_update_timestamp)

table = {"status": "RESTORING"}

while table["status"] != "ACTIVE":

wait(10)

table = self.ks_wrapper.get_table(table_name_restored)

print(

f"Restored {starting_table_name} to {table_name_restored} "

f"at a point in time of {pre_update_timestamp}."

)

movies = qm.get_movies(table_name_restored)

print("Now the movies in our table are:")

for movie in movies:

print(movie.title)

def cleanup(self, cert_path):

"""

1. Deletes the table and waits for it to be removed.

2. Deletes the keyspace.

:param cert_path: The path of the TLS certificate used in the demo. If

the

certificate was downloaded during the demo, it is

removed.

"""

if q.ask(

f"Do you want to delete your {self.ks_wrapper.table_name} table and "

f"{self.ks_wrapper.ks_name} keyspace? (y/n) ",

q.is_yesno,

table_name = self.ks_wrapper.table_name

self.ks_wrapper.delete_table()

table = self.ks_wrapper.get_table(table_name)

print("Waiting for the table to be deleted.")

while table is not None:

wait(5)

table = self.ks_wrapper.get_table(table_name)

print("Table deleted.")

self.ks_wrapper.delete_keyspace()

print(

"Keyspace deleted. If you chose to restore your table during the

"demo, the original table is also deleted."

)

Learn the basics 458

Amazon Keyspaces (for Apache Cassandra) Developer Guide

if cert_path == os.path.join(

os.path.dirname(__file__), QueryManager.DEFAULT_CERT_FILE

) and os.path.exists(cert_path):

os.remove(cert_path)

print("Removed certificate that was downloaded for this demo.")

def run_scenario(self):

logging.basicConfig(level=logging.INFO, format="%(levelname)s:

%(message)s")

print("-" * 88)

print("Welcome to the Amazon Keyspaces (for Apache Cassandra) demo.")

print("-" * 88)

self.create_keyspace()

self.create_table()

cert_file_path = self.ensure_tls_cert()

# Use a context manager to ensure the connection to the keyspace is

closed.

with QueryManager(

cert_file_path, boto3.DEFAULT_SESSION, self.ks_wrapper.ks_name

) as qm:

self.query_table(qm, "../../../resources/sample_files/movies.json")

self.update_and_restore_table(qm)

self.cleanup(cert_file_path)

print("\nThanks for watching!")

print("-" * 88)

if __name__ == "__main__":

try:

scenario = KeyspaceScenario(KeyspaceWrapper.from_client())

scenario.run_scenario()

except Exception:

logging.exception("Something went wrong with the demo.")

Deﬁne a class that wraps keyspace and table actions.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

Learn the basics 459

Amazon Keyspaces (for Apache Cassandra) Developer Guide

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def create_keyspace(self, name):

"""

Creates a keyspace.

:param name: The name to give the keyspace.

:return: The Amazon Resource Name (ARN) of the new keyspace.

"""

try:

response = self.keyspaces_client.create_keyspace(keyspaceName=name)

self.ks_name = name

self.ks_arn = response["resourceArn"]

except ClientError as err:

logger.error(

"Couldn't create %s. Here's why: %s: %s",

name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

else:

return self.ks_arn

def exists_keyspace(self, name):

"""

Checks whether a keyspace exists.

:param name: The name of the keyspace to look up.

Learn the basics 460

Amazon Keyspaces (for Apache Cassandra) Developer Guide

:return: True when the keyspace exists. Otherwise, False.

"""

try:

response = self.keyspaces_client.get_keyspace(keyspaceName=name)

self.ks_name = response["keyspaceName"]

self.ks_arn = response["resourceArn"]

exists = True

except ClientError as err:

if err.response["Error"]["Code"] == "ResourceNotFoundException":

logger.info("Keyspace %s does not exist.", name)

exists = False

else:

logger.error(

"Couldn't verify %s exists. Here's why: %s: %s",

name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

return exists

def list_keyspaces(self, limit):

"""

Lists the keyspaces in your account.

:param limit: The maximum number of keyspaces to list.

"""

try:

ks_paginator = self.keyspaces_client.get_paginator("list_keyspaces")

for page in ks_paginator.paginate(PaginationConfig={"MaxItems":

limit}):

for ks in page["keyspaces"]:

print(ks["keyspaceName"])

print(f"\t{ks['resourceArn']}")

except ClientError as err:

logger.error(

"Couldn't list keyspaces. Here's why: %s: %s",

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

Learn the basics 461

Amazon Keyspaces (for Apache Cassandra) Developer Guide

def create_table(self, table_name):

"""

Creates a table in the keyspace.

The table is created with a schema for storing movie data

and has point-in-time recovery enabled.

:param table_name: The name to give the table.

:return: The ARN of the new table.

"""

try:

response = self.keyspaces_client.create_table(

keyspaceName=self.ks_name,

tableName=table_name,

schemaDefinition={

"allColumns": [

{"name": "title", "type": "text"},

{"name": "year", "type": "int"},

{"name": "release_date", "type": "timestamp"},

{"name": "plot", "type": "text"},

"partitionKeys": [{"name": "year"}, {"name": "title"}],

pointInTimeRecovery={"status": "ENABLED"},

)

except ClientError as err:

logger.error(

"Couldn't create table %s. Here's why: %s: %s",

table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

else:

return response["resourceArn"]

def get_table(self, table_name):

"""

Gets data about a table in the keyspace.

:param table_name: The name of the table to look up.

:return: Data about the table.

"""

try:

Learn the basics 462

Amazon Keyspaces (for Apache Cassandra) Developer Guide

response = self.keyspaces_client.get_table(

keyspaceName=self.ks_name, tableName=table_name

)

self.table_name = table_name

except ClientError as err:

if err.response["Error"]["Code"] == "ResourceNotFoundException":

logger.info("Table %s does not exist.", table_name)

self.table_name = None

response = None

else:

logger.error(

"Couldn't verify %s exists. Here's why: %s: %s",

table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

return response

def list_tables(self):

"""

Lists the tables in the keyspace.

"""

try:

table_paginator = self.keyspaces_client.get_paginator("list_tables")

for page in table_paginator.paginate(keyspaceName=self.ks_name):

for table in page["tables"]:

print(table["tableName"])

print(f"\t{table['resourceArn']}")

except ClientError as err:

logger.error(

"Couldn't list tables in keyspace %s. Here's why: %s: %s",

self.ks_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

def update_table(self):

"""

Updates the schema of the table.

Learn the basics 463

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This example updates a table of movie data by adding a new column

that tracks whether the movie has been watched.

"""

try:

self.keyspaces_client.update_table(

keyspaceName=self.ks_name,

tableName=self.table_name,

addColumns=[{"name": "watched", "type": "boolean"}],

)

except ClientError as err:

logger.error(

"Couldn't update table %s. Here's why: %s: %s",

self.table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

def restore_table(self, restore_timestamp):

"""

Restores the table to a previous point in time. The table is restored

to a new table in the same keyspace.

:param restore_timestamp: The point in time to restore the table. This

time

must be in UTC format.

:return: The name of the restored table.

"""

try:

restored_table_name = f"{self.table_name}_restored"

self.keyspaces_client.restore_table(

sourceKeyspaceName=self.ks_name,

sourceTableName=self.table_name,

targetKeyspaceName=self.ks_name,

targetTableName=restored_table_name,

restoreTimestamp=restore_timestamp,

)

except ClientError as err:

logger.error(

"Couldn't restore table %s. Here's why: %s: %s",

restore_timestamp,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

Learn the basics 464

Amazon Keyspaces (for Apache Cassandra) Developer Guide

)

raise

else:

return restored_table_name

def delete_table(self):

"""

Deletes the table from the keyspace.

"""

try:

self.keyspaces_client.delete_table(

keyspaceName=self.ks_name, tableName=self.table_name

)

self.table_name = None

except ClientError as err:

logger.error(

"Couldn't delete table %s. Here's why: %s: %s",

self.table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

def delete_keyspace(self):

"""

Deletes the keyspace.

"""

try:

self.keyspaces_client.delete_keyspace(keyspaceName=self.ks_name)

self.ks_name = None

except ClientError as err:

logger.error(

"Couldn't delete keyspace %s. Here's why: %s: %s",

self.ks_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

Learn the basics 465

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Deﬁne a class that creates a TLS connection to a keyspace, authenticates with SigV4, and

sends CQL queries to a table in the keyspace.

class QueryManager:

"""

Manages queries to an Amazon Keyspaces (for Apache Cassandra) keyspace.

Queries are secured by TLS and authenticated by using the Signature V4

(SigV4)

AWS signing protocol. This is more secure than sending username and password

with a plain-text authentication provider.

This example downloads a default certificate to secure TLS, or lets you

specify

your own.

This example uses a table of movie data to demonstrate basic queries.

"""

DEFAULT_CERT_FILE = "sf-class2-root.crt"

CERT_URL = f"https://certs.secureserver.net/repository/sf-class2-root.crt"

def __init__(self, cert_file_path, boto_session, keyspace_name):

"""

:param cert_file_path: The path and file name of the certificate used for

TLS.

:param boto_session: A Boto3 session. This is used to acquire your AWS

credentials.

:param keyspace_name: The name of the keyspace to connect.

"""

self.cert_file_path = cert_file_path

self.boto_session = boto_session

self.ks_name = keyspace_name

self.cluster = None

self.session = None

def __enter__(self):

"""

Creates a session connection to the keyspace that is secured by TLS and

authenticated by SigV4.

"""

ssl_context = SSLContext(PROTOCOL_TLSv1_2)

Learn the basics 466

Amazon Keyspaces (for Apache Cassandra) Developer Guide

ssl_context.load_verify_locations(self.cert_file_path)

ssl_context.verify_mode = CERT_REQUIRED

auth_provider = SigV4AuthProvider(self.boto_session)

contact_point = f"cassandra.

{self.boto_session.region_name}.amazonaws.com"

exec_profile = ExecutionProfile(

consistency_level=ConsistencyLevel.LOCAL_QUORUM,

load_balancing_policy=DCAwareRoundRobinPolicy(),

)

self.cluster = Cluster(

[contact_point],

ssl_context=ssl_context,

auth_provider=auth_provider,

port=9142,

execution_profiles={EXEC_PROFILE_DEFAULT: exec_profile},

protocol_version=4,

)

self.cluster.__enter__()

self.session = self.cluster.connect(self.ks_name)

return self

def __exit__(self, *args):

"""

Exits the cluster. This shuts down all existing session connections.

"""

self.cluster.__exit__(*args)

def add_movies(self, table_name, movie_file_path):

"""

Gets movies from a JSON file and adds them to a table in the keyspace.

:param table_name: The name of the table.

:param movie_file_path: The path and file name of a JSON file that

contains movie data.

"""

with open(movie_file_path, "r") as movie_file:

movies = json.loads(movie_file.read())

stmt = self.session.prepare(

f"INSERT INTO {table_name} (year, title, release_date, plot) VALUES

(?, ?, ?, ?);"

)

for movie in movies[:20]:

self.session.execute(

stmt,

Learn the basics 467

Amazon Keyspaces (for Apache Cassandra) Developer Guide

parameters=[

movie["year"],

movie["title"],

date.fromisoformat(movie["info"]

["release_date"].partition("T")[0]),

movie["info"]["plot"],

)

def get_movies(self, table_name, watched=None):

"""

Gets the title and year of the full list of movies from the table.

:param table_name: The name of the movie table.

:param watched: When specified, the returned list of movies is filtered

either movies that have been watched or movies that have

not

been watched. Otherwise, all movies are returned.

:return: A list of movies in the table.

"""

if watched is None:

stmt = SimpleStatement(f"SELECT title, year from {table_name}")

params = None

else:

stmt = SimpleStatement(

f"SELECT title, year from {table_name} WHERE watched = %s ALLOW

FILTERING"

)

params = [watched]

return self.session.execute(stmt, parameters=params).all()

def get_movie(self, table_name, title, year):

"""

Gets a single movie from the table, by title and year.

:param table_name: The name of the movie table.

:param title: The title of the movie.

:param year: The year of the movie's release.

:return: The requested movie.

"""

return self.session.execute(

SimpleStatement(

f"SELECT * from {table_name} WHERE title = %s AND year = %s"

Learn the basics 468

Amazon Keyspaces (for Apache Cassandra) Developer Guide

parameters=[title, year],

).one()

def watched_movie(self, table_name, title, year):

"""

Updates a movie as having been watched.

:param table_name: The name of the movie table.

:param title: The title of the movie.

:param year: The year of the movie's release.

"""

self.session.execute(

SimpleStatement(

f"UPDATE {table_name} SET watched=true WHERE title = %s AND year

= %s"

parameters=[title, year],

)

• For API details, see the following topics in AWS SDK for Python (Boto3) API Reference.

• CreateKeyspace

• CreateTable

• DeleteKeyspace

• DeleteTable

• GetKeyspace

• GetTable

• ListKeyspaces

• ListTables

• RestoreTable

• UpdateTable

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Learn the basics 469

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Actions for Amazon Keyspaces using AWS SDKs

The following code examples demonstrate how to perform individual Amazon Keyspaces actions

with AWS SDKs. Each example includes a link to GitHub, where you can ﬁnd instructions for setting

up and running the code.

The following examples include only the most commonly used actions. For a complete list, see the

Amazon Keyspaces (for Apache Cassandra) API Reference.

Examples

• Use CreateKeyspace with an AWS SDK or CLI

• Use CreateTable with an AWS SDK or CLI

• Use DeleteKeyspace with an AWS SDK or CLI

• Use DeleteTable with an AWS SDK or CLI

• Use GetKeyspace with an AWS SDK or CLI

• Use GetTable with an AWS SDK or CLI

• Use ListKeyspaces with an AWS SDK or CLI

• Use ListTables with an AWS SDK or CLI

• Use RestoreTable with an AWS SDK or CLI

• Use UpdateTable with an AWS SDK or CLI

Use CreateKeyspace with an AWS SDK or CLI

The following code examples show how to use CreateKeyspace.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

Actions 470

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Create a new keyspace.

/// </summary>

/// <param name="keyspaceName">The name for the new keyspace.</param>

/// <returns>The Amazon Resource Name (ARN) of the new keyspace.</returns>

public async Task<string> CreateKeyspace(string keyspaceName)

{

var response =

await _amazonKeyspaces.CreateKeyspaceAsync(

new CreateKeyspaceRequest { KeyspaceName = keyspaceName });

return response.ResourceArn;

}

• For API details, see CreateKeyspace in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void createKeySpace(KeyspacesClient keyClient, String

keyspaceName) {

try {

Actions 471

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CreateKeyspaceRequest keyspaceRequest =

CreateKeyspaceRequest.builder()

.keyspaceName(keyspaceName)

.build();

CreateKeyspaceResponse response =

keyClient.createKeyspace(keyspaceRequest);

System.out.println("The ARN of the KeySpace is " +

response.resourceArn());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see CreateKeyspace in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun createKeySpace(keyspaceNameVal: String) {

val keyspaceRequest =

CreateKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.createKeyspace(keyspaceRequest)

println("The ARN of the KeySpace is ${response.resourceArn}")

}

Actions 472

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see CreateKeyspace in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def create_keyspace(self, name):

"""

Creates a keyspace.

:param name: The name to give the keyspace.

:return: The Amazon Resource Name (ARN) of the new keyspace.

"""

try:

response = self.keyspaces_client.create_keyspace(keyspaceName=name)

self.ks_name = name

self.ks_arn = response["resourceArn"]

Actions 473

Amazon Keyspaces (for Apache Cassandra) Developer Guide

except ClientError as err:

logger.error(

"Couldn't create %s. Here's why: %s: %s",

name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

else:

return self.ks_arn

• For API details, see CreateKeyspace in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use CreateTable with an AWS SDK or CLI

The following code examples show how to use CreateTable.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Create a new Amazon Keyspaces table.

Actions 474

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// </summary>

/// <param name="keyspaceName">The keyspace where the table will be

created.</param>

/// <param name="schema">The schema for the new table.</param>

/// <param name="tableName">The name of the new table.</param>

/// <returns>The Amazon Resource Name (ARN) of the new table.</returns>

public async Task<string> CreateTable(string keyspaceName, SchemaDefinition

schema, string tableName)

{

var request = new CreateTableRequest

{

KeyspaceName = keyspaceName,

SchemaDefinition = schema,

TableName = tableName,

PointInTimeRecovery = new PointInTimeRecovery { Status =

PointInTimeRecoveryStatus.ENABLED }

};

var response = await _amazonKeyspaces.CreateTableAsync(request);

return response.ResourceArn;

}

• For API details, see CreateTable in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void createTable(KeyspacesClient keyClient, String keySpace,

String tableName) {

try {

// Set the columns.

ColumnDefinition defTitle = ColumnDefinition.builder()

.name("title")

Actions 475

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.type("text")

.build();

ColumnDefinition defYear = ColumnDefinition.builder()

.name("year")

.type("int")

.build();

ColumnDefinition defReleaseDate = ColumnDefinition.builder()

.name("release_date")

.type("timestamp")

.build();

ColumnDefinition defPlot = ColumnDefinition.builder()

.name("plot")

.type("text")

.build();

List<ColumnDefinition> colList = new ArrayList<>();

colList.add(defTitle);

colList.add(defYear);

colList.add(defReleaseDate);

colList.add(defPlot);

// Set the keys.

PartitionKey yearKey = PartitionKey.builder()

.name("year")

.build();

PartitionKey titleKey = PartitionKey.builder()

.name("title")

.build();

List<PartitionKey> keyList = new ArrayList<>();

keyList.add(yearKey);

keyList.add(titleKey);

SchemaDefinition schemaDefinition = SchemaDefinition.builder()

.partitionKeys(keyList)

.allColumns(colList)

.build();

PointInTimeRecovery timeRecovery = PointInTimeRecovery.builder()

.status(PointInTimeRecoveryStatus.ENABLED)

Actions 476

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.build();

CreateTableRequest tableRequest = CreateTableRequest.builder()

.keyspaceName(keySpace)

.tableName(tableName)

.schemaDefinition(schemaDefinition)

.pointInTimeRecovery(timeRecovery)

.build();

CreateTableResponse response = keyClient.createTable(tableRequest);

System.out.println("The table ARN is " + response.resourceArn());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see CreateTable in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun createTable(

keySpaceVal: String?,

tableNameVal: String?,

) {

// Set the columns.

val defTitle =

ColumnDefinition {

name = "title"

type = "text"

}

Actions 477

Amazon Keyspaces (for Apache Cassandra) Developer Guide

val defYear =

ColumnDefinition {

name = "year"

type = "int"

}

val defReleaseDate =

ColumnDefinition {

name = "release_date"

type = "timestamp"

}

val defPlot =

ColumnDefinition {

name = "plot"

type = "text"

}

val colList = ArrayList<ColumnDefinition>()

colList.add(defTitle)

colList.add(defYear)

colList.add(defReleaseDate)

colList.add(defPlot)

// Set the keys.

val yearKey =

PartitionKey {

name = "year"

}

val titleKey =

PartitionKey {

name = "title"

}

val keyList = ArrayList<PartitionKey>()

keyList.add(yearKey)

keyList.add(titleKey)

val schemaDefinitionOb =

SchemaDefinition {

partitionKeys = keyList

allColumns = colList

}

Actions 478

Amazon Keyspaces (for Apache Cassandra) Developer Guide

val timeRecovery =

PointInTimeRecovery {

status = PointInTimeRecoveryStatus.Enabled

}

val tableRequest =

CreateTableRequest {

keyspaceName = keySpaceVal

tableName = tableNameVal

schemaDefinition = schemaDefinitionOb

pointInTimeRecovery = timeRecovery

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.createTable(tableRequest)

println("The table ARN is ${response.resourceArn}")

}

• For API details, see CreateTable in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

Actions 479

Amazon Keyspaces (for Apache Cassandra) Developer Guide

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def create_table(self, table_name):

"""

Creates a table in the keyspace.

The table is created with a schema for storing movie data

and has point-in-time recovery enabled.

:param table_name: The name to give the table.

:return: The ARN of the new table.

"""

try:

response = self.keyspaces_client.create_table(

keyspaceName=self.ks_name,

tableName=table_name,

schemaDefinition={

"allColumns": [

{"name": "title", "type": "text"},

{"name": "year", "type": "int"},

{"name": "release_date", "type": "timestamp"},

{"name": "plot", "type": "text"},

"partitionKeys": [{"name": "year"}, {"name": "title"}],

pointInTimeRecovery={"status": "ENABLED"},

)

except ClientError as err:

logger.error(

"Couldn't create table %s. Here's why: %s: %s",

table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

else:

return response["resourceArn"]

Actions 480

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see CreateTable in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use DeleteKeyspace with an AWS SDK or CLI

The following code examples show how to use DeleteKeyspace.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Delete an existing keyspace.

/// </summary>

/// <param name="keyspaceName"></param>

/// <returns>A Boolean value indicating the success of the action.</returns>

public async Task<bool> DeleteKeyspace(string keyspaceName)

{

var response = await _amazonKeyspaces.DeleteKeyspaceAsync(

new DeleteKeyspaceRequest { KeyspaceName = keyspaceName });

return response.HttpStatusCode == HttpStatusCode.OK;

}

Actions 481

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see DeleteKeyspace in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void deleteKeyspace(KeyspacesClient keyClient, String

keyspaceName) {

try {

DeleteKeyspaceRequest deleteKeyspaceRequest =

DeleteKeyspaceRequest.builder()

.keyspaceName(keyspaceName)

.build();

keyClient.deleteKeyspace(deleteKeyspaceRequest);

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see DeleteKeyspace in AWS SDK for Java 2.x API Reference.

Actions 482

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun deleteKeyspace(keyspaceNameVal: String?) {

val deleteKeyspaceRequest =

DeleteKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.deleteKeyspace(deleteKeyspaceRequest)

}

• For API details, see DeleteKeyspace in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

Actions 483

Amazon Keyspaces (for Apache Cassandra) Developer Guide

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def delete_keyspace(self):

"""

Deletes the keyspace.

"""

try:

self.keyspaces_client.delete_keyspace(keyspaceName=self.ks_name)

self.ks_name = None

except ClientError as err:

logger.error(

"Couldn't delete keyspace %s. Here's why: %s: %s",

self.ks_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

• For API details, see DeleteKeyspace in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use DeleteTable with an AWS SDK or CLI

The following code examples show how to use DeleteTable.

Actions 484

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Delete an Amazon Keyspaces table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to delete.</param>

/// <returns>A Boolean value indicating the success of the action.</returns>

public async Task<bool> DeleteTable(string keyspaceName, string tableName)

{

var response = await _amazonKeyspaces.DeleteTableAsync(

new DeleteTableRequest { KeyspaceName = keyspaceName, TableName =

tableName });

return response.HttpStatusCode == HttpStatusCode.OK;

}

• For API details, see DeleteTable in AWS SDK for .NET API Reference.

Actions 485

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void deleteTable(KeyspacesClient keyClient, String

keyspaceName, String tableName) {

try {

DeleteTableRequest tableRequest = DeleteTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

keyClient.deleteTable(tableRequest);

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see DeleteTable in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun deleteTable(

Actions 486

Amazon Keyspaces (for Apache Cassandra) Developer Guide

keyspaceNameVal: String?,

tableNameVal: String?,

) {

val tableRequest =

DeleteTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.deleteTable(tableRequest)

}

• For API details, see DeleteTable in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

Actions 487

Amazon Keyspaces (for Apache Cassandra) Developer Guide

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def delete_table(self):

"""

Deletes the table from the keyspace.

"""

try:

self.keyspaces_client.delete_table(

keyspaceName=self.ks_name, tableName=self.table_name

)

self.table_name = None

except ClientError as err:

logger.error(

"Couldn't delete table %s. Here's why: %s: %s",

self.table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

• For API details, see DeleteTable in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use GetKeyspace with an AWS SDK or CLI

The following code examples show how to use GetKeyspace.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

Actions 488

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Get data about a keyspace.

/// </summary>

/// <param name="keyspaceName">The name of the keyspace.</param>

/// <returns>The Amazon Resource Name (ARN) of the keyspace.</returns>

public async Task<string> GetKeyspace(string keyspaceName)

{

var response = await _amazonKeyspaces.GetKeyspaceAsync(

new GetKeyspaceRequest { KeyspaceName = keyspaceName });

return response.ResourceArn;

}

• For API details, see GetKeyspace in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void checkKeyspaceExistence(KeyspacesClient keyClient, String

keyspaceName) {

try {

GetKeyspaceRequest keyspaceRequest = GetKeyspaceRequest.builder()

Actions 489

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.keyspaceName(keyspaceName)

.build();

GetKeyspaceResponse response =

keyClient.getKeyspace(keyspaceRequest);

String name = response.keyspaceName();

System.out.println("The " + name + " KeySpace is ready");

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see GetKeyspace in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun checkKeyspaceExistence(keyspaceNameVal: String?) {

val keyspaceRequest =

GetKeyspaceRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response: GetKeyspaceResponse =

keyClient.getKeyspace(keyspaceRequest)

val name = response.keyspaceName

println("The $name KeySpace is ready")

}

Actions 490

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see GetKeyspace in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def exists_keyspace(self, name):

"""

Checks whether a keyspace exists.

:param name: The name of the keyspace to look up.

:return: True when the keyspace exists. Otherwise, False.

"""

try:

response = self.keyspaces_client.get_keyspace(keyspaceName=name)

self.ks_name = response["keyspaceName"]

self.ks_arn = response["resourceArn"]

Actions 491

Amazon Keyspaces (for Apache Cassandra) Developer Guide

exists = True

except ClientError as err:

if err.response["Error"]["Code"] == "ResourceNotFoundException":

logger.info("Keyspace %s does not exist.", name)

exists = False

else:

logger.error(

"Couldn't verify %s exists. Here's why: %s: %s",

name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

return exists

• For API details, see GetKeyspace in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use GetTable with an AWS SDK or CLI

The following code examples show how to use GetTable.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

Actions 492

Amazon Keyspaces (for Apache Cassandra) Developer Guide

/// <summary>

/// Get information about an Amazon Keyspaces table.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the Amazon Keyspaces table.</param>

/// <returns>The response containing data about the table.</returns>

public async Task<GetTableResponse> GetTable(string keyspaceName, string

tableName)

{

var response = await _amazonKeyspaces.GetTableAsync(

new GetTableRequest { KeyspaceName = keyspaceName, TableName =

tableName });

return response;

}

• For API details, see GetTable in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void checkTable(KeyspacesClient keyClient, String keyspaceName,

String tableName)

throws InterruptedException {

try {

boolean tableStatus = false;

String status;

GetTableResponse response = null;

GetTableRequest tableRequest = GetTableRequest.builder()

.keyspaceName(keyspaceName)

.tableName(tableName)

.build();

Actions 493

Amazon Keyspaces (for Apache Cassandra) Developer Guide

while (!tableStatus) {

response = keyClient.getTable(tableRequest);

status = response.statusAsString();

System.out.println(". The table status is " + status);

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true;

}

Thread.sleep(500);

}

List<ColumnDefinition> cols =

response.schemaDefinition().allColumns();

for (ColumnDefinition def : cols) {

System.out.println("The column name is " + def.name());

System.out.println("The column type is " + def.type());

}

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see GetTable in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun checkTable(

keyspaceNameVal: String?,

tableNameVal: String?,

) {

var tableStatus = false

Actions 494

Amazon Keyspaces (for Apache Cassandra) Developer Guide

var status: String

var response: GetTableResponse? = null

val tableRequest =

GetTableRequest {

keyspaceName = keyspaceNameVal

tableName = tableNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

while (!tableStatus) {

response = keyClient.getTable(tableRequest)

status = response!!.status.toString()

println(". The table status is $status")

if (status.compareTo("ACTIVE") == 0) {

tableStatus = true

}

delay(500)

}

val cols: List<ColumnDefinition>? =

response!!.schemaDefinition?.allColumns

if (cols != null) {

for (def in cols) {

println("The column name is ${def.name}")

println("The column type is ${def.type}")

}

• For API details, see GetTable in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

Actions 495

Amazon Keyspaces (for Apache Cassandra) Developer Guide

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def get_table(self, table_name):

"""

Gets data about a table in the keyspace.

:param table_name: The name of the table to look up.

:return: Data about the table.

"""

try:

response = self.keyspaces_client.get_table(

keyspaceName=self.ks_name, tableName=table_name

)

self.table_name = table_name

except ClientError as err:

if err.response["Error"]["Code"] == "ResourceNotFoundException":

logger.info("Table %s does not exist.", table_name)

self.table_name = None

response = None

else:

logger.error(

"Couldn't verify %s exists. Here's why: %s: %s",

table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

Actions 496

Amazon Keyspaces (for Apache Cassandra) Developer Guide

raise

return response

• For API details, see GetTable in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use ListKeyspaces with an AWS SDK or CLI

The following code examples show how to use ListKeyspaces.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Lists all keyspaces for the account.

/// </summary>

/// <returns>Async task.</returns>

public async Task ListKeyspaces()

{

var paginator = _amazonKeyspaces.Paginators.ListKeyspaces(new

ListKeyspacesRequest());

Console.WriteLine("{0, -30}\t{1}", "Keyspace name", "Keyspace ARN");

Actions 497

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Console.WriteLine(new string('-', Console.WindowWidth));

await foreach (var keyspace in paginator.Keyspaces)

{

Console.WriteLine($"{keyspace.KeyspaceName,-30}\t{keyspace.ResourceArn}");

}

• For API details, see ListKeyspaces in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void listKeyspacesPaginator(KeyspacesClient keyClient) {

try {

ListKeyspacesRequest keyspacesRequest =

ListKeyspacesRequest.builder()

.maxResults(10)

.build();

ListKeyspacesIterable listRes =

keyClient.listKeyspacesPaginator(keyspacesRequest);

listRes.stream()

.flatMap(r -> r.keyspaces().stream())

.forEach(content -> System.out.println(" Name: " +

content.keyspaceName()));

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

Actions 498

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see ListKeyspaces in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun listKeyspacesPaginator() {

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient

.listKeyspacesPaginated(ListKeyspacesRequest {})

.transform { it.keyspaces?.forEach { obj -> emit(obj) } }

.collect { obj ->

println("Name: ${obj.keyspaceName}")

}

• For API details, see ListKeyspaces in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

Actions 499

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def list_keyspaces(self, limit):

"""

Lists the keyspaces in your account.

:param limit: The maximum number of keyspaces to list.

"""

try:

ks_paginator = self.keyspaces_client.get_paginator("list_keyspaces")

for page in ks_paginator.paginate(PaginationConfig={"MaxItems":

limit}):

for ks in page["keyspaces"]:

print(ks["keyspaceName"])

print(f"\t{ks['resourceArn']}")

except ClientError as err:

logger.error(

"Couldn't list keyspaces. Here's why: %s: %s",

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

• For API details, see ListKeyspaces in AWS SDK for Python (Boto3) API Reference.

Actions 500

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use ListTables with an AWS SDK or CLI

The following code examples show how to use ListTables.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Lists the Amazon Keyspaces tables in a keyspace.

/// </summary>

/// <param name="keyspaceName">The name of the keyspace.</param>

/// <returns>A list of TableSummary objects.</returns>

public async Task<List<TableSummary>> ListTables(string keyspaceName)

{

var response = await _amazonKeyspaces.ListTablesAsync(new

ListTablesRequest { KeyspaceName = keyspaceName });

response.Tables.ForEach(table =>

{

Console.WriteLine($"{table.KeyspaceName}\t{table.TableName}\t{table.ResourceArn}");

});

return response.Tables;

}

Actions 501

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For API details, see ListTables in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void listTables(KeyspacesClient keyClient, String keyspaceName)

{

try {

ListTablesRequest tablesRequest = ListTablesRequest.builder()

.keyspaceName(keyspaceName)

.build();

ListTablesIterable listRes =

keyClient.listTablesPaginator(tablesRequest);

listRes.stream()

.flatMap(r -> r.tables().stream())

.forEach(content -> System.out.println(" ARN: " +

content.resourceArn() +

" Table name: " + content.tableName()));

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see ListTables in AWS SDK for Java 2.x API Reference.

Actions 502

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun listTables(keyspaceNameVal: String?) {

val tablesRequest =

ListTablesRequest {

keyspaceName = keyspaceNameVal

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient

.listTablesPaginated(tablesRequest)

.transform { it.tables?.forEach { obj -> emit(obj) } }

.collect { obj ->

println(" ARN: ${obj.resourceArn} Table name: ${obj.tableName}")

}

• For API details, see ListTables in AWS SDK for Kotlin API reference.

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

Actions 503

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def list_tables(self):

"""

Lists the tables in the keyspace.

"""

try:

table_paginator = self.keyspaces_client.get_paginator("list_tables")

for page in table_paginator.paginate(keyspaceName=self.ks_name):

for table in page["tables"]:

print(table["tableName"])

print(f"\t{table['resourceArn']}")

except ClientError as err:

logger.error(

"Couldn't list tables in keyspace %s. Here's why: %s: %s",

self.ks_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

• For API details, see ListTables in AWS SDK for Python (Boto3) API Reference.

Actions 504

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use RestoreTable with an AWS SDK or CLI

The following code examples show how to use RestoreTable.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Restores the specified table to the specified point in time.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to restore.</param>

/// <param name="timestamp">The time to which the table will be restored.</

param>

/// <returns>The Amazon Resource Name (ARN) of the restored table.</returns>

public async Task<string> RestoreTable(string keyspaceName, string tableName,

string restoredTableName, DateTime timestamp)

{

var request = new RestoreTableRequest

{

RestoreTimestamp = timestamp,

SourceKeyspaceName = keyspaceName,

SourceTableName = tableName,

TargetKeyspaceName = keyspaceName,

TargetTableName = restoredTableName

Actions 505

Amazon Keyspaces (for Apache Cassandra) Developer Guide

};

var response = await _amazonKeyspaces.RestoreTableAsync(request);

return response.RestoredTableARN;

}

• For API details, see RestoreTable in AWS SDK for .NET API Reference.

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void restoreTable(KeyspacesClient keyClient, String

keyspaceName, ZonedDateTime utc) {

try {

Instant myTime = utc.toInstant();

RestoreTableRequest restoreTableRequest =

RestoreTableRequest.builder()

.restoreTimestamp(myTime)

.sourceTableName("Movie")

.targetKeyspaceName(keyspaceName)

.targetTableName("MovieRestore")

.sourceKeyspaceName(keyspaceName)

.build();

RestoreTableResponse response =

keyClient.restoreTable(restoreTableRequest);

System.out.println("The ARN of the restored table is " +

response.restoredTableARN());

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

Actions 506

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

• For API details, see RestoreTable in AWS SDK for Java 2.x API Reference.

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun restoreTable(

keyspaceName: String?,

utc: ZonedDateTime,

) {

// Create an aws.smithy.kotlin.runtime.time.Instant value.

val timeStamp =

aws.smithy.kotlin.runtime.time

.Instant(utc.toInstant())

val restoreTableRequest =

RestoreTableRequest {

restoreTimestamp = timeStamp

sourceTableName = "MovieKotlin"

targetKeyspaceName = keyspaceName

targetTableName = "MovieRestore"

sourceKeyspaceName = keyspaceName

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

val response = keyClient.restoreTable(restoreTableRequest)

println("The ARN of the restored table is ${response.restoredTableArn}")

}

• For API details, see RestoreTable in AWS SDK for Kotlin API reference.

Actions 507

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def restore_table(self, restore_timestamp):

"""

Restores the table to a previous point in time. The table is restored

to a new table in the same keyspace.

:param restore_timestamp: The point in time to restore the table. This

time

must be in UTC format.

:return: The name of the restored table.

"""

try:

restored_table_name = f"{self.table_name}_restored"

self.keyspaces_client.restore_table(

Actions 508

Amazon Keyspaces (for Apache Cassandra) Developer Guide

sourceKeyspaceName=self.ks_name,

sourceTableName=self.table_name,

targetKeyspaceName=self.ks_name,

targetTableName=restored_table_name,

restoreTimestamp=restore_timestamp,

)

except ClientError as err:

logger.error(

"Couldn't restore table %s. Here's why: %s: %s",

restore_timestamp,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

else:

return restored_table_name

• For API details, see RestoreTable in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Use UpdateTable with an AWS SDK or CLI

The following code examples show how to use UpdateTable.

Action examples are code excerpts from larger programs and must be run in context. You can see

this action in context in the following code example:

• Learn the basics

Actions 509

Amazon Keyspaces (for Apache Cassandra) Developer Guide

.NET

AWS SDK for .NET

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

/// <summary>

/// Updates the movie table to add a boolean column named watched.

/// </summary>

/// <param name="keyspaceName">The keyspace containing the table.</param>

/// <param name="tableName">The name of the table to change.</param>

/// <returns>The Amazon Resource Name (ARN) of the updated table.</returns>

public async Task<string> UpdateTable(string keyspaceName, string tableName)

{

var newColumn = new ColumnDefinition { Name = "watched", Type =

"boolean" };

var request = new UpdateTableRequest

{

KeyspaceName = keyspaceName,

TableName = tableName,

AddColumns = new List<ColumnDefinition> { newColumn }

};

var response = await _amazonKeyspaces.UpdateTableAsync(request);

return response.ResourceArn;

}

• For API details, see UpdateTable in AWS SDK for .NET API Reference.

Actions 510

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Java

SDK for Java 2.x

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

public static void updateTable(KeyspacesClient keyClient, String keySpace,

String tableName) {

try {

ColumnDefinition def = ColumnDefinition.builder()

.name("watched")

.type("boolean")

.build();

UpdateTableRequest tableRequest = UpdateTableRequest.builder()

.keyspaceName(keySpace)

.tableName(tableName)

.addColumns(def)

.build();

keyClient.updateTable(tableRequest);

} catch (KeyspacesException e) {

System.err.println(e.awsErrorDetails().errorMessage());

System.exit(1);

}

• For API details, see UpdateTable in AWS SDK for Java 2.x API Reference.

Actions 511

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Kotlin

SDK for Kotlin

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

suspend fun updateTable(

keySpace: String?,

tableNameVal: String?,

) {

val def =

ColumnDefinition {

name = "watched"

type = "boolean"

}

val tableRequest =

UpdateTableRequest {

keyspaceName = keySpace

tableName = tableNameVal

addColumns = listOf(def)

}

KeyspacesClient { region = "us-east-1" }.use { keyClient ->

keyClient.updateTable(tableRequest)

}

• For API details, see UpdateTable in AWS SDK for Kotlin API reference.

Actions 512

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Python

SDK for Python (Boto3)

Note

There's more on GitHub. Find the complete example and learn how to set up and run

in the AWS Code Examples Repository.

class KeyspaceWrapper:

"""Encapsulates Amazon Keyspaces (for Apache Cassandra) keyspace and table

actions."""

def __init__(self, keyspaces_client):

"""

:param keyspaces_client: A Boto3 Amazon Keyspaces client.

"""

self.keyspaces_client = keyspaces_client

self.ks_name = None

self.ks_arn = None

self.table_name = None

@classmethod

def from_client(cls):

keyspaces_client = boto3.client("keyspaces")

return cls(keyspaces_client)

def update_table(self):

"""

Updates the schema of the table.

This example updates a table of movie data by adding a new column

that tracks whether the movie has been watched.

"""

try:

self.keyspaces_client.update_table(

keyspaceName=self.ks_name,

tableName=self.table_name,

addColumns=[{"name": "watched", "type": "boolean"}],

)

Actions 513

Amazon Keyspaces (for Apache Cassandra) Developer Guide

except ClientError as err:

logger.error(

"Couldn't update table %s. Here's why: %s: %s",

self.table_name,

err.response["Error"]["Code"],

err.response["Error"]["Message"],

)

raise

• For API details, see UpdateTable in AWS SDK for Python (Boto3) API Reference.

For a complete list of AWS SDK developer guides and code examples, see Using this service with

an AWS SDK. This topic also includes information about getting started and details about previous

SDK versions.

Actions 514

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces (for Apache Cassandra) libraries and

tools

This section provides information about Amazon Keyspaces (for Apache Cassandra) libraries, code

examples, and tools.

Topics

• Libraries and examples

• Highlighted sample and developer tool repos

Libraries and examples

You can ﬁnd Amazon Keyspaces open-source libraries and developer tools on GitHub in the AWS

and AWS samples repos.

Amazon Keyspaces (for Apache Cassandra) developer toolkit

This repository provides a docker image with helpful developer tools for Amazon Keyspaces.

For example, it includes a CQLSHRC ﬁle with best practices, an optional AWS authentication

expansion for cqlsh, and helper tools to perform common tasks. The toolkit is optimized for

Amazon Keyspaces, but also works with Apache Cassandra clusters.

https://github.com/aws-samples/amazon-keyspaces-toolkit.

Amazon Keyspaces (for Apache Cassandra) examples

This repo is our oﬃcial list of Amazon Keyspaces example code. The repo is subdivided into

sections by language (see Examples). Each language has its own subsection of examples. These

examples demonstrate common Amazon Keyspaces service implementations and patterns that you

can use when building applications.

https://github.com/aws-samples/amazon-keyspaces-examples/.

AWS Signature Version 4 (SigV4) authentication plugins

The plugins enable you to manage access to Amazon Keyspaces by using AWS Identity and Access

Management (IAM) users and roles.

Libraries and examples 515

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Java: https://github.com/aws/aws-sigv4-auth-cassandra-java-driver-plugin.

Node.js: https://github.com/aws/aws-sigv4-auth-cassandra-nodejs-driver-plugin.

Python: https://github.com/aws/aws-sigv4-auth-cassandra-python-driver-plugin.

Go: https://github.com/aws/aws-sigv4-auth-cassandra-gocql-driver-plugin.

Highlighted sample and developer tool repos

Below are a selection of helpful community tools for Amazon Keyspaces (for Apache Cassandra).

Amazon Keyspaces Protocol Buﬀers

You can use Protocol Buﬀers (Protobuf) with Amazon Keyspaces to provide an alternative to

Apache Cassandra User Deﬁned Types (UDTs). Protobuf is a free and open-source cross-platform

data format which is used to serialize structured data. You can store Protobuf data using the

CQL BLOB data type and refactor UDTs while preserving structured data across applications and

programming languages.

This repository provides a code example that connects to Amazon Keyspaces, creates a new table,

and inserts a row containing a Protobuf message. Then the row is read with strong consistency.

https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/java/datastax-v4/

protobuf-user-deﬁned-types

AWS CloudFormation template to create Amazon CloudWatch

dashboard for Amazon Keyspaces (for Apache Cassandra) metrics

This repository provides AWS CloudFormation templates to quickly set up CloudWatch metrics

for Amazon Keyspaces. Using this template will allow you to get started more easily by providing

deployable prebuilt CloudWatch dashboards with commonly used metrics.

https://github.com/aws-samples/amazon-keyspaces-cloudwatch-cloudformation-templates.

Using Amazon Keyspaces (for Apache Cassandra) with AWS Lambda

The repository contains examples that show how to connect to Amazon Keyspaces from Lambda.

Below are some examples.

Highlighted sample and developer tool repos 516

Amazon Keyspaces (for Apache Cassandra) Developer Guide

C#/.NET: https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/dotnet/

datastax-v3/connection-lambda.

Java: https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/java/datastax-v4/

connection-lambda.

Another Lambda example that shows how to deploy and use Amazon Keyspaces from a Python

Lambda is available from the following repo.

https://github.com/aws-samples/aws-keyspaces-lambda-python

Using Amazon Keyspaces (for Apache Cassandra) with Spring

This is an example that shows you how to use Amazon Keyspaces with Spring Boot.

https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/java/datastax-v4/spring

Using Amazon Keyspaces (for Apache Cassandra) with Scala

This is an example that shows how to connect to Amazon Keyspaces using the SigV4 authentication

plugin with Scala.

https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/scala/datastax-v4/

connection-sigv4

Using Amazon Keyspaces (for Apache Cassandra) with AWS Glue

This is an example that shows how to use Amazon Keyspaces with AWS Glue.

https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/scala/datastax-v4/aws-

glue

Amazon Keyspaces (for Apache Cassandra) Cassandra query language

(CQL) to AWS CloudFormation converter

This package implements a command-line tool for converting Apache Cassandra Query Language

(CQL) scripts to AWS CloudFormation (CloudFormation) templates, which allows Amazon

Keyspaces schemas to be easily managed in CloudFormation stacks.

https://github.com/aws/amazon-keyspaces-cql-to-cfn-converter.

Using Amazon Keyspaces (for Apache Cassandra) with Spring 517

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces (for Apache Cassandra) helpers for Apache

Cassandra driver for Java

This repository contains driver policies, examples, and best practices when using the DataStax Java

Driver with Amazon Keyspaces (for Apache Cassandra).

https://github.com/aws-samples/amazon-keyspaces-java-driver-helpers.

Amazon Keyspaces (for Apache Cassandra) snappy compression demo

This repository demonstrates how to compress, store, and read/write large objects for faster

performance and lower throughput and storage costs.

https://github.com/aws-samples/amazon-keyspaces-compression-example.

Amazon Keyspaces (for Apache Cassandra) and Amazon S3 codec demo

Custom Amazon S3 Codec supports transparent, user-conﬁgurable mapping of UUID pointers to

Amazon S3 objects.

https://github.com/aws-samples/amazon-keyspaces-large-object-s3-demo.

Amazon Keyspaces (for Apache Cassandra) helpers for Apache Cassandra driver for Java 518

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Best practices for designing and architecting with

Amazon Keyspaces (for Apache Cassandra)

Use this section to quickly ﬁnd recommendations for maximizing performance and minimizing

throughput costs when working with Amazon Keyspaces.

Contents

• Key diﬀerences and design principles of NoSQL design

• Diﬀerences between relational data design and NoSQL

• Two key concepts for NoSQL design

• Approaching NoSQL design

• Optimize client driver connections for the serverless environment

• How connections work in Amazon Keyspaces

• How to conﬁgure connections in Amazon Keyspaces

• How to conﬁgure connections over VPC endpoints in Amazon Keyspaces

• How to monitor connections in Amazon Keyspaces

• How to handle connection errors in Amazon Keyspaces

• Data modeling best practices: recommendations for designing data models

• How to use partition keys eﬀectively in Amazon Keyspaces

• Use write sharding to evenly distribute workloads across partitions

• Sharding using compound partition keys and random values

• Sharding using compound partition keys and calculated values

• Optimizing costs of Amazon Keyspaces tables

• Evaluate your costs at the table level

• How to view the costs of a single Amazon Keyspaces table

• Cost Explorer's default view

• How to use and apply table tags in Cost Explorer

• Evaluate your table's capacity mode

• What table capacity modes are available

• When to select on-demand capacity mode

• When to select provisioned capacity mode

519

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Additional factors to consider when choosing a table capacity mode

• Evaluate your table's Application Auto Scaling settings

• Understanding your Application Auto Scaling settings

• How to identify tables with low target utilization (<=50%)

• How to address workloads with seasonal variance

• How to address spiky workloads with unknown patterns

• How to address workloads with linked applications

• Identify your unused resources to optimize costs in Amazon Keyspaces

• How to identify unused resources

• Identifying unused table resources

• Cleaning up unused table resources

• Cleaning up unused point-in-time recovery (PITR) backups

• Evaluate your table usage patterns to optimize performance and cost

• Perform fewer strongly-consistent read operations

• Enable Time to Live (TTL)

• Evaluate your provisioned capacity for right-sized provisioning

• How to retrieve consumption metrics from your Amazon Keyspaces tables

• How to identify under-provisioned Amazon Keyspaces tables

• How to identify over-provisioned Amazon Keyspaces tables

Key diﬀerences and design principles of NoSQL design

NoSQL database systems like Amazon Keyspaces use alternative models for data management,

such as key-value pairs or document storage. When you switch from a relational database

management system to a NoSQL database system like Amazon Keyspaces, it's important to

understand the key diﬀerences and speciﬁc design approaches.

Topics

• Diﬀerences between relational data design and NoSQL

• Two key concepts for NoSQL design

• Approaching NoSQL design

NoSQL design 520

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Diﬀerences between relational data design and NoSQL

Relational database systems (RDBMS) and NoSQL databases have diﬀerent strengths and

weaknesses:

• In RDBMS, data can be queried ﬂexibly, but queries are relatively expensive and don't scale well

in high-traﬃc situations (see the section called “Data modeling”).

• In a NoSQL database such as Amazon Keyspaces, data can be queried eﬃciently in a limited

number of ways, outside of which queries can be expensive and slow.

These diﬀerences make database design diﬀerent between the two systems:

• In RDBMS, you design for ﬂexibility without worrying about implementation details or

performance. Query optimization generally doesn't aﬀect schema design, but normalization is

important.

• In Amazon Keyspaces, you design your schema speciﬁcally to make the most common and

important queries as fast and as inexpensive as possible. Your data structures are tailored to the

speciﬁc requirements of your business use cases.

Two key concepts for NoSQL design

NoSQL design requires a diﬀerent mindset than RDBMS design. For an RDBMS, you can go ahead

and create a normalized data model without thinking about access patterns. You can then extend it

later when new questions and query requirements arise. You can organize each type of data into its

own table.

How NoSQL design is diﬀerent

• By contrast, you shouldn't start designing your schema for Amazon Keyspaces until you know the

questions it needs to answer. Understanding the business problems and the application use cases

up front is essential.

• You should maintain as few tables as possible in an Amazon Keyspaces application. Having fewer

tables keeps things more scalable, requires less permissions management, and reduces overhead

for your Amazon Keyspaces application. It can also help keep backup costs lower overall.

NoSQL vs. RDBMS 521

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Approaching NoSQL design

The ﬁrst step in designing your Amazon Keyspaces application is to identify the speciﬁc query

patterns that the system must satisfy.

In particular, it is important to understand three fundamental properties of your application's

access patterns before you begin:

• Data size: Knowing how much data will be stored and requested at one time helps to determine

the most eﬀective way to partition the data.

• Data shape: Instead of reshaping data when a query is processed (as an RDBMS system does), a

NoSQL database organizes data so that its shape in the database corresponds with what will be

queried. This is a key factor in increasing speed and scalability.

• Data velocity: Amazon Keyspaces scales by increasing the number of physical partitions that are

available to process queries, and by eﬃciently distributing data across those partitions. Knowing

in advance what the peak query loads will be might help determine how to partition data to best

use I/O capacity.

After you identify speciﬁc query requirements, you can organize data according to general

principles that govern performance:

• Keep related data together. Research on routing-table optimization 20 years ago found that

"locality of reference" was the single most important factor in speeding up response time:

keeping related data together in one place. This is equally true in NoSQL systems today, where

keeping related data in close proximity has a major impact on cost and performance. Instead

of distributing related data items across multiple tables, you should keep related items in your

NoSQL system as close together as possible.

As a general rule, you should maintain as few tables as possible in an Amazon Keyspaces

application.

Exceptions are cases where high-volume time series data are involved, or datasets that have very

diﬀerent access patterns. A single table with inverted indexes can usually enable simple queries

to create and retrieve the complex hierarchical data structures required by your application.

• Use sort order. Related items can be grouped together and queried eﬃciently if their key

design causes them to sort together. This is an important NoSQL design strategy.

General approach 522

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Distribute queries. It is also important that a high volume of queries not be focused on one

part of the database, where they can exceed I/O capacity. Instead, you should design data keys

to distribute traﬃc evenly across partitions as much as possible, avoiding "hot spots."

These general principles translate into some common design patterns that you can use to model

data eﬃciently in Amazon Keyspaces.

Optimize client driver connections for the serverless

environment

To communicate with Amazon Keyspaces, you can use any of the existing Apache Cassandra client

drivers of your choice. Because Amazon Keyspaces is a serverless service, we recommend that

you optimize the connection conﬁguration of your client driver for the throughput needs of your

application. This topic introduces best practices including how to calculate how many connections

your application requires, as well as monitoring and error handling of connections.

Topics

• How connections work in Amazon Keyspaces

• How to conﬁgure connections in Amazon Keyspaces

• How to conﬁgure connections over VPC endpoints in Amazon Keyspaces

• How to monitor connections in Amazon Keyspaces

• How to handle connection errors in Amazon Keyspaces

How connections work in Amazon Keyspaces

This sections gives an overview of how client driver connections work in Amazon Keyspaces.

Because Cassandra client driver misconﬁguration can result in PerConnectionRequestExceeded

events in Amazon Keyspaces, conﬁguring the right amount of connections in the client driver

conﬁguration is required to avoid these and similar connection errors.

When connecting to Amazon Keyspaces, the driver requires a seed endpoint to establish an initial

connection. Amazon Keyspaces uses DNS to route the initial connection to one of the many

available endpoints. The endpoints are attached to network load balancers that in turn establish

a connection to one of the request handlers in the ﬂeet. After the initial connection is established,

the client driver gathers information about all available endpoints from the system.peers

Connections 523

Amazon Keyspaces (for Apache Cassandra) Developer Guide

table. With this information, the client driver can create additional connections to the listed

endpoints. The number of connections the client driver can create is limited by the number of local

connections speciﬁed in the client driver settings. By default, most client drivers establish one

connection per endpoint and establish a connection pool to Cassandra and load balance queries

over that pool of connections. Although multiple connections can be established to the same

endpoint, behind the network load balancer they may be connected to many diﬀerent request

handlers. When connecting through the public endpoint, establishing one connection to each

of the nine endpoints listed in the system.peers table results in nine connections to diﬀerent

request handlers.

How to conﬁgure connections in Amazon Keyspaces

Amazon Keyspaces supports up to 3,000 CQL queries per TCP connection per second. Because

there's no limit on the number of connections a driver can establish, we recommend to target only

500 CQL requests per second per connection to allow for overhead, traﬃc bursts, and better load

balancing. Follow these steps to ensure that your driver's connection is correctly conﬁgured for the

needs of your application.

Increase the number of connections per IP address your driver is maintaining in its connection

pool.

• Most Cassandra drivers establish a connection pool to Cassandra and load balance queries

over that pool of connections. The default behavior of most drivers is to establish a single

connection to each endpoint. Amazon Keyspaces exposes nine peer IP addresses to drivers, so

based on the default behavior of most drivers, this results in 9 connections. Amazon Keyspaces

supports up to 3,000 CQL queries per TCP connection per second, therefore, the maximum

How to conﬁgure connections 524

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CQL query throughput of a driver using the default settings is 27,000 CQL queries per second.

If you use the driver's default settings, a single connection may have to process more than

the maximum CQL query throughput of 3,000 CQL queries per second. This could result in

PerConnectionRequestExceeded events.

•

To avoid PerConnectionRequestExceeded events, you must conﬁgure the driver to create

additional connections per endpoint to distribute the throughput.

• As a best practice in Amazon Keyspaces, assume that each connection can support 500 CQL

queries per second.

• That means that for a production application that needs to support an estimated 27,000

CQL queries per second distributed over the nine available endpoints, you must conﬁgure six

connections per endpoint. This ensures that each connection processes no more than 500

requests per second.

Calculate the number of connections per IP address you need to conﬁgure for your driver based

on the needs of your application.

To determine the number of connections you need to conﬁgure per endpoint for your application,

consider the following example. You have an application that needs to support 20,000 CQL queries

per second consisting of 10,000 INSERT, 5,000 SELECT, and 5,000 DELETE operations. The Java

application is running on three instances on Amazon Elastic Container Service (Amazon ECS) where

each instance establishes a single session to Amazon Keyspaces. The calculation you can use to

estimate how many connections you need to conﬁgure for your driver uses the following input.

1. The number of requests per second your application needs to support.

2. The number of the available instances with one subtracted to account for maintenance or

failure.

3. The number of available endpoints. If you're connecting over public endpoints, you have nine

available endpoints. If you're using VPC endpoints, you have between two and ﬁve available

endpoints, depending on the Region.

4. Use 500 CQL queries per second per connection as a best practice for Amazon Keyspaces.

5. Round up the result.

For this example, the formula looks like this.

How to conﬁgure connections 525

Amazon Keyspaces (for Apache Cassandra) Developer Guide

20,000 CQL queries / (3 instances - 1 failure) / 9 public endpoints / 500 CQL queries

per second = ROUND(2.22) = 3

Based on this calculation, you need to specify three local connections per endpoint in the driver

conﬁguration. For remote connections, conﬁgure only one connection per endpoint.

How to conﬁgure connections over VPC endpoints in Amazon

Keyspaces

When connecting over private VPC endpoints, you have most likely 3 endpoints available. The

number of VPC endpoints can be diﬀerent per Region, based on the number of Availability Zones,

and the number of subnets in the assigned VPC. US East (N. Virginia) Region has ﬁve Availability

Zones and you can have up to ﬁve Amazon Keyspaces endpoints. US West (N. California) Region

has two Availability Zones and you can have up to two Amazon Keyspaces endpoints. The number

of endpoints does not impact scale, but it does increase the number of connections you need to

establish in the driver conﬁguration. Consider the following example. Your application needs to

support 20,000 CQL queries and is running on three instances on Amazon ECS where each instance

establishes a single session to Amazon Keyspaces. The only diﬀerence is how many endpoints are

available in the diﬀerent AWS Regions.

Connections required in the US East (N. Virginia) Region:

20,000 CQL queries / (3 instances - 1 failure) / 5 private VPC endpoints / 500 CQL

queries per second = 4 local connections

Connections required in the US West (N. California) Region:

20,000 CQL queries / (3 instances - 1 failure) / 2 private VPC endpoints / 500 CQL

queries per second = 10 local connections

Important

When using private VPC endpoints, additional permissions are required for Amazon

Keyspaces to discover the available VPC endpoints dynamically and populate the

system.peers table. For more information, see the section called “Populating

system.peers table entries with interface VPC endpoint information”.

VPC endpoint connections 526

Amazon Keyspaces (for Apache Cassandra) Developer Guide

When accessing Amazon Keyspaces through a private VPC endpoint using a diﬀerent AWS account,

it’s likely that you only see a single Amazon Keyspaces endpoint. Again this doesn't impact the

scale of possible throughput to Amazon Keyspaces, but it may require you to increase the number

of connections in your driver conﬁguration. This example shows the same calculation for a single

available endpoint.

20,000 CQL queries / (3 instances - 1 failure) / 1 private VPC endpoints / 500 CQL

queries per second = 20 local connections

To learn more about cross-account access to Amazon Keyspaces using a shared VPC, see the section

called “Conﬁgure cross-account access in a shared VPC”.

How to monitor connections in Amazon Keyspaces

To help identify the number of endpoints your application is connected to, you can log the number

of peers discovered in the system.peers table. The following example is an example of Java code

which prints the number of peers after the connection has been established.

ResultSet result = session.execute(new SimpleStatement("SELECT * FROM system.peers"));

logger.info("number of Amazon Keyspaces endpoints:" + result.all().stream().count());

Note

The CQL console or AWS console are not deployed within a VPC and therefore use the

public endpoint. As a result, running the system.peers query from applications located

outside of the VPCE often results in 9 peers. It may also be helpful to print the IP addresses

of each peer.

You can also observe the number of peers when using a VPC endpoint by setting up VPCE Amazon

CloudWatch metrics. In CloudWatch, you can see the number of connections established to the VPC

endpoint. The Cassandra drivers establish a connection for each endpoint to send CQL queries and

a control connection to gather system table information. The image below shows the VPC endpoint

CloudWatch metrics after connecting to Amazon Keyspaces with 1 connection conﬁgured in the

driver settings. The metric is showing six active connections consisting of one control connection

and ﬁve connections (1 per endpoint across Availability Zones).

How to monitor connections 527

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To get started with monitoring the number of connections using a CloudWatch graph, you can

deploy this AWS CloudFormation template available on GitHub in the Amazon Keyspaces template

repository.

How to handle connection errors in Amazon Keyspaces

When exceeding the 3,000 request per connection quota, Amazon Keyspaces returns a

PerConnectionRequestExceededevent and the Cassandra driver receives a WriteTimeout

or ReadTimeout exception. You should retry this exception with exponential backoﬀ in your

Cassandra retry policy or in your application. You should provide exponential backoﬀ to avoid

sending additional request.

The default retry policy attempts to try next host in the query plan. Because Amazon

Keyspaces may have one to three available endpoints when connecting to the VPC endpoint,

you may also see the NoHostAvailableException in addition to the WriteTimeout and

ReadTimeout exceptions in your application logs. You can use Amazon Keyspaces provided retry

policies, which retry on the same endpoint but across diﬀerent connections.

You can ﬁnd examples for exponential retry policies for Java on GitHub in the Amazon Keyspaces

Java code examples repository. You can ﬁnd additional language examples on Github in the

Amazon Keyspaces code examples repository.

Data modeling best practices: recommendations for designing

data models

Eﬀective data modeling is crucial for optimizing performance and minimizing costs when

working with Amazon Keyspaces (for Apache Cassandra). This topic covers key considerations and

recommendations for designing data models that suit your application's data access patterns.

• Partition Key Design – The partition key plays a critical role in determining how data is

distributed across partitions in Amazon Keyspaces. Choosing an appropriate partition key can

signiﬁcantly impact query performance and throughput costs. This section discusses strategies

How to handle connection errors 528

Amazon Keyspaces (for Apache Cassandra) Developer Guide

for designing partition keys that promote even distribution of read and write activity across

partitions.

• Key Considerations:

• Uniform activity distribution – Aim for uniform read and write activity across all partitions to

minimize throughput costs and leverage burst capacity eﬀectively.

• Access patterns – Align your partition key design with your application's primary data access

patterns.

• Partition size – Avoid creating partitions that grow too large, as this can impact performance

and increase costs.

To visualize and design data models more easily, you can use the NoSQL Workbench.

Topics

• How to use partition keys eﬀectively in Amazon Keyspaces

How to use partition keys eﬀectively in Amazon Keyspaces

The primary key that uniquely identiﬁes each row in an Amazon Keyspaces table can consist of

one or multiple partition key columns, which determine which partitions the data is stored in, and

one or more optional clustering column, which deﬁne how data is clustered and sorted within a

partition.

Because the partition key establishes the number of partitions your data is stored in and how the

data is distributed across these partitions, how you chose your partition key can have a signiﬁcant

impact upon the performance of your queries. In general, you should design your application for

uniform activity across all partitions on disk.

Distributing read and write activity of your application evenly across all partitions helps to

minimize throughput costs and this applies to on-demand as well as provisioned read/write

capacity modes. For example, if you are using provisioned capacity mode, you can determine

the access patterns that your application needs, and estimate the total read capacity units (RCU)

and write capacity units (WCU) that each table requires. Amazon Keyspaces supports your access

patterns using the throughput that you provisioned as long as the traﬃc against a given partition

does not exceed 3,000 RCUs and 1,000 WCUs.

Amazon Keyspaces oﬀers additional ﬂexibility in your per-partition throughput provisioning by

providing burst capacity, for more information see the section called “Use burst capacity”.

Partition key design 529

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Topics

• Use write sharding to evenly distribute workloads across partitions

Use write sharding to evenly distribute workloads across partitions

One way to better distribute writes across a partition in Amazon Keyspaces is to expand the space.

You can do this in several diﬀerent ways. You can add an additional partition key column to which

you write random numbers to distribute the rows among partitions. Or you can use a number that

is calculated based on something that you're querying on.

Sharding using compound partition keys and random values

One strategy for distributing loads more evenly across a partition is to add an additional partition

key column to which you write random numbers. Then you randomize the writes across the larger

space.

For example, consider the following table which has a single partition key representing a date.

CREATE TABLE IF NOT EXISTS tracker.blogs (

publish_date date,

title text,

description int,

PRIMARY KEY (publish_date));

To more evenly distribute this table across partitions, you could include an additional partition key

column shard that stores random numbers. For example:

CREATE TABLE IF NOT EXISTS tracker.blogs (

publish_date date,

shard int,

title text,

description int,

PRIMARY KEY ((publish_date, shard)));

When inserting data you might choose a random number between 1 and 200 for the shard

column. This yields compound partition key values like (2020-07-09, 1), (2020-07-09, 2),

and so on, through (2020-07-09, 200). Because you are randomizing the partition key, the

writes to the table on each day are spread evenly across multiple partitions. This results in better

parallelism and higher overall throughput.

Partition key design 530

Amazon Keyspaces (for Apache Cassandra) Developer Guide

However, to read all the rows for a given day, you would have to query the rows for all the shards

and then merge the results. For example, you would ﬁrst issue a SELECT statement for the

partition key value (2020-07-09, 1). Then issue another SELECT statement for (2020-07-09,

2), and so on, through (2020-07-09, 200). Finally, your application would have to merge the

results from all those SELECT statements.

Sharding using compound partition keys and calculated values

A randomizing strategy can greatly improve write throughput. But it's diﬃcult to read a speciﬁc

row because you don't know which value was written to the shard column when the row was

written. To make it easier to read individual rows, you can use a diﬀerent strategy. Instead of using

a random number to distribute the rows among partitions, use a number that you can calculate

based upon something that you want to query on.

Consider the previous example, in which a table uses today's date in the partition key. Now suppose

that each row has an accessible title column, and that you most often need to ﬁnd rows by title

in addition to date. Before your application writes the row to the table, it could calculate a hash

value based on the title and use it to populate the shard column. The calculation might generate

a number between 1 and 200 that is fairly evenly distributed, similar to what the random strategy

produces.

A simple calculation would likely suﬃce, such as the product of the UTF-8 code point values for

the characters in the title, modulo 200, + 1. The compound partition key value would then be the

combination of the date and calculation result.

With this strategy, the writes are spread evenly across the partition key values, and thus across

the physical partitions. You can easily perform a SELECT statement for a particular row and date

because you can calculate the partition key value for a speciﬁc title value.

To read all the rows for a given day, you still must SELECT each of the (2020-07-09, N) keys

(where N is 1–200), and your application then has to merge all the results. The beneﬁt is that you

avoid having a single "hot" partition key value taking all of the workload.

Optimizing costs of Amazon Keyspaces tables

This section covers best practices on how to optimize costs for your existing Amazon Keyspaces

tables. You should look at the following strategies to see which cost optimization strategy best

suits your needs and approach them iteratively. Each strategy provides an overview of what might

Cost optimization 531

Amazon Keyspaces (for Apache Cassandra) Developer Guide

be impacting your costs, how to look for opportunities to optimize costs, and prescriptive guidance

on how to implement these best practices to help you save.

Topics

• Evaluate your costs at the table level

• Evaluate your table's capacity mode

• Evaluate your table's Application Auto Scaling settings

• Identify your unused resources to optimize costs in Amazon Keyspaces

• Evaluate your table usage patterns to optimize performance and cost

• Evaluate your provisioned capacity for right-sized provisioning

Evaluate your costs at the table level

The Cost Explorer tool found within the AWS Management Console allows you to see costs broken

down by type, for example read, write, storage, and backup charges. You can also see these costs

summarized by period such as month or day.

One common challenge with Cost Explorer is that you can't review the costs of only one particular

table easily, because Cost Explorer doesn't let you ﬁlter or group by costs of a speciﬁc table. You

can view the metric Billable table size (Bytes) of each table in the Amazon Keyspaces console on

the table's Monitor tab. If you need more cost related information per table, this section shows you

how to use tagging to perform individual table cost analysis in Cost Explorer.

Topics

• How to view the costs of a single Amazon Keyspaces table

• Cost Explorer's default view

• How to use and apply table tags in Cost Explorer

How to view the costs of a single Amazon Keyspaces table

You can see basic information about an Amazon Keyspaces table in the console, including the

primary key schema, the billable table size, and capacity related metrics. You can use the size of the

table to calculate the monthly storage cost for the table. For example, $0.25 per GB in the US East

(N. Virginia) AWS Region.

Evaluate your costs at the table level 532

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If the table is using provisioned capacity mode, the current read capacity unit (RCU) and write

capacity unit (WCU) settings are returned as well. You can use this information to calculate the

current read and write costs for the table. Note that these costs could change, especially if you

have conﬁgured the table with Amazon Keyspaces automatic scaling.

Cost Explorer's default view

The default view in Cost Explorer provides charts showing the cost of consumed resources, for

example throughput and storage. You can choose to group these costs by period, such as totals by

month or by day. The costs of storage, reads, writes, and other categories can be broken out and

compared as well.

How to use and apply table tags in Cost Explorer

By default, Cost Explorer does not provide a summary of the costs for any one speciﬁc table,

because it combines the costs of multiple tables into a total. However, you can use AWS resource

tagging to identify each table by a metadata tag. Tags are key-value pairs that you can use for a

variety of purposes, for example to identify all resources belonging to a project or department. For

more information, see the section called “Working with tags”.

For this example, we use a table with the name MyTable.

1. Set a tag with the key of table_name and the value of MyTable.

2. Activate the tag within Cost Explorer and then ﬁlter on the tag value to gain more visibility

into each table's costs.

Evaluate your costs at the table level 533

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

It may take one or two days for the tag to start appearing in Cost Explorer

You can set metadata tags yourself in the console, or programmatically with CQL, the AWS CLI, or

the AWS SDK. Consider requiring a table_name tag to be set as part of your organization’s new

table creation process. For more information, see the section called “Create cost allocation reports”.

Evaluate your table's capacity mode

This section provides an overview of how to select the appropriate capacity mode for your Amazon

Keyspaces table. Each mode is tuned to meet the needs of a diﬀerent workload in terms of

responsiveness to change in throughput, as well as how that usage is billed. You must balance

these factors when making your decision.

Topics

• What table capacity modes are available

• When to select on-demand capacity mode

• When to select provisioned capacity mode

• Additional factors to consider when choosing a table capacity mode

What table capacity modes are available

When you create an Amazon Keyspaces table, you must select either on-demand or provisioned

capacity mode. For more information, see the section called “Conﬁgure read/write capacity

modes”.

On-demand capacity mode

The on-demand capacity mode is designed to eliminate the need to plan or provision the capacity

of your Amazon Keyspaces table. In this mode, your table instantly accommodates requests

without the need to scale any resources up or down (up to twice the previous peak throughput of

the table).

On-demand tables are billed by counting the number of actual requests against the table, so you

only pay for what you use rather than what has been provisioned.

Evaluate your table's capacity mode 534

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Provisioned capacity mode

The provisioned capacity mode is a more traditional model where you can deﬁne how much

capacity the table has available for requests either directly or with the assistance of Application

Auto Scaling. Because a speciﬁc capacity is provisioned for the table at any given time, billing is

based oﬀ of the capacity provisioned rather than the number of requests. Going over the allocated

capacity can also cause the table to reject requests and reduce the experience of your application's

users.

Provisioned capacity mode requires a balance between not over-provisioning or under provisioning

the table to achieve both, low occurrence of insuﬃcient throughput capacity errors, and optimized

costs.

When to select on-demand capacity mode

When optimizing for cost, on-demand mode is your best choice when you have an unpredictable

workload similar to the one shown in the following graph.

These factors contribute to this type of workload:

• Unpredictable request timing (resulting in traﬃc spikes)

• Variable volume of requests (resulting from batch workloads)

• Drops to zero or below 18% of the peak for a given hour (resulting from development or test

environments)

For workloads with the above characteristics, using Application Auto Scaling to maintain enough

capacity for the table to respond to spikes in traﬃc may lead to undesirable outcomes. Either the

table could be over-provisioned and costing more than necessary, or the table could be under

Evaluate your table's capacity mode 535

Amazon Keyspaces (for Apache Cassandra) Developer Guide

provisioned and requests are leading to unnecessary low capacity throughput errors. In cases like

this, on-demand tables are the better choice.

Because on-demand tables are billed by request, there is nothing further you need to do at the

table level to optimize for cost. You should regularly evaluate your on-demand tables to verify the

workload still has the above characteristics. If the workload has stabilized, consider changing to

provisioned mode to maintain cost optimization.

When to select provisioned capacity mode

An ideal workload for provisioned capacity mode is one with a more predictable usage pattern like

shown in the graph below.

The following factors contribute to a predictable workload:

• Predicable/cyclical traﬃc for a given hour or day

• Limited short term bursts of traﬃc

Since the traﬃc volumes within a given time or day are more stable, you can set the provisioned

capacity relatively close to the actual consumed capacity of the table. Cost optimizing a

provisioned capacity table is ultimately an exercise in getting the provisioned capacity (blue line) as

close to the consumed capacity (orange line) as possible without increasing ThrottledRequests

events for the table. The space between the two lines is both, wasted capacity as well as insurance

against a bad user experience due to insuﬃcient throughput capacity errors.

Amazon Keyspaces provides Application Auto Scaling for provisioned capacity tables, which

automatically balances this on your behalf. You can track your consumed capacity throughout the

day and conﬁgure the provisioned capacity of the table based on a handful of variables.

Evaluate your table's capacity mode 536

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Minimum capacity units

You can set the minimum capacity of a table to limit the occurrence of insuﬃcient throughput

capacity errors, but it doesn't reduce the cost of the table. If your table has periods of low usage

followed by a sudden burst of high usage, setting the minimum can prevent Application Auto

Scaling from setting the table capacity too low.

Maximum capacity units

You can set the maximum capacity of a table to limit a table scaling higher than intended. Consider

applying a maximum for development or test tables, where large-scale load testing is not desired.

You can set a maximum for any table, but be sure to regularly evaluate this setting against the

table baseline when using it in production, to prevent accidental insuﬃcient throughput capacity

errors.

Target utilization

Setting the target utilization of the table is the primary means of cost optimization for a

provisioned capacity table. Setting a lower percent value here increases how much the table is

over-provisioned, increasing cost, but reducing the risk of insuﬃcient throughput capacity errors.

Setting a higher percentage value decreases by how much the table is over-provisioned, but

increases the risk of insuﬃcient throughput capacity errors.

Additional factors to consider when choosing a table capacity mode

When deciding between the two capacity modes, there are some additional factors worth

considering.

When deciding between the two table modes, consider how much this additional discount aﬀects

the cost of the table. In many cases, even a relatively unpredictable workload can be more cost

eﬀective to run on an over-provisioned provisioned capacity table with reserved capacity.

Improving predictability of your workload

In some situations, a workload may seemingly have both, a predictable and an unpredictable

pattern. While this can be easily supported with an on-demand table, costs will likely be lower if

the unpredictable patterns in the workload can be improved.

One of the most common causes of these patterns are batch imports. This type of traﬃc can often

exceed the baseline capacity of the table to such a degree that insuﬃcient throughput capacity

Evaluate your table's capacity mode 537

Amazon Keyspaces (for Apache Cassandra) Developer Guide

errors would occur if it were to run. To keep a workload like this running on a provisioned capacity

table, consider the following options:

• If the batch occurs at scheduled times, you can schedule an increase to your application auto-

scaling capacity before it runs.

• If the batch occurs randomly, consider trying to extend the time it takes to run rather than

executing as fast as possible.

• Add a ramp up period to the import, where the velocity of the import starts small but is slowly

increased over a few minutes until Application Auto Scaling has had the opportunity to start

adjusting table capacity.

Evaluate your table's Application Auto Scaling settings

This section provides an overview of how to evaluate the Application Auto Scaling settings on your

Amazon Keyspaces tables. Amazon Keyspaces Application Auto Scaling is a feature that manages

table throughput based on your application traﬃc and your target utilization metric. This ensures

your tables have the required capacity required for your application patterns.

The Application Auto Scaling service monitors your current table utilization and compares it to

the target utilization value: TargetValue. It notiﬁes you if it is time to increase or decrease the

allocated capacity.

Topics

• Understanding your Application Auto Scaling settings

• How to identify tables with low target utilization (<=50%)

• How to address workloads with seasonal variance

• How to address spiky workloads with unknown patterns

• How to address workloads with linked applications

Understanding your Application Auto Scaling settings

Deﬁning the correct value for the target utilization, initial step, and ﬁnal values is an activity that

requires involvement from your operations team. This allows you to properly deﬁne the values

based on historical application usage, which is used to trigger the Application Auto Scaling policies.

The utilization target is the percentage of your total capacity that needs to be met during a period

of time before the Application Auto Scaling rules apply.

Evaluate your table's Application Auto Scaling settings 538

Amazon Keyspaces (for Apache Cassandra) Developer Guide

When you set a high utilization target (a target around 90%) it means your traﬃc needs to be

higher than 90% for a period of time before the Application Auto Scaling is activated. You should

not use a high utilization target unless your application is very constant and doesn’t receive spikes

in traﬃc.

When you set a very low utilization (a target less than 50%) it means your application would

need to reach 50% of the provisioned capacity before it triggers an Application Auto Scaling policy.

Unless your application traﬃc grows at a very aggressive rate, this usually translates into unused

capacity and wasted resources.

How to identify tables with low target utilization (<=50%)

You can use either the AWS CLI or AWS Management Console to monitor and identify the

TargetValues for your Application Auto Scaling policies in your Amazon Keyspaces resources:

AWS CLI

1. Return the entire list of resources by running the following command:

aws application-autoscaling describe-scaling-policies --service-namespace

cassandra

This command will return the entire list of Application Auto Scaling policies that are issued

to any Amazon Keyspaces resource. If you only want to retrieve the resources from a

particular table, you can add the –resource-id parameter. For example:

aws application-autoscaling describe-scaling-policies --service-namespace

cassandra --resource-id "keyspace/keyspace-name/table/table-name”

2. Return only the auto scaling policies for a particular table by running the following

command

aws application-autoscaling describe-scaling-policies --service-namespace

cassandra --resource-id "keyspace/keyspace-name/table/table-name”

The values for the Application Auto Scaling policies are highlighted below. You need to

ensure that the target value is greater than 50% to avoid over-provisioning. You should

obtain a result similar to the following:

{

Evaluate your table's Application Auto Scaling settings 539

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"ScalingPolicies": [

{

"PolicyARN": "arn:aws:autoscaling:<region>:<account-

id>:scalingPolicy:<uuid>:resource/keyspaces/table/table-name-scaling-policy",

"PolicyName": $<full-gsi-name>”,

"ServiceNamespace": "cassandra",

"ResourceId": "keyspace/keyspace-name/table/table-name",

"ScalableDimension": "cassandra:index:WriteCapacityUnits",

"PolicyType": "TargetTrackingScaling",

"TargetTrackingScalingPolicyConfiguration": {

"TargetValue": 70.0,

"PredefinedMetricSpecification": {

"PredefinedMetricType": "KeyspacesWriteCapacityUtilization"

}

"Alarms": [

...

"CreationTime": "2022-03-04T16:23:48.641000+10:00"

{

"PolicyARN": "arn:aws:autoscaling:<region>:<account-

id>:scalingPolicy:<uuid>:resource/keyspaces/table/table-name/index/<index-

name>:policyName/$<full-gsi-name>-scaling-policy",

"PolicyName":$<full-table-name>”,

"ServiceNamespace": "cassandra",

"ResourceId": "keyspace/keyspace-name/table/table-name",

"ScalableDimension": "cassandra:index:ReadCapacityUnits",

"PolicyType": "TargetTrackingScaling",

"TargetTrackingScalingPolicyConfiguration": {

"TargetValue": 70.0,

"PredefinedMetricSpecification": {

"PredefinedMetricType": "CassandraReadCapacityUtilization"

}

"Alarms": [

...

"CreationTime": "2022-03-04T16:23:47.820000+10:00"

}

]

}

Evaluate your table's Application Auto Scaling settings 540

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AWS Management Console

1. Log into the AWS Management Console and navigate to the CloudWatch service page at

Getting Started with the AWS Management Console. Select the appropriate AWS Region if

necessary.

2. On the left navigation bar, select Tables. On the Tables page, select the table's Name.

3. On the Table Details page on the Capacity tab, review your table's Application Auto

Scaling settings.

If your target utilization values are less than or equal to 50%, you should explore your table

utilization metrics to see if they are under-provisioned or over-provisioned.

How to address workloads with seasonal variance

Consider the following scenario: your application is operating under a minimum average value

most of the time, but the utilization target is low so your application can react quickly to events

that happen at certain hours in the day and you have enough capacity and avoid getting throttled.

This scenario is common when you have an application that is very busy during normal oﬃce

hours (9 AM to 5 PM) but then it works at a base level during after hours. Since some users start

to connect before 9 am, the application uses this low threshold to ramp up quickly to get to the

required capacity during peak hours.

This scenario could look like this:

•

Between 5 PM and 9 AM the ConsumedWriteCapacityUnits units stay between 90 and 100

• Users start to connect to the application before 9 AM and the capacity units increases

considerably (the maximum value you’ve seen is 1500 WCU)

• On average, your application usage varies between 800 to 1200 during working hours

If the previous scenario applies to your application, consider using scheduled application auto

scaling, where your table could still have an Application Auto Scaling rule conﬁgured, but with a

less aggressive target utilization that only provisions the extra capacity at the speciﬁc intervals you

require.

You can use the AWS CLI to execute the following steps to create a scheduled auto scaling rule that

executes based on the time of day and the day of the week.

Evaluate your table's Application Auto Scaling settings 541

Amazon Keyspaces (for Apache Cassandra) Developer Guide

1. Register your Amazon Keyspaces table as a scalable target with Application Auto Scaling. A

scalable target is a resource that Application Auto Scaling can scale out or in.

aws application-autoscaling register-scalable-target \

--service-namespace cassandra \

--scalable-dimension cassandra:table:WriteCapacityUnits \

--resource-id keyspace/keyspace-name/table/table-name \

--min-capacity 90 \

--max-capacity 1500

2. Set up scheduled actions according to your requirements.

You need two rules to cover the scenario: one to scale up and another to scale down. The ﬁrst

rule to scale up the scheduled action is shown in the following example.

aws application-autoscaling put-scheduled-action \

--service-namespace cassandra \

--scalable-dimension cassandra:table:WriteCapacityUnits \

--resource-id keyspace/keyspace-name/table/table-name \

--scheduled-action-name my-8-5-scheduled-action \

--scalable-target-action MinCapacity=800,MaxCapacity=1500 \

--schedule "cron(45 8 ? * MON-FRI *)" \

--timezone "Australia/Brisbane"

The second rule to scale down the scheduled action is shown in this example.

aws application-autoscaling put-scheduled-action \

--service-namespace cassandra \

--scalable-dimension cassandra:table:WriteCapacityUnits \

--resource-id keyspace/keyspace-name/table/table-name \

--scheduled-action-name my-5-8-scheduled-down-action \

--scalable-target-action MinCapacity=90,MaxCapacity=1500 \

--schedule "cron(15 17 ? * MON-FRI *)" \

--timezone "Australia/Brisbane"

3. Run the following command to validate both rules have been activated:

aws application-autoscaling describe-scheduled-actions --service-namespace

cassandra

You should get a result like this:

Evaluate your table's Application Auto Scaling settings 542

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"ScheduledActions": [

{

"ScheduledActionName": "my-5-8-scheduled-down-action",

"ScheduledActionARN":

"arn:aws:autoscaling:<region>:<account>:scheduledAction:<uuid>:resource/keyspaces/

table/table-name:scheduledActionName/my-5-8-scheduled-down-action",

"ServiceNamespace": "cassandra",

"Schedule": "cron(15 17 ? * MON-FRI *)",

"Timezone": "Australia/Brisbane",

"ResourceId": "keyspace/keyspace-name/table/table-name",

"ScalableDimension": "cassandra:table:WriteCapacityUnits",

"ScalableTargetAction": {

"MinCapacity": 90,

"MaxCapacity": 1500

"CreationTime": "2022-03-15T17:30:25.100000+10:00"

{

"ScheduledActionName": "my-8-5-scheduled-action",

"ScheduledActionARN":

"arn:aws:autoscaling:<region>:<account>:scheduledAction:<uuid>:resource/keyspaces/

table/table-name:scheduledActionName/my-8-5-scheduled-action",

"ServiceNamespace": "cassandra",

"Schedule": "cron(45 8 ? * MON-FRI *)",

"Timezone": "Australia/Brisbane",

"ResourceId": "keyspace/keyspace-name/table/table-name",

"ScalableDimension": "cassandra:table:WriteCapacityUnits",

"ScalableTargetAction": {

"MinCapacity": 800,

"MaxCapacity": 1500

"CreationTime": "2022-03-15T17:28:57.816000+10:00"

}

]

}

The following picture shows a sample workload that always keeps the 70% target utilization.

Notice how the auto scaling rules are still applying and the throughput is not getting reduced.

Evaluate your table's Application Auto Scaling settings 543

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Zooming in, we can see there was a spike in the application that triggered the 70% auto scaling

threshold, forcing the autoscaling to kick in and provide the extra capacity required for the

table. The scheduled auto scaling action will aﬀect maximum and minimum values, and it's your

responsibility to set them up.

Evaluate your table's Application Auto Scaling settings 544

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How to address spiky workloads with unknown patterns

In this scenario, the application uses a very low utilization target, because you don’t know the

application patterns yet, and you want to ensure your workload is not experiencing low capacity

throughput errors.

Consider using on-demand capacity mode instead. On-demand tables are perfect for spiky

workloads where you don’t know the traﬃc patterns. With on-demand capacity mode, you pay per

request for the data reads and writes your application performs on your tables. You do not need to

specify how much read and write throughput you expect your application to perform, as Amazon

Keyspaces instantly accommodates your workloads as they ramp up or down.

How to address workloads with linked applications

In this scenario, the application depends on other systems, like batch processing scenarios where

you can have big spikes in traﬃc according to events in the application logic.

Consider developing custom application auto-scaling logic that reacts to those events where

you can increase table capacity and TargetValues depending on your speciﬁc needs. You

could beneﬁt from Amazon EventBridge and use a combination of AWS services like Λ and Step

Functions to react to your speciﬁc application needs.

Identify your unused resources to optimize costs in Amazon Keyspaces

This section provides an overview of how to evaluate your unused resources regularly. As your

application requirements evolve, you should ensure no resources are unused and contributing to

unnecessary Amazon Keyspaces costs. The procedures described below use Amazon CloudWatch

metrics to identify unused resources and take action to reduce costs.

You can monitor Amazon Keyspaces using CloudWatch, which collects and processes raw data from

Amazon Keyspaces into readable, near real-time metrics. These statistics are retained for a period

of time, so that you can access historical information to better understand your utilization. By

default, Amazon Keyspaces metric data is sent to CloudWatch automatically. For more information,

see What is Amazon CloudWatch? and Metrics retention in the Amazon CloudWatch User Guide.

Topics

• How to identify unused resources

• Identifying unused table resources

• Cleaning up unused table resources

Identify your unused resources 545

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Cleaning up unused point-in-time recovery (PITR) backups

How to identify unused resources

To identify unused tables you can take a look at the following CloudWatch metrics over a period of

30 days to understand if there are any active reads or writes on a speciﬁc table:

ConsumedReadCapacityUnits

The number of read capacity units consumed over the speciﬁed time period, so you can track how

much consumed capacity you have used. You can retrieve the total consumed read capacity for a

table.

ConsumedWriteCapacityUnits

The number of write capacity units consumed over the speciﬁed time period, so you can track how

much consumed capacity you have used. You can retrieve the total consumed write capacity for a

table.

Identifying unused table resources

Amazon CloudWatch is a monitoring and observability service which provides the Amazon

Keyspaces table metrics you can use to identify unused resources. CloudWatch metrics can be

viewed through the AWS Management Console as well as through the AWS Command Line

Interface.

AWS Command Line Interface

To view your tables metrics through the AWS Command Line Interface, you can use the

following commands.

1. First, evaluate your table's reads:

Note

If the table name is not unique within your account, you must also specify the name

of the keyspace.

aws cloudwatch get-metric-statistics --metric-name

Identify your unused resources 546

Amazon Keyspaces (for Apache Cassandra) Developer Guide

ConsumedReadCapacityUnits --start-time <start-time> --end-time <end-

time> --period <period> --namespace AWS/Cassandra --statistics Sum --

dimensions Name=TableName,Value=<table-name>

To avoid falsely identifying a table as unused, evaluate metrics over a longer period.

Choose an appropriate start-time and end-time range, such as 30 days, and an appropriate

period, such as 86400.

In the returned data, any Sum above 0 indicates that the table you are evaluating received

read traﬃc during that period.

The following result shows a table receiving read traﬃc in the evaluated period:

{

"Timestamp": "2022-08-25T19:40:00Z",

"Sum": 36023355.0,

"Unit": "Count"

{

"Timestamp": "2022-08-12T19:40:00Z",

"Sum": 38025777.5,

"Unit": "Count"

The following result shows a table not receiving read traﬃc in the evaluated period:

{

"Timestamp": "2022-08-01T19:50:00Z",

"Sum": 0.0,

"Unit": "Count"

{

"Timestamp": "2022-08-20T19:50:00Z",

"Sum": 0.0,

"Unit": "Count"

2. Next, evaluate your table’s writes:

aws cloudwatch get-metric-statistics --metric-name

ConsumedWriteCapacityUnits --start-time <start-time> --end-time <end-

time> --period <period> --namespace AWS/Cassandra --statistics Sum --

Identify your unused resources 547

Amazon Keyspaces (for Apache Cassandra) Developer Guide

dimensions Name=TableName,Value=<table-name>

To avoid falsely identifying a table as unused, you will want to evaluate metrics over a

longer period. Choose an appropriate start-time and end-time range, such as 30 days, and

an appropriate period, such as 86400.

In the returned data, any Sum above 0 indicates that the table you are evaluating received

read traﬃc during that period.

The following result shows a table receiving write traﬃc in the evaluated period:

{

"Timestamp": "2022-08-19T20:15:00Z",

"Sum": 41014457.0,

"Unit": "Count"

{

"Timestamp": "2022-08-18T20:15:00Z",

"Sum": 40048531.0,

"Unit": "Count"

The following result shows a table not receiving write traﬃc in the evaluated period:

{

"Timestamp": "2022-07-31T20:15:00Z",

"Sum": 0.0,

"Unit": "Count"

{

"Timestamp": "2022-08-19T20:15:00Z",

"Sum": 0.0,

"Unit": "Count"

AWS Management Console

The following steps allow you to evaluate your resource utilization through the AWS

Management Console.

Identify your unused resources 548

Amazon Keyspaces (for Apache Cassandra) Developer Guide

1. Log into the AWS Management Console and navigate to the CloudWatch service page at

https://console.aws.amazon.com/cloudwatch/. Select the appropriate AWS Region in the

top right of the console, if necessary.

2. On the left navigation bar, locate the Metrics section and choose All metrics.

3. The action above opens a dashboard with two panels. In the top panel, you can see

currently graphed metrics. On the bottom you can select the metrics available to graph.

Choose Amazon Keyspaces in the bottom panel.

4. In the Amazon Keyspaces metrics selection panel, choose the Table Metrics category to

show the metrics for your tables in the current region.

5. Identify your table name by scrolling down the menu, then choose the metrics

ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits for your table.

6. Choose the Graphed metrics (2) tab and adjust the Statistic column to Sum.

7. To avoid falsely identifying a table as unused, evaluate the table metrics over a longer

period. At the top of the graph panel, choose an appropriate time frame, such as 1 month,

to evaluate your table. Choose Custom, choose 1 Months in the drop-down menu, and

choose Apply.

8. Evaluate the graphed metrics for your table to determine if it is being used. Metrics that

have gone above 0 indicate that a table has been used during the evaluated time period. A

ﬂat graph at 0 for both read and write indicates that a table is unused.

Cleaning up unused table resources

If you have identiﬁed unused table resources, you can reduce their ongoing costs in the following

ways.

Note

If you have identiﬁed an unused table but would still like to keep it available in case it

needs to be accessed in the future, consider switching it to on-demand mode. Otherwise,

you can consider deleting the table.

Capacity modes

Amazon Keyspaces charges for reading, writing, and storing data in your Amazon Keyspaces tables.

Identify your unused resources 549

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces has two capacity modes, which come with speciﬁc billing options for processing

reads and writes on your tables: on-demand and provisioned. The read/write capacity mode

controls how you are charged for read and write throughput and how you manage capacity.

For on-demand mode tables, you don't need to specify how much read and write throughput you

expect your application to perform. Amazon Keyspaces charges you for the reads and writes that

your application performs on your tables in terms of read request units and write request units.

If there is no activity on your table, you do not pay for throughput but you still incur a storage

charge.

Deleting tables

If you’ve discovered an unused table and would like to delete it, consider to make a backup or

export the data ﬁrst.

Backups taken through AWS Backup can leverage cold storage tiering, further reducing cost. Refer

to the Managing backup plans documentation for information on how to use a lifecycle to move

your backup to cold storage.

After your table has been backed up, you may choose to delete it either through the AWS

Management Console or through the AWS Command Line Interface.

Cleaning up unused point-in-time recovery (PITR) backups

Amazon Keyspaces oﬀers Point-in-time recovery, which provides continuous backups for 35 days

to help you protect against accidental writes or deletes. PITR backups have costs associated with

them.

Refer to the documentation at the section called “Backup and restore with point-in-time recovery”

to determine if your tables have backups enabled that may no longer be needed.

Evaluate your table usage patterns to optimize performance and cost

This section provides an overview of how to evaluate if you are eﬃciently using your Amazon

Keyspaces tables. There are certain usage patterns which are not optimal for Amazon Keyspaces,

and they allow room for optimization from both a performance and a cost perspective.

Topics

• Perform fewer strongly-consistent read operations

Evaluate your table usage patterns 550

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Enable Time to Live (TTL)

Perform fewer strongly-consistent read operations

Amazon Keyspaces allows you to conﬁgure read consistency on a per-request basis. Read requests

are eventually consistent by default. Eventually consistent reads are charged at 0.5 RCU for up to 4

KB of data.

Most parts of distributed workloads are ﬂexible and can tolerate eventual consistency. However,

there can be access patterns requiring strongly consistent reads. Strongly consistent reads are

charged at 1 RCU for up to 4 KB of data, essentially doubling your read costs. Amazon Keyspaces

provides you with the ﬂexibility to use both consistency models on the same table.

You can evaluate your workload and application code to conﬁrm if strongly consistent reads are

used only where required.

Enable Time to Live (TTL)

Time to Live (TTL) helps you simplify your application logic and optimize the price of storage by

expiring data from tables automatically. Data that you no longer need is automatically deleted

from your table based on the Time to Live value that you set.

Evaluate your provisioned capacity for right-sized provisioning

This section provides an overview of how to evaluate if you have right-sized provisioning on

your Amazon Keyspaces tables. As your workload evolves, you should modify your operational

procedures appropriately, especially when your Amazon Keyspaces table is conﬁgured in

provisioned mode and you have the risk to over-provision or under-provision your tables.

The procedures described in this section require statistical information that should be captured

from the Amazon Keyspaces tables that are supporting your production application. To understand

your application behavior, you should deﬁne a period of time that is signiﬁcant enough to capture

the data seasonality of your application. For example, if your application shows weekly patterns,

using a three week period should give you enough room for analysing application throughput

needs.

If you don’t know where to start, use at least one month’s worth of data usage for the calculations

below.

Evaluate your provisioned capacity for right-sized provisioning 551

Amazon Keyspaces (for Apache Cassandra) Developer Guide

While evaluating capacity, for Amazon Keyspaces tables you can conﬁgure Read Capacity Units

(RCUs) and Write Capacity Units (WCU) independently.

Topics

• How to retrieve consumption metrics from your Amazon Keyspaces tables

• How to identify under-provisioned Amazon Keyspaces tables

• How to identify over-provisioned Amazon Keyspaces tables

How to retrieve consumption metrics from your Amazon Keyspaces tables

To evaluate the table capacity, monitor the following CloudWatch metrics and select the

appropriate dimension to retrieve table information:

Read Capacity Units Write Capacity Units

ConsumedReadCapacityUnits ConsumedWriteCapacityUnits

ProvisionedReadCapacityUnits ProvisionedWriteCapacityUnits

ReadThrottleEvents WriteThrottleEvents

You can do this either through the AWS CLI or the AWS Management Console.

AWS CLI

Before you retrieve the table consumption metrics, you need to start by capturing some

historical data points using the CloudWatch API.

Start by creating two ﬁles: write-calc.json and read-calc.json. These ﬁles represent

the calculations for the table. You need to update some of the ﬁelds, as indicated in the table

below, to match your environment.

Note

If the table name is not unique within your account, you must also specify the name of

the keyspace.

Evaluate your provisioned capacity for right-sized provisioning 552

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Field Name Deﬁnition Example

<table-name>

The name of the table that

you are analysing

SampleTable

The period of time that you

are using to evaluate the

utilization target, based in

seconds

For a 1-hour period you

should specify: 3600

<start-time>

The beginning of your

evaluation interval, speciﬁed

in ISO8601 format

2022-02-21T23:00:00

<end-time>

The end of your evaluatio

n interval, speciﬁed in

ISO8601 format

2022-02-22T06:00:00

The write calculations ﬁle retrieves the number of WCU provisioned and consumed in the time

period for the date range speciﬁed. It also generates a utilization percentage that can be used

for analysis. The full content of the write-calc.json ﬁle should look like in the following

example.

{

"MetricDataQueries": [

{

"Id": "provisionedWCU",

"MetricStat": {

"Metric": {

"Namespace": "AWS/Cassandra",

"MetricName": "ProvisionedWriteCapacityUnits",

"Dimensions": [

{

"Name": "TableName",

"Value": "<table-name>"

}

]

"Period": <period>,

Evaluate your provisioned capacity for right-sized provisioning 553

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"Stat": "Average"

"Label": "Provisioned",

"ReturnData": false

{

"Id": "consumedWCU",

"MetricStat": {

"Metric": {

"Namespace": "AWS/Cassandra",

"MetricName": "ConsumedWriteCapacityUnits",

"Dimensions": [

{

"Name": "TableName",

"Value": "<table-name>""

}

]

"Period": <period>,

"Stat": "Sum"

"Label": "",

"ReturnData": false

{

"Id": "m1",

"Expression": "consumedWCU/PERIOD(consumedWCU)",

"Label": "Consumed WCUs",

"ReturnData": false

{

"Id": "utilizationPercentage",

"Expression": "100*(m1/provisionedWCU)",

"Label": "Utilization Percentage",

"ReturnData": true

}

"StartTime": "<start-time>",

"EndTime": "<end-time>",

"ScanBy": "TimestampDescending",

"MaxDatapoints": 24

}

Evaluate your provisioned capacity for right-sized provisioning 554

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The read calculations ﬁle uses a similar metrics. This ﬁle retrieves how many RCUs were

provisioned and consumed during the time period for the date range speciﬁed. The contents of

the read-calc.json ﬁle should look like in this example.

{

"MetricDataQueries": [

{

"Id": "provisionedRCU",

"MetricStat": {

"Metric": {

"Namespace": "AWS/Cassandra",

"MetricName": "ProvisionedReadCapacityUnits",

"Dimensions": [

{

"Name": "TableName",

"Value": "<table-name>"

}

]

"Period": <period>,

"Stat": "Average"

"Label": "Provisioned",

"ReturnData": false

{

"Id": "consumedRCU",

"MetricStat": {

"Metric": {

"Namespace": "AWS/Cassandra",

"MetricName": "ConsumedReadCapacityUnits",

"Dimensions": [

{

"Name": "TableName",

"Value": "<table-name>"

}

]

"Period": <period>,

"Stat": "Sum"

"Label": "",

"ReturnData": false

Evaluate your provisioned capacity for right-sized provisioning 555

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"Id": "m1",

"Expression": "consumedRCU/PERIOD(consumedRCU)",

"Label": "Consumed RCUs",

"ReturnData": false

{

"Id": "utilizationPercentage",

"Expression": "100*(m1/provisionedRCU)",

"Label": "Utilization Percentage",

"ReturnData": true

}

"StartTime": "<start-time>",

"EndTime": "<end-time>",

"ScanBy": "TimestampDescending",

"MaxDatapoints": 24

}

Once you've created the ﬁles, you can start retrieving utilization data.

1. To retrieve the write utilization data, issue the following command:

aws cloudwatch get-metric-data --cli-input-json file://write-calc.json

2. To retrieve the read utilization data, issue the following command:

aws cloudwatch get-metric-data --cli-input-json file://read-calc.json

The result for both queries is a series of data points in JSON format that can be used for

analysis. Your results depend on the number of data points you speciﬁed, the period, and your

own speciﬁc workload data. It could look like in the following example.

{

"MetricDataResults": [

{

"Id": "utilizationPercentage",

"Label": "Utilization Percentage",

"Timestamps": [

"2022-02-22T05:00:00+00:00",

Evaluate your provisioned capacity for right-sized provisioning 556

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"2022-02-22T04:00:00+00:00",

"2022-02-22T03:00:00+00:00",

"2022-02-22T02:00:00+00:00",

"2022-02-22T01:00:00+00:00",

"2022-02-22T00:00:00+00:00",

"2022-02-21T23:00:00+00:00"

"Values": [

91.55364583333333,

55.066631944444445,

2.6114930555555556,

24.9496875,

40.94725694444445,

25.61819444444444,

0.0

"StatusCode": "Complete"

}

"Messages": []

}

Note

If you specify a short period and a long time range, you might need to modify the

MaxDatapoints value, which is by default set to 24 in the script. This represents one

data point per hour and 24 per day.

AWS Management Console

1. Log into the AWS Management Console and navigate to the CloudWatch service page at

Getting Started with the AWS Management Console. Select the appropriate AWS Region if

necessary.

2. Locate the Metrics section on the left navigation bar and choose All metrics.

3. This opens a dashboard with two panels. The top panel shows you the graphic, and the

bottom panel has the metrics that you want to graph. Choose the Amazon Keyspaces

panel.

4. Choose the Table Metrics category from the sub panels. This shows you the tables in your

current AWS Region.

Evaluate your provisioned capacity for right-sized provisioning 557

Amazon Keyspaces (for Apache Cassandra) Developer Guide

5. Identify your table name by scrolling down the menu and selecting the write operation

metrics: ConsumedWriteCapacityUnits and ProvisionedWriteCapacityUnits

Note

This example talks about write operation metrics, but you can also use these steps

to graph the read operation metrics.

6. Select the Graphed metrics (2) tab to modify the formulas. By default CloudWatch chooses

the statistical function Average for the graphs.

7. While having both graphed metrics selected (the checkbox on the left) select the menu

Add math, followed by Common, and then select the Percentage function. Repeat the

procedure twice.

First time selecting the Percentage function.

Second time selecting the Percentage function.

8. At this point you should have four metrics in the bottom menu. Let’s work on the

ConsumedWriteCapacityUnits calculation. To be consistent, you need to match the

names with the ones you used in the AWS CLI section. Click on the m1 ID and change this

value to consumedWCU.

9. Change the statistic from Average to Sum. This action automatically creates another metric

called ANOMALY_DETECTION_BAND. For the scope of this procedure, you can ignore this

by removing the checkbox on the newly generated ad1 metric.

10. Repeat step 8 to rename the m2 ID to provisionedWCU. Leave the statistic set to Average.

11. Choose the Expression1 label and update the value to m1 and the label to Consumed

WCUs.

Note

Make sure you have only selected m1 (checkbox on the left) and provisionedWCU

to properly visualize the data. Update the formula by clicking in Details and

changing the formula to consumedWCU/PERIOD(consumedWCU). This step might

also generate another ANOMALY_DETECTION_BAND metric, but for the scope of

this procedure you can ignore it.

Evaluate your provisioned capacity for right-sized provisioning 558

Amazon Keyspaces (for Apache Cassandra) Developer Guide

12. You should now have two graphics: one that indicates your provisioned WCUs on the table

and another that indicates the consumed WCUs.

13. Update the percentage formula by selecting the Expression2 graphic (e2). Rename

the labels and IDs to utilizationPercentage. Rename the formula to match 100*(m1/

provisionedWCU).

14. Remove the checkbox from all the metrics except utilizationPercentage to visualize your

utilization patterns. The default interval is set to 1 minute, but feel free to modify it as

needed.

The results you get depend on the actual data from your workload. Intervals with more than

100% utilization are prone to low throughput capacity error events. Amazon Keyspaces

oﬀers burst capacity, but as soon as the burst capacity is exhausted, anything above 100%

experiences low throughput capacity error events.

How to identify under-provisioned Amazon Keyspaces tables

For most workloads, a table is considered under-provisioned when it constantly consumes more

than 80% of its provisioned capacity.

Burst capacity is an Amazon Keyspaces feature that allow customers to temporarily consume more

RCUs/WCUs than originally provisioned (more than the per-second provisioned throughput that

was deﬁned for the table). The burst capacity was created to absorb sudden increases in traﬃc

due to special events or usage spikes. This burst capacity limited, for more information, see the

section called “Use burst capacity”. As soon as the unused RCUs and WCUs are depleted, you

can experience low capacity throughput error events if you try to consume more capacity than

provisioned. When your application traﬃc is getting close to the 80% utilization rate, your risk of

experiencing low capacity throughput error events is signiﬁcantly higher.

The 80% utilization rate rule varies from the seasonality of your data and your traﬃc growth.

Consider the following scenarios:

• If your traﬃc has been stable at ~90% utilization rate for the last 12 months, your table has just

the right capacity

• If your application traﬃc is growing at a rate of 8% monthly in less than 3 months, you will

arrive at 100%

• If your application traﬃc is growing at a rate of 5% in a little more than 4 months, you will still

arrive at 100%

Evaluate your provisioned capacity for right-sized provisioning 559

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The results from the queries above provide a picture of your utilization rate. Use them as a guide to

further evaluate other metrics that can help you choose to increase your table capacity as required

(for example: a monthly or weekly growth rate). Work with your operations team to deﬁne what is

a good percentage for your workload and your tables.

There are special scenarios where the data is skewed when you analyse it on a daily or weekly

basis. For example, with seasonal applications that have spikes in usage during working hours (but

then drop to almost zero outside of working hours), you could beneﬁt from scheduling application

auto-scaling, where you specify the hours of the day (and the days of the week) to increase

the provisioned capacity, as well as when to reduce it. Instead of aiming for higher capacity so

you can cover the busy hours, you can also beneﬁt from Amazon Keyspaces table auto-scaling

conﬁgurations if your seasonality is less pronounced.

How to identify over-provisioned Amazon Keyspaces tables

The query results obtained from the scripts above provide the data points required to perform

some initial analysis. If your data set presents values lower than 20% utilization for several

intervals, your table might be over-provisioned. To further deﬁne if you need to reduce the number

of WCUs and RCUS, you should revisit the other readings in the intervals.

When your table contains several low usage intervals, you can beneﬁt from using Application Auto

Scaling policies, either by scheduling Application Auto Scaling or by just conﬁguring the default

Application Auto Scaling policies for the table that are based on utilization.

If you have a workload with a low utilization to high throttle ratio (Max(ThrottleEvents)/

Min(ThrottleEvents) in the interval), this could happen when you have a very spiky workload

where traﬃc increases signiﬁcantly on speciﬁc days (or times of day), but is otherwise consistently

low. In these scenarios, it might be beneﬁcial to use scheduled Application Auto Scaling.

Evaluate your provisioned capacity for right-sized provisioning 560

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Troubleshooting Amazon Keyspaces (for Apache

Cassandra)

This guide covers troubleshooting steps for various scenarios when working with Amazon

Keyspaces (for Apache Cassandra). It includes information on resolving general errors, connection

issues, capacity management problems, and Data Deﬁnition Language (DDL) errors.

• General errors

•

Troubleshooting top-level exceptions like NoNodeAvailableException,

NoHostAvailableException, and AllNodesFailedException.

• Isolating underlying errors from Java driver exceptions.

• Implementing retry policies and conﬁguring connections correctly.

• Connection issues

•

Resolving errors when connecting to Amazon Keyspaces endpoints using cqlsh or Apache

Cassandra client drivers.

• Troubleshooting VPC endpoint connections, Cassandra-stress connections, and IAM

conﬁguration errors.

• Handling connection losses during data imports.

• Capacity management errors

• Recognizing and resolving insuﬃcient capacity errors related to tables, partitions, and

connections.

• Monitoring relevant Amazon Keyspaces metrics in Amazon CloudWatch Logs.

• Optimizing connections and throughput for improved performance.

• Data Deﬁnition Language (DDL) errors

• Troubleshooting errors when creating, accessing, or restoring keyspaces and tables.

• Handling failures related to custom Time to Live (TTL) settings, column limits, and range

deletes.

• Considerations for heavy delete workloads.

For troubleshooting guidance speciﬁc to IAM access, see the section called “Troubleshooting”. For

more information about security best practices, see the section called “Security best practices”.

561

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Topics

• Troubleshooting general errors in Amazon Keyspaces

• Troubleshooting connection errors in Amazon Keyspaces

• Troubleshooting capacity management errors in Amazon Keyspaces

• Troubleshooting data deﬁnition language errors in Amazon Keyspaces

Troubleshooting general errors in Amazon Keyspaces

Getting general errors? Here are some common issues and how to resolve them.

General errors

You're getting one of the following top-level exceptions that can occur due to many diﬀerent

reasons.

•

NoNodeAvailableException

•

NoHostAvailableException

•

AllNodesFailedException

These exceptions are generated by the client driver and can occur either when you're establishing

the control connection or when you're performing read/write/prepare/execute/batch requests.

When the error occurs while you're establishing the control connection, it's a sign that all the

contact points speciﬁed in your application are unreachable. When the error occurs while

performing read/write/prepare/execute queries, it indicates that all of the retries for that request

have been exhausted. Each retry is attempted on a diﬀerent node when you're using the default

retry policy.

How to isolate the underlying error from top-level Java driver exceptions

These general errors can be caused either by connection issues or when performing read/write/

prepare/execute operations. Transient failures have to be expected in distributed systems, and

should be handled by retrying the request. The Java driver doesn't automatically retry when

connection errors are encountered, so it's recommended to implement the retry policy when

establishing the driver connection in your application. For a detailed overview of connection best

practices, see the section called “Connections”.

General errors 562

Amazon Keyspaces (for Apache Cassandra) Developer Guide

By default, the Java driver sets idempotence to false for all request, which means the Java driver

doesn't automatically retry failed read/write/prepare request. To set idempotence to true and

tell the driver to retry failed requests, you can do so in a few diﬀerent ways. Here's one example

how you can set idempotence programmatically for a single request in your Java application.

Statement s = new SimpleStatement("SELECT * FROM my_table WHERE id = 1");

s.setIdempotent(true);

Or you can set the default idempotence for your entire Java application programmatically as

shown in the following example.

// Make all statements idempotent by default:

cluster.getConfiguration().getQueryOptions().setDefaultIdempotence(true);

//Set the default idempotency to true in your Cassandra configuration

basic.request.default-idempotence = true

Another recommendation is to create a retry policy at the application level. In this case, the

application needs to catch the NoNodeAvailableException and retry the request. We

recommend 10 retries with exponential backoﬀ starting at 10ms and working up to 100ms with a

total time of 1 second for all retries.

Another option is to apply the Amazon Keyspaces exponential retry policy when establishing the

Java driver connection available on Github.

Conﬁrm that you have established connections to more than one node when using the default

retry policy. You can do so using the following query in Amazon Keyspaces.

SELECT * FROM system.peers;

If the response for this query is empty, this indicates that you're working with a single node for

Amazon Keyspaces. If you're using the default retry policy, there will be no retries because the

default retry always occurs on a diﬀerent node. To learn more about establishing connections over

VPC endpoints, see the section called “VPC endpoint connections”.

For a step-by-step tutorial that shows how to establish a connection to Amazon Keyspaces using

the Datastax 4.x Cassandra driver, see the section called “Authentication plugin for Java 4.x”.

General errors 563

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Troubleshooting connection errors in Amazon Keyspaces

Having trouble connecting? Here are some common issues and how to resolve them.

Errors connecting to an Amazon Keyspaces endpoint

Failed connections and connection errors can result in diﬀerent error messages. The following

section covers the most common scenarios.

Topics

• I can't connect to Amazon Keyspaces with cqlsh

• I can't connect to Amazon Keyspaces using a Cassandra client driver

I can't connect to Amazon Keyspaces with cqlsh

You're trying to connect to an Amazon Keyspaces endpoint using cqlsh and the connection fails

with a Connection error.

If you try to connect to an Amazon Keyspaces table and cqlsh hasn't been conﬁgured properly,

the connection fails. The following section provides examples of the most common conﬁguration

issues that result in connection errors when you're trying to establish a connection using cqlsh.

Note

If you're trying to connect to Amazon Keyspaces from a VPC, additional permissions are

required. To successfully conﬁgure a connection using VPC endpoints, follow the steps in

the the section called “Connecting with VPC endpoints”.

You're trying to connect to Amazon Keyspaces using cqlsh, but you get a connection timed

out error.

This might be the case if you didn't supply the correct port, which results in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com 9140 -u "USERNAME" -p "PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.199': error(None,

"Tried connecting to [('3.234.248.199', 9140)]. Last error: timed out")})

Connection errors 564

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To resolve this issue, verify that you're using port 9142 for the connection.

You're trying to connect to Amazon Keyspaces using cqlsh, but you get a Name or service

not known error.

This might be the case if you used an endpoint that is misspelled or doesn't exist. In the following

example, the name of the endpoint is misspelled.

# cqlsh cassandra.us-east-1.amazon.com 9142 -u "USERNAME" -p "PASSWORD" --ssl

Traceback (most recent call last):

File "/usr/bin/cqlsh.py", line 2458, in >module>

main(*read_options(sys.argv[1:], os.environ))

File "/usr/bin/cqlsh.py", line 2436, in main

encoding=options.encoding)

File "/usr/bin/cqlsh.py", line 484, in __init__

load_balancing_policy=WhiteListRoundRobinPolicy([self.hostname]),

File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.11.0-bb96859b.zip/

cassandra-driver-3.11.0-bb96859b/cassandra/policies.py", line 417, in __init__

socket.gaierror: [Errno -2] Name or service not known

To resolve this issue when you're using public endpoints to connect, select an available endpoint

from the section called “Service endpoints”, and verify that the name of the endpoint doesn't have

any errors. If you're using VPC endpoints to connect, verify that the VPC endpoint information is

correct in your cqlsh conﬁguration.

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive an

OperationTimedOut error.

Amazon Keyspaces requires that SSL is enabled for connections to ensure strong security. The SSL

parameter might be missing if you receive the following error.

# cqlsh cassandra.us-east-1.amazonaws.com -u "USERNAME" -p "PASSWORD"

Connection error: ('Unable to connect to any servers', {'3.234.248.192':

OperationTimedOut('errors=Timed out creating connection (5 seconds),

last_host=None',)})

To resolve this issue, add the following ﬂag to the cqlsh connection command.

--ssl

Errors connecting to an Amazon Keyspaces endpoint 565

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You're trying to connect to Amazon Keyspaces using cqlsh, and you receive a SSL transport

factory requires a valid certfile to be specified error.

In this case, the path to the SSL/TLS certiﬁcate is missing, which results in the following error.

# cat .cassandra/cqlshrc

[connection]

port = 9142

factory = cqlshlib.ssl.ssl_transport_factory

# cqlsh cassandra.us-east-1.amazonaws.com -u "USERNAME" -p "PASSWORD" --ssl

Validation is enabled; SSL transport factory requires a valid certfile to be specified.

Please provide path to the certfile in [ssl] section as 'certfile' option in /

root/.cassandra/cqlshrc (or use [certfiles] section) or set SSL_CERTFILE environment

variable.

To resolve this issue, add the path to the certﬁle on your computer.

certfile = path_to_file/sf-class2-root.crt

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive a No such file or

directory error.

This might be the case if the path to the certiﬁcate ﬁle on your computer is wrong, which results in

the following error.

# cat .cassandra/cqlshrc

[connection]

port = 9142

factory = cqlshlib.ssl.ssl_transport_factory

[ssl]

validate = true

certfile = /root/wrong_path/sf-class2-root.crt

# cqlsh cassandra.us-east-1.amazonaws.com -u "USERNAME" -p "PASSWORD" --ssl

Errors connecting to an Amazon Keyspaces endpoint 566

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Connection error: ('Unable to connect to any servers', {'3.234.248.192': IOError(2, 'No

such file or directory')})

To resolve this issue, verify that the path to the certﬁle on your computer is correct.

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive a [X509] PEM lib

error.

This might be the case if the SSL/TLS certiﬁcate ﬁle sf-class2-root.crt is not valid, which

results in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com -u "USERNAME" -p "PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.241':

error(185090057, u"Tried connecting to [('3.234.248.241', 9142)]. Last error: [X509]

PEM lib (_ssl.c:3063)")})

To resolve this issue, download the Starﬁeld digital certiﬁcate using the following command. Save

sf-class2-root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive an unknown SSL

error.

This might be the case if the SSL/TLS certiﬁcate ﬁle sf-class2-root.crt is empty, which results

in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com -u "USERNAME" -p "PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.220': error(0,

u"Tried connecting to [('3.234.248.220', 9142)]. Last error: unknown error

(_ssl.c:3063)")})

To resolve this issue, download the Starﬁeld digital certiﬁcate using the following command. Save

sf-class2-root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

Errors connecting to an Amazon Keyspaces endpoint 567

Amazon Keyspaces (for Apache Cassandra) Developer Guide

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive a SSL:

CERTIFICATE_VERIFY_FAILED error.

This might be the case if the SSL/TLS certiﬁcate ﬁle could not be veriﬁed, which results in the

following error.

Connection error: ('Unable to connect to any servers', {'3.234.248.223':

error(1, u"Tried connecting to [('3.234.248.223', 9142)]. Last error: [SSL:

CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:727)")})

To resolve this issue, download the certiﬁcate ﬁle again using the following command. Save sf-

class2-root.crt locally or in your home directory.

curl https://certs.secureserver.net/repository/sf-class2-root.crt -O

You're trying to connect to Amazon Keyspaces using cqlsh, but you're receiving a Last error:

timed out error.

This might be the case if you didn't conﬁgure an outbound rule for Amazon Keyspaces in your

Amazon EC2 security group, which results in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com 9142 -u "USERNAME" -p "PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.206': error(None,

"Tried connecting to [('3.234.248.206', 9142)]. Last error: timed out")})

To conﬁrm that this issue is caused by the conﬁguration of the Amazon EC2 instance and not

cqlsh, you can try to connect to your keyspace using the AWS CLI, for example with the following

command.

aws keyspaces list-tables --keyspace-name 'my_keyspace'

If this command also times out, the Amazon EC2 instance is not correctly conﬁgured.

To conﬁrm that you have suﬃcient permissions to access Amazon Keyspaces, you can use the AWS

CloudShell to connect with cqlsh. If that connections gets established, you need to conﬁgure the

Amazon EC2 instance.

To resolve this issue, conﬁrm that your Amazon EC2 instance has an outbound rule that allows

traﬃc to Amazon Keyspaces. If that is not the case, you need to create a new security group for

Errors connecting to an Amazon Keyspaces endpoint 568

Amazon Keyspaces (for Apache Cassandra) Developer Guide

the EC2 instance, and add a rule that allows outbound traﬃc to Amazon Keyspaces resources. To

update the outbound rule to allow traﬃc to Amazon Keyspaces, choose CQLSH/CASSANDRA from

the Type drop-down menu.

After creating the new security group with the outbound traﬃc rule, you need to add it to the

instance. Select the instance and then choose Actions, then Security, and then Change security

groups. Add the new security group with the outbound rule, but make sure that the default group

also remains available.

For more information about how to view and edit EC2 outbound rules, see Add rules to a security

group in the Amazon EC2 User Guide.

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive an Unauthorized

error.

This might be the case if you're missing Amazon Keyspaces permissions in the IAM user policy,

which results in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com 9142 -u "testuser-at-12345678910" -p

"PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.241':

AuthenticationFailed('Failed to authenticate to 3.234.248.241: Error from server:

code=2100 [Unauthorized] message="User arn:aws:iam::12345678910:user/testuser has no

permissions."',)})

To resolve this issue, ensure that the IAM user testuser-at-12345678910 has permissions to

access Amazon Keyspaces. For examples of IAM policies that grant access to Amazon Keyspaces, see

the section called “Identity-based policy examples”.

For troubleshooting guidance that's speciﬁc to IAM access, see the section called

“Troubleshooting”.

You're trying to connect to Amazon Keyspaces using cqlsh, but you receive a Bad credentials

error.

This might be the case if the user name or password is wrong, which results in the following error.

# cqlsh cassandra.us-east-1.amazonaws.com 9142 -u "USERNAME" -p "PASSWORD" --ssl

Connection error: ('Unable to connect to any servers', {'3.234.248.248':

AuthenticationFailed('Failed to authenticate to 3.234.248.248: Error from server:

Errors connecting to an Amazon Keyspaces endpoint 569

Amazon Keyspaces (for Apache Cassandra) Developer Guide

code=0100 [Bad credentials] message="Provided username USERNAME and/or password are

incorrect"',)})

To resolve this issue, verify that the USERNAME and PASSWORD in your code match the user name

and password you obtained when you generated service-speciﬁc credentials.

Important

If you continue to see errors when trying to connect with cqlsh, rerun the command with

the --debug option and include the detailed output when contacting AWS Support.

I can't connect to Amazon Keyspaces using a Cassandra client driver

The following sections shows the most common errors when connecting with a Cassandra client

driver.

You're trying to connect to an Amazon Keyspaces table using the DataStax Java driver, but you

receive an NodeUnavailableException error.

If the connection on which the request is attempted is broken, it results in the following error.

[com.datastax.oss.driver.api.core.NodeUnavailableException: No connection

was available to Node(endPoint=vpce-22ff22f2f22222fff-aa1bb234.cassandra.us-

west-2.vpce.amazonaws.com/11.1.1111.222:9142, hostId=1a23456b-

c77d-8888-9d99-146cb22d6ef6, hashCode=123ca4567)]

To resolve this issue, ﬁnd the heartbeat value and lower it to 30 seconds if it's higher.

advanced.heartbeat.interval = 30 seconds

Then look for the associated time out and ensure the value is set to at least 5 seconds.

advanced.connection.init-query-timeout = 5 seconds

You're trying to connect to an Amazon Keyspaces table using a driver and the SigV4 plugin, but

you receive an AttributeError error.

Errors connecting to an Amazon Keyspaces endpoint 570

Amazon Keyspaces (for Apache Cassandra) Developer Guide

If the credentials are not correctly conﬁgured, it results in the following error.

cassandra.cluster.NoHostAvailable: (‘Unable to connect to any servers’,

{‘44.234.22.154:9142’: AttributeError(“‘NoneType’ object has no attribute

‘access_key’“)})

To resolve this issue, verify that you're passing the credentials associated with your IAM user or role

when using the SigV4 plugin. The SigV4 plugin requires the following credentials.

•

AWS_ACCESS_KEY_ID – Speciﬁes an AWS access key associated with an IAM user or role.

•

AWS_SECRET_ACCESS_KEY– Speciﬁes the secret key associated with the access key. This is

essentially the "password" for the access key.

To learn more about access keys and the SigV4 plugin, see the section called “Create IAM

credentials for AWS authentication”.

You're trying to connect to an Amazon Keyspaces table using a driver, but you receive a

PartialCredentialsError error.

If the AWS_SECRET_ACCESS_KEY is missing, it can result in the following error.

cassandra.cluster.NoHostAvailable: (‘Unable to connect to any servers’,

{‘44.234.22.153:9142’:

PartialCredentialsError(‘Partial credentials found in config-file, missing:

aws_secret_access_key’)})

To resolve this issue, verify that you're passing both the AWS_ACCESS_KEY_ID and the

AWS_SECRET_ACCESS_KEY when using the SigV4 plugin. To learn more about access keys and the

SigV4 plugin, see the section called “Create IAM credentials for AWS authentication”.

You're trying to connect to an Amazon Keyspaces table using a driver, but you receive an

Invalid signature error.

This might be the case if you used wrong credentials, which results in the following error.

cassandra.cluster.NoHostAvailable: (‘Unable to connect to any servers’,

{‘44.234.22.134:9142’:

AuthenticationFailed(‘Failed to authenticate to 44.234.22.134:9142: Error from server:

code=0100

[Bad credentials] message=“Authentication failure: Invalid signature”’)})

Errors connecting to an Amazon Keyspaces endpoint 571

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To resolve this issue, verify that the credentials you're passing are associated with the IAM user or

role that you conﬁgured to access Amazon Keyspaces. To learn more about access keys and the

SigV4 plugin, see the section called “Create IAM credentials for AWS authentication”.

My VPC endpoint connection doesn't work properly

You're trying to connect to Amazon Keyspaces using VPC endpoints, but you're receiving token

map errors or you are experiencing low throughput.

This might be the case if the VPC endpoint connection isn't correctly conﬁgured.

To resolve these issues, verify the following conﬁguration details. To follow a step-by-step tutorial

to learn how to conﬁgure a connection over interface VPC endpoints for Amazon Keyspaces see the

section called “Connecting with VPC endpoints”.

1. Conﬁrm that the IAM entity used to connect to Amazon Keyspaces has read/write access to the

user table and read access to the system tables as shown in the following example.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select",

"cassandra:Modify"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

}

]

}

2. Conﬁrm that the IAM entity used to connect to Amazon Keyspaces has the required read

permissions to access the VPC endpoint information on your Amazon EC2 instance as shown in

the following example.

{

"Version":"2012-10-17",

Errors connecting to an Amazon Keyspaces endpoint 572

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"Statement":[

{

"Sid":"ListVPCEndpoints",

"Effect":"Allow",

"Action":[

"ec2:DescribeNetworkInterfaces",

"ec2:DescribeVpcEndpoints"

"Resource":"*"

}

]

}

Note

The managed policies AmazonKeyspacesReadOnlyAccess_v2 and

AmazonKeyspacesFullAccess include the required permissions to let Amazon

Keyspaces access the Amazon EC2 instance to read information about available

interface VPC endpoints.

For more information about VPC endpoints, see the section called “Using interface VPC

endpoints for Amazon Keyspaces”

3. Conﬁrm that the SSL conﬁguration of the Java driver sets hostname validation to false as

shown in this example.

hostname-validation = false

For more information about driver conﬁguration, see the section called “Step 2: Conﬁgure the

driver”.

4. To conﬁrm that the VPC endpoint has been conﬁgured correctly, you can run the following

statement from within your VPC.

Note

You can't use your local developer environment or the Amazon Keyspaces CQL editor

to conﬁrm this conﬁguration, because they use the public endpoint.

Errors connecting to an Amazon Keyspaces endpoint 573

Amazon Keyspaces (for Apache Cassandra) Developer Guide

SELECT peer FROM system.peers;

The output should look similar to this example and return between 2 to 6 nodes with private

IP addresses, depending on your VPC setup and AWS Region.

peer

---------------

192.0.2.0.15

192.0.2.0.24

192.0.2.0.13

192.0.2.0.7

192.0.2.0.8

(5 rows)

I can't connect using cassandra-stress

You're trying to connect to Amazon Keyspaces using the cassandra-stress command, but

you're receiving an SSL context error.

This happens if you try to connect to Amazon Keyspaces, but you don't have the trustStore setup

correctly. Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure

connections with clients.

In this case, you see the following error.

Error creating the initializing the SSL Context

To resolve this issue, follow the instructions to set up a trustStore as shown in this topic the section

called “Before you begin”.

Once the trustStore is setup, you should be able to connect with the following command.

./cassandra-stress user profile=./profile.yaml n=100 "ops(insert=1,select=1)"

cl=LOCAL_QUORUM -node "cassandra.eu-north-1.amazonaws.com" -port native=9142

-transport ssl-alg="PKIX" truststore="./cassandra_truststore.jks" truststore-

password="trustStore_pw" -mode native cql3 user="user_name" password="password"

Errors connecting to an Amazon Keyspaces endpoint 574

Amazon Keyspaces (for Apache Cassandra) Developer Guide

I can't connect using IAM identities

You're trying to connect to an Amazon Keyspaces table using an IAM identity, but you're

receiving an Unauthorized error.

This happens if you try to connect to an Amazon Keyspaces table using an IAM identity (for

example, an IAM user) without implementing the policy and giving the user the required

permissions ﬁrst.

In this case, you see the following error.

Connection error: ('Unable to connect to any servers', {'3.234.248.202':

AuthenticationFailed('Failed to authenticate to 3.234.248.202:

Error from server: code=2100 [Unauthorized] message="User

arn:aws:iam::1234567890123:user/testuser has no permissions."',)})

To resolve this issue, verify the permissions of the IAM user. To connect with a standard driver, a

user must have at least SELECT access to the system tables, because most drivers read the system

keyspaces/tables when they establish the connection.

For example IAM policies that grant access to Amazon Keyspaces system and user tables, see the

section called “Accessing Amazon Keyspaces tables”.

To review the troubleshooting section speciﬁc to IAM, see the section called “Troubleshooting”.

I'm trying to import data with cqlsh and the connection to my Amazon Keyspaces table is lost

You're trying to upload data to Amazon Keyspaces with cqlsh, but you're receiving connection

errors.

The connection to Amazon Keyspaces fails after the cqlsh client receives three consecutive errors of

any type from the server. The cqlsh client fails with the following message.

Failed to import 1 rows: NoHostAvailable - , will retry later, attempt 3 of 100

To resolve this error, you need to make sure that the data to be imported matches the table

schema in Amazon Keyspaces. Review the import ﬁle for parsing errors. You can try using a single

row of data by using an INSERT statement to isolate the error.

The client automatically attempts to reestablish the connection.

Errors connecting to an Amazon Keyspaces endpoint 575

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Troubleshooting capacity management errors in Amazon

Keyspaces

Having trouble with serverless capacity? Here are some common issues and how to resolve them.

Serverless capacity errors

This section outlines how to recognize errors related to serverless capacity management and how

to resolve them. For example, you might observe insuﬃcient capacity events when your application

exceeds your provisioned throughput capacity.

Because Apache Cassandra is cluster-based software that is designed to run on a ﬂeet of nodes,

it doesn’t have exception messages related to serverless features such as throughput capacity.

Most drivers only understand the error codes that are available in Apache Cassandra, so Amazon

Keyspaces uses that same set of error codes to maintain compatibility.

To map Cassandra errors to the underlying capacity events, you can use Amazon CloudWatch to

monitor the relevant Amazon Keyspaces metrics. Insuﬃcient-capacity events that result in client-

side errors can be categorized into these three groups based on the resource that is causing the

event:

• Table – If you choose Provisioned capacity mode for a table, and your application exceeds your

provisioned throughput, you might observe insuﬃcient-capacity errors. For more information,

see the section called “Conﬁgure read/write capacity modes”.

• Partition – You might experience insuﬃcient-capacity events if traﬃc against a given partition

exceeds 3,000 RCUs or 1,000 WCUs. We recommend distributing traﬃc uniformly across

partitions as a best practice. For more information, see the section called “Data modeling”.

• Connection – You might experience insuﬃcient throughput if you exceed the quota for the

maximum number of operations per second, per connection. To increase throughput, you can

increase the number of default connections when conﬁguring the connection with the driver.

To learn how to conﬁgure connections for Amazon Keyspaces, see the section called “How to

conﬁgure connections”. For more information about optimizing connections over VPC endpoints,

see the section called “VPC endpoint connections”.

To determine which resource is causing the insuﬃcient-capacity event that is returning the client-

side error, you can check the dashboard in the Amazon Keyspaces console. By default, the console

Capacity management errors 576

Amazon Keyspaces (for Apache Cassandra) Developer Guide

provides an aggregated view of the most common capacity and traﬃc related CloudWatch metrics

in the Capacity and related metrics section on the Capacity tab for the table.

To create your own dashboard using Amazon CloudWatch, check the following Amazon Keyspaces

metrics.

•

PerConnectionRequestRateExceeded – Requests to Amazon Keyspaces that exceed the

quota for the per-connection request rate. Each client connection to Amazon Keyspaces can

support up to 3000 CQL requests per second. You can perform more than 3000 requests per

second by creating multiple connections.

•

ReadThrottleEvents – Requests to Amazon Keyspaces that exceed the read capacity for a

table.

•

StoragePartitionThroughputCapacityExceeded – Requests to an Amazon Keyspaces

storage partition that exceed the throughput capacity of the partition. Amazon Keyspaces

storage partitions can support up to 1000 WCU/WRU per second and 3000 RCU/RRU per second.

To mitigate these exceptions, we recommend that you review your data model to distribute

read/write traﬃc across more partitions.

•

WriteThrottleEvents – Requests to Amazon Keyspaces that exceed the write capacity for a

table.

To learn more about CloudWatch, see the section called “Monitoring with CloudWatch”. For a list

of all available CloudWatch metrics for Amazon Keyspaces, see the section called “Metrics and

dimensions”.

Note

To get started with a custom dashboard that shows all commonly observed metrics for

Amazon Keyspaces, you can use a prebuilt CloudWatch template available on GitHub in the

AWS samples repository.

Topics

• I'm receiving NoHostAvailable insuﬃcient capacity errors from my client driver

• I'm receiving write timeout errors during data import

• I can't see the actual storage size of a keyspace or table

Serverless capacity errors 577

Amazon Keyspaces (for Apache Cassandra) Developer Guide

I'm receiving NoHostAvailable insuﬃcient capacity errors from my client driver

You're seeing Read_Timeout or Write_Timeout exceptions for a table.

Repeatedly trying to write to or read from an Amazon Keyspaces table with insuﬃcient capacity

can result in client-side errors that are speciﬁc to the driver.

Use CloudWatch to monitor your provisioned and actual throughput metrics, and insuﬃcient

capacity events for the table. For example, a read request that doesn’t have enough throughput

capacity fails with a Read_Timeout exception and is posted to the ReadThrottleEvents metric.

A write request that doesn’t have enough throughput capacity fails with a Write_Timeout

exception and is posted to the WriteThrottleEvents metric. For more information about these

metrics, see the section called “Metrics and dimensions”.

To resolve these issues, consider one of the following options.

• Increase the provisioned throughput for the table, which is the maximum amount of throughput

capacity an application can consume. For more information, see the section called “Read capacity

units and write capacity units”.

• Let the service manage throughput capacity on your behalf with automatic scaling. For more

information, see the section called “Manage throughput capacity with auto scaling”.

• Chose On-demand capacity mode for the table. For more information, see the section called

“Conﬁgure on-demand capacity mode”.

If you need to increase the default capacity quota for your account, see Quotas.

You're seeing errors related to exceeded partition capacity.

When you're seeing the error StoragePartitionThroughputCapacityExceeded the partition

capacity is temporarily exceeded. This might be automatically handled by adaptive capacity or on-

demand capacity. We recommend reviewing your data model to distribute read/write traﬃc across

more partitions to mitigate these errors. Amazon Keyspaces storage partitions can support up to

1000 WCU/WRU per second and 3000 RCU/RRU per second. To learn more about how to improve

your data model to distribute read/write traﬃc across more partitions, see the section called “Data

modeling”.

Write_Timeout exceptions can also be caused by an elevated rate of concurrent write operations

that include static and nonstatic data in the same logical partition. If traﬃc is expected to run

multiple concurrent write operations that include static and nonstatic data within the same logical

Serverless capacity errors 578

Amazon Keyspaces (for Apache Cassandra) Developer Guide

partition, we recommend writing static and nonstatic data separately. Writing the data separately

also helps to optimize the throughput costs.

You're seeing errors related to exceeded connection request rate.

You're seeing PerConnectionRequestRateExceeded due to one of the following causes.

• You might not have enough connections conﬁgured per session.

• You might be getting fewer connections than available peers, because you don't have the VPC

endpoint permissions conﬁgured correctly. For more information about VPC endpoint policies,

see the section called “Using interface VPC endpoints for Amazon Keyspaces”.

• If you're using a 4.x driver, check to see if you have hostname validation enabled. The driver

enables TLS hostname veriﬁcation by default. This conﬁguration leads to Amazon Keyspaces

appearing as a single-node cluster to the driver. We recommend that you turn hostname

veriﬁcation oﬀ.

We recommend that you follow these best practices to ensure that your connections and

throughput are optimized:

• Conﬁgure CQL query throughput tuning.

Amazon Keyspaces supports up to 3,000 CQL queries per TCP connection per second, but there is

no limit on the number of connections a driver can establish.

Most open-source Cassandra drivers establish a connection pool to Cassandra and load balance

queries over that pool of connections. Amazon Keyspaces exposes 9 peer IP addresses to drivers.

The default behavior of most drivers is to establish a single connection to each peer IP address.

Therefore, the maximum CQL query throughput of a driver using the default settings will be

27,000 CQL queries per second.

To increase this number, we recommend that you increase the number of connections per IP

address that your driver is maintaining in its connection pool. For example, setting the maximum

connections per IP address to 2 will double the maximum throughput of your driver to 54,000

CQL queries per second.

• Optimize your single-node connections.

By default, most open-source Cassandra drivers establish one or more connections to every IP

address advertised in the system.peers table when establishing a session. However, certain

Serverless capacity errors 579

Amazon Keyspaces (for Apache Cassandra) Developer Guide

conﬁgurations can lead to a driver connecting to a single Amazon Keyspaces IP address. This

can happen if the driver is attempting SSL hostname validation of the peer nodes (for example,

DataStax Java drivers), or when it's connecting through a VPC endpoint.

To get the same availability and performance as a driver with connections to multiple IP

addresses, we recommend that you do the following:

• Increase the number of connections per IP to 9 or higher depending on the desired client

throughput.

• Create a custom retry policy that ensures that retries are run against the same node.

• If you use VPC endpoints, grant the IAM entity that is used to connect to Amazon Keyspaces

access permissions to query your VPC for the endpoint and network interface information. This

improves load balancing and increases read/write throughput. For more information, see ???.

I'm receiving write timeout errors during data import

You're receiving a timeout error when uploading data using the cqlsh COPY command.

Failed to import 1 rows: WriteTimeout - Error from server: code=1100 [Coordinator node

timed out waiting for replica nodes' responses]

message="Operation timed out - received only 0 responses." info={'received_responses':

0, 'required_responses': 2, 'write_type': 'SIMPLE', 'consistency':

'LOCAL_QUORUM'}, will retry later, attempt 1 of 100

Amazon Keyspaces uses the ReadTimeout and WriteTimeout exceptions to indicate when a

write request fails due to insuﬃcient throughput capacity. To help diagnose insuﬃcient capacity

exceptions, Amazon Keyspaces publishes the following metrics in Amazon CloudWatch.

•

WriteThrottleEvents

•

ReadThrottledEvents

•

StoragePartitionThroughputCapacityExceeded

To resolve insuﬃcient-capacity errors during a data load, lower the write rate per worker or the

total ingest rate, and then retry to upload the rows. For more information, see the section called

“Step 4: Conﬁgure cqlsh COPY FROM settings”. For a more robust data upload option, consider

using DSBulk, which is available from the GitHub repository. For step-by-step instructions, see the

section called “Loading data using DSBulk”.

Serverless capacity errors 580

Amazon Keyspaces (for Apache Cassandra) Developer Guide

I can't see the actual storage size of a keyspace or table

You can't see the actual storage size of the keyspace or table.

To learn more about the storage size of your table, see the section called “Evaluate your costs at

the table level”. You can also estimate storage size by starting to calculate the row size in a table.

Detailed instructions for calculating the row size are available at the section called “Estimate row

size”.

Troubleshooting data deﬁnition language errors in Amazon

Keyspaces

Having trouble creating resources? Here are some common issues and how to resolve them.

Data deﬁnition language errors

Amazon Keyspaces performs data deﬁnition language (DDL) operations asynchronously—for

example, creating and deleting keyspaces and tables. If an application is trying to use the resource

before it's ready, the operation fails.

You can monitor the creation status of new keyspaces and tables in the AWS Management Console,

which indicates when a keyspace or table is pending or active. You can also monitor the creation

status of a new keyspace or table programmatically by querying the system schema table. A

keyspace or table becomes visible in the system schema when it's ready for use.

Note

To optimize the creation of keyspaces using AWS CloudFormation, you can use this utility

to convert CQL scripts into CloudFormation templates. The tool is available from the

GitHub repository.

Topics

• I created a new keyspace, but I can't view or access it

• I created a new table, but I can't view or access it

• I'm trying to restore a table using Amazon Keyspaces point-in-time recovery (PITR), but the

restore fails

Data deﬁnition language errors 581

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• I'm trying to use INSERT/UPDATE to edit custom Time to Live (TTL) settings, but the operation

fails

• I'm trying to upload data to my Amazon Keyspaces table and I get an error about exceeding the

number of columns

• I'm trying to delete data in my Amazon Keyspaces table and the deletion fails for the range

I created a new keyspace, but I can't view or access it

You're receiving errors from your application that is trying to access a new keyspace.

If you try to access a newly created Amazon Keyspaces keyspace that is still being created

asynchronously, you will get an error. The following error is an example.

InvalidRequest: Error from server: code=2200 [Invalid query] message="unconfigured

keyspace mykeyspace"

The recommended design pattern to check when a new keyspace is ready for use is to poll the

Amazon Keyspaces system schema tables (system_schema_mcs.*).

For more information, see the section called “Check keyspace creation status”.

I created a new table, but I can't view or access it

You're receiving errors from your application that is trying to access a new table.

If you try to access a newly created Amazon Keyspaces table that is still being created

asynchronously, you will get an error. For example, trying to query a table that isn't available yet

fails with an unconfigured table error.

InvalidRequest: Error from server: code=2200 [Invalid query] message="unconfigured

table mykeyspace.mytable"

Trying to view the table with sync_table() fails with a KeyError.

KeyError: 'mytable'

The recommended design pattern to check when a new table is ready for use is to poll the Amazon

Keyspaces system schema tables (system_schema_mcs.*).

This is the example output for a table that is being created.

Data deﬁnition language errors 582

Amazon Keyspaces (for Apache Cassandra) Developer Guide

user-at-123@cqlsh:system_schema_mcs> select table_name,status from

system_schema_mcs.tables where keyspace_name='example_keyspace' and

table_name='example_table';

table_name | status

------------+----------

example_table | CREATING

(1 rows)

This is the example output for a table that is active.

user-at-123@cqlsh:system_schema_mcs> select table_name,status from

system_schema_mcs.tables where keyspace_name='example_keyspace' and

table_name='example_table';

table_name | status

------------+----------

example_table | ACTIVE

(1 rows)

For more information, see the section called “Check table creation status”.

I'm trying to restore a table using Amazon Keyspaces point-in-time recovery

(PITR), but the restore fails

If you're trying to restore an Amazon Keyspaces table with point-in-time recovery (PITR), and you

see the restore process begin but not complete successfully, you might not have conﬁgured all of

the required permissions that are needed by the restore process for this particular table.

In addition to user permissions, Amazon Keyspaces might require permissions to perform actions

during the restore process on your principal's behalf. This is the case if the table is encrypted with a

customer managed key, or if you're using IAM policies that restrict incoming traﬃc.

For example, if you're using condition keys in your IAM policy to restrict source traﬃc to speciﬁc

endpoints or IP ranges, the restore operation fails. To allow Amazon Keyspaces to perform the

Data deﬁnition language errors 583

Amazon Keyspaces (for Apache Cassandra) Developer Guide

table restore operation on your principal's behalf, you must add an aws:ViaAWSService global

condition key in the IAM policy.

For more information about permissions to restore tables, see the section called “Conﬁgure IAM

permissions for restore”.

I'm trying to use INSERT/UPDATE to edit custom Time to Live (TTL) settings, but

the operation fails

If you're trying to insert or update a custom TTL value, the operation might fail with the following

error.

TTL is not yet supported.

To specify custom TTL values for rows or columns by using INSERT or UPDATE operations, you

must ﬁrst enable TTL for the table. You can enable TTL for a table using the ttl custom property.

For more information about enabling custom TTL settings for tables, see the section called

“Update table custom TTL”.

I'm trying to upload data to my Amazon Keyspaces table and I get an error about

exceeding the number of columns

You're uploading data and have exceeded the number of columns that can be updated

simultaneously.

This error occurs when your table schema exceeds the maximum size of 350 KB. For more

information, see Quotas.

I'm trying to delete data in my Amazon Keyspaces table and the deletion fails for

the range

You're trying to delete data by partition key and receive a range delete error.

This error occurs when you're trying to delete more than 1,000 rows in one delete operation.

Range delete requests are limited by the amount of items that can be deleted in a

single range.

For more information, see the section called “Range delete”.

Data deﬁnition language errors 584

Amazon Keyspaces (for Apache Cassandra) Developer Guide

To delete more than 1,000 rows within a single partition, consider the following options.

• Delete by partition – If the majority of partitions are under 1,000 rows, you can attempt to

delete data by partition. If the partitions contain more than 1,000 rows, attempt to delete by the

clustering column instead.

• Delete by clustering column – If your model contains multiple clustering columns, you can use

the column hierarchy to delete multiple rows. Clustering columns are a nested structure, and you

can delete many rows by operating against the top-level column.

• Delete by individual row – You can iterate through the rows and delete each row by its full

primary key (partition columns and clustering columns).

• As a best practice, consider splitting your rows across partitions – In Amazon Keyspaces, we

recommend that you distribute your throughput across table partitions. This distributes data

and access evenly across physical resources, which provides the best throughput. For more

information, see the section called “Data modeling”.

Consider also the following recommendations when you're planning delete operations for heavy

workloads.

• With Amazon Keyspaces, partitions can contain a virtually unbounded number of rows. This

allows you to scale partitions “wider” than the traditional Cassandra guidance of 100 MB. It’s not

uncommon for time series or ledgers to grow over a gigabyte of data over time.

• With Amazon Keyspaces, there are no compaction strategies or tombstones to consider when

you have to perform delete operations for heavy workloads. You can delete as much data as you

want without impacting read performance.

Data deﬁnition language errors 585

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Monitoring Amazon Keyspaces (for Apache Cassandra)

Monitoring is an important part of maintaining the reliability, availability, and performance of

Amazon Keyspaces and your other AWS solutions. AWS provides the following monitoring tools

to watch Amazon Keyspaces, report when something is wrong, and take automatic actions when

appropriate:

• Amazon Keyspaces oﬀers a preconﬁgured dashboard in the AWS Management Console showing

the latency and errors aggregated across all tables in the account.

• Amazon CloudWatch monitors your AWS resources and the applications you run on AWS in

real time. You can collect and track metrics with customized dashboards. For example, you can

create a baseline for normal Amazon Keyspaces performance in your environment by measuring

performance at various times and under diﬀerent load conditions. As you monitor Amazon

Keyspaces, store historical monitoring data so that you can compare it with current performance

data, identify normal performance patterns and performance anomalies, and devise methods to

address issues. To establish a baseline, you should, at a minimum, monitor for system errors. For

more information, see the Amazon CloudWatch User Guide.

• Amazon CloudWatch alarms monitor a single metric over a time period that you specify, and

perform one or more actions based on the value of the metric relative to a given threshold over

a number of time periods. For example if you use Amazon Keyspaces in provisioned mode with

application auto scaling, the action is a notiﬁcation sent by the Amazon Simple Notiﬁcation

Service (Amazon SNS) to evaluate an Application Auto Scaling policy.

CloudWatch alarms do not invoke actions simply because they are in a particular state. The

state must have changed and been maintained for a speciﬁed number of periods. For more

information, see Monitoring Amazon Keyspaces with Amazon CloudWatch.

• Amazon CloudWatch Logs enables you to monitor, store, and access your log ﬁles from Amazon

Keyspaces tables, CloudTrail, and other sources. CloudWatch Logs can monitor information in the

log ﬁles and notify you when certain thresholds are met. You can also archive your log data in

highly durable storage. For more information, see the Amazon CloudWatch Logs User Guide.

• AWS CloudTrail captures API calls and related events made by or on behalf of your AWS account

and delivers the log ﬁles to an Amazon S3 bucket that you specify. You can identify which users

and accounts called AWS, the source IP address from which the calls were made, and when the

calls occurred. For more information, see the AWS CloudTrail User Guide.

586

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon EventBridge is a serverless event bus service that makes it easy to connect your

applications with data from a variety of sources. EventBridge delivers a stream of real-time data

from your own applications, Software-as-a-Service (SaaS) applications, and AWS services and

routes that data to targets such as Lambda. This enables you to monitor events that happen in

services, and build event-driven architectures. For more information, see the Amazon EventBridge

User Guide.

Topics

• Monitoring Amazon Keyspaces with Amazon CloudWatch

• Logging Amazon Keyspaces API calls with AWS CloudTrail

Monitoring Amazon Keyspaces with Amazon CloudWatch

You can monitor Amazon Keyspaces using Amazon CloudWatch, which collects raw data and

processes it into readable, near real-time metrics. These statistics are kept for 15 months, so that

you can access historical information and gain a better perspective on how your web application or

service is performing.

You can also set alarms that watch for certain thresholds, and send notiﬁcations or take actions

when those thresholds are met. For more information, see the Amazon CloudWatch User Guide.

Note

To get started quickly with a preconﬁgured CloudWatch dashboard showing common

metrics for Amazon Keyspaces, you can use an AWS CloudFormation template available

from https://github.com/aws-samples/amazon-keyspaces-cloudwatch-cloudformation-

templates.

Topics

• How do I use Amazon Keyspaces metrics?

• Amazon Keyspaces metrics and dimensions

• Creating CloudWatch alarms to monitor Amazon Keyspaces

Monitoring with CloudWatch 587

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How do I use Amazon Keyspaces metrics?

The metrics reported by Amazon Keyspaces provide information that you can analyze in diﬀerent

ways. The following list shows some common uses for the metrics. These are suggestions to get

you started, not a comprehensive list. For more information about metrics and retention, see

Metrics.

How can I? Relevant metrics

How can I determine if any

system errors occurred?

You can monitor SystemErrors to determine whether any

requests resulted in a server error code. Typically, this metric

should be equal to zero. If it isn't, you might want to investiga

te.

How can I compare average

provisioned read to consumed

read capacity?

To monitor average provisioned read capacity and consumed

read capacity

Set the Period for ConsumedReadCapacityUnits and

ProvisionedReadCapacityUnits to the interval

you want to monitor.

Change the Statistic for ConsumedReadCapaci

tyUnits from Average to Sum.

3. Create a new empty Math expression.

4. In the Details section of the new math expression, enter

the Id of ConsumedReadCapacityUnits and divide

the metric by the CloudWatch PERIOD function of the

metric (metric_id/(PERIOD(metric_id)).

Unselect ConsumedReadCapacityUnits .

You can now compare your average consumed read capacity

to your provisioned capacity. For more information on basic

arithmetic functions and how to create a time series see Using

metric math.

Using metrics 588

Amazon Keyspaces (for Apache Cassandra) Developer Guide

How can I? Relevant metrics

How can I compare average

provisioned write to

consumed write capacity?

To monitor average provisioned write capacity and consumed

write capacity

Set the Period for ConsumedWriteCapacityUnits

and ProvisionedWriteCapacityUnits to the

interval you want to monitor.

Change the Statistic for ConsumedWriteCapac

ityUnits from Average to Sum.

3. Create a new empty Math expression.

4. In the Details section of the new math expression, enter

the Id of ConsumedWriteCapacityUnits and divide

the metric by the CloudWatch PERIOD function of the

metric (metric_id/(PERIOD(metric_id)).

Unselect ConsumedWriteCapacityUnits .

You can now compare your average consumed write capacity

to your provisioned capacity. For more information on basic

arithmetic functions and how to create a time series see Using

metric math.

Amazon Keyspaces metrics and dimensions

When you interact with Amazon Keyspaces, it sends the following metrics and dimensions to

Amazon CloudWatch. All metrics are aggregated and reported every minute. You can use the

following procedures to view the metrics for Amazon Keyspaces.

To view metrics using the CloudWatch console

Metrics are grouped ﬁrst by the service namespace, and then by the various dimension

combinations within each namespace.

1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

2. If necessary, change the Region. On the navigation bar, choose the Region where your AWS

resources reside. For more information, see AWS service endpoints.

Metrics and dimensions 589

Amazon Keyspaces (for Apache Cassandra) Developer Guide

3. In the navigation pane, choose Metrics.

Under the All metrics tab, choose AWS/Cassandra.

To view metrics using the AWS CLI

• At a command prompt, use the following command.

aws cloudwatch list-metrics --namespace "AWS/Cassandra"

Amazon Keyspaces metrics and dimensions

The metrics and dimensions that Amazon Keyspaces sends to Amazon CloudWatch are listed here.

Amazon Keyspaces metrics

Amazon CloudWatch aggregates Amazon Keyspaces metrics at one-minute intervals.

Not all statistics, such as Average or Sum, are applicable for every metric. However, all of these

values are available through the Amazon Keyspaces console, or by using the CloudWatch console,

AWS CLI, or AWS SDKs for all metrics. In the following table, each metric has a list of valid statistics

that are applicable to that metric.

Metric Description

AccountMaxTableLevelReads

The maximum number of read capacity units that can be

used by a table of an account. For on-demand tables this

limit caps the maximum read request units a table can

use.

Units: Count

Valid Statistics:

•

Maximum – The maximum number of read capacity

units that can be used by a table of the account.

AccountMaxTableLev

elWrites

The maximum number of write capacity units that can

be used by a table of an account. For on-demand tables

Metrics and dimensions 590

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

this limit caps the maximum write request units a table

can use.

Units: Count

Valid Statistics:

•

Maximum – The maximum number of write capacity

units that can be used by a table of the account.

AccountProvisioned

ReadCapacityUtilization

The percentage of provisioned read capacity units

utilized by an account.

Units: Percent

Valid Statistics:

•

Maximum – The maximum percentage of provisioned

read capacity units utilized by the account.

•

Minimum – The minimum percentage of provisioned

read capacity units utilized by the account.

•

Average – The average percentage of provisioned

read capacity units utilized by the account. The metric

is published for ﬁve-minute intervals. Therefore, if you

rapidly adjust the provisioned read capacity units, this

statistic might not reﬂect the true average.

Metrics and dimensions 591

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

AccountProvisioned

WriteCapacityUtilization

The percentage of provisioned write capacity units

utilized by an account.

Units: Percent

Valid Statistics:

•

Maximum – The maximum percentage of provisioned

write capacity units utilized by the account.

•

Minimum – The minimum percentage of provisioned

write capacity units utilized by the account.

•

Average – The average percentage of provisioned

write capacity units utilized by the account. The metric

is published for ﬁve-minute intervals. Therefore, if you

rapidly adjust the provisioned write capacity units, this

statistic might not reﬂect the true average.

BillableTableSizeInBytes

The billable size of the table in bytes. It is the sum of the

encoded size of all rows in the table. This metric helps

you track your table storage costs over time.

Units: Bytes

Dimensions: Keyspace, TableName

Valid Statistics:

•

Maximum – The maximum storage size of the table.

•

Minimum – The minimum storage size of the table.

•

Average – The average storage size of the table. This

metric is calculated over 4 - 6 hour intervals.

Metrics and dimensions 592

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ConditionalCheckFa

iledRequests

The number of failed lightweight transaction (LWT)

write requests. The INSERT, UPDATE, and DELETE

operations let you provide a logical condition that must

evaluate to true before the operation can proceed. If this

condition evaluates to false, ConditionalCheckFa

iledRequests is incremented by one. Condition

checks that evaluate to false consume write capacity

units based on the size of the row. For more information,

see the section called “Estimate capacity consumption

of LWT”.

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

Metrics and dimensions 593

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ConsumedReadCapacityUnits

The number of read capacity units consumed over the

speciﬁed time period. For more information, see Read/

Write capacity mode.

Note

To understand your average throughput utilizati

on per second, use the Sum statistic to calculate

the consumed throughput for the one minute

period. Then divide the sum by the number of

seconds in a minute (60) to calculate the average

ConsumedReadCapacityUnits per second

(recognizing that this average does not highlight

any large but brief spikes in read activity that

occurred during that minute). For more informati

on on comparing average consumed read

capacity to provisioned read capacity, see the

section called “Using metrics”

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Minimum – The minimum number of read capacity

units consumed by any individual request to the table.

•

Maximum – The maximum number of read capacity

units consumed by any individual request to the table.

•

Average – The average per-request read capacity

consumed.

Metrics and dimensions 594

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

Note

The Average value is inﬂuenced by periods of

inactivity where the sample value will be zero.

•

Sum – The total read capacity units consumed.

This is the most useful statistic for the ConsumedR

eadCapacityUnits metric.

•

SampleCount – The number of requests to Amazon

Keyspaces, even if no read capacity was consumed.

Note

The SampleCount value is inﬂuenced by

periods of inactivity where the sample value

will be zero.

Metrics and dimensions 595

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ConsumedWriteCapac

ityUnits

The number of write capacity units consumed over

the speciﬁed time period. You can retrieve the total

consumed write capacity for a table. For more informati

on, see Read/Write capacity mode.

Note

To understand your average throughput utilizati

on per second, use the Sum statistic to calculate

the consumed throughput for the one minute

period. Then divide the sum by the number of

seconds in a minute (60) to calculate the average

ConsumedWriteCapacityUnits per second

(recognizing that this average does not highlight

any large but brief spikes in write activity that

occurred during that minute). For more informati

on on comparing average consumed write

capacity to provisioned write capacity, see the

section called “Using metrics”

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Minimum – The minimum number of write capacity

units consumed by any individual request to the table.

•

Maximum – The maximum number of write capacity

units consumed by any individual request to the table.

•

Average – The average per-request write capacity

consumed.

Metrics and dimensions 596

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

Note

The Average value is inﬂuenced by periods of

inactivity where the sample value will be zero.

•

Sum – The total write capacity units consumed.

This is the most useful statistic for the ConsumedW

riteCapacityUnits metric.

•

SampleCount – The number of requests to Amazon

Keyspaces, even if no write capacity was consumed.

Note

The SampleCount value is inﬂuenced by

periods of inactivity where the sample value

will be zero.

Metrics and dimensions 597

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

MaxProvisionedTabl

eReadCapacityUtilization

The percentage of provisioned read capacity units

utilized by the highest provisioned read table of an

account.

Units: Percent

Valid Statistics:

•

Maximum – : The maximum percentage of provisioned

read capacity units utilized by the highest provisioned

read table of the account.

•

Minimum – The minimum percentage of provisioned

read capacity units utilized by the highest provisioned

read table of the account.

•

Average – The average percentage of provisioned

read capacity units utilized by the highest provisioned

read table of the account. The metric is published for

ﬁve-minute intervals. Therefore, if you rapidly adjust

the provisioned read capacity units, this statistic might

not reﬂect the true average.

Metrics and dimensions 598

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

MaxProvisionedTabl

eWriteCapacityUtil

ization

The percentage of provisioned write capacity utilized by

the highest provisioned write table of an account.

Units: Percent

Valid Statistics:

•

Maximum – The maximum percentage of provisioned

write capacity units utilized by the highest provisioned

write table of the account.

•

Minimum – The minimum percentage of provisioned

write capacity units utilized by the highest provisioned

write table of the account.

•

Average – The average percentage of provisioned

write capacity units utilized by the highest provisioned

write table of the account. The metric is published for

ﬁve-minute intervals. Therefore, if you rapidly adjust

the provisioned write capacity units, this statistic

might not reﬂect the true average.

Metrics and dimensions 599

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

PerConnectionReque

stRateExceeded

Requests to Amazon Keyspaces that exceed the per-

connection request rate quota. Each client connection to

Amazon Keyspaces can support up to 3000 CQL requests

per second. Clients can create multiple connections to

increase throughput.

When you're using Multi-Region Replication, each

replicated write also contributes to this quota. As a

best practice, we recommend to increase the number

of connections to your tables to avoid PerConnec

tionRequestRateExceeded errors. There is no

limit to the number of connections you can have in

Amazon Keyspaces.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

SampleCount

•

Sum

Metrics and dimensions 600

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ProvisionedReadCap

acityUnits

The number of provisioned read capacity units for a

table.

The TableName dimension returns the Provision

edReadCapacityUnits for the table.

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Minimum – The lowest setting for provisioned read

capacity. If you use ALTER TABLE to increase read

capacity, this metric shows the lowest value of

provisioned ReadCapacityUnits during this time

period.

•

Maximum – The highest setting for provisioned read

capacity. If you use ALTER TABLE to decrease read

capacity, this metric shows the highest value of

provisioned ReadCapacityUnits during this time

period.

•

Average – The average provisioned read capacity.

The ProvisionedReadCapacityUnits metric

is published at ﬁve-minute intervals. Therefore, if you

rapidly adjust the provisioned read capacity units, this

statistic might not reﬂect the true average.

Metrics and dimensions 601

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ProvisionedWriteCa

pacityUnits

The number of provisioned write capacity units for a

table.

The TableName dimension returns the Provision

edWriteCapacityUnits for the table.

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Minimum – The lowest setting for provisioned write

capacity. If you use ALTER TABLE to increase write

capacity, this metric shows the lowest value of

provisioned WriteCapacityUnits during this

time period.

•

Maximum – The highest setting for provisioned write

capacity. If you use ALTER TABLE to decrease write

capacity, this metric shows the highest value of

provisioned WriteCapacityUnits during this

time period.

•

Average – The average provisioned write capacity.

The ProvisionedWriteCapacityUnits metric

is published at ﬁve-minute intervals. Therefore, if you

rapidly adjust the provisioned write capacity units, this

statistic might not reﬂect the true average.

Metrics and dimensions 602

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ReadThrottleEvents

Requests to Amazon Keyspaces that exceed the

provisioned read capacity for a table, or account level

quotas, request per connection quotas, or partition level

quotas.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

SampleCount

•

Sum

ReplicationLatency

This metric only applies to multi-Region keyspaces

and measures the time it took to replicate updates,

inserts, or deletes from one replica table to another

replica table in a multi-Region keyspace.

Units: Millisecond

Dimensions: TableName, ReceivingRegion

Valid Statistics:

•

Average

•

Maximum

•

Minimum

Metrics and dimensions 603

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

ReturnedItemCountBySelect The number of rows returned by multi-row SELECT

queries during the speciﬁed time period. Multi-row

SELECT queries are queries which do not contain the

fully qualiﬁed primary key, such as full table scans and

range queries.

The number of rows returned is not necessarily the

same as the number of rows that were evaluated. For

example, suppose that you requested a SELECT * with

ALLOW FILTERING on a table that had 100 rows,

but speciﬁed a WHERE clause that narrowed the results

so that only 15 rows were returned. In this case, the

response from SELECT would contain a ScanCount of

100 and a Count of 15 returned rows.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

Metrics and dimensions 604

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

StoragePartitionTh

roughputCapacityExceeded

Requests to an Amazon Keyspaces storage partition

that exceed the throughput capacity of the partition.

Amazon Keyspaces storage partitions can support up

to 1000 WCU/WRU per second and 3000 RCU/RRU per

second. We recommend reviewing your data model to

distribute read/write traﬃc across more partitions to

mitigate these exceptions.

Note

Logical Amazon Keyspaces partitions can span

multiple storage partitions and are virtually

unbounded in size.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

SampleCount

•

Sum

SuccessfulRequestCount

The number of successful requests processed over the

speciﬁed time period.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

SampleCount

Metrics and dimensions 605

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

SuccessfulRequestLatency

The successful requests to Amazon Keyspaces during the

speciﬁed time period. SuccessfulRequestLatency

can provide two diﬀerent kinds of information:

•

The elapsed time for successful requests (Minimum,

Maximum, Sum, or Average).

•

The number of successful requests (SampleCount ).

SuccessfulRequestLatency reﬂects activity

only within Amazon Keyspaces and does not take into

account network latency or client-side activity.

Units: Milliseconds

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

SystemErrors

The requests to Amazon Keyspaces that generate a

ServerError during the speciﬁed time period. A

ServerError usually indicates an internal service

error.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

Sum

•

SampleCount

Metrics and dimensions 606

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

SystemReconciliati

onDeletes

The units consumed to delete tombstoned data when

client-side timestamps are enabled. Each SystemRec

onciliationDelete provides enough capacity

to delete or update up to 1KB of data per row. For

example, to update a row that stores 2.5 KB of data

and to delete one or more columns within the row at

the same time requires 3 SystemReconciliati

onDeletes . Or, to delete an entire row that contains

3.5 KB of data requires 4 SystemReconciliati

onDeletes .

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Sum – The total number of SystemReconciliati

onDeletes consumed in a time period.

TTLDeletes

The units consumed to delete or update data in a row

by using Time to Live (TTL). Each TTLDelete provides

enough capacity to delete or update up to 1KB of data

per row. For example, to update a row that stores 2.5

KB of data and to delete one or more columns within

the row at the same time requires 3 TTL deletes. Or,

to delete an entire row that contains 3.5 KB of data

requires 4 TTL deletes.

Units: Count

Dimensions: Keyspace, TableName

Valid Statistics:

•

Sum – The total number of TTLDeletes consumed

in a time period.

Metrics and dimensions 607

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Metric Description

UserErrors

Requests to Amazon Keyspaces that generate an

InvalidRequest error during the speciﬁed time

period. An InvalidRequest usually indicates a client-

side error, such as an invalid combination of parameter

s, an attempt to update a nonexistent table, or an

incorrect request signature.

UserErrors represents the aggregate of invalid

requests for the current AWS Region and the current

AWS account.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

Sum

•

SampleCount

WriteThrottleEvents

Requests to Amazon Keyspaces that exceed the

provisioned write capacity for a table, or account level

quotas, request per connection quotas, or partition level

quotas.

Units: Count

Dimensions: Keyspace, TableName, Operation

Valid Statistics:

•

SampleCount

•

Sum

Metrics and dimensions 608

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Dimensions for Amazon Keyspaces metrics

The metrics for Amazon Keyspaces are qualiﬁed by the values for the account, table name, or

operation. You can use the CloudWatch console to retrieve Amazon Keyspaces data along any of

the dimensions in the following table.

Dimension Description

Keyspace

This dimension limits the data to a speciﬁc keyspace. This value

can be any keyspace in the current Region and the current AWS

account.

Operation

This dimension limits the data to one of the Amazon Keyspaces

CQL operations, such as INSERT or SELECT operations.

TableName

This dimension limits the data to a speciﬁc table. This value

can be any table name in the current Region and the current

AWS account. If the table name is not unique within the

account, you must also specify Keyspace.

Creating CloudWatch alarms to monitor Amazon Keyspaces

You can create an Amazon CloudWatch alarm for Amazon Keyspaces that sends an Amazon Simple

Notiﬁcation Service (Amazon SNS) message when the alarm changes state. An alarm watches a

single metric over a time period that you specify. It performs one or more actions based on the

value of the metric relative to a given threshold over a number of time periods. The action is a

notiﬁcation sent to an Amazon SNS topic or an Application Auto Scaling policy.

When you use Amazon Keyspaces in provisioned mode with Application Auto Scaling, the service

creates two pairs of CloudWatch alarms on your behalf. Each pair represents your upper and lower

boundaries for provisioned and consumed throughput settings. These CloudWatch alarms are

triggered when the table's actual utilization deviates from your target utilization for a sustained

period of time. To learn more about CloudWatch alarms created by Application Auto Scaling, see

the section called “How Amazon Keyspaces automatic scaling works”.

Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions

simply because they are in a particular state. The state must have changed and been maintained

for a speciﬁed number of periods.

Creating alarms 609

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information about creating CloudWatch alarms, see Using Amazon CloudWatch alarms in

the Amazon CloudWatch User Guide.

Logging Amazon Keyspaces API calls with AWS CloudTrail

Amazon Keyspaces is integrated with AWS CloudTrail, a service that provides a record of actions

taken by a user, role, or an AWS service in Amazon Keyspaces. CloudTrail captures Data Deﬁnition

Language (DDL) API calls and Data Manipulation Language (DML) API calls for Amazon Keyspaces

as events. The calls that are captured include calls from the Amazon Keyspaces console and

programmatic calls to the Amazon Keyspaces API operations.

If you create a trail, you can enable continuous delivery of CloudTrail events to an Amazon Simple

Storage Service (Amazon S3) bucket, including events for Amazon Keyspaces.

If you don't conﬁgure a trail, you can still view the most recent supported events on the CloudTrail

console in Event history. Using the information collected by CloudTrail, you can determine the

request that was made to Amazon Keyspaces, the IP address from which the request was made,

who made the request, when it was made, and additional details.

To learn more about CloudTrail, see the AWS CloudTrail User Guide.

Topics

• Conﬁguring Amazon Keyspaces log ﬁle entries in CloudTrail

• Amazon Keyspaces Data Deﬁnition Language (DDL) information in CloudTrail

• Amazon Keyspaces Data Manipulation Language (DML) information in CloudTrail

• Understanding Amazon Keyspaces log ﬁle entries

Conﬁguring Amazon Keyspaces log ﬁle entries in CloudTrail

Each Amazon Keyspaces API action logged in CloudTrail includes request parameters that are

expressed in CQL query language. For more information, see the CQL language reference.

You can view, search, and download recent events in your AWS account. For more information, see

Viewing events with CloudTrail event history.

For an ongoing record of events in your AWS account, including events for Amazon Keyspaces,

create a trail. A trail enables CloudTrail to deliver log ﬁles to an Amazon S3 bucket. By default,

Logging with CloudTrail 610

Amazon Keyspaces (for Apache Cassandra) Developer Guide

when you create a trail in the console, the trail applies to all AWS Regions. The trail logs events

from all Regions in the AWS partition and delivers the log ﬁles to the Amazon S3 bucket that you

specify. Additionally, you can conﬁgure other AWS services to further analyze and act upon the

event data collected in CloudTrail logs.

For more information, see the following topics in the AWS CloudTrail User Guide:

• Overview for creating a trail

• CloudTrail supported services and integrations

• Conﬁguring Amazon SNS notiﬁcations for CloudTrail

• Receiving CloudTrail log ﬁles from multiple Regions

• Receiving CloudTrail log ﬁles from multiple accounts

Every event or log entry contains information about who generated the request. The identity

information helps you determine the following:

• Whether the request was made with root or AWS Identity and Access Management (IAM) user

credentials.

• Whether the request was made with temporary security credentials for a role or federated user.

• Whether the request was made by another AWS service.

For more information, see the CloudTrail userIdentity element.

Amazon Keyspaces Data Deﬁnition Language (DDL) information in

CloudTrail

CloudTrail is enabled on your AWS account when you create the account. When a DDL activity

occurs in Amazon Keyspaces, that activity is automatically recorded as a CloudTrail event along

with other AWS service events in Event history. The following table shows the DDL statements

that are logged for Amazon Keyspaces.

CloudTrail

eventName

Statement CQL action AWS SDK action

CreateKeyspace DDL

CREATE KEYSPACE CreateKeyspace

DDL information in CloudTrail 611

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CloudTrail

eventName

Statement CQL action AWS SDK action

DropKeyspace DDL

DROP KEYSPACE DeleteKeyspace

CreateTable DDL

CREATE TABLE CreateTable

DropTable DDL

DROP TABLE DeleteTable

AlterTable DDL

ALTER TABLE UpdateTable ,

TagResource ,

UntagResource

Amazon Keyspaces Data Manipulation Language (DML) information in

CloudTrail

To enable logging of Amazon Keyspaces DML statements with CloudTrail, you have to ﬁrst

enable logging of data plane API activity in CloudTrail. You can start logging Amazon Keyspaces

DML events in new or existing trails by choosing to log activity for the data event type

Cassandra table using the CloudTrail console, or by setting the resources.type value to

AWS::Cassandra::Table using the AWS CLI, or CloudTrail API operations. For more information,

see Logging data events.

The following table shows the data events that are logged by CloudTrail for Cassandra table.

CloudTrail

eventName

Statement CQL action AWS SDK action

Select DML

SELECT GetKeyspa

ce , GetTable,

ListKeysp

aces , ListTable

s ListTagsF

orResource

Insert DML

INSERT

no AWS SDK actions

available

DML information in CloudTrail 612

Amazon Keyspaces (for Apache Cassandra) Developer Guide

CloudTrail

eventName

Statement CQL action AWS SDK action

Update DML

UPDATE

no AWS SDK actions

available

Delete DML

DELETE

no AWS SDK actions

available

Understanding Amazon Keyspaces log ﬁle entries

CloudTrail log ﬁles contain one or more log entries. An event represents a single request from

any source and includes information about the requested action, the date and time of the action,

request parameters, and so on. CloudTrail log ﬁles aren't an ordered stack trace of the public API

calls, so they don't appear in any speciﬁc order.

The following example shows a CloudTrail log entry that demonstrates the CreateKeyspace,

DropKeyspace, CreateTable, and DropTable actions:

{

"Records": [

{

"eventVersion": "1.05",

"userIdentity": {

"type": "AssumedRole",

"principalId": "AKIAIOSFODNN7EXAMPLE:alice",

"arn": "arn:aws:sts::111122223333:assumed-role/users/alice",

"accountId": "111122223333",

"sessionContext": {

"sessionIssuer": {

"type": "Role",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:role/Admin",

"accountId": "111122223333",

"userName": "Admin"

"webIdFederationData": {},

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2020-01-15T18:47:56Z"

Understanding log ﬁle entries 613

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

"eventTime": "2020-01-15T18:53:04Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "CreateKeyspace",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"rawQuery": "\n\tCREATE KEYSPACE \"mykeyspace\"\n\tWITH\n\t\tREPLICATION =

{'class': 'SingleRegionStrategy'}\n\t\t",

"keyspaceName": "mykeyspace"

"responseElements": null,

"requestID": "bfa3e75d-bf4d-4fc0-be5e-89d15850eb41",

"eventID": "d25beae8-f611-4229-877a-921557a07bb9",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Keyspace",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"recipientAccountId": "111122223333",

"managementEvent": true,

"eventCategory": "Management",

"tlsDetails": {

"tlsVersion": "TLSv1.2",

"cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

{

"eventVersion": "1.05",

"userIdentity": {

"type": "AssumedRole",

"principalId": "AKIAIOSFODNN7EXAMPLE:alice",

"arn": "arn:aws:sts::111122223333:assumed-role/users/alice",

"accountId": "111122223333",

"sessionContext": {

"sessionIssuer": {

Understanding log ﬁle entries 614

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"type": "Role",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:role/Admin",

"accountId": "111122223333",

"userName": "Admin"

"webIdFederationData": {},

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2020-01-15T18:47:56Z"

}

"eventTime": "2020-01-15T19:28:39Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "DropKeyspace",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"rawQuery": "DROP KEYSPACE \"mykeyspace\"",

"keyspaceName": "mykeyspace"

"responseElements": null,

"requestID": "66f3d86a-56ae-4c29-b46f-abcd489ed86b",

"eventID": "e5aebeac-e1dd-41e3-a515-84fe6aaabd7b",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Keyspace",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"recipientAccountId": "111122223333",

"managementEvent": true,

"eventCategory": "Management",

"tlsDetails": {

"tlsVersion": "TLSv1.2",

"cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

Understanding log ﬁle entries 615

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"eventVersion": "1.05",

"userIdentity": {

"type": "AssumedRole",

"principalId": "AKIAIOSFODNN7EXAMPLE:alice",

"arn": "arn:aws:sts::111122223333:assumed-role/users/alice",

"accountId": "111122223333",

"sessionContext": {

"sessionIssuer": {

"type": "Role",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:role/Admin",

"accountId": "111122223333",

"userName": "Admin"

"webIdFederationData": {},

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2020-01-15T18:47:56Z"

}

"eventTime": "2020-01-15T18:55:24Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "CreateTable",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"rawQuery": "\n\tCREATE TABLE \"mykeyspace\".\"mytable\"(\n\t\t\"ID\" int,

\n\t\t\"username\" text,\n\t\t\"email\" text,\n\t\t\"post_type\" text,\n\t\tPRIMARY

KEY((\"ID\", \"username\", \"email\")))",

"keyspaceName": "mykeyspace",

"tableName": "mytable"

"responseElements": null,

"requestID": "5f845963-70ea-4988-8a7a-2e66d061aacb",

"eventID": "fe0dbd2b-7b34-4675-a30c-740f9d8d73f9",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

Understanding log ﬁle entries 616

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"recipientAccountId": "111122223333",

"managementEvent": true,

"eventCategory": "Management",

"tlsDetails": {

"tlsVersion": "TLSv1.2",

"cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

{

"eventVersion": "1.05",

"userIdentity": {

"type": "AssumedRole",

"principalId": "AKIAIOSFODNN7EXAMPLE:alice",

"arn": "arn:aws:sts::111122223333:assumed-role/users/alice",

"accountId": "111122223333",

"sessionContext": {

"sessionIssuer": {

"type": "Role",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:role/Admin",

"accountId": "111122223333",

"userName": "Admin"

"webIdFederationData": {},

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2020-01-15T18:47:56Z"

}

"eventTime": "2020-01-15T19:27:59Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "DropTable",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"rawQuery": "DROP TABLE \"mykeyspace\".\"mytable\"",

Understanding log ﬁle entries 617

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"keyspaceName": "mykeyspace",

"tableName": "mytable"

"responseElements": null,

"requestID": "025501b0-3582-437e-9d18-8939e9ef262f",

"eventID": "1a5cbedc-4e38-4889-8475-3eab98de0ffd",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"recipientAccountId": "111122223333",

"managementEvent": true,

"eventCategory": "Management",

"tlsDetails": {

"tlsVersion": "TLSv1.2",

"cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

}

]

}

The following log ﬁle shows an example of a SELECT statement.

{

"eventVersion": "1.09",

"userIdentity": {

"type": "IAMUser",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:user/alice",

"accountId": "111122223333",

"userName": "alice"

"eventTime": "2023-11-17T10:38:04Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "Select",

"awsRegion": "us-east-1",

Understanding log ﬁle entries 618

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"conditions": [

"pk = **(Redacted)",

"ck < 3**(Redacted)0",

"region = 't**(Redacted)t'"

"select": [

"pk",

"ck",

"region"

"allowFiltering": true

"responseElements": null,

"requestID": "6d83bbf0-a3d0-4d49-b1d9-e31779a28628",

"eventID": "e00552d3-34e9-4092-931a-912c4e08ba17",

"readOnly": true,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/my_keyspace/

table/my_table"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"managementEvent": false,

"recipientAccountId": "111122223333",

"eventCategory": "Data",

"tlsDetails": {

"tlsVersion": "TLSv1.3",

"cipherSuite": "TLS_AES_128_GCM_SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

}

The following log ﬁle shows an example of an INSERT statement.

Understanding log ﬁle entries 619

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"eventVersion": "1.09",

"userIdentity": {

"type": "IAMUser",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:user/alice",

"accountId": "111122223333",

"userName": "alice"

"eventTime": "2023-12-01T22:11:43Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "Insert",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"primaryKeys": {

"pk": "**(Redacted)",

"ck": "1**(Redacted)8"

"columnNames": [

"pk",

"ck",

"region"

"updateParameters": {

"TTL": "2**(Redacted)0"

}

"responseElements": null,

"requestID": "edf8af47-2f87-4432-864d-a960ac35e471",

"eventID": "81b56a1c-9bdd-4c92-bb8e-92776b5a3bf1",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/my_keyspace/table/

my_table"

}

Understanding log ﬁle entries 620

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"managementEvent": false,

"recipientAccountId": "111122223333",

"eventCategory": "Data",

"tlsDetails": {

"tlsVersion": "TLSv1.3",

"cipherSuite": "TLS_AES_128_GCM_SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

}

The following log ﬁle shows an example of an UPDATE statement.

{

"eventVersion": "1.09",

"userIdentity": {

"type": "IAMUser",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:user/alice",

"accountId": "111122223333",

"userName": "alice"

"eventTime": "2023-12-01T22:11:43Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "Update",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"primaryKeys": {

"pk": "'t**(Redacted)t'",

"ck": "'s**(Redacted)g'"

"assignmentColumnNames": [

"nonkey"

"conditions": [

"nonkey < 1**(Redacted)7"

]

Understanding log ﬁle entries 621

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"responseElements": null,

"requestID": "edf8af47-2f87-4432-864d-a960ac35e471",

"eventID": "81b56a1c-9bdd-4c92-bb8e-92776b5a3bf1",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/my_keyspace/table/

my_table"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"managementEvent": false,

"recipientAccountId": "111122223333",

"eventCategory": "Data",

"tlsDetails": {

"tlsVersion": "TLSv1.3",

"cipherSuite": "TLS_AES_128_GCM_SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

}

The following log ﬁle shows an example of a DELETE statement.

{

"eventVersion": "1.09",

"userIdentity": {

"type": "IAMUser",

"principalId": "AKIAIOSFODNN7EXAMPLE",

"arn": "arn:aws:iam::111122223333:user/alice",

"accountId": "111122223333",

"userName": "alice",

"eventTime": "2023-10-23T13:59:05Z",

"eventSource": "cassandra.amazonaws.com",

"eventName": "Delete",

"awsRegion": "us-east-1",

"sourceIPAddress": "10.24.34.01",

"userAgent": "Cassandra Client/ProtocolV4",

"requestParameters": {

Understanding log ﬁle entries 622

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"keyspaceName": "my_keyspace",

"tableName": "my_table",

"primaryKeys": {

"pk": "**(Redacted)",

"ck": "**(Redacted)"

"conditions": [],

"deleteColumnNames": [

"m",

"s"

"updateParameters": {}

"responseElements": null,

"requestID": "3d45e63b-c0c8-48e2-bc64-31afc5b4f49d",

"eventID": "499da055-c642-4762-8775-d91757f06512",

"readOnly": false,

"resources": [

{

"accountId": "111122223333",

"type": "AWS::Cassandra::Table",

"ARN": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/my_keyspace/table/

my_table"

}

"eventType": "AwsApiCall",

"apiVersion": "3.4.4",

"managementEvent": false,

"recipientAccountId": "111122223333",

"eventCategory": "Data",

"tlsDetails": {

"tlsVersion": "TLSv1.3",

"cipherSuite": "TLS_AES_128_GCM_SHA256",

"clientProvidedHostHeader": "cassandra.us-east-1.amazonaws.com"

}

Understanding log ﬁle entries 623

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Security in Amazon Keyspaces (for Apache Cassandra)

Cloud security at AWS is the highest priority. As an AWS customer, you beneﬁt from a data center

and network architecture that is built to meet the requirements of the most security-sensitive

organizations.

Security is a shared responsibility between AWS and you. The shared responsibility model describes

this as security of the cloud and security in the cloud:

• Security of the cloud – AWS is responsible for protecting the infrastructure that runs AWS

services in the AWS Cloud. AWS also provides you with services that you can use securely. The

eﬀectiveness of our security is regularly tested and veriﬁed by third-party auditors as part of

the AWS compliance programs. To learn about the compliance programs that apply to Amazon

Keyspaces, see AWS Services in scope by compliance program.

• Security in the cloud – Your responsibility is determined by the AWS service that you use. You

are also responsible for other factors including the sensitivity of your data, your organization’s

requirements, and applicable laws and regulations.

This documentation will help you understand how to apply the shared responsibility model when

using Amazon Keyspaces. The following topics show you how to conﬁgure Amazon Keyspaces to

meet your security and compliance objectives. You'll also learn how to use other AWS services that

can help you to monitor and secure your Amazon Keyspaces resources.

Topics

• Data protection in Amazon Keyspaces

• AWS Identity and Access Management for Amazon Keyspaces

• Compliance validation for Amazon Keyspaces (for Apache Cassandra)

• Resilience and disaster recovery in Amazon Keyspaces

• Infrastructure security in Amazon Keyspaces

• Conﬁguration and vulnerability analysis for Amazon Keyspaces

• Security best practices for Amazon Keyspaces

624

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Data protection in Amazon Keyspaces

The AWS shared responsibility model applies to data protection in Amazon Keyspaces (for Apache

Cassandra). As described in this model, AWS is responsible for protecting the global infrastructure

that runs all of the AWS Cloud. You are responsible for maintaining control over your content

that is hosted on this infrastructure. You are also responsible for the security conﬁguration and

management tasks for the AWS services that you use. For more information about data privacy,

see the Data Privacy FAQ. For information about data protection in Europe, see the AWS Shared

Responsibility Model and GDPR blog post on the AWS Security Blog.

For data protection purposes, we recommend that you protect AWS account credentials and set

up individual users with AWS IAM Identity Center or AWS Identity and Access Management (IAM).

That way, each user is given only the permissions necessary to fulﬁll their job duties. We also

recommend that you secure your data in the following ways:

• Use multi-factor authentication (MFA) with each account.

• Use SSL/TLS to communicate with AWS resources. We require TLS 1.2 and recommend TLS 1.3.

• Set up API and user activity logging with AWS CloudTrail. For information about using CloudTrail

trails to capture AWS activities, see Working with CloudTrail trails in the AWS CloudTrail User

Guide.

• Use AWS encryption solutions, along with all default security controls within AWS services.

• Use advanced managed security services such as Amazon Macie, which assists in discovering and

securing sensitive data that is stored in Amazon S3.

• If you require FIPS 140-3 validated cryptographic modules when accessing AWS through a

command line interface or an API, use a FIPS endpoint. For more information about the available

FIPS endpoints, see Federal Information Processing Standard (FIPS) 140-3.

We strongly recommend that you never put conﬁdential or sensitive information, such as your

customers' email addresses, into tags or free-form text ﬁelds such as a Name ﬁeld. This includes

when you work with Amazon Keyspaces or other AWS services using the console, API, AWS CLI, or

AWS SDKs. Any data that you enter into tags or free-form text ﬁelds used for names may be used

for billing or diagnostic logs. If you provide a URL to an external server, we strongly recommend

that you do not include credentials information in the URL to validate your request to that server.

Topics

• Encryption at rest in Amazon Keyspaces

Data protection 625

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Encryption in transit in Amazon Keyspaces

• Internetwork traﬃc privacy in Amazon Keyspaces

Encryption at rest in Amazon Keyspaces

Amazon Keyspaces (for Apache Cassandra) encryption at rest provides enhanced security by

encrypting all your data at rest using encryption keys stored in AWS Key Management Service (AWS

KMS). This functionality helps reduce the operational burden and complexity involved in protecting

sensitive data. With encryption at rest, you can build security-sensitive applications that meet strict

compliance and regulatory requirements for data protection.

Amazon Keyspaces encryption at rest encrypts your data using 256-bit Advanced Encryption

Standard (AES-256). This helps secure your data from unauthorized access to the underlying

storage.

Amazon Keyspaces encrypts and decrypts the table data transparently. Amazon Keyspaces uses

envelope encryption and a key hierarchy to protect data encryption keys. It integrates with AWS

KMS for storing and managing the root encryption key. For more information about the encryption

key hierarchy, see the section called “How it works”. For more information about AWS KMS

concepts like envelope encryption, see AWS KMS management service concepts in the AWS Key

Management Service Developer Guide.

When creating a new table, you can choose one of the following AWS KMS keys (KMS keys):

• AWS owned key – This is the default encryption type. The key is owned by Amazon Keyspaces (no

additional charge).

• Customer managed key – This key is stored in your account and is created, owned, and managed

by you. You have full control over the customer managed key (AWS KMS charges apply).

You can switch between the AWS owned key and the customer managed key at any given time. You

can specify a customer managed key when you create a new table or change the KMS key of an

existing table by using the console or programmatically using CQL statements. To learn how, see

Encryption at rest: How to use customer managed keys to encrypt tables in Amazon Keyspaces.

Encryption at rest using the default option of AWS owned keys is oﬀered at no additional charge.

However, AWS KMS charges apply for customer managed keys. For more information about pricing,

see AWS KMS pricing.

Encryption at rest 626

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces encryption at rest is available in all AWS Regions, including the AWS China

(Beijing) and AWS China (Ningxia) Regions. For more information, see Encryption at rest: How it

works in Amazon Keyspaces.

Topics

• Encryption at rest: How it works in Amazon Keyspaces

• Encryption at rest: How to use customer managed keys to encrypt tables in Amazon Keyspaces

Encryption at rest: How it works in Amazon Keyspaces

Amazon Keyspaces (for Apache Cassandra) encryption at rest encrypts your data using the 256-bit

Advanced Encryption Standard (AES-256). This helps secure your data from unauthorized access

to the underlying storage. All customer data in Amazon Keyspaces tables is encrypted at rest by

default, and server-side encryption is transparent, which means that changes to applications aren't

required.

Encryption at rest integrates with AWS Key Management Service (AWS KMS) for managing the

encryption key that is used to encrypt your tables. When creating a new table or updating an

existing table, you can choose one of the following AWS KMS key options:

• AWS owned key – This is the default encryption type. The key is owned by Amazon Keyspaces (no

additional charge).

• Customer managed key – This key is stored in your account and is created, owned, and managed

by you. You have full control over the customer managed key (AWS KMS charges apply).

AWS KMS key (KMS key)

Encryption at rest protects all your Amazon Keyspaces data with a AWS KMS key. By default,

Amazon Keyspaces uses an AWS owned key, a multi-tenant encryption key that is created and

managed in an Amazon Keyspaces service account.

However, you can encrypt your Amazon Keyspaces tables using a customer managed key in your

AWS account. You can select a diﬀerent KMS key for each table in a keyspace. The KMS key you

select for a table is also used to encrypt all its metadata and restorable backups.

You select the KMS key for a table when you create or update the table. You can change the

KMS key for a table at any time, either in the Amazon Keyspaces console or by using the ALTER

Encryption at rest 627

Amazon Keyspaces (for Apache Cassandra) Developer Guide

TABLE statement. The process of switching KMS keys is seamless, and doesn't require downtime

or cause service degradation.

Key hierarchy

Amazon Keyspaces uses a key hierarchy to encrypt data. In this key hierarchy, the KMS key is the

root key. It's used to encrypt and decrypt the Amazon Keyspaces table encryption key. The table

encryption key is used to encrypt the encryption keys used internally by Amazon Keyspaces to

encrypt and decrypt data when performing read and write operations.

With the encryption key hierarchy, you can make changes to the KMS key without having to

reencrypt data or impacting applications and ongoing data operations.

Encryption at rest 628

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Table key

The Amazon Keyspaces table key is used as a key encryption key. Amazon Keyspaces uses

the table key to protect the internal data encryption keys that are used to encrypt the data

stored in tables, log ﬁles, and restorable backups. Amazon Keyspaces generates a unique data

encryption key for each underlying structure in a table. However, multiple table rows might be

protected by the same data encryption key.

When you ﬁrst set the KMS key to a customer managed key, AWS KMS generates a data key. The

AWS KMS data key refers to the table key in Amazon Keyspaces.

When you access an encrypted table, Amazon Keyspaces sends a request to AWS KMS to use the

KMS key to decrypt the table key. Then, it uses the plaintext table key to decrypt the Amazon

Keyspaces data encryption keys, and it uses the plaintext data encryption keys to decrypt table

data.

Amazon Keyspaces uses and stores the table key and data encryption keys outside of AWS KMS.

It protects all keys with Advanced Encryption Standard (AES) encryption and 256-bit encryption

keys. Then, it stores the encrypted keys with the encrypted data so that they're available to

decrypt the table data on demand.

Table key caching

To avoid calling AWS KMS for every Amazon Keyspaces operation, Amazon Keyspaces caches

the plaintext table keys for each connection in memory. If Amazon Keyspaces gets a request

for the cached table key after ﬁve minutes of inactivity, it sends a new request to AWS KMS to

decrypt the table key. This call captures any changes made to the access policies of the KMS key

in AWS KMS or AWS Identity and Access Management (IAM) since the last request to decrypt the

table key.

Envelope encryption

If you change the customer managed key for your table, Amazon Keyspaces generates a new

table key. Then, it uses the new table key to reencrypt the data encryption keys. It also uses the

new table key to encrypt previous table keys that are used to protect restorable backups. This

process is called envelope encryption. This ensures that you can access restorable backups even

if you rotate the customer managed key. For more information about envelope encryption, see

Envelope Encryption in the AWS Key Management Service Developer Guide.

Topics

Encryption at rest 629

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• AWS owned keys

• Customer managed keys

• Encryption at rest usage notes

AWS owned keys

AWS owned keys aren't stored in your AWS account. They are part of a collection of KMS keys that

AWS owns and manages for use in multiple AWS accounts. AWS services can use AWS owned keys

to protect your data.

You can't view, manage, or use AWS owned keys, or audit their use. However, you don't need to do

any work or change any programs to protect the keys that encrypt your data.

You aren't charged a monthly fee or a usage fee for use of AWS owned keys, and they don't count

against AWS KMS quotas for your account.

Customer managed keys

Customer managed keys are keys in your AWS account that you create, own, and manage. You have

full control over these KMS keys.

Use a customer managed key to get the following features:

• You create and manage the customer managed key, including setting and maintaining the key

policies, IAM policies, and grants to control access to the customer managed key. You can enable

and disable the customer managed key, enable and disable automatic key rotation, and schedule

the customer managed key for deletion when it is no longer in use. You can create tags and

aliases for the customer managed keys you manage.

• You can use a customer managed key with imported key material or a customer managed key in

a custom key store that you own and manage.

• You can use AWS CloudTrail and Amazon CloudWatch Logs to track the requests that Amazon

Keyspaces sends to AWS KMS on your behalf. For more information, see the section called “Step

6: Conﬁgure monitoring with AWS CloudTrail”.

Customer managed keys incur a charge for each API call, and AWS KMS quotas apply to these KMS

keys. For more information, see AWS KMS resource or request quotas.

When you specify a customer managed key as the root encryption key for a table, restorable

backups are encrypted with the same encryption key that is speciﬁed for the table at the time the

Encryption at rest 630

Amazon Keyspaces (for Apache Cassandra) Developer Guide

backup is created. If the KMS key for the table is rotated, key enveloping ensures that the latest

KMS key has access to all restorable backups.

Amazon Keyspaces must have access to your customer managed key to provide you access to

your table data. If the state of the encryption key is set to disabled or it's scheduled for deletion,

Amazon Keyspaces is unable to encrypt or decrypt data. As a result, you are not able to perform

read and write operations on the table. As soon as the service detects that your encryption key is

inaccessible, Amazon Keyspaces sends an email notiﬁcation to alert you.

You must restore access to your encryption key within seven days or Amazon Keyspaces deletes

your table automatically. As a precaution, Amazon Keyspaces creates a restorable backup of your

table data before deleting the table. Amazon Keyspaces maintains the restorable backup for 35

days. After 35 days, you can no longer restore your table data. You aren't billed for the restorable

backup, but standard restore charges apply.

You can use this restorable backup to restore your data to a new table. To initiate the restore, the

last customer managed key used for the table must be enabled, and Amazon Keyspaces must have

access to it.

Note

When you're creating a table that's encrypted using a customer managed key that's

inaccessible or scheduled for deletion before the creation process completes, an error

occurs. The create table operation fails, and you're sent an email notiﬁcation.

Encryption at rest usage notes

Consider the following when you're using encryption at rest in Amazon Keyspaces.

• Server-side encryption at rest is enabled on all Amazon Keyspaces tables and can't be disabled.

The entire table is encrypted at rest, you can't select speciﬁc columns or rows for encryption.

• By default, Amazon Keyspaces uses a single-service default key (AWS owned key) for encrypting

all of your tables. If this key doesn’t exist, it's created for you. Service default keys can't be

disabled.

• Encryption at rest only encrypts data while it's static (at rest) on a persistent storage media. If

data security is a concern for data in transit or data in use, you must take additional measures:

Encryption at rest 631

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Data in transit: All your data in Amazon Keyspaces is encrypted in transit. By default,

communications to and from Amazon Keyspaces are protected by using Secure Sockets Layer

(SSL)/Transport Layer Security (TLS) encryption.

• Data in use: Protect your data before sending it to Amazon Keyspaces by using client-side

encryption.

• Customer managed keys: Data at rest in your tables is always encrypted using your customer

managed keys. However operations that perform atomic updates of multiple rows encrypt

data temporarily using AWS owned keys during processing. This includes range delete

operations and operations that simultaneously access static and non-static data.

• A single customer managed key can have up to 50,000 grants. Every Amazon Keyspaces table

associated with a customer managed key consumes 2 grants. One grant is released when the

table is deleted. The second grant is used to create an automatic snapshot of the table to

protect from data loss in case Amazon Keyspaces lost access to the customer managed key

unintentionally. This grant is released 42 days after deletion of the table.

Encryption at rest: How to use customer managed keys to encrypt tables in

Amazon Keyspaces

You can use the console or CQL statements to specify the AWS KMS key for new tables and update

the encryption keys of existing tables in Amazon Keyspaces. The following topic outlines how to

implement customer managed keys for new and existing tables.

Topics

• Prerequisites: Create a customer managed key using AWS KMS and grant permissions to Amazon

Keyspaces

• Step 3: Specify a customer managed key for a new table

• Step 4: Update the encryption key of an existing table

• Step 5: Use the Amazon Keyspaces encryption context in logs

• Step 6: Conﬁgure monitoring with AWS CloudTrail

Encryption at rest 632

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Prerequisites: Create a customer managed key using AWS KMS and grant permissions to

Amazon Keyspaces

Before you can protect an Amazon Keyspaces table with a customer managed key, you must ﬁrst

create the key in AWS Key Management Service (AWS KMS) and then authorize Amazon Keyspaces

to use that key.

Step 1: Create a customer managed key using AWS KMS

To create a customer managed key to be used to protect an Amazon Keyspaces table, you can

follow the steps in Creating symmetric encryption KMS keys using the console or the AWS API.

Step 2: Authorize the use of your customer managed key

Before you can choose a customer managed key to protect an Amazon Keyspaces table, the policies

on that customer managed key must give Amazon Keyspaces permission to use it on your behalf.

You have full control over the policies and grants on the customer managed key. You can provide

these permissions in a key policy, an IAM policy, or a grant.

Amazon Keyspaces doesn't need additional authorization to use the default AWS owned key to

protect the Amazon Keyspaces tables in your AWS account.

The following topics show how to conﬁgure the required permissions using IAM policies and grants

that allow Amazon Keyspaces tables to use a customer managed key.

Topics

• Key policy for customer managed keys

• Example key policy

• Using grants to authorize Amazon Keyspaces

Key policy for customer managed keys

When you select a customer managed key to protect an Amazon Keyspaces table, Amazon

Keyspaces gets permission to use the customer managed key on behalf of the principal who makes

the selection. That principal, a user or role, must have the permissions on the customer managed

key that Amazon Keyspaces requires.

At a minimum, Amazon Keyspaces requires the following permissions on a customer managed key:

• kms:Encrypt

Encryption at rest 633

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• kms:Decrypt

• kms:ReEncrypt* (for kms:ReEncryptFrom and kms:ReEncryptTo)

• kms:GenerateDataKey* (for kms:GenerateDataKey and kms:GenerateDataKeyWithoutPlaintext)

• kms:DescribeKey

• kms:CreateGrant

Example key policy

For example, the following example key policy provides only the required permissions. The policy

has the following eﬀects:

• Allows Amazon Keyspaces to use the customer managed key in cryptographic operations

and create grants—but only when it's acting on behalf of principals in the account who have

permission to use Amazon Keyspaces. If the principals speciﬁed in the policy statement don't

have permission to use Amazon Keyspaces, the call fails, even when it comes from the Amazon

Keyspaces service.

• The kms:ViaService condition key allows the permissions only when the request comes

from Amazon Keyspaces on behalf of the principals listed in the policy statement. These

principals can't call these operations directly. Note that the kms:ViaService value,

cassandra.*.amazonaws.com, has an asterisk (*) in the Region position. Amazon Keyspaces

requires the permission to be independent of any particular AWS Region.

•

Gives the customer managed key administrators (users who can assume the db-team role) read-

only access to the customer managed key and permission to revoke grants, including the grants

that Amazon Keyspaces requires to protect the table.

• Gives Amazon Keyspaces read-only access to the customer managed key. In this case, Amazon

Keyspaces can call these operations directly. It doesn't have to act on behalf of an account

principal.

Before using an example key policy, replace the example principals with actual principals from your

AWS account.

{

"Id": "key-policy-cassandra",

"Version":"2012-10-17",

"Statement": [

{

Encryption at rest 634

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"Sid" : "Allow access through Amazon Keyspaces for all principals in the account

that are authorized to use Amazon Keyspaces",

"Effect": "Allow",

"Principal": {"AWS": "arn:aws:iam::111122223333:user/db-lead"},

"Action": [

"kms:Encrypt",

"kms:Decrypt",

"kms:ReEncrypt*",

"kms:GenerateDataKey*",

"kms:DescribeKey",

"kms:CreateGrant"

"Resource": "*",

"Condition": {

"StringLike": {

"kms:ViaService" : "cassandra.*.amazonaws.com"

}

{

"Sid": "Allow administrators to view the customer managed key and revoke

grants",

"Effect": "Allow",

"Principal": {

"AWS": "arn:aws:iam::111122223333:role/db-team"

"Action": [

"kms:Describe*",

"kms:Get*",

"kms:List*",

"kms:RevokeGrant"

"Resource": "*"

}

]

}

Using grants to authorize Amazon Keyspaces

In addition to key policies, Amazon Keyspaces uses grants to set permissions on a customer

managed key. To view the grants on a customer managed key in your account, use the ListGrants

operation. Amazon Keyspaces doesn't need grants, or any additional permissions, to use the AWS

owned key to protect your table.

Encryption at rest 635

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces uses the grant permissions when it performs background system maintenance

and continuous data protection tasks. It also uses grants to generate table keys.

Each grant is speciﬁc to a table. If the account includes multiple tables encrypted under the same

customer managed key, there is a grant of each type for each table. The grant is constrained by the

Amazon Keyspaces encryption context, which includes the table name and the AWS account ID. The

grant includes permission to retire the grant if it's no longer needed.

To create the grants, Amazon Keyspaces must have permission to call CreateGrant on behalf of

the user who created the encrypted table.

The key policy can also allow the account to revoke the grant on the customer managed key.

However, if you revoke the grant on an active encrypted table, Amazon Keyspaces will not be able

to protect and maintain the table.

Step 3: Specify a customer managed key for a new table

Follow these steps to specify the customer managed key on a new table using the Amazon

Keyspaces console or CQL.

Create an encrypted table using a customer managed key (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables, and then choose Create table.

3. On the Create table page in the Table details section, select a keyspace and provide a name

for the new table.

4. In the Schema section, create the schema for your table.

5. In the Table settings section, choose Customize settings.

6. Continue to Encryption settings.

In this step, you select the encryption settings for the table.

In the Encryption at rest section under Choose an AWS KMS key, choose the option Choose

a diﬀerent KMS key (advanced), and in the search ﬁeld, choose an AWS KMS key or enter an

Amazon Resource Name (ARN).

Encryption at rest 636

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

If the key you selected is not accessible or is missing the required permissions, see

Troubleshooting key access in the AWS Key Management Service Developer Guide.

7. Choose Create to create the encrypted table.

Create a new table using a customer managed key for encryption at rest (CQL)

To create a new table that uses a customer managed key for encryption at rest, you can use the

CREATE TABLE statement as shown in the following example. Make sure to replace the key ARN

with an ARN for a valid key with permissions granted to Amazon Keyspaces.

CREATE TABLE my_keyspace.my_table(id bigint, name text, place text STATIC, PRIMARY

KEY(id, name)) WITH CUSTOM_PROPERTIES = {

'encryption_specification':{

'encryption_type': 'CUSTOMER_MANAGED_KMS_KEY',

'kms_key_identifier':'arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111'

}

};

If you receive an Invalid Request Exception, you need to conﬁrm that the customer

managed key is valid and Amazon Keyspaces has the required permissions. To conﬁrm that the key

has been conﬁgured correctly, see Troubleshooting key access in the AWS Key Management Service

Developer Guide.

Step 4: Update the encryption key of an existing table

You can also use the Amazon Keyspaces console or CQL to change the encryption keys of an

existing table between an AWS owned key and a customer managed KMS key at any time.

Update an existing table with the new customer managed key (console)

1. Sign in to the AWS Management Console, and open the Amazon Keyspaces console at https://

console.aws.amazon.com/keyspaces/home.

2. In the navigation pane, choose Tables.

3. Choose the table that you want to update, and then choose the Additional settings tab.

Encryption at rest 637

Amazon Keyspaces (for Apache Cassandra) Developer Guide

4. In the Encryption at rest section, choose Manage Encryption to edit the encryption settings

for the table.

Under Choose an AWS KMS key, choose the option Choose a diﬀerent KMS key (advanced),

and in the search ﬁeld, choose an AWS KMS key or enter an Amazon Resource Name (ARN).

Note

If the key you selected is not valid, see Troubleshooting key access in the AWS Key

Management Service Developer Guide.

Alternatively, you can choose an AWS owned key for a table that is encrypted with a customer

managed key.

5. Choose Save changes to save your changes to the table.

Update the encryption key used for an existing table

To change the encryption key of an existing table, you use the ALTER TABLE statement to specify

a customer managed key for encryption at rest. Make sure to replace the key ARN with an ARN for

a valid key with permissions granted to Amazon Keyspaces.

ALTER TABLE my_keyspace.my_table WITH CUSTOM_PROPERTIES = {

'encryption_specification':{

'encryption_type': 'CUSTOMER_MANAGED_KMS_KEY',

'kms_key_identifier':'arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111'

}

};

If you receive an Invalid Request Exception, you need to conﬁrm that the customer

managed key is valid and Amazon Keyspaces has the required permissions. To conﬁrm that the key

has been conﬁgured correctly, see Troubleshooting key access in the AWS Key Management Service

Developer Guide.

To change the encryption key back to the default encryption at rest option with AWS owned keys,

you can use the ALTER TABLE statement as shown in the following example.

ALTER TABLE my_keyspace.my_table WITH CUSTOM_PROPERTIES = {

Encryption at rest 638

Amazon Keyspaces (for Apache Cassandra) Developer Guide

'encryption_specification':{

'encryption_type' : 'AWS_OWNED_KMS_KEY'

}

};

Step 5: Use the Amazon Keyspaces encryption context in logs

An encryption context is a set of key–value pairs that contain arbitrary nonsecret data. When you

include an encryption context in a request to encrypt data, AWS KMS cryptographically binds

the encryption context to the encrypted data. To decrypt the data, you must pass in the same

encryption context.

Amazon Keyspaces uses the same encryption context in all AWS KMS cryptographic operations.

If you use a customer managed key to protect your Amazon Keyspaces table, you can use the

encryption context to identify the use of the customer managed key in audit records and logs. It

also appears in plaintext in logs, such as in logs for AWS CloudTrail and Amazon CloudWatch Logs.

In its requests to AWS KMS, Amazon Keyspaces uses an encryption context with three key–value

pairs.

"encryptionContextSubset": {

"aws:cassandra:keyspaceName": "my_keyspace",

"aws:cassandra:tableName": "mytable"

"aws:cassandra:subscriberId": "111122223333"

}

• Keyspace – The ﬁrst key–value pair identiﬁes the keyspace that includes the table that Amazon

Keyspaces is encrypting. The key is aws:cassandra:keyspaceName. The value is the name of

the keyspace.

"aws:cassandra:keyspaceName": "<keyspace-name>"

For example:

"aws:cassandra:keyspaceName": "my_keyspace"

• Table – The second key–value pair identiﬁes the table that Amazon Keyspaces is encrypting. The

key is aws:cassandra:tableName. The value is the name of the table.

"aws:cassandra:tableName": "<table-name>"

Encryption at rest 639

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For example:

"aws:cassandra:tableName": "my_table"

• Account – The third key–value pair identiﬁes the AWS account. The key is

aws:cassandra:subscriberId. The value is the account ID.

"aws:cassandra:subscriberId": "<account-id>"

For example:

"aws:cassandra:subscriberId": "111122223333"

Step 6: Conﬁgure monitoring with AWS CloudTrail

If you use a customer managed key to protect your Amazon Keyspaces tables, you can use AWS

CloudTrail logs to track the requests that Amazon Keyspaces sends to AWS KMS on your behalf.

The GenerateDataKey, DescribeKey, Decrypt, and CreateGrant requests are discussed in

this section. In addition, Amazon Keyspaces uses a RetireGrant operation to remove a grant when

you delete a table.

GenerateDataKey

Amazon Keyspaces creates a unique table key to encrypt data at rest. It sends a

GenerateDataKey request to AWS KMS that speciﬁes the KMS key for the table.

The event that records the GenerateDataKey operation is similar to the following example

event. The user is the Amazon Keyspaces service account. The parameters include the Amazon

Resource Name (ARN) of the customer managed key, a key speciﬁer that requires a 256-bit key,

and the encryption context that identiﬁes the keyspace, the table, and the AWS account.

{

"eventVersion": "1.08",

"userIdentity": {

"type": "AWSService",

"invokedBy": "AWS Internal"

"eventTime": "2021-04-16T04:56:05Z",

Encryption at rest 640

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"eventSource": "kms.amazonaws.com",

"eventName": "GenerateDataKey",

"awsRegion": "us-east-1",

"sourceIPAddress": "AWS Internal",

"userAgent": "AWS Internal",

"requestParameters": {

"keySpec": "AES_256",

"encryptionContext": {

"aws:cassandra:keyspaceName": "my_keyspace",

"aws:cassandra:tableName": "my_table",

"aws:cassandra:subscriberId": "123SAMPLE012"

"keyId": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

"responseElements": null,

"requestID": "5e8e9cb5-9194-4334-aacc-9dd7d50fe246",

"eventID": "49fccab9-2448-4b97-a89d-7d5c39318d6f",

"readOnly": true,

"resources": [

{

"accountId": "123SAMPLE012",

"type": "AWS::KMS::Key",

"ARN": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

}

"eventType": "AwsApiCall",

"managementEvent": true,

"eventCategory": "Management",

"recipientAccountId": "123SAMPLE012",

"sharedEventID": "84fbaaf0-9641-4e32-9147-57d2cb08792e"

}

DescribeKey

Amazon Keyspaces uses a DescribeKey operation to determine whether the KMS key you

selected exists in the account and Region.

The event that records the DescribeKey operation is similar to the following example event.

The user is the Amazon Keyspaces service account. The parameters include the ARN of the

customer managed key and a key speciﬁer that requires a 256-bit key.

Encryption at rest 641

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"eventVersion": "1.08",

"userIdentity": {

"type": "IAMUser",

"principalId": "AIDAZ3FNIIVIZZ6H7CFQG",

"arn": "arn:aws:iam::123SAMPLE012:user/admin",

"accountId": "123SAMPLE012",

"accessKeyId": "AKIAIOSFODNN7EXAMPLE",

"userName": "admin",

"sessionContext": {

"sessionIssuer": {},

"webIdFederationData": {},

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2021-04-16T04:55:42Z"

}

"invokedBy": "AWS Internal"

"eventTime": "2021-04-16T04:55:58Z",

"eventSource": "kms.amazonaws.com",

"eventName": "DescribeKey",

"awsRegion": "us-east-1",

"sourceIPAddress": "AWS Internal",

"userAgent": "AWS Internal",

"requestParameters": {

"keyId": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

"responseElements": null,

"requestID": "c25a8105-050b-4f52-8358-6e872fb03a6c",

"eventID": "0d96420e-707e-41b9-9118-56585a669658",

"readOnly": true,

"resources": [

{

"accountId": "123SAMPLE012",

"type": "AWS::KMS::Key",

"ARN": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

}

"eventType": "AwsApiCall",

"managementEvent": true,

Encryption at rest 642

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"eventCategory": "Management",

"recipientAccountId": "123SAMPLE012"

}

Decrypt

When you access an Amazon Keyspaces table, Amazon Keyspaces needs to decrypt the table key

so that it can decrypt the keys below it in the hierarchy. It then decrypts the data in the table.

To decrypt the table key, Amazon Keyspaces sends a Decrypt request to AWS KMS that speciﬁes

the KMS key for the table.

The event that records the Decrypt operation is similar to the following example event. The

user is the principal in your AWS account who is accessing the table. The parameters include

the encrypted table key (as a ciphertext blob) and the encryption context that identiﬁes the

table and the AWS account. AWS KMS derives the ID of the customer managed key from the

ciphertext.

{

"eventVersion": "1.08",

"userIdentity": {

"type": "AWSService",

"invokedBy": "AWS Internal"

"eventTime": "2021-04-16T05:29:44Z",

"eventSource": "kms.amazonaws.com",

"eventName": "Decrypt",

"awsRegion": "us-east-1",

"sourceIPAddress": "AWS Internal",

"userAgent": "AWS Internal",

"requestParameters": {

"encryptionContext": {

"aws:cassandra:keyspaceName": "my_keyspace",

"aws:cassandra:tableName": "my_table",

"aws:cassandra:subscriberId": "123SAMPLE012"

"encryptionAlgorithm": "SYMMETRIC_DEFAULT"

"responseElements": null,

"requestID": "50e80373-83c9-4034-8226-5439e1c9b259",

"eventID": "8db9788f-04a5-4ae2-90c9-15c79c411b6b",

"readOnly": true,

"resources": [

Encryption at rest 643

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"accountId": "123SAMPLE012",

"type": "AWS::KMS::Key",

"ARN": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

}

"eventType": "AwsApiCall",

"managementEvent": true,

"eventCategory": "Management",

"recipientAccountId": "123SAMPLE012",

"sharedEventID": "7ed99e2d-910a-4708-a4e3-0180d8dbb68e"

}

CreateGrant

When you use a customer managed key to protect your Amazon Keyspaces table, Amazon

Keyspaces uses grants to allow the service to perform continuous data protection and

maintenance and durability tasks. These grants aren't required on AWS owned keys.

The grants that Amazon Keyspaces creates are speciﬁc to a table. The principal in the

CreateGrant request is the user who created the table.

The event that records the CreateGrant operation is similar to the following example event.

The parameters include the ARN of the customer managed key for the table, the grantee

principal and retiring principal (the Amazon Keyspaces service), and the operations that the

grant covers. It also includes a constraint that requires all encryption operations use the

speciﬁed encryption context.

{

"eventVersion": "1.08",

"userIdentity": {

"type": "IAMUser",

"principalId": "AIDAZ3FNIIVIZZ6H7CFQG",

"arn": "arn:aws:iam::arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111:user/admin",

"accountId": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111",

"accessKeyId": "AKIAI44QH8DHBEXAMPLE",

"userName": "admin",

"sessionContext": {

"sessionIssuer": {},

"webIdFederationData": {},

Encryption at rest 644

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"attributes": {

"mfaAuthenticated": "false",

"creationDate": "2021-04-16T04:55:42Z"

}

"invokedBy": "AWS Internal"

"eventTime": "2021-04-16T05:11:10Z",

"eventSource": "kms.amazonaws.com",

"eventName": "CreateGrant",

"awsRegion": "us-east-1",

"sourceIPAddress": "AWS Internal",

"userAgent": "AWS Internal",

"requestParameters": {

"keyId": "a7d328af-215e-4661-9a69-88c858909f20",

"operations": [

"DescribeKey",

"GenerateDataKey",

"Decrypt",

"Encrypt",

"ReEncryptFrom",

"ReEncryptTo",

"RetireGrant"

"constraints": {

"encryptionContextSubset": {

"aws:cassandra:keyspaceName": "my_keyspace",

"aws:cassandra:tableName": "my_table",

"aws:cassandra:subscriberId": "123SAMPLE012"

}

"retiringPrincipal": "cassandratest.us-east-1.amazonaws.com",

"granteePrincipal": "cassandratest.us-east-1.amazonaws.com"

"responseElements": {

"grantId":

"18e4235f1b07f289762a31a1886cb5efd225f069280d4f76cd83b9b9b5501013"

"requestID": "b379a767-1f9b-48c3-b731-fb23e865e7f7",

"eventID": "29ee1fd4-28f2-416f-a419-551910d20291",

"readOnly": false,

"resources": [

{

"accountId": "123SAMPLE012",

Encryption at rest 645

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"type": "AWS::KMS::Key",

"ARN": "arn:aws:kms:eu-

west-1:5555555555555:key/11111111-1111-111-1111-111111111111"

}

"eventType": "AwsApiCall",

"managementEvent": true,

"eventCategory": "Management",

"recipientAccountId": "123SAMPLE012"

}

Encryption in transit in Amazon Keyspaces

Amazon Keyspaces only accepts secure connections using Transport Layer Security (TLS).

Encryption in transit provides an additional layer of data protection by encrypting your data

as it travels to and from Amazon Keyspaces. Organizational policies, industry or government

regulations, and compliance requirements often require the use of encryption in transit to increase

the data security of your applications when they transmit data over the network.

To learn how to encrypt cqlsh connections to Amazon Keyspaces using TLS, see the section called

“How to manually conﬁgure cqlsh connections for TLS”. To learn how to use TLS encryption with

client drivers, see the section called “Using a Cassandra client driver”.

Internetwork traﬃc privacy in Amazon Keyspaces

This topic describes how Amazon Keyspaces (for Apache Cassandra) secures connections from

on-premises applications to Amazon Keyspaces and between Amazon Keyspaces and other AWS

resources within the same AWS Region.

Traﬃc between service and on-premises clients and applications

You have two connectivity options between your private network and AWS:

• An AWS Site-to-Site VPN connection. For more information, see What is AWS Site-to-Site VPN?

in the AWS Site-to-Site VPN User Guide.

• An AWS Direct Connect connection. For more information, see What is AWS Direct Connect? in

the AWS Direct Connect User Guide.

Encryption in transit 646

Amazon Keyspaces (for Apache Cassandra) Developer Guide

As a managed service, Amazon Keyspaces (for Apache Cassandra) is protected by AWS

global network security. For information about AWS security services and how AWS protects

infrastructure, see AWS Cloud Security. To design your AWS environment using the best practices

for infrastructure security, see Infrastructure Protection in Security Pillar AWS Well‐Architected

Framework.

You use AWS published API calls to access Amazon Keyspaces through the network. Clients must

support the following:

• Transport Layer Security (TLS). We require TLS 1.2 and recommend TLS 1.3.

• Cipher suites with perfect forward secrecy (PFS) such as DHE (Ephemeral Diﬃe-Hellman) or

ECDHE (Elliptic Curve Ephemeral Diﬃe-Hellman). Most modern systems such as Java 7 and later

support these modes.

Additionally, requests must be signed by using an access key ID and a secret access key that is

associated with an IAM principal. Or you can use the AWS Security Token Service (AWS STS) to

generate temporary security credentials to sign requests.

Amazon Keyspaces supports two methods of authenticating client requests. The ﬁrst method uses

service-speciﬁc credentials, which are password based credentials generated for a speciﬁc IAM user.

You can create and manage the password using the IAM console, the AWS CLI, or the AWS API. For

more information, see Using IAM with Amazon Keyspaces.

The second method uses an authentication plugin for the open-source DataStax Java Driver for

Cassandra. This plugin enables IAM users, roles, and federated identities to add authentication

information to Amazon Keyspaces (for Apache Cassandra) API requests using the AWS Signature

Version 4 process (SigV4). For more information, see the section called “Create IAM credentials for

AWS authentication”.

Traﬃc between AWS resources in the same Region

Interface VPC endpoints enable private communication between your virtual private cloud (VPC)

running in Amazon VPC and Amazon Keyspaces. Interface VPC endpoints are powered by AWS

PrivateLink, which is an AWS service that enables private communication between VPCs and AWS

services. AWS PrivateLink enables this by using an elastic network interface with private IPs in

your VPC so that network traﬃc does not leave the Amazon network. Interface VPC endpoints

don't require an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

For more information, see Amazon Virtual Private Cloud and Interface VPC endpoints (AWS

Internetwork traﬃc privacy 647

Amazon Keyspaces (for Apache Cassandra) Developer Guide

PrivateLink). For example policies, see the section called “Using interface VPC endpoints for

Amazon Keyspaces”.

AWS Identity and Access Management for Amazon Keyspaces

AWS Identity and Access Management (IAM) is an AWS service that helps an administrator securely

control access to AWS resources. IAM administrators control who can be authenticated (signed in)

and authorized (have permissions) to use Amazon Keyspaces resources. IAM is an AWS service that

you can use with no additional charge.

Topics

• Audience

• Authenticating with identities

• Managing access using policies

• How Amazon Keyspaces works with IAM

• Amazon Keyspaces identity-based policy examples

• AWS managed policies for Amazon Keyspaces

• Troubleshooting Amazon Keyspaces identity and access

• Using service-linked roles for Amazon Keyspaces

Audience

How you use AWS Identity and Access Management (IAM) diﬀers, depending on the work that you

do in Amazon Keyspaces.

Service user – If you use the Amazon Keyspaces service to do your job, then your administrator

provides you with the credentials and permissions that you need. As you use more Amazon

Keyspaces features to do your work, you might need additional permissions. Understanding how

access is managed can help you request the right permissions from your administrator. If you

cannot access a feature in Amazon Keyspaces, see Troubleshooting Amazon Keyspaces identity and

access.

Service administrator – If you're in charge of Amazon Keyspaces resources at your company, you

probably have full access to Amazon Keyspaces. It's your job to determine which Amazon Keyspaces

features and resources your service users should access. You must then submit requests to your IAM

administrator to change the permissions of your service users. Review the information on this page

AWS Identity and Access Management 648

Amazon Keyspaces (for Apache Cassandra) Developer Guide

to understand the basic concepts of IAM. To learn more about how your company can use IAM with

Amazon Keyspaces, see How Amazon Keyspaces works with IAM.

IAM administrator – If you're an IAM administrator, you might want to learn details about how you

can write policies to manage access to Amazon Keyspaces. To view example Amazon Keyspaces

identity-based policies that you can use in IAM, see Amazon Keyspaces identity-based policy

examples.

Authenticating with identities

Authentication is how you sign in to AWS using your identity credentials. You must be

authenticated (signed in to AWS) as the AWS account root user, as an IAM user, or by assuming an

IAM role.

You can sign in to AWS as a federated identity by using credentials provided through an identity

source. AWS IAM Identity Center (IAM Identity Center) users, your company's single sign-on

authentication, and your Google or Facebook credentials are examples of federated identities.

When you sign in as a federated identity, your administrator previously set up identity federation

using IAM roles. When you access AWS by using federation, you are indirectly assuming a role.

Depending on the type of user you are, you can sign in to the AWS Management Console or the

AWS access portal. For more information about signing in to AWS, see How to sign in to your AWS

account in the AWS Sign-In User Guide.

If you access AWS programmatically, AWS provides a software development kit (SDK) and a

command line interface (CLI) to cryptographically sign your requests by using your credentials. If

you don't use AWS tools, you must sign requests yourself. For more information about using the

recommended method to sign requests yourself, see Signing AWS API requests in the IAM User

Guide.

Regardless of the authentication method that you use, you might be required to provide additional

security information. For example, AWS recommends that you use multi-factor authentication

(MFA) to increase the security of your account. To learn more, see Multi-factor authentication in the

AWS IAM Identity Center User Guide and Using multi-factor authentication (MFA) in AWS in the IAM

User Guide.

AWS account root user

When you create an AWS account, you begin with one sign-in identity that has complete access to

all AWS services and resources in the account. This identity is called the AWS account root user and

Authenticating with identities 649

Amazon Keyspaces (for Apache Cassandra) Developer Guide

is accessed by signing in with the email address and password that you used to create the account.

We strongly recommend that you don't use the root user for your everyday tasks. Safeguard your

root user credentials and use them to perform the tasks that only the root user can perform. For

the complete list of tasks that require you to sign in as the root user, see Tasks that require root

user credentials in the IAM User Guide.

IAM users and groups

An IAM user is an identity within your AWS account that has speciﬁc permissions for a single person

or application. Where possible, we recommend relying on temporary credentials instead of creating

IAM users who have long-term credentials such as passwords and access keys. However, if you have

speciﬁc use cases that require long-term credentials with IAM users, we recommend that you rotate

access keys. For more information, see Rotate access keys regularly for use cases that require long-

term credentials in the IAM User Guide.

An IAM group is an identity that speciﬁes a collection of IAM users. You can't sign in as a group. You

can use groups to specify permissions for multiple users at a time. Groups make permissions easier

to manage for large sets of users. For example, you could have a group named IAMAdmins and give

that group permissions to administer IAM resources.

Users are diﬀerent from roles. A user is uniquely associated with one person or application, but

a role is intended to be assumable by anyone who needs it. Users have permanent long-term

credentials, but roles provide temporary credentials. To learn more, see When to create an IAM user

(instead of a role) in the IAM User Guide.

IAM roles

An IAM role is an identity within your AWS account that has speciﬁc permissions. It is similar to an

IAM user, but is not associated with a speciﬁc person. You can temporarily assume an IAM role in

the AWS Management Console by switching roles. You can assume a role by calling an AWS CLI or

AWS API operation or by using a custom URL. For more information about methods for using roles,

see Using IAM roles in the IAM User Guide.

IAM roles with temporary credentials are useful in the following situations:

• Federated user access – To assign permissions to a federated identity, you create a role

and deﬁne permissions for the role. When a federated identity authenticates, the identity

is associated with the role and is granted the permissions that are deﬁned by the role. For

information about roles for federation, see Creating a role for a third-party Identity Provider

Authenticating with identities 650

Amazon Keyspaces (for Apache Cassandra) Developer Guide

in the IAM User Guide. If you use IAM Identity Center, you conﬁgure a permission set. To control

what your identities can access after they authenticate, IAM Identity Center correlates the

permission set to a role in IAM. For information about permissions sets, see Permission sets in

the AWS IAM Identity Center User Guide.

• Temporary IAM user permissions – An IAM user or role can assume an IAM role to temporarily

take on diﬀerent permissions for a speciﬁc task.

• Cross-account access – You can use an IAM role to allow someone (a trusted principal) in a

diﬀerent account to access resources in your account. Roles are the primary way to grant cross-

account access. However, with some AWS services, you can attach a policy directly to a resource

(instead of using a role as a proxy). To learn the diﬀerence between roles and resource-based

policies for cross-account access, see Cross account resource access in IAM in the IAM User Guide.

• Cross-service access – Some AWS services use features in other AWS services. For example, when

you make a call in a service, it's common for that service to run applications in Amazon EC2 or

store objects in Amazon S3. A service might do this using the calling principal's permissions,

using a service role, or using a service-linked role.

• Forward access sessions (FAS) – When you use an IAM user or role to perform actions in

AWS, you are considered a principal. When you use some services, you might perform an

action that then initiates another action in a diﬀerent service. FAS uses the permissions of the

principal calling an AWS service, combined with the requesting AWS service to make requests

to downstream services. FAS requests are only made when a service receives a request that

requires interactions with other AWS services or resources to complete. In this case, you must

have permissions to perform both actions. For policy details when making FAS requests, see

Forward access sessions.

• Service role – A service role is an IAM role that a service assumes to perform actions on your

behalf. An IAM administrator can create, modify, and delete a service role from within IAM. For

more information, see Creating a role to delegate permissions to an AWS service in the IAM

User Guide.

• Service-linked role – A service-linked role is a type of service role that is linked to an AWS

service. The service can assume the role to perform an action on your behalf. Service-linked

roles appear in your AWS account and are owned by the service. An IAM administrator can

view, but not edit the permissions for service-linked roles.

• Applications running on Amazon EC2 – You can use an IAM role to manage temporary

credentials for applications that are running on an EC2 instance and making AWS CLI or AWS API

requests. This is preferable to storing access keys within the EC2 instance. To assign an AWS role

to an EC2 instance and make it available to all of its applications, you create an instance proﬁle

Authenticating with identities 651

Amazon Keyspaces (for Apache Cassandra) Developer Guide

that is attached to the instance. An instance proﬁle contains the role and enables programs that

are running on the EC2 instance to get temporary credentials. For more information, see Using

an IAM role to grant permissions to applications running on Amazon EC2 instances in the IAM

User Guide.

To learn whether to use IAM roles or IAM users, see When to create an IAM role (instead of a user)

in the IAM User Guide.

Managing access using policies

You control access in AWS by creating policies and attaching them to AWS identities or resources.

A policy is an object in AWS that, when associated with an identity or resource, deﬁnes their

permissions. AWS evaluates these policies when a principal (user, root user, or role session) makes

a request. Permissions in the policies determine whether the request is allowed or denied. Most

policies are stored in AWS as JSON documents. For more information about the structure and

contents of JSON policy documents, see Overview of JSON policies in the IAM User Guide.

Administrators can use AWS JSON policies to specify who has access to what. That is, which

principal can perform actions on what resources, and under what conditions.

By default, users and roles have no permissions. To grant users permission to perform actions on

the resources that they need, an IAM administrator can create IAM policies. The administrator can

then add the IAM policies to roles, and users can assume the roles.

IAM policies deﬁne permissions for an action regardless of the method that you use to perform the

operation. For example, suppose that you have a policy that allows the iam:GetRole action. A

user with that policy can get role information from the AWS Management Console, the AWS CLI, or

the AWS API.

Identity-based policies

Identity-based policies are JSON permissions policy documents that you can attach to an identity,

such as an IAM user, group of users, or role. These policies control what actions users and roles can

perform, on which resources, and under what conditions. To learn how to create an identity-based

policy, see Creating IAM policies in the IAM User Guide.

Identity-based policies can be further categorized as inline policies or managed policies. Inline

policies are embedded directly into a single user, group, or role. Managed policies are standalone

policies that you can attach to multiple users, groups, and roles in your AWS account. Managed

Managing access using policies 652

Amazon Keyspaces (for Apache Cassandra) Developer Guide

policies include AWS managed policies and customer managed policies. To learn how to choose

between a managed policy or an inline policy, see Choosing between managed policies and inline

policies in the IAM User Guide.

Resource-based policies

Resource-based policies are JSON policy documents that you attach to a resource. Examples of

resource-based policies are IAM role trust policies and Amazon S3 bucket policies. In services that

support resource-based policies, service administrators can use them to control access to a speciﬁc

resource. For the resource where the policy is attached, the policy deﬁnes what actions a speciﬁed

principal can perform on that resource and under what conditions. You must specify a principal

in a resource-based policy. Principals can include accounts, users, roles, federated users, or AWS

services.

Resource-based policies are inline policies that are located in that service. You can't use AWS

managed policies from IAM in a resource-based policy.

Access control lists (ACLs)

Access control lists (ACLs) control which principals (account members, users, or roles) have

permissions to access a resource. ACLs are similar to resource-based policies, although they do not

use the JSON policy document format.

Amazon S3, AWS WAF, and Amazon VPC are examples of services that support ACLs. To learn more

about ACLs, see Access control list (ACL) overview in the Amazon Simple Storage Service Developer

Guide.

Other policy types

AWS supports additional, less-common policy types. These policy types can set the maximum

permissions granted to you by the more common policy types.

• Permissions boundaries – A permissions boundary is an advanced feature in which you set

the maximum permissions that an identity-based policy can grant to an IAM entity (IAM user

or role). You can set a permissions boundary for an entity. The resulting permissions are the

intersection of an entity's identity-based policies and its permissions boundaries. Resource-based

policies that specify the user or role in the Principal ﬁeld are not limited by the permissions

boundary. An explicit deny in any of these policies overrides the allow. For more information

about permissions boundaries, see Permissions boundaries for IAM entities in the IAM User Guide.

Managing access using policies 653

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Service control policies (SCPs) – SCPs are JSON policies that specify the maximum permissions

for an organization or organizational unit (OU) in AWS Organizations. AWS Organizations is a

service for grouping and centrally managing multiple AWS accounts that your business owns. If

you enable all features in an organization, then you can apply service control policies (SCPs) to

any or all of your accounts. The SCP limits permissions for entities in member accounts, including

each AWS account root user. For more information about Organizations and SCPs, see Service

control policies in the AWS Organizations User Guide.

• Session policies – Session policies are advanced policies that you pass as a parameter when you

programmatically create a temporary session for a role or federated user. The resulting session's

permissions are the intersection of the user or role's identity-based policies and the session

policies. Permissions can also come from a resource-based policy. An explicit deny in any of these

policies overrides the allow. For more information, see Session policies in the IAM User Guide.

Multiple policy types

When multiple types of policies apply to a request, the resulting permissions are more complicated

to understand. To learn how AWS determines whether to allow a request when multiple policy

types are involved, see Policy evaluation logic in the IAM User Guide.

How Amazon Keyspaces works with IAM

Before you use IAM to manage access to Amazon Keyspaces, you should understand what IAM

features are available to use with Amazon Keyspaces. To get a high-level view of how Amazon

Keyspaces and other AWS services work with IAM, see AWS services that work with IAM in the IAM

User Guide.

Topics

• Amazon Keyspaces identity-based policies

• Amazon Keyspaces resource-based policies

• Authorization based on Amazon Keyspaces tags

• Amazon Keyspaces IAM roles

Amazon Keyspaces identity-based policies

With IAM identity-based policies, you can specify allowed or denied actions and resources as well

as the conditions under which actions are allowed or denied. Amazon Keyspaces supports speciﬁc

How Amazon Keyspaces works with IAM 654

Amazon Keyspaces (for Apache Cassandra) Developer Guide

actions and resources, and condition keys. To learn about all of the elements that you use in a

JSON policy, see IAM JSON policy elements reference in the IAM User Guide.

To see the Amazon Keyspaces service-speciﬁc resources and actions, and condition context keys

that can be used for IAM permissions policies, see the Actions, resources, and condition keys for

Amazon Keyspaces (for Apache Cassandra) in the Service Authorization Reference.

Actions

Administrators can use AWS JSON policies to specify who has access to what. That is, which

principal can perform actions on what resources, and under what conditions.

The Action element of a JSON policy describes the actions that you can use to allow or deny

access in a policy. Policy actions usually have the same name as the associated AWS API operation.

There are some exceptions, such as permission-only actions that don't have a matching API

operation. There are also some operations that require multiple actions in a policy. These

additional actions are called dependent actions.

Include actions in a policy to grant permissions to perform the associated operation.

Policy actions in Amazon Keyspaces use the following preﬁx before the action: cassandra:. For

example, to grant someone permission to create an Amazon Keyspaces keyspace with the Amazon

Keyspaces CREATE CQL statement, you include the cassandra:Create action in their policy.

Policy statements must include either an Action or NotAction element. Amazon Keyspaces

deﬁnes its own set of actions that describe tasks that you can perform with this service.

To specify multiple actions in a single statement, separate them with commas as follows:

"Action": [

"cassandra:CREATE",

"cassandra:MODIFY"

]

To see a list of Amazon Keyspaces actions, see Actions Deﬁned by Amazon Keyspaces (for Apache

Cassandra) in the Service Authorization Reference.

Resources

Administrators can use AWS JSON policies to specify who has access to what. That is, which

principal can perform actions on what resources, and under what conditions.

How Amazon Keyspaces works with IAM 655

Amazon Keyspaces (for Apache Cassandra) Developer Guide

The Resource JSON policy element speciﬁes the object or objects to which the action applies.

Statements must include either a Resource or a NotResource element. As a best practice,

specify a resource using its Amazon Resource Name (ARN). You can do this for actions that support

a speciﬁc resource type, known as resource-level permissions.

For actions that don't support resource-level permissions, such as listing operations, use a wildcard

(*) to indicate that the statement applies to all resources.

"Resource": "*"

In Amazon Keyspaces keyspaces and tables can be used in the Resource element of IAM

permissions.

The Amazon Keyspaces keyspace resource has the following ARN:

arn:${Partition}:cassandra:${Region}:${Account}:/keyspace/${KeyspaceName}/

The Amazon Keyspaces table resource has the following ARN:

arn:${Partition}:cassandra:${Region}:${Account}:/keyspace/${KeyspaceName}/table/

${tableName}

For more information about the format of ARNs, see Amazon Resource Names (ARNs) and AWS

service namespaces.

For example, to specify the mykeyspace keyspace in your statement, use the following ARN:

"Resource": "arn:aws:cassandra:us-east-1:123456789012:/keyspace/mykeyspace/"

To specify all keyspaces that belong to a speciﬁc account, use the wildcard (*):

"Resource": "arn:aws:cassandra:us-east-1:123456789012:/keyspace/*"

Some Amazon Keyspaces actions, such as those for creating resources, cannot be performed on a

speciﬁc resource. In those cases, you must use the wildcard (*).

"Resource": "*"

To connect to Amazon Keyspaces programmatically with a standard driver, a principal must

have SELECT access to the system tables, because most drivers read the system keyspaces/

How Amazon Keyspaces works with IAM 656

Amazon Keyspaces (for Apache Cassandra) Developer Guide

tables on connection. For example, to grant SELECT permissions to an IAM user for mytable

in mykeyspace, the principal must have permissions to read both, mytable and the system

keyspace. To specify multiple resources in a single statement, separate the ARNs with commas.

"Resource": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

To see a list of Amazon Keyspaces resource types and their ARNs, see Resources Deﬁned by Amazon

Keyspaces (for Apache Cassandra) in the Service Authorization Reference. To learn with which

actions you can specify the ARN of each resource, see Actions Deﬁned by Amazon Keyspaces (for

Apache Cassandra).

Condition keys

Administrators can use AWS JSON policies to specify who has access to what. That is, which

principal can perform actions on what resources, and under what conditions.

The Condition element (or Condition block) lets you specify conditions in which a statement

is in eﬀect. The Condition element is optional. You can create conditional expressions that use

condition operators, such as equals or less than, to match the condition in the policy with values in

the request.

If you specify multiple Condition elements in a statement, or multiple keys in a single

Condition element, AWS evaluates them using a logical AND operation. If you specify multiple

values for a single condition key, AWS evaluates the condition using a logical OR operation. All of

the conditions must be met before the statement's permissions are granted.

You can also use placeholder variables when you specify conditions. For example, you can grant

an IAM user permission to access a resource only if it is tagged with their IAM user name. For more

information, see IAM policy elements: variables and tags in the IAM User Guide.

AWS supports global condition keys and service-speciﬁc condition keys. To see all AWS global

condition keys, see AWS global condition context keys in the IAM User Guide.

Amazon Keyspaces deﬁnes its own set of condition keys and also supports using some global

condition keys. To see all AWS global condition keys, see AWS global condition context keys in the

IAM User Guide.

How Amazon Keyspaces works with IAM 657

Amazon Keyspaces (for Apache Cassandra) Developer Guide

All Amazon Keyspaces actions support the aws:RequestTag/${TagKey}, the

aws:ResourceTag/${TagKey}, and the aws:TagKeys condition keys. For more information, see

the section called “ Amazon Keyspaces resource access based on tags”.

To see a list of Amazon Keyspaces condition keys, see Condition Keys for Amazon Keyspaces (for

Apache Cassandra) in the Service Authorization Reference. To learn with which actions and resources

you can use a condition key, see Actions Deﬁned by Amazon Keyspaces (for Apache Cassandra).

Examples

To view examples of Amazon Keyspaces identity-based policies, see Amazon Keyspaces identity-

based policy examples.

Amazon Keyspaces resource-based policies

Amazon Keyspaces does not support resource-based policies. To view an example of a detailed

resource-based policy page, see https://docs.aws.amazon.com/lambda/latest/dg/access-control-

resource-based.html.

Authorization based on Amazon Keyspaces tags

You can manage access to your Amazon Keyspaces resources by using tags. To manage resource

access based on tags, you provide tag information in the condition element of a policy using

the cassandra:ResourceTag/key-name, aws:RequestTag/key-name, or aws:TagKeys

condition keys. For more information about tagging Amazon Keyspaces resources, see the section

called “Working with tags”.

To view example identity-based policies for limiting access to a resource based on the tags on that

resource, see Amazon Keyspaces resource access based on tags.

Amazon Keyspaces IAM roles

An IAM role is an entity within your AWS account that has speciﬁc permissions.

Using temporary credentials with Amazon Keyspaces

You can use temporary credentials to sign in with federation, to assume an IAM role, or to assume

a cross-account role. You obtain temporary security credentials by calling AWS STS API operations

such as AssumeRole or GetFederationToken.

How Amazon Keyspaces works with IAM 658

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces supports using temporary credentials with the AWS Signature Version 4 (SigV4)

authentication plugin available from the Github repo for the following languages:

• Java: https://github.com/aws/aws-sigv4-auth-cassandra-java-driver-plugin.

• Node.js: https://github.com/aws/aws-sigv4-auth-cassandra-nodejs-driver-plugin.

• Python: https://github.com/aws/aws-sigv4-auth-cassandra-python-driver-plugin.

• Go: https://github.com/aws/aws-sigv4-auth-cassandra-gocql-driver-plugin.

For examples and tutorials that implement the authentication plugin to access Amazon Keyspaces

programmatically, see the section called “Using a Cassandra client driver”.

Service-linked roles

Service-linked roles allow AWS services to access resources in other services to complete an action

on your behalf. Service-linked roles appear in your IAM account and are owned by the service. An

IAM administrator can view but not edit the permissions for service-linked roles.

For details about creating or managing Amazon Keyspaces service-linked roles, see the section

called “Using service-linked roles”.

Service roles

Amazon Keyspaces does not support service roles.

Amazon Keyspaces identity-based policy examples

By default, IAM users and roles don't have permission to create or modify Amazon Keyspaces

resources. They also can't perform tasks using the console, CQLSH, AWS CLI, or AWS API. An IAM

administrator must create IAM policies that grant users and roles permission to perform speciﬁc

API operations on the speciﬁed resources they need. The administrator must then attach those

policies to the IAM users or groups that require those permissions.

To learn how to create an IAM identity-based policy using these example JSON policy documents,

see Creating policies on the JSON tab in the IAM User Guide.

Topics

• Policy best practices

• Using the Amazon Keyspaces console

Identity-based policy examples 659

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• Allow users to view their own permissions

• Accessing Amazon Keyspaces tables

• Amazon Keyspaces resource access based on tags

Policy best practices

Identity-based policies determine whether someone can create, access, or delete Amazon

Keyspaces resources in your account. These actions can incur costs for your AWS account. When you

create or edit identity-based policies, follow these guidelines and recommendations:

• Get started with AWS managed policies and move toward least-privilege permissions – To

get started granting permissions to your users and workloads, use the AWS managed policies

that grant permissions for many common use cases. They are available in your AWS account. We

recommend that you reduce permissions further by deﬁning AWS customer managed policies

that are speciﬁc to your use cases. For more information, see AWS managed policies or AWS

managed policies for job functions in the IAM User Guide.

• Apply least-privilege permissions – When you set permissions with IAM policies, grant only the

permissions required to perform a task. You do this by deﬁning the actions that can be taken on

speciﬁc resources under speciﬁc conditions, also known as least-privilege permissions. For more

information about using IAM to apply permissions, see Policies and permissions in IAM in the

IAM User Guide.

• Use conditions in IAM policies to further restrict access – You can add a condition to your

policies to limit access to actions and resources. For example, you can write a policy condition to

specify that all requests must be sent using SSL. You can also use conditions to grant access to

service actions if they are used through a speciﬁc AWS service, such as AWS CloudFormation. For

more information, see IAM JSON policy elements: Condition in the IAM User Guide.

• Use IAM Access Analyzer to validate your IAM policies to ensure secure and functional

permissions – IAM Access Analyzer validates new and existing policies so that the policies

adhere to the IAM policy language (JSON) and IAM best practices. IAM Access Analyzer provides

more than 100 policy checks and actionable recommendations to help you author secure and

functional policies. For more information, see IAM Access Analyzer policy validation in the IAM

User Guide.

• Require multi-factor authentication (MFA) – If you have a scenario that requires IAM users

or a root user in your AWS account, turn on MFA for additional security. To require MFA when

API operations are called, add MFA conditions to your policies. For more information, see

Conﬁguring MFA-protected API access in the IAM User Guide.

Identity-based policy examples 660

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information about best practices in IAM, see Security best practices in IAM in the IAM User

Guide.

Using the Amazon Keyspaces console

Amazon Keyspaces doesn't require speciﬁc permissions to access the Amazon Keyspaces console.

You need at least read-only permissions to list and view details about the Amazon Keyspaces

resources in your AWS account. If you create an identity-based policy that is more restrictive than

the minimum required permissions, the console won't function as intended for entities (IAM users

or roles) with that policy.

Two AWS managed policies are available to the entities for Amazon Keyspaces console access.

• AmazonKeyspacesReadOnlyAccess_v2 – This policy grants read-only access to Amazon

Keyspaces.

• AmazonKeyspacesFullAccess – This policy grants permissions to use Amazon Keyspaces with full

access to all features.

For more information about Amazon Keyspaces managed policies, see the section called “AWS

managed policies”.

Allow users to view their own permissions

This example shows how you might create a policy that allows IAM users to view the inline and

managed policies that are attached to their user identity. This policy includes permissions to

complete this action on the console or programmatically using the AWS CLI or AWS API.

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "ViewOwnUserInfo",

"Effect": "Allow",

"Action": [

"iam:GetUserPolicy",

"iam:ListGroupsForUser",

"iam:ListAttachedUserPolicies",

"iam:ListUserPolicies",

"iam:GetUser"

"Resource": ["arn:aws:iam::*:user/${aws:username}"]

Identity-based policy examples 661

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"Sid": "NavigateInConsole",

"Effect": "Allow",

"Action": [

"iam:GetGroupPolicy",

"iam:GetPolicyVersion",

"iam:GetPolicy",

"iam:ListAttachedGroupPolicies",

"iam:ListGroupPolicies",

"iam:ListPolicyVersions",

"iam:ListPolicies",

"iam:ListUsers"

"Resource": "*"

}

]

}

Accessing Amazon Keyspaces tables

The following is a sample policy that grants read-only (SELECT) access to the Amazon Keyspaces

system tables. For all samples, replace the Region and account ID in the Amazon Resource Name

(ARN) with your own.

Note

To connect with a standard driver, a user must have at least SELECT access to the system

tables, because most drivers read the system keyspaces/tables on connection.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

Identity-based policy examples 662

Amazon Keyspaces (for Apache Cassandra) Developer Guide

}

]

}

The following sample policy adds read-only access to the user table mytable in the keyspace

mykeyspace.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

}

]

}

The following sample policy assigns read/write access to a user table and read access to the system

tables.

Note

System tables are always read-only.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Select",

"cassandra:Modify"

Identity-based policy examples 663

Amazon Keyspaces (for Apache Cassandra) Developer Guide

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/

mytable",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

}

]

}

The following sample policy allows a user to create tables in keyspace mykeyspace.

{

"Version":"2012-10-17",

"Statement":[

{

"Effect":"Allow",

"Action":[

"cassandra:Create",

"cassandra:Select"

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/*",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

]

}

]

}

Amazon Keyspaces resource access based on tags

You can use conditions in your identity-based policy to control access to Amazon Keyspaces

resources based on tags. These policies control visibility of the keyspaces and tables in the account.

Note that tag-based permissions for system tables behave diﬀerently when requests are made

using the AWS SDK compared to Cassandra Query Language (CQL) API calls via Cassandra drivers

and developer tools.

•

To make List and Get resource requests with the AWS SDK when using tag-based access, the

caller needs to have read access to system tables. For example, Select action permissions are

required to read data from system tables via the GetTable operation. If the caller has only tag-

based access to a speciﬁc table, an operation that requires additional access to a system table

will fail.

Identity-based policy examples 664

Amazon Keyspaces (for Apache Cassandra) Developer Guide

• For compatibility with established Cassandra driver behavior, tag-based authorization policies

are not enforced when performing operations on system tables using Cassandra Query Language

(CQL) API calls via Cassandra drivers and developer tools.

The following example shows how you can create a policy that grants permissions to a user to view

a table if the table's Owner contains the value of that user's user name. In this example you also

give read access to the system tables.

{

"Version":"2012-10-17",

"Statement":[

{

"Sid":"ReadOnlyAccessTaggedTables",

"Effect":"Allow",

"Action":"cassandra:Select",

"Resource":[

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/table/*",

"arn:aws:cassandra:us-east-1:111122223333:/keyspace/system*"

"Condition":{

"StringEquals":{

"aws:ResourceTag/Owner":"${aws:username}"

}

]

}

You can attach this policy to the IAM users in your account. If a user named richard-roe

attempts to view an Amazon Keyspaces table, the table must be tagged Owner=richard-roe or

owner=richard-roe. Otherwise, he is denied access. The condition tag key Owner matches both

Owner and owner because condition key names are not case-sensitive. For more information, see

IAM JSON policy elements: Condition in the IAM User Guide.

The following policy grants permissions to a user to create tables with tags if the table's Owner

contains the value of that user's user name.

{

"Version": "2012-10-17",

"Statement": [

Identity-based policy examples 665

Amazon Keyspaces (for Apache Cassandra) Developer Guide

{

"Sid": "CreateTagTableUser",

"Effect": "Allow",

"Action": [

"cassandra:Create",

"cassandra:TagResource"

"Resource": "arn:aws:cassandra:us-east-1:111122223333:/keyspace/mykeyspace/

table/*",

"Condition":{

"StringEquals":{

"aws:RequestTag/Owner":"${aws:username}"

}

]

}

AWS managed policies for Amazon Keyspaces

An AWS managed policy is a standalone policy that is created and administered by AWS. AWS

managed policies are designed to provide permissions for many common use cases so that you can

start assigning permissions to users, groups, and roles.

Keep in mind that AWS managed policies might not grant least-privilege permissions for your

speciﬁc use cases because they're available for all AWS customers to use. We recommend that you

reduce permissions further by deﬁning customer managed policies that are speciﬁc to your use

cases.

You cannot change the permissions deﬁned in AWS managed policies. If AWS updates the

permissions deﬁned in an AWS managed policy, the update aﬀects all principal identities (users,

groups, and roles) that the policy is attached to. AWS is most likely to update an AWS managed

policy when a new AWS service is launched or new API operations become available for existing

services.

For more information, see AWS managed policies in the IAM User Guide.

AWS managed policies 666

Amazon Keyspaces (for Apache Cassandra) Developer Guide

AWS managed policy: AmazonKeyspacesReadOnlyAccess_v2

You can attach the AmazonKeyspacesReadOnlyAccess_v2 policy to your IAM identities.

This policy grants read-only access to Amazon Keyspaces and includes the required permissions

when connecting through private VPC endpoints.

Permissions details

This policy includes the following permissions.

•

Amazon Keyspaces – Provides read-only access to Amazon Keyspaces.

•

Application Auto Scaling – Allows principals to view conﬁgurations from Application Auto

Scaling. This is required so that users can view automatic scaling policies that are attached to a

table.

•

CloudWatch – Allows principals to view metric data and alarms conﬁgured in CloudWatch.

This is required so users can view the billable table size and CloudWatch alarms that have been

conﬁgured for a table.

•

AWS KMS – Allows principals to view keys conﬁgured in AWS KMS. This is required so users

can view AWS KMS keys that they create and manage in their account to conﬁrm that the key

assigned to Amazon Keyspaces is a symmetric encryption key that is enabled.

•

Amazon EC2 – Allows principals connecting to Amazon Keyspaces through VPC endpoints to

query the VPC on your Amazon EC2 instance for endpoint and network interface information.

This read-only access to the Amazon EC2 instance is required so Amazon Keyspaces can look up

and store available interface VPC endpoints in the system.peers table used for connection

load balancing.

To review the policy in JSON format, see AmazonKeyspacesReadOnlyAccess_v2.

AWS managed policy: AmazonKeyspacesReadOnlyAccess

You can attach the AmazonKeyspacesReadOnlyAccess policy to your IAM identities.

AWS managed policies 667

Amazon Keyspaces (for Apache Cassandra) Developer Guide

This policy grants read-only access to Amazon Keyspaces.

Permissions details

This policy includes the following permissions.

•

Amazon Keyspaces – Provides read-only access to Amazon Keyspaces.

•

Application Auto Scaling – Allows principals to view conﬁgurations from Application Auto

Scaling. This is required so that users can view automatic scaling policies that are attached to a

table.

•

CloudWatch – Allows principals to view metric data and alarms conﬁgured in CloudWatch.

This is required so users can view the billable table size and CloudWatch alarms that have been

conﬁgured for a table.

•

AWS KMS – Allows principals to view keys conﬁgured in AWS KMS. This is required so users

can view AWS KMS keys that they create and manage in their account to conﬁrm that the key

assigned to Amazon Keyspaces is a symmetric encryption key that is enabled.

To review the policy in JSON format, see AmazonKeyspacesReadOnlyAccess.

AWS managed policy: AmazonKeyspacesFullAccess

You can attach the AmazonKeyspacesFullAccess policy to your IAM identities.

This policy grants administrative permissions that allow your administrators unrestricted access to

Amazon Keyspaces.

Permissions details

This policy includes the following permissions.

•

Amazon Keyspaces – Allows principals to access any Amazon Keyspaces resource and perform

all actions.

AWS managed policies 668

Amazon Keyspaces (for Apache Cassandra) Developer Guide

•

Application Auto Scaling – Allows principals to create, view, and delete automatic scaling

policies for Amazon Keyspaces tables. This is required so that administrators can manage

automatic scaling policies for Amazon Keyspaces tables.

•

CloudWatch – Allows principals to see the billable table size as well as create, view, and delete

CloudWatch alarms for Amazon Keyspaces automatic scaling policies. This is required so that

administrators can view the billable table size and create a CloudWatch dashboard.

•

IAM – Allows Amazon Keyspaces to create service-linked roles with IAM automatically when the

following features are turned on:

•

Application Auto Scaling – When an administrator enables Application Auto Scaling for

a table, Amazon Keyspaces creates a service-linked role to perform automatic scaling actions

on your behalf.

•

Amazon Keyspaces Multi-Region Replication – When an administrator creates

a multi-Region keyspace, a service-linked role is automatically created to perform data

replication to the selected AWS Regions on your behalf.

For more information about service-linked roles, see the section called “Using service-linked

roles”.

•

AWS KMS – Allows principals to view keys conﬁgured in AWS KMS. This is required so that users

can view AWS KMS keys that they create and manage in their account to conﬁrm that the key

assigned to Amazon Keyspaces is a symmetric encryption key that is enabled.

•

Amazon EC2 – Allows principals connecting to Amazon Keyspaces through VPC endpoints to

query the VPC on your Amazon EC2 instance for endpoint and network interface information.

This read-only access to the Amazon EC2 instance is required so Amazon Keyspaces can look up

and store available interface VPC endpoints in the system.peers table used for connection

load balancing.

To review the policy in JSON format, see AmazonKeyspacesFullAccess.

Amazon Keyspaces updates to AWS managed policies

View details about updates to AWS managed policies for Amazon Keyspaces since this service

began tracking these changes. For automatic alerts about changes to this page, subscribe to the

RSS feed on the Document history page.

AWS managed policies 669

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Change Description Date

AmazonKeyspacesFullAccess –

Update to an existing policy

Amazon Keyspaces added

new read-only permissio

ns for clients connecting to

Amazon Keyspaces through

interface VPC endpoints

to access the Amazon EC2

instance to lookup network

information.

Amazon Keyspaces stores

available interface VPC

endpoints in the system.pe

ers table for connectio

n load balancing. For more

information, see the section

called “Using interface VPC

endpoints”.

October 3, 2023

AmazonKeyspacesRea

dOnlyAccess_v2 – New policy

Amazon Keyspaces created

a new policy to add read-

only permissions for clients

connecting to Amazon

Keyspaces through interface

VPC endpoints to access

the Amazon EC2 instance to

lookup network information.

Amazon Keyspaces stores

available interface VPC

endpoints in the system.pe

ers table for connectio

n load balancing. For more

information, see the section

called “Using interface VPC

endpoints”.

September 12, 2023

AWS managed policies 670

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Change Description Date

AmazonKeyspacesFullAccess –

Update to an existing policy

Amazon Keyspaces added

new permissions to allow

Amazon Keyspaces to create

a service-linked role when an

administrator creates a multi-

Region keyspace.

Amazon Keyspaces uses the

service-linked role to perform

data replication tasks on your

behalf. For more information,

see the section called “Multi-

Region Replication”.

June 5, 2023

AmazonKeyspacesRea

dOnlyAccess – Update to an

existing policy

Amazon Keyspaces added

new permissions to allow

users to view the billable size

of a table using CloudWatch.

Amazon Keyspaces integrate

s with Amazon CloudWatch

to allow you to monitor the

billable table size. For more

information, see the section

called “Amazon Keyspaces

metrics and dimensions”.

July 7, 2022

AWS managed policies 671

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Change Description Date

AmazonKeyspacesFullAccess –

Update to an existing policy

Amazon Keyspaces added

new permissions to allow

users to view the billable size

of a table using CloudWatch.

Amazon Keyspaces integrate

s with Amazon CloudWatch

to allow you to monitor the

billable table size. For more

information, see the section

called “Amazon Keyspaces

metrics and dimensions”.

July 7, 2022

AmazonKeyspacesRea

dOnlyAccess – Update to an

existing policy

Amazon Keyspaces added

new permissions to allow

users to view AWS KMS keys

that have been conﬁgured for

Amazon Keyspaces encryption

at rest.

Amazon Keyspaces encryptio

n at rest integrates with

AWS KMS for protecting and

managing the encryption

keys used to encrypt data

at rest. To view the AWS

KMS key conﬁgured for

Amazon Keyspaces, read-only

permissions have been added.

June 1, 2021

AWS managed policies 672

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Change Description Date

AmazonKeyspacesFullAccess –

Update to an existing policy

Amazon Keyspaces added

new permissions to allow

users to view AWS KMS keys

that have been conﬁgured for

Amazon Keyspaces encryption

at rest.

Amazon Keyspaces encryptio

n at rest integrates with

AWS KMS for protecting and

managing the encryption

keys used to encrypt data

at rest. To view the AWS

KMS key conﬁgured for

Amazon Keyspaces, read-only

permissions have been added.

June 1, 2021

Amazon Keyspaces started

tracking changes

Amazon Keyspaces started

tracking changes for its AWS

managed policies.

June 1, 2021

Troubleshooting Amazon Keyspaces identity and access

Use the following information to help you diagnose and ﬁx common issues that you might

encounter when working with Amazon Keyspaces and IAM.

Topics

• I'm not authorized to perform an action in Amazon Keyspaces

• I modiﬁed an IAM user or role and the changes did not take eﬀect immediately

• I can't restore a table using Amazon Keyspaces point-in-time recovery (PITR)

• I'm not authorized to perform iam:PassRole

• I'm an administrator and want to allow others to access Amazon Keyspaces

• I want to allow people outside of my AWS account to access my Amazon Keyspaces resources

Troubleshooting 673

Amazon Keyspaces (for Apache Cassandra) Developer Guide

I'm not authorized to perform an action in Amazon Keyspaces

If the AWS Management Console tells you that you're not authorized to perform an action, then

you must contact your administrator for assistance. Your administrator is the person that provided

you with your user name and password.

The following example error occurs when the mateojackson IAM user tries to use the console to

view details about a table but does not have cassandra:Select permissions for the table.

User: arn:aws:iam::123456789012:user/mateojackson is not authorized to perform:

cassandra:Select on resource: mytable

In this case, Mateo asks his administrator to update his policies to allow him to access the mytable

resource using the cassandra:Select action.

I modiﬁed an IAM user or role and the changes did not take eﬀect immediately

IAM policy changes may take up to 10 minutes to take eﬀect for applications with existing,

established connections to Amazon Keyspaces. IAM policy changes take eﬀect immediately when

applications establish a new connection. If you have made modiﬁcations to an existing IAM user or

role, and it has not taken immediate eﬀect, either wait for 10 minutes or disconnect and reconnect

to Amazon Keyspaces.

I can't restore a table using Amazon Keyspaces point-in-time recovery (PITR)

If you are trying to restore an Amazon Keyspaces table with point-in-time recovery (PITR), and

you see the restore process begin, but not complete successfully, you might not have conﬁgured

all of the required permissions that are needed by the restore process. You must contact your

administrator for assistance and ask that person to update your policies to allow you to restore a

table in Amazon Keyspaces.

In addition to user permissions, Amazon Keyspaces may require permissions to perform actions

during the restore process on your principal's behalf. This is the case if the table is encrypted with a

customer-managed key, or if you are using IAM policies that restrict incoming traﬃc. For example,

if you are using condition keys in your IAM policy to restrict source traﬃc to speciﬁc endpoints or

IP ranges, the restore operation fails. To allow Amazon Keyspaces to perform the table restore

operation on your principal's behalf, you must add an aws:ViaAWSService global condition key

in the IAM policy.

Troubleshooting 674

Amazon Keyspaces (for Apache Cassandra) Developer Guide

For more information about permissions to restore tables, see the section called “Conﬁgure IAM

permissions for restore”.

I'm not authorized to perform iam:PassRole

If you receive an error that you're not authorized to perform the iam:PassRole action, your

policies must be updated to allow you to pass a role to Amazon Keyspaces.

Some AWS services allow you to pass an existing role to that service instead of creating a new

service role or service-linked role. To do this, you must have permissions to pass the role to the

service.

The following example error occurs when an IAM user named marymajor tries to use the console

to perform an action in Amazon Keyspaces. However, the action requires the service to have

permissions that are granted by a service role. Mary does not have permissions to pass the role to

the service.

User: arn:aws:iam::123456789012:user/marymajor is not authorized to perform:

iam:PassRole

In this case, Mary's policies must be updated to allow her to perform the iam:PassRole action.

If you need help, contact your AWS administrator. Your administrator is the person who provided

you with your sign-in credentials.

I'm an administrator and want to allow others to access Amazon Keyspaces

To allow others to access Amazon Keyspaces, you must grant permission to the people or

applications that need access. If you are using AWS IAM Identity Center to manage people

and applications, you assign permission sets to users or groups to deﬁne their level of access.

Permission sets automatically create and assign IAM policies to IAM roles that are associated with

the person or application. For more information, see Permission sets in the AWS IAM Identity Center

User Guide.

If you are not using IAM Identity Center, you must create IAM entities (users or roles) for the people

or applications that need access. You must then attach a policy to the entity that grants them

the correct permissions in Amazon Keyspaces. After the permissions are granted, provide the

credentials to the user or application developer. They will use those credentials to access AWS.

To learn more about creating IAM users, groups, policies, and permissions, see IAM Identities and

Policies and permissions in IAM in the IAM User Guide.

Troubleshooting 675

Amazon Keyspaces (for Apache Cassandra) Developer Guide

I want to allow people outside of my AWS account to access my Amazon

Keyspaces resources

You can create a role that users in other accounts or people outside of your organization can use to

access your resources. You can specify who is trusted to assume the role. For services that support

resource-based policies or access control lists (ACLs), you can use those policies to grant people

access to your resources.

To learn more, consult the following:

• To learn whether Amazon Keyspaces supports these features, see How Amazon Keyspaces works

with IAM.

• To learn how to provide access to your resources across AWS accounts that you own, see

Providing access to an IAM user in another AWS account that you own in the IAM User Guide.

• To learn how to provide access to your resources to third-party AWS accounts, see Providing

access to AWS accounts owned by third parties in the IAM User Guide.

• To learn how to provide access through identity federation, see Providing access to externally

authenticated users (identity federation) in the IAM User Guide.

• To learn the diﬀerence between using roles and resource-based policies for cross-account access,

see Cross account resource access in IAM in the IAM User Guide.

Using service-linked roles for Amazon Keyspaces

Amazon Keyspaces (for Apache Cassandra) uses AWS Identity and Access Management (IAM)

service-linked roles. A service-linked role is a unique type of IAM role that is linked directly to

Amazon Keyspaces. Service-linked roles are predeﬁned by Amazon Keyspaces and include all the

permissions that the service requires to call other AWS services on your behalf.

Topics

• Using roles for Amazon Keyspaces application auto scaling

• Using roles for Amazon Keyspaces Multi-Region Replication

Using roles for Amazon Keyspaces application auto scaling

Amazon Keyspaces (for Apache Cassandra) uses AWS Identity and Access Management (IAM)

service-linked roles. A service-linked role is a unique type of IAM role that is linked directly to

Using service-linked roles 676

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Amazon Keyspaces. Service-linked roles are predeﬁned by Amazon Keyspaces and include all the

permissions that the service requires to call other AWS services on your behalf.

A service-linked role makes setting up Amazon Keyspaces easier because you don’t have to

manually add the necessary permissions. Amazon Keyspaces deﬁnes the permissions of its service-

linked roles, and unless deﬁned otherwise, only Amazon Keyspaces can assume its roles. The

deﬁned permissions include the trust policy and the permissions policy, and that permissions policy

cannot be attached to any other IAM entity.

You can delete a service-linked role only after ﬁrst deleting its related resources. This protects your

Amazon Keyspaces resources because you can't inadvertently remove permission to access the

resources.

For information about other services that support service-linked roles, see AWS services that work

with IAM and look for the services that have Yes in the Service-linked roles column. Choose a Yes

with a link to view the service-linked role documentation for that service.

Service-linked role permissions for Amazon Keyspaces

Amazon Keyspaces uses the service-linked role named

AWSServiceRoleForApplicationAutoScaling_CassandraTable to allow Application Auto Scaling to

call Amazon Keyspaces and Amazon CloudWatch on your behalf.

The AWSServiceRoleForApplicationAutoScaling_CassandraTable service-linked role trusts the

following services to assume the role:

•

cassandra.application-autoscaling.amazonaws.com

The role permissions policy allows Application Auto Scaling to complete the following actions on

the speciﬁed Amazon Keyspaces resources:

•

Action: cassandra:Select on arn:*:cassandra:*:*:/keyspace/system/table/*

•

Action: cassandra:Select on the resource arn:*:cassandra:*:*:/keyspace/

system_schema/table/*

•

Action: cassandra:Select on the resource arn:*:cassandra:*:*:/keyspace/

system_schema_mcs/table/*

•

Action: cassandra:Alter on the resource arn:*:cassandra:*:*:"*"

Using service-linked roles 677

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Creating a service-linked role for Amazon Keyspaces

You don't need to manually create a service-linked role for Amazon Keyspaces automatic scaling.

When you enable Amazon Keyspaces auto scaling on a table with the AWS Management Console,

CQL, the AWS CLI, or the AWS API, Application Auto Scaling creates the service-linked role for you.

If you delete this service-linked role, and then need to create it again, you can use the same process

to recreate the role in your account. When you enable Amazon Keyspaces auto scaling for a table,

Application Auto Scaling creates the service-linked role for you again.

Important

This service-linked role can appear in your account if you completed an action in another

service that uses the features supported by this role. To learn more, see A new role

appeared in my AWS account.

If you delete this service-linked role, and then need to create it again, you can use the same process

to recreate the role in your account. When you enable Amazon Keyspaces automatic application

scaling for a table, Application Auto Scaling creates the service-linked role for you again.

Editing a service-linked role for Amazon Keyspaces

Amazon Keyspaces does not allow you to edit the

AWSServiceRoleForApplicationAutoScaling_CassandraTable service-linked role. After you

create a service-linked role, you cannot change the name of the role because various entities

might reference the role. However, you can edit the description of the role using IAM. For more

information, see Editing a service-linked role in the IAM User Guide.

Deleting a service-linked role for Amazon Keyspaces

If you no longer need to use a feature or service that requires a service-linked role, we recommend

that you delete that role. That way you don’t have an unused entity that isn't actively monitored or

maintained. However, you must ﬁrst disable automatic scaling on all tables in the account across all

AWS Regions before you can delete the service-linked role manually. To disable automatic scaling

on Amazon Keyspaces tables, see the section called “Turn oﬀ Amazon Keyspaces auto scaling for a

table”.

Using service-linked roles 678

Amazon Keyspaces (for Apache Cassandra) Developer Guide

Note

If Amazon Keyspaces automatic scaling is using the role when you try to modify the

resources, then the deregistration might fail. If that happens, wait for a few minutes and

try the operation again.

To manually delete the service-linked role using IAM

Use the IAM console, the AWS CLI, or the AWS API to delete the

AWSServiceRoleForApplicationAutoScaling_CassandraTable service-linked role. For more

information, see Deleting a Service-Linked Role in the IAM User Guide.

Note

To delete the service-linked role used by Amazon Keyspaces automatic scaling, you must

ﬁrst disable automatic scaling on all tables in the account.

Supported Regions for Amazon Keyspaces service-linked roles

Amazon Keyspaces supports using service-linked roles in all of the Regions where the service is

available. For more information, see Service endpoints for Amazon Keyspaces.

Using roles for Amazon Keyspaces Multi-Region Replication