Is there any efficient way to delete all the items from a amazon dynamodb tabe at once.I have gone through the aws docs but there it's shown deletion of a single item.
You will want to use BatchWriteItem if you can't drop the table. If all your entries are within a single HashKey, you can use the Query API to retrieve the records, and then delete them 25 items at a time. If not, you'll probably have to Scan.
Alternatively, you could provide a simple wrapper around AmazonDynamoDBClient
(from the official SDK) that collects a Set of Hash/Range keys that exist in your table. Then you wouldn't need to Query or Scan for the items you inserted after the test, since you would already have the Set built. That would look something like this:
public class KeyCollectingAmazonDynamoDB implements AmazonDynamoDB
{
private final AmazonDynamoDB delegate;
// HashRangePair is something you have to define
private final Set<Key> contents;
public InsertGatheringAmazonDynamoDB( AmazonDynamoDB delegate )
{
this.delegate = delegate;
this.contents = new HashSet<>();
}
@Override
public PutItemResult putItem( PutItemRequest putItemRequest )
throws AmazonServiceException, AmazonClientException
{
contents.add( extractKey( putItemRequest.getItem() ) );
return delegate.putItem( putItemRequest );
}
private Key extractKey( Map<String, AttributeValue> item )
{
// TODO Define your hash/range key extraction here
// Create a Key object
return new Key( hashKey, rangeKey );
}
@Override
public DeleteItemResult deleteItem( DeleteItemRequest deleteItemRequest )
throws AmazonServiceException, AmazonClientException
{
contents.remove( deleteItemRequest.getKey() );
return delegate.deleteItem( deleteItemRequest );
}
@Override
public BatchWriteItemResult batchWriteItem( BatchWriteItemRequest batchWriteItemRequest )
throws AmazonServiceException, AmazonClientException
{
// Similar extraction, but in bulk.
for ( Map.Entry<String, List<WriteRequest>> entry : batchWriteItemRequest.getRequestItems().entrySet() )
{
String tableName = entry.getKey();
List<WriteRequest> writeRequests = entry.getValue();
for ( WriteRequest writeRequest : writeRequests )
{
PutRequest putRequest = writeRequest.getPutRequest();
if ( putRequest != null )
{
// Add to Set just like putItem
}
DeleteRequest deleteRequest = writeRequest.getDeleteRequest();
if ( deleteRequest != null )
{
// Remove from Set just like deleteItem
}
}
}
// Write through to DynamoDB
return delegate.batchWriteItem( batchWriteItemRequest );
}
// remaining methods elided, since they're direct delegation
}
Key
is a class within the DynamoDB SDK that accepts zero, one, or two AttributeValue
objects in the constructor to represent a hash key or a hash/range key. Assuming it's equals
and hashCode
methods work, you can use in within the Set
I described. If they don't, you'll have to write your own Key
class.
This should get you a maintained Set for use within your tests. It's not specific to a table, so you might need to add another layer of collection if you're using multiple tables. That would change Set<Key>
to something like Map<TableName, Set<Key>>
. You would need to look at the getTableName()
property to pick the correct Set
to update.
Once your test finishes, grabbing the contents of the table and deleting should be straightforward.
One final suggestion: use a different table for testing than you do for your application. Create an identical schema, but give the table a different name. You probably even want a different IAM user to prevent your test code from accessing your production table. If you have questions about that, feel free to open a separate question for that scenario.
Do the following steps:
- Make delete table request
- In the response you will get the TableDescription
- Using TableDescription create the table again.
That's what I do in my application.
DynamoDBMapper will do the job in few lines :
AWSCredentials credentials = new PropertiesCredentials(credentialFile);
client = new AmazonDynamoDBClient(credentials);
DynamoDBMapper mapper = new DynamoDBMapper(this.client);
DynamoDBScanExpression scanExpression = new DynamoDBScanExpression();
PaginatedScanList<LogData> result = mapper.scan(LogData.class, scanExpression);
for (LogData data : result) {
mapper.delete(data);
}
scan
is quite expensive; however using query
is not as clean API calls-wise –
Helms As ihtsham says, the most efficient way is to delete and re-create the table. However, if that is not practical (e.g. due to complex configuration of the table, such as Lambda triggers), here are some AWS CLI commands to delete all records. They require the jq
program for JSON processing.
Deleting records one-by-one (slow!), assuming your table is called my_table
, your partition key is called partition_key
, and your sort key (if any) is called sort_key
:
aws dynamodb scan --table-name my_table | \
jq -c '.Items[] | { partition_key, sort_key }' | \
tr '\n' '\0' | \
xargs -0 -n1 -t aws dynamodb delete-item --table-name my_table --key
Deleting records in batches of up to 25 records:
aws dynamodb scan --table-name my_table | \
jq -c '[.Items | keys[] as $i | { index: $i, value: .[$i]}] | group_by(.index / 25 | floor)[] | { "my_table": [.[].value | { "DeleteRequest": { "Key": { partition_key, sort_key }}}] }' | \
tr '\n' '\0' | \
xargs -0 -n1 -t aws dynamodb batch-write-item --request-items
If you start seeing non-empty UnprocessedItems
responses, your write capacity has been exceeded. You can account for this by reducing the batch size. For me, each batch takes about a second to submit, so with a write capacity of 5 per second, I set the batch size to 5.
Just for the record, a quick solution with item-by-item delete in Python 3 (using Boto3 and scan()): (Credentials need to be set.)
def delete_all_items(table_name):
# Deletes all items from a DynamoDB table.
# You need to confirm your intention by pressing Enter.
import boto3
client = boto3.client('dynamodb')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(table_name)
response = client.describe_table(TableName=table_name)
keys = [k['AttributeName'] for k in response['Table']['KeySchema']]
response = table.scan()
items = response['Items']
number_of_items = len(items)
if number_of_items == 0: # no items to delete
print("Table '{}' is empty.".format(table_name))
return
print("You are about to delete all ({}) items from table '{}'."
.format(number_of_items, table_name))
input("Press Enter to continue...")
with table.batch_writer() as batch:
for item in items:
key_dict = {k: item[k] for k in keys}
print("Deleting " + str(item) + "...")
batch.delete_item(Key=key_dict)
delete_all_items("test_table")
Obviously, this shouldn't be used for tables with a lot of items. (100+) For that, the delete / recreate approach is cheaper and more efficient.
You will want to use BatchWriteItem if you can't drop the table. If all your entries are within a single HashKey, you can use the Query API to retrieve the records, and then delete them 25 items at a time. If not, you'll probably have to Scan.
Alternatively, you could provide a simple wrapper around AmazonDynamoDBClient
(from the official SDK) that collects a Set of Hash/Range keys that exist in your table. Then you wouldn't need to Query or Scan for the items you inserted after the test, since you would already have the Set built. That would look something like this:
public class KeyCollectingAmazonDynamoDB implements AmazonDynamoDB
{
private final AmazonDynamoDB delegate;
// HashRangePair is something you have to define
private final Set<Key> contents;
public InsertGatheringAmazonDynamoDB( AmazonDynamoDB delegate )
{
this.delegate = delegate;
this.contents = new HashSet<>();
}
@Override
public PutItemResult putItem( PutItemRequest putItemRequest )
throws AmazonServiceException, AmazonClientException
{
contents.add( extractKey( putItemRequest.getItem() ) );
return delegate.putItem( putItemRequest );
}
private Key extractKey( Map<String, AttributeValue> item )
{
// TODO Define your hash/range key extraction here
// Create a Key object
return new Key( hashKey, rangeKey );
}
@Override
public DeleteItemResult deleteItem( DeleteItemRequest deleteItemRequest )
throws AmazonServiceException, AmazonClientException
{
contents.remove( deleteItemRequest.getKey() );
return delegate.deleteItem( deleteItemRequest );
}
@Override
public BatchWriteItemResult batchWriteItem( BatchWriteItemRequest batchWriteItemRequest )
throws AmazonServiceException, AmazonClientException
{
// Similar extraction, but in bulk.
for ( Map.Entry<String, List<WriteRequest>> entry : batchWriteItemRequest.getRequestItems().entrySet() )
{
String tableName = entry.getKey();
List<WriteRequest> writeRequests = entry.getValue();
for ( WriteRequest writeRequest : writeRequests )
{
PutRequest putRequest = writeRequest.getPutRequest();
if ( putRequest != null )
{
// Add to Set just like putItem
}
DeleteRequest deleteRequest = writeRequest.getDeleteRequest();
if ( deleteRequest != null )
{
// Remove from Set just like deleteItem
}
}
}
// Write through to DynamoDB
return delegate.batchWriteItem( batchWriteItemRequest );
}
// remaining methods elided, since they're direct delegation
}
Key
is a class within the DynamoDB SDK that accepts zero, one, or two AttributeValue
objects in the constructor to represent a hash key or a hash/range key. Assuming it's equals
and hashCode
methods work, you can use in within the Set
I described. If they don't, you'll have to write your own Key
class.
This should get you a maintained Set for use within your tests. It's not specific to a table, so you might need to add another layer of collection if you're using multiple tables. That would change Set<Key>
to something like Map<TableName, Set<Key>>
. You would need to look at the getTableName()
property to pick the correct Set
to update.
Once your test finishes, grabbing the contents of the table and deleting should be straightforward.
One final suggestion: use a different table for testing than you do for your application. Create an identical schema, but give the table a different name. You probably even want a different IAM user to prevent your test code from accessing your production table. If you have questions about that, feel free to open a separate question for that scenario.
You can recreate a DynamoDB table using AWS Java SDK
// Init DynamoDB client
AmazonDynamoDB dynamoDB = AmazonDynamoDBClientBuilder.standard().build();
// Get table definition
TableDescription tableDescription = dynamoDB.describeTable("my-table").getTable();
// Delete table
dynamoDB.deleteTable("my-table");
// Create table
CreateTableRequest createTableRequest = new CreateTableRequest()
.withTableName(tableDescription.getTableName())
.withAttributeDefinitions(tableDescription.getAttributeDefinitions())
.withProvisionedThroughput(new ProvisionedThroughput()
.withReadCapacityUnits(tableDescription.getProvisionedThroughput().getReadCapacityUnits())
.withWriteCapacityUnits(tableDescription.getProvisionedThroughput().getWriteCapacityUnits())
)
.withKeySchema(tableDescription.getKeySchema());
dynamoDB.createTable(createTableRequest);
I use following javascript code to do it:
async function truncate(table, keys) {
const limit = (await db.describeTable({
TableName: table
}).promise()).Table.ProvisionedThroughput.ReadCapacityUnits;
let total = 0;
let lastEvaluatedKey = null;
do {
const qp = {
TableName: table,
Limit: limit,
ExclusiveStartKey: lastEvaluatedKey,
ProjectionExpression: keys.join(' '),
};
const qr = await ddb.scan(qp).promise();
lastEvaluatedKey = qr.LastEvaluatedKey;
const dp = {
RequestItems: {
},
};
dp.RequestItems[table] = [];
if (qr.Items) {
for (const i of qr.Items) {
const dr = {
DeleteRequest: {
Key: {
}
}
};
keys.forEach(k => {
dr.DeleteRequest.Key[k] = i[k];
});
dp.RequestItems[table].push(dr);
if (dp.RequestItems[table].length % 25 == 0) {
await ddb.batchWrite(dp).promise();
total += dp.RequestItems[table].length;
dp.RequestItems[table] = [];
}
}
if (dp.RequestItems[table].length > 0) {
await ddb.batchWrite(dp).promise();
total += dp.RequestItems[table].length;
dp.RequestItems[table] = [];
}
}
console.log(`Deleted ${total}`);
setTimeout(() => {}, 1000);
} while (lastEvaluatedKey);
}
(async () => {
truncate('table_name', ['id']);
})();
In this case, you may delete the table and create a new one.
Example:
from __future__ import print_function # Python 2/3 compatibility
import boto3
dynamodb = boto3.resource('dynamodb', region_name='us-west-2', endpoint_url="http://localhost:8000")
table = dynamodb.Table('Movies')
table.delete()
© 2022 - 2024 — McMap. All rights reserved.