Amazon DynamoDB

From Bauman National Library
Revision as of 16:45, 10 November 2018 by egor zorin (Talk | contribs)

Amazon DynamoDB
ADB.png
Developer(s) Amazon
Initial release 2012
Repository {{#property:P1324}}
Development status Active
Platform Crossplatform
Available in English
Type NoSQL
License Proprietary
Website official website

Amazon DynamoDB [1] is a fast and flexible NoSQL database service. It's suitable for any applications that require a stable work with a delay of no more than a few milliseconds at any scale and it's a fully managed database. DynamoDB supports data models like key-value pair and document data structures. DynamoDB is allowed to be used for web apps or mobile apps, games, different platforms and the "Internet of things" and other applications due to it's flexibility and reliability.

Amazon DynamoDB

DynamoDB

Amazon DynamoDB - a fully managed service database NoSQL, providing predictable high performance with effective scalability. Amazon DynamoDB enables customers to offload the administrative burdens of operating and scaling distributed databases to AWS so that they don’t have to worry about hardware provisioning, setup and configuration, throughput capacity planning, replication, software patching, or cluster scaling. [2] In addition to relational databases (MySQL, PostgreSQL, Oracle, Microsoft SQL Server), Amazon also offers to use NoSQL database DynamoDB via cloud since 2012 .

Common Amazon DynamoDB Features

DynamoDB allows customers to solve one of the main problems of scaling databases: organizing the management of database software and extracting the hardware components to provide their work.
DynamoDB supports GET/PUT operations by using a user-defined primary key which appears to be the nessesary attribute of any table. The primary key is set when the table is created as the unique identifier for every object. Also DynamoDB provides query on nonprimary key attributes using global secondary indexes and local secondary indexes. After creating the table the PutItem API or BatchWriteItem API can be used to insert items via DynamoDB or API CreateTable. The table values can be string type, numeric type, binary type or a set of values. The size of one entry should not exceed 64 KB. You can use APIs such as GetItem, BatchGetItem, or Query to get inserted items if the table uses composite primary keys.
When reading data from DynamoDB, users can specify whether they want the read to be eventually consistent or strongly consistent:
Eventually consistent reads (the default) – maximizes your read throughput, however might not reflect the results of a recently completed write. All copies of data usually reach consistency within a second. Repeating a read after a short time should return the updated data.
Strongly consistent reads — returns a result that reflects all writes that received a successful response before the read.

DynamoDB Core Components

DynamoDB uses following core components:

  • Table: Similar to other database systems, DynamoDB stores data in tables. A table is a collection of data. Each table should have a primary key '. The primary key can be a key to one attribute or the key attribute with a complex comprising two attributes. Attributes are assigned as the primary key must exist for each element in the form of a primary key that uniquely identifies each element in the table.
  • Item: Each table contains zero or more items. An item is a group of attributes that is uniquely identifiable among all of the other items. The element consists of a primary or a composite key and a variable number of attributes. Explicitly set limits on the number of attributes associated with individual element does not exist, however, the aggregate size of the item, including all the names and values ​​of attributes should not exceed 400 KB.
  • Attribute: Each item is composed of one or more attributes. An attribute is a fundamental data element, something that does not need to be broken down any further.each attribute associated with a data element consists of an attribute name and a value or set of values. Some attributes do not have explicit restrictions on the size, but the total value of the item (including all names, and attribute values) should not exceed 400 KB.

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB Accelerator (DAX)' delivers fast read performance for your DynamoDB tables at scale by enabling you to use a fully managed, highly available, in-memory cache. Using DAX, you can improve the read performance of your DynamoDB tables by up to 10x—taking the time required for reads from milliseconds to microseconds, even at millions of requests per second. In fact DAX is a caching service with high RAM performance. DAX is responsible for implementing memory acceleration in DynamoDB tables without requiring developers to manage cache invalidation, data population, or cluster management. Customers do not need to modify application logic, since DAX is compatible with existing DynamoDB API calls and they can enable DAX with just a few clicks in the AWS Management Console or using the AWS SDK.

Key-value and document data model support

DynamoDB supports key-value and document data structures. that are designed to scale easily with a flexible schema.

  • Key-value data structure

The data is stored in so-called tables that have a primary key and a set of attributes. The data can be searched, inserted, deleted with the primary key. There is also an opportunity to use conditional operations (for example, update if the condition is fulfilled), atomic modifications (for example, increasing the attribute value by one) and searching for non-key attributes using a full table scan.

  • Document data structure

DynamoDB supports storing, updating, and querying documents. It has built-in support for JSON, so JSON documents can be written directly to DynamoDB tables. The maximum element size is 400 KB, so DynamoDB allows you to save large JSON documents and embedded objects in a single transaction. Unfortunately, DynamoDB does not have an effective way to request data on non-key attributes. The desired level of consistency can be specified when reading data (potentially consistent or strictly consistent reading).

Global Tables

Global tables build on the global DynamoDB footprint to provide you with a fully managed, multi-region, and multi-master database that provides fast, local, read and write performance for massively scaled, global applications. This approach provides high performance local read and write operations for global applications with extensive scalability. Global tables replicate your DynamoDB tables automatically across your choice of AWS Regions. Global tables will get customers out of replicating data between regions and resolving update conflicts. In addition, global tables allow applications to stay high available even in the case of unlikely event of isolation or degradation of the performance of the whole region.

Auto Scaling

DynamoDB delivers seamless, automatic scaling of throughput and storage scaling via APIs and the AWS Management Console. You can dial up unlimited throughput or storage. Before using the Auto Scaling customers have to set a scaling policy for a table or a global secondary index. This policy will define whether you want to scale read capacity or write capacity (or both), and the minimum and maximum provisioned capacity unit settings for the table or index. When scaling policy is set, Auto Scaling creates a pair of Amazon CloudWatch alarms on your behalf. Each pair represents the upper and lower boundaries for provisioned throughput settings. These CloudWatch alarms are triggered when the table's actual utilization deviates from your target utilization for a sustained period of time.
When one of the CloudWatch alarms is triggered, Amazon SNS sends you a notification (if you have enabled it). The CloudWatch alarm then invokes Application Auto Scaling, which in turn notifies DynamoDB to adjust the table's provisioned capacity upward or downward, as appropriate. This is how Auto Scaling works:

Auto Scaling architecture

On-Demand Backup and Restore

On-demand backup and restore allows you to create full backups of your DynamoDB tables’ data for data archiving, which can help you meet your corporate and governmental regulatory requirements. You can back up tables from a few megabytes to hundreds of terabytes of data and not impact performance or availability to your production applications. On-Demand Backup allows you to create backup copies of tables for their long-term retention and archival for regulatory compliance needs. You can restore or backup DynamoDB tables anytime with just a single click in the AWS Management Console or with a single API call. There is no impact on table performance or aviability with backup or restore actions. All backups are cataloged, easily discoverable, and retained until explicitly deleted.

Time To Live

Time To Live (TTL) allows you to set a specific timestamp in order to delete expired items from your tables. When the time stamp expires the corresponding item is marked as expired and is deleted from the table. TTL reduces storage usage and reduces the cost of data that is no longer relevant. With TTL enabled you can set a deletion timestamp for each item, thereby limiting storage usage to only relevant entries

Secondary Indexes

A secondary index is a data structure that contains a subset of attributes from a table and an alternate key to support Query operations. You can retrieve data from the index using a Query, in much the same way as you use Query with a table. A table can have multiple secondary indexes, which gives your applications access to many different query patterns, but a certain index can have a reference to only one table.
DynamoDB provides secondary indexes that give you the flexibility to efficiently query on any attribute (column). You can create and delete secondary indexes from your table at any time.

DynamoDB Streams and Triggers

Amazon DynamoDB Streams is a time-ordered sequence of item-level changes in any DynamoDB table. DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table, and stores this information in a log for up to 24 hours, so using Streams you can keep track of the latest item-level change or get all item-level updates. Applications can access this log and view the data items as they appeared before and after they were modified, in near real time. This data can be used for building creative applications for replication, materializing views, backups, and integrating with other services. DynamoDB integrates with AWS Lambda to provide triggers. Triggers are pieces of code that automatically respond to any events in DynamoDB Streams. With triggers, you can build applications that react to data modifications in DynamoDB tables.

Web Application Architecture with DynamoDB

The example of using DynamoDB in terms of a simple web application is show below [3]

Simple web app architecture using DynamoDB

DynamoDB vs. RDBMS

The main difference between DynamoDB and RDBMS is the payment feature - payment is made not for the amount of data, but for the activity of accessing it. It's also the different conception of scaling - with the increase in the number of requests and volumes of sent data DynamoDB can automatically scale horizontally by itself. It used to use hard drives, but now it's only high-speed flash drivers. The main thing in this approach is precisely predictable performance and response time (significantly less than 10 ms) to customer requests, which is important for large-scale real-time online systems. There is no formal limit on the amount of data in the database; the new space is automatically connected as the stored information grows. DynamoDB operates in zero-administration mode, data is synchronized in three geographic zones.

DynamoDB offers "key - value" conception for tables, which are qualitatively different from the relational matrix, as it does not support the rigid schemes description information. An important characteristic of DynamoDB is the guaranteed delivery of the latest values ​​from the table, which NoSQL services usually do not provide users with.

DynamoDB unlike many other NoSQL-systems is proprietary. Since there is no single interface for NoSQL-products, users need to learn the architecture and interfaces from scratch, which is, however, not too difficult. But a certain threshold for entering this technology against the background of many open and free NoSQL-products pushed Amazon to release in September 2013 the local version of DynamoDB, which can be installed on users PC.

DynamoDB is a simple-to-use database for online systems that can scale to thousands of requests per second while maintaining short response time. However, it certainly can not be compared in terms of functionality with mass RDBMS and not 'is able to perform complex queries, although supported by a number of atomic operations, which, for example, change the value of a particular numeric field in the record. Home, DynamoDb 'allows you to quickly deploy the online system' 'that requires a fairly simple database data organization [4] .

Advantages

  • Fast and stable perfomance

Amazon DynamoDB works fast and stable with any types of applications. The average request processing takes milliseconds. With the growth of the volume of data and the need in improving system's performance Amazon DynamoDB provides users with the implementation of bandwith and processing time requirements via automatic partiotioning.

  • High Scalability

When you create a table you just need to specify the required amount of requests. If you want to change the bandwidth settings, update a table using AWS Management Console or Amazon DynamoDB API. Amazon DynamoDB performs all scaling in stealth mode and during their implementation to ensure continued compliance with statutory requirements for bandwidth.

  • Flexibility

Amazon DynamoDB supports data structures based on both documents and steam "key-value", so you can choose the optimum architecture taking into account features of their system.

  • Full Control

You no longer need to worry about database management, hardware and software setup and configuration, software updates, using reliable distributed database cluster, or the division of information into several blocks as you change the scale of the system.

Disadvantages

  • It's not an open-source
  • Changing or adding keys instantly is impossible without creating a new table.
  • Data requests are extremely limited.
  • Registration

Registration takes quite a long time as there is a huge amount of data you have to deliver about yourself. And you have to pay 1$ dollar for trial version of DyanmoDB.

DynamoDB in Production

  • Support for data model "key-value"
  • Scaling
  • The availability of data even when a data center failure
  • Local application development, followed by scaling it down to the size of the cloud
  • Request of any attribute using local and global secondary indexes
  • Streams - ADDB can provide details about all the changes that have occurred since cloud users in the last 24 hours
  • Interregional replication
  • Use triggers to automatically run user defined functions when a change is detected in the tables
  • Search by content
  • Integration with ADDB graph database Titan
  • Flexible data (data in the table are not required to have the same attributes that are supported by a variety of different types of data)
  • Strict consistency and atomic counters
  • Built-in monitoring query performance and latency tables
  • Security
  • Integration with Elastic MapReduce
  • Integration with Redshift
  • Integration with AWS Data Pipeline
  • Console and API

Getting started with DynamoDB

References

  1. Amazon DynamoDB | https://aws.amazon.com/dynamodb/?nc1=h_ls
  2. Amazon DynamoDB Documentation| https://aws.amazon.com/ru/documentation/dynamodb/
  3. Architecture DynamoDB | https://www.slideshare.net/AmazonWebServices/amazon-elastic-compute-cloud-ec2-module-1-part-2-awsome-day-2017?next_slideshow=1:
  4. Amazon switches on DynamoDB Cloud DB service | https://www.zdnet.com/article/amazon-switches-on-dynamodb-cloud-database-service/