What is NoSQL?
What are NoSQL databases?
NoSQL databases, otherwise known as purpose-built databases, are designed for specific data models and stores data in flexible schemas that scale easily for modern applications. Many database workloads can benefit from the cost-effectiveness and performance of NoSQL databases. As an example, Amazon DynamoDB is serverless so resource utilization is automatically optimized and you never pay for over-provisioning. Moreover, NoSQL databases are widely recognized for their ease of development, functionality, and performance at scale. This page includes resources to help you better understand NoSQL databases and to get started.
Get started with,
DynamoDB, ElastiCache, DocumentDB, Keyspaces, MemoryDB, Neptune, Timestream, QLDB
What are the advantages of NoSQL databases
Modern applications face several challenges that can be solved by NoSQL databases. For instance, applications process a large data volume from disparate sources like social media, smart sensors, and third-party databases. All of this disparate data doesn't fit neatly into the relational model. Enforcing tabular structures can lead to redundancy, data duplication, and performance issues at scale.NoSQL databases are purpose-built for non-relational data models and have flexible schemas for building modern applications. They are widely recognized for their ease of development, functionality, and performance at scale. Benefits of NoSQL databases are listed below.
What are the use cases of NoSQL databases
You can use NoSQL databases to build a wide variety of high-performance mobile, Internet of Things (IoT), gaming, and web applications that provide great user experiences at scale. The range of NoSQL databases and their respective uses cases are wide-ranging. While it is challenging to present a representative set of use cases, below we provide a few illustrative examples as thought-starters and encourage you to learn more about each NoSQL database and their respective uses cases.
How do NoSQL databases work
NoSQL databases use a variety of data models for accessing and managing data. These types of databases are optimized specifically for applications that require flexible data models, large data volume, and low latency, which are achieved by relaxing some of the data consistency restrictions of relational databases. There are differences in implementation based on the data model. However, many NoSQL databases use Javascript Object Notation (JSON), an open data interchange format that represents data as a collection of name-value pairs.
SQL vs. NoSQL terminology
The following table compares terminology used by select NoSQL databases with terminology used by SQL databases.
SQL
|
MongoDB
|
DynamoDB
|
Cassandra
|
Couchbase
|
---|---|---|---|---|
Table
|
Collection
|
Table
|
Table
|
Data bucket
|
Row
|
Document
|
Item
|
Row
|
Document
|
Column
|
Field
|
Attribute
|
Column
|
Field
|
Primary key
|
ObjectId
|
Primary key |
Primary key
|
Document ID
|
Index
|
Index
|
Secondary index
|
Index
|
Index
|
View
|
View
|
Global secondary index
|
Materialized view
|
View
|
Nested table or object
|
Embedded document
|
Map
|
Map
|
Map
|
Array
|
Array
|
List
|
List
|
List
|
What are the types of NoSQL databases
There are several different NoSQL database systems due to variations in the way they manage and store schema-less data. We explain some of the common types below.
What are the differences between NoSQL and SQL databases
For decades, the predominant data model in application development was the relational data model that stored data in tables made of rows and columns. Structured Query Language (SQL) was used to create and edit these relational tables. SQL databases model data relationships as tables. The rows in the table represent a collection of related values of one object or entity. Each column in the table represents a data attribute, and a field (or table cell) stores the actual value of the attribute. You can use a relational database management system (RDBMS) to access the data in many different ways without reorganizing the database tables themselves.It wasn’t until the mid to late 2000s that other flexible data models began to gain significant adoption and usage. To differentiate and categorize these new classes of databases and data models, the term NoSQL was coined. NoSQL stands for not only SQL or non-SQL. Often the term NoSQL is used interchangeably with the term non-relational. Key differences between relational and non-relational databases are given in the table below.
|
Relational databases
|
NoSQL databases
|
---|---|---|
Optimal workloads
|
Relational databases are designed for transactional and strongly consistent online transaction processing (OLTP) applications. They are also good for online analytical processing (OLAP). |
NoSQL databases are designed for a number of data access patterns that include low-latency applications. NoSQL search databases are designed for analytics over semi-structured data.
|
Data model
|
The relational model normalizes data into tables that are composed of rows and columns. A schema strictly defines the tables, rows, columns, indexes, relationships between tables, and other database elements. The database enforces referential integrity in relationships between tables. |
NoSQL databases provide a variety of data models, such as key-value, document, graph, and column, which are optimized for performance and scale. |
ACID properties
|
Relational databases provide atomicity, consistency, isolation, and durability (ACID) properties:
|
Most NoSQL databases offer trade-offs by relaxing some of the ACID properties of relational databases in favor of a more flexible data model that can scale horizontally. This makes NoSQL databases an excellent choice for high-throughput, low-latency use cases that need to scale horizontally beyond the limitations of a single instance. |
Performance
|
Performance is generally dependent on the disk subsystem. The optimization of queries, indexes, and table structure is often required to achieve peak performance. |
Performance is generally a function of the underlying hardware cluster size, network latency, and the calling application. |
Scale
|
Relational databases typically scale up by increasing the compute capabilities of hardware or scale out by adding replicas for read-only workloads. |
NoSQL databases are typically partitionable. This is because access patterns can scale out by using distributed architecture to increase throughput that provides consistent performance at near-boundless scale. |
APIs
|
Requests to store and retrieve data are communicated using queries that conform to a structured query language (SQL). These queries are parsed and executed by the relational database. |
Object-based APIs allow app developers to easily store and retrieve data structures. Partition keys let apps look up key-value pairs, column sets, or semi-structured documents that contain serialized app objects and attributes. |
When should you choose NoSQL databases over SQL databases
A NoSQL database is best for handling indeterminate, unrelated, or rapidly changing data. It is intuitive to use for developers when the application dictates the database schema. You can use it for applications that:
Need flexible schemas that enable faster and more iterative development.
Prioritize performance over strong data consistency and maintaining relationships between data tables (referential integrity).
Require horizontal scaling by sharding across servers.
Support for semi-structured and unstructured data.
You don't always have to choose between a non-relational and relational database schema. You can employ a combination of SQL and NoSQL databases in your applications. This hybrid approach is quite common and ensures each workload is mapped to the right database for optimal price performance.
How can AWS support your NoSQL database requirements
AWS has several NoSQL database services to meet all your NoSQL requirements. For example:
Amazon DynamoDB is a serverless, fully managed, key-value database service that provides consistent, single-digit-millisecond performance with limitless scalability.
Amazon DocumentDB (with MongoDB compatibility) is a fully managed, native JSON document database that makes it easy and cost effective to operate critical document workloads at virtually any scale without managing infrastructure.
Amazon Neptune is a serverless, fully managed graph database service designed for superior scalability and availability with ability to query billions of relationships in seconds.
Amazon MemoryDB for Redis is a durable, in-memory database service that delivers microsecond read and write response times for ultra-fast performance.
Amazon ElastiCache is a fully managed, Redis- and Memcached-compatible, in-memory data store and cache service that delivers real-time, cost-optimized performance.
Amazon Keyspaces (for Apache Cassandra) is a serverless, fully managed wide-column database designed for up to 99.999% availability with multi-Region replication. A scalable, highly available, and managed Apache Cassandra–compatible database service.
Amazon Timestream is a serverless, fully managed time-series database that makes it easier to store and analyze trillions of events per day up to 1,000 times faster than relational databases.
Amazon OpenSearch Service is fully managed distributed search and analytics suite that enables real-time search, monitoring, and analysis of business and operational data.