What is SingleStore?
SingleStore is a highly scalable, distributed SQL database that pairs an in-memory rowstore with an analytics-oriented columnstore on disk. This ability to act as both a row store and a column store concurrently, combined with lock-free data structures, multiversion concurrency control, and lock-free skiplists, allows SingleStore to support traditional and real-time analytics, eliminating the need for an additional data warehouse and ETL processes.
Why and when to use SingleStore
Both a transactional database and analytical data warehouse
SingleStore is designed to act as both a transactional (rowstore) database, and an analytical (columnstore) data warehouse. This unique technology eliminates the need for an organization to spin up a transactional database for row-level writes, and a separate analytical datawarehouse for historical analysis. Organizations can concurrently write and read from the same distributed system using ANSI SQL.
The ability to stream operational data into relational format while concurrently reading data means SingleStore can discover insights on the latest events or compare those with historical data. This approach enables real-time financial reporting or insights on live customer interactions with a single unified database. The integrated architecture can replace more complex systems that require copying of data from a transactional database or data lake into a dedicated data warehouse for analysis. Most relational database management systems (RDBMSs) cannot write and read data at the same time. This unique architecture makes SingleStore an attractive technology for organizations that want to analyze their business continuously while providing a historical reference store for predicting events or for making strategic decisions.
SingleStore scales similar to a Hadoop architecture in that SingleStore doesn’t require exotic hardware; it scales linearly on commodity hardware. This allows SingleStore to run on-premises, in the cloud, or as a managed service. It’s also possible to scale SingleStore instances without taking the cluster offline, which means SingleStore can be easily scaled “just-in-time” when new performance or storage is necessary.
What use cases is SingleStore really great for?
Use cases that require operational or real-time analytics
SingleStore supports real-time analytics. It can natively subscribe to Kafka, Spark, AWS S3, or HDFS data pipelines and process at a very large scale. Once data is in SingleStore, you can use standard SQL to analyze it. This makes SingleStore a great technology for use cases where a high business value is placed on minimized latency, such as fraud detection, system monitoring, e-commerce, and social networks.
How much does SingleStore cost?
SingleStore offers a free tier up to 128GB of RAM and unlimited disk or as an Enterprise subscription priced per GB of RAM in the cluster. The subscription provides professional support. For more details, contact SingleStore’s team.
SingleStore has a free tier that can be used in production and is available up to 128GB of RAM with unlimited disk.
How does SingleStore work?
SingleStore’s architecture has three distinct components that contribute to its speed and dual functionality. These three distinct components are:
- Distributed, shared-nothing architecture on commodity hardware
- Hybrid OLTP & OLAP Architecture
- Lock-free data structures
Distributed shared nothing architecture
SingleStore’s architecture is distributed across many individual machines, with each machine possessing its own CPU, memory, and disk; this is commonly known as a shared-nothing architecture, because storage and compute resources aren’t shared across the system.
SingleStore is also designed to run on commodity hardware, which means that the hardware needed to run a SingleStore system is not proprietary and therefore is relatively cheap. This also allows SingleStore administrators to scale the instance cheaply and easily.
Each node within SingleStore will fall under one of two types:
- Aggregator nodes store metadata such as indexes across the entire system, and are responsible for querying the storage nodes (known as leaves), aggregating the results, and sending them back to the client.
- Leaf nodes each contain a portion of the larger table. They’re responsible for storing and computing the data located on the node.
Application workloads can be divided amongst aggregators, where one aggregator serves application A, and another serves application B.
Hybrid OLTP & OLAP architecture
SingleStore contains both an in-memory rowstore for fast, high availability queries and an on-disk columnstore for analytical workloads that involve aggregating data, or where the size of table exceeds memory constraints.
SingleStore is able to join between its rowstore and columnstore, which means that a user querying the SingleStore database will be able surface both real-time and historical data with one query.
Lock-free data structures
Many relational databases impose a “lock” on stored tables when writing data. This is because if a read occurs at the same time data is being written to the table, the data can be output inconsistently. For example if an
UPDATE statement is issued on the table at the same time as a
SELECT statement, you could potentially read out half written data.
Table locks increase latency for database systems because they necessitate a dedicated database or data warehouse specifically for reading data or performing analytical queries. Data will then be moved in batches into that database at dedicated ETL intervals.
SingleStore implements lock-free data structures, skiplists, and Multiversion Concurrency Control (MVCC) to eliminate the need to lock the database when updating tables, which means that writes can happen at very high throughput while also maintaining a high level of concurrent reads.
Types of data
SingleStore stores data in either row or columnar format, depending on the table DDL*. It can ingest both tabular and nested data structures originating from text (delimited) files, JSON, Avro, and geospatial data.
Range of row (quantity)
SingleStore is capable of billion-row, petabyte-scale tables given enough compute and storage. Keep in mind that costs will increase if you need substantial space in the in-memory rowstore portion of SingleStore.
How to use SingleStore
SingleStore is flexible enough to be implemented on-premise, and across a variety of cloud platforms. A good resource to get an idea for the implementation process for SingleStore is SingleStore’s developer documentation.
SingleStore is an ODBC database and wire-compatible with MySQL, so any software that is compatible with MySQL can also be connected to SingleStore. This makes it very easy to move from MySQL to SingleStore. Once you move your data out of MySQL and into SingleStore, all you’ll need to do is point your existing applications at the SingleStore database.
Process for new data
SingleStore supports the SQL standard
LOAD DATA command to bring data into the environment. Because of SingleStore’s MySQL heritage, you can also leverage MySQL’s
CLI commands or an open source tool like
mysqldump to get data into SingleStore. Additionally, SingleStore has a GUI-based loading tool called Pipelines that can ingest data from different sources, including Amazon S3, Kafka, and other sources.
SingleStore uses sharding to distribute data uniformly across the leaf nodes. Sharding is the process of horizontally splitting a database table into uniform chunks, and distributing those chunks evenly across multiple nodes. With other databases, sharding must be configured manually, which prompts occasionally rebalancing the database. SingleStore on the other hand, partitions the tables automatically.
SingleStore can be scaled linearly, with “just-in-time” scaling. You don’t need to take the cluster offline. Increasing leaf nodes allows for faster query return time, and higher concurrency.
Things to look out for
SingleStore is a relatively new database, and is not yet fully ANSI SQL compatible (as of this writing it’s only ANSI SQL-92 compliant, compared to MySQL or Postgres which are at least SQL:2003 compliant).
While SingleStore has worked hard to develop the database optimizer, the technology still has shortcomings that are natural with a newer database technology. You may need to rewrite some queries in order to get the performance you’re looking for. Their support team will assist you with this process.