Organization Apache Software Foundation introduced Release distributed DBMS apache Cassandra 4.0 related to the Class of NOSQL-systems and designed for the creation of highly scalable and reliable storage of huge data arrays stored in the form of an associative array (hash). CASSANDRA 4.0 is recognized as ready for workshops and has already been tested in Amazon, Apple, Datastax, Instaclustr, Iland and Netflix with clusters with clusters with more than 1000 nodes. The project code is written in Java and extends as part of the APACHE 2.0 license.
Initially, the Cassandra DBMS was developed by Facebook and in 2009 was transferred to the admonia of the Apache Foundation. Industrial solutions based on Cassandra are deployed to ensure the operation of companies such as Apple, Adobe, Cern, Cisco, IBM, HP, Comcast, Disney, Ebay, Huawei, Netflix, Sony, Rackspace, Reddit and Twitter. For example, an Apache Cassandra’s storage infrastructure in Apache Cassandra has more than a thousand clusters, including 160 thousand nodes and storing more than 100 petabytes of data. In Huawei, more than 300 Apache Cassandra clusters are used, including 30 thousand nodes, and NetFlix has more than 100 clusters covering 10 thousand nodes and processing more than a trillion requests per day.
CASSANDRA DBMS combines a fully distributed HASH system Dynamo, providing almost linear scalability with an increase in the amount of data. Cassandra uses a storage model based on columnfamily family (ColumnFamily), which is different from the systems of such memcachedb, which store data only in the key / value ligament, the ability to organize storage of hashes with several nesting levels.
To simplify interaction with the database, the language of formation of structured requests CQL (Cassandra Query Language) , resembling SQL, but trimmed by functionality. From the ability to note the support of the names of names and column families, creating indexes through the expression “Create Index”.
DBMS Allows you to create storage resistant: placed in database data is automatically replicated to several distributed network nodes, which can cover different data centers. If the node fails, its functions are picked up by other nodes on the fly. Adding new nodes to the cluster and the update version of Cassandra is made on the fly, without additional manual intervention and reconfiguring other nodes. CQL support drivers are prepared for languages python , java (jdbc / dbapi2), Ruby , PHP , C ++ and javascript (node.js).