Wednesday, September 1, 2021

Shared Nothing Architecture (SN Architecture) and Sharding database

 

SN Architecture: A shared nothing architecture (SN) is a distributed computing approach in which each node is independent and self-sufficient, and there is no single point of contention required across the system. 

  • This means no resources are shared between nodes (No shared memory, No shared file storage) 

  • The nodes are able to work independently without depending on each other for any work. 

  • Failure on one node affects only the users of that node, however other nodes continue to work without any disruption. 

This approach is highly scalable since it avoid the existence of single bottleneck in the system. Shared nothing is recently become popular for web development due to its linear scalability. Google has been using it for long time. 
 
In theory, A shared nothing system can scale almost infinitely simply by adding nodes in the form of inexpensive machines. 

Sharding:  is an architectural approach that distributes a single logical database system into a cluster of machines. 

 
Sharding is Horizontal partitioning design scheme. In this database design rows of a database table are stored separately, instead of splitting into columns (like in normalization and vertical partitioning). Each partition is called as a shard, which can be independently located on a separate database server or physical location. 
 
Sharding makes a database system highly scalable. The total number of rows in each table in each database is reduced since the tables are divided and distributed into multiple servers. This reduces the index size, which generally means improved search performance. 
 
The most common approach for creating shards is by the use of consistent hashing of a unique id in application (e.g. user id).  
 
The downsides of sharding are, 

  • It requires application to be aware of the data location. 

  • Any addition or deletion of nodes from system will require some rebalance to be done in the system. 

  • If you require lot of cross node join queries then your performance will be really bad. Therefore, knowing how the data will be used for querying becomes really important. 

  • A wrong sharding logic may result in worse performance. Therefore make sure you shard based on the application need. 

No comments:

Post a Comment

Generative AI: Paving the way for Performance-Driven Enterprise Architecture

  Generative AI is not just reshaping the technological frontier; it's rapidly becoming an essential tool in optimizing enterprise archi...