Hello Guys,
This blog is part four of eight part series of porting your RDBMS SQL database to a NoSQL wide column key value pair database. For introduction or part zero you can click here.
In this blog we will discuss more on designing the Row Key. For a NoSQL database Row Key is very important for fetching rows. It is a vast topic, let me try to squeeze in what ever I can. Store this deep in your mind, a Row Key is not allowed or not appreciated to be updated. The only way to update is Delete first and then Insert the row with new Row Key value. This helps in setting up the row in proper partition. Of course row key design involves lots of transformations.
There is a lot to ponder while designing efficient row keys but here are few important points to always keep in mind for row key design and these are the ones we used in our row key design:
- Order:
Row Keys are implemented as byte arrays and are stored in byte-lexicographical order. What this means is that row keys are sorted, byte by byte, from left to right. For example, lets take this row key set:
{1,09,005,2,3,22,11}
Numerically this will be ordered as:
{1,2,3,005,09,11,22}
But in byte-lexicographical order, the set will be ordered as:
{005,09,1,11,2,22,3}
Lets make it an alphanumeric row key set:
{1,09,005,2,Ash,ASH,3,22,11}
the ordering will be:
{005,09,1,11,2,22,3,ASH,Ash}
The ASCII value for 'S' in ASH has lesser value than ASCII value of 's' in Ash. Anyways, I just wanted to point out that 22 is put before 3, so is 11 before 2 and finally ASH before Ash and that is because of byte-lexicographical ordering. - Padding:
In SQL, our tendency is to normally use numeric values for keys or IDs. So when these numeric IDs become part of the row keys they mess up the ordering and while fetching records you might end up in fetching unwanted records. As per above example if we want to fetch numeric value from 1 to 5 we will get output set as {1,11,2,22,3}, actually we were expecting resulting set as {1,2,3,005}. To resolve this, its good to set up a length for the IDs and then pad up or prefix the ID values with zeros '0'.
So the above set becomes with length=4
{0001,0009,0005,0002,0003,0022,0011}
With the byte lexicographical ordering the set will be:
{0001,0002,0003,0005,0009,0011,0022}
In our data model we have CustomerID as a field and we want to use it as the part of the concatenated row key (discussed below) and we expect not more then a million customers and if the CustomerID is 123 then after padding it, the CustomerID will be transformed to 000123. Do not unnecessarily pad, as shorter the row key better the performance but make sure the length is adequate enough to satisfy the use cases as row key update is not possible. - Descending Order: For few use cases we might want to have the row key sorted in descending order. You can make this possible by implementing following equation:
Key Value = Max Value - ID Value
For example CustomerID = 123 we can have
Key Value = 999999 - 123 = 999876
CustomerID = 100 will be transformed to Key Value = 999899. So the order will change to:
Transformed Value (123) = 999876
Transformed Value (100) = 999899
Don't forget to pad the values with leading zero's so it follows the lexicographical order. Normally this logic is used for sorting Timestamps in descending order.
The descending sort can also be implemented in the code after the data set fetch is complete. - Hotspotting: This one is very important. Design your keys in such a way that will avoid hot-spotting. So what is hotspotting? All NoSQL databases are scalable and each database is spread across multiple nodes in the cluster. The more the data is spread across the clusters the more better performance as all the nodes in the cluster are working together.
Now consider a case you are trying to fetch data and all it can target is only one node in the cluster as all the requested data is on that node only. In this case only one node is doing all the work to provide the data. Now if there are multiple requests and each targets the same node that is the hotspot node. In that case all other nodes are doing nothing. Let me explain with an example.
Lets say we have row key that is made up of OrderID. Let say in the RDBMS data model OrderID is an identity field or a sequence field and values that are generated are sequential in order. In this case, as the row key is always sorted it is possible a sequential batch of OrderID will be stored together in the same region. The region might have capacity to store order for a week. Now when the reports are ran for the date period of week and that gets aligned with the region, the query will always target the same region again and again making it a hotspot.
To avoid this we can just apply a reverse string operation over the 0 padded OrderID and use that as the row key and the Order data will get spread across multiple nodes in the cluster.
Key Value = Reverse(Pad(OrderID,'0'))
Do not try to use reverse string logic on date time values when they are in formats like 'MM/dd/yyyy hh:mm:ss.z'. If you reverse date time formatted key values they might fetch unwanted rows. If you want to implement reverse string transformation on date time value convert them into timestamps and then apply the reverse string transformation.
Another logic is to use salt, create a salt and prefix the OrderID with the salt. Make sure the logic used to create the salt helps in properly distributing the data across multiple nodes.
Key Value = Salt + Pad(OrderID,'0')
The salt value can be a single digit let say the digit in the units place or it can be a predetermined set of hash values.
You can find more details on hotspotting here. - Concatenated Row Keys: In the previous blog we had generated a data model for 'Fetch list of Customer Payments' use case in Multiple Tables section. In the data model for that use case, we had set the row key as PaymentDateTime_CustomerID_PaymentID. Note the order in which the fields are concatenated to generate the row key and the delimiter ( _ ). You can use any other delimiter of your choice, just make sure when you supply the search values for fetching the rows, use the same delimiter what was used to insert the data. The row key PaymentDateTime_CustomerID_PaymentID says something, it says I cannot be used to generate report for a single customer for a date range. What this means is, as the row keys are ordered byte lexicographically the row key will be sorted as follows:
01-01-2017 09:00:00_000123_012345678
01-01-2017 09:00:30_000156_012345679
01-01-2017 09:00:40_000129_012345680
01-01-2017 09:05:00_000163_012345681
01-01-2017 16:06:30_000123_012345682
01-01-2017 09:07:00_000023_012345683
01-01-2017 09:00:00_000123_012345678
Now if we want to fetch data for CustomerID 123 only and if we give following search start and end key value
Start Key Value: '01-01-2017 00:00:00_000123'
End Key Value: '01-01-2017 16:59:59_000123'
the result set returned will be:
01-01-2017 09:00:00_000123_012345678
01-01-2017 09:00:30_000156_012345679
01-01-2017 09:00:40_000129_012345680
01-01-2017 09:05:00_000163_012345681
01-01-2017 16:06:30_000123_012345682
If you see in the above result set it returns the rows of other customers too.
Now, what happens if we change the order to, CustomerID_PaymentDateTime_PaymentID. With this key its possible to fetch data for a single customer for a date range.
000023_01-01-2017 09:07:00_012345683
000123_01-01-2017 09:00:00_012345678
000123_01-01-2017 16:06:30_012345682
000129_01-01-2017 09:00:40_012345680
000156_01-01-2017 09:00:30_012345679
000163_01-01-2017 09:05:00_012345681
After applying above search key values as follows:
Start Key Value: '000123_01-01-2017 00:00:00'
End Key Value: '000123_01-01-2017 16:59:59'
the result set returned will be:
000123_01-01-2017 09:00:00_012345678
000123_01-01-2017 16:06:30_012345682
and this is what is expected.
Why do we require _PaymentID in both the keys, for uniqueness. Its possible, that two users might pay at the same time for the same customer, if that happens by coincidence, the new value will overwrite the previous value as it can not hold two values for the same key. So for uniqueness we can append Payment_ID. When fetching records its not necessary to send the complete start and end value. For example (using key PaymentDateTime_CustomerID_PaymentID):
Start Key Value: '01-01-2017 00:00:00'
End Key Value: '01-01-2017 09:59:59'
will return result set as:
01-01-2017 09:00:00_000123_012345678
01-01-2017 09:00:30_000156_012345679
01-01-2017 09:00:40_000129_012345680
01-01-2017 09:05:00_000163_012345681
The above key will fetch payments for all the customers between that date range. In the key, the value on the left is more important than the value on the right, and values on the right can be skippedAnother example (using key CustomerID_PaymentDateTime_PaymentID):
Start Key Value: '000123'End Key Value: '000123'
The above key will fetch all the payments for CustomerID 123.
Start Key Value: '000123_01-01-2017' or '000123_01-01-2017 00:00:00_000000000'
End Key Value: '000123_01-01-2017' or '000123_01-01-2017 23:59:59_999999999'
the result set returned will be:
000123_01-01-2017 09:00:00_012345678
000123_01-01-2017 16:06:30_012345682
The above key will fetch all the payments for the day of 1st Jan 2017 for CustomerID 123 So different use case might require different combination of fields as well as different order of field concatenation to generate the row key. - Secondary Indexes: Well this is more interesting point. Again lets take the use cases in the concatenated keys. If we have two different row keys, then as per multiple tables data model, for which we have model as below:
Row Key PaymentDateTime_CustomerID_PaymentID Column Families P FirstName LastName BusinessName PaymentDateTime PaymentMode Name Number Amount ExpiryDate AuthDetails Status Row Key CustomerID_PaymentDateTime_PaymentID Column Families P FirstName LastName BusinessName PaymentDateTime PaymentMode Name Number Amount ExpiryDate AuthDetails Status
As you see in the above data model, there is huge amount of data duplication, just because we have a use case, for which we are not able to use, the other existing use case data model. So we have to create another data model and write, may be, the same data twice. For any new use case that cannot use the existing data model, we will have to create a new data model every time and that makes us to have more duplication and redundancy.
This can be resolved by using something like the CustomerOrder data model and creating secondary index tables. Secondary index tables are nothing but tables that have concatenated row key and value is the row key of the other table.
Now OrderID is expected to be by default of type identity or sequential. As discussed above, in hotspotting, sequential values are good candidates for hotspotting. To avoid it we can use either the reverse transformation or hash prefix. We will be using reverse transformation for our data model.
CustomerOrderRow Key Reverse(Pad(OrderID,'0')) Column Families C O P S B D CustomerID OrderNo PaymentID AddressID AddressID DiscountCode FirstName OrderDateTime PaymentDateTime Address1 Address1 GrossAmount LastName TotalAmount PaymentMode Address2 Address2 TotalAmount BusinessName GrossAmount Name City City EmailAddress OrderStatus Number State State Designation ItemCount Amount Zip Zip ExpiryDate AuthDetails Status
Secondary index tables for 'Fetch list of customer payments' use case:
PaymentsByDateRow Key PaymentDateTime_CustomerID_PaymentID Column Families P OrderID = Reverse(Pad(OrderID,'0'))
PaymentsByCustomerRow Key CustomerID_PaymentDateTime_PaymentID Column Families P OrderID = Reverse(Pad(OrderID,'0'))
If you see in above data model we have optimized for duplication and redundancy. If a new use case needs to be implemented just create a new concatenated row key that will satisfy the search parameters for the use case and store the reverse padded OrderID as value.
So now we can create more secondary index tables to satisfy our other use cases. Below is another secondary index table.
CustomerOrdersRow Key OrderDateTime_CustomerID_OrderID Column Families O OrderID = Reverse(Pad(OrderID,'0'))
Thanks for reading!
Ashfaq Chougle
Previous: Part Three: Wide Column Data Model - From SQL Server to Wide Column NoSQL