Saturday, December 31, 2016

Part Three: Wide Column Data Model - From SQL Server to Wide Column No-SQL





Hello Guys,

This blog is part three of eight part series of porting your RDBMS database to a NO-SQL wide column key value pair database. For introduction or part zero you can click here.


Well the question arises, why in topic Part One Sample Use Cases, I have provided the stored procedures. The stored procedures here give us the output that has to be reported. When we learned software engineering in our colleges or institutes we were told to start with the output. The output drives the input and hence the data model too. With more experience we started created data models that can be future proof. What I mean is we started adding fields which are not used currently but we think might make sense in future. NoSQL data model some what tries to restrict us with that thinking. Anyways, the stored procedures, are generally used to generate the reports are being analyzed here to generate the data model and later on the input.

Also note, NoSQL and other non-RDBMS databases the data is more modeled as De-Normalized data. As the joins are more costly, denormalization helps in improving performance. But denormalization on the other hand produces redundancy and data duplication and hence consumes lots of space. If more importance has to be given to performance and resource availability is not an issue then redundancy and data duplication becomes irrelevant. In this blog we will discuss more about data modeling. How the data is stored is out of scope of this series. For that you can refer to respective NoSQL data storage technique.

With the stored procedures in the part one we could think of only two data models:
  1. Multiple Column Families
  2. Multiple Tables
There might be more but we explored above two. Lets discuss them in more detail.

Multiple Column Families: The Mapr-DB wide column key value pair database allows to define multiple column families in a single table. For Cassandra it is called as Super Column Family or composite columns. I can imagine the key value pair in following way:


Many NoSQL databases like HBase are optimized to handle two to three column families. But after confirmation with MAPR PS guys, Mapr-DB can handle more number of column families. Anyways we can leave that discussion to debate on later.


Now lets see what the first table with multiple column looks like:

CustomerOrder Table:




Row Key OrderID
Column Families
C O P S B D
CustomerID OrderNo PaymentID AddressID AddressID DiscountCode
FirstName OrderDateTime PaymentDateTime Address1 Address1 GrossAmount
LastName TotalAmount PaymentMode Address2 Address2 TotalAmount
BusinessName GrossAmount Name City City
EmailAddress OrderStatus Number State State
Designation ItemCount Amount Zip Zip


ExpiryDate




AuthDetails




Status


Few things to note here:
  • The data here is denormalized into different column families
  •  Column family names are just one letters. Smaller the best. Mapr-DB and HBase stores column family name with every column so smaller the better, so name consumes lesser space.
  • ItemCount is stored in O (Order) Column Family. That saves us aggregation over OrderDetails table.
The CustomerOrder table data model will provide us data for following use cases:
  1. Fetch list of customer payments
  2. Fetch customer orders
  3. Fetch customer order summary
  4. Total discounts applied by discount codes
  5. Tableau connectivity
CustomerDetails Table:
Row Key CustomerID
Column Families
C P S B
CustomerID AddressID AddressID AddressID
FirstName Address1 Address1 Address1
LastName Address2 Address2 Address2
BusinessName City City City
Designation State State State
EmailAddress Zip Zip Zip
JoinDateTime Type Type Type
AvatarImageURL


The CustomerDetails table will provide us data for 'Fetch list of customers' use case.
 
OrderDetails Table:
Row Key OrderDetailsID
Column Families
O OD
OrderID ItemID
CustomerID ItemName
OrderNo UnitCost
OrderDateTime Quantity
TotalAmount Amount
GrossAmount NetAmount
OrderStatus Status

The OrderDetails table does not have any use case to provide data for but we have to store the data.

ItemDetails Table:
Row Key ItemID
Column Families
I ID
ItemName ItemName
Manufacturer LastOrderNo
Unit LastOrderDateTime
UnitCost LastQuantitySold
Discount LastCustomerName
Quantity LastOrderID
LastPaymentReceived

Finally ItemDetails table provides us data for 'Fetch last item status' use case. The ID column family holds all the aggregated values that are required for fetch last item status use case.

Above data model is debatable! The first use case Fetch list of customer payments the query in the stored procedure is fetched between dates. If we try to do that here there will be huge amount of table scan as the OrderDateTime is not indexed. This is true for all use cases where the WHERE clause has different search parameters then the row key. In HBASE and Mapr-DB only the Row Key is indexed. In Cassandra we have Primary Key as well as secondary indexes but this model fits into Super Column Data model where secondary indexes again are not possible. With the advent of Composite Columns, Cassandra has resolved the issue of secondary indexes. May be HBase and Mapr-DB should follow them. Well, not to worry, the huge table scans can be resolved by creating secondary index tables that have search parameters part of row keys and the value as the row key of another table. Its like we are building our own index. This we will discuss in next important part of this series Row Key Design.

Multiple Tables:  Another way of optimizing is denormalizing the data in multiple flat structure tables with appropriate row key design. We will discuss more on row key design in next part of this series. I will give only few examples here to shorten the length of this topic. If you want data model for remaining please to add to comments and I will try to put up best response.

Data model for 'Fetch list of Customer Payments'

Row Key  PaymentDateTime_CustomerID_PaymentID
Column Families
P
FirstName
LastName
BusinessName
PaymentDateTime
PaymentMode
Name
Number
Amount
ExpiryDate
AuthDetails
Status

Data model for 'Fetch customer orders'
Row Key  OrderDateTime_CustomerID_OrderID         
Column Families
O
OrderNo
OrderDateTime
OrderStatus
TotalAmount
FirstName
LastName
BusinessName
ShippingAddress1
ShippingAddress2
ShippingCity
ShippingState
ShippingZip
PaymentDateTime
PaymentMode
Name
Number
Amount
Status
BillingAddress1
BillingAddress2
BillingCity
BillingState
BillingZip

Data model for 'Fetch last item status'

Row Key  ItemID                                                            
Column Families
ID
ItemName
LastOrderNo
LastOrderDateTime
LastQuantitySold
LastCustomerName
LastOrderID
LastPaymentReceived

We have created three tables for three use cases. So that said, it means that chances are there that we will have to create table for each use case. Note that the row key structure has changed also note the order in which each field in the row key has been positioned. There is a reason for that. We will surely discuss that in more detail in next part.

Now, if we compare both Multiple Column Families and Multiple Tables, here are few things to ponder on:
  1. Multiple writes: Multiple Column Families might decrease the number of writes as compared to Multiple Tables.
  2. Reads: Multiple Column Families is not modeled properly might end up in having multiple reads and joins on the code side or business logic layer side, as compare to Multiple Tables, it tends to be a straight forward read.
  3. Deserialization: Multiple Column Families end up in unnecessary deserialization of all columns in the column family if you have to fetch one or two columns each from multiple column families, as compared to one or two columns in Multiple Tables. So while creating a data model it makes sense not to add unnecessary columns. Now this on other hand decreases our data model to be future proof.
  4. Redundancy: Multiple Column Families to some extent resolve redundancy, but all depends on how it is modeled. On the other hand Multiple Tables increase redundancy and data duplication as same data is written to many tables.
  5. Consistency: Multiple Column Families are to some extent more consistent to Multiple Tables. If for some reason the value is not replaced in all tables in Multiple Tables data model on an update command, lets say few tables are ignored by mistake, then it ends up in inconsistent data. Same thing with insert, if few tables are ignored by mistake.
There are more complex points also to ponder, we will discuss them in future. Keep in mind, everything has advantages and disadvantages, its on us to decide, what we feel is more balanced and more useful.

Thanks for reading!

Ashfaq Chougle

Previous:
Part Two: Data Model - From SQL Server to Wide Column No-SQL
Next: Part Four: Row Key Design - From SQL Server to Wide Column NoSQL

Part Two: Data Model - From SQL Server to Wide Column No-SQL





Hello Guys,

This blog is part two of eight part series of porting your RDBMS database to a NO-SQL wide column key value pair database. For introduction or part zero you can click here.


We already have a data model designed for the series. For the sake of continuity I will republish the data model. The stored procedures outlined in Part One Sample Use Cases do follow the same data model.

Customer:
CustomerID
FirstName
LastName
BusinessName
Designation
EmailAddress
JoinDateTime
AvatarImageURL


Item:
ItemID
ItemName
Manufacturer
Unit
UnitCost
Discount
Quantity


Address:
AddressID
CustomerID
AddressType
Address1
Address2
City
State
Zip


Order:
OrderID
OrderNo
CustomerID
OrderDateTime
ShippingAddressID
GrossAmount
DiscountCode
TotalAmount
OrderStatus






Order Details:
OrderDetailsID
OrderID
ItemID
UnitCost
Quantity
Amount
NetAmount
Status




Payments:
PaymentsID
CustomerID
OrderID
BillingAddressID
PaymentDateTime






PaymentDetails:
PaymentDetailsID
PaymentsID
PaymentMode
Name
Number
Amount
ExpiryDate
AuthDetails
Status

Expect part three soon as the data model was ready at the beginning of the series.

Thanks for reading!

Ashfaq Chougle

Previous: Part One: Sample Use Cases - From SQL Server to Wide Column No-SQL

Next: Part Three: Wide Column Data Model - From SQL Server to Wide Column No-SQL 

Sunday, December 25, 2016

Part One: Sample Use Cases - From SQL Server to Wide Column No-SQL




Hello Guys,

This blog is part one of eight part series of porting your RDBMS database to a NO-SQL wide column key value pair database. For introduction or part zero you can click here.

In this blog, I am presenting the thought process involved in selecting our sample use cases. As pointed out in the introduction the main intention of the POC was to make sure that Hadoop ecosystem can help us in improving the performance of our reporting web application and analytics application. Also, to recollect we had lots of intelligent and complex stored procedures. Again, I will be using this sample data model to explain the use cases. It is assumed that the sample database is a high transaction volume database as the database handles millions of orders a month.

Following are the list of sample use cases we selected:
  1. Fetch large number of records, in the order of tens of millions of records.
    We wanted to make sure that the new data architecture in the Hadoop ecosystem is capable of fetching large number of data set. Normally these are cases for a data dump requested by the users.
    Fetch list of customer payments:

    CREATE PROCEDURE prcGetCustomerPayments
    (
         @StartDateTime DateTime
        ,@EndDateTime DateTime
    )
    AS
    BEGIN
        SET NOCOUNT ON;
        SELECT
             C.FirstName
            ,C.LastName
            ,C.BusinessName
            ,PaymentDateTime
            ,PaymentMode
            ,Name
            ,Number
            ,Amount
            ,ExpiryDate
            ,AuthDetails
            ,Status
        FROM Customer C WITH (NOLOCK)
        JOIN Payments P WITH (NOLOCK) ON C.CustomerID = P.CustomerID
            AND PaymentDateTime BETWEEN @StartDateTime AND @EndDateTime
        JOIN PaymentDetails PD WITH (NOLOCK) ON P.PaymentsID = PD.PaymentsID
    END
  2. Fetch small number of records, in the order of tens or hundreds of records.
    We also wanted to make sure that implemented architecture not only fetches large data sets it also is capable of fetching smaller data sets faster.
    Fetch list of customers:

    CREATE PROCEDURE prcGetCustomerList
    AS
    BEGIN
        SET NOCOUNT ON
        SELECT
             C.CustomerID
            ,C.FirstName
            ,C.LastName
            ,C.BusinessName

            ,C.EmailAddress
            ,A.Address1
            ,A.Address2
            ,A.City
            ,A.State
            ,A.Zip
        FROM Customer C WITH (NOLOCK)
        JOIN Address A WITH (NOLOCK) ON C.CustomerID = A.CustomerID
            AND A.AddressType = 1 --Personal Address

    END
  3. Fetch data from multiple tables.
    Just wanted to make sure if architecture is efficiently able to fetch data from multiple tables with optimum performance.
    Fetch last item status:

    CREATE PROCEDURE prcGetLastItemStatus
    (
        @ItemID INT
    )
    AS
    BEGIN
        SET NOCOUNT ON
       
        DECLARE @ItemName VARCHAR(100)
        DECLARE @LastOrderNo VARCHAR(15)
        DECLARE @LastOrderDateTime SMALLDATETIME
        DECLARE @LastQuantitySold INT
        DECLARE @LastCustomerName VARCHAR(100)
        DECLARE @ShippingLastAddress1 VARCHAR(50)
        DECLARE @ShippingLastAddress2 VARCHAR(50)
        DECLARE @ShippingLastCity VARCHAR(50)
        DECLARE @ShippingLastState VARCHAR(50)
        DECLARE @ShippingLastZip VARCHAR(10)
        DECLARE @LastPaymentReceived SMALLMONEY

        SELECT
            @ItemName = Name
        FROM Item WITH (NOLOCK)
        WHERE ItemID = @ItemID

        DECLARE @OrderID INT
        DECLARE @CustomerID INT
        DECLARE @ShippingAddressID INT

        SELECT
            @OrderID = MAX(OrderID)
        FROM OrderDetails WITH (NOLOCK)
        WHERE ItemID = @ItemID

        SELECT
             @LastOrderNo = OrderNo
            ,@LastOrderDateTime = OrderDateTime
            ,@LastQuantitySold = Quantity
            ,@CustomerID = CustomerID
            ,@ShippingAddressID = ShippingAddressID
        FROM Order O WITH (NOLOCK)
        JOIN OrderDetails OD WITH (NOLOCK) ON O.OrderID = OD.OrderID AND OD.ItemID = @ItemID AND O.OrderID = @OrderID

        SELECT
            @LastCustomerName = FirstName + ' ' + ISNULL(MiddleName,'') + ' ' + LastName
        FROM Customer C WITH (NOLOCK)
        WHERE CustomerID = @CustomerID

        SELECT
             @ShippingLastAddress1 = Address1
            ,@ShippingLastAddress2 = Address2
            ,@ShippingLastCity = City
            ,@ShippingLastState = State
            ,@ShippingLastZip = Zip
        FROM Address WITH (NOLOCK)
        WHERE AddressID = @ShippingAddressID

        DECLARE @PaymentDetailsID INT

        SELECT @PaymentDetailsID = MAX(PaymentDetailsID)
        FROM PaymentDetails PD WITH (NOLOCK)
        JOIN Payments P WITH (NOLOCK) ON PD.PaymentsID = P.PaymentsID AND P.OrderID = @OrderID

        SELECT
            @LastPaymentReceived = Amount
        FROM PaymentDetails WITH (NOLOCK)
        WHERE PaymentDetailsID = @PaymentDetailsID

        SELECT 
             ItemName                    = @ItemName
            ,LastOrderNo               = @LastOrderNo
            ,LastOrderDateTime    = @LastOrderDateTime
            ,LastQuantitySold        = @LastQuantitySold
            ,LastCustomerName    = @LastCustomerName
            ,ShippingLastAddress1= @ShippingLastAddress1
            ,ShippingLastAddress2= @ShippingLastAddress2
            ,ShippingLastCity        = @ShippingLastCity
            ,ShippingLastState       = @ShippingLastState
            ,ShippingLastZip          = @ShippingLastZip
            ,LastPaymentReceived = @LastPaymentReceived

    END
  4. Fetch large number of records (multiple table joins), in the order of tens of millions of records.
    Another thing we wanted to know, how the architecture helps in the optimizing multiple joins.
    Fetch customer orders:

    CREATE PROCEDURE prcGetCustomerOrders
    (
         @StartDateTime DateTime
        ,@EndDateTime DateTime
    )
    AS
    BEGIN
        SET NOCOUNT ON;
        SELECT
             O.OrderNo
            ,O.OrderDateTime
            ,O.OrderStatus
            ,O.TotalAmount
            ,C.FirstName
            ,C.LastName
            ,C.BusinessName
            ,A.Address1
            ,A.Address2
            ,A.City
            ,A.State
            ,A.Zip
            ,PaymentDateTime
            ,PaymentMode
            ,Amount
            ,Status
            ,BAddress1 = BA.Address1
            ,BAddress2 = BA.Address2
            ,BCity = BA.City
            ,BState = BA.State
            ,BZip = BA.Zip
        FROM Customer C WITH (NOLOCK)
        JOIN Order O WITH (NOLOCK)
            ON O.CustomerID = C.CustomerID
                AND OrderDateTime BETWEEN @StartDateTime AND @EndDateTime
        JOIN Address A WITH (NOLOCK)
            ON O.ShippingAddressID = A.AddressID
        LEFT JOIN Payments P WITH (NOLOCK) ON O.OrderID = P.OrderID
        LEFT JOIN PaymentDetails PD WITH (NOLOCK) ON P.PaymentsID = PD.PaymentsID
        LEFT JOIN Address B WITH (NOLOCK)
            ON P.BillingAddressID = B.AddressID
    END
  5. In memory aggregation large number of records.
    Is the architecture able to handle in memory aggregations for large volume of data?
    Fetch customer order summary:

    CREATE PROCEDURE prcGetOrderSummary
    (
         @StartDateTime DateTime
        ,@EndDateTime DateTime
    )
    AS
    BEGIN
        SET NOCOUNT ON;

        SELECT
             O.CustomerID
            ,TotalDiscount = SUM(CASE WHEN DiscountCode IS NOT NULL THEN O.GrossAmount - O.TotalAmount ELSE 0 END)
            ,TotalAmount = SUM(O.TotalAmount)
            ,OrderCount = COUNT(1)
            ,FisrtOrder = MIN(OrderDateTime)
            ,LastOrder = MAX(OrderDateTime)
        INTO #TempOrders
        FROM Orders O WITH (NOLOCK)
        WHERE O.OrderDateTime BETWEEN @StartDateTime AND @EndDateTime

        SELECT
             O.CustomerID
            ,ItemCount = COUNT(1)
        INTO #TempOrderDetails
        FROM Orders O WITH (NOLOCK)
        JOIN OrderDetails OD WITH (NOLOCK) ON O.OrderID = OD.OrderID
            AND O.OrderDateTime BETWEEN @StartDateTime AND @EndDateTime


        SELECT
             O.CustomerID
            ,PaymentCount = COUNT(1)
            ,FisrtPayment = MIN(PaymentDateTime)
            ,LastPayment = MAX(PaymentDateTime)
        INTO #TempPayments
        FROM Orders O WITH (NOLOCK)
        JOIN Payments P WITH (NOLOCK) ON O.OrderID = P.OrderID
            AND O.OrderDateTime BETWEEN @StartDateTime AND @EndDateTime

        SELECT
             O.CustomerID
            ,TotalPayment = SUM(CASE WHEN Status = 1 THEN Amount ELSE 0 END)
        INTO #TempPaymentDetails
        FROM Orders O WITH (NOLOCK)
        JOIN Payments P WITH (NOLOCK) ON O.OrderID = P.OrderID
            AND O.OrderDateTime BETWEEN @StartDateTime AND @EndDateTime
        JOIN PaymentDetails P WITH (NOLOCK) ON P.PaymentsID = PD.PaymentsID

        SELECT
             C.FirstName
            ,C.LastName
            ,C.BusinessName
            ,A.Address1
            ,A.Address2
            ,A.City
            ,A.State
            ,A.Zip
            ,O.OrderCount
            ,O.TotalAmount
            ,O.TotalDiscount
            ,O.FisrtOrder
            ,O.LastOrder
            ,OD.ItemCount
            ,P.PaymentCount
            ,PD.TotalPayment
            ,P.FisrtPayment
            ,P.LastPayment
            ,Balance = O.TotalAmount - PD.TotalPayment
        FROM Customer C WITH (NOLOCK)
        JOIN Address A WITH (NOLOCK)
            ON A.CustomerID = C.AddressID
        JOIN #TempOrders O WITH (NOLOCK) ON O.CustomerID = C.CustomerID
        JOIN #TempOrderDetails OD WITH (NOLOCK) ON OD.CustomerID = C.CustomerID
        LEFT JOIN #TempPayments P WITH (NOLOCK) ON P.CustomerID = C.CustomerID
        LEFT JOIN #TempPaymentDetails PD WITH (NOLOCK) ON PD.CustomerID = C.CustomerID
       
    END
  6. In memory aggregation less number of records.
    How does it perform for smaller data set but heavy aggregation.
    Total discounts applied by discount codes:

    CREATE PROCEDURE prcGetDiscountsAppliedByCode
    (
         @StartDateTime DateTime
        ,@EndDateTime DateTime
    )
    AS
    BEGIN
        SET NOCOUNT ON

        SELECT
              DiscountCode
            ,Amount = SUM(GrossAmount - TotalAmount)

            ,Count = COUNT(1)
        FROM Order WITH (NOLOCK)
        WHERE OrderDateTime BETWEEN @StartDateTime AND @EndDateTime
        GROUP BY DiscountCode
    END
  7. Finally a data set for Tableau connectivity
    Just want to make sure how the Tableau can be connected to the new data model.
    CREATE VIEW vwAllCustomerOrders
    AS

        SELECT
             O.OrderNo
            ,O.OrderDateTime
            ,O.OrderStatus
            ,O.TotalAmount
            ,C.FirstName
            ,C.LastName
            ,C.BusinessName
            ,A.Address1
            ,A.Address2
            ,A.City
            ,A.State
            ,A.Zip
            ,PaymentDateTime
            ,PaymentMode
            ,Amount
            ,Status
            ,BAddress1 = BA.Address1
            ,BAddress2 = BA.Address2
            ,BCity = BA.City
            ,BState = BA.State
            ,BZip = BA.Zip
        FROM Customer C WITH (NOLOCK)
        JOIN Order O WITH (NOLOCK)
            ON O.CustomerID = C.CustomerID
        JOIN Address A WITH (NOLOCK)
            ON O.ShippingAddressID = A.AddressID
        LEFT JOIN Payments P WITH (NOLOCK) ON O.OrderID = P.OrderID
        LEFT JOIN PaymentDetails PD WITH (NOLOCK) ON P.PaymentsID = PD.PaymentsID
        LEFT JOIN Address B WITH (NOLOCK)
            ON P.BillingAddressID = B.AddressID


Thanks for reading!

Ashfaq Chougle

Previous: Part Zero: Introduction - From SQL Server to Wide Column No-SQL
Next: Part Two: Data Model - From SQL Server to Wide Column No-SQL 

Saturday, December 17, 2016

Part Zero: Introduction - From SQL Server to Wide Column No-SQL




Hello Guys,

This blog is part zero of eight part series of porting your RDBMS database to a NO-SQL wide column key value pair database.

Just an introduction!

To begin with, I have been working on a use case where the reporting web application and the analytics application was to be segregated and expected to point to another highly scalable and reliable database, as currently the applications are pointing to the Transactional database where the IoT based devices are continuously pumping in data and the amount of data being queried by the reporting applications was enormous. The web application is built in .Net technology and MS SQL Server was used as transactional database. Lots of logic is embedded into stored procedures and also, there are more than thousand stored procedures. Having said that I would not mind calling the database as intelligent database.

After lots of research and discussion we decided to leverage Hadoop Ecosystem for the reporting and analytics. Having expertise in Microsoft technology and being more well versed with RDBMS data modeling, you are bound to be skeptical on how Hadoop Ecosystem will help to resolve your problem. The more you read on Hadoop ecosystem, the more you are confused. Also worth noting is there is nothing like stored procedures in Hadoop Ecosystem. So the question was, where do I port the logic embedded in the stored procedure? Anyways, I started looking over strategies to make Hadoop ecosystem part of the infrastructure.

We started speaking with consulting firms and started looking at various flavors of Hadoop Ecosystem. MapR also approached us and provided us with a workable architecture. Logically it did make sense but we wanted to have a feel of it and see it working and so we decided to implement few use cases as proof of concept (POC). MapR professional services (PS) helped us in implementing the POC. We did look at various options like either using MapR-FS (HDFS) and Mapr-DB (HBASE). Mapr-DB was found more feasible to our use case so we started architecting the solution using Mapr-DB. HDFS is good for Static data set or data set that can be refreshed or synced at low frequecy or high intervals. On the other hand Mapr-DB is good for high volume transactional database or near real time sync. Also note Mapr-DB is multi-model No-SQL database. It can have tables in form of multi column family wide column key-value pair or can be a document database.

One thing I learnt and would like to share is there is no way to port the stored procedures into Hadoop Ecosystem, do not waste time to research on it. I will be sharing the steps we executed. For now, I am just listing them but I will explain each step more thoroughly, so it will provide clear picture. Here are the steps:

  1. List sample use cases or reports of different categories or types
  2. As per the use cases, model the sample data set
  3. Create the wide column key-value pair data model. I will be using this sample RDBMS data model to explain and create the wide column key-value pair data model.
  4. Row-key design
  5. Data Import.
  6. Create C# application to fetch data from MapR-DB
  7. Implement change data capture (CDC) for continuous flow of data
  8. Using Drill create view for Tableau live connection.
Steps 3, 4 and 5 are very important. If the No-SQL data model is perfectly implemented, expect high throughput. A low throughput is like badly designed RDBMS data model or like a high volume table with wrong index optimizers.

The above steps are common and should work for HBASE or any other No-SQL wide column / key value pair database. I hope it helps you to port your application to Hadoop Ecosystem or other No-SQL databases.

Thanks for reading!

Ashfaq Chougle

Continue to Part One: Sample Use Cases - From SQL Server to Wide Column No-SQL