NoSQL Zone is brought to you in partnership with:

Passionate about technology and startups. Have worked for the BBC in London, Livedoor.com in Japan, Cloudera in San Francisco and MailChannels in Vancouver, Canada. Currently reside in Vancouver where I'm working on building PaaS based on Cloud Foundry, called Stackato. I enjoy writing about technology, especially when it relates to interesting startups. Phil is a DZone MVB and is not an employee of DZone and has posted 39 posts at DZone. You can read more from them at their website. View Full User Profile

"NoSQL Distilled" Reviewed!

10.03.2012
| 9695 views |
  • submit to reddit
Published by:
ISBN: 0321826620

Reviewer Ratings

Relevance:
5

Readability:
4

Overall:
4

Buy it now

One Minute Bottom Line

 

Pramod and Martin cover the key questions surrounding NoSQL very well, starting with the obvious "Why NoSQL?" and moving on to the fundamentals: what is a document database, a column-family data store, or a graph database? But it goes much deeper than that.

 

Review

In this post I am going to review Pramod J. Sadalage and Martin Fowler's new book entitled "NoSQL Distilled - A Brief Guide to the Emerging World of Polyglot Persistence."

An Overview

Pramod and Martin cover the key questions surrounding NoSQL very well, starting with the obvious "Why NoSQL?" and moving on to the fundamentals: what is a document database, a column-family data store, or a graph database? But it goes much deeper than that.

If you have studied much about NoSQL databases or any type of data-store then phrases such as "The CAP Theorem" and "Map-Reduce" will be familiar to you. If you are not, then there are good explanations to get you up to speed. "CAP Theorem" covers about 4 pages with diagram. This book builds on these topics as they look at consistency, transactions, query features, data structures and scalability of each of the different data storage types.

NoSQL databases covered in this book include Riak, MongoDB, Cassandra, HBase and many others. CouchDB, Terrastore, OrientDB, and RavenDB are mentioned, too. In fact, if you just look at the book's index, it seems that there is at least one mention of every data-store known to man. These authors know them all and touch on them at just the right time so that you see where they fit in the world. It is, as they describe, "a broad survey of the NoSQL world.”

Generally, in each topic area such as the discussion on "column-families,” one data-stored is chosen to take the lead for the purpose of explanation and demonstration. In the case of "column-families” it is Cassandra that gets the spotlight. It is pointed out that HBase, Hypertable, and Amazon DynamoDB also use column-families, but the focus is on the fundamentals rather than the specific technology solution.

I particularly like the "Suitable Use Cases" and "When Not to Use" sections in the chapters "Key-Value Databases,” "Document Databases,” "Column-Family Stores" and "Graph Databases.” Just reading these sections will tell you quickly if your choice of technology is a good one or not.

Most chapters conclude with some "Further Reading" and coverage of the "Key Points,” so you can be sure to hone your knowledge in each topic area.

Chapter-By-Chapter

Let's take a look at how the book is broken up into chapters.

Part 1, "Understand" covers chapters 1 to 7.

Part 2, "Implement" covers the remaining chapters.

Chapter 1: Why NoSQL?

This chapter looks at NoSQL, starting from the viewpoint of relation databases and hits of the following points:

  •       Persistence of data
  •       Currency
  •       Integration
  •       Clustering

The chapter ends with "The Emergence of NoSQL,” which brings into the core topic of this book.

Chapter 2: Aggregate Data Models

This chapter looks at the "aggregate orientation" of key-value, document and column-family NoSQL data-stores, as we move away from the relational model of relational databases. It also looks at how data in NoSQL data-stores (that is commonly accessed together), is stored together.

Chapter 3: More Details on Data Models

Here we dig into the different types of organization you might find in different data-stores.

  •       Relationships
  •       Graphs
  •       Schemaless
  •       Indexes
  •       Data Access

Chapter 4: Distribution Models

This chapter looks at scaling out vs. scaling up and covers the following topics.

  •       Sharding
  •       Master-slave replication
  •       Peer-to-peer replication
  •       Sharding plus replication

Chapter 5: Consistency

This is where we take it up a gear as we hit on consistency and The CAP Theorem:

  •       Write-write conflicts
  •       Lost updates
  •       Conditional updates
  •       Inconsistent reads
  •       Read-write conflict
  •       Inconsistency window
  •       Replication consistency
  •       Eventual consistency
  •       Staleness of data
  •       Relaxing consistency or durability
  •       The CAP Theorem

Chapter 6: Version Stamps

A common criticism of NoSQL is the lack of transaction support. Pramod and Martin look at how NoSQL systems deal with this using version stamps and how this works when you have multiple nodes.

Chapter 7: Map-Reduce

The basic idea behind map-reduce and examples are given. The following map-reduce topics are covered.

  •       Partitioning and combining
  •       Composing calculations
  •       Multiple stage and incremental map-reduce

Chapter 8: Key-Value Databases

This is the first chapter of "Part II – Implement.” Now that we understand Part I, we can start to implement what we know.

The general concepts behind a key-value data-store are given and consistency, transactions and query features are addressed.

Chapter 9: Document Databases

Many popular NoSQL data-stores such as MongoDB use the "document" concept. It is examined what a "document database" is. Again, we look at consistency, transactions and query features. We also look at availability and focus in on MongoDB's "replica sets.”

  •      Suitable uses cases discussed include the following.
  •       Event logging
  •       Content Management Systems
  •       Blogging platforms
  •       Real-time Analytics

Chapter 10: Column-Family Stores

I have seen many developers get confused over this aspect of NoSQL databases, so I am glad it receives a chapter.

An explanation of "what is a column-family?" is given, as well as consistency, transactions and query features.

Suitable uses cases discussed have some overlap with Document Databases, but also include the following.

  •       Counters
  •       Expiring Usage

Chapter 11: Graph Databases

The idea of nodes, edges and node properties are introduced in this chapter on graph databases.

Suitable use cases include the following:

  •       Connected data
  •       Routing (e.g. satellite navigation)
  •       Location-based services
  •       Recommendation engines

Chapter 12: Schema Migrations

NoSQL databases have a schemaless nature. First, this chapter steps back and looks at relational database schemas and how changes to those schemas are generally handled. We then look at how this relates to changes in a NoSQL data-store's "schema,” which is effectively managed by the application code.

  • The following topics are covered.
  • Incremental migration
  • Migrations in graph databases
  • Changing aggregate structure

Chapter 13: Polyglot Persistence

Chapter 13 is a look at the requirements of different aspects of an application and using different data-stores for different requirements. For instance, e-commerce session management storage does not require the rigorous backup and recovery strategy as the e-commerce orders data.

Using the e-commerce example, we look at how we can use multiple data-store types across our application. Each data-store type used for the task it is designed for.

  • Topics covered include the following:
  • Enterprise concerns with polyglot persistence
  • Deployment complexity

Chapter 14: Beyond NoSQL

Chapter 14 provides a look outside of the NoSQL world, but at technologies that closely relate to it, such as Google File System and Hadoop Distributed File-System. We also look at how events across your architecture are managed and updates are triggered.

I found the idea of versioning your data store an interesting one, which would be an interesting topic to dig more. I would like to see more use cases for this topic.

XML databases and Object databases are still alive and kicking and get some attention in this chapter.

Chapter 15: Choosing Your Database

This is the concluding chapter of the book and reflects on motivations for choosing your databases and experimenting with new ones.

It is admitted that it can take a team months to fully understand the full implications a new technology and running several in parallel is not always feasible or economically viable. I agree with the statement, "The [development] team doing the work should decide.”

Some considerations discussed include:

  • Data-access performance
  • Developer performance and agility
  • Sticking with what you know and what is stable
  • NoSQL is not always the best solution

 

 

Published at DZone with permission of Phil Whelan, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: