SQL Zone is brought to you in partnership with:

I am the founder and CEO of Data Geekery GmbH, located in Zurich, Switzerland. With our company, we have been selling database products and services around Java and SQL since 2013. Ever since my Master's studies at EPFL in 2006, I have been fascinated by the interaction of Java and SQL. Most of this experience I have obtained in the Swiss E-Banking field through various variants (JDBC, Hibernate, mostly with Oracle). I am happy to share this knowledge at various conferences, JUGs, in-house presentations and on our blog. Lukas is a DZone MVB and is not an employee of DZone and has posted 242 posts at DZone. You can read more from them at their website. View Full User Profile

SQL Performance Explained

09.17.2013
| 6093 views |
  • submit to reddit
Published by:
ISBN: 9783950307825

Reviewer Ratings

Relevance:
5

Readability:
5

Overall:
5

Buy it now

One Minute Bottom Line

SQL databases have become incredibly fast in parsing and executing even very complex SQL statements. But to many SQL developers, writing performing SQL is still a mystery. This book covers 90 percent of what any SQL developer should know in very simple terms. A must-read for all SQL beginners and also for pros!

Review

SQL is a very complex topic. While most developers still interact with SQL in one way or another, many find it an increasingly hard to master the technology because SQL seems to perform so much magic.

Markus Winand, creator of http://use-the-index-luke.com, busts a lot of myths by explaining in very simple terms that all you need to understand is proper indexing. His book "SQL Performance Explained," which is available also in German and French, reads like a very easy text book, starting by explaining the most popular and most common indexing structure: the B-Tree. From there on, the book puts new things in logical relation to topics already covered. As such, it helps the reader find his or her way to understanding the most complex problems faced in everyday SQL development. At the same time, the amount of information per page is extremely dense, making every sentence an important piece of the puzzle. This is quite remarkable for such a technical book, dealing with such a dry matter.

Markus proves to be very knowledgeable in his writings, not only with respect to SQL as a language itself, but also concerning software architecture in general. He puts database interaction in relation to overall scalability, and also knows his ways with ORMs in various languages. This is interesting for those developers who are using “higher levels” of abstraction over their database, and are thus facing the fine-tuning of third-party generated SQL.

The book really shines once the reader goes into the depths of understanding performance implications of proper indexing to each of the SQL SELECT statement’s various clauses. Even after years of tuning Oracle execution plans, I’ve had many “aha” effects when learning about things like:

  • How Oracle’s not indexing NULL can lead to very surprising full table scan operations, because Oracle has to check the table for records that are not in the index.
  • How using bind values becomes increasingly important when many JOIN operations are involved, because the query optimizer complexity increases with the factorial of the number of joined tables.
  • How nested loops, hash joins, and sort merge joins really work, and when they work best.
  • How to design queries to produce index-only scans, when all columns from the predicates and the projection are indeed contained in an applicable index.

The most interesting “aha” effect to me, however, was triggered by learning that indexes can be used for their data clustering effects. If properly designed, an index will “pre-sort” related values prior to actually executing GROUP BY, ORDER BY, or FETCH FIRST clauses. This can lead to instant ORDER BY results, if the ORDER BY clause matches the chosen index definition. Markus also busts the myth that Oracle’s Index-Organized Tables are actually useful for this purpose; they’re mostly not useful.

If clustering and pipelining effects do not apply, one can still choose to use what Markus calls the “SEEK method”, where partial results are fetched by omitting the OFFSET .. FETCH FIRST clause and using much faster regular predicates instead. The “SEEK method” has two advantages:

  1. It is extremely fast, as no rows have to be skipped before reaching the “OFFSET”.
  2. It is stable, as the “OFFSET” doesn’t shift when new records are inserted during paging.

What I am looking forward to in a future edition of Markus’ book is some more depth in the chapters explaining the influence of indexes with respect to performance of INSERT, UPDATE, DELETE statements (and maybe MERGE?). It would also be great if some additional insight into DB2 and Sybase could be given. The current edition essentially covers the popular Oracle, PostgreSQL, SQL Server, MySQL databases.

Having read this book, I feel much more at ease with designing databases for performance and with analyzing execution plans for fine-tuning. Even with quite a bit of experience, this book has generated some new knowledge, while at the same time, I think it is very easy to read for SQL tuning beginners as well.

Or to put it in Markus’ terms, “This book should have been written 10 years ago.”

This book is available via booksellers and the author's website: http://sql-performance-explained.com.

Published at DZone with permission of Lukas Eder, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)