At Stackify, we have about 1,000 SQL Azure databases. Monitoring, managing, and right sizing these databases creates a unique set of challenges. For security and scalability, each of our clients has their own database. As you can imagine, each client is in a different sizes and we have wildly different performance characteristics from one databases to another. We are excited that SQL Azure Elastic Pools will help solve some of these problems.
We started playing with SQL Azure Elastic Pools in our QA environment. We have about 66 databases in it and one of those databases gets virtually all the traffic. This is the killer feature of Elastic Pools: being able to have the overall power of 66 databases but being able to shift that power from one database to another in a pool, much like you would if you had a physical database server.
Up until now, we have been using SQL Azure Business edition databases. We have hundreds and hundreds of them. For the most part, they work just fine. However, Microsoft does not guarantee the performance levels of them like they do the new Basic/Standard V12 tiers. But we have not switched because trying to pick Basic, S0, S1, S2, S3 foreach of our thousand databases would be a nightmare. So we have just stuck with business edition hoping for a better way…enter Elastic Pools, FTW!
The performance of SQL Azure Business edition has been good for us. We really only have a few client databases that really push the limits of it. As you do this migration, you will need to step back and review the overall performance of all of your database to determine how big of an Elastic Pool you will need. To see the performance of each of your SQL Azure databases you can query the master database like so:
select top 100 * from sys.resource_stats order by 1 desc
If you are using Web or Business edition still, the percentages are relative to a S2 database. So keep this in mind as you plan on moving to SQL Elastic Pools and are evaluating how many DTUs you will need. You can pick 100 to 1200 DTUs that are shared across all your databases. Also note that for every DTU you get 1GB of space. You can set the maximum DTU that any database can use to help avoid noisy neighbor problems. So be sure to set that.
The first thing we had to do was convert all of our databases to Basic or Standard tiers. To expedite that we used a little SQL query that we will share and then ran all the ALTER DATABASE statements. Be sure to pick the right edition for all or some of your databases. Run this against your master DB and then when you are ready you can run all the ALTERs against master as well.
select 'ALTER DATABASE [' + name + '] MODIFY (EDITION = ''basic'', MAXSIZE=2GB) ' + CHAR(13) + CHAR(10) + 'GO' from sys.databases where name <> 'master'
After getting all the databases off Business edition we had to upgrade to SQL Azure V12 via the Azure portal. That took an hour or two with no down time. Once that was complete, we created a new Elastic Pool and selected all of our databases and it started provisioning and moving them to it. This transition also took a few hours with no down time. The entire process was actually very simple once we figured out how to script all the ALTER DATABASE commands to avoid manually changing them all by hand.
Microsoft has some PowerShell scripts that can help migrate to Elastic Pools if you have a lot of databases that would help your conversion.
The new Azure portal has some basic features for reporting performance. DTU is sort of a fuzzy way to manage CPU and IO for your database. Luckily the portal gives you a little bit of an idea on how much CPU and IO your pool is using so you know if you need to scale up your Elastic Pool DTUs.
This chart below perfectly details one of the reasons why we needed Elastic Pools. Every night when we need to run some data purging and maintenance, it really kills our databases. Having the extra horsepower of the pool will help us alleviate some of this problem now.
The Azure Portal lets you see DTU, CPU, IO, and other stats.
There is some basic reporting that shows peak and average DTU usage per database.
You can select a specific database and see average DTU usage over time. They also make available some advanced stats like deadlocks, connections, etc.
Another great reason for using Elastic Pools is saving money. In this example we had 66 databases. So Business edition was costing us at least $660 a month at $9.95 a database. (Rarely our databases under 1GB so the $4.95 tier wouldn’t have really worked) One of our databases in QA is 10GB though so that costs another $4 per GB.
Based on current preview pricing, the Elastic Pool will cost us $2.23 per DTU. So a simple 100 DTU pool is $223 a month plus $1.26 per database ($83 total). This is roughly $306.
66 Business Edition databases: ~$660 minimum + $40 for one larger DB
SQL Elastic Pool: ~$306 maximum (under 100GB total)
The price we paid for clients with larger databases on Business edition actually goes up quickly. Whereas with SQL Elastic Pools, it includes 100GB of space to share with the 100 DTU and database size is not our normal constraint compared to the DTUs. So larger client databases mixed in with all the smaller ones won’t drive our costs up as much.
To keep an eye on SQL performance after the conversion we used our Stackify APM to monitor overall SQL performance and spot check several key database queries to see if we had any major slow downs. It looks like we might have seena little slowness during the conversion migration, but performance since then looks good. Our QA environment doesn’t get all that much traffic though. We will be excited to see if we actually get some performance improvements by using pools in our pre-production and production environments.
It appears that SQL Management Studio 2014 does not work with Elastic Pool databases. Doing a Windows Update to 2014 SP1 fixed this issue.
Some SQL error messages also look a little different. This message doesn’t even show the actual address of our database. It says something about a routing destination. Weird. This will make some troubleshooting more difficult.
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. This failure occurred while attempting to connect to the routing destination.
The statement has been terminated.
Since we also moved from Business edition to V12 standard databases, we also ran into a couple performance problems in our queries. Business edition in some scenarios was faster and may hide some missing indexes or bad queries. When you move to V12 these become more obvious. So be sure to really test performance. Azure has something called the Index Tuning Advisor you can try. I also recommend simple queries against the DMVs to find slow queries.
We are very excited to continue testing Elastic Pools to see how they can help us improve performance, simplify overall database management, and lower even costs. We would strongly recommend anyone who has a lot of databases to start trying out the new Elastic Pools. SQL Azure Elastic Pools are one of the killer Microsoft Azure features we have been dying to have!