Invitation Digital Tech Blog

Building Scalable & Responsive Architecture

By

Migrating from SQL Server 2012 Enterprise to AWS RDS SQL Server Standard with ZERO downtime

Background

At IDL our infrastructure is hosted at AWS. We use SQL Server as our main DB while we utilise Mongo for reads. We were using SQL Enterprise version for a couple of years until recently when we decided to stop paying ridiculous amounts of money. We were using the Enterprise version primarily for 2 reasons, Change Data Capture or CDC and AlwaysOn Availability Groups. For quite sometime we were considering moving away from the Enterprise version down to Standard and started investigating RDS (Amazon’s Relational Database Service) as a potential option instead of setting up our own Mirroring (soon to be deprecated although that is what RDS uses under the hood) or Log shipping solution.

Moving to RDS

Before we could even attempt to move to RDS or away from Enterprise we had to find an alternative to CDC as CDC is not supported by AWS. We considered 2 possible solutions, Change Tracking and the good old Triggers. We decided to go ahead with the Triggers as it suited our needs best, even though I have an aversion to Triggers. This required changes to our systems that was reliant upon CDC and once these changes were made and deployed we were clear to move ahead. As for AlwaysOn Availability Groups, we did not utilise it much for its scale such as read replicas so we were not going to lose any functionality by moving to RDS (except what RDS does not support by default). Even though there is good support for SQL Server on RDS there are many features that are not supported. Before you decide to move to RDS make sure you go through these. You can find the list here.

We also needed to ensure that we would find the right hardware configuration on RDS to support our production needs. Since our database is not read heavy, as we utilise Mongo for our read layer remember, we did not need to have Provisioned IOPS as AWS guarantees 3 IOPS per GB of General Purpose SSD provisioned storage which provided us with enough IOPS for the size of the disk we provisioned. You can only have a maximum of 1 TB before you need to start sharding your database.

Data Migration Limitations

A major part of Database migration is the migration of data. While RDS provides a self-managed service, which is why we chose it, it does have a serious limitation and that is you cannot simply restore a database from the backup you created as it does not provide you access to the underlying file system so you cannot do RESTORE DATABASE ... FROM FILE. I have done a few DB migrations and it usually goes like this in its very simple form:

  1. Transfer the Database backups and Transaction logs to new machine.
  2. Restore them to the newly installed SQL instance.

However, with RDS you cannot do that you have to use some sort of utility to transfer your data such as SQL Server Import Export Wizard or Generate Scripts Wizard or the Bulk Copy feature using bcp utility. See more details here. We decided to use bcp as it is the most efficient means of transferring large amounts of data across. There is a very handy tool called SQL Database Migration Wizard (SQLAzureMW) we used (more on this later) which is designed to help you migrate your SQL server databases to Azure. While the tool is built to migrate to Azure it can also be used as a migration tool from one SQL instance to another and it uses bcp utility under the hood to migrate data. SQLAzureMW can be downloaded from Codeplex.

But even using this can take hours depending on the size of your database which brings me to my next point.

NOTE TO AWS: You should provide the ability in RDS management tool online to restore backups and transaction logs from S3 bucket.

Migrating without Downtime

Knowing that it can potentially take a good few hours to migrate the data, even though its is in the middle of the night we still have to users using our system as we are in 14 countries and its not middle of the night everywhere else although the usage is comparatively really low, I did not want any downtime. Achieving such a feat requires a lot of prep work so actual time is limited to transferring of data and a deep understanding of the systems that utilise the database. You need to be aware of your reads and writes. We were happy to have certain features of the systems not working than having the whole site down. To achieve this we switched our database to ReadOnly mode (see the Step by Step section below for instructions on how to do this) ensuring no writes happen during the migration of the data. While registration, Saving an offer for later use, or a limited number of redemptions that require interactions with 3rd parties will not work but majority of the other features including log in, listings, normal redemptions, search etc would function which would serve more than 90% of the users without them noticing any loss of functionality.

We also considered sending some writes to a queue and replay them after the migration in order to even avoid the loss of some features that rely on writes but the hassle this would cause was not worth the effort for the limited benefit it would provide for the limited amount of time in the middle of the night.

We also recommend to use a domain (sql.example.com) or an alias instead of an IP address to connect to your database. This would avoid the need to redeploy your applications that acccess the database as you connection string would change. Ensure the TTL is short lived such as 60 secs, so after the migration you can simple change your DNS entry for the domain to point to your RDS instance.

Step by Step SQL Server to RDS Migration Process

Here is the step by step migration process that we followed when moving to RDS. Needless to say, you should do a few trial runs and have it running on a staging environment before attempting this on Production.

To ensure the impact time is kept to a minimum we will need to do some prep work. This involves getting our database up and running with the up to date schema, roles, logins and permissions before we even begin migration. We did this a few days ago. Just make sure no schema changes take place after that.

Reducing the Data

Migrating the data is what takes the most amount of time so it is worth investigating if you can shed some data. Usually, the working dataset is a whole lot smaller than the actual data that has been accumulated in the database over the years. Some of it is redundant imports from 3rd parties, tons of analytics data which is really old and some of it is data related to deprecated features. If this can be removed it can reduce the migration time significantly (obviosuly depending on the size of data removed). Now I am not advocating one should get rid of this valuable data, rather move it to S3 bucket or Amazon Glacier or to a different database to be moved later.

Set up your SQL Server RDS instance

Ensure that you have your RDS instance set up and running with appropriate security groups and you can connect to it and log in using master account via SSMS. I would strongly recommend, as AWS does too, to launch in a Single AZ first, migrate your data and then convert your configuration to Multi AZ otherwise that data transfer process will be really really slow, I mean really slow. This is not a guide on how to set up an RDS instance. Please refer to RDS documentation that explains this very well. However, there are a few things which the documentation does not clarify, these are:

  1. You can upgrade your RDS instance from Single AZ to Multi AZ without incurring any downtime or vice versa. There will be some impact on latency when moving to Multi AZ so ensure you kick it off at quite times.
  2. You will need to use the option files for this as you wont see a button on RDS console that will do this. Just select your instance, click Modify and choose the appropriate option from the Options drop down. There should be a default mirrored and non-mirrored option provided by RDS. You can choose to apply changes immediately otherwise RDS would use the Maintenance window specified during set up to action this.
  3. Running in Single AZ initially also allows you to have the database in Read Only mode as it is not supported on a mirrored configuration.

Download the Azure SQL Migration Tool

You can download the correct version for you DB version from CodePlex. Unzip the package and click on the SQLAzureMW.exe. You can change the output directory from within the SQLAzureMW.exe.config file. Search for BCPFileDir config key. The great thing about this tool is that it also gives you the actual bcp commands it runs so you can run them yourself if needed, which you might just need.

Tip: We had a couple of errors around duplicate records that wasted our valuable time. We found that the tool may for some reason throw an error where it tries to start reinserting the records for a given table that it has already sent across which results in an annoying Primary key or unique key error. If this happens you may have to run the bcp command manually from the command line and transfer the data for the table that errored. Just check how much data for the table has already been imported and adjust your bc export query accordingly or truncate the table and migrate the whole lot again. We had a to run a few manual imports and exports which worked fine.

Migrate the Schema

As mentioned before, its best to migrate the schema, roles and logins and any static data in advance to reduce the impact time.

  1. Now run the SQLAzureMW.exe and select Analyse / Migrate Database and Azure SQL Database as shown below.

    Step 1

  2. Now select the source database and click next. Select the appropriate Database objects and click Advanced and choose Table Schema Only for Script Table/Data.

    Step 2

  3. Continue through the wizard to generate the script.

  4. Connect to the target server and execute the script.

  5. Check the output for any errors as you may need to fix them individually. There may be some compatibility warnings as some features are not supported by SQL Azure but those can be ignored as you are not migrating to Azure.

  6. If you have any Full Text indexes they will need to be created manually and any Stored Proc’s that reference these would need to be recreated after the creation of the full text index. You can use the following script to create the index as well as associate the correct table with it.

     USE [DBName]  
     GO
    
     /****** Object:  FullTextCatalog [MYTABLE_SEARCH]    Script Date: 05/05/2015 17:47:58 ******/
     CREATE FULLTEXT CATALOG [MYTABLE_SEARCH] WITH ACCENT_SENSITIVITY = OFF
     GO
    
     CREATE FULLTEXT INDEX ON [dbo].[MYTABLE] KEY INDEX [PK_MYTABLE] ON ([MYTABLE_SEARCH]) WITH (CHANGE_TRACKING AUTO)
     GO
    
     ALTER FULLTEXT INDEX ON [dbo].[MYTABLE] ADD ([SEARCHABLE_COLUMN_A] LANGUAGE [Neutral])
     GO
    
     ALTER FULLTEXT INDEX ON [dbo].[MYTABLE] ADD ([SEARCHABLE_COLUMN_B] LANGUAGE [Neutral])
     GO
    
     ALTER FULLTEXT INDEX ON [dbo].[MYTABLE] ENABLE
     GO
    

    Needless to say replace MYTABLE_SEARCH with a name of your choosing and MYTABLE with the name of your actual table and SEARCHABLE_COLUMN_A with the columns you would like to search on.

    After the Full text indexes have been created you will need to recreate any SP’s that utilise these Full text indexes.

  7. Check the Target DB to ensure the schema and SP’s etc have been copied across.

Migrate the logins

To migrate the logins with the correct password there is a pretty handy stored proc called sp_help_revlogin which can be found here along with the instructions. Run this on the source server to get the script and run that script on the Target server.

NOTE: Users, Logins, and Permissions are automatically replicated for you on the standby mirror. You don’t need to worry about recreating them. User-defined server roles (a SQL 2012 feature) are not replicated in Multi-AZ instances.

Migrate the Database Users

To be done on each DB you are migrating.

  1. In the Enterprise manager expand the database level Security node (not the server level).

  2. Click on the Users node.

  3. Select the menu option View and select _ Object Explorer Details (F7) _.

  4. In the details pane on the right, select all or required users. Right click and select _ Script User As -> Create to -> New Query Editor Window _.

  5. Execute the generated script against the Target DB.

    Copying Users

Assigning Users to Roles

To be done on each DB you are migrating.

Once the users have been copied you will need to assign users to roles. Run this proc on the source db to generate the script.

SELECT  'EXECUTE sp_AddRoleMember ''' + roles.name + ''', ''' + users.name + ''''
FROM    sys.database_principals users
INNER JOIN sys.database_role_members link ON link.member_principal_id = users.principal_id
INNER JOIN sys.database_principals roles ON roles.principal_id = link.role_principal_id

Run the generated script on the target DB.

Assign Permissions to the Roles

To be done on each DB you are migrating.

Now that the roles have been created. You need to assign appropriate permissions to them. Run the following script on source DB then copy the results (i.e. SQL Statements) and run them on Target DB.

SELECT  'GRANT ' + prm.permission_name + ' ON ' + prm.class_Desc + '::' + '['
		+ CASE class
			WHEN 0 THEN DB_NAME(major_id)
			WHEN 1 THEN OBJECT_NAME(major_id)
			WHEN 3 THEN SCHEMA_NAME(major_id)
			ELSE 'n/a'
		  END + '] TO [' + rol.name COLLATE latin1_general_ci_ai + ']' AS STMT
FROM    sys.database_permissions prm
		JOIN sys.database_principals rol ON prm.grantee_principal_id = rol.principal_id
WHERE   rol.type = 'R'
		AND rol.name != 'public'
ORDER BY rol.name

Migrate Static (or nearly static) Data

To reduce the impact time further, its best to migrate data that is static or that does not change often. There are tables that only get filled or updated every X hours or so. As long as the data is not going to change during the migration process then you are best to migrate that before you switch the Read Only mode. Next step shows the process of migrating the Data that can be used to migrate this selected data using the SQLAzure Wizard. (Ignore the steps to turn the DB to Read Only mode for migrating static data)

Migrating Data (Impact time starts now)

We are going to use the SQLAzureMW Wizard to migrate the data. Before we export the data we need to set the DB to read only mode.

  1. Set the source DB to Read Only Mode. To do this we will need to Alter the database mode to Single User mode first. Before we do this we need to ensure there are no background jobs running otherwise they may hijack your only connection after turning the DB into Read Only mode.

    Firstly, ensure AUTO UPDATE STATISTICS is OFF and no background jobs are running (see query below). If there are then either let the jobs finish or manually terminate them by using KILL STATS JOB.

    SELECT s.is_auto_update_stats_async_on, * FROM sys.databases s

    If its on then turn it off using SET AUTO_UPDATE_STATISTICS_ASYNC TO OFF

    SELECT * FROM sys.dm_exec_background_job_queue

    Background Job Queue

    Once you are sure there are no background jobs running set the DB to single user mode. The query below sets the database to single user mode and rolls back any incomplete transactions after 60 secs.

     ALTER DATABASE [DbName]
     SET SINGLE_USER WITH ROLLBACK AFTER 60;
    

    Now set the database to Read Only mode.

     ALTER DATABASE [DBNAME] SET READ_ONLY WITH NO_WAIT
     GO
    

    Now set the DB back to multi user mode.

     ALTER DATABASE [DBNAME]
     SET MULTI_USER WITH ROLLBACK IMMEDIATE;
    

    The database should now be in Read Only mode and it should be indicated so next to the DB in SSMS.

  2. Now that the database is in Read only mode we can continue with our data export. First check the directory specified in BCPFileDir in the SQLAzureMW.exe.config file points to location with enough space for your data.

  3. Run the SQLAzureMW.exe Wizard using the same settings as before and connecting to the correct source database.

  4. This time just check the Tables option from the list of objects. Click Advanced and choose Data Only for Script Table/Data.

    Data Only Option

  5. Continue on to generate the Script.

  6. Once the script is generated we will import the data onto the target database. BUT before you import the data disable referential integrity on the target database using the query below.

    EXEC SP_MSFOREACHTABLE "ALTER TABLE ? NOCHECK CONSTRAINT ALL"

  7. Proceed with the wizard and import the data on to the target DB. Once the import is complete re-enable referenital integrity constraint.

    EXEC SP_MSFOREACHTABLE "ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL"

  8. Well done, you have successfully migrated your data to RDS. Now run a few queries on your RDS DB just to check the data and connectivity. Do a count comparison on a few tables.

  9. Now if you are currently running on Single AZ, follow the steps in 1 above to set your RDS DB(s) to read only. This would allow you to check how your new DB is holding up before you allow writes in case you need to rollback.

  10. Point your DNS entry to your RDS database to start sending traffic to the new RDS Read Only database. Check your application is functioning as expected and see the RDS graphs on AWS console to ensure you are receiving traffic.

  11. Once you are happy then change your new database on RDS back to read write mode. This time no need to wait for 60 secs for transactions to rollback as the DB is read only.

    -- ONCE READY TO MAKE DB READWRITE CHANGE IT BACK TO SINGLE USER MODE THEN
    ALTER DATABASE [DbName]
    SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    
    -- Set DB TO READ WRITE NOW
    ALTER DATABASE [DBNAME] SET READ_WRITE WITH NO_WAIT
    GO
    
    --SET DB TO MULTI USER MODE
    ALTER DATABASE [DBNAME]
    SET MULTI_USER WITH ROLLBACK IMMEDIATE;
    
  12. If anything fails and you need to abort the migration then all you will have to do is make your current database read write again using the script above.

  13. Lastly, make sure you change your Single AZ set up to Multi AZ. Its not wise to have the production set up running on Single AZ set up with no redundancy.

We had a few worrying moments but the whole migration process was relatively smooth with no downtime and now we have a Multi AZ RDS running in the production environment.