Automated restore of mysql 5.1 database on Windows via SQL Server 2008 SSIS
From time to time I am given some wish list items to accomplish. The most recent one was to restore a Mysql 5.1 database on a Mysql instance running on windows. Performing this action 1 time wasn't that much of a problem. I am not a fan of manually doing anything that can be automated. I perform automate sql server database restores all the time why not a different db engine?
I had limited exposure to Mysql prior to this request. I had created linked servers in SQL 05/08 to mysql servers and db's accross the internet etc, but not any real hands on mauling of a Mysql server instance.
I won't pretend to tell you that I set up the Mysql instance perfectly or that it is ideal. It is currently running on a converted Dell workstation. This is only a POC (Proof of concept). Disclaimers out of the way, here is what I did to get it to work.
This post will be broken down into phases. First the overview, then hurdles, then solution.
First I must confess that I love using Sql Server 05/08 SSIS. I use it for just about all that I can to wrap up individual steps to automate tasks. SSIS does have limitations. But through configurations and variables and populating those variable with data values directly from database tables and vb scripting most of those hurdles can be overcome. My goal was to use SSIS as a vessel to automate this Mysql restore.
The Mysql database in question is provided to me as a .gz compresssed file on a SFTP site on the internet. The goal was to deletethe old mysql backup, retrieve the .gz file, decompress it, and restore the db.
SSIS has support for file system tasks such as copy move and delete. I created a File system task which uses a variable to determine yesterdays date and append a string of yyyy-mm-dd to the file name for my connection manager and delete yesterdays file. Step 1 is done.
Natively SSIS doesn't support SFTP or SSH connections that I am aware of. Luckily for me I had purchased a 3rd party SSIS connector for SFTP from a company named CozyRoc Software. http://www.cozyroc.com/ This product (and other I have used from them) worked very well for SFTP operations. using the SFTP task (almost identical to the FTP task) in SSIS I built the step for retrieving the %filename%.sql.gz file.
Decompressing the .gz file proved to be more difficult than originally thought. .gz files are supported by one of my favorite programs WINRAR. I love WINRAR. Very powerful, small footprint, handles many formats it's just a great program. While .gz files are supported by WINRAR, they are not supported via the WINRAR CLI. As this is to be an automated restore I need something different. Well, WinZip supports decompressing .gz files but not from their CLI either. Time to go directly to the source. It turns out that Gzip (which is most likely the program used to zip the file anyway) has an exe for windows. If you are just a windows geek and not familiar with open source software get ready for a learning curve. By trade I am a DBA and Developer of SQL and .Net with a background in Windows system administration. Further back I worked with routing and telephany equipment. Throughout my time in IT I have used windows and only dabbled with Linux. With my experience I realized the importance of being familiar with the command line in Windows. I have strived for years to learn how to do tasks from the command line. These are some of the reasons why I really LOVE Powershell. More on that some other time. In order to use gzip, just download it from http://www.gzip.com/ and open a command session and type the full path and exe name like "c:\gzip\gzip.exe" this example assumes you have the gzip executable in a folder on the root named "gzip". by executing the above command you'll see that the gzip cli help comes up. I won'ty give you the answers here but by messing around with it enough you'll figure it out. My reccomendation is the use the "-d" switch to decompress. This allowed me to "unzip" the file needed for the restore. This is automated through the use of a batch file. batch files can be called from SSIS using an "Execute process task". The final command looked something like this:
cd c:\gzip
gzip -d \\remote_server\full_path\file.sql.gz
What I thought was the final step was to restore the database. Through research I found that the easiest way to automate the restore of a Mysql database was through the "Mysql command" this appears to be a CLI command that does just about everything for Mysql from the command line using switches. Again, with some messing around and some research you'll find that using the mysql command and some switches and some carrat ">" "<" characters will get you where you need to be. This is also automated via the use of a batch file which is called from SSIS via the same way mentioned above. The final command looked something like this:
cd "C:\Program Files\MySQL\MySQL Server 5.1\bin"
mysql -u user -p"passwordnospace" database_name < filename.sql
Time to deploy this solution right? WRONG.....
The SQL Server instance which will run the SSIS package as a SQL Server agent job and the Mysql instance are on different machines. I need to perform the mysql commands (the gzip commands too unless I want to decompress on the network and copy over) on the Mysql instance on the Mysql machine. What tools do I have at my disposal to perform operations to machines accross the network? That question spawned 3 answers.
The first was the good old AT command in windows. Essentially the AT command is a windows CLI / DOS command that will create a one time one use scheduled task on the remote machine. I trust you can figure out how to get help for the AT command in a windows command shell window. because of the need for some windows security and a profile and interaction with a logged on user this did not work to suit my needs.
Second I found a pretty cool tool named PSEXEC. PSEXEC is a tool frin SYSinternal or winternals that allows you to execute commands on a remote machine. It works much like alot of the open source command line tools that I have seen where you just drop it anywhere in your %PATH% environment variable (or add it to your %PATH%) and you can call it as needed. PSEXEC allows you to pass security information as well as the command you want to execute. PSEXEC /? brings up help. Unfortunately, PSEXEC did not work well for me either. There was some sort of security issue as I was not able to execute the commands as expected. This behavior is also well documented in the rest of the user community. There does not seem to be a real solution. Perhaps there is not a real understang of the underlying problem. regardless this did not work to fulfill my requirement.
Finally, I found an open source solution that acts much like PSEXEC and worked! The item is called REMCOM. A quick search for the keyword will quickly lead you to sourceforge.net. This also required passing a username and password which just like PSEXEC passes this information in clear text, however this worked. I was able to execute remote commands and decompress then restore the mysql database.
Although this is not as straight forward as it sounds, in order to make it work I had to create and execute a batch file that calls the remcom command which executes a remote batch file on the remote machine. esentially, I created a batch file on the destination mysql server that would decompress the .gz file and a batch file that would restore the database. Then on the SQL Server that would run the SSIS package I created a batch file that used the REMCOM command to call the destination batch file. Seems like alot of disjoined crap I know. However with some simple log output entries in the batch file, I am able to track problems and performance. The final entries looked like this:
c:\windows\system32\remcom.exe \\remote_machine_name /user:domain\username /pwd:password c:\full_path\example.bat
I highly reccomend implementing some sort of error trapping / logging of activities in the batch files. It doesn't parse well in this viewer but, I used a format such as the following:
@ECHO ONVERIFY ON::----------------------------------:: Today's date in YYYYMMDD format::----------------------------------
::use this code for XP date formatsset SortDate=%date:~-4%%date:~4,2%%date:~7,2%
::Use this code for Win2000 date formats::set SortDate=%date:~-4%%date:~0,2%%date:~3,2%
::----------------------------------------------------------------------------:: LOG FILE VARIABLE ::----------------------------------------------------------------------------SET theLog=\\server\network_path\%SortDate%_.LOG
ECHO ================================================================================================================= >>%theLog%ECHO START: %Date% %COMPUTERNAME% %USERNAME% Task Name PROCESS START: %TIME% >>%theLog%ECHO ================================================================================================================= >>%theLog%
echo **************************************************************************************************** >>%theLog%echo *********** *********************** >>%theLog%echo *********** Task Name Description >>%theLog%echo *********** *********************** >>%theLog%echo **************************************************************************************************** >>%theLog%
echo -------------------------------------- >>%theLog%echo Task Name >>%theLog%echo -------------------------------------- >>%theLog%echo. >>%theLog%
REM your batch file command goes here!
echo. >>%theLog%
echo. >>%theLog%echo. >>%theLog%echo ------------------------------- >>%theLog%echo Task Name >>%theLog%echo ------------------------------- >>%theLog%echo. >>%theLog%ECHO ---------------------------------------------------------------------------------------------------- >>%theLog% ECHO END: %DATE% %COMPUTERNAME% %USERNAME% PROCESS END: %TIME% >>%theLog% ECHO ---------------------------------------------------------------------------------------------------- >>%theLog% ECHO. >>%theLog% echo. >>%theLog%
exit %ERRORLEVEL%
Other items to note were:
I had to put a "waitfor" 1 minute TSQL task in my package between the arrival of the .gz backup file and calling the gzip decompressor. There appeared to be some sort of file contention.
This operation including PSEXEC worked via SSIS in the Development environment. However when running as a scheduled job it would just hang. This is the reasoning behind using REMCOM.
The Mysql command for database restores took a long time. Almost 4 hours for a 1.3Gb file.
Good luck.
I had limited exposure to Mysql prior to this request. I had created linked servers in SQL 05/08 to mysql servers and db's accross the internet etc, but not any real hands on mauling of a Mysql server instance.
I won't pretend to tell you that I set up the Mysql instance perfectly or that it is ideal. It is currently running on a converted Dell workstation. This is only a POC (Proof of concept). Disclaimers out of the way, here is what I did to get it to work.
This post will be broken down into phases. First the overview, then hurdles, then solution.
First I must confess that I love using Sql Server 05/08 SSIS. I use it for just about all that I can to wrap up individual steps to automate tasks. SSIS does have limitations. But through configurations and variables and populating those variable with data values directly from database tables and vb scripting most of those hurdles can be overcome. My goal was to use SSIS as a vessel to automate this Mysql restore.
The Mysql database in question is provided to me as a .gz compresssed file on a SFTP site on the internet. The goal was to deletethe old mysql backup, retrieve the .gz file, decompress it, and restore the db.
SSIS has support for file system tasks such as copy move and delete. I created a File system task which uses a variable to determine yesterdays date and append a string of yyyy-mm-dd to the file name for my connection manager and delete yesterdays file. Step 1 is done.
Natively SSIS doesn't support SFTP or SSH connections that I am aware of. Luckily for me I had purchased a 3rd party SSIS connector for SFTP from a company named CozyRoc Software. http://www.cozyroc.com/ This product (and other I have used from them) worked very well for SFTP operations. using the SFTP task (almost identical to the FTP task) in SSIS I built the step for retrieving the %filename%.sql.gz file.
Decompressing the .gz file proved to be more difficult than originally thought. .gz files are supported by one of my favorite programs WINRAR. I love WINRAR. Very powerful, small footprint, handles many formats it's just a great program. While .gz files are supported by WINRAR, they are not supported via the WINRAR CLI. As this is to be an automated restore I need something different. Well, WinZip supports decompressing .gz files but not from their CLI either. Time to go directly to the source. It turns out that Gzip (which is most likely the program used to zip the file anyway) has an exe for windows. If you are just a windows geek and not familiar with open source software get ready for a learning curve. By trade I am a DBA and Developer of SQL and .Net with a background in Windows system administration. Further back I worked with routing and telephany equipment. Throughout my time in IT I have used windows and only dabbled with Linux. With my experience I realized the importance of being familiar with the command line in Windows. I have strived for years to learn how to do tasks from the command line. These are some of the reasons why I really LOVE Powershell. More on that some other time. In order to use gzip, just download it from http://www.gzip.com/ and open a command session and type the full path and exe name like "c:\gzip\gzip.exe" this example assumes you have the gzip executable in a folder on the root named "gzip". by executing the above command you'll see that the gzip cli help comes up. I won'ty give you the answers here but by messing around with it enough you'll figure it out. My reccomendation is the use the "-d" switch to decompress. This allowed me to "unzip" the file needed for the restore. This is automated through the use of a batch file. batch files can be called from SSIS using an "Execute process task". The final command looked something like this:
cd c:\gzip
gzip -d \\remote_server\full_path\file.sql.gz
What I thought was the final step was to restore the database. Through research I found that the easiest way to automate the restore of a Mysql database was through the "Mysql command" this appears to be a CLI command that does just about everything for Mysql from the command line using switches. Again, with some messing around and some research you'll find that using the mysql command and some switches and some carrat ">" "<" characters will get you where you need to be. This is also automated via the use of a batch file which is called from SSIS via the same way mentioned above. The final command looked something like this:
cd "C:\Program Files\MySQL\MySQL Server 5.1\bin"
mysql -u user -p"passwordnospace" database_name < filename.sql
Time to deploy this solution right? WRONG.....
The SQL Server instance which will run the SSIS package as a SQL Server agent job and the Mysql instance are on different machines. I need to perform the mysql commands (the gzip commands too unless I want to decompress on the network and copy over) on the Mysql instance on the Mysql machine. What tools do I have at my disposal to perform operations to machines accross the network? That question spawned 3 answers.
The first was the good old AT command in windows. Essentially the AT command is a windows CLI / DOS command that will create a one time one use scheduled task on the remote machine. I trust you can figure out how to get help for the AT command in a windows command shell window. because of the need for some windows security and a profile and interaction with a logged on user this did not work to suit my needs.
Second I found a pretty cool tool named PSEXEC. PSEXEC is a tool frin SYSinternal or winternals that allows you to execute commands on a remote machine. It works much like alot of the open source command line tools that I have seen where you just drop it anywhere in your %PATH% environment variable (or add it to your %PATH%) and you can call it as needed. PSEXEC allows you to pass security information as well as the command you want to execute. PSEXEC /? brings up help. Unfortunately, PSEXEC did not work well for me either. There was some sort of security issue as I was not able to execute the commands as expected. This behavior is also well documented in the rest of the user community. There does not seem to be a real solution. Perhaps there is not a real understang of the underlying problem. regardless this did not work to fulfill my requirement.
Finally, I found an open source solution that acts much like PSEXEC and worked! The item is called REMCOM. A quick search for the keyword will quickly lead you to sourceforge.net. This also required passing a username and password which just like PSEXEC passes this information in clear text, however this worked. I was able to execute remote commands and decompress then restore the mysql database.
Although this is not as straight forward as it sounds, in order to make it work I had to create and execute a batch file that calls the remcom command which executes a remote batch file on the remote machine. esentially, I created a batch file on the destination mysql server that would decompress the .gz file and a batch file that would restore the database. Then on the SQL Server that would run the SSIS package I created a batch file that used the REMCOM command to call the destination batch file. Seems like alot of disjoined crap I know. However with some simple log output entries in the batch file, I am able to track problems and performance. The final entries looked like this:
c:\windows\system32\remcom.exe \\remote_machine_name /user:domain\username /pwd:password c:\full_path\example.bat
I highly reccomend implementing some sort of error trapping / logging of activities in the batch files. It doesn't parse well in this viewer but, I used a format such as the following:
@ECHO ONVERIFY ON::----------------------------------:: Today's date in YYYYMMDD format::----------------------------------
::use this code for XP date formatsset SortDate=%date:~-4%%date:~4,2%%date:~7,2%
::Use this code for Win2000 date formats::set SortDate=%date:~-4%%date:~0,2%%date:~3,2%
::----------------------------------------------------------------------------:: LOG FILE VARIABLE ::----------------------------------------------------------------------------SET theLog=\\server\network_path\%SortDate%_.LOG
ECHO ================================================================================================================= >>%theLog%ECHO START: %Date% %COMPUTERNAME% %USERNAME% Task Name PROCESS START: %TIME% >>%theLog%ECHO ================================================================================================================= >>%theLog%
echo **************************************************************************************************** >>%theLog%echo *********** *********************** >>%theLog%echo *********** Task Name Description >>%theLog%echo *********** *********************** >>%theLog%echo **************************************************************************************************** >>%theLog%
echo -------------------------------------- >>%theLog%echo Task Name >>%theLog%echo -------------------------------------- >>%theLog%echo. >>%theLog%
REM your batch file command goes here!
echo. >>%theLog%
echo. >>%theLog%echo. >>%theLog%echo ------------------------------- >>%theLog%echo Task Name >>%theLog%echo ------------------------------- >>%theLog%echo. >>%theLog%ECHO ---------------------------------------------------------------------------------------------------- >>%theLog% ECHO END: %DATE% %COMPUTERNAME% %USERNAME% PROCESS END: %TIME% >>%theLog% ECHO ---------------------------------------------------------------------------------------------------- >>%theLog% ECHO. >>%theLog% echo. >>%theLog%
exit %ERRORLEVEL%
Other items to note were:
I had to put a "waitfor" 1 minute TSQL task in my package between the arrival of the .gz backup file and calling the gzip decompressor. There appeared to be some sort of file contention.
This operation including PSEXEC worked via SSIS in the Development environment. However when running as a scheduled job it would just hang. This is the reasoning behind using REMCOM.
The Mysql command for database restores took a long time. Almost 4 hours for a 1.3Gb file.
Good luck.

0 Comments:
Post a Comment
<< Home