Quantcast
Channel: World of Whatever
Viewing all 144 articles
Browse latest View live

SQL Saturday Schedule Builder

$
0
0

SQL Saturday Schedule Builder

There's gold on the SQLSaturday website and I want to help you find it. I'm talking about the Schedule Builder. By using the schedule builder before the event, I can see whether a session is projected to overflow their room capacity. It does not commit you to a particular schedule! Attend whatever sessions you chose. What the schedule builder does, is provides the SQL Saturday admins insight into what's likely to be popular. This allows me to proactively move sessions into larger class rooms so that everyone can attend the hot session of the hour.

How do I use Schedule Builder

Using the Schedule Builder is easy peasy.
  1. Go to your event's SQL Saturday site. KC folks, you want http://sqlsaturday.com/159/eventhome.aspx

  2. Click the LOG IN link and the AJAX control will drop down. Enter your user name and password. Don't know them? No problem. Click forgot password

  3. A warning will pop indicating you are leaving sqlsaturday and going to sqlpass. That's to be expected.

  4. Did you know that when you signed up for SQLSaturday, you became a member of SQL PASS (Professional Association for SQL Server)? Heck no you didn't, you were too excited about free SQL Server training. You should explore the other opportunities PASS provides for you to grow as a professional (free virtual chapters, user groups, Summit, SQLRally, 24 Hours of PASS) but not until you've built out your schedule. Enter the address you're getting SQLSaturday 159 email delivered to and click the button.

  5. Once you've received that email, reset your password and venture back to http://sqlsaturday.com/159/eventhome.aspx

  6. Sign in and notice the site greets you ever so kindly

  7. Click the Schedule and select Schedule Builder

  8. You'll be presented with a page like the following. Click the "Edit Schedule button and chose a primary and alternative session.

  9. Click save,you're done, my report is now populated with your choices and now I have to scramble to move sessions about. :D
For extra credit, install Guidebook which is a FREE mobile application that allows you to view event info, schedule, etc on your phone/tablet/mobile devices. For list of compatible devices, please visit here. Once installed, search for "SQLSaturday #159 - Kansas City 2012.".

A the Summit, their guidebook had the ability to integrate with your SQLPass account but I'm not seeing it at the moment.

 


EzAPI 2012 referenced assembly issue

$
0
0

A brief post on using the current source build of EzAPI. If you try to compile it, you will receive this error

Assembly 'Microsoft.SqlServer.TxScript, Version=11.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' uses 'Microsoft.SqlServer.VSTAScriptingLib, Version=11.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' which has a higher version than referenced assembly 'Microsoft.SqlServer.VSTAScriptingLib, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91'

No problem you say. You expand your references and see that the project found the 2008 DLL and you just need to change the path from "C:\Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies\Microsoft.SqlServer.VSTAScriptingLib.dll" to "C:\Program Files (x86)\Microsoft SQL Server\110\SDK\Assemblies\Microsoft.SqlServer.VSTAScriptingLib.dll"

The only problem is, there isn't a "c:\Program Files\Microsoft SQL Server\110\DTS\PipelineComponents\Microsoft.SqlServer.TxScript.dll"

I have no idea why that DLL isn't installed in that location, or any location in the Program Files|Program Files (x86) location but whatever. The assembly is on the machine as SSIS works, you'll simply need to get it out of the GAC (Global Assembly Cache). I'm lazy and just repointed my reference into the GAC but the other option is to copy it into the expected location.

copy C:\Windows\Microsoft.NET\assembly\GAC_MSIL\Microsoft.SqlServer.VSTAScriptingLib\v4.0_11.0.0.0__89845dcd8080cc91\Microsoft.SqlServer.VSTAScriptingLib.dll "c:\Program Files (x86)\Microsoft SQL Server\110\DTS\PipelineComponents"

Row Count with SSIS EzAPI

$
0
0
Haha, this is a trick. There is no EzAPI component for Row Count Transformation. Until now!

[CompID("{150E6007-7C6A-4CC3-8FF3-FC73783A972E}")]
publicclass EzRowCountTransform : EzComponent
{
public EzRowCountTransform(EzDataFlow dataFlow) : base(dataFlow) { }
public EzRowCountTransform(EzDataFlow parent, IDTSComponentMetaData100 meta) : base(parent, meta) { }

publicstring VariableName
{
get { return (string)Meta.CustomPropertyCollection["VariableName"].Value; }
set { Comp.SetComponentProperty("VariableName", value); }
}
}

Make it simple (recovery mode)

$
0
0
This is one of those posts that is probably only useful for me. At work we have some VMs set up for general purpose development. Each dev has one assigned to them except mine is no longer mine. Hazards of leaving a job I suppose. So on my return, I've been camping out on other, less used VMs when I need to have a long running process. One of the things I notice though is that these boxes are nearly out of space.

Why? Because databases are all in FULL recovery mode. This setting makes sense in production environments but as no one is taking backups in this dev environment, this setting causes the database log files to grow and grow and grow and poof, the machines are out of disk space. 

The cure, the following query will generate a row for every database in the instance in full recovery mode. I then copy and paste the results back into SSMS/sqlcmd and voila, things are fixed.

SELECT
    'ALTER DATABASE ' + quotename(DB.name) + ' SET RECOVERY SIMPLE WITH NO_WAIT' AS cmd
FROM
    sys.databases DB 
WHERE
    DB.recovery_model_desc = 'full';
 

I then shrink the log but that's ok because it never should have grown that large.

PowerShell Export SQL Agent Job

$
0
0
This brief post is in response to a comment on my post PowerShell export query to CSV. Octopusgrabbus, awesome handle, asked how one could launch this. If you were looking to make this into a SQL Agent job, then I'd simply create a new job, add a step of type PowerShell and then you can literally paste in the script I referenced in the previous article.

If you prefer a script approach, this should create a job named PowerShellExportTest and with a single step named PS test. When the job runs, the query in line 30 will run and generate a CSV to the path defined on line 35.

   1:  USE [msdb]
   2:  GO
   3:  DECLARE @ReturnCode INT
   4:  DECLARE @jobId BINARY(16)
   5:  EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'PowerShellExportTest', 
   6:          @enabled=1, 
   7:          @notify_level_eventlog=0, 
   8:          @notify_level_email=0, 
   9:          @notify_level_netsend=0, 
  10:          @notify_level_page=0, 
  11:          @delete_level=0, 
  12:          @description=N'No description available.', 
  13:          @category_name=N'[Uncategorized (Local)]', 
  14:          @owner_login_name=N'sa', @job_id = @jobId OUTPUT
  15:   
  16:   
  17:  EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'PS test', 
  18:          @step_id=1, 
  19:          @cmdexec_success_code=0, 
  20:          @on_success_action=1, 
  21:          @on_success_step_id=0, 
  22:          @on_fail_action=2, 
  23:          @on_fail_step_id=0, 
  24:          @retry_attempts=0, 
  25:          @retry_interval=0, 
  26:          @os_run_priority=0, @subsystem=N'PowerShell', 
  27:          @command=N'
  28:  $server = "localhost"
  29:  $database = "master"
  30:  $query = "SELECT D.* FROM sys.databases D"
  31:   
  32:  # powershell raw/verbatim strings are ugly
  33:  # Update this with the actual path where you want data dumped
  34:  $extractFile = @"
  35:  C:\test.csv
  36:  "@
  37:   
  38:  # If you have to use users and passwords, my condolences
  39:  $connectionTemplate = "Data Source={0};Integrated Security=SSPI;Initial Catalog={1};"
  40:  $connectionString = [string]::Format($connectionTemplate, $server, $database)
  41:  $connection = New-Object System.Data.SqlClient.SqlConnection
  42:  $connection.ConnectionString = $connectionString
  43:   
  44:  $command = New-Object System.Data.SqlClient.SqlCommand
  45:  $command.CommandText = $query
  46:  $command.Connection = $connection
  47:   
  48:  $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
  49:  $SqlAdapter.SelectCommand = $command
  50:  $DataSet = New-Object System.Data.DataSet
  51:  $SqlAdapter.Fill($DataSet)
  52:  $connection.Close()
  53:   
  54:  # dump the data to a csv
  55:  # http://technet.microsoft.com/en-us/library/ee176825.aspx
  56:  $DataSet.Tables[0] | Export-Csv $extractFile', 
  57:          @database_name=N'master', 
  58:          @flags=0
  59:   
  60:  EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
  61:  EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'

A little secret

$
0
0

I freaking love my new job. Tomorrow marks 1 week having been with Valorem Consulting Group. My first day, I got to learn about how to apply security to dimensions in SSAS. The next day, I was out meeting with a client for 4 hours to talk them about where they want to take their databases. They're on SQL Server 2005 and are trying to understand what their upgrade options are, licensing and costs associated with a move to 2008, 2008 R2 vs 2012. Oh and I got to research database monitoring tools, SOX controls, temp database contention and clustering vs VMware's hot failover thing. Obviously, I'm not done digging into that one ;) Day 3, I'm digging into another client's DW and the cubes they built out to support it. It was nice to be able to finally put into play the training I went to three years ago. Yesterday was a blur of documentation about our findings for Day 2 client.

Today, I got to learn the details of the administrative things I "knew." tempdb should not be on C: Great, let's move it. How do I do that? To the Bing cave! Sadly, it's rather trival to move system databases. How about enabling Instant File Initialization and Lock Pages in memory? Same story, easy to do because I walk where giants have tred (and blogged). I have more sections and pages in my new OneNote book for this place than I did in 9 months consulting back at my last job. There are so many fun challenges ahead of me, I can't wait to see what I look like after all the growth is done.

SSMS templates

$
0
0

SSMS templates, a gem you probably aren't using

How do you keep track of the queries you need to run regularly?

SQL Server Management Studio has a Template Explorer feature that you probably have never seen. Which is a pity, because it can be a real time saver.

Accessing the Template Explorer

The Template Explorer is available under the View menu or Ctrl-Alt-T for the keyboard fans.
View, Template Explorer

Now you have the "Template Browser" panel open, feel free to mock the inconsistent name from the menu to the actual panel title. Template explorer is labeled template browser

Using a template

How do you add extended properties to a table? Sure, you can punch that into AltaVista or whatever search engine you use and you'll get hits for MSDN and StackOverflow but what if you didn't have to go there? You don't, in fact. It was right there on your machine, a perfect example waiting for you in the Template Explorer.

Heaven forbid, what if you didn't have access to the internet? I know I don't install help files anymore and the few times I'm flying, I'm pretty far from having internet access. Double click on "Add Extended Properties to Table" and the following query gets pushed into a new SSMS query pane.

   1:  -- =========================================
   2:  -- Add Extended Properties to Table template
   3:  -- =========================================
   4:  USE<database, sysname, AdventureWorks>
   5:  GO
   6:   
   7:  IF OBJECT_ID('<schema_name, sysname, dbo>.<table_name, sysname, sample_table>', 'U') ISNOTNULL
   8:  DROPTABLE<schema_name, sysname, dbo>.<table_name, sysname, sample_table>
   9:  GO
  10:   
  11:  CREATETABLE<schema_name, sysname, dbo>.<table_name, sysname, sample_table>
  12:  (
  13:  <columns_in_primary_key, , c1> <column1_datatype, , int> <column1_nullability,, NOTNULL>, 
  14:  CONSTRAINT<contraint_name, sysname, PK_sample_table> PRIMARYKEY (<columns_in_primary_key, , c1>)
  15:  )
  16:  -- Add description to table object
  17:  EXEC sys.sp_addextendedproperty 
  18:      @name=N'MS_Description', 
  19:      @value=N'<table_description_value,,Table description here>' ,
  20:      @level0type=N'SCHEMA', 
  21:      @level0name=N'<schema_name, sysname, dbo>', 
  22:      @level1type=N'TABLE', 
  23:      @level1name=N'<table_name, sysname, sample_table>'
  24:  GO
  25:   
  26:  -- Add description to a specific column
  27:  EXEC sys.sp_addextendedproperty 
  28:      @name=N'MS_Description', 
  29:      @value=N'<column_description,,Column description here>' ,
  30:      @level0type=N'SCHEMA', 
  31:      @level0name=N'<schema_name, sysname, dbo>', 
  32:      @level1type=N'TABLE', 
  33:      @level1name=N'<table_name, sysname, sample_table>', 
  34:      @level2type=N'COLUMN', 
  35:      @level2name=N'<columns_in_primary_key, , c1>'
  36:  GO

Ok, now what?
Finding Nemo, now what scene
All of those beautiful <parameter, [type], value> in the above mean something to SSMS. They are template parameters and to unlock their power, hit Ctrl-Shift-M.

The resulting pop-up window looks like
Specify Values for Template Parameters Compare the first row in that replace window to line 4 of the template query-all that SSMS did was parse out the 3 values between the < and the > and assume that what's in between is getting replaced. It's not fool proof but it's "good enough."

Really using templates

There's nothing stopping you from making your own templates. Think about it, how do you keep track of all your queries? For my money, I'm using Templates. Currently, I have 4 SSMSTemplate folders available on my dropbox site.
  • 45NewTSQLFeatures - this was my first attempt at a template library. I had a SQL Saturday presentation on 45 new language features in 2005 and 2008 in 45 minutes. Now it's my syntax reference for those features
  • SSIS - I built this out for my 2012 PASS Summit Lightning talk. It allows me to deploy .ispac files into the new SSISDB as well as tear it all down
  • Sweets - this is my SQL Saturday presentation on the new language features in SQL Server 2012.
  • Utility - this a work in progress but the basic idea is that with my new job, I'm going to be walking into lots of strange, new environments and I'll need to be able to hit the ground running. I took Brent Ozar's concept of sp_blitz and added in all the other scripts I might need to handle whatever is thrown at me.

Installation

After Template Explorer is first opened, SSMS will create the basic structure in your AppData folder. If you can't see it, take the training wheels off Windows Explorer and show hidden folders. For my account and SQL Server 2012, they were installed in C:\Users\bfellows\AppData\Roaming\Microsoft\SQL Server Management Studio\110\Templates. Download the folders to your Templates folder and restart SSMS (it only scans for files when it is first opened).

Making your own

In the Template Browser, right click in the background and you will have an option for New and there you can either create a new script or folder. Right-click and select Edit to change a template, otherwise you'll only be changing a copy and not the original.

Moving SQL Server databases

$
0
0

It's not uncommon to find SQL Server installations where the physical file layout is less than optimal. We have a client who is having some performance issues and in initial discovery, we observed that they had the system databases on the C drive and had put their user databases and log on the D. The challenge you can run into, and they are experiencing it, is that heavy workloads can saturate your I/O subsystem and that brings everything to a crawl. There's also a risk of running your C drive out of disk space that way which can cause the OS itself to stop responding. The client has already purchased a SAN and as part of our effort, we're going to relocate user and system databases onto the SAN.

There are lots of permutations on how to lay that out but the general rule is to isolate logs, data files, tempdb and if you can spare it, the backups from each other. Knowing that's the right design is one thing, but how do I go about fixing a broken one? MSDN to the rescue, move system databases.

I started by moving tempdb. Since tempdb doesn't require moving the data or log files, it was the easiest to deal with. The reason you don't need to move them, is that they are recreated whenever the system is restarted. I went through the example and was able to change the location of my tempdb as easy as can be.

The "other" system databases, those seemed a little more daunting because now I was going to be dealing with TSQL and OS commands. Yes, it's just "move" but still, I'm now going to be in two different environments and I'll need to repeat the process for all the databases. At this point, I start to get nervous about the amount of manual work to be done and more importantly the opportunity for errors to creep in. I don't know if I'm going to be pushing buttons at the client site or if they'll be at the keyboard. An automated solution sounds like it might be called for.

I wanted to go PowerShell but I held off on that approach as I wanted to deal with this situation a few times before I felt comfortable that I'd stepped on all the landmines. Instead, I wrote a TSQL script to generate my changes. The script generates 4 sets of commands for each row in sys.master_files. The first 2 columns change the file locations in the database and on disk. The last 2 columns undo those changes. Pro-tip: always have a backout strategy. For those looking at my SSMS Templates, in my Utility folder, I called this one DatabaseFileMover.sql As of right now, that code looks like the following

   1:  DECLARE
   2:      @NewLogLocation varchar(256) = 'L:\SQLLog'
   3:  ,   @NewBackupLocation varchar(256) = 'M:\SQLBackups'
   4:  ,   @NewDataLocation varchar(256) = 'S:\SQLData'
   5:  ,   @NewTempDBLocation varchar(256) = 'T:\TempDB'
   6:  ,   @AlterCommand varchar(512) = ' 
   7:  ALTER DATABASE <dbname/>  
   8:  MODIFY FILE (NAME = <name/>, FILENAME = ''<new_location/>'');'
   9:  ,   @MoveCommand varchar(512) = 'MOVE "<old_location/>" "<new_location/>"'
  10:  ,   @DeleteCommand varchar(512) = 'DEL "<old_location/>"'; 
  11:  
  12:  SELECT
  13:  CASE
  14:  WHEN D.type_desc = 'ROWS'AND D.DatabaseName <> 'tempdb'
  15:  -- Undo version 
  16:  -- THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', D.databaseName), '<name/>', D.LogicalName), '<new_location/>', D.PhysicalName) 
  17:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>', @NewDataLocation + '\' + D.BaseFileName) 
  18:  WHEN D.type_desc = 'LOG'AND D.DatabaseName <> 'tempdb'
  19:  -- Undo version 
  20:  -- THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', D.databaseName), '<name/>', D.LogicalName), '<new_location/>', D.PhysicalName) 
  21:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>', @NewLogLocation + '\' + D.BaseFileName) 
  22:  WHEN D.DatabaseName = 'tempdb'
  23:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>',  @NewTempDBLocation + '\' + D.BaseFileName) 
  24:  ELSENULL
  25:  ENDAS AlterCommand 
  26:  ,   CASE
  27:  WHEN D.type_desc = 'ROWS'AND D.DatabaseName <> 'tempdb'
  28:  THEN REPLACE(REPLACE(REPLACE(@MoveCommand, '<old_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<new_location/>', @NewDataLocation + '\' + D.BaseFileName) 
  29:  WHEN D.type_desc = 'LOG'AND D.DatabaseName <> 'tempdb'
  30:  THEN REPLACE(REPLACE(REPLACE(@MoveCommand, '<old_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<new_location/>', @NewLogLocation + '\' + D.BaseFileName) 
  31:  WHEN D.DatabaseName = 'tempdb'
  32:  THEN REPLACE(REPLACE(REPLACE(@DeleteCommand, '<old_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<new_location/>', @NewLogLocation + '\' + D.BaseFileName) 
  33:  ENDAS MoveCommand 
  34:  ,   CASE
  35:  WHEN D.type_desc = 'ROWS'AND D.DatabaseName <> 'tempdb'
  36:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>', D.PhysicalName) 
  37:  WHEN D.type_desc = 'LOG'AND D.DatabaseName <> 'tempdb'
  38:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>', D.PhysicalName) 
  39:  WHEN D.DatabaseName = 'tempdb'
  40:  THEN REPLACE(REPLACE(REPLACE(@AlterCommand, '<dbname/>', QUOTENAME(D.databaseName)), '<name/>', D.LogicalName), '<new_location/>',  D.PhysicalName) 
  41:  ELSENULL
  42:  ENDAS UndoAlterCommand 
  43:  ,   CASE
  44:  WHEN D.type_desc = 'ROWS'AND D.DatabaseName <> 'tempdb'
  45:  THEN REPLACE(REPLACE(REPLACE(@MoveCommand, '<new_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<old_location/>', @NewDataLocation + '\' + D.BaseFileName) 
  46:  WHEN D.type_desc = 'LOG'AND D.DatabaseName <> 'tempdb'
  47:  THEN REPLACE(REPLACE(REPLACE(@MoveCommand, '<new_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<old_location/>', @NewLogLocation + '\' + D.BaseFileName) 
  48:  WHEN D.DatabaseName = 'tempdb'
  49:  THEN REPLACE(REPLACE(REPLACE(@DeleteCommand, '<new_location/>', D.PhysicalName), '<name/>', D.LogicalName), '<old_location/>', @NewLogLocation + '\' + D.BaseFileName) 
  50:  ENDAS UndoMoveCommand 
  51:  FROM
  52:  ( 
  53:  SELECT D.* 
  54:      ,   LEFT(D.PhysicalName, D.LastSlash) AS CurrentFolder 
  55:      ,   RIGHT(D.PhysicalName, LEN(D.PhysicalName) - D.LastSlash - 1) AS BaseFileName 
  56:  FROM
  57:      ( 
  58:  SELECT
  59:              name AS LogicalName 
  60:          ,   physical_name AS PhysicalName 
  61:          ,   LEN(MF.physical_name) - CHARINDEX('\', REVERSE(MF.physical_name)) AS LastSlash 
  62:          ,   MF.type_desc 
  63:          ,   DB_NAME(MF.database_id) AS DatabaseName 
  64:          ,   MF.database_id 
  65:  FROM
  66:              sys.master_files AS MF 
  67:  --WHERE 
  68:  --    MF.database_id < 5 
  69:      ) D 
  70:  ) D 
It generated output like
AlterCommandMoveCommandUndoAlterCommandUndoMoveCommand
ALTER DATABASE [master] MODIFY FILE (NAME = master, FILENAME = 'S:\SQLData\master.mdf');MOVE "C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\master.mdf" "S:\SQLData\master.mdf"ALTER DATABASE [master] MODIFY FILE (NAME = master, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\master.mdf');MOVE "S:\SQLData\master.mdf" "C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\master.mdf"
ALTER DATABASE [master] MODIFY FILE (NAME = mastlog, FILENAME = 'L:\SQLLog\mastlog.ldf');MOVE "C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\mastlog.ldf" "L:\SQLLog\mastlog.ldf"ALTER DATABASE [master] MODIFY FILE (NAME = mastlog, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\mastlog.ldf');MOVE "L:\SQLLog\mastlog.ldf" "C:\Program Files\Microsoft SQL Server\MSSQL11.DEV2012\MSSQL\DATA\mastlog.ldf"
That looks about right for what I was intending. Column 0 was saved out to AlterCommand.sql. Column 1 to MoveCommand.bat. Column 2 became UndoAlterCommand.sql and Column 3 became UndoMoveCommand.bat.

I stared at the pending alter statements long and hard. Had a cup of tea and finally hit F5 but it failed. See, I didn't have any drive on my machine but C. Instead, I thought I could fake it out by using my friend subst. It works great for the SSIS scenarios I have where I need to pretend I have a particular path available for a hardcoded thing. But, SQL Server is smarter than that. It could see through my shenanigans and determine that my S, L and T drives were fakes and would not allow the ALTER DATABASE statements to pass.

After replace all those fake drive letters with actual paths on C:, I was ready. I fired off the alter commands and they all worked. SQL Server reported that those changes would take effect next time I restarted the service. Goodie, I double click my MoveFile.bat and it promptly throws up because those files are all in an administrator location (thanks windows 8). I couldn't see it failing though as the batch script ended too quickly. If you add the command PAUSE at the end of a .bat file, it will keep the window open until someone hits a key. I fix the file, right click and run as admin but those files are still in use. Oh yeah, *stop* the service, then move the files. I stop SQL Server and re-run the batch script and see my mdfs and ldfs move into the right folders. Great success.

I started the service and what was that flash? Oh crap, why is it still red? Eventviewer, eventvwr, got me started by reminding me that I have an error log available that I can read without SSMS. It kindly informed me that on start up it couldn't find my master database. The funny thing was, it was still looking in C:\Program Files... Sonofagun but fortunately I had my UndoMoveCommand.bat script there and after running as admin, it put back the files in their correct spots. Restart SQL Server and it's still broken. This time it found master data and log files but it was looking for model and msdb in C:\FakeDriveLocations. What the hell? The ALTER statement worked for everything *but* master. That seems unlikely. I looked back at the alter scripts and there didn't seem to be any diference between master and model commands. Inspiration finally struck after reading and rereading the errorlog file and I looked at the service in SQL Server Configuration Manager (SQLServerManager11.msc). There on the Advanced tab was startup parameters of "-dC:\Program Files...\master.mdf;-e...ErrorLog;-lC:\Program Files...\mastlog.ldf"

startup parameters

Database engine startup options specify that -d and -l specify where the master database data and log file should live. To heck with start up parameters, I already specified where the files should live before I broke the server, use that! So I copied out the existing parameters and cut out the -d and -l options. I figured the error log was fine where it was and restarted the service. Again SQL Server failed to start but this time the error message simply specified that it couldn't find master.mdf and mastlog.ldf-no path specified.

Intriguing, but at this point, I just want to get my database back up. Once again, I modify the start up parameters and provide explicit paths for the master database using the new paths. This time when I start the service, it starts and all my databases are available. Nothing outrageous in the error log either. I start SQL Agent and it too seems fine. Huzzah!

I undo the whole process by running UndoAlterStatement.sql, stopping the SQL Service, running UndoMoveCommand.bat as an admin, reverting the startup parameters for the service and finally restarting the service. Once again, green lights and my database seems operational. I think I have a winner.

How I plan to migrate database files

This, as with all information on this blog, is merely what I'm doing. Doing so in your environment may cause loss of data, irreparable harm and loss of life. Procede at your own risk.
  1. Take a full backup up of everything and ensure it was good
  2. Edit the above template script to ensure the files are going to valid locations
  3. Run the script and save out each column as described. Be sure to add the PAUSE command at the end of the .bat files
  4. Run the AlterCommand script against the server "sqlcmd -S yourserver -i AlterCommand.sql"
  5. Use SQL Server Configuration Manager to stop the SQL Server service
  6. Change the Startup Parameters to use the new locations for the master mdf and ldf
  7. Run the MoveCommand.bat as an admin
  8. Restart the SQL Service and check the ErrorLog
  9. Connect to the instance via SSMS and ensure all databases are online and accessible
  10. Check for database mail being configured and send an email


SSIS-Encrypt data while in transit

$
0
0

A coworker asked if I had any experience with encrypting data while in transit with SSIS. Their client specifies that PII data must be protected at rest and in flight. They enforce the data at rest through something, it's not my client so I don't know the particulars but we'll be pulling data like a social security number, SSN, out of the financial system and into the data warehouse and while that data is between places, our ETL needs to safeguard that data.

I had never had to deal with encryption but in SSIS, or any ADO.NET connection, we can encrypt the connection and everything flowing down the wire should be encrypted. Encrypting Connections to SQL Server It looks like there’s some setup involved but once it’s done, it should be a matter of simply updating the connection string from a usage perspective.

The second option was to use the native, but new-to-me, TSQL function for encryption and decryption. The following example shows me using Encrypt/DecryptBYPASSPHRASE functions to encrypt/decrypt the names from spt_values. At this point I will also point out, that crypto stuff has never been anything I've been passionate about so do your due diligence before you go implementing something you read on the interblag.
See also: Cryptographic functions

   1:  DECLARE @PassphraseEnteredByUser  nvarchar(128);
   2:  SET @PassphraseEnteredByUser  = 'Bill is the greatest';
   3:  WITH SRC AS
   4:  (
   5:  SELECT
   6:          SV.name
   7:      ,   ROW_NUMBER() OVER (ORDERBY (SELECTNULL)) AS  FakeSK
   8:  FROM
   9:          master.dbo.spt_values AS SV
  10:  )
  11:  , ENCRYPTO AS
  12:  (
  13:  SELECT
  14:          S.name
  15:      ,   EncryptByPassPhrase
  16:          (
  17:              @PassphraseEnteredByUser
  18:          ,   S.name
  19:          ,   1
  20:          ,   CONVERT( varbinary,  S.FakeSK)
  21:          ) AS  encrypted_text
  22:      ,   S.FakeSK
  23:  FROM
  24:          SRC S
  25:  )
  26:  , DECRYPTO AS
  27:  (
  28:  SELECT
  29:          E.*
  30:      ,   CONVERT(nvarchar, DECRYPTBYPASSPHRASE
  31:          (
  32:              @PassphraseEnteredByUser
  33:          ,   E.encrypted_text
  34:          ,   1
  35:          ,   CONVERT( varbinary,  E.FakeSK)
  36:          )) AS  decrypted_text
  37:  FROM
  38:          ENCRYPTO E
  39:  )
  40:  SELECT
  41:      D.* 
  42:  FROM
  43:      DECRYPTO AS D;

In the SSIS world, we could use this as our source query. The challenge around doing that we’d either need to write the encrypted value into a staging table and then decrypt in a post cleanup step (Execute SQL task) or decrypt in the data flow prior to writing to the target table. But at that point, we’d have unencrypted SSNs in memory. If we run out of memory in the DF and spill to disk, the temp files would have the unencrypted SSN in them too. The automatic, post execution step would handle the cleanup on those files but there’d be the ghost image of them if people were NSA crazy about getting the data back. Either way, that might violate the client's rules on PII security.

The other thing to note about encrypting the data is that you’re going to go from a simple 9 byte allocation per row to varbinary(8000) which will result in the data flow either writing the binary portion to disk and carrying a pointer along to that field in the in memory portion or the engine will have to allocate 8k bytes of memory per row in the buffer. Either way, you’re probably going to feel a performance impact for large datasets but we’d have to test before we can know the possible severity.

Sync to SharePoint Workspaces

$
0
0

Sync to SharePoint Workspace, an undiscovered jewel

I hate SharePoint. It's generally a pain in the ass. Today, I was pointed as a SharePoint site being used as a document library for a client engagement. Lots of documents in there as we've been working with this client for a few months. One of our folders contains images of our data models and even though they were proper image files, for whatever reason, the server wouldn't serve them as such. Instead, I'd get a dialog asking me what I'd like to do with each one of them. I could tell you in detail what I thought SP could do with them but then I'd need to flag my blog as mature content

Instead, I said there has to be a better way. I first turned to the old standby of data sheet view but that didn't work as my client didn't supported (we use a hosted SP site). Wait, no it was the Windows Explorer view that is so handy. Nope, that didn't work either. At this point, I'm thinking where's my SkyDrive download all button.

Sync to SharePoint Workspace

Ladies and Gentlemen, I give you the SharePoint "Download All" button.
Sync to SharePoint Workspace
I will also present my neophyte SnagIt skills. Look at that sawtooth crop!

From what I've been reading, this synch is actually closer to a SkyDrive/DropBox without the need for an explicit client installation- or at least it's already set up from installing Office. The dialog asked me what I was looking to sync and where it should store the content locally and after a few minutes of downloading, I had the entirety of our SharePoint library on my machine. How cool is that?

Why is this the best thing ever?

That might be a bit of hyperbole, but it is rather useful. Enough so that I might be able to dial back the hatred to a distaste. What's so useful? For starters, I hate the constant page refresh when I'm in a SharePoint library/list. Windows Explorer isn't going to do that to me. I don't have to worry about whether someone has configured IIS to send the right handler information along. Windows already knows how to handle these files. I don't have to go through a click dance because "this file came from the internet and might melt your face". Well, I'll probably need to indicate that the path I Sync to is trusted but I can handle that.

References

SSIS Excel Source via Script

$
0
0
This is a quick and dirty post on how I deal with ugly Excel as a source in response to this StackOverflow question. Originally, I built this script to address some data I needed to import that was stored in Excel. It wasn't typical Excel data of rows and columns. No, it was a repeating report that was exported to Excel. There was no way in hell I was going to be able to make the Excel driver work with that data to get it imported. I recently had cause to bring this code out to parse a file that will be consistent with the layout but the column headers will change between sheets. It's a forecasting spreadsheet so tab 1 is people, table 2 is dollars, tab 3 is hours, etc.

Script Task

You will need to define your output columns and their data types and then go in and add 2 classes to the existing project which are described below.

Main

This is used in the Script Task's Main section. There is probably a more graceful way of doing this but it works. In my PreExecute, I instantiate a DataSet by calling the static method ReadExcelQuery. This fills a "raw" dataset, which is literally the data exactly out of Excel. I then pass the raw dataset into the ParseSample method and get a different dataset out. In this example, there were two sheets in the Excel Workbook that I needed to get the data out of. After the PreExecute has completed, my member variable parsedData is populated with a few data tables. There is a raw, which I find useful for seeing the data exactly as it was extracted from Excel. This was useful when I was running into dates which were really just integers (see the ConvertXlsdtToDateTime method). There will be a parsed data table which is populated from your specific Parse method. Finally, there is an Error data table which lists the row & column that didn't parse as expected.

Given the fun that is debugging in SSIS Script tasks, I strongly encourage you to create a .NET Console project. Inside that project's main, you can basically paste the PreExecute method's code in there and then work on getting your Parse logic squared away. Once that's done, then add them into the actual SSIS Script Task. At that point, you're simply wiring up the resultant dataset into the OuputBuffer.

   1:  using System;
   2:  using System.Data;
   3:  using System.Collections.Generic;
   4:  using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
   5:  using Microsoft.SqlServer.Dts.Runtime.Wrapper;
   6:   
   7:  /// <summary>
   8:  /// This is the class to which to add your code.  Do not change the name, attributes, or parent
   9:  /// of this class.
  10:  /// </summary>
  11:      [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
  12:  publicclass ScriptMain : UserComponent
  13:      {
  14:          DataSet parsedData;
  15:   
  16:  publicoverridevoid PreExecute()
  17:          {
  18:  base.PreExecute();
  19:  this.parsedData = new DataSet();
  20:   
  21:  string fileName = this.Variables.CurrentFileName;
  22:              DataSet ds = null;
  23:  string worksheetName = string.Empty;
  24:  bool hasHeader = false;
  25:              DataSet excel = null;
  26:   
  27:              List<string> worksheets = new List<string>() { "Sheet1$", "Sheet2$"};
  28:  foreach (string worksheet in worksheets)
  29:              {
  30:                  excel = ExcelReader.ReadExcelQuery(fileName, worksheet, hasHeader);
  31:  if (ds == null)
  32:                  {
  33:                      ds = ExcelParser.ParseSample(FileFormatter.Sample, excel, worksheet);
  34:                  }
  35:  else
  36:                  {
  37:                      ds.Merge(ExcelParser.ParseSample(FileFormatter.Sample, excel, worksheet));
  38:                  }
  39:              }
  40:   
  41:  this.parsedData = ds.Copy();
  42:          }
  43:   
  44:  /// <summary>
  45:  /// Slog through parsedData and emit
  46:  /// </summary>
  47:  publicoverridevoid CreateNewOutputRows()
  48:          {
  49:  bool pbCancel = false;
  50:  foreach (DataRow item inthis.parsedData.Tables["Sample_TYPE"].Rows)
  51:              {
  52:  try
  53:                  {
  54:                      SampleBuffer.AddRow();
  55:                      SampleBuffer.Company = item["Company"].ToString();
  56:                      SampleBuffer.Entity = item["Entity"].ToString();
  57:                      SampleBuffer.Function = item["Function"].ToString();
  58:   
  59:                  }
  60:  catch (Exception)
  61:                  {
  62:   
  63:  throw;
  64:                  }
  65:   
  66:              }
  67:   
  68:  foreach (DataRow item inthis.parsedData.Tables["ERROR"].Rows)
  69:              {
  70:                  ErrorOutputBuffer.AddRow();
  71:                  ErrorOutputBuffer.entity = item["entity"].ToString();
  72:                  ErrorOutputBuffer.severity = item["severity"].ToString();
  73:                  ErrorOutputBuffer.messagecode = item["msg_cd"].ToString();
  74:                  ErrorOutputBuffer.messagetext = item["msg_tx"].ToString();
  75:                  ErrorOutputBuffer.value = item["value"].ToString();
  76:                  ComponentMetaData.FireError(0, "Sample parser", item["value"].ToString(), string.Empty, 0, out pbCancel);
  77:              }
  78:   
  79:          }
  80:   
  81:      }

ExcelParser class

Here is where the guts of your coding belongs. You will work through that raw dataset in a row by row fashion. It's not the greatest but it works. You will also need to define what your intended output looks like in the "MockDataTable" method.

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Data;
   4:  using System.Data.OleDb;
   5:  using System.Data.SqlClient;
   6:  using System.Diagnostics;
   7:  using System.Linq;
   8:  using System.Text;
   9:  using System.Text.RegularExpressions;
  10:   
  11:  /// <summary>
  12:  /// This class is concerned with parsing data from raw datasets
  13:  /// </summary>
  14:  publicclass ExcelParser
  15:      {
  16:  /// <summary>
  17:  /// Value translates to Class.MethodName, used for logging
  18:  /// </summary>
  19:  privatestaticstring logFileName = System.Reflection.MethodBase.GetCurrentMethod().DeclaringType.ToString();
  20:  
  21:  /// <summary>
  22:  /// Prevents a default instance of ExcelParser class from being created.
  23:  /// </summary>
  24:  private ExcelParser()
  25:          {
  26:          }
  27:   
  28:  /// <summary>
  29:  /// Parse some Excel file
  30:  /// </summary>
  31:  /// <param name="fileFormat"></param>
  32:  /// <param name="rawData"></param>
  33:  /// <returns></returns>
  34:  publicstatic DataSet ParseSample(FileFormatter fileFormat, DataSet rawData, string workSheetName)
  35:          {
  36:              System.Data.DataSet ds = null;
  37:              System.Data.DataTable dt = null;
  38:              System.Data.DataTable errors = null;
  39:              System.Data.DataRow newRow = null;
  40:   
  41:  // monotomically increasing number
  42:  int rowNumber = 0;
  43:   
  44:  // Key to understand when a new group has been found
  45:  string currentEntity = string.Empty;
  46:  string current = string.Empty;
  47:  decimal someValue = 0;
  48:   
  49:  // Error reporting
  50:  string errorSeverity = string.Empty;
  51:  string errorCode = string.Empty;
  52:  string errorMessage = string.Empty;
  53:  string errorValue = string.Empty;
  54:   
  55:              errorSeverity = "fatal";
  56:              errorCode = "invalid data type";
  57:              errorMessage = "The supplied value does not match expected type";
  58:              errors = ExcelParser.MockErrorTable();
  59:   
  60:              ds = new DataSet();
  61:              dt = ExcelParser.MockDataTable(fileFormat);
  62:  int outRowOrdinal = 15;
  63:  int outYearOrdinal = 16;
  64:  int outTabNameOrdinal = 17;
  65:  bool keepRow = true;
  66:  int processingYear = 0;
  67:   
  68:  // Parsing logic goes here
  69:  try
  70:              {
  71:  foreach (DataRow dr in rawData.Tables[0].Rows)
  72:                  {
  73:  // Skip first 3 rows
  74:                      newRow = dt.NewRow();
  75:  if (rowNumber < 2)
  76:                      {
  77:                          rowNumber++;
  78:  continue;
  79:                      }
  80:  
  81:  // On the third row, we need to peel out the year
  82:  if (rowNumber == 2)
  83:                      {
  84:  // Jan 13 Forecast
  85:                          current = dr[3].ToString().Split()[1];
  86:                          processingYear = Int32.Parse(current) + 2000;
  87:                          rowNumber++;
  88:  continue;
  89:                      }
  90:   
  91:  // Keep track of the forecast amount as we only store rows which
  92:  // are not zeroed out. 
  93:                      keepRow = false;
  94:   
  95:  for (int i = 0; i < 15; i++)
  96:                      {
  97:                          current = dr[i].ToString();
  98:  switch (i)
  99:                          {
 100:   
 101:  case 0:
 102:  // Company
 103:                                  newRow[i] = current;
 104:  break;
 105:  case 1:
 106:  // Department
 107:  if (!string.IsNullOrEmpty(current))
 108:                                  {
 109:                                      currentEntity = current.Trim();
 110:                                  }
 111:                                  newRow[i] = currentEntity;
 112:  break;
 113:  case 2:
 114:  // Division
 115:   
 116:                                  newRow[i] = current.Trim();
 117:  break;
 118:  default:
 119:  if (!Decimal.TryParse(current, out someValue))
 120:                                  {
 121:                                      errorValue = string.Format(
 122:  "Unable to convert {0} to an decimal. '{3}' ({1}{2})",
 123:                                          current,
 124:                                          ExcelParser.GetExcelColumnName(i),
 125:                                          rowNumber,
 126:                                          workSheetName);
 127:                                      errors.Rows.Add(newobject[] { fileFormat.ToString(), errorSeverity, errorCode, errorMessage, errorValue });
 128:                                  }
 129:  else
 130:                                  {
 131:                                      newRow[i] = someValue;
 132:  if (someValue != 0)
 133:                                      {
 134:                                          keepRow = true;
 135:                                      }
 136:                                  }
 137:   
 138:  break;
 139:                          }
 140:                      }
 141:   
 142:                      newRow[outRowOrdinal] = rowNumber++;
 143:                      newRow[outTabNameOrdinal] = workSheetName.TrimEnd(new[] { '$' });
 144:                      newRow[outYearOrdinal] = processingYear;
 145:  if (keepRow)
 146:                      {
 147:                          dt.Rows.Add(newRow);                        
 148:                      }
 149:                  }
 150:              }
 151:  catch (Exception ex)
 152:              {
 153:                  errorValue = string.Format(
 154:  "{0} {4} '{3}' {1}{2}",
 155:                              logFileName,
 156:                              ExcelParser.GetExcelColumnName(0),
 157:                              rowNumber,
 158:                              workSheetName,
 159:                              ex.Message);
 160:  //LogUtil.Error(logFileName, errorValue, ex);
 161:                  errors.Rows.Add(newobject[] { fileFormat.ToString(), errorSeverity, "UnableToParse", "Unable to process file.  Contact development", errorValue });
 162:              }
 163:   
 164:              ds.Tables.Add(dt);
 165:              ds.Tables.Add(errors);
 166:   
 167:  //LogUtil.Debug(System.Reflection.MethodBase.GetCurrentMethod().DeclaringType.ToString(), string.Format("{0} Leaving method", ExcelParser.logFileName));
 168:   
 169:  return ds;
 170:          }
 171:   
 172:  /// <summary>
 173:  /// Method to generate the format for an error table
 174:  /// </summary>
 175:  /// <returns>A template for error messages</returns>
 176:  publicstatic DataTable MockErrorTable()
 177:          {
 178:              DataTable dt = null;
 179:              dt = new DataTable("ERROR");
 180:              dt.Columns.Add("entity", System.Type.GetType("System.String"));
 181:              dt.Columns.Add("severity", System.Type.GetType("System.String"));
 182:              dt.Columns.Add("msg_cd", System.Type.GetType("System.String"));
 183:              dt.Columns.Add("msg_tx", System.Type.GetType("System.String"));
 184:              dt.Columns.Add("value", System.Type.GetType("System.String"));
 185:  return dt;
 186:          }
 187:   
 188:  /// <summary>
 189:  /// Translate a number into its equivalent Excel column name.  e.g. 0 -> A; 2 -> C
 190:  /// </summary>
 191:  /// <remarks>http://stackoverflow.com/questions/181596/how-to-convert-a-column-number-eg-127-into-an-excel-column-eg-aa</remarks>
 192:  /// <param name="columnNumber">A zero based number that maps to an excel column ordinal</param>
 193:  /// <returns>A string of letters that are Excel column names</returns>
 194:  publicstaticstring GetExcelColumnName(int columnNumber)
 195:          {
 196:  int dividend = columnNumber + 1;
 197:  string columnName = String.Empty;
 198:  int modulo;
 199:   
 200:              while (dividend > 0)
 201:              {
 202:                  modulo = (dividend - 1) % 26;
 203:                  columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
 204:                  dividend = (int)((dividend - modulo) / 26);
 205:              }
 206:   
 207:  return columnName;
 208:          }
 209:   
 210:  /// <summary>
 211:  /// Seriously?  For the loss
 212:  /// <see cref="http://www.debugging.com/bug/19252"></see>
 213:  /// </summary>
 214:  /// <param name="excelDate">Number of days since 1900-01-01</param>
 215:  /// <returns>The converted days to date</returns>
 216:  publicstatic DateTime ConvertXlsdtToDateTime(int excelDate)
 217:          {
 218:              DateTime dt = new DateTime(1899, 12, 31);
 219:   
 220:  // adjust for 29 Feb 1900 which Excel considers a valid date
 221:  if (excelDate >= 60)
 222:              {
 223:                  excelDate--;
 224:              }
 225:   
 226:  return dt.AddDays(excelDate);
 227:          }
 228:   
 229:  /// <summary>
 230:  /// In an attempt to simplify the actual parsing code, I have moved the
 231:  /// table creation and column definition into this method.
 232:  /// </summary>
 233:  /// <param name="fileFormat">The appropriate fileformat to use</param>
 234:  /// <returns>A well formed datatable for the given input type</returns>
 235:  privatestatic DataTable MockDataTable(FileFormatter fileFormat)
 236:          {
 237:              DataTable dt = null;
 238:   
 239:  switch (fileFormat)
 240:              {
 241:  case FileFormatter.Sample:
 242:                      dt = new System.Data.DataTable("Sample_TYPE", "tableNamespace");
 243:                      dt.Columns.Add("Company", System.Type.GetType("System.String"));
 244:                      dt.Columns.Add("Entity", System.Type.GetType("System.String"));
 245:                      dt.Columns.Add("Function", System.Type.GetType("System.String"));
 246:  break;
 247:  default:
 248:                      dt = new System.Data.DataTable("Unknown", "tableNamespace");
 249:                      dt.Columns.Add("row_id", System.Type.GetType("System.Int32"));
 250:                      dt.Columns.Add("name", System.Type.GetType("System.String"));
 251:                      dt.Columns.Add("value", System.Type.GetType("System.Decimal"));
 252:  break;
 253:              }
 254:   
 255:  return dt;
 256:          }
 257:      }    
 258:  
 259:  /// <summary>
 260:  /// The specific file formatters.  
 261:  /// </summary>
 262:  publicenum FileFormatter
 263:      {
 264:  /// <summary>
 265:  /// Budgeting file
 266:  /// </summary>
 267:          Sample,
 268:      }

ExcelReader class

This class is responsible for querying columns out of an Excel as a dataset. A worksheet has a $ at the end of its name which isn't visible in Excel. This also works with named ranges which won't have a trailing $.

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Data;
   4:  using System.Data.OleDb;
   5:  using System.Data.SqlClient;
   6:  using System.Diagnostics;
   7:  using System.Linq;
   8:  using System.Text;
   9:  using System.Text.RegularExpressions;
  10:   
  11:  /// <summary>
  12:  /// This class is concerned with extracting data from non-tabular Excel 
  13:  /// documents.
  14:  /// </summary>
  15:  publicclass ExcelReader
  16:      {
  17:  /// <summary>
  18:  /// Value translates to Class.MethodName, used for logging
  19:  /// </summary>
  20:  privatestaticstring logFileName = System.Reflection.MethodBase.GetCurrentMethod().DeclaringType.ToString();
  21:   
  22:  /// <summary>
  23:  /// Template for an OLEDB provider to read Excel data
  24:  /// </summary>
  25:  privatestaticstring providerTemplate = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"{0}\";Extended Properties=\"Excel 8.0;HDR={1};IMEX=1\"";
  26:   
  27:  /// <summary>
  28:  /// Template for an OLEDB provider to read Excel data
  29:  /// </summary>
  30:  privatestaticstring providerTemplateXLSX = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\"{0}\";Extended Properties=\"Excel 12.0 Xml;HDR={1};IMEX=1\"";
  31:   
  32:  /// <summary>
  33:  /// The row id column will always be the last element in the unparsed data
  34:  /// </summary>
  35:  privatestaticstring rowIdColumnName = "row_id";
  36:   
  37:  /// <summary>
  38:  /// Prevents a default instance of the ExcelReader class from being created.
  39:  /// Static methods so constructor is hidden
  40:  /// </summary>
  41:  private ExcelReader()
  42:          {
  43:          }
  44:   
  45:  /// <summary>
  46:  /// Gets the row id column name
  47:  /// </summary>
  48:  publicstaticstring RowIdColumnName
  49:          {
  50:  get { return ExcelReader.rowIdColumnName; }
  51:          }
  52:   
  53:  /// <summary>
  54:  /// Utility method for pulling base excel data into a dataset
  55:  /// </summary>
  56:  /// <param name="filePath">Base folder for a file</param>
  57:  /// <param name="fileName">actual file name</param>
  58:  /// <param name="worksheetName">The name of the Excel worksheet</param>
  59:  /// <param name="hasHeader">Boolean indicating whether there is a header row</param>
  60:  /// <returns>A dataset representation of a spreadsheet</returns>
  61:  publicstatic DataSet ReadExcelQuery(string filePath, string fileName, string worksheetName, bool hasHeader)
  62:          {
  63:  return ExcelReader.ReadExcelQuery(System.IO.Path.Combine(filePath, fileName), worksheetName, hasHeader);
  64:          }
  65:   
  66:  /// <summary>
  67:  /// Utility method for pulling base excel data into a dataset
  68:  /// </summary>
  69:  /// <param name="fileName">The filename to work against</param>
  70:  /// <param name="worksheetName">The name of the Excel worksheet</param>
  71:  /// <param name="hasHeader">Boolean indicating whether there is a header row</param>
  72:  /// <returns>Dataset with raw dataset</returns>
  73:  publicstatic DataSet ReadExcelQuery(string fileName, string worksheetName, bool hasHeader)
  74:          {
  75:              DataSet dataSet = new DataSet();
  76:  string datatableName = string.Empty;
  77:  string excelConnectionString = string.Empty;
  78:  string query = string.Empty;
  79:              datatableName = "RawData";
  80:              query = string.Format("SELECT * FROM [{0}]", worksheetName);
  81:   
  82:              excelConnectionString = string.Format(ExcelReader.providerTemplate, fileName, hasHeader ? "YES" : "NO");
  83:   
  84:  //Console.WriteLine(string.Format("{0} Processing file {1}  Query {2}", logFileName, fileName, query));
  85:  //Console.WriteLine(string.Format("{0} excelConnectionString {1}", logFileName, fileName));
  86:  //Console.WriteLine(string.Format("{0} excelConnectionString {1}", logFileName, excelConnectionString));
  87:   
  88:  // Ensure file exists
  89:  if (!System.IO.File.Exists(fileName))
  90:              {
  91:  //Console.WriteLine(string.Format("FileNotFound!  Expected fileName: {0}", logFileName, fileName),
  92:  //    new System.IO.FileNotFoundException("FileNotFount exception", fileName));
  93:  return dataSet;
  94:              }
  95:   
  96:  try
  97:              {
  98:                  OleDbDataAdapter oleDbDataAdapter = new OleDbDataAdapter(
  99:                      query, excelConnectionString);
 100:   
 101:                  dataSet.Tables.Add(datatableName);
 102:                  oleDbDataAdapter.Fill(dataSet.Tables[datatableName]);
 103:   
 104:  // add a column in for record id
 105:                  DataColumn rowIdColumn = new DataColumn(ExcelReader.rowIdColumnName, System.Type.GetType("System.Int32"));
 106:   
 107:                  dataSet.Tables[datatableName].Columns.Add(rowIdColumn);
 108:   
 109:  // populate record id
 110:  for (int i = 0; i < dataSet.Tables[datatableName].Rows.Count; i++)
 111:                  {
 112:                      dataSet.Tables[datatableName].Rows[i][ExcelReader.rowIdColumnName] = i + 1;
 113:  if (hasHeader)
 114:                      {
 115:  // TODO:  fix this dreck
 116:  // This section is all wrong and untested code
 117:  if (i == 0)
 118:                          {
 119:                              i++;
 120:                          }
 121:   
 122:                          dataSet.Tables[datatableName].Rows[i][ExcelReader.rowIdColumnName] = i + 1;
 123:                      }
 124:                  }
 125:   
 126:  //LogUtil.Info(logFileName, string.Format("Dataset created for file: {0}", fileName));
 127:  return dataSet;
 128:              }
 129:  catch (System.IO.IOException ex)
 130:              {
 131:  // Most likely the file is locked by another process
 132:  //LogUtil.Error(logFileName, string.Format("Exclusive lock on file: {0} Error:  {1}", fileName, ex.ToString()));
 133:  returnnew DataSet();
 134:              }
 135:  catch (System.Data.OleDb.OleDbException ex)
 136:              {
 137:  // The tab name does not exist in the spreadsheet.  
 138:  // Odds are decent they forgot to append a $ to the tab name
 139:  //Console.Error.WriteLine(string.Format("Unable to find tab {2} in file {0} Error:  {1}", fileName, ex.ToString(), worksheetName));
 140:   
 141:  if (!worksheetName.EndsWith("$"))
 142:                  {
 143:  //LogUtil.Error(logFileName, string.Format("Retrying as tab {1}$ in file {0}", fileName, worksheetName));
 144:  return ExcelReader.ReadExcelQuery(fileName, worksheetName + "$", hasHeader);
 145:                  }
 146:   
 147:  returnnew DataSet();
 148:              }
 149:  catch (Exception ex)
 150:              {
 151:  //LogUtil.Error(logFileName, string.Format("Reading inital dataset failed: {0} Error:  {1}", fileName, ex.ToString()));
 152:  returnnew DataSet();
 153:              }
 154:          }
 155:      }

File in use by another process

$
0
0
There's nothing quite so delightful as getting error messages that don't provide enough information to resolve the base problem. Case in point, I went to run an SSIS package which in 2012 requires a rebuild of the .ispac and received the following error message.

System.IO.IOException: The process cannot access the file 'C:\Src\DataWarehouse\bin\Development\DataWarehouse.ispac' because it is being used by another process.

What is the other process? I've seen people resort to a reboot because they can't figure out what process has their grubby finger on the file but that is overkill. All you really need is the free utility "Process Explorer" from Microsoft/Sysinternals.

Download or simply run from the website (but you'll want to do download it and keep it as part of your troubleshooting kit) and once it pops up, under the Find menu, select "Find Handle or DLL..." Ctrl-F



In the resulting window, simply search for the file that is being reported as locked.



Here, I searched for the .ispac extension and discovered that a DtsDebugHost.exe still had the .ispac file open. In this specific example, that tells me I should check my Visual Studio instances as that'd be the process that launched DtsDebugHost. That was my instinct however and I had no instances of VS/SSDT/BIDS running so I chalked it up to a zombie process and just killed the process.

You can then simply double click on the line with the Process and observe that the underlying process list will find the offending PID (9388). In the resulting window, right click on the process line and select Kill

Summit 2013, I'm speaking!

$
0
0

I'm in!

I might be a little stoked so please forgive this post but I've had a submission accepted for the 2013 Pass Summit. In the "BI Platform Architecture, Development & Administration" track, I'll be delivering a talk titled "Type more, click less: Programmatically build SSIS packages."

Gratuitous browser abuse

Marquee tag: I'm speaking!

Blink tag: I'm presenting at Summit 2013

Type more, click less: Programmatically build SSIS packages.

Why should you come to my session? Because you would love to automate some of the repetitive tasks you perform with SSIS.

You did know you didn't have to fire up BIDS/SSDT to make all your changes, right? In fact, did you know you can build complete packages without ever firing up BIDS? I wouldn't go that route unless I had a strong business case for doing so, but a little bit of up front work can yield an N% complete SSIS package.

Specific technologies I'll cover are BIMLScript, EzAPI and the base .NET library.

After attending this session, I want you to know what your options are and when it makes sense to use one approach instead of the other.

BimlScript overview

$
0
0

BimlScript overview

This is a followup of sorts to my posts on creating SSIS packages with the EzAPI. In these posts, I'll show how to create the same SSIS packages using Biml.

Prerequisites

The pre-reqs for using BIML to create SSIS packages is simple

Getting started with Biml

Biml, Business Intelligence Markup Language, is a DSL for creating BI artifacts. It's an interesting proposition: learn a third language to speak a second language more efficiently. Note that Biml covers both SSIS packages and Analysis Services but for now, I'm only building a HOW TO SSIS packages. The more I've been working with this, the more I think this dog might hunt...

Open Visual Studio/BIDS/SSDT-BI and create a new project of type "Integration Services Project".
New Project - Integration Services Project

This is the SQL Server 2012 templates in Visual Studio 2012 but the beautiful thing about Biml is that it's going to work with whatever version of BIDS Helper you've installed.
BimlSSIS

Right click on your project and select "Add New Biml File".
Add New Biml File
You will now have a BimlScript.biml file in your Miscellaneous folder. You may rename it if you wish, there is no meaning between a .biml file and the output it generates.
BimlScript.biml
Double click it and Visual Studio will open it and you'll be greeted with


<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
</Biml>

To quote Kitty: Wow, that's really something Lex. It's freaking gone with the wind.

You'd be correct, this script won't do anything. Let's describe the Hello World equivalent of an SSIS package. We must create a Packages collection and inside that, add a Package node. Every Package you define will translate into a physical .dtsx file. Each Package node requires two attributes: Name and ConstraintMode.


<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package
Name="SimpleBimlPackage"
ConstraintMode="Linear"
/>
</Packages>
</Biml>

A beautiful thing you should notice is the Intellisense. Perhaps it's just my age---but a good implementation of an autocomplete system is invaluable. What I really like, especially since I'm just fumbling my way through BIML, is that after defining an attribute, it no longer shows in the autocomplete list.
Intellisense

At this point, we still have no package, just a description of what the project as a whole and the package itself should be. Let's correct that. Right click on the BimlScript.biml file and select "Check Biml for Errors".
Check Biml for Errors

After a moment or two, BIDSHelper should report back "No errors or warnings were found."
No errors or warnings were fund.

Once more, right click on "BimlScript.biml" but this time chose the "Generate SSIS Packages" option. After a moment and a flicker, you will notice you now have a real SSIS package.
SimpleBimlPackage.dtsx

Examine the package from head to toe and you'll find it's devoid of anything. We now have a mirror of the two packages created with EzAPI and the straight API calls. Given the beauty of tab completion, I'm going to award Biml the fewest key strokes to generate a package.

Yup, looks like a package. One way you can differentiate a package generated from EzAPI from BIDS/SSDT from Biml or the straight .NET API is the Description property. In the case of the .NET API or Biml, the Description is only populated if you provide explicit descriptions.
Package properties


SSISDB folders, projects, environments explanation

$
0
0

Integration Services Catalog

The 2012 release of SQL Server brought a new capability for SSIS package management and I wanted to put together a comprehensive post on some of the management side of things. Getting the nomenclature sorted out can be rather helpful so let's start there. Within SQL Server Management Studio, SSMS, connect to a 2012 instance. At the bottom, there should be an "Integration Services Catalogs." If that expands when you click the plus sign, then someone has configured the server to run Integration Services.

Integration services catalog

1. SSISDB

Despite the "Catalogs" part of the above label, there is only one allowable catalog for the 2012 release---the SSISDB. That is a physical catalog/database on the instance that contains all of the bits and bobs for managing Integration Services from within SQL Server for those using the new Project deployment model.

And lo, the people rejoiced as Microsoft has stopped stuffing everything into msdb...

Right click options

  • Active Operations - Lists all currently running SSIS packages with an option to Stop the selected package.
  • Create Folder - Creates a folder with an optional description
  • Start PowerShell - Self explanatory
  • Reports - A variety of native reports about package executions.
  • Delete - This deletes the SSISDB catalog. Doing this will result in having to recreate the catalog and a loss of all your history.
  • Refresh - Self explanatory
  • Properties - Contains information about the default logging settings and the project versioning.

2. Folder

A folder contains Project(s) and, optionally, Environment(s).You must create a folder. The deployment wizard will not automagically create one for you. You may call it whatever you like. You may have all of your SSIS projects under a single folder, have a 1:1 mapping between projects and folders or anything in between. Folders are only 1 level deep - No nesting

Right click options

  • Start PowerShell - Self explanatory
  • Reports - All Executions, All Validations
  • Delete - This deletes the folder, assuming there are no Projects or Environments below it.
  • Refresh - Self explanatory
  • Properties - Contains the name, version and a description.

3. Projects

This is automatically created when a folder is created.

Right click options

  • Deploy Project - IS Dpeloyment Wizard
  • Import Packages - Integration Services Project Conversion Wizard
  • Start PowerShell - Self explanatory
  • Reports - Only Custom Reports are available
  • Refresh - Self explanatory

4. An instance of a project

In this example, I had an SSIS project created called SSIS2012. I also had one called SQLSaturdaySSIS. This is the name of the associated .dtproj in Visual Studio. If you don't like this name in the Integration Services Catalog, you cannot change it here. You must update your original solution, recompile it and deploy it back into the catalog.

Right click options

  • Configure - This allows for the specification of values for Parameters and/or Connection Managers.
  • Validate - Attempt to validate the deployment. Ensures all the metadata is valid, connections work, etc, without actually "running" the packages.
  • Move - Relocate the project from one folder to another.
  • Versions - The catalog now keeps track of previous revisions of your deployed project. This is not proper version control.. Phil Brammer has a clever writeup about extracting old versions of projects.
  • Export - This allows you to pull the .ispac out of SQL Server to disk
  • Start PowerShell - Self explanatory
  • Reports - Two native report options
    • All Executions - Provide a history of all package executions, where all is however far back you have the catalog configured to retain history
    • All Validations - Provide a history of all package validations.
  • Delete - Deletes the project
  • Refresh - Self explanatory
  • Properties - Provides the Name, Identifier, Description, Project Version and Deployment Date. Name and Description come from the .ispac's @Project.manifest file. Identifier, Version and Deployment date are all things internal to the catalog's management. The Deployment date though, that can be handy if things "suddenly" start acting up you can quickly determine if there might have been a code change. Emphasis on might because deploying the same .ispac three times in a row will result in three different versions in history with no actual change.

5. Packages

This folder is automatically created when you deploy an ispac into a folder.

Right click options

  • Start PowerShell - Self explanatory
  • Reports - Only has the option for Custom reports
  • Refresh - Self explanatory

6. A package

Right click options

  • Configure - This allows for the specification of values for Parameters and/or Connection Managers
  • Execute - Brings up the dialog to prepare a package for execution.
  • Validate - Attempt to validate the deployment. Ensures all the metadata is valid, connections work, etc, without actually "running" the packages.
  • Start PowerShell - Self explanatory
  • Reports - Two native report options
    • All Executions - Provide a history of all package executions, where all is however far back you have the catalog configured to retain history
    • All Validations - Provide a history of all package validations.
  • Refresh - Self explanatory
  • Properties - Exposes the Name, Identifier, whether it's an Entry Point package and the Description.

7. Environments

Environments are the successor for Configuration resources (XML, Table, OS Environment Variable, Registry value).

Right click options

  • Create Environment - Create a new Environment under this folder. Requires a Name and an optional Description.
  • Start PowerShell - Self explanatory
  • Reports - Only has the option for Custom reports
  • Refresh - Self explanatory

. An Environment

FileSystem and Database are examples of two Environments I have created. You can define as many Environments as you wish. I typically see people naming them for the server class they are in (dev, test, load, stage, prod) but I don't think that makes as much sense as Environments based on the type of thing they configure (FileSystem, Database, etc). Future blog material though. For now, know they exist as a dictionary of values. You can mark them as Sensitive so you can store secure materials.

Right click options

  • Move - Relocate the Environment from one folder to another. Any configurations built on that Environment do not have a cascading update applied.
  • Start PowerShell - Self explanatory
  • Reports - Only has the option for Custom reports
  • Rename - Changes the name of the Environment.
  • Delete - Deletes the Environment
  • Refresh - Self explanatory
  • Properties - 3 tabs. One is the Name, ID and Description. Second is a list of the Name, Description, Type, Value, Sensitivity of each variable. The Third tab describes the permissions associated to the Environment.

GUI isn't everything

Even though there's a GUI for all this management, don't hesitate to click that Script button on any of these interfaces and look at the generated scripts. I might use the GUI to build out my Environment and configure a project to use them on our Dev server but you can be assured I've clicked the Script button and saved that out to a version controlled file. When it's time to migrate up the chain I update server names and file paths and hit F5. You're welcome to make mistakes hand key those values per environment but while you're doing that, I'll be off checking StackOverflow.


SSIS Loading Windows Azure Blob Storage

$
0
0

On StackOverflow today, someone asked if you could upload an image from an on-premises SQL Server to Windows Azure Blob Storage with SSIS. That's just the sort of crazy stuff I need to see if I can do so off I went.

I must give great credit to the Azure team and in particular, their tech writers. I was able to crib heavily from their how to and had a working demo in fairly short order. Tying it into SSIS was a bit more work just due to the eccentricities of the platform but hey, it works!

I'm reposting my answer here in case I need to remember how to do this.


What a fun question this was! I got to thread together a lot of pieces that I had never tried.

I first built out a simple console app based on the fine manual over on HOW TO: Blob Storage. Knowing that I had working code allowed me to adapt it for SSIS.

I created 3 SSIS Variables at the Package level. AccountName, AccountKey and ContainerName. They are all data type String. These provide credentials + the folder where my uploaded data will reside.

data flow and variables

Data Flow

The general look of your data flow is rather simple. A data source to a Script Component that will act as a Destination. You will need two columns: one provides a unique name for the blob and the other will be the binary bits.

My source is a trivial table. It has Country Names and their flag (stored as varbinary(max)) which you yourself can scrape from the CIA World Handbook if you're so inclined.

The Destination will be a bit of C#. Add a Script Component of type Destination.

On the Script tab, I have 3 ReadOnly Variables listed User::AccountKey,User::AccountName,User::ContainerName

On the Input Columns tab, I select CountryName and FlagImage.

The script itself follows. As noted in the How To, you will need to add a reference to Microsoft.WindowsAzure.Storage assembly before you can access the last 3 assemblies there.

using System;
using System.Data;
using System.IO;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

// Must add reference to Microsoft.WindowsAzure.Storage for this to work
// http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Auth;
using Microsoft.WindowsAzure.Storage.Blob;

/// <summary>
/// Watch me load data to Azure from SSIS
/// </summary>
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

/// <summary>
/// The storage account used
/// </summary>
private CloudStorageAccount storageAccount;

/// <summary>
/// An entity to work with the Blobs
/// </summary>
private CloudBlobClient blobClient;

/// <summary>
/// Blobs live in containers
/// </summary>
private CloudBlobContainer container;

/// <summary>
/// blockBlob instead of a pageBlob
/// </summary>
private CloudBlockBlob blockBlob;

/// <summary>
/// This method is called once, before rows begin to be processed in the data flow.
///
/// You can remove this method if you don't need to do anything here.
/// </summary>
public override void PreExecute()
{
base.PreExecute();
string cs = string.Empty;
string csTemplate = string.Empty;
string accountName = string.Empty;
string accountKey = string.Empty;
string containerName = string.Empty;

accountName = Variables.AccountName;
accountKey = Variables.AccountKey;
containerName = Variables.ContainerName;
csTemplate = "DefaultEndpointsProtocol=https;AccountName={0};AccountKey={1}";
cs = string.Format(csTemplate, accountName, accountKey);

this.storageAccount = CloudStorageAccount.Parse(cs);
this.blobClient = this.storageAccount.CreateCloudBlobClient();
this.container = this.blobClient.GetContainerReference(containerName);
this.container.CreateIfNotExists();
this.container.SetPermissions(new BlobContainerPermissions { PublicAccess = BlobContainerPublicAccessType.Blob });

}

/// <summary>
/// For each row passing through, upload to Azure
/// </summary>
/// <param name="Row">The row that is currently passing through the component</param>
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
string blobName = string.Empty;

using (MemoryStream memStream = new MemoryStream(Row.FlagImage.GetBlobData(0, (int)Row.FlagImage.Length)))
{
this.blockBlob = this.container.GetBlockBlobReference(Row.CountryName);
this.blockBlob.UploadFromStream(memStream);
}
}

}

Global Assembly Cache (GAC)

Assemblies you wish to use within SSIS must reside in the GAC. Assemblies cannot go into the GAC unless they are signed. Fortunately, the Azure assemblies are signed so from a Visual Studio Command Prompt, type gacutil -if "C:\Program Files\Microsoft SDKs\Windows Azure\.NET SDK\v2.1\ref\Microsoft.WindowsAzure.Storage.dll" or the equivalent of where your version of that assembly exists

Load successful

And as proof, here's a shot from Azure Storage Explorer

blobs everywhere

BIML Active Directory SSIS Data Source

$
0
0

Since posting Active Directory SSIS Data Source I've received a few requests for the package. And I always felt bad because there wasn't a nice way to say change, these N things from my copy and it'll work for you and here's a copy for 2005, 2008, 2012 and now 2014. Egads.

Fortunately, there's a new tool in my toolbelt called Biml and it's awesomesauce.

  1. Download BIDS Helper
  2. Install BIDS Helper
  3. Open Visual Studio, create/open an Integration Services project type.
  4. Right click on the project and select "Add New Biml File"
  5. In your project's Miscellaneous folder, double-click the BimlScript.biml file and paste the following content into it replacing the file's content.
  6. Save
  7. Edit line 6. Replace HOME.BILLFELLOWS.NET with the name of your domain controller (DC).
    • To determine your DC, from a command prompt, type set | find /i "userdnsdomain"
    • Sample results

      J:\Users\bfellows>set | find /i "userdnsdomain"
      USERDNSDOMAIN=HOME.BILLFELLOWS.NET
  8. Edit line 46. Replace LDAP://DC=home,DC=billfellows,DC=net with the path to the domain controller. Follow the pattern and you should be fine.
  9. Right click on the BimlScript.biml file and choose "Generate SSIS Packages"

<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AdoNetConnection
Name="AD"
Provider="System.Data.OleDb.OleDbConnection, System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
ConnectionString="Data Source=HOME.BILLFELLOWS.NET;Provider=ADsDSOObject;Integrated Security=SSPI;"
/>
</Connections>
<Packages>
<Package
Name="ADQueryMulti"
ConstraintMode="Linear"
DelayValidation="true"
>
<Annotations>
<AnnotationAnnotationType="Description">$header$</Annotation>
</Annotations>
<Connections>
<ConnectionConnectionName="AD"/>
</Connections>
<Variables>
<VariableName="RowCountSource"DataType="Int32"Namespace="User">0</Variable>
</Variables>
<Tasks>
<Dataflow
Name="DFT AD Extract"
DelayValidation="true"
>
<Transformations>
<AdoNetSource
Name="ADO AD"
ConnectionName="AD"
>
<DirectInput>
SELECT
distinguishedName
, mail
, samaccountname
, Name
, objectSid
, userAccountControl
, givenName
, middleName
, sn
FROM
'LDAP://DC=home,DC=billfellows,DC=net'
WHERE
sAMAccountType = 805306368
ORDER BY
sAMAccountName ASC
</DirectInput>
</AdoNetSource>

<RowCount
Name="RC Source"
VariableName="User.RowCountSource"
/>

<DataConversion
Name="DC NTEXT to DT_WSTR"
>
<Columns>
<ColumnDataType="String"
Length="500"
SourceColumn="distinguishedName"
TargetColumn="distinguishedName"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="mail"
TargetColumn="mail"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="samaccountname"
TargetColumn="samaccountname"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="Name"
TargetColumn="Name"
IsUsed="true"
/>
<ColumnDataType="Int32"

SourceColumn="userAccountControl"
TargetColumn="userAccountControl"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="givenName"
TargetColumn="givenName"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="middleName"
TargetColumn="middleName"
IsUsed="true"
/>
<ColumnDataType="String"
Length="500"
SourceColumn="sn"
TargetColumn="sn"
IsUsed="true"
/>
</Columns>
</DataConversion>

<DerivedColumnsName="DER Check Account Status">
<Columns>
<ColumnDataType="Boolean"Name="IsActive">(([userAccountControl] &amp; 2) == 2) ? false : true</Column>
</Columns>
</DerivedColumns>

<ConditionalSplitName="CSPL Filter Inactive Accounts">
<OutputPaths>
<OutputPathName="ActiveAccounts">
<Expression>IsActive</Expression>
</OutputPath>
</OutputPaths>
<InputPathOutputPathName="DER Check Account Status.Output"></InputPath>
<DataflowOverrides>
<OutputPathOutputPathName="Default"
Description="AD accounts that are deactivated"
PathAnnotation="SourceName"
></OutputPath>
</DataflowOverrides>
</ConditionalSplit>
<DerivedColumnsName="bit bucket Active">
<InputPathOutputPathName="CSPL Filter Inactive Accounts.ActiveAccounts"></InputPath>
</DerivedColumns>
<DerivedColumnsName="bit bucket InActive">
<InputPathOutputPathName="CSPL Filter Inactive Accounts.Default"></InputPath>
</DerivedColumns>
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>

If everything went well, you'll end up with a package like this (minus the data viewers)

Caveats

The original limitations still exist. The next post of this will have the script task included to generate group membership. The final installment will use a .NET script source to provide an alternative to the limitations of the query route.

SSIS Copy all tables

$
0
0

This could have also been titled, "holy crap biml is powerful" but that wouldn't do much for SEO. Instead, this post demonstrates how easy, how trivial it is to generate an SSIS package for every table in your database and copy all the data to a new server. Think about that. How often do you have a need to refresh N tables in a development environment from production. No, you can't do a restore because there's work in progress there, people just need some subset of data refreshed. This is yet another hammer for you to solve that problem with.

I have been working with Biml for a few months now but I haven't had a use case for pulling the scripting aspect into it. Today was that day, courtesy of this SO question: ssis best practice to load N tables from Source to Target Server

The following bimlscript will generate an SSIS package for every table in the source database and write all the source data to the matching table in the destination. For my example, I scripted out the Adventureworks2012DW into a new copy and ran the following biml.

What's it doing?

I create two connection strings to point to my source and destination. As noted over on StackOverflow, you will need to change this to your source and destination servers as well as verify the Provider matches your version.

I then have an query that pulls in all the tables and their schema, skipping any MS shipped ones and sysdiagrams. That query gets executed and dumped into a DataTable. That's all the setup really.

It's standard biml, create the Connections collection, define our Packages collection and then begin iterating through our DataTable.

Each Package will have 3 Variables: SchemaName, TableName and QualifiedTableSchema. The first two are set by values from our DataTable, the third is an expression built using the first two.

We add a Task collection and inside of it, a Data Flow with an OLE DB Source routed to an OLE DB Destination. I drive both of these by our Variable, QualifiedTableName.


<#@ template language="C#" hostspecific="true" #>
<#@ import namespace="System.Data" #>
<#@ import namespace="System.Data.SqlClient" #>
<#@ import namespace="System.IO" #>
<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
<!--
<#
string connectionStringSource = @"Server=localhost\dev2012;Initial Catalog=AdventureWorksDW2012;Integrated Security=SSPI;Provider=SQLNCLI11.1";
string connectionStringDestination = @"Server=localhost\dev2012;Initial Catalog=AdventureWorksDW2012_DEST;Integrated Security=SSPI;Provider=SQLNCLI11.1";

string SrcTableQuery = @"
SELECT
SCHEMA_NAME(t.schema_id) AS schemaName
, T.name AS tableName
FROM
sys.tables AS T
WHERE
T.is_ms_shipped = 0
AND T.name <> 'sysdiagrams';
";

DataTable dt = null;
dt = ExternalDataAccess.GetDataTable(connectionStringSource, SrcTableQuery);
#>
-->
<Connections>
<OleDbConnection
Name="SRC"
CreateInProject="false"
ConnectionString="&lt;#=connectionStringSource#>"
RetainSameConnection="false">
</OleDbConnection>
<OleDbConnection
Name="DST"
CreateInProject="false"
ConnectionString="&lt;#=connectionStringDestination#>"
RetainSameConnection="false">
</OleDbConnection>
</Connections>

<Packages>
<# foreach (DataRow dr in dt.Rows) { #>
<PackageConstraintMode="Linear"
Name="&lt;#=dr[1].ToString()#>"

>
<Variables>
<VariableName="SchemaName"DataType="String"><#=dr[0].ToString()#></Variable>
<VariableName="TableName"DataType="String"><#=dr[1].ToString()#></Variable>
<VariableName="QualifiedTableSchema"
DataType="String"
EvaluateAsExpression="true">"[" + @[User::SchemaName] + "].[" + @[User::TableName] + "]"</Variable>
</Variables>
<Tasks>
<Dataflow
Name="DFT"
>
<Transformations>
<OleDbSource
Name="OLE_SRC&lt;#=dr[0].ToString()#>_<#=dr[1].ToString()#>"
ConnectionName="SRC"
>
<TableFromVariableInputVariableName="User.QualifiedTableSchema"/>
</OleDbSource>
<OleDbDestination
Name="OLE_DST&lt;#=dr[0].ToString()#>_<#=dr[1].ToString()#>"
ConnectionName="DST"
KeepIdentity="true"
TableLock="true"
UseFastLoadIfAvailable="true"
KeepNulls="true"
>
<TableFromVariableOutputVariableName="User.QualifiedTableSchema"/>
</OleDbDestination>
</Transformations>
</Dataflow>

</Tasks>
</Package>
<# } #>
</Packages>
</Biml>

Usage

  1. Download and install BIDS Helper
  2. Add a new biml file to an existing SSIS project
  3. Disable Visual Studio's auto-fix for XML files. See Overcoming BimlScript Copy and Paste Issues in Visual Studio
  4. Paste this code into that Bimlscript.biml file
  5. Fix your source and destination connection strings (lines 8 & 9) to point to the correct servers as well as change the Provider type if SQLNCLI11.1 is not correct for your version of SQL Server
  6. Right-Click on the biml file and select "Generate SSIS Packages"

Biml - Backup Database Task

$
0
0

Biml - Backup Database Task

You can create an SSIS package that executes the Backup Database Task with Biml using only a few lines.


<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AdoNetConnectionName="CM_ADO_DB"ConnectionString="Data Source=localhost\dev2012;Integrated Security=SSPI;Connect Timeout=30;Database=msdb;"Provider="SQL"/>
</Connections>
<Packages>
<PackageConstraintMode="Linear"Name="Task_BackUpDatabase">
<Tasks>
<BackupDatabase
BackupAction="Full"
ConnectionName="CM_ADO_DB"
DatabaseSelectionMode="User"
Name="BKP User Databases">
<AutomaticBackupDestination
BackupFileExtension="bak"
CreateSubfolder="true"
Folder="C:\backup"
/>
</BackupDatabase>
</Tasks>
</Package>
</Packages>
</Biml>

Results

Right click on that script, after adjusting the connection string, and you should end up with a package like this.

Biml - Bulk Insert Task

$
0
0

Biml - Bulk Insert Task

The Bulk Insert Task is the SSIS equivalent of the bcp utility or the BULK INSERT SQL Command. It's a way to import flat file data very efficiently with no real Transformation in your ETL. Using Biml to create an Bulk Insert Task requires only a few lines of code but you'll observe I build out a full fledged demo.

Setup

I am a fan of working in tempdb so create the following table there, or any other scratch database you have handy. I'm going to create a table called BulkInsert and load a simple file into it.

Table declaration


CREATETABLE dbo.BulkInsert
(
EventNumber intNOTNULL
, EventLocation varchar(50) NOTNULL
);

File declaration

I use a base or root folder for all of my SSIS file work. I call it "SSISData" as I'm original like that, and then create subject area subfolders and within each subfolder, I have 3 folders: Archive, Input, and Output. In this case however, I will have a file sitting in the root of the folder named "C:\SSISData\NoHeaderRow.tsv" (tsv means tab separated values) so if the following comes across as spaces, replace them with the tab character.

53 Kansas City

Biml

The following code uses two connection managers. CM_File is going to refer to the flat file created above. CM_OLE is looking for a named instance of Dev2012 on the local machine. Change these values as you see fit.

<Bimlxmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<FileConnectionName="CM_File"FilePath="C:\ssisdata\NoHeaderRow.tsv"FileUsageType="ExistingFile"HasMultipleFiles="false"/>
<OleDbConnectionName="CM_OLE"ConnectionString="Data Source=localhost\dev2012;Initial Catalog=tempdb;Provider=SQLNCLI11.0;Integrated Security=SSPI;"/>
</Connections>

<Packages>
<PackageName="Task_BulkInsert"ConstraintMode="Linear">
<Tasks>
<BulkInsertName="BLK Load Data"ConnectionName="CM_OLE"DataFileType="Char"TableLock="true">
<SourceFileConnectionName="CM_File"></SourceFile>
<ExternalDestinationTableTable="[dbo].[BulkInsert]"/>
</BulkInsert>
</Tasks>
</Package>
</Packages>
</Biml>

Results

Right click on that Biml script and you'll have a package named "Task_BulkInsert" that looks something like

Bulk Insert Task package

If you double click on the Task, you will see options we set for DataFileType and TableLock match what is implemented. Powerful stuff.

Bulk Insert Task Connections tab
Bulk Insert Task - Advanced tab

Viewing all 144 articles
Browse latest View live