Wednesday, February 27, 2013

Building a Microsoft BI Demo Environment

I've been asked at multiple meetings in the last month what is the best way to build a Microsoft BI demo environment. There are a few ways that you can do this:

Excel 2013

If you have access to Excel 2013 (you can get the 60 day evaluation version here), both PowerPivot and Power View are included as add-ins that don't require additional downloads (or SharePoint), just activate them from the File menu, select Options.

Then select Add-Ins from the Options left navigation panel. In the Add-Ins panel change the Manage dropdown at the bottom of the panel to COM Add-ins and click Go....

From the COM Add-Ins popup, check the boxes for Microsoft Office PowerPivot for Excel 2013 and Power View and then click OK to close the popup.

That should add the POWERPIVOT menu to your Excel 2013 menu bar and allow you to build PowerPivot workbooks.

To create Power View reports, just go to the Insert menu and click on the Power View button that should appear between PivotChart and Sparklines items in the ribbon.

With Excel 2013 you can create the PowerPivot/Power View workbook and save them as normal Excel workbooks without requiring SharePoint or other software on your computer.

Microsoft Virtual Labs

Microsoft has posted virtual labs that you access with an internet connection. All of these labs are provided free of charge, but there is a limit of time you can access each lab. This link will take you to a list of all available labs, or use this link to go right to the Power View lab or this link for the PowerPivot lab. There are also some TechNet virtual labs that you can access from this link as well.

Virtual Machine

The last and most complex to setup option is to build your own complete VM (or VMs). For this option you will have full control over the environment, but you will need to install and configure all of the software yourself. For a complete environment you will need to install Windows Server, SQL Server, SharePoint and Office. For help getting all of this installed and configured in the correct order, see this post put together by Microsoft. This guide is also available, but it is a bit dated since it is using SQL 2008R2 and SharePoint 2010. There is also a guide for specifically setting up SharePoint 2013 or SharePoint 2010. All of the software that is required for setting up this VM(s) is available in evaluation versions, just keep in mind that these versions will expire after a period of time (180 days max). The other thing to keep in mind is that you can buy a TechNet Professional annual subscription for $349 (renewal $249 after first year), you might also find this cheaper online. I recommend the TechNet Professional license over the other versions of TechNet because it does include license keys for all of the Enterprise server products that Microsoft offers.

Microsoft has also provided a version of a VM (previously referred to as the Microsoft All-Up BI Image) that you can download, this VM can be run in HyperV, VirtualBox or VMWare and is available at this location. Here is a review of this VM from Dan English. Microsoft has said that they are working on a new version of this image, but it is currently not available for download.


Personally I like the control of having everything in a VM that I have built, but it is a lot more work. The good thing is you will become very knowledgeable on how to install and configure all of this software (comes in very handy for Microsoft Certifications). I do buy a TechNet Professional license for my own use every year, so this is how I am able to license all of the Enterprise versions of the OS, SQL Server and even SharePoint. I also have Excel 2013 installed on the laptop itself as well, so that I can do a quick demo if necessary without spinning up a whole VM.


Tuesday, February 5, 2013

ABCs of CDC with SSIS 2012

Today I presented a session as part of Pragmatic Works free "Training on the T's", called "ABCs of CDC with SSIS 2012". The session was attended by 507 people and I received a lot of positive feedback (even though my voice as a bit rough because of a cold that I'm getting over).

This session focuses on defining what CDC (Change Data Capture) is for those that have heard of it and maybe even know that it has been offered with SQL Server 2008 and greater, but didn't really know how to use it. I also show how it is much easier to consume the CDC data now with SSIS 2012, since it now includes native components for CDC.

I have posted all of the slides and demo materials for this session to my SkyDrive for anyone to download and try out. The only thing that you will need to get this up and running in your environment is the AdventureWorksDW2012 database from CodePlex  and a SQL Server 2012 Enterprise/Developer/Evaluation Edition instance with the Database Engine and SSIS setup on (CDC only works on these editions of SQL Server). You will need to make sure you have full admin rights on this instance since the commands to setup CDC require that level of permission.

Below are a few of the most common questions that I received in the session with answers:

What is the performance impact for using CDC on a database/table?
This will vary based on the amount of changes that are being done to the database/table you are using CDC on, but Microsoft has optimized this process to use as little CPU/memory as possible. As with any process that you are going to add to a database you should test this in your own environment to know for sure what the impacts are. There is a great White Paper available on MSDN that covers many different scenarios and impacts of CDC on each.

How is CDC different from Replication, Mirroring or other Disaster Recovery/Backup Scenarios?
CDC is only going to track the changes to the data in the database/table that it is implemented on, so it will not be able to capture security changes or many other DDL changes done to a table (columns being added/removed are tracked, but with limits, see MSDN article). The main purpose behind CDC is for a quick way to track the changes to the data made over time and to periodically apply those changes to another database with the same schema. CDC is not meant to be used in place of Replication, Mirroring, Log Shipping or other disaster recovery type of scenarios. It is possible to use some of these other technologies with a database/table that has CDC enabled (see the "Change Data Capture and Other SQL Server Features" section of the MSDN article linked here). The most common use for CDC is where a source database needs to be loaded into a data warehouse or the staging area for a data warehouse and you want to get all of the data inserts, updates and deletes applied appropriately without using triggers or other custom code that will have to be maintained.

Can CDC be enabled on any database/table?
Yes it can, but it must be enabled at the database level first and then the table(s) in that database that you want to track with CDC have to be enabled separately. CDC is not on by default on any databases or tables when it is first setup. There are also some important column restrictions that are shown in the slides that I have attached above.

Is it possible to have the CDC track only Updates and Inserts and not Deletes (or any combination of these)?
The CDC stored procedures provided by Microsoft to enable CDC on the database and table do not have any options to ignore certain changes, so once it is enabled it will track all Inserts, Updates and Deletes done to that table. When you are creating the processes that use the CDC data to apply the changes to the destination database, you can choose to not implement any of the types of changes you don't want by using the _$operation column and looking for the appropriate value(s). If you are using SSIS 2012 and the new CDC Splitter, then you can choose not to implement any of the data flows off of it that you don't want to have processed.

Can the SSIS 2012 CDC components read the CDC source data from older versions of SQL Server?
Yes! The SSIS 2012 CDC components can connect to a SQL Server 2008/2008R2/2012 CDC enabled database/table and use that data to then update any SQL Server 2008/2008R2/2012 destination (along with the possibility of other databases that have the required ADO .NET data providers in SSIS 2012). See this link for details on using SSIS 2012 CDC with Oracle!

For more details on how CDC works, there are great resources available on MSDN.

If you have any other specific questions, please feel free to email me at the address in the slides attached above or send me a message on Twitter.