Monday, January 14, 2013

Balanced Data Distributor (BDD) for SSIS

I was just alerted to a Microsoft developed SSIS component that might help in situations where you need to load large amounts of data via SSIS and have some slow components in your Data Flow. The component is called Balanced Data Distributor or BDD and it is available for free from Microsoft for SQL Server 2008/2008R2/2012 in both 32bit and 64bit flavors.

This component was developed as a way to split your processing within an SSIS package to take advantage of today's multi-threaded servers and do it without requiring you to run multiple copies of the same packages in parallel or other more difficult solutions. The component is very easy to use, once installed you just drag in onto your Data Flow and hook it up to your input data, then you can pull off as many output flows from it as you want (Microsoft is currently not recommending more than 5, but you should test it in your environment to see what works best). From there your output flows should be the same to execute the slow moving processes and then they can each write directly to the same output or be merged back together using a Union All depending on your needs. Whatever the number of outputs you use the component will do the splitting by routing each pipeline buffer of data to the outputs in order. So, if you have 2 outputs setup, the first buffer goes to output 1 and the second buffer goes to output 2 and so on until the whole input file is complete.

Nice simple component that can have huge benefits if you are working with data that is just taking way too long to load in a single Data Flow. The downside is that you will need to duplicate the components in each of the output flows, so maintenance might be a mess later on since you will have make changes in each of the flows. Overall, though that issue is probably going to be out-weighted by the runtime improvements you should see in your packages. Also, keep in mind that this is going to be processing the data out of order, so if your process requires that this component won't help you unless you can work around that requirement.

Below are some links on this component:


Stay tuned for a more detailed post on using this as I do some experiments in my development environment to see just how this component works. I wanted to get this out there as a quick post since I already know of 2 clients that I have worked with recently that will probably benefit from using this component.