Integration Services Design for performance Measure for success
[email protected] Twitter: @donalddotfarmer
OVAL
Operations Volume Application Location
Operations
Volume
Application
Location
What logic should be applied to the data?
Operations
What logic should be applied to the data?
Volume
How much data must be processed?
Application
Location
Operations
What logic should be applied to the data?
Volume
How much data must be processed?
Application
Which app is best suited to these operations at this volume of data?
Location
Operations
What logic should be applied to the data?
Volume
How much data must be processed?
Application
Which app is best suited to these operations at this volume of data?
Location
Where should the app run? On a shared server? on a standalone machine?
An OVAL Example: Loading a Text File
Text file on Server 1
SQL Server on Server 2
Operations • Understand all operations performed
Operations • Understand all operations performed
Open a transaction on SQL Server Read data from the text file Load data into the SSIS data flow Load the data into SQL Server
Commit the transaction
Operations • Beware of hidden operations
• Data conversion?
Sharpen your Operations • File Source – Unnecessary data type conversions – ‘FastParse’ in Flat File Source
• Unnecessary operations: e.g., converting from text to datetime, then from datetime to date • Reduce database operations – Database logging – Commit size – Fast Load – Table lock
Improving text file loading
DEMO
Sharpen your Volume • Reduce where possible – Don’t push unneeded columns – Conditional split for filtering rows – Do not parse or convert columns unnecessarily • In a fixed-width format you can combine adjacent unneeded columns into one • Leave unneeded columns as strings
Volume: Sharpen • Use appropriate data types – An integer in the range 1-999 takes 2 bytes as an integer, 3 bytes as a string, but 4 bytes as a real – “Suggest Types” in the flat file connection manager UI
More improvements to text file loading
DEMO
Application • Is SSIS right for this? – Overhead of starting up an SSIS package may offset any performance gain over BCP for small data sets
• Is BCP good enough? – Is the greater manageability and control of SSIS needed?
• Bulk Import Task vs. Data Flow
Location • Consider the following configuration …
Text file on Server 1
• Where should SSIS run? – (Licensing issues aside)
SQL Server on Server 2
Location Considerations •
SSIS on Server 1 – Competes with apps for resources – Will data conversion on Server 1 reduce or increase the volume of data transferred across the network? – Can not use the fast SSIS SQL Server Destination
•
SSIS on Server 2 – Competes with SQL Server for resources – Will pulling text over conversion be expensive? •
Also consider transferring the file unparsed to Server 2 and read it locally from there
– Can use the fast SSIS SQL Server Destination
Measure, tune, measure again • OVAL does not provide prescriptive guidance – Too many variables
• Improve performance by applying OVAL and measuring – SSIS Logging
– Performance counters – SQL Server Profiler • For extract queries, lookups and loading
Measuring performance in practice Measure the full data flow
• Source, Transforms, Destination = STD
Replace the destination with a Sampling or Row Count Component
• STD – STC = Destination Load Performance
Replace the Transforms with a Sampling or Row Count Component
• STC-SC = Transformation performance • SC = Source Extract performance
Measuring performance in practice
Putting it all together
DEMO
Operations Volume Application Location
OVAL Whitepaper http://technet.microsoft.com/en-us/library/cc966530.aspx