SQL Centre of Excellence

Inspired by the DBA tools (https://dbatools.io/) library for automating all things SQL Server DBEngine I created some Powershell Library functions to automate SSIS Build and Deployment. We use this on projects to move SSIS closer to the world of DevOps and that magic “10 deploys a day”. These Powershell functions work well when added to a VSTS deployment and release pipeline.



This library function users the command line version of visual studio to build an SSIS project and create the “ispac” file that contains the packages.

Example usage below




This library function uses the Microsoft.SqlServer.Management.IntegrationServices Namespace to deploy ispac file to a folder in the SSIDB database. This similar functionality to the Integration Services deployment Wizard. Example of it in action is below




Installation and Source Code

You can get the source code and installation instructions to play around with the Powershell from my GitHub project here


Cannot Start PolyBase Services

by Bob Duffy 22. February 2017 12:45

One of my pet hates attending SQL talks is when the “expert” speaker spends most of the time blabbing on about “install” and “configuration” difficulties they had. It kinda paints a picture that they probably haven’t actually done much with the technology yet.

Saying that I thought I would add some notes on problems starting the PolyBase Services that are installed with SQL 2016 as I’ve struggled with them a bit – so you guessed it by my own logic I am certainly not an expert on PolyBase!

Three common gotchas found:

1) TCP/IP Must be enabled

As PolyBase is designed to connect to HADOOP and other cloud based data sources it obviously uses TCP/IP, but bear in mind that developer edition which we all install on our laptops defaults to no enabling TCP/IP. Luckily Greg Lowe and some other MVPs figured this out before me http://sqlblog.com/blogs/greg_low/archive/2016/07/06/sql-server-2016-polybase-services-stuck-in-change-pending-state.aspx

Hopefully Microsoft can maybe add a warning or check on the install to fix this in the future.

2) Server Name Change is not supported

In my rush to setup my new laptop I installed SQL Server BEFORE changing the computer name from its default. While DBEngine supports a name change via the sp_dropserver and sp_addserver procs, the PolyBase Service will just not start with an error message in the appliction log of “No such host is known”.

The only workaround we found was to uninstall and re-install the PolyBase Services – which actually doesn't take that long.

I added a connect item to vote for a fix on this


3) Service Name Change needs CU

If you wanted to change the service account (Scale out PolyBase needs a domain account rather than the default network service account) then in the past this was an uninstall and re-install job to get the correct rights assigned to the account.

Now this is supported, but only if you have applied a newer CU


I’ve recently been working almost exclusively on projects running SQL Server in Azure (IaaS) and the question came on the best way to handle backup and data protection. We accept that for pretty much all SQL Server based solutions we need to have data and potentially logs backups if RTO is more frequent than about 24 hours, but the choice is do we backup to local disk/file share/3rd party product, or use the new kid in down backup to azure blob store (aka backup to URL)

Well, when we say a “good” choice we probably mean three principal attributes:

  • Is it safe
  • Is it cheap
  • Is it fast

Is backing up to a Azure blob store (TO URL) Safe ?

Two types of “safe” to consider: safe from data loss and safe from some 14 year old expert hacker in china or disgruntled ex employee.

WRT data loss the Azure storage accounts come with a 99.99% SLA for “hot” storage and 99.9% SLA for the cheaper cool storage which we would recommend for backup as its dirt cheap. In addition it keeps a whopping THREE copies of your files within a data centre and if you pay the double wonga for geo replication to another data centre (Amsterdam for us here in Dublin) , you get SIX copies of the data. So I think its safe to say (you see what I did there ), that putting sql backups onto azure storage is a safe bet, probably safer than anything any of our customers have on their on premise infrastructure.

Image result for azure storage three copies

Another big plus for azure storage is its automatically “off site” and hopefully not in the same fault domain as your production services, so no eggs in one basket like you may have on premise. If you decide to NOT put your sql backups into the cloud and then have no offsite copy then you've probably increased risk of data loss, not reduced it !

WRT being safe from hackers and disgruntled ex employees that's not so clear. As azure storage account is classed as a “public” service its pretty much always available over https if you have the right account credentials, or access keys on the storage account. It relies on good people and process to secure (doesn't sound good).

One massive mitigation is the move away from single point access keys to Shared Access Signatures or SAS. These allow for a per service security token to be granted which can restrict based on time duration,  allowed IP addresses, container and access rights allowed. For example maybe the production sql server could just have write access to the azure storage account for the specified IP address for a specific container. If it was compromised, then the attacker can’t do much.

One complaint I have is that the SAS token generated by using the wizard in SSMS is pretty much “full” access rights. What happened to “secure by default” Microsoft! Hopefully later revisions of SSMS will allow for more locked down SAS tokens for backup. In the interim, no worries, you can generate your own SAS tokens from azure portal or Powershell if you want the more beefed up security.

 Is backing up to a Azure blob store Cheap?
Currently 1TB of cold azure blob storage comes in at around 8 euro per TB per month before discounts, and if your running a large 24x7 in azure I would hope you at least have an EA or some other form of reduced price scheme and don’t “pay as you go”.

Geo replication is about double that with a bit more if you want the geo site to be readable, and IMO its rude to geo replicate data and not have it readable at the remote site.

So whether you consider it “cheap” depends on how much you currently pay for backup storage. If you are a SOHO using a small NAS unit with cheapo 5TB SATA drives that cost a hundred euro each and have near zero OPEX/support, then maybe you think this is expensive. If you are paying for a ridiculously expensive SAN, along with support and OPEX costs, then you might be paying anything from 2k to 10k per TB, so 8 euro a month is definitely cheap. Especially give then “offsite” nature.

 Is it fast ?

Well, this is probably the most important consideration for a DBA thinking about maintenance window or recovery run book.

Each storage account is limited to about 20,000 IOPS (and to be honest I haven't yet figured out how that maps to MB/Sec throughout limits), but what I do know is that I ran some simple tests by backing up a 150 GB database recording the effective throughput below on different VM sizes in the “DSv2” series using a single 1TB P30 SSD for the data volume.

The 2 core VM put out a modest 79 MB/Sec and the most interesting thing to note was that this was faster than backing up to the Local SSD. Another interesting point is that backing up to cold read only geo-replicated storage was about the same performance profile.


The four core VM with identical configuration managed to get 127MB/Sec


The eight core VM achieve 253 MB/Sec


And the sixteen core VM got to a tasty 346 MB/Sec


I have no idea if anyone else got similar backup performance from VMs in Azure ? post to comments if you did/didn’t.

One major recommendation is that if you are every doing a emergency restore in Azure on a big data estate, you probably want to increase the core count to “max”, do the restore and then shrink the core count back to normal.

I did mess around with using multiple storage accounts to “stripe” backups across, but it only made a 10% difference throughput wise in two scenarios:

a) When I was backing up from on-premise with higher latency to storage account. Striping the backup increased the thread count which as we all know is mucho better for saturating bandwidth on high latency links. By the way fact of the day is that by default the backup uses six threads per URL backed up to (or so we observed).

b) On the 16 core VM we started to see some benefit from striping. I guess that on smaller VMs it just not possible for the a single VM to “stress” the storage account, so using multiple storage accounts wasn't worth the trouble for us.

With a “ragged” hierarchy we often need a way to make sure that the hierarchy is displayed in a user friendly fashion when measures can be linked to any level (including parents).

There are two established solutions

1) The Pure DAX approach

Alberto was one of the first to publish a DAX workaround for the problem of ragged hierarchies here:


This workaround starts to get very complex when you have multiple hierarchies and performance can suffer on large hierarchies as the evaluation is at run time, but it does work and is “supported”.

2) HideMemberIf

BIDS helper exposes an “unsupported” technique to add HideMemberIf into the XMLA of the model definition. https://bidshelper.codeplex.com/ 

This solution looks more appealing as a lot of the work is done during ProcessRecalc when it does Hierarchy Processing – hopefully giving benefit at query time. It also involves less calculated measures, columns and code which sounds nice.

However two words of warning:

a) There is no guarantee that this will work in later editions

b) There are stability issues with a large number of levels. On the model we are working the ProcessRecalc and visual studio will just hang when we try to set this property on a NINE level hierarchy.


In the end we did use HideMemberIf, but only set the property on specific levels which we know can have measures attached to them to avoid the stability issues of having this turned on for all levels.

I’ve been diving deep into how long Analysis Services can take to open a connection for a few months now and thanks to some tips from many people (Alberto Ferrari, Greg Galloway, Darren Gospel to name a few) here are some ideas to improve performance.

1. Make sure your network is not slow

OK, sounds kinda obvious, but this was killing me for months.

Most of me problems ended up being driver related with various incompatibilities between our NICs, Hyper-V and some of the new features.

A simple internet speed test or benchmarking tool would have shown that the VMs running SSAS were running very slow with packet loss. I blogged more details here


2. Avoid Named Instances

It can take up to double the time to open a connection to a named instance. As a test I opened 1,000 connections to a remote server with five scenarios:

  • By Name (Default Instance)
  • By IP (Default Instance)
  • By Server Name and Instance Name (Named Instance)
  • By Name and Port number (Named Instance)
  • By IP and Port number (Named Instance)


The results show that if we are using a named instance (e.g. bob-pc\tabular) then connections can be almost double the time to make. Once we manually specify the port number and bypass the “browser” service, connection time is about the same for the other test cases.

There was a very small improvement for using the IP instead of the name, but this was so small I’m not sure it would be worth the effort (2.5%)

lesson learned: Use fixed port numbers when using SSAS instances. If you do need named instances and want to simplify connections it should be possible to give each instance  the same default port number (2383) and bind a different IP address to each instance. We've been using this trick on SQL DBEngine clusters for years to simplify connections and manageability.

2. Optimising Local Connections

If the client and SSAS are running on the same server (e.g. development laptop). You can optimise the connection even further by using the IP address and/or avoiding using the alias “localhost”


Ok these are small numbers, but its a 45% gain to be made on connection time Smile

one eye opener or me is there was a measurable difference to show that using “localhost” was not the best option for me.

3. IP4 or IP6 ?

I ran a quick test comparing use of IP4 to IP6 for connecting to an SSAS Server. This did show a 30% improvement with the older IP4 (I have no idea why this would be so!)



4. Do I care about connection time ?

Many tools like Excel either keep a connection open to SSAS or re-use a single connection, so connection time might not be that important.

Other tools may open a connection per request or in some cases (reporting services) many many connections per user – In this case the impact of faster connections may be more noticeable.

Over the last few months we have been developing reporting services reports on top of a SSAS tabular model. One of the banes of my life has been the unusable performance of our reports, especially if they open a lot of connections.

This week I finally got some time to triage the issue and found the root cause, so I’ll post it here in case anyone else is hit by this.

How Can we measure Connection Time on Reports ?

Reporting services 2012 onwards has additional performance data on datasets on the ExecutionLog3 view  in the AdditionalInfo XML column. In my case it looked something like this:


As you can see it is taking 420ms to open a single connection to the Analysis Server This should be around 20ms or less. One of the annoying this about reporting services is it will open a fresh connection for every dataset, so with more complex reports this became unusable very quickly with some seemingly simple reports taking 30 seconds plus.

We used and xPath query and another report to visualise this performance data per data


which facilitates a report showing time spent between connection and actual query time



What was the cause

After trying it on various servers, laptops, desktops we managed to isolate it down to any of our SSAS Servers running inside Hyper-V on Windows 2012R2. Some further triage isolated the issue down to VMs which were using a broadcom network card with a known issue with Hyper-V and the NIC



After some reconfiguration, the connections came down to under 20ms Smile

Analysing Report Server Logs

by Bob Duffy 4. March 2015 08:54

Attached is the example Power Pivot Model I used to analyse Report Server Logs complete with 3D pie chart to shock the DataViz world ;-)

I want to get time to extend this to include a lot of the “version 3” data which includes per Dataset metrics on query and processing time.

Feel free to use this as a starting point for analysing Report Server performance or usage.


You’ll need to update the connection in the Power Picot window to point to your ReportServer database.




Speaking Schedule First Half 2015

by Bob Duffy 1. February 2015 20:01

Quite a selection of speaking events coming up and an ambitious selection of mostly new content this year.

Belfast SQL User Group - Thurs 26th Feb.

  • When Good SQL Design Goes Bad
  • Optimising Reporting Services Performance

sqlBits, Excel London Exhibition Centre – 4th to 7th March

Friday 11:00 – 12:00 When Good SQL Design Goes Bad

Saturday 11:15 – 13:15 Scaling the DVD Store from 100 to a million TPM

Saturday  16:00 – 17:00 The European Economic Crisis And the Euro – A Data Tale

SQL Saturday Exeter. 24th –25th April, 2015

Friday Pre con Optimising Reporting Services for MDX Data Sources

Saturday 16:20 When Good SQL Design Goes Bad

Note sure if I will be selected, but I have submitted for SQL Saturday Edinburgh (BI Edition)  on June 13th. Edinburgh’s a city close to my heart with my college years spent there and Sandra living out there now.

And of course don’t forget SQL Saturday Dublin on the 20th/21st June. I may make an appearance on the lightning talks maybe but not a regular session as soo much to organise and so many great speakers submitted.

BTW Myself and Marco have some mad plans for a half day or whole day FREE session in Ireland too with myself and some other speakers . Will post when we get dates firmed up. Watch the UG newsletters and blogs

If you are looking at running SQL Server in the cloud a key question you will ask is “Should I rent or Buy” SQL licenses. E.g. should you use the stock images and pay by the hour or should you buy your own licences.

Some facts that will help you decide:

  • SQL Server Enterprise Edition can vary in price from say 16,000 euro to 24,000 euro per four cores depending on your licensing scheme and bartering ability.
  • You MUST pay for software assurance at 20% per year to transfer licenses to the cloud (Azure or Amazon)
  • Azure charges about 1,164 euro per month for enterprise edition, per four cores.
  • Buying licences with SA gives you rights to use a “DR” instance without paying for its license. When renting you need to pay for DR while its turned on (not good for say a passive mirror).

So assuming full retail prices we can say the following:

  • If you are running for more than 2 years at 100% up time you should buy.
  • If you are powering down images then factor this in. So if 50% powered down,  only buy if you are sweating the licences for > 4 years.
  • If you are running “warm” DR Servers then buy licenses if run for more than a year. Effectively buying licences becomes twice as appealing.

One thing I am looking at the moment is a scale out architecture where there may be two nodes up 100% of the time and then additional nodes scaled on demand. In this hybrid case it makes sense to buy licenses on the permanent nodes and “rent” on the elastic nodes.

Obviously this doesn’t take into account that renting gives you more flexibility, so is certainly the mode of choice for temporary environments.

Hot of the press is the announcement of MS Azure support for local SSD, up to 800GB which should be enough for “most” TempDBs.

Now this doesn’t have me too excited as its volatile and core storage is still a bit challenging to get IOPS on, especially if you need “burst” capability. What has me super excited is the introduction of the “memory” intensive images, or the D series with very competitive pricing.


Annoyingly the pricing is a bit misleading as provisioning page has the price above (768.48 per month), but the pricing page has a different price




If we look at a 16 core server options with close to 128GB ram we can compare the old and new offerings against Amazon. The only way to get 16 cores before was via the A9 image and it was really pricey, now we have local SSD thrown in for 51% price drop (ok with no infiniband, but I'm good with the trade).

  Azure (Old Pricing) Azure (new) Amazon EC2
Name A9 STANDARD_D14 r3.4xlarge
Cost/Month € 2,715 1,314 1,092
Spec 16 Cores
112 GB
40 GBit Infiniband
16 Cores
112 GB
800GB Local SSD
16 Cores
122 GB
2x320GB Local SSD

So, while the price for 16 core images has come down a lot its still not as competitive as the Amazon offering.

What is it missing to be a technically superior solution to Amazon for running SQL Server. A few things IMO:

  • Provisioned IOPS
  • Wholesale Replacement of old fashioned Magnetic Media with SSD
  • Burst Capably on IOPS

Amazon still has these trump cards and while it does, it will be holding the technical edge for running SQL Server at the high end on demanding IO based workloads.

However todays announcement is a huge leap forward. If you can leverage the Local SSD and/or don’t need massive IOPS then you can pocket a higher business value.

Page List

Page List