SQL Centre of Excellence

While the Matrix Control does has some new features for automatically handling ragged hierarchies as described by Reza Rad and others  (https://radacad.com/removing-blanks-from-organizational-ragged-hierarchy-in-power-bi-matrix-visual), it falls short of what we want to achieve on most of my projects and still doesn’t come up to where MOLAP was with its automatic support via HideMemberIf

This is the approach that I use, and I was pleased to see that DAX patterns second edition also has this type of design pattern.


As an example we can take the “DimAccount” parent child hierarchy from the AdventureworksDW database and render it in PowerBI. on first cut it comes out as below.

The challenge is that we need to be able to determine when the current  scope of a node is below the maximum depth.


To eliminate these blanks in DAX, there are a few approaches, but I’ll cover off using ISINSCOPE. The approach is as below

1) I assume that there is a PATH column in order to render the hierarchy like below
AccountPath = PATH ( Account[AccountKey], Account[ParentAccountKey] )

2) We add a calculated column to determine the Depth of the Node. as below

Account Depth = PATHLENGTH(Account[AccountPath] )

3) we add a calculated measure to determine the scope of the current node that is being navigated to. In this case the hierarchy can be between 2 and 6 levels deep.

AccountScope = switch(TRUE(), ISINSCOPE( Account[Account L6]), 6, ISINSCOPE( Account[Account L5]), 5, ISINSCOPE( Account[Account L4]), 4, ISINSCOPE( Account[Account L3]), 3, ISINSCOPE(Account[Account L2]),2 )

When that is in play we can see that the current scope of selected nodes is below the maximum depth of a hierarchy (Example below)

So the final step is to create a measure that returns 1 or null depending on if the level is in scope and multiply the Report measures by this. as below

HideAccount = if(Account[AccountScope] <= min(Account[Account Depth]) ,1)


GL Amount = CALCULATE(sum('General Ledger'[BaseAmount] ) * Account[HideAccount])

You loose a bit of DAX performance using a “helper” measure rather than “inlining” the IF, but I do find this approach easier to support, so the choice is yours.

Once we put this into play, we can now get the desired result of a ragged hierarchy with no blanks!


Coming off the tails of the hugely successful SQL Saturday Dublin events, Data Ceili – Dublin 2020 is shaping up to be Irelands largest Microsoft Data Platform event.

We still have till then end of Jan 2020 for speaker submissions and we have over 180 submissions from about 60 Local and International speakers. A who’s who of MVPs, experts and product team speakers to share their knowledge.

What has changed a lot since the first event in 2013 in Dublin is the sheer breadth of the Data Platform, hence the name change. This new style of event embraces the fact that modern data professional in Ireland use a diverse toolset including Power BI, Power Automate, Powershell, AI, ML, Bots, ADF, Containers Kubernetes, Databricks, Analysis services and more Azure features than one person can possibility master.

For me, I love getting an insight into emerging technologies which I’m not using yet, but plan to in the near future. I like speakers like Simon Whiteley who combine architectural principles I like such as meta data based frameworks to bleeding edge technologies like ADF and Databricks, which people are only just getting to grips with.

I also love taking a technology that I am weaker in (cough Powershell) and seeing how more dedicated veterans like Rob Sowell use it to a different level.

Often I will go an see a seasoned speaker like Kevin Kline, not so much about the topic, but more about improving the soft skills of delivery. Kevin is a god of the speaker world – I even saw him deliver a lightning talk at 10 seconds notice about a bar of chocolate of all things, with the entire audience transfixed.

Personally, much of the take away from an event this size for me, is also the people you meet: Speakers with new ideas, bumping into customers, and mixing with fellow data platform professionals. I’ve got multi-year projects on the back of events, found new employees and built a network of goto experts I can subcontract to when things need real specialists.

Inspired by the DBA tools (https://dbatools.io/) library for automating all things SQL Server DBEngine I created some Powershell Library functions to automate SSIS Build and Deployment. We use this on projects to move SSIS closer to the world of DevOps and that magic “10 deploys a day”. These Powershell functions work well when added to a VSTS deployment and release pipeline.



This library function users the command line version of visual studio to build an SSIS project and create the “ispac” file that contains the packages.

Example usage below




This library function uses the Microsoft.SqlServer.Management.IntegrationServices Namespace to deploy ispac file to a folder in the SSIDB database. This similar functionality to the Integration Services deployment Wizard. Example of it in action is below




Installation and Source Code

You can get the source code and installation instructions to play around with the Powershell from my GitHub project here


Cannot Start PolyBase Services

by Bob Duffy 22. February 2017 12:45

One of my pet hates attending SQL talks is when the “expert” speaker spends most of the time blabbing on about “install” and “configuration” difficulties they had. It kinda paints a picture that they probably haven’t actually done much with the technology yet.

Saying that I thought I would add some notes on problems starting the PolyBase Services that are installed with SQL 2016 as I’ve struggled with them a bit – so you guessed it by my own logic I am certainly not an expert on PolyBase!

Three common gotchas found:

1) TCP/IP Must be enabled

As PolyBase is designed to connect to HADOOP and other cloud based data sources it obviously uses TCP/IP, but bear in mind that developer edition which we all install on our laptops defaults to no enabling TCP/IP. Luckily Greg Lowe and some other MVPs figured this out before me http://sqlblog.com/blogs/greg_low/archive/2016/07/06/sql-server-2016-polybase-services-stuck-in-change-pending-state.aspx

Hopefully Microsoft can maybe add a warning or check on the install to fix this in the future.

2) Server Name Change is not supported

In my rush to setup my new laptop I installed SQL Server BEFORE changing the computer name from its default. While DBEngine supports a name change via the sp_dropserver and sp_addserver procs, the PolyBase Service will just not start with an error message in the appliction log of “No such host is known”.

The only workaround we found was to uninstall and re-install the PolyBase Services – which actually doesn't take that long.

I added a connect item to vote for a fix on this


3) Service Name Change needs CU

If you wanted to change the service account (Scale out PolyBase needs a domain account rather than the default network service account) then in the past this was an uninstall and re-install job to get the correct rights assigned to the account.

Now this is supported, but only if you have applied a newer CU


I’ve recently been working almost exclusively on projects running SQL Server in Azure (IaaS) and the question came on the best way to handle backup and data protection. We accept that for pretty much all SQL Server based solutions we need to have data and potentially logs backups if RTO is more frequent than about 24 hours, but the choice is do we backup to local disk/file share/3rd party product, or use the new kid in down backup to azure blob store (aka backup to URL)

Well, when we say a “good” choice we probably mean three principal attributes:

  • Is it safe
  • Is it cheap
  • Is it fast

Is backing up to a Azure blob store (TO URL) Safe ?

Two types of “safe” to consider: safe from data loss and safe from some 14 year old expert hacker in china or disgruntled ex employee.

WRT data loss the Azure storage accounts come with a 99.99% SLA for “hot” storage and 99.9% SLA for the cheaper cool storage which we would recommend for backup as its dirt cheap. In addition it keeps a whopping THREE copies of your files within a data centre and if you pay the double wonga for geo replication to another data centre (Amsterdam for us here in Dublin) , you get SIX copies of the data. So I think its safe to say (you see what I did there ), that putting sql backups onto azure storage is a safe bet, probably safer than anything any of our customers have on their on premise infrastructure.

Image result for azure storage three copies

Another big plus for azure storage is its automatically “off site” and hopefully not in the same fault domain as your production services, so no eggs in one basket like you may have on premise. If you decide to NOT put your sql backups into the cloud and then have no offsite copy then you've probably increased risk of data loss, not reduced it !

WRT being safe from hackers and disgruntled ex employees that's not so clear. As azure storage account is classed as a “public” service its pretty much always available over https if you have the right account credentials, or access keys on the storage account. It relies on good people and process to secure (doesn't sound good).

One massive mitigation is the move away from single point access keys to Shared Access Signatures or SAS. These allow for a per service security token to be granted which can restrict based on time duration,  allowed IP addresses, container and access rights allowed. For example maybe the production sql server could just have write access to the azure storage account for the specified IP address for a specific container. If it was compromised, then the attacker can’t do much.

One complaint I have is that the SAS token generated by using the wizard in SSMS is pretty much “full” access rights. What happened to “secure by default” Microsoft! Hopefully later revisions of SSMS will allow for more locked down SAS tokens for backup. In the interim, no worries, you can generate your own SAS tokens from azure portal or Powershell if you want the more beefed up security.

 Is backing up to a Azure blob store Cheap?
Currently 1TB of cold azure blob storage comes in at around 8 euro per TB per month before discounts, and if your running a large 24x7 in azure I would hope you at least have an EA or some other form of reduced price scheme and don’t “pay as you go”.

Geo replication is about double that with a bit more if you want the geo site to be readable, and IMO its rude to geo replicate data and not have it readable at the remote site.

So whether you consider it “cheap” depends on how much you currently pay for backup storage. If you are a SOHO using a small NAS unit with cheapo 5TB SATA drives that cost a hundred euro each and have near zero OPEX/support, then maybe you think this is expensive. If you are paying for a ridiculously expensive SAN, along with support and OPEX costs, then you might be paying anything from 2k to 10k per TB, so 8 euro a month is definitely cheap. Especially give then “offsite” nature.

 Is it fast ?

Well, this is probably the most important consideration for a DBA thinking about maintenance window or recovery run book.

Each storage account is limited to about 20,000 IOPS (and to be honest I haven't yet figured out how that maps to MB/Sec throughout limits), but what I do know is that I ran some simple tests by backing up a 150 GB database recording the effective throughput below on different VM sizes in the “DSv2” series using a single 1TB P30 SSD for the data volume.

The 2 core VM put out a modest 79 MB/Sec and the most interesting thing to note was that this was faster than backing up to the Local SSD. Another interesting point is that backing up to cold read only geo-replicated storage was about the same performance profile.


The four core VM with identical configuration managed to get 127MB/Sec


The eight core VM achieve 253 MB/Sec


And the sixteen core VM got to a tasty 346 MB/Sec


I have no idea if anyone else got similar backup performance from VMs in Azure ? post to comments if you did/didn’t.

One major recommendation is that if you are every doing a emergency restore in Azure on a big data estate, you probably want to increase the core count to “max”, do the restore and then shrink the core count back to normal.

I did mess around with using multiple storage accounts to “stripe” backups across, but it only made a 10% difference throughput wise in two scenarios:

a) When I was backing up from on-premise with higher latency to storage account. Striping the backup increased the thread count which as we all know is mucho better for saturating bandwidth on high latency links. By the way fact of the day is that by default the backup uses six threads per URL backed up to (or so we observed).

b) On the 16 core VM we started to see some benefit from striping. I guess that on smaller VMs it just not possible for the a single VM to “stress” the storage account, so using multiple storage accounts wasn't worth the trouble for us.

With a “ragged” hierarchy we often need a way to make sure that the hierarchy is displayed in a user friendly fashion when measures can be linked to any level (including parents).

There are two established solutions

1) The Pure DAX approach

Alberto was one of the first to publish a DAX workaround for the problem of ragged hierarchies here:


This workaround starts to get very complex when you have multiple hierarchies and performance can suffer on large hierarchies as the evaluation is at run time, but it does work and is “supported”.

2) HideMemberIf

BIDS helper exposes an “unsupported” technique to add HideMemberIf into the XMLA of the model definition. https://bidshelper.codeplex.com/ 

This solution looks more appealing as a lot of the work is done during ProcessRecalc when it does Hierarchy Processing – hopefully giving benefit at query time. It also involves less calculated measures, columns and code which sounds nice.

However two words of warning:

a) There is no guarantee that this will work in later editions

b) There are stability issues with a large number of levels. On the model we are working the ProcessRecalc and visual studio will just hang when we try to set this property on a NINE level hierarchy.


In the end we did use HideMemberIf, but only set the property on specific levels which we know can have measures attached to them to avoid the stability issues of having this turned on for all levels.

I’ve been diving deep into how long Analysis Services can take to open a connection for a few months now and thanks to some tips from many people (Alberto Ferrari, Greg Galloway, Darren Gospel to name a few) here are some ideas to improve performance.

1. Make sure your network is not slow

OK, sounds kinda obvious, but this was killing me for months.

Most of me problems ended up being driver related with various incompatibilities between our NICs, Hyper-V and some of the new features.

A simple internet speed test or benchmarking tool would have shown that the VMs running SSAS were running very slow with packet loss. I blogged more details here


2. Avoid Named Instances

It can take up to double the time to open a connection to a named instance. As a test I opened 1,000 connections to a remote server with five scenarios:

  • By Name (Default Instance)
  • By IP (Default Instance)
  • By Server Name and Instance Name (Named Instance)
  • By Name and Port number (Named Instance)
  • By IP and Port number (Named Instance)


The results show that if we are using a named instance (e.g. bob-pc\tabular) then connections can be almost double the time to make. Once we manually specify the port number and bypass the “browser” service, connection time is about the same for the other test cases.

There was a very small improvement for using the IP instead of the name, but this was so small I’m not sure it would be worth the effort (2.5%)

lesson learned: Use fixed port numbers when using SSAS instances. If you do need named instances and want to simplify connections it should be possible to give each instance  the same default port number (2383) and bind a different IP address to each instance. We've been using this trick on SQL DBEngine clusters for years to simplify connections and manageability.

2. Optimising Local Connections

If the client and SSAS are running on the same server (e.g. development laptop). You can optimise the connection even further by using the IP address and/or avoiding using the alias “localhost”


Ok these are small numbers, but its a 45% gain to be made on connection time Smile

one eye opener or me is there was a measurable difference to show that using “localhost” was not the best option for me.

3. IP4 or IP6 ?

I ran a quick test comparing use of IP4 to IP6 for connecting to an SSAS Server. This did show a 30% improvement with the older IP4 (I have no idea why this would be so!)



4. Do I care about connection time ?

Many tools like Excel either keep a connection open to SSAS or re-use a single connection, so connection time might not be that important.

Other tools may open a connection per request or in some cases (reporting services) many many connections per user – In this case the impact of faster connections may be more noticeable.

Over the last few months we have been developing reporting services reports on top of a SSAS tabular model. One of the banes of my life has been the unusable performance of our reports, especially if they open a lot of connections.

This week I finally got some time to triage the issue and found the root cause, so I’ll post it here in case anyone else is hit by this.

How Can we measure Connection Time on Reports ?

Reporting services 2012 onwards has additional performance data on datasets on the ExecutionLog3 view  in the AdditionalInfo XML column. In my case it looked something like this:


As you can see it is taking 420ms to open a single connection to the Analysis Server This should be around 20ms or less. One of the annoying this about reporting services is it will open a fresh connection for every dataset, so with more complex reports this became unusable very quickly with some seemingly simple reports taking 30 seconds plus.

We used and xPath query and another report to visualise this performance data per data


which facilitates a report showing time spent between connection and actual query time



What was the cause

After trying it on various servers, laptops, desktops we managed to isolate it down to any of our SSAS Servers running inside Hyper-V on Windows 2012R2. Some further triage isolated the issue down to VMs which were using a broadcom network card with a known issue with Hyper-V and the NIC



After some reconfiguration, the connections came down to under 20ms Smile

Analysing Report Server Logs

by Bob Duffy 4. March 2015 08:54

Attached is the example Power Pivot Model I used to analyse Report Server Logs complete with 3D pie chart to shock the DataViz world ;-)

I want to get time to extend this to include a lot of the “version 3” data which includes per Dataset metrics on query and processing time.

Feel free to use this as a starting point for analysing Report Server performance or usage.


You’ll need to update the connection in the Power Picot window to point to your ReportServer database.




Speaking Schedule First Half 2015

by Bob Duffy 1. February 2015 20:01

Quite a selection of speaking events coming up and an ambitious selection of mostly new content this year.

Belfast SQL User Group - Thurs 26th Feb.

  • When Good SQL Design Goes Bad
  • Optimising Reporting Services Performance

sqlBits, Excel London Exhibition Centre – 4th to 7th March

Friday 11:00 – 12:00 When Good SQL Design Goes Bad

Saturday 11:15 – 13:15 Scaling the DVD Store from 100 to a million TPM

Saturday  16:00 – 17:00 The European Economic Crisis And the Euro – A Data Tale

SQL Saturday Exeter. 24th –25th April, 2015

Friday Pre con Optimising Reporting Services for MDX Data Sources

Saturday 16:20 When Good SQL Design Goes Bad

Note sure if I will be selected, but I have submitted for SQL Saturday Edinburgh (BI Edition)  on June 13th. Edinburgh’s a city close to my heart with my college years spent there and Sandra living out there now.

And of course don’t forget SQL Saturday Dublin on the 20th/21st June. I may make an appearance on the lightning talks maybe but not a regular session as soo much to organise and so many great speakers submitted.

BTW Myself and Marco have some mad plans for a half day or whole day FREE session in Ireland too with myself and some other speakers . Will post when we get dates firmed up. Watch the UG newsletters and blogs

Page List

Page List