Power BI – Musings in May 2014

Having just finished a customer POC using Power BI I wanted to share my thoughts on the toolset. I feel confident that Microsoft are moving in the right direction with Power BI and that its objectives, deliver self-service and mobile BI, is exactly what a lot of my customers want. To do that wrapped in the familiar Excel and bundled up with new licensing model options with Office 365 it is an exciting consideration.

Where to Begin?

My first issue with Power BI is that the overall messaging and marketing is confusing. I am fairly competent but trying to set up my Office 365 trial with Power BI took some getting my head around. However I am also typical IT man in that I try never to read the getting started guide. After an hour of faffing around I went to the Power BI getting started guide and it helped, a lot!

To understand it more I put together the following diagram that hopefully helps, please note this is not an overview of the whole of Power BI just the elements we covered in this POC:


Basically Power BI is an app that runs inside SharePoint online. You get access to SharePoint online if you sign up for an Office 365 subscription (can be paid monthly or annually). With Office 365 you can use Office apps online or download them to your desktop. To be able to use the Power BI Excel add-ins (Power View, Power Pivot, Power Query, Power Map) you will need Office 365 Professional Plus. Unfortunately even for users that will simply consume reports/dashboards through your Power BI site you will need the add-on to your subscription for Power BI (tenant). Again this is something that should be looked at by Microsoft.

Power BI components, utilised during this POC:

Power Pivot – probably needs no introduction but is a data modelling add-in for Excel that allows users to bring together data from multiple sources, relate it, extend it and add calculations using DAX.

Power View – a dashboarding and visualisation tool (perhaps the same thing) that can source data from Excel, Power Pivot or SSAS Tabular DBs (in SharePoint on-premise, integrated mode it runs inside SSRS and can work with SSAS Multidimensional DBs). Power View has some great charting functions and allows relating dashboard items, advanced filters, Bing maps, play axis and slicers across the whole dashboard.

Power Query – this is a self service ETL tool (to some degree) it allows you to connect out to the internet (except Twitter at this point contrary to all the pre-sales demos showing you Twitter feeds, to get this currently you will need a third party connector) and grab tables and lists of data. Once you have it you can add it to Power Pivot models and you can then analyse the data using Excel and Power View, for example.

Power Map – this is my least used tool that looks great in demos but I have yet to see a use over and above Bing Maps in Power View. It ultimately works in a similar way to bing maps in Power View in that you can plot locations and look at measures on a map. The key benefit to Power Map is that you can then record a “tour” where you can record your analyse around the map and then save this out to a video for example.

So to get started with Power BI you go get yourself a trial of Office 365, add the Power BI functionality to it, follow the getting started guide and then start building out some Power Pivot models in Excel that you can use as a source for reports and dashboards in Power View.

The final piece to the Power BI puzzle that I found really great is the Windows 8.1 Power BI app. Again this will be available for iOS and Android later this year. What this allows is the user of the app to browse to the SharePoint online site and feature reports from that site in the app.

What went right, what went wrong?

So back to the proof of concept. It took us about 1/2 a day to construct a very basic SharePoint site that had the Power BI app enabled. We downloaded and installed Office 365 Pro Plus on our desktop and then it took a lot more time to try and come up with useful content. Our major issue is that we were trying to use this to surface our SSAS 2012 Multidimensional cube. We have a large(ish) cube with (around 40gbs) with a lot of data. However the biggest flaw with Power BI, today, is that you cannot connect to on-premise SSAS databases directly. You have to either create subsets of data for each and every report you need in an embedded Power Pivot model or you have to try and create oData feeds of the data you want to use. There is a potentially useful download called the Microsoft Data Management Gateway that does allow you to set up an encrypted link between your on-premise SQL Server or Oracle databases (and oData feeds) but as yet this doesn’t allow for connection to SSAS so we were not able to make use of it.

The other massive benefit my customer, and most other customers, is the ability to have true mobile BI. Ultimately the pieces they need to access are Excel, Power View and SSRS. And unfortunately where they need them is on an iPad! This customer actually wanted to see Excel and Power View reports on the Nexus as well. Power View can be rendered in HTML5 however it warns you when you use this view that some things may not work (most notably the play axis, although I still cant find a real world use for this) we found it most cumbersome around Bing Maps in Power View. Having done some mobile app development I know how hard it is to make sure an app functions and offers a similar user experience across browsers and devices, however with Microsoft’s resources they need to get this right.

Finally the biggest drawback to the whole POC was performance and usability to actually get to the report. From a mobile BI perspective what we want is our analyst to create a report in Excel, check (once we gave them a Power Pivot model). The ability of said analyst to publish the Power View dashboard to the Power BI site, check. Finally an alert for our end users there was a new report and to go look at it, nope. Ok so can the analysts simply click a link to our Power BI site and look at a list of reports, not quite! They have to login to our SharePoint online team site (we don’t want this at all) and then launch Power BI from the left hand links. The user experience here is not just too many clicks but also conflicting look/feel. The team site can be customised and made to look almost corporate, in true SharePoint fashion, however the Power BI site cannot be and looks blue and white. Don’t get me wrong it looks ok, but this is a real world business. The point of our mobile BI piece is for senior execs to be able to launch a report from their iPad on the golf course and get a glance at how their business is going, if they have to go through a Microsoft branded page they are not going to be terribly impressed. And the speed… the lack of it. Loading the team site, slow. Loading the Power BI site, slow. Loading the reports, slower. Even, when running HTML5 as opposed to Silverlight, interacting with a Power View dashboard was slow. This needs to be looked at and fixed. Alternatively give us a link directly to a single report, we can make those look corporate and hopefully we can push them to use Surfaces and use IE and Silverlight so interaction is fast!

There were other general things that can be summarised here:
1. Inability to link to on-premise data in the form of the SSAS cube.
2. General site performance – Loading the Team site was slow (took nn), loading the Power BI page took nn, loading reports took too long.
3. Too many clicks to get to a report
4. No ability to share a report via email link (to take the user straight to the report)
5. Power BI site not able to be customized inline with SharePoint look/feel, based on corporate requirements.
6. Featured Reports option not working on Android and Apple devices.
7. No ability to remove Featured Reports.
8. Inability to connect directly to Twitter feeds from Power Query and therefore link to a Power Pivot model and visualize through Power View.
9. Bing maps viewed excellently in Windows app and through Silverlight but through HTML5 they were not responsive enough, issues with pinching to zoom, moving the map around with touch and issues when changing filter it not always refreshing the map points.
10. Interacting with drillable charts in Power View (in HTML5) on the iPad and Nexus was tricky, sometimes the drill worked other times it highlighted the slice/column.

To Power BI or Not to Power BI?

In short my customer chose not to Power BI at this time and I agree with them. Obviously depending on your customers or business needs this decision may be different. But with the issues we encountered it wasn’t a viable option for end user reporting and dashboarding. BUT… I have it on great authority that a lot of the issues we found will be fixed in coming release(s). In fact if you read Chris Webb’s blog post a few of the issues we encountered are mentioned at the recent PASS summit.

I have advised my customer to wait until said release(s) and to try again with our POC upon their potential new release (perhaps mid-July as the rumour mill suggests?). This coupled with the relative cost versus alternatives such as QlikView and Tableau AND the fact that most companies are looking for a suitable upgrade path for MS Office means that people will use this toolset, and rightly so. If Microsoft can make v2 with all the required enhancements they will have a truly amazing BI stack, imo the best in the marketplace. Lets cross our fingers and hope July is not just sunny but Power BI v2 comes out and knocks us all down!

As usual any feedback or tips very much appreciated!

P.S – It would also be so wonderful if I could extend my trial. Now I am looking at having to re-do all my POC work, from scratch, in July!

P.P.S – Check out Microsoft’s latest guide on the BI Tool Use


PDW Shallow Dive – Part 3

Welcome to part three of the PDW Shallow dive. In this section I am going to explore partitioning in PDW. We will look at how to partition a table on the appliance and how to use partition switching. Unfortunately I was hoping to look at the CTAS function and specifically the mega merge statement but I will have to create a 4th part of the series to cover that due to time constraints.

Introducing Partitioning on the PDW

Partitioning is used to help manage large tables. When partitioning a table you track logical subsets of rows with metadata. Unlike SMP SQL Server there is no partitioning function or scheme you simply set it when you create the table or use Alter Table to add partitioning to the table by selecting what column to partition by and setting your boundary values. This does not affect which distribution or compute node the data is stored on. Rather it will allow you to manage the table with these subsets of data, for example partition by month and then archive an older month by switching the partition to an archive table.

Let’s look at a simple example of a fact table that is partitioned, the code looks like this:

CREATE TABLE [dbo].[FactSales] (
[OnlineSalesKey] bigint NULL,
[DateKey] datetime NULL,
[StoreKey] int NULL,
[ProductKey] int NULL,
[PromotionKey] int NULL,
[CurrencyKey] int NULL,
[CustomerKey] int NULL,
[SalesOrderNumber] varchar(28) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[SalesOrderLineNumber] int NULL,
[SalesQuantity] int NULL,
[SalesAmount] money NULL,
[ReturnQuantity] int NULL,
[ReturnAmount] money NULL,
[DiscountQuantity] int NULL,
[DiscountAmount] money NULL,
[TotalCost] money NULL,
[UnitCost] money NULL,
[UnitPrice] money NULL,
[ETLLoadID] int NULL,
[LoadDate] datetime NULL,
[UpdateDate] datetime NULL
PARTITION ([DateKey] RANGE LEFT FOR VALUES (‘Dec 31 2006 12:00AM’, ‘Dec 31 2007 12:00AM’, ‘Dec 31 2008 12:00AM’, ‘Dec 31 2009 12:00AM’, ‘Dec 31 2010 12:00AM’, ‘Dec 31 2011 12:00AM’)));

It doesn’t have to be dates that are used to partition tables, although this is the most used option, you can partition tables by set values such a product categories or sales channels. Understanding the data you are working with and how it will be analysed is critical to set up partitioning. The key options for setting up partitioning are:

  • The column – as mentioned before it is critical to identify the best column for partitioning your data by, in the example above we use the [DateKey] column.
  • Range – this specifies the boundary of the partition, it defaults to Left (lower values) but you can choose Right (higher values), in the example above we use the default, Left. This means that the values specified will be the last value in each partition e.g. our first partition will have all data with a date before Dec 31 2006 12:00AM.
  • The boundary value is the values in the list that you will partition the table by, this cannot be Null. These values must match or be implicitly convertible to the data type of the partitioning column.

Partitioning by example

In this section we will go step by step through a basic partitioning example. We create three tables, a main table with partitioning on it using a part_id, a second table used for archiving data from our main table and finally a table that we will use to move data from into our main table.

  1. Create out our factTableSample with a partitioning option, using a part_id column:

    create table factTableSample

    (id int not null, col1 varchar(50),part_id int not null)

    with (distribution =
    partition(part_id range

  2. Now we will insert some basic data into the table:

    into factTableSample

    select 1 id,
    ‘row1’col1, 1 part_id


    select 2 id,
    ‘row2’col1, 2 part_id


    select 3 id,
    ‘row3’col1, 3 part_id


    select 4 id,
    ‘row4’col1, 4 part_id


    select 5 id,
    ‘row5’col1, 5 part_id

    Below shows what those rows then look like:

  3. Let’s take a look at the how the data is stored in that table, how many rows are in each partition:

    –Check Paritions and the number of rows

    SELECT o.name, pnp.index_id, pnp.partition_id, pnp.rows,

    pnp.data_compression_desc, pnp.pdw_node_id

    sys.pdw_nodes_partitions AS pnp

    sys.pdw_nodes_tables AS NTables

    ON pnp.object_id
    = NTables.object_id

    AND pnp.pdw_node_id = NTables.pdw_node_id

    sys.pdw_table_mappings AS TMap

    ON NTables.name = TMap.physical_name

    AS o

    ON TMap.object_id
    = o.object_id

    WHERE o.name =

    BY o.name, pdw_node_id, pnp.index_id, pnp.partition_id

    The results show that all but the final partition has 1 row in them. However you also notice there are 24 rows returned. This is because we created a replicated table that is replicated across our 4 nodes. For more information on replicated tables review part 1 of the series.

  4. Now we will create an archive table to which we can move out older data:

    table factTableSample_Archive

    id int

    col1 varchar(50),

    part_id int

    with (distribution =

  5. Now we can switch out partition 5 to the archive table and look at the results:

    table factTableSample switch partition 5 to factTableSample_Archive;

    We can clearly see that the 5th partition has been moved from the main fact table (factTableSample) to the archive table (factTableSample_Archive).

  6. Next we create a new table which we would use to load staging data into:

    table factTableStage with (distribution =

    partition(part_id range


    select 1 id,

    varchar(50)) col1,

    5 part_id

This adds a single row into partition 5:

  1. Now let’s switch the stage data into partition 5 of the main fact table and look at the results:

table factTableStage switch partition 5 to factTableSample partition 5;

We see that partition 5 is now populated with the data from the stage table.

To summarise we can see using partitioning on the PDW is not overly complex. The alter table statements used to switch the partitions are very fast. If we had used distributed tables in the above examples it would have no effect on any of the code we used. Just like with SMP SQL Server it is important to plan your partitioning strategy in line with the analysis requirements and distribution of data in the datawarehouse. In the next part of the series I will dig into the CTAS (create table as select) statement and show an example of using this to do a merge into a dimension table. I hope you find the above useful, if you want any code examples please feel free to email at matt@coeo.com and I will get you a copy of the full script.