As discussed recently on the blog, I have been on a journey to try and attain the Microsoft Certified Solutions Associate Certification in BI Reporting. I was very fortunate to overcome the final hurdle of this task by passing Exam 70-778: Analyzing and Visualizing Data with Microsoft Power BI the other day. I enjoyed the opportunity to dive deeper into the world of Business Intelligence, particularly given the enhanced role Power BI has within the Business Applications space today. With this in mind, and in the hopes of encouraging others, today’s post is the first in a new series of revision notes for Exam 70-778. I hope that you find this, and all future posts, useful as either a revision tool or as an introduction into the world of Power BI.

The first skill area of the exam is all around how to Import from data sources, as described on the exam specification:

Connect to and import from databases, files, and folders; connect to Microsoft SQL Azure, Big Data, SQL Server Analysis Services (SSAS), and Power Query; import supported file types; import from other Excel workbooks; link to data from other sources

To begin with, I will provide a detailed overview of the topic areas covered above, before jumping into an example of how to import data into Power BI.

Supported Data Sources

The great benefit of Power BI is its huge list of supported connectors, which are integrated neatly within the application itself. The list of all possible data sources changes on a monthly basis, and it is impossible to go into detail on each one. Suffice to say; you should at least be familiar with the following data sources:

  • SQL Server (on-premise & Azure)
  • SQL Server Analysis Services
  • A wide range of vendor-specific Relational Database Management Systems (RDBMS’s), such as Oracle, MySQL, PostgreSQL, SAP Hana
  • Any data source that supports Open Database Connectivity (ODBC) or Object Linking and Embedding, Database (OLEDB).
  • The following flat file types:
    • Excel (.xlsx)
    • Text (.txt)
    • Comma Separated Value documents (.csv)
    • Extensible Markup Language (.xml)
    • JavaScript Object Notation (.json)
  • Web sources, such as Web pages or OData Feeds

Some RDBMS vendor solutions have a requirement to install additional software, which will enable you to interact with that particular data source. You should check the relevant documentation for each vendor to verify any specific requirements.

Power BI also supports a wide range of Microsoft proprietary and non-proprietary applications, such as Dynamics 365 Customer Engagement, SharePoint, Google Analytics & SalesForce. If you are feeling particularly technical, then you can also use the Blank Query option to, in theory, connect to any data source of your choosing or even go as far as building custom connectors yourself to interact with a specific application.

Bulk File Loading

As well as supporting connections to single flat files, it is also possible to interact with multiple files existing in the same location. This feature can be useful if, for example, there is a requirement to process hundreds of .csv files with different data, but the same overall structure. The supported list of bulk file locations are:

  • Windows file system folder
  • SharePoint document folder
  • Azure Blob Storage
  • Azure Data Lake Storage

When loading multiple files into Power BI, you not only can read the contents of each file but can also access file-level metadata, as indicated below:

Import vs DirectQuery

An important design decision when working with data sources concerns the data connectivity mode to be used. Your final choice will generally fall into one of two options:

  • Import: When connecting to your data source, Power BI takes a copy of all data and stores it within your report. By implication, this places additional pressure on your local machines disk space and memory consumption. Import is the default option for most data sources and, to ensure that your data remains consistently up to date when deployed to the Power BI service, you have the opportunity of defining your data refresh frequency – 8 times a day for Power BI Professional and 48 times a day for Power BI Premium subscriptions. Import is the most sensible option to choose when there is no requirement for regular refreshing of your data sources or if performance concerns arise when using…
  • DirectQuery: Instead of taking a snapshot of the data, Power BI will read the data at source and store only the schema of the data within the model. At the time of writing this post, only a select number of mostly SQL based data sources are compatible with this feature. DirectQuery is your best choice when there is a need to keep reports continually up to date, and when your target data source is sufficiently beefed up to handle frequent requests. It’s also worth bearing in mind the following points when evaluating DirectQuery:
    • DirectQuery only supports a single data source connection for the entire model, with no option of defining additional sources. While traditionally true, the release of composite models for DirectQuery removes this much-loathed limitation.
    • There are limitations when it comes to data transformation options, especially for non-Microsoft data sources.
    • Some query types will be unsupported.
    • For data modelling using DAX, there are some crucial limitations. For example, Measures that use the SUMX & PATH functions (or their related counterparts) are not allowed.

You should also be aware of a third option – Live Connection – which behaves similar to DirectQuery but is for SQL Server Analysis Services only. This option has the following limitations:

  • Not possible to define relationships
  • No possibility to transform data from within Power BI.
  • Data modelling options, except for Measure creation, are almost non-existent.

Importing Excel Workbooks

There are some aspects of working with Excel documents in Power BI that are worth further consideration. You mostly have two options at your disposal to consume Excel workbooks:

  1. Import Data: Similar to working with any other flat file source, data within each of the following Excel objects is importable into Power BI:
    • Tables
    • Sheets
    • Ranges
  2. You can see below how this looks for a file containing four worksheets:
  3. Import workbook contents: If you have built out a complex spreadsheet that utilises the full range of features available in the Excel Data Model, then it is also possible to import these into Power BI “as-is”. The following Excel Data Model features are exportable in this manner:
    • Power Query queries
    • Power Pivot Data Models
    • Power View Worksheets
    • (Most) Power View visuals; where a visual is unsupported in Power BI, an error appears on the appropriate visual.

Example: Importing SQL Server Database Data

What follows now is a typical data connection exercise in Power BI Desktop, which involves connecting to a SQL Server database. The experience described here is mostly similar for other data sources and, therefore, represents an optimal example to familiarise yourself with connecting to data sources in the application:

  1. Launch Power Bi Desktop and, on the splash screen, select the Get data link on the left-hand pane:
  2. On the Get Data window, choose Database on the left-hand list, select SQL Server database and then press the Connect button:
  3. You will now be prompted to provide the following details:
    • Server: This will be either the Fully Qualified Domain Name (FQDN) of the computer with a default SQL Server instance or the computer name and named instance name (e.g. MYSQLSERVER/MYSQLSERVERINSTANCE). In the example below, we are connecting to a default SQL Server instance on the computer JG-LAB-SQL
    • Database: If you already know the name of the database you want to access, you can type this here; otherwise, leave blank. In this example, we are connecting to the WideWorldImporters sample database.
    • Data Connectivity mode: See the section Import vs DirectQuery above for further details. For this example, select the Import setting:
  4. There are also several additional options that are definable in the Advanced options area:
    • Command timeout in minutes: Tells Power BI how long to wait before throwing an error due to connection issues.
    • SQL statement: Specify here a pre-compiled SQL statement that will return the objects/datasets that you require. This option can be useful if you wish to reduce the complexity of your model within Power BI or if there is a requirement to return data from a stored procedure.
    • Include relationship columns: Enabling this setting will return a single column for each defined relationship which, when expanded, gives you the ability to add related column fields onto your table object.
    • Navigate using full hierarchy: Enabling this will allow you to navigate through the database hierarchy using schema object names. In most cases, this should remain disabled, unless there a specified schema names in your dataset (like Application, Sales, Purchasing etc. in the WideWorldImporters database).
    • Enable SQL Server Failover support: If enabled, then Power BI will take advantage of any failover capability setup on your SQL Server instance, re-routing requests to the appropriate location where necessary.
  5. Illustrated below are some example settings for all of the above. For this walkthrough, leave all of these fields blank and then press OK to continue.
  6. The Navigator window will appear, which will enable you to select the Tables or Views that you wish to work within the model. Selecting any of the objects listed will load a preview in the right-hand window, allowing you to see a “sneak peek” of the schema and the underlying data. Tick the object Sales.CustomerTransactions and then press the Select Related Tables button; all other Tables that have a relationship with the Sales.CustomerTransactions are then automatically included. Press Load when you are ready to import all selected table objects into Power BI.
  7. After a few moments, the Load window will appear and update accordingly as each table object gets processed by Power BI (exact times may vary, depending on the remote server/local machines capabilities). Eventually, when the window closes, you will see on the right-hand pane that all table objects have been loaded into Power BI and are ready to use for building out visualizations:
  8. At this stage, you would then look at loading up your imported objects into Power Query for fine-tuning. But that’s a topic for the next post 🙂

Key Takeaways

  • Power BI supports a broad range of database systems, flat file, folder, application and custom data sources. While it is impossible to memorise each data source, you should at least broadly familiarise yourself with the different types at our disposal.
  • A crucial decision for many data sources relates to the choice of either Importing a data source in its entirety or in taking advantage of DirectQuery functionality instead (if available). Both routes have their own defined set of benefits and disadvantages. DirectQuery is worth consideration if there is a need to keep data regularly refreshed and you have no requirement to work with multiple data sources as part of your solution.
  • Live Connection is a specific data connectivity option available for SQL Server Analysis Services. It behaves similarly to DirectQuery.
  • It is possible to import an existing Excel BI solution into Power BI with minimal effort, alongside the ability to import standard worksheet data in the same manner as other flat file types.

Look out for my next post in this series, where I will take a look at the range of transformation options available to us in Power BI, and work through some examples applied against the tables listed above.

The life of a Dynamics CRM/Dynamics 365 for Customer Engagement (CRM/D365CE) professional is one of continual learning across many different technology areas within the core “stack” of the Business Applications platform. Microsoft has clarified this in no uncertain terms recently via the launch of the Power Platform offering, making it clear that cross-skilling across the various services associated with the Common Data Service is no longer an optional requirement, should you wish to build out a comprehensive business solution. I would not be surprised in the slightest if we find ourselves in a situation where the standard SSRS, Chart and Dashboarding options available within CRM/D365CE become deprecated soon, and Power BI becomes the preferred option for any reporting requirements involving the application. With this in mind, knowledge of Power BI becomes a critical requirement when developing and managing these applications, even more so when you consider how it is undoubtedly a core part of Microsoft’s product lineup; epitomised most clearly by the release of the Microsoft Certified Solutions Architect certification in BI Reporting earlier this year.

I have been doing a lot of hands-on and strategic work with Power BI this past year, a product for which I have a lot of affection and which has numerous business uses. As a consequence, I am in the process of going through the motions to attain the BI Reporting MCSA, having recently passed Exam 70-779: Analyzing and Visualizing Data with Microsoft Excel. As part of this week’s post, I wanted to share some general, non-NDA breaching advice for those who are contemplating going for the exam. I hope you find it useful 🙂

Power BI Experience is Relevant

For an exam focused purely on the Excel sides of things, there are a lot of areas tested that have a significant amount of crossover with Power BI, such as:

  • Connecting to data sources via Power Query in Excel, an experience which is an almost carbon copy of working with Power Query within Power BI.
  • Although working with the Excel Data Model, for me at least, represented a significant learning curve when revising, it does have a lot of cross-functionality with Power BI, specifically when it comes to how DAX fits into the whole equation.
  • Power BI is even a tested component for this exam. You should, therefore, expect to know how to upload Excel workbooks/Data Models into Power BI and be thoroughly familiar with the Power BI Publisher for Excel.

Any previous knowledge around working with Power BI is going to give you a jet boost when it comes to tackling this exam, but do not take this for granted. There are some significant differences between both sets of products (epitomised by the fact that Excel and Power BI, in theory, address two distinctly different business requirements), and this is something that you will need to understand and potentially identify during the exam. But specific, detailed knowledge of some of the inner workings of Power BI is not going to be a disadvantage to you.

Learn *a lot* of DAX

DAX, or Data Analysis Expressions, are so important for this exam, and also for 70-778 as well. While it will not necessarily be required for you to memorise every single DAX expression available to pass the exam (although you are welcome to try!), you should be in a position to recognise the structure of the more common DAX functions available. You ideal DAX study areas before the exam may include, but is not limited to:

A focus, in particular, should be driven towards the syntax of these functions, to the extent that you can memorise example usage scenarios involving them.

Get the exam book

As with all exams, Microsoft has released an accompanying book that is a handy revision guide and reference point for self-study. On balance, I feel this is one of the better exam reference books that I have come across, but beware of the occasional errata and, given the frequency of changes these days thanks to the regular Office 365 release cycle, be sure to supplement your learning with any proper online cross-checking.

Setup a dedicated lab environment

This task can be accomplished alongside working through the exercises in the exam book referenced above but, as with any exam, hands-on experience using the product is the best way of getting a passing grade. Download a copy of SQL Server Developer edition, restore one of the sample databases made available by Microsoft, get a copy of Excel 2016 and – hey presto! – you now have a working lab environment & dataset that you can interact with to your heart’s content.

Pivot yourself toward greater Excel knowledge

Almost a quarter of the exam tests candidates on the broad range of PivotTable/PivotChart visualisations made available within Excel. With this in mind, some very detailed, specific knowledge is required in each of the following areas to stand a good chance of a passing grade:

  • PivotTables: How they are structured, how to modify the displaying of Totals/Subtotals, changing their layout configuration, filtering (Slicers, Timelines etc.), auto-refresh options, aggregations/summarising data and the difference between Implicit and Explicit Measures.
  • PivotCharts: What chart types are available in previous and newer versions of Excel (and which aren’t), understanding the ideal usage scenario for each chart type, understanding the different variants available for each chart types, understanding the structure of a chart (Legend, Axis etc.), chart filtering and formatting options available for each chart.

Check out the relevant edX course

As a revision tool, I found the following edX course of great assistance and free of charge to work through:

Analyzing and Visualizing Data with Excel

The course syllable mirrors itself firmly to the skills measured for the exam and represents a useful visual tool for self-study or as a means of quickly filling any knowledge gaps.

Conclusions or Wot I Think

It is always essential, within the IT space, to keep one eye over the garden fence to see what is happening in related technology areas. This simple action of “keeping up with the Joneses” is to ensure no surprises down the road and to ensure that you can keep your skills relevant for the here and now. In case you didn’t realise already, Power BI is very much one of those things that traditional reporting analysts and CRM/D365CE professionals should be contemplating working with, now or in the immediate future. As well as being a dream to work with, it affords you the best opportunity to implement a reporting solution that will both excite and innovate end users. For me, it has allowed me to develop client relationships further once putting the solution in place, as users increasingly ask us to bring in other data sources into the solution. Whereas typically, this may have resulted in a protracted and costly development cycle to implement, Power BI takes all the hassle out of this and lets us instead focus on creating the most engaging range of visualisations possible for the data in question. I would strongly urge any CRM/D365CE professional to start learning about Power BI when they can and, as the next logical step, look to go for the BI Reporting MCSA.

The single biggest challenge when developing a reporting solution is data. Invariably, you won’t always have one database that contains all the information you need; often, you will need to bring across disparate and unrelated data sources into a common model. This problem can be exasperated if your organisation has a number of application or database systems from different vendors. Finding a tool that can overcome some of these hurdles is a real challenge. For example, whilst I am a huge fan of SQL Server Reporting Services, the out of the box data connection options are generally limited to common vendor products or ODBC/OLE DB data sources. Case in point: Finding or even developing a Data Source that can support a JSON stream can be troublesome and complicated to configure. With the landscape as such, the reporting tool that can offer the most streamlined method of overcoming these challenges is the tool that is going to win the day.

Following on from my deep-dive with the product last year, I have been working more and more with PowerBI in recent weeks. What I like most about the tool is that a lot of the hassle is taken out of configuring your data sources. PowerBI does this by leveraging the existing Power Query language and equipping itself with a large number of Data Source Connectors. The most surprising aspect of this? Microsoft products form only a subset of the options available, with connectors in place for many of competitor products from the likes SAP, SalesForce and Oracle. In my limited experience with the product to date, I have yet to find a data source that it does not support, either as part of a data connector or a manual Power Query.

A recent work example can best illustrate the above, as well as showcasing some of the built-in functionality (and learning curves!) that come to working with data via Power Query and writing Data Analysis Expressions (DAXs). There was a requirement to generate an internal department dashboard for an IT service desk. The dashboard had to meet the following key requirements:

  • Display a summary of each team members movements for the current week, ncluding the location of each person on that current day. Each member of the team was already recording their weekly movements within their Exchange calendar as all day appointments, configuring the Subject field for each appointment accordingly. For example, In OfficeWorking at Home etc. No other all day appointments were booked in the calendars.
  • Query Dynamics CRM and return data relating to Active/Inactive Case records.
  • To be displayable on a TV/Screen, refresh automatically and be exportable as a .pdf document or similar.

A CRM Dashboard can achieve about 50-60% of the above, but the key requirements of querying Exchange and exporting the dashboard are much more tricky; whilst it is certainly possible to perform web service requests from within CRM to Exchange, the process would be so convoluted to implement that is arguably not worth the effort. Likewise, CRM is not particularly friendly when it comes to printing out Dashboards, as you often left to the mercy of the web browser in question. With all this in mind, we decided that PowerBI was the best option and proceeded to bring all the data together using PowerQuery.

We first used the out of the box Exchange connector to query each person’s mailbox for all Calendar items, performing two transformations on the data. First, we filtered the result to return Calendar items from the current week and, second, we added a column to identify which Calendar the record derives from (as there is no field on each record to determine this). We’ll see why this is required in a few moments:

let
    Source = Exchange.Contents("john.smith@domain.com"),
    Calendar1 = Source{[Name="Calendar"]}[Data],
    #"Filtered Rows" = Table.SelectRows(Calendar1, each Date.IsInCurrentWeek([Start])),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Owner", each "John Smith")
in
    #"Added Custom"

Next, we combined all Calendars together into one table – again using Power Query. Table.Combine is a comma separated list of all tables you want to merge together, so you can add/remove accordingly to suit your requirements. We also take this opportunity to remove unnecessary fields and convert our Start and End field values to their correct type:

let
    Source = Table.Combine({Calendar1, Calendar2, Calendar3}),
    #"Removed Columns" = Table.RemoveColumns(Source,{"Folder Path", "Location", "DisplayTo", "DisplayCc", "RequiredAttendees", "OptionalAttendees", "LegacyFreeBusyStatus", "IsReminderSet", "ReminderMinutesBeforeStart", "Importance", "Categories", "HasAttachments", "Attachments", "Preview", "Attributes", "Body"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Start", type date}, {"End", type date}})
in
    #"Changed Type"

Our CRM data is returned via an adapted version of the query used previously on the blog, taking into account the benefits of using a Saved Query as opposed to FetchXML. No further work is required to manipulate the data once in PowerBI, so this won’t be covered any further. Our issue is now with the Exchange Calendars. Because the appointments in the Calendars to indicate each persons movement are set as All Day appointments spanning multiple days, we have no way of extrapolating the days in between to determine whether it is the current day. So for example, if the all day Appointment starts on Monday and ends on Wednesday, we have Monday and Wednesday’s date, but not Tuesday’s. We, therefore, need to find a solution that determines whether the appointment falls on a specific day of the week – Monday, Tuesday, Wednesday, Thursday or Friday.

Our first step is to generate a date table covering the entire period we are concerned with. Using this very handy query, we can set up a PowerBI function that will allow us to generate just that – in this case, for the whole of 2017:

Why do we need this? Because we need to determine for each date in 2017 what day it falls on. For this reason, we now take off our Power Query hat and jam on our DAX one instead 🙂 Close & Apply your queries in PowerBI and then navigate to your new date table. Add a new Column, using the following DAX formula to populate it:

Day of Week (Number) = WEEKDAY('2017'[Date], 2)

The WEEKDAY function is an incredibly handy function in this regard, enabling us to determine the day of the week for any date value. Nice! We can now go back to our “unified” calendar, and perform the following modifications to it:

  • Add on a column that returns a TRUE/FALSE value for each row on our calendar, which tells us if the Start, End or any date between these values falls on a specific day. So, for our IsMondayAllDay field, our DAX formula is below. This will need to be modified accordingly for each subsequent column, by incrementing 1 on the ‘2017’[Day of Week (Number)], 1 bit by 1 for Tuesday, 2 for Wednesday etc.:

IsMondayAllDay = IF(AND(CONTAINS(CALENDAR([Start], IF([End] = [Start], [End], [End] - 1)), [Date], DATEVALUE(LOOKUPVALUE('2017'[Date], '2017'[Week Number], FORMAT(WEEKNUM(AllCalendars[Start], 2), "General Number"), '2017'[Day of Week (Number)], 1))), AllCalendars[IsAllDayEvent] = TRUE()), "TRUE", "FALSE")

  • A calculated column that tells us whether the current row is today, by referencing each of our fields created in the subsequent step. Similar to above, a TRUE/FALSE is returned for this:

IsToday = IF(([IsMondayAllDay] = "TRUE" && WEEKDAY(TODAY(), 2) = 1) || ([IsTuesdayAllDay] = "TRUE" && WEEKDAY(TODAY(), 2) = 2) || ([IsWednesdayAllDay] = "TRUE" && WEEKDAY(TODAY(), 2) = 3) || ([IsThursdayAllDay] = "TRUE" && WEEKDAY(TODAY(), 2) = 4) || ([IsFridayAllDay] = "True" && WEEKDAY(TODAY(), 2) = 5), "TRUE", "FALSE")

We now have everything we need to configure a Measure that can be used on our Dashboard – the Subject of the calendar appointment and a way of indicating that the appointment is today. So our final DAX formula would be as follows for John Smith:

John's Location (Today) = LOOKUPVALUE(AllCalendars[Subject], AllCalendars[IsToday], "TRUE", AllCalendars[Owner], "John Smith")

Now, it is worth noting, that the above solution is not fool-proof. For example, if a person has multiple All Day Appointments configured in their Calendar, then it is likely that the Measure used above will fall over. Giving that this is unlikely to happen in the above scenario, no proactive steps have been taken to mitigate this, although you can certainly implement a solution to address this (e.g. use the MAX function, only return records which contains “Working”, “Office” or “Home” in the Subject etc.). Nevertheless, I feel the above solution provided an effective “crash course” in a number of fundamental PowerBI concepts, including:

  • PowerQuery data retrieval and manipulation
  • PowerBi Functions
  • DAX Formulas and the difference between Measures and Calculated Columns

As a colleague recently said to me, “I think we will be using PowerBI a lot more in the future”. This is something that I would certainly agree with based on my experience with it so far 🙂