I’ve been doing some work with Power BI Online datasets this week. It’s the first time I’ve taken a look at them in any great detail, as I have traditionally preferred to create and deploy any data sources needed for my reports via the Desktop client. Datasets address the needs for users who do not necessarily have the necessary Power Query/DAX language and require a mechanism to quickly hook up to a pre-prepared dataset, thereby allowing them to start consuming any data rapidly. They are defined within an individual workspace and, once deployed, can then be shared out to other users to connect to. Datasets can help towards ensuring that any extensive piece of work developing a data model can be benefitted by as many users as possible and can reduce the need for people having to create data sources themselves if organised correctly. To find out more about how to create a dataset, you can take a look through my recent series all around Microsoft Exam 70-778, where I cover this topic in further detail.

Datasets can remove some headaches for more substantial Power BI deployments, but you should be aware of they can’t do. The most noteworthy limitations include:

  • You can only connect a Power BI report to a single Power BI dataset at any time. You are unable to bring in additional data sources via Power Query and, likewise, if you have already defined several data sources within your report, you cannot then bring a Power BI dataset into your model.
  • It is not possible to make any modifications to a Power BI dataset from within Power BI Desktop, either via Power Query manipulation or by adding DAX custom columns; you can, however, define new Measures using DAX.

Where this can start to be problematic is when you are attempting to surface data generated by the Applications Insights Continuous Export facility via a Stream Analytics job. The great thing about Stream Analytics is that you can define multiple input/output locations for data that it processes, one of which includes a Power BI dataset. A convenient feature and one that can potentially sidestep the need to export any information out to a SQL-based destination for further processing. However, one area where the feature handicaps itself is in the fact that you cannot output multiple tables within the same Power BI dataset. If you attempt to do this, you get the following error message:

When you take into account the limitations mentioned above concerning a strict 1:1 mapping between dataset/report, problems can arise when it comes to defining your queries on the Stream Analytics side of things. You either have to export out all of the base information in a raw format or instead develop highly customised queries on the Stream Analytics side. The latter solution may successfully meet a specific business requirement, but risks putting you in a position where your BI solution contains many different reports, some of which may only include a meagre amount of visualisations. I’m not sure about you, but my preference would be more towards the first option, as opposed to building out an overtly complex BI solution; but the limitations that Power BI datasets introduce for this particular scenario does present some challenge in sticking to this mantra.

For example, assume we have the following Stream Analytics query that outputs Request information from Application Insights into a Power BI dataset:

SELECT 
    requestflat.ArrayValue.id AS RequestID,
    requestflat.ArrayValue.name AS RequestType,
    requestflat.ArrayValue.responseCode AS ResponseCode,
    requestflat.ArrayValue.success AS RequestSuccessful,
    requestflat.ArrayValue.url AS URL,
    requestflat.ArrayValue.urlData.base AS BaseURL,
    requestflat.ArrayValue.urlData.host AS URLHost,
    requestflat.ArrayValue.durationMetric.value AS Duration,
    r.internal.data.id AS ID,
    r.internal.data.documentVersion AS DocumentVersion,
    r.context.data.eventTime AS EventTime,
    r.context.data.isSynthetic AS IsSynthetic,
    r.context.data.syntheticSource AS SyntheticSource,
    r.context.data.samplingRate AS SamplingRate,
    r.context.device.type AS DeviceType,
    r.context.device.roleName AS SlotName,
    r.context.device.roleinstance AS RoleInstance,
    r.context.session.isFirst AS IsFirstSession,
    r.context.operation.id AS OperationID,
    r.context.operation.parentID AS OperationParentID,
    r.context.operation.name AS OperationName,
    GetRecordPropertyValue(GetArrayElement(r.[context].[custom].[dimensions], 0), 'Platform') AS Platform,
    GetRecordPropertyValue(GetArrayElement(r.[context].[custom].[dimensions], 1), 'Browser') AS Browser,
    GetRecordPropertyValue(GetArrayElement(r.[context].[custom].[dimensions], 3), 'UserAgent') AS UserAgent,
    GetRecordPropertyValue(GetArrayElement(r.[context].[custom].[dimensions], 4), 'Browser_Version') AS BrowserVersion,
    GetRecordPropertyValue(GetArrayElement(r.[context].[custom].[dimensions], 5), 'Referrer') AS ReferralURL,
    r.EventProcessedUtcTime AS RequestEventProcessedUtcTime,
    r.context.[user].anonId AS AnonymousID,
    r.context.location.continent AS ClientLocation,
    r.context.location.country AS ClientCountry
INTO [PowerBIRequestsOutput]
FROM [Requests] AS r
CROSS APPLY GetElements(r.[request]) AS requestflat

Having the data in this format provides us with the most flexibility when tailoring things on the Power BI side of things. But what if you wanted to perform some ranking on the data, based on a distinct category value – for example, ranking each of the Browser values in popularity order, based on each visitor to the website? While it is certainly possible to do this via a Stream Analytics query, you would end up having to group the data, thereby reducing the wider usage that the dataset could accommodate. Fortunately, thanks to the fact that we can create DAX Measures, it is possible to overcome this to generate a ranking per category and then display the most popular browser as part of a Card visualisation. We first need to create the following Measure within Power BI Desktop:

Browser Ranking = RANKX(
                        ALLSELECTED(PowerBIRequestsOutput[browser]), 
                        CALCULATE(
                                DISTINCTCOUNT(
                                    PowerBIRequestsOutput[anonymousid]
                                    )
                                ),,,Dense
                            )

We can confirm this works as expected by dropping a quick table visualisation in and verifying that each Browser is being ranked correctly, based on the count of the anonymousid field:

So far, so good 🙂 And this is potentially a visualisation that has a good usage case solely on its own; but, with an additional DAX Measure, we can return the highest ranked value above, IE, via the following Measure:

Most Used Browser = TOPN(1, FILTER(VALUES(PowerBIRequestsOutput[browser]), [Browser Ranking] = 1))

Now, the thing to mention here is that the above does not explicitly handle ties in any sophisticated way; therefore, there is no guarantee over which value will display in the event of a draw for the top ranking. Nevertheless, it is encouraging to know that DAX provides us with the types of capabilities we need when we find ourselves unable to do the kind of manipulation we would want to, either via a SQL query or Power Query manipulation.

Software deployments and updates are always a painful event for me. This feeling hasn’t subsided over time, even though pretty much every development project I am involved with these days utilises Azure DevOps Release Pipelines to automate this whole process for our team. The key thing to always stress around automation is that it does not mean that all of your software deployments suddenly become entirely successful, just because you have removed the human error aspect from the equation. In most cases, all you have done is reduce the number of times that you will have to stick your nose in to figure out what’s gone wrong 🙂

Database upgrades, which are done typically via a Data-tier Application Package (DACPAC) deployment, can be the most nerve-racking of all. Careful consideration needs putting towards the types of settings you define as part of your publish profile XML, as otherwise, you may find either a) specific database changes are blocked entirely, due to dependency issues or because intended data loss will occur or b) the types of changes you make could result in unintended data loss. This last one is a particularly salient concern and one which can be understood most fully by implementing staging or pre-production environments for your business systems. Despite some of the thought that requires factoring in before you can look to take advantage of DACPAC deployments, they do represent the natural option of choice in managing your database upgrades more formally, mainly when there is need to manage deployments into Azure SQL databases. This state of play is mostly thanks to the dedicated task that handles this all for us within Azure DevOps:

What this feature doesn’t make available to us are any appropriate steps we may need to take to generate a snapshot of the database before the deployment begins, a phase which represents both an equally desirable and necessary business requirement for software deployments. Now, I should point out that Azure SQL includes many built-in options around recovery and point in time restore options. These options are pretty extensive and enable you, depending on the database size tier you have opted for, to restore your database to any single time point over a 30-day point. The question that therefore arises from this is fairly obvious – why go to the extra bother (and cost) to create a separate database backup? Consider the following:

  • The recovery time for a point-in-time restore can vary greatly, depending on several factors relating to your database size, current pricing tier and any transactions that may be running on the database itself. In situations where a short release window constraints you and your release must satisfy a strict success/fail condition, having to go through the restore process after a database upgrade could lead to your application from being down longer then is mandated within your organisation. Having a previous version of the database already available there means you can very quickly update your application connection strings to ensure the system returns to operational use if required.
  • Having a replica copy of the database available directly after an upgrade can be incredibly useful if you need to reference data within the old version of the database post-upgrade. For example, a column may have been removed from one table and added to another, with the need to copy across all of this data accordingly. Although a point-in-time restore can be done to expose this information out, having a backup of the old version of the database available straight after the upgrade can help in expediting this work.
  • Although Microsoft promise and provide an SLA with point-in-time restore, sometimes its always best to err on the side of caution. 🙂 By taking a snapshot of the database yourself, you have full control over its hosting location and the peace of mind in knowing that the database is instantly accessible in case of an issue further down the line.

If any of the above conditions apply, then you can look to generate a copy of your database before any DACPAC deployment takes place via the use of an Azure PowerShell script task. The example script below shows how to achieve this requirement, which is designed to mirror a specific business process; namely, that a readily accessible database backup will generate before any upgrade is taken place and to create a copy of this within the same Azure SQL Server instance, but with the current date value appended onto it. When a new deployment triggers in future, the script will delete the previously backed up database:

#Define parameters for the Azure SQL Server name, resource group and target database

$servername = 'mysqlservername'
$rg = 'myresourcegroup'
$db = 'mydb'

#Get any previous backed up databases and remove these from the SQL Server instance

$sqldbs = Get-AzureRmSqlDatabase -ResourceGroupName $rg -ServerName $servername | select DatabaseName | Where-Object {$_.DatabaseName -like $db + '_Backup*'}

if (($sqldbs |  Measure-Object).Count)
{
	Remove-AzureRmSqlDatabase -ResourceGroup $rg -ServerName $servername -DatabaseName $sqldbs[0].DatabaseName
}

#Get the current date and convert it into a string, with format DD_MM_YYYY

$date = Get-Date
$date = $date.ToShortDateString()
$date = $date -replace "/", "_"

#Create the name of the new database

$copydbname = $db + '_Backup_' + $date

#Actually create the copy of the database

New-AzureRmSqlDatabaseCopy -CopyDatabaseName $copydbname -DatabaseName $db -ResourceGroupName $rg -ServerName $servername -CopyServerName $servername

Simply add this on as a pipeline task before any database deployment task, connect up to your Azure subscription and away you go!

Backups are an unchanging aspect of any piece of significant IT delivery work and one which cloud service providers, such as Microsoft, have proactively tried to implement as part of their Platform-as-a-Service (PaaS) product lines. Azure SQL is not any different in this regard and, you could argue that the point-in-time restore options listed above provide sufficient assurance in the event of a software deployment failure or a disaster-recovery scenario, therefore meaning that no extra steps are necessary to protect yourself. Consider your particular needs carefully when looking to implement a solution like the one described in this post as, although it does afford you the ability to recover quickly from any failed software deployment, it does introduce additional complexity into your deployment procedures, overall technical architecture and – perhaps most importantly – cost.

After a few months of working with Microsoft Azure, you start to become familiar with the concepts behind the Resource Manager model and also how it is possible to move resources around, depending on business requirements. Whereas in the past, you would typically need to open a support request to Microsoft to complete this action, the vast majority of these operations can now be given to Azure administrators to finish instead. As such, we have the capability to:

  • Move a resource into a different resource group
  • Move a resource/resource group into a separate subscription, thereby altering its billing arrangements.
  • Migrate an entire subscription to a new Azure Active Directory tenant, which alters both its ownership and billing arrangements in one fell swoop.

There are a few things regarding a resource/resource group that remain unchangeable post-creation. For example, its name and location (UK South, North Europe etc.) are static properties that cannot be altered for the entirety of a resources lifespan, and the only way to modify this is to delete and re-create it from scratch – action steps that have severe implications for a production workload. There are also a few scenarios where it will not be possible to move a resource to another subscription. The list of resources this affects is dwindling as the months go by, but there are still many popular services that are affected by this – such as Azure Data Factory, Logic Apps and Service Fabric. Partners who have a desire to move their customers away from Microsoft direct to Cloud Solutions Provider (CSP) billing need to take note in particular of these limitations when scoping any migration exercise, as it could be that an entire project becomes derailed if an un-supported resource is in the mix.

Sometimes, even if you think you will be OK when migrating a resource, you will occasionally hit an issue during the validation stage of a move. I came across a good example of this when attempting to move an App Service Plan and its corresponding App Service to a new subscription location, with the following JSON error message being generated (modified so that it is readable):

{
	"code": "ResourceMoveProviderValidationFailed",
	"message": "Resource move validation failed. Please see details. Diagnostic information: timestamp '20190221T124038Z', subscription id '4e846969-3196-476f-a088-6e393bb5ce98', tracking id '6cf8ddf7-8d15-48bc-8b34-f7729bd44f0a', request correlation id '68c44a05-eeba-4d96-887c-56a228ab69a3'.",
	"details": [
		{
			"target": "Microsoft.Web/serverFarms",
			"message": "{\"Code\":\"BadRequest\",\"Message\":\"Cannot move resources because some site(s) are hosted by other resource group(s) but located in resource group 'myoriginalrg'. The list of sites and corresponding hosting resource groups: 'myappserviceplan:myoriginalrg,mywebapp1:myoriginalrg'. This may be a result of prior move operations. Move the site(s) back to respective hosting resource groups and try again."\ ... }
		}
	]
}

In this particular example, the App Service Plan and App Service had been the subject of a previous move operation from their original resource group (in this example, myoriginalrg) into the current resource group. The location of the original and current resource group when this move took place was in the same subscription, so the transfer completed without issue. Now, because I was looking to move the resources to a new subscription, the error message above appeared. What’s worse, the myoriginalrg resource group had been deleted a long time ago, meaning there it wasn’t immediately clear whether the option suggested in the error message was even possible. Fortunately, this was not the case, and the following workaround steps can be used to get your App Service Plan/App Services moved to your desired location.

  1. Create a new resource group in the same subscription where the App Service Plan/App Service exists, with the same name listed in the error message (in this case, myoriginalrg).
  2. Move the resources into the myoriginalrg resource group. Verify that this completes successfully.
  3. Re-attempt the resource move operation to your preferred new subscription location. The procedure should complete successfully.

My main worry when I first saw this error message is that the resources in question were tied permanently to a non-existent resource group and that the listed workaround steps were not going to work. Fortunately, my glass half full outlook proved me to be categorically wrong, and the workaround solved the issue entirely. I can’t really understand or explain why an App Service Plan / App Service creates such a binding to a resource group (and, to a lesser extent, a subscription), which therefore causes an issue like this to appear; fortunately, as we have seen, there is a way of getting things working as intended without having to involve Microsoft support 🙂