As discussed recently on the blog, I have been on a journey to try and attain the Microsoft Certified Solutions Associate Certification in BI Reporting. I was very fortunate to overcome the final hurdle of this task by passing Exam 70-778: Analyzing and Visualizing Data with Microsoft Power BI the other day. I enjoyed the opportunity to dive deeper into the world of Business Intelligence, particularly given the enhanced role Power BI has within the Business Applications space today. With this in mind, and in the hopes of encouraging others, today’s post is the first in a new series of revision notes for Exam 70-778. I hope that you find this, and all future posts, useful as either a revision tool or as an introduction into the world of Power BI.

The first skill area of the exam is all around how to Import from data sources, as described on the exam specification:

Connect to and import from databases, files, and folders; connect to Microsoft SQL Azure, Big Data, SQL Server Analysis Services (SSAS), and Power Query; import supported file types; import from other Excel workbooks; link to data from other sources

To begin with, I will provide a detailed overview of the topic areas covered above, before jumping into an example of how to import data into Power BI.

Supported Data Sources

The great benefit of Power BI is its huge list of supported connectors, which are integrated neatly within the application itself. The list of all possible data sources changes on a monthly basis, and it is impossible to go into detail on each one. Suffice to say; you should at least be familiar with the following data sources:

  • SQL Server (on-premise & Azure)
  • SQL Server Analysis Services
  • A wide range of vendor-specific Relational Database Management Systems (RDBMS’s), such as Oracle, MySQL, PostgreSQL, SAP Hana
  • Any data source that supports Open Database Connectivity (ODBC) or Object Linking and Embedding, Database (OLEDB).
  • The following flat file types:
    • Excel (.xlsx)
    • Text (.txt)
    • Comma Separated Value documents (.csv)
    • Extensible Markup Language (.xml)
    • JavaScript Object Notation (.json)
  • Web sources, such as Web pages or OData Feeds

Some RDBMS vendor solutions have a requirement to install additional software, which will enable you to interact with that particular data source. You should check the relevant documentation for each vendor to verify any specific requirements.

Power BI also supports a wide range of Microsoft proprietary and non-proprietary applications, such as Dynamics 365 Customer Engagement, SharePoint, Google Analytics & SalesForce. If you are feeling particularly technical, then you can also use the Blank Query option to, in theory, connect to any data source of your choosing or even go as far as building custom connectors yourself to interact with a specific application.

Bulk File Loading

As well as supporting connections to single flat files, it is also possible to interact with multiple files existing in the same location. This feature can be useful if, for example, there is a requirement to process hundreds of .csv files with different data, but the same overall structure. The supported list of bulk file locations are:

  • Windows file system folder
  • SharePoint document folder
  • Azure Blob Storage
  • Azure Data Lake Storage

When loading multiple files into Power BI, you not only can read the contents of each file but can also access file-level metadata, as indicated below:

Import vs DirectQuery

An important design decision when working with data sources concerns the data connectivity mode to be used. Your final choice will generally fall into one of two options:

  • Import: When connecting to your data source, Power BI takes a copy of all data and stores it within your report. By implication, this places additional pressure on your local machines disk space and memory consumption. Import is the default option for most data sources and, to ensure that your data remains consistently up to date when deployed to the Power BI service, you have the opportunity of defining your data refresh frequency – 8 times a day for Power BI Professional and 48 times a day for Power BI Premium subscriptions. Import is the most sensible option to choose when there is no requirement for regular refreshing of your data sources or if performance concerns arise when using…
  • DirectQuery: Instead of taking a snapshot of the data, Power BI will read the data at source and store only the schema of the data within the model. At the time of writing this post, only a select number of mostly SQL based data sources are compatible with this feature. DirectQuery is your best choice when there is a need to keep reports continually up to date, and when your target data source is sufficiently beefed up to handle frequent requests. It’s also worth bearing in mind the following points when evaluating DirectQuery:
    • DirectQuery only supports a single data source connection for the entire model, with no option of defining additional sources. While traditionally true, the release of composite models for DirectQuery removes this much-loathed limitation.
    • There are limitations when it comes to data transformation options, especially for non-Microsoft data sources.
    • Some query types will be unsupported.
    • For data modelling using DAX, there are some crucial limitations. For example, Measures that use the SUMX & PATH functions (or their related counterparts) are not allowed.

You should also be aware of a third option – Live Connection – which behaves similar to DirectQuery but is for SQL Server Analysis Services only. This option has the following limitations:

  • Not possible to define relationships
  • No possibility to transform data from within Power BI.
  • Data modelling options, except for Measure creation, are almost non-existent.

Importing Excel Workbooks

There are some aspects of working with Excel documents in Power BI that are worth further consideration. You mostly have two options at your disposal to consume Excel workbooks:

  1. Import Data: Similar to working with any other flat file source, data within each of the following Excel objects is importable into Power BI:
    • Tables
    • Sheets
    • Ranges
  2. You can see below how this looks for a file containing four worksheets:
  3. Import workbook contents: If you have built out a complex spreadsheet that utilises the full range of features available in the Excel Data Model, then it is also possible to import these into Power BI “as-is”. The following Excel Data Model features are exportable in this manner:
    • Power Query queries
    • Power Pivot Data Models
    • Power View Worksheets
    • (Most) Power View visuals; where a visual is unsupported in Power BI, an error appears on the appropriate visual.

Example: Importing SQL Server Database Data

What follows now is a typical data connection exercise in Power BI Desktop, which involves connecting to a SQL Server database. The experience described here is mostly similar for other data sources and, therefore, represents an optimal example to familiarise yourself with connecting to data sources in the application:

  1. Launch Power Bi Desktop and, on the splash screen, select the Get data link on the left-hand pane:
  2. On the Get Data window, choose Database on the left-hand list, select SQL Server database and then press the Connect button:
  3. You will now be prompted to provide the following details:
    • Server: This will be either the Fully Qualified Domain Name (FQDN) of the computer with a default SQL Server instance or the computer name and named instance name (e.g. MYSQLSERVER/MYSQLSERVERINSTANCE). In the example below, we are connecting to a default SQL Server instance on the computer JG-LAB-SQL
    • Database: If you already know the name of the database you want to access, you can type this here; otherwise, leave blank. In this example, we are connecting to the WideWorldImporters sample database.
    • Data Connectivity mode: See the section Import vs DirectQuery above for further details. For this example, select the Import setting:
  4. There are also several additional options that are definable in the Advanced options area:
    • Command timeout in minutes: Tells Power BI how long to wait before throwing an error due to connection issues.
    • SQL statement: Specify here a pre-compiled SQL statement that will return the objects/datasets that you require. This option can be useful if you wish to reduce the complexity of your model within Power BI or if there is a requirement to return data from a stored procedure.
    • Include relationship columns: Enabling this setting will return a single column for each defined relationship which, when expanded, gives you the ability to add related column fields onto your table object.
    • Navigate using full hierarchy: Enabling this will allow you to navigate through the database hierarchy using schema object names. In most cases, this should remain disabled, unless there a specified schema names in your dataset (like Application, Sales, Purchasing etc. in the WideWorldImporters database).
    • Enable SQL Server Failover support: If enabled, then Power BI will take advantage of any failover capability setup on your SQL Server instance, re-routing requests to the appropriate location where necessary.
  5. Illustrated below are some example settings for all of the above. For this walkthrough, leave all of these fields blank and then press OK to continue.
  6. The Navigator window will appear, which will enable you to select the Tables or Views that you wish to work within the model. Selecting any of the objects listed will load a preview in the right-hand window, allowing you to see a “sneak peek” of the schema and the underlying data. Tick the object Sales.CustomerTransactions and then press the Select Related Tables button; all other Tables that have a relationship with the Sales.CustomerTransactions are then automatically included. Press Load when you are ready to import all selected table objects into Power BI.
  7. After a few moments, the Load window will appear and update accordingly as each table object gets processed by Power BI (exact times may vary, depending on the remote server/local machines capabilities). Eventually, when the window closes, you will see on the right-hand pane that all table objects have been loaded into Power BI and are ready to use for building out visualizations:
  8. At this stage, you would then look at loading up your imported objects into Power Query for fine-tuning. But that’s a topic for the next post 🙂

Key Takeaways

  • Power BI supports a broad range of database systems, flat file, folder, application and custom data sources. While it is impossible to memorise each data source, you should at least broadly familiarise yourself with the different types at our disposal.
  • A crucial decision for many data sources relates to the choice of either Importing a data source in its entirety or in taking advantage of DirectQuery functionality instead (if available). Both routes have their own defined set of benefits and disadvantages. DirectQuery is worth consideration if there is a need to keep data regularly refreshed and you have no requirement to work with multiple data sources as part of your solution.
  • Live Connection is a specific data connectivity option available for SQL Server Analysis Services. It behaves similarly to DirectQuery.
  • It is possible to import an existing Excel BI solution into Power BI with minimal effort, alongside the ability to import standard worksheet data in the same manner as other flat file types.

Look out for my next post in this series, where I will take a look at the range of transformation options available to us in Power BI, and work through some examples applied against the tables listed above.

When it comes to handling large datasets in a formal, structured and highly scalable manner, nothing beats SQL Server. Having worked with the product for almost six years, I always look forward to the opportunity of putting together some SQL queries or to build out a new database. I think of it as a nice little treat, a reward amongst the midst of other, sometimes tedious, tasks that I have to deal with on a weekly basis. I rank knowledge of SQL Server pretty highly if you are aiming to become a serious Dynamics CRM/365 for Customer Engagement professional and I credit my experience with SQL as one of the things that helped to streamline my journey into Dynamics.

It may be, however, that others are not as keen at the prospect of working with SQL databases, particularly when it is unable to accommodate some of the alternative data storage mechanisms that are commonplace with application developers. A good example of this is JSON (JavaScript Object Notation), a format that is used widely today as a portable and “easy to read” mechanism of transferring data. For someone who is more used to working with SQL, getting your head around JSON can be a bit of a challenge at first and – traditionally – was not something that Microsoft factored into the design of SQL Server. A lot has changed with Microsoft – to the extent that services such as Microsoft Flow and Azure now use JSON extensively – and with SQL Server, as a whole host of related functions were added to SQL Server 2014 to provide straightforward conversions into JSON. The FOR JSON PATH clause is the primary gateway into this world and is a function which I have slowly, but surely, started to get my head around. What I wanted to do in this week’s blog post was provide a “crash course” on how to use this nifty piece of functionality, hopefully with the aim of giving you everything you need to start utilising it in your environment(s).

Before we begin…

To best illustrate how the clause works in practice, it is necessary to create an appropriate set of inter-related tables within SQL Server, that will be used for all examples that follow. Here’s one I (rather unimaginatively) made earlier:

The code snippets to create them can be found below:

CREATE TABLE dbo.[ParentTable]
(
	ParentTableID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
	Field1	BIT		NOT NULL,
	Field2	CHAR(10)	NULL,
	Field3	VARCHAR(MAX) NULL
)

GO

CREATE TABLE dbo.[ChildTable1]
(
	ChildTable1ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
	ParentTableID INT FOREIGN KEY REFERENCES dbo.[ParentTable](ParentTableID) NULL,
	Field1	BIT		NOT NULL,
	Field2	CHAR(10)	NULL,
	Field3	VARCHAR(MAX) NULL
)

CREATE TABLE dbo.[ChildTable2]
(
	ChildTable2ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
	ParentTableID INT FOREIGN KEY REFERENCES dbo.[ParentTable](ParentTableID) NULL,
	Field1	BIT		NOT NULL,
	Field2	CHAR(10)	NULL,
	Field3	VARCHAR(MAX) NULL
)

GO

CREATE TABLE dbo.[GrandchildTable]
(
	GrandchildTableID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
	ChildTable2ID INT FOREIGN KEY REFERENCES dbo.[ChildTable2](ChildTable2ID) NULL,
	Field1	BIT		NOT NULL,
	Field2	CHAR(10)	NULL,
	Field3	VARCHAR(MAX) NULL
)

GO

The table structures are incredibly basic, but note, in particular, the FOREIGN KEY relationships from the 2 Child Tables to the Parent and also the additional parent/child relationship between the GrandchildTable and ChildTable2. You will also need to look at inserting some test data into the tables to properly follow through the rest of this post.

With our environment prepped, let’s take a look at the different ways we can convert our dataset into JSON format, with minimal effort involved.

Example 1: FOR JSON AUTO

If we were to look at doing a straightforward SELECT * query on all our tables, our query and expected output might look something like this:

SELECT *
FROM dbo.ParentTable AS PT
 INNER JOIN dbo.ChildTable1 AS CT1
  ON PT.ParentTableID = CT1.ParentTableID
 INNER JOIN dbo.ChildTable2 AS CT2
  ON PT.ParentTableID = CT2.ParentTableID
 INNER JOIN dbo.GrandchildTable AS GT
  ON CT2.ChildTable2ID = GT.ChildTable2ID

Our main issue with this query is that, because of how T-SQL works, the 2 ParentTable records are returned for every child and grandchild record. For a client application, this can be somewhat cumbersome to handle. FOR JSON AUTO can be straightforwardly added to the end of the above query to convert the query output accordingly:

SELECT *
FROM dbo.ParentTable AS PT
 INNER JOIN dbo.ChildTable1 AS CT1
  ON PT.ParentTableID = CT1.ParentTableID
 INNER JOIN dbo.ChildTable2 AS CT2
  ON PT.ParentTableID = CT2.ParentTableID
 INNER JOIN dbo.GrandchildTable AS GT
  ON CT2.ChildTable2ID = GT.ChildTable2ID
FOR JSON AUTO

//Example output of the first 25 lines below:

[{
    "ParentTableID": 1,
    "Field1": true,
    "Field2": "Test      ",
    "Field3": "This is a test record",
    "CT1": [{
        "ChildTable1ID": 1,
        "ParentTableID": 1,
        "Field1": true,
        "Field2": "Test      ",
        "Field3": "This is a test record",
        "CT2": [{
            "ChildTable2ID": 1,
            "ParentTableID": 1,
            "Field1": false,
            "Field2": "Test      ",
            "Field3": "This is a test record",
            "GT": [{
                "GrandchildTableID": 1,
                "ChildTable2ID": 1,
                "Field1": false,
                "Field2": "Test      ",
                "Field3": "This is a test record"
            }]
       ...

This output provides a much more sensible structure, with no record duplication and proper nesting of child records.

Example 2: FOR JSON PATH, ROOT

With some modifications to the above query, it is also possible to specify names for each root element. This can be tailored depending on your specific requirements. For example, let’s say we had to provide the following root element names for each of the example tables:

  • dbo.ParentTable -> Parent
  • dbo.ChildTable1 -> FirstChildTable
  • dbo.ChildTable2 -> SecondChildTable

The following query would achieve these requirements, in addition to adding a master root element name of MyTestSQLJSONObject:

SELECT PT.ParentTableID AS [Parent.ParentTableID], PT.Field1 AS [Parent.Field1], PT.Field2 AS [Parent.Field2], PT.Field3 AS [Parent.Field3],
	   ChildTable1ID AS [FirstChildTable.ChildTable1ID], CT1.Field1 AS [FirstChildTable.Field1], CT1.Field2 AS [FirstChildTable.Field2], CT1.Field3 AS [FirstChildTable.Field3],
	   CT2.ChildTable2ID AS [SecondChildTable.ChildTable1ID], CT2.Field1 AS [SecondChildTable.Field1], CT2.Field2 AS [SecondChildTable.Field2], CT2.Field3 AS [SecondChildTable.Field3],
	   GT.GrandchildTableID AS [GrandchildTable.GrandchildTableID], GT.Field1 AS [GrandchildTable.Field1], CT2.Field2 AS [GrandchildTable.Field2], CT2.Field3 AS [GrandchildTable.Field3]
FROM dbo.ParentTable AS PT
 INNER JOIN dbo.ChildTable1 AS CT1
  ON PT.ParentTableID = CT1.ParentTableID
 INNER JOIN dbo.ChildTable2 AS CT2
  ON PT.ParentTableID = CT2.ParentTableID
 INNER JOIN dbo.GrandchildTable AS GT
  ON CT2.ChildTable2ID = GT.ChildTable2ID
FOR JSON PATH, ROOT('MyTestSQLJSONObject')

//Example of first 25 lines below:

{
    "MyTestSQLJSONObject": [{
        "Parent": {
            "ParentTableID": 1,
            "Field1": true,
            "Field2": "Test      ",
            "Field3": "This is a test record"
        },
        "FirstChildTable": {
            "ChildTable1ID": 1,
            "Field1": true,
            "Field2": "Test      ",
            "Field3": "This is a test record"
        },
        "SecondChildTable": {
            "ChildTable1ID": 1,
            "Field1": false,
            "Field2": "Test      ",
            "Field3": "This is a test record"
        },
        "GrandchildTable": {
            "GrandchildTableID": 1,
            "Field1": false,
            "Field2": "Test      ",
            "Field3": "This is a test record"
        ...

Example 3: NULL Field Values

One thing worth bearing in mind when working with the FOR JSON clause is how NULL field values behave. Take a look at the following example query output from dbo.ParentTable:

When attempting to query this single record using the FOR JSON AUTO clause, we get the following output:

//Example output below. Notice that no field name/value is returned for Field2 now

[{
    "ParentTableID": 1,
    "Field1": true,
    "Field3": "This is a test record"
}]

If you have a requirement always to return a value for every NULL field, then you can use the INCLUDE_NULL_VALUES option to get around this:

SELECT *
FROM dbo.ParentTable AS PT
WHERE PT.ParentTableID = 1
FOR JSON AUTO, INCLUDE_NULL_VALUES
//Notice now that Field2 returns as expected

[{
    "ParentTableID": 1,
    "Field1": true,
    "Field2": null,
    "Field3": "This is a test record"
}]

Even with this option specified, there may still be issues with outputting this field with a value of null. In these scenarios, on a T-SQL side, you would generally use the ISNULL function to replace NULL values with an empty string. Further, because the field type in this example is a CHAR(10) data type, there are ten characters of whitespace that need removing from the output string. The following query will fix both of these problems:

SELECT ParentTableID, Field1, LTRIM(ISNULL(Field2, '')) AS Field2, Field3
FROM dbo.ParentTable AS PT
WHERE PT.ParentTableID = 1
FOR JSON AUTO

Example 4: Using sub-queries to return child records as JSON objects

In most cases involving parent and child records, returning all the data as JSON may not be required. Instead, it may be necessary to return the fields from the parent record only, and all child records as a single JSON object field on the parent record. Using Subqueries, we can accommodate this scenario as follows

SELECT PT.ParentTableID, PT.Field1, PT.Field2, PT.Field3,
(
	SELECT *
	FROM dbo.ChildTable1 AS CT1
	WHERE CT1.ParentTableID = PT.ParentTableID
	FOR JSON AUTO
) AS ChildTable1,
(
	SELECT *
	FROM dbo.ChildTable2 AS CT2
	 INNER JOIN dbo.GrandchildTable AS GT
	  ON CT2.ChildTable2ID = GT.ChildTable2ID
	WHERE CT2.ParentTableID = PT.ParentTableID
	FOR JSON AUTO
) AS ChildTable2
FROM dbo.ParentTable AS PT

Example 5: Storing FOR JSON Query Output in Parameters

In most scenarios, you will generally provide functions or Stored Procedures for developers to interface with when connecting to the database. It is in this situation where the ability to store the output of any query – including those that use the FOR JSON clause – within a parameter becomes very useful. The following snippet will store the output of a FOR JSON query into a parameter called @JSON, which is then retrievable at any time via a SELECT query:

DECLARE @JSON NVARCHAR(MAX) = (SELECT * FROM dbo.ChildTable2 AS CT2 INNER JOIN dbo.GrandchildTable AS GT ON CT2.ChildTable2ID = GT.ChildTable2ID FOR JSON AUTO)

SELECT @JSON

Wrapping Up: A Few Things to Bear in Mind

  • The FOR JSON clause is not compatible with Common Table Expressions (CTE’s).
  • When storing the output of a FOR JSON query in a parameter, you have no native capability to query the inner contents of the JSON object via a SELECT…WHERE query.
  • Because there is no way of specifying the name of the single column that returns as part of a FOR JSON query, you cannot create a view that uses this clause.

I hope that this post has given you all the information and guidance needed to start working with FOR JSON clauses in your queries. The ability to straightforwardly convert SQL tabular data into a format that most developers would readily scream for is a convenient feature to have at our disposal and is indicative of how far the SQL Server product has moved forward in recent years. Hopefully, this feature will save you some hassle in the future and will mean that you can make friends with your developer colleagues 🙂

This week’s blog post is sponsored by ActiveCrypt Software.

Encryption appears to be a topic of near constant discussion at the moment, spearheaded primarily by the impending deadline of the General Data Protection Regulations (GDPR). These are, in essence, a new set of data protection rules that will apply to all organisations operating within the European Economic Area (EEA). A key aspect of them concerns implementing appropriate technical controls over sensitive data categories, to mitigate against any damage resulting from a data breach. Now, the key thing to highlight around this is the “proportionality” aspect; i.e. any technical controls implemented should be reasonably expected, based on the size of the organisation in question and the nature of their data processing/controlling activity. You should, therefore, be carefully evaluating your organisation to identify whether the lack of encryption could result in damage to a data subject.

I’ve had a look previously at database encryption in the context of Dynamics 365 Customer Engagement. What is nice about the application, and nearly all of Microsoft’s Software as a Service (SaaS) products at the moment, is that GDPR is very much at the centre of each individual offering. I have been genuinely impressed to see the level of effort Microsoft has been devoting to GDPR and in ensuring their SaaS product lines are compliant with the regulations – often without the need for charging customers an arm and a leg in the process. The same can perhaps not be said for any on-premise equivalent of a particular SaaS product. This is, to be fair, expected – Microsoft has been incredibly vocal about adopting a “cloud first” strategy in all things. But for organisations who do find themselves having to support on-premise applications or database systems, the journey towards implementing the required technical solutions for encryption could be rocky.

Case in point – SQL Server has long provided the capability to implement Transparent Database Encryption (TDE), which satisfies the requirement for at rest encryption without the need to redevelop applications from the ground up. Setting up Transparent Database Encryption can be an onerous process (more on this in a second), and requires the involvement of manual scripting. The following script outlines all the steps involved:

--First, a Master Key should be created on the Server instance

USE master;  
GO  
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'mymasterkey';  
GO

--Next, a Certificate for the Server should be created.

CREATE CERTIFICATE MyCert WITH SUBJECT = 'DEK Certificate for testing purposes';  
GO

--This then allows for a Database Encryption Key to be created for encrypting a database. This needs to be created for
--EVERY database that requires encryption

USE EncryptionTest;  
GO  
CREATE DATABASE ENCRYPTION KEY  
WITH ALGORITHM = AES_256  
ENCRYPTION BY SERVER CERTIFICATE MyCert;  
GO  

--Once created, Encryption can then be enabled/disabled using the snippets below

ALTER DATABASE MyTestDatabase 
SET ENCRYPTION ON;  
GO

ALTER DATABASE MyTestDatabase 
SET ENCRYPTION OFF;
GO

--The Server Certificate should be backed up for disaster recovery scenarios or to enable databases to be restored to
--other SQL Server instances. First, backup the certificate with an encrypted private key...

USE master;
GO
BACKUP CERTIFICATE MyCert TO FILE = 'C:\MyCert.cer'  
    WITH PRIVATE KEY ( FILE = 'C:\MyCert.pvk',
    ENCRYPTION BY PASSWORD = 'mypassword');
GO  

--Once saved, execute the following code on the target instance to restore the certificate...

CREATE CERTIFICATE MyCert FROM FILE ='C:\MyCert.cer'
	WITH PRIVATE KEY(FILE='C:\MyCert.pvk', DECRYPTION BY PASSWORD='mypassword');

Whilst TDE is a neat solution, it does have some issues:

  • It’s important to keep in mind any potential disaster recovery scenario, when working with TDE, by backing up the server certificate to a separate physical location. The above script provides the necessary snippet to accomplish this, so it is imperative that this is done for every certificate you plan to work with.
  • All required configuration steps have to be accomplished via scripting and the feature is not enabled by default, unlike Azure SQL Databases. Depending on your level of expertise when working with SQL Server, you may have to leverage assistance from other sources to get up and running with the feature.
  • Perhaps the biggest barrier to adopting TDE is the version restrictions. It is only made available as part of the Developer and Enterprise editions of SQL Server. As the name suggests, the Developer edition is licensed strictly for non-Production environments and the Enterprise edition has a staggering cost, licensed based on the number of cores the target server is running. To put this into better context, I was recently quoted a whopping £68,000 through Microsoft Volume Licensing! For most organisations, this can result in an incredibly high cost of ownership just to satisfy a single requirement.

Fortunately, for those who are wanting to implement database encryption via an accessible interface, there are a number of products available on the market to assist. The best one I have come across is DbDefence from ActiveCrypt, which offers a simple to use and efficient means of configuring encryption for your databases. In fact, depending on your database size, you can more than likely have your databases encrypted in less than 5 minutes 🙂 Let’s take a closer look at how straightforward the software is to use by encrypting a database from scratch:

  1. After downloading the installation package, you will need to run it on the server where your SQL Server instance resides. During the installation process, the Full installation option can be selected and you will also need to specify the SQL Server instance that you wish to utilise with the software:

  1. After the installation completes successfully, launch the application and then connect to your target SQL Server instance. Next, select the database that you want to encrypt. You should see a window similar to the below if done correctly:
  2. At this point, you could choose to accept the default Encryption and Protection options and proceed to the next step. However, I would recommend changing the options as follows:
    • Modify the AES Encryption Options value to 256-bit. Whilst the risk of a successful brute force between 128 and 256 bit is effectively zero, 256 still supports longer keys and is, therefore, more secure.
    • In most cases, you just need to ensure data is encrypted at rest and not provide any additional access restrictions beyond this. In these situations, I would recommend setting the required level of protection to Only Encryption. Maximum Transparency. This negates the need for any additional configuration after encryption to ensure your client applications still work successfully.

  1. To encrypt the database, a password/key is required. You should always ensure you utilise a random, sequential password that contains upper/lower case letters, numbers and symbols. I would also recommend having a seperate password for each database you encrypt and to ensure that these are all stored seperately (as they may be required to decrypt the databases at a later date). The length of the password to use will depend on the AES encryption mode, but if you are using 256 bit, then an 18 character password is recommended.
  2. When you are ready to start the encryption process, press the Encrypt button and confirm the warning box that appears:

Give it a few minutes and you will then be able to see in the main window that your database has been encrypted successfully:

If you ever have the requirement to decrypt the database, then you can return to the application at any time, connect up to the database, enter the password and then press Decrypt:

  1. As a final step, you can then test that your database files have been encrypted successfully by attempting to mount the encrypted database files onto a seperate SQL Server instance. You should get an error message similar to the below, indicating that your database has been encrypted successfully:

Conclusions or Wot I Think

The world of encryption can be a veritable nightmare to those approaching for the first time, and GDPR can be blamed – but also, I would argue, welcomed – in raising the profile of the topic recently. As with a lot of things concerning GDPR, there is a real opportunity for organisations to get a handle on the personal data they work with every day and to implement the required processes and systems to ensure the right thing is being done when handling sensitive data. Database encryption is one weapon in your arsenel when it comes to satisfying a number of areas within GDPR; but, as we have seen, the total cost of ownership and technical expertise required to implement such a solution could – regrettably – force many to simply look the other way when it comes to securing their databases. DbDefence assists greatly in both these regards – by significantly reducing cost and providing a simplified, easy to use interface, to deploy database encryption within minutes. What’s great as well is that, as part of evaluating the software, I found the support team at ActiveCrypt incredibly reactive and helpful in dealing with the queries I had around the product. If you are looking for a cheaper, yet wholly effective, solution to implement database encryption for SQL Server, then I would not hesitate to recommend the DbDefence product.

Slight change of pace with this week’s blog post, which will be a fairly condensed and self-indulgent affair – due to personal circumstances, I have been waylaid somewhat when it comes to producing content for the blog and I have also been unable to make any further progress with my new YouTube video series. Hoping that normal service will resume shortly, meaning additional videos and more content-rich blog posts, so stay tuned.

I’ve been running the CRM Chap blog for just over 2 years now. Over this time, I have been humbled and proud to have received numerous visitors to the site, some of whom have been kind enough to provide feedback or to share some of their Dynamics CRM/365 predicaments with me. Having reached such a landmark now seems to be good a time as any to take a look back on the posts that have received the most attention and to, potentially, give those who missed them the opportunity to read them. In descending order, here is the list of the most viewed posts to date on the crmchap.co.uk website:

  1. Utilising SQL Server Stored Procedures with Power BI
  2. Installing Dynamics CRM 2016 SP1 On-Premise
  3. Power BI Deep Dive: Using the Web API to Query Dynamics CRM/365 for Enterprise
  4. Utilising Pre/Post Entity Images in a Dynamics CRM Plugin
  5. Modifying System/Custom Views FetchXML Query in Dynamics CRM
  6. Grant Send on Behalf Permissions for Shared Mailbox (Exchange Online)
  7. Getting Started with Portal Theming (ADXStudio/CRM Portals)
  8. Microsoft Dynamics 365 Data Export Service Review
  9. What’s New in the Dynamics 365 Developer Toolkit
  10. Implementing Tracing in your CRM Plug-ins

I suppose it is a testament to the blog’s stated purpose that posts covering areas not exclusive to Dynamics CRM/365 rank so highly on the list and, indeed, represents how this application is so deeply intertwined with other technology areas within the Microsoft “stack”.

To all new and long-standing followers of the blog, thank you for your continued support and appreciation for the content 🙂

If you are looking at automating the execution of SQL Server Integration Services .dtsx packages, then there are a few options at your disposal. The recommended and most streamlined route is to utilise the SSIDB catalog and deploy your packages to the catalog from within Visual Studio. This gives you additional flexibility if, when working with SQL Server 2016 or greater, on whether to deploy out single or multiple packages together. An alternative approach is to deploy your packages to the file system and then configure an Agent Job on SQL Server to execute the job based on a schedule and with runtime settings specified. This is as simple as selecting the appropriate dropdown option on the Agent Job Step configuration screen and setting the Package source value to File system:

Deploying out in this manner is useful if you are restricted from setting up the SSISDB catalog on your target SQL Server instance or if you are attempting to deploy packages to a separate Active Directory domain (I have encountered this problem previously, much to my chagrin). You also have the benefit of utilising other features available via the SQL Server Agent, such as e-mail notifications on fail/success of a job or conditional processing logic for job step(s). The in-built scheduling tool is also pretty powerful, enabling you to fine tune your automated package execution to any itinerary you could possibly conjure up.

I encountered a strange issue recently with a DTSX package configured via the SQL Agent. Quite randomly, the package suddenly started failing each time it was scheduled for execution, with the following error generated in the log files:

Failed to decrypt an encrypted XML node because the password was not specified or not correct. Package load will attempt to continue without the encrypted information.

The issue was a bit of a head-scratcher, with myself and a colleague trying the following steps in an attempt to fix the issue:

  • Forcing the package to execute manually generated the same error – this one was a bit of a longshot but worth trying anyway 🙂
  • When executing the package from within Visual Studio, no error was encountered and the package executed successfully.
  • After replacing the package on the target server with the package just executed on Visual Studio (same version) and manually executing it, once again the same error was thrown.

In the end, the issue was resolved by deleting the Agent Job and creating it again from scratch. Now, if you are diagnosing the same issue and are looking to perform these same steps, it may be best to use the Script Job as option within SQL Server Management Studio to save yourself from any potential headache when re-building your Job’s profile:

Then, for good measure, perform a test execution of the Job via the Start Job at Step… option to verify everything works.

I am still stumped at just what exactly went wrong here, but it is good to know that an adapted version of the ancient IT advice of yore can be referred back to…