The biggest headache when managing any database system is enforcing data quality and consistency across the entire dataset. This can range from ensuring that field values are entered correctly through to enforcing rules to prevent duplicate records from even touching the periphery of your database. If you are using an application like CRM Online/Dynamics 365 for Enterprise (D365E), then the built-in functionality within these systems can assist with this, via the use of Option Sets for field values and handy features such as Duplicate Detection Rules. If you have developed your own system, then all due thought should be directed towards ensuring that your system can adequality enforce data quality at the client side, wherever possible. Doing this early on can save you a lot of work down the line.
If the backend database system you are using is SQL Server, then, fortunately, the SQL standard (and, specifically, Transact-SQL or T-SQL, Microsoft’s implementation of the standard) has some tricks up its sleeve that is worth noting. CONSTRAINT’s are database objects that can be setup to enforce…well, constraints on the values that are allowed within a particular field. There are a number of different CONSTRAINT’s that are at our disposal, but the most commonly used ones are:
- PRIMARY KEY CONSTRAINT: Typically used to indicate a column that will always contain a unique value in the database, to facilitate working with a specific row of data in a table. Any field type can be setup as a PRIMARY KEY, but it is generally recommended to use an Integer (INT) or Globally Unique Identifier field (UNIQUEIDENTIFIER), more generally referred to as a GUID.
- FOREIGN KEY CONSTRAINT: FOREIGN KEY’s are used to indicate a field within a database that is a reference to another table within the database. Values entered into this field must match against a record within the related table and you can configure a wide range of options of how “parent” record behaves, based on how the related record is modified or removed in the database. Those coming from a CRM/D365E background can grasp this better by realising that these are essentially lookup fields as part of a one-to-many (1:N) relationship.
- DEFAULT CONSTRAINT: On some occasions, when a value is not entered into a column, you may need to ensure that something is put into the field. This is particularly the case when you are working with NOT NULL fields, which always require a value. A DEFAULT CONSTRAINT gets around this issue by allowing you to specify an initial value for the column, should the database operation against the column results in a value not being specified as part of an INSERT statement.
W3 Schools have a handy list that covers all possible **CONSTRAINT'**s within SQL, but be sure to cross reference this with the relevant Microsoft documentation! As an implementation of the standard, T-SQL can have a few nice - but sometimes surprising - differences that you should be aware of :) The upside of all of this is that, if you have the need to ensure that your database column values are protected against erroneous values, then a CHECK CONSTRAINT is your first port of call. What’s even better is that these are something that can be setup rather straightforwardly to ensure, for example, a field is only allowed to conform to a specific set of values.
A practical illustration is the best way to demonstrate the ease - and potential pitfall - you may hit when working with CHECK CONSTRAINT’s containing a large list of potential values. Let’s say you want to create a table with a field - TestField - that should only ever accept the values A, B or C. Your CREATE TABLE script would look something like this:
This works well within the following situations:
- You are working with a handful of values that need to checked - ideally no more than a dozen.
- You can guarantee that the list of values will not be subject to frequent changes.
If your situation fits within the opposite end of the parameters specified above, you may make the assumption that the best way to build a sustainable solution is via a dedicated lookup table within your database. The idea being with this is the list of values required for the CONSTRAINT can be managed in bulk, updated/removed via common T-SQL statements and also prevents you from managing particularly long-winded table scripts within your database. The following script will create a lookup table that records the fields you require CONSTRAINT’s for (assuming this is the case; this can be removed at your discretion) and the values that need checking:
Those who have good, but not extensive, experience with T-SQL may make the next step assumption that we can then modify our CHECK constraint to directly query the table, similar to the below:
After executing the command, your next reaction may be confusion as an error is thrown back to you:
What this means is that there is no way that we can define a query within our CONSTRAINT to essentially “lookup” the values that are allowed from our table and enforce only the list of approved values. The workaround for this is that we can look at utilising a user-defined function, or UDF. The function will perform the necessary query against our lookup table but achieves the requirements of returning a single (i.e. scalar) value, which we can then filter against in our CONSTRAINT. Functions are a topic that has been covered previously on the blog, and, when used effectively, can help to encapsulate complicated functionality within simple objects that can be referenced via your T-SQL queries. Below is a function that can be setup to achieve the requisite conditions for our scenario:
Let’s break down what this is doing in more detail:
- When called, you must supply two parameters with values - the name of the field that needs to be checked against (@FieldName) and the value to check (@Value). We’ll see how this works in practice in a few moments.
- The function will always return a single value - either TRUE or FALSE - thereby ensuring we have a scalar value to interact with.
- Because our TestField has been defined as NULL, we have to add some additional logic to handle these occurrences. SQL Server will not enforce a CHECK constraint for NULL values when entered/updated into a database, but our function does not extend this far. Therefore, unless a record is inserted into our lkp_Test table for our field with a NULL value, these value types will force the CONSTRAINT to be enforced. By adding in an IF condition to check for NULL’s and return a value of ‘TRUE’ in these instances, NULL values can be entered into the database successfully.
With this function now in place, our CONSTRAINT just needs to be modified to call the function, verifying that TRUE returns; if not, then the field value will not be allowed:
Now we can ensure that we are able to specify a subquery within our CONSTRAINT within a supported manner - excellent! :) CONSTRAINT’s are a really useful tool at the disposal of any database administrator/developer, and something that you should always have at the front of your mind as part of any design. Doing so will ensure that you are building a solution which, as much as realistically conceivable, tries to keep the data within your database as squeaky clean as possible.