Introduction
In Part 1, we provided an introduction to BREX and some terminology. In this part, we will cover the basics of writing context rules and provide a primer to XPath, the expression language used in defining context rules.
Writing BREX Context Rules
BREX context rules provide the ability to precisely define what structures are allowed, and not allowed, in your project's data. This is achieved by using XPath expressions. XPath expressions can get quite complex, but in the majority of the cases, you only need to utilize a basic subset of XPath.
A key benefit to context rules is they can be validated in an automated fashion with the use of a BREX validation tool (aka BREX checker) like IrisCheck. Therefore, if you have business rules that are applicable to the XML structure of your data, it is best to express them as context rules to minimize the need for manual verification of your rules.
The <contextRules>
element
In a BREX data module, context rules are specified under the <contextRules>
element:
<dmodule>
<content>
<brex>
<contextRules>
...
</contextRules>
</brex>
</content>
</dmodule>
The <contextRules>
element is repeatable. As the element name implies, you can specify a specific context for a set of rules. For example, if you have a set rules that are only applicable for procedural data modules and another set of rules only applicable for fault data modules, you can have the following:
<contextRules
rulesContext="http://www.s1000d.org/S1000D_5-0/xml_schema_flat/proced.xsd">
...
</contextRules>
<contextRules
rulesContext="http://www.s1000d.org/S1000D_5-0/xml_schema_flat/fault.xsd">
...
</contextRules>
The rulesContext
attribute value must be a schema URI. BREX checkers examine the value to determine when the set of rules should be verified against a data module.
When the rulesContext
attribute is not specified, then the rules apply to all S1000D CSDB object types:
<contextRules>
<!-- ... rules for all object types ... -->
</contextRules>
Limitations of the rulesContext
attribute
Unfortunately, the rulesContext
is very limited, and can be impractical to use due to the following:
A data module's schema URI must match exactly the schema URI specified in the
rulesContext
attribute for the rules to be applied. If you have been a good IETP author and use the standard schema URIs, this is not a problem. However, there are environments where schema URIs refer to local pathnames (e.g. "file://C:/data/
...") instead of utilizing XML Catalogs.Only a single URI can be specified in the
rulesContext
attribute. This can be very limiting if you have rules that may be applicable to more than one schema type, or applicable to different issues of a given schema type (e.g. Issue 4.0, 4.1, 5.0, etc). When usingrulesContext
, you will have to replicate the the rules for each context, which leads to sustainment headaches when rule updates are required.
Alternative to the rulesContext
attribute
The limitations of rulesContext
are pretty severe from a sustainment perspective. Fortunately, with the use of XPath, we can achieve context-based rules without the need of using the rulesContext
attribute. Details about writing context rules will be covered in the next Part of the series, but the following should give you a basic idea of how a rule can be constructed so it only applies to a data module of a given type:
<structureObjectRule>
<objectPath allowedObjectFlag="0">/dmodule/content/
illustratedPartsCatalog//partNumber[not(@id)]</objectPath>
<objectUse>Part numbers in IPDs must have an authored ID.</objectUse>
</structureObjectRule>
The rule in the example is somewhat contrived, but it illustrates how one is able to contextualize the rule to only apply to IPDs by leveraging the XML schema structure of IPDs. Here is a brief explanation of the expression:
The start of the expression, "
/dmodule/content/illustratedPartsCatalog
" restricts any matches to under the<illustratedPartsCatalog>
element. Since no other schema type allows such a structure, the rule is effectively limited to IPDs.The next part is the expression "
//partNumber
", which indicates any<partNumber>
elements under<illustratedPartsCatalog>
, at any depth. The XPath expression "//
" indicates descendants of any depth.The "
[not(@id)]
" says to only match<partNumber>
elements with noid
attribute.
With the designation of allowedObjectFlag="0"
for the rule, any match of the above expression is not allowed: if the expression matches at least one node in an IPD data module, that DM will not pass validation.
Equivalent rulesContext
-based rule
For comparison, the following is an equivalent rule using <contextRules>
's ruleContext
attribute:
<contextRules
rulesContext="http://www.s1000d.org/S1000D_5-0/xml_schema_flat/ipd.xsd">
<structureObjectRuleGroup>
<structureObjectRule>
<objectPath allowedObjectFlag="0">//partNumber[not(@id)]</objectPath>
<objectUse>Part numbers in IPDs must have an authored ID.</objectUse>
</structureObjectRule>
</structureObjectRuleGroup>
</contextRules>
Notice how the XPath expression is briefer since the applicable context does not need to be established within the expression itself. With the ruleContext
attribute setting, the rule will only be evaluated on Issue 5.0 IPD data modules. If I want the rule to apply to other Issues, the rule must be copied for each Issue..
You may be satisfied with using the ruleContext
if you know all your data modules will always be authored to the same Issue, they use S1000D standard schema URIs, and you have little-to-no occurrences of rules that are applicable to multiple schema types. However, it is highly recommended to author rules that are less susceptible to external changes that could make the rule become stale (which can be easily overlooked). For example, if you change the schema URIs of your modules, the existing rules will no longer be applied by the BREX checker unless you remember to update the ruleContext
attribute.
XPath primer
A complete guide to writing XPath expressions is beyond the scope of this article, but you will need a basic understanding of XPath to get started in writing structured rules for your project. For additional guidance on using XPath, various tutorials are available on the internet.
What is XPath?
XPath provides a syntax for identifying parts (formally known as nodes) of an XML document1. There are different types of nodes, where elements and attributes are the node types of most interest when writing BREX rules.
An XML document intrinsically defines a tree structure, similar to how files on a file system are organized. For example, see how the location of the folder "IrisCheck
" is identified at the top of Windows File Explorer:
The absolute location is "C:\app\IrisCheck
". With XPath, we identify XML nodes in a similar manner, but use forward slashes, "/
", instead of backslashes. For example, given the following XML document structure:
<dmodule>
<content>
<brex> <!-- We want to identify this element here -->
...
</content>
</dmodule>
We can identify the <brex>
element with the following XPath expression:
/dmodule/content/brex
NOTE: You do NOT use “
<>
”s around element names in XPath expressions. In XPath, the “<
” and “>
” characters represent the less-than and greater-then operators, respectively.
Unlike a file system, an XML document can have items of the same name at the same level. For example:
<dmodule>
<content>
<brex>
<contextRules>
<structureObjectRuleGroup>
<structureObjectRule>... <-- We want this one -->
<structureObjectRule>... <-- Not this one >
...
</structureObjectRuleGroup>
</contextRules>
</brex>
</content>
</dmodule>
If we use the XPath expression,
/dmodule/content/brex/contextRules/structureObjectRuleGroup/structureObjectRule
We identify all <structureObjectRule>
elements as a child of <structureObjectRuleGroup>
. If we only want the first <structureObjectRule>
element, we do the following:
/dmodule/content/brex/contextRules/structureObjectRuleGroup/
structureObjectRule[1]
The “[1]
” specifies a positional index relative to the parent element, where in this example, it represents the 1st <structuredObjectRule>
child element of the <structureObjectRuleGroup>
element.
NOTE: For those familiar with other programming languages, positional indexing usually starts with “
0
”. When working with XPath, numbering starts at “1
”.
With the updated expression, we may still identify more than one <structureObjectRule>
element in a document. If given the following XML,
<dmodule>
<content>
<brex>
<contextRules>
<structureObjectRuleGroup>
<structureObjectRule>... <-- MATCH -->
...
</structureObjectRuleGroup>
</contextRules>
<contextRules>
<structureObjectRuleGroup>
<structureObjectRule>... <-- MATCH -->
...
</structureObjectRuleGroup>
</contextRules>
</brex>
</content>
</dmodule>
we identified two <structureObjectRule>
elements, as both matches are the 1st child of a <structureObjectRuleGroup>
at the same level in the XML tree. If we want to only identify the very first <structureObjectRule>
element in the document, we can use the following:
/dmodule/content/brex/contextRules[1]/
structureObjectRuleGroup[1]/structureObjectRule[1]
We could have included [1]
for the <dmodule>
, <content>
, and <brex>
nodes also, but we know the schema does not allow multiple occurrences of those elements in the document tree.
Identifying by ID
If your XML documents contain IDs, using them to identify elements is much easier than using full paths. Take the following for example:
<dmodule>
<content>
<brex>
<contextRules>
<structureObjectRuleGroup>
<structureObjectRule id="SOR-001">... <-- We want this one -->
<structureObjectRule>...
...
</structureObjectRuleGroup>
</contextRules>
</brex>
</content>
</dmodule>
The element we want to identify can be expressed as follows:
//structureObjectRule[@id="SOR-001"]
The expression contains some components that need further explanation:
//
: This is a shorthand notation indicating any descendant node. Since it is at the start of the expression, it indicates any node within the document.[@id="SOR-001"]
: The "[]
" represents a conditional expression (formally called a predicate) on the node that precedes it. In this case, the node that proceeds it isstructureObjectRule
. In order for astructureObjectRule
to match the expression, the expression inside the[]
's must evaluate to a true value.
In our example, the conditional expression,@id="SOR-001"
, is only true if the attribute named "id
" has the value "SOR-001
". In XPath, to distinguish an element name from an attribute name, attribute names are prefixed with the "@
" character, hence the use of "@id
". If we left out the "@
", the name "id
" would have been interpreted as the name of a child element.
NOTE: An attribute in XPath can also be identified as follows:
attribute::id
. This is identical to “@id
”, where “@
” is shorthand for “attribute::
”.
Identifying by any attribute
You are not limited to ID attributes for identifying elements in a data module. For example, if I wanted to identify all elements marked as deleted, I can use the following:
//*[@changeType="delete"]
The special character "*
" will match any element, but the attribute test condition limits matching to only elements whose changeType
attribute equals "delete
".
Testing XPath expressions
Writing XPath expressions can be challenging, where the ability to test your expressions is essential to ensure the accuracy of your context rules. What you think your expression does may not be what it actually does. It is the classic problem all software programmers face, "The computer did what I said, not what I meant" (this may apply to parenting also :-)
To test your XPath expression, you can try one of the free, online XPath testers on the internet. Unfortunately, if you are working with restricted data and/or need to test many files at once, such services are not an option or impractical. With a tool like IrisCheck (shameless plug follows), you can test your expressions on your local data from your computer’s desktop by using the integrated XPath Tester feature:
The XPath Tester is integrated into IrisCheck’s validation services, allowing you to launch it directly from BREX validation results and BREX statistics reports.
More about XPath
Additional and more extensive tutorials on XPath can be found by searching the web.
What’s Next?
In Part 3, we will take an example business rule and translate it into our first BREX context rule.
This is a simplification of what XPath is, but for what is covered here, it is good enough :)