Writing Natural Language Rule Statements — a Systematic Approach Part 13: Complex Data Types
About this series of articlesWhile my first series of articles on writing natural language rule statements[1] explored a wide variety of issues in a rather organic and hence random manner, this series takes a more holistic and systematic approach and draws on insights gained while writing my recently-published book on the same topic.[2] Rule statements recommended in these articles are intended to comply with the Object Management Group's Semantics of Business Vocabulary and Business Rules (SBVR) version 1.0.[3]
The story so far
In previous articles in this series (see the "Language Archives" sidebar) we have looked at standardised rule statements for various types of data rule,[4] which fall into three broad categories:
- data cardinality rules: rules that require the presence or absence of a data item and/or place a restriction on the maximum or minimum number of occurrences of a data item), e.g., RS16
RS16. | Each Travel Insurance Application must specify exactly one Departure Date. |
- data content rules: rules that place restrictions on the values contained in a data item or set of data items (rather than whether or not they must be present and how many there may or must be), e.g., RS61
RS61. | The Return Date specified in each Travel Insurance Application must be later than the Departure Date specified in that Travel Insurance Application. |
- data update rules: rules that either prohibit update of a data item or place restrictions on the new value of a data item in terms of its existing value, e.g., RS131
RS131. | The Status of a Loan Application may be updated to Approved only if the Status that is currently recorded for that Loan Application is Under Review. |
Each specific type of rule statement has a common formulation which we have discussed in both a relatively informal way and by way of rule statement patterns.
For the next few articles, we will be looking at rules other than data rules, starting with process rules. But before we look at them, I would like to discuss complex data types, which provide a useful means of keeping rule statement proliferation under control.
Complex data types: an example
Many websites, user interfaces or messaging environments require various addresses to be specified. For example, an online ordering site will require a delivery address and a billing address (although these may be the same). These addresses will generally be subject to the same data rules, e.g.,
RS134. | Each Online Order must specify exactly one Street Number and Name in the Delivery Address. |
RS135. | Each Online Order must specify exactly one Locality Name in the Delivery Address. |
RS136. | Each Online Order must specify exactly one State Code in the Delivery Address. |
RS137. | Each Online Order must specify exactly one Postal Code in the Delivery Address. |
RS138. | The combination of Locality Name, State Code and Postal Code specified in the Delivery Address in each Online Order must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
RS139. | Each Online Order must specify exactly one of the following in the Billing Address: a Street Number and Name or a Post Office Box Number but not both. |
RS140. | Each Online Order must specify exactly one Locality Name in the Billing Address. |
RS141. | Each Online Order must specify exactly one State Code in the Billing Address. |
RS142. | Each Online Order must specify exactly one Postal Code in the Billing Address. |
RS143. | The combination of Locality Name, State Code and Postal Code specified in the Billing Address in each Online Order must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
An alternative method of expressing these rules is by:
- defining a complex data type such as Street Address and specifying just once each rule with which instances of that data type must comply
- defining the individual complex data items (Delivery Address and Billing Address) in terms of the complex data type (Street Address).
The following rule statements illustrate this alternative method:
RS144. | Each Street Address must include exactly one Street Number and Name. |
RS145. | Each Street Address must include exactly one Locality Name. |
RS146. | Each Street Address must include exactly one State Code. |
RS147. | Each Street Address must include exactly one Postal Code. |
RS148. | The combination of Locality Name, State Code and Postal Code included in each Street Address must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
RS149. | The Delivery Address specified in each Online Order must be a valid Street Address. |
RS150. | The Billing Address specified in each Online Order must be a valid Street Address. |
Note that this method uses only seven rule statements rather than ten; furthermore, each of the rule statements RS144 – RS148 is simpler than the corresponding original rule statements (RS134 – RS143). A recent business rules project involved at least ten transaction types, specifying between them a total of sixteen address data items. The standard address data structure in this project was more complex, requiring nine rule statements for each address. Using the original method, 144 rule statements were required (16 x 9) whereas the alternative method required only 25 rule statements (16 + 9).
The rule statements governing the complex data type (RS144 – RS148) are of the same types as those they replace (RS134 – RS143), in this case 4 mandatory data item rule statements and 1 value set rule statement, differing only in the use of the verb include rather than specify. The rule statements that define complex data items in terms of complex data types (RS149 – RS150) are data item format rules.
Of course, while a delivery address must be a street address, a billing address can be any postal address: a street address or a post office box address. Post office box addresses differ from street addresses in that a post office box number is required rather than a street number and name. There are two ways to define these requirements:
- Define the complex data type Post Office Box Address in addition to Street Address, allow a Delivery Address to be only a Street Address, and allow a Billing Address to be a Street Address or a Post Office Box Address, using the following rule statements in addition to RS144 – RS148:
RS151. | Each Post Office Box Address must include exactly one Post Office Box Type and Name. |
RS152. | Each Post Office Box Address must include exactly one Locality Name. |
RS153. | Each Post Office Box Address must include exactly one State Code. |
RS154. | Each Post Office Box Address must include exactly one Postal Code. |
RS155. | The combination of Locality Name, State Code and Postal Code included in each Post Office Box Address must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
RS156. | The Delivery Address specified in each Online Order must be a valid Street Address. |
RS157. | The Billing Address specified in each Online Order must be a valid Street Address or Post Office Box Address. |
- Define the complex data type Postal Address in addition to Street Address, allow a Delivery Address to be only a Street Address, and allow a Billing Address to be any Postal Address, using the following rule statements in addition to RS144 – RS148:
RS158. | Each Postal Address must include exactly one of the following: a Street Number and Name or a Post Office Box Type and Name. |
RS159. | Each Postal Address must include exactly one Locality Name. |
RS160. | Each Postal Address must include exactly one State Code. |
RS161. | Each Postal Address must include exactly one Postal Code. |
RS162. | The combination of Locality Name, State Code and Postal Code included in each Postal Address must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
RS163. | The Delivery Address specified in each Online Order must be a valid Street Address. |
RS164. | The Billing Address specified in each Online Order must be a valid Postal Address. |
Note that a Postal Address is either a Street Address or a Post Office Box Address.
Other complex data types
Most websites, user interfaces or messaging environments require personal names to be specified. For example, when booking travel or taking out travel insurance, the name of each passenger must be stated. Each name must include a Title (e.g., Mr, Ms, Dr), a First Given Name, and a Family Name.[5] This can be achieved using either:
- RS165 – RS167 (which govern the individual components of a personal name to be supplied in a transaction) or
RS165. | Each Travel Insurance Application must specify exactly one Title for each Passenger. |
RS166. | Each Travel Insurance Application must specify exactly one Given Name for each Passenger. |
RS167. | Each Travel Insurance Application must specify exactly one Family Name for each Passenger. |
- RS168 – RS171 (which define Personal Name as a complex data type)
RS168. | Each Travel Insurance Application must specify exactly one Personal Name for each Passenger. |
RS169. | Each Personal Name must include exactly one Title. |
RS170. | Each Personal Name must include exactly one Given Name. |
RS171. | Each Personal Name must include exactly one Family Name. |
While the second approach uses four instead of three rule statements, it pays off if there are many different transactions in which personal names are required; each additional transaction requires one additional rule statement rather than three.
Other complex data types are:
- An Effective Time Period or Lifespan is made up of a start date (or effective date) and an end date (or expiry date).
- A Share Fraction, such as 1/3 or 2/7, is made up of a numerator and a denominator.[6]
Impact on rule statement patterns
The rule statement patterns defined in earlier articles in this series require only minor adjustment:
- the addition of include as an alternative to specify
- the recognition that data rule statements can govern complex data types as well as transactions.
The revised patterns are:
P7. | {<mandatory data item rule statement> |
<multiple data rule statement>} ::= Each {<transaction term> | <complex data type term>} {<qualifying clause> |} must {specify | include} {exactly | at least} <cardinal number> <data item term> {for {the | each} <complex data item term> {(if any) |} |} {<qualifying clause> |} {{if | unless} <conditional clause> |}. |
P8. | <mandatory option selection rule statement> ::= Each {<transaction term> | <complex data type term>} {<qualifying clause> |} must {({if | unless} <conditional clause>) |} {specify | include} {for {the | each} <complex data item term> {(if any) |} whether that <complex data item term> | whether {it | <determiner> <data item term>}} <predicate list>. |
P9. | <mandatory group rule statement> ::= Each {<transaction term> | <complex data type term>} {<qualifying clause> |} must {({if | unless} <conditional clause>) |} {specify | include} {for {the | each} <complex data item term> {(if any) |} |} {{a | an} <data item term> or {a | an} <data item term> but not both | {a | an} <data item term>, {a | an} <data item term> or both | {at least | exactly} one of the following: <item alternative list>}. |
P10. | <prohibited data rule statement> ::= {A | An} {<transaction term> | <complex data type term>} {<qualifying clause> |} must not {specify | include} a <data item term> {for any <complex data item term> {(if any) |} |} {<qualifying clause> |} {{if | unless} <conditional clause> |}. |
P11. | <maximum cardinality rule statement> ::= {A | An} {<transaction term> | <complex data type term>} {<qualifying clause> |} must not {specify | include} more than <cardinal number> <data item term> {for any one <complex data item term> {(if any) |} |} {<qualifying clause> |} {{if | unless} <conditional clause> |}. |
P12. | <dependent cardinality rule statement> ::= The number of <data item term> {specified | included} {for {the | each} <complex data item term> {(if any) |} |} in each {<transaction term> | <complex data type term>} must be {equal to | less than | more than} <data item term> <qualifying clause> {{if | unless} <conditional clause> |}. |
P13. | <range rule statement> ::= {Each | The} <data item term> {(if any) |} {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must <range predicate> {<temporal qualifying clause> |} {{if | unless} <conditional clause> |}. |
P17. | <value set rule statement> ::= {Each | The} {<data item term> {(if any) |} | combination of <item combination list>} {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must <value set predicate> {<temporal qualifying clause> |} {{if | unless} <conditional clause> |}. |
P19. | <(in)equality rule statement> ::= {Each | The} <data item term> {(if any) |} {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must <(in)equality predicate> {<temporal qualifying clause> |} {{if | unless} <conditional clause> |}. |
P22. | <uniqueness constraint statement> ::= {Each | The} {<data item term> {(if any) |} | combination of <item combination list>} {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must <uniqueness constraint predicate> {{if | unless} <conditional clause> |}. |
P24. | <data combination consistency rule statement> ::= The combination of <item combination list> {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must be such that <conditional clause> {{if | unless} <conditional clause> |}. |
P25. | <set function rule statement> ::= The <set function> of <data item term> {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must {<range predicate> | <(in)equality predicate>} {{if | unless} <conditional clause> |}. |
P26. | <set consistency rule statement> ::= {Each | The} set of <data item term> {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must {be the same as | be different to | include} the set of <term> {<qualifying clause> |} {{if | unless} <conditional clause> |}. |
P27. | <set equality rule statement> ::= {Each | The} <data item term> {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must be the same as {each other | the} <data item term> {specified | included} {for {the | each other} <complex data item term> |} {in | on} that {<transaction term> | <complex data type term>} {{if | unless} <conditional clause> |}. |
P32. | <data item format rule statement> ::= {Each | The} {<data item term> {(if any) |} | combination of <item combination list>} {specified | included} {for {the | each} <complex data item term> {(if any) |} |} {in | on} each {<transaction term> | <complex data type term>} {<qualifying clause> |} must <data item format predicate> {{if | unless} <conditional clause> |}. |
To be continued...
The next article in this series will take a look at process rules.
References
[1] The first of which is: Graham Witt, "A Practical Method of Developing Natural Language Rule Statements (Part 1)," Business Rules Journal, Vol. 10, No. 2 (Feb. 2009), URL: http://www.BRCommunity.com/a2009/b461.html
[2] Graham Witt, Writing Effective Business Rules. Morgan Kaufmann (2012).
[3] Semantics of Business Vocabulary and Business Rules (SBVR), v1.0. Object Management Group (Jan. 2008). Available at http://www.omg.org/spec/SBVR/1.0/
The font and colour conventions used in these rule statements reflect those in the SBVR, namely underlined teal for terms, italic blue for verb phrases, orange for keywords, and double-underlined green for names and other literals. Note that, for clarity, these conventions are not used for rule statements that exhibit one or more non-recommended characteristics.
[4] A rule that constrains the data included in a transaction (a form or message) or a persistent data set (e.g., a database record).
[5] Many websites ask for First Name and Last Name rather than First Given Name and Family Name. If the only purpose is to reproduce the person's name in full (e.g., "Mr Graham Witt") this is sufficient, but "last name" is often a surrogate for family name. For example, correspondence may be addressed "Dear Mr Witt". This is OK if the customer's culture is one in which the family name comes last, as in the English-speaking world, but not if that culture is one in which the family name comes first, as in many Asian and Eastern European cultures. Of course, a customer from such a culture may realise the business's intent and reverse their family name and first given name, but there is no guarantee that every such customer will do so, and data quality is compromised. Inclusion of a Patronymic data item provides for more inclusive capture of personal names, as it covers those cultures, such as Icelandic, in which patronymics (e.g., Jónsson or Jónsdóttir) are used instead of family names.
[6] Where an asset has shared ownership, vulgar fractions must be used rather than percentages since the shares resulting from dividing something equally between 3 or 7 parties (or indeed any number of parties that is not a power of 2 multiplied by a power of 5) cannot be expressed exactly as a percentage or decimal fraction.
# # #
About our Contributor:
Online Interactive Training Series
In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.