Writing Natural Language Rule Statements — a Systematic Approach Part 5 Value Set Rules
About this series of articlesWhile my first series of articles on writing natural language rule statements[1] explored a wide variety of issues in a rather organic and hence random manner, this series takes a more holistic and systematic approach and draws on insights gained while writing my recently-published book on the same topic.[2] Rule statements recommended in these articles are intended to comply with the Object Management Group's Semantics of Business Vocabulary and Business Rules (SBVR) version 1.0.[3]
The story so far
In the previous article[4] we looked at a variety of data cardinality rule statements[5] and one type of data content rule statement,[6] namely a range rule statement.[7] In this article we will look at another type of data content rule statement, the value set rule.
A note on ambiguity
One of the range rule statements discussed in the previous article was RS63 (reproduced below).
RS63. | The Birth Date specified for each Passenger in each Travel Insurance Application must be later than 80 years before the Return Date specified in that Travel Insurance Application. |
In this rule statement the lower bound on passengers’ birth dates (80 years before the Return Date specified in that Travel Insurance Application) involves not just a data item (the Return Date specified in that Travel Insurance Application) but an offset (80 years before).
Range rules of this type can occur with numeric values other than dates. Digressing for a moment into the field of aircraft operation, one can envisage a rule like the one stated in RS64 (the real world rule is rather more complicated than that, taking into account weather conditions, etc.).
RS64. | The Landing Distance Required of an Aircraft must be at least 60% less than the Length of each Runway at which that Aircraft lands. |
However, this rule statement is ambiguous. What if the Landing Distance Required of some aircraft is 59% less than the length of a runway at which that aircraft is to land? Since that distance is greater than "60% less than the Length of [that] Runway" it appears to comply with the "at least" constraint. However, the intention of the rule is to ensure that the Landing Distance Required is not greater than "60% less than the Length of each Runway". Even rewording the rule statement as in RS65 doesn’t remove the ambiguity.
RS65. | The Landing Distance Required of an Aircraft must be no more than 60% less than the Length of each Runway at which that Aircraft lands. |
The only way to avoid this ambiguity is to replace the negative (less than) offset by a positive (more than) offset, as in RS66:
RS66. | The Length of each Runway at which an Aircraft lands must be at least 60% greater than the Landing Distance Required of that Aircraft. |
Note that this ambiguity does not arise in RS63, even though it has a negative (before) offset, because later than is unambiguous. The ambiguity caused by non-temporal negative offsets (such as less than) is that it is not clear whether the inequality operator (be at least in RS64, be no more than in RS65) applies to the offset alone (60%) or to the entire comparand (60% less than the Length of each Runway at which that Aircraft lands). By contrast, a temporal negative offset is unambiguous: it is clear that the inequality operator (be later than) in RS63 applies to the entire comparand (80 years before the Return Date specified in that Travel Insurance Application).
Value set rules
A value set rule requires that the content of a data item be (or not be) one of a particular set of values (either a fixed set or a set that may change over time), or that the content of a combination of data items match or not match a corresponding combination in a set of records.
What data items in a travel insurance application might be constrained in this way?
- Any names of regions or countries listed as to be visited must be limited to recognised region or country names rather than being unconstrained "free text" (which would prevent automatic price quotation), as specified in rule statements RS67 and RS68 later in this article.
- Passengers’ salutations or titles (e.g., Mr, Ms, Dr) might be limited to a set of acceptable salutations.
- Names of passengers’ pre-existing medical conditions might be limited to a set of names of high-risk conditions.
- The locality name, state, and postal code in the postal address should be limited to a set of valid locality name/state/postal code combinations provided by the postal authority of the country in which the insurance company operates.
- The credit card issuer might be limited to a set of recognised issuers (e.g., Visa, Mastercard).
Some sets of valid values (such as the set of regions in RS67) are not only relatively small but also relatively non-volatile. However, if the insurance company required that the countries to be visited (rather than the regions) be listed, a rule statement listing all of the 240-plus countries identified in international standard ISO3166-1 would be an inappropriate duplication of information available elsewhere. By contrast, RS68 refers to that standard, so is not only easier to read but easier to write. It is also future-proof in that, should a country name change (e.g., Burma became Myanmar some time ago) or a new country emerge (such as South Sudan) the rule statement does not have to be updated. Of course, there is no reason why a rule governing regions cannot be stated in the same way (as in RS69). If the rule expressed by this rule statement is to be effectively implemented, there must be a single master list of recognised regions recorded in a data resource available to each system or person that applies this rule.
RS67. | Each Region (if any) specified in each Travel Insurance Application must be Africa, Asia, Europe, North America, Oceania or South America. |
RS68. | Each Country (if any) specified in each Travel Insurance Application must be one of the Countries listed in ISO3166-1. |
RS69. | Each Region (if any) specified in each Travel Insurance Application must be one of the Regions recognised by Australian Travel Insurance P/L. |
Rule statements similar to RS69 can be written for:
- passengers’ salutations (RS70),
- names of passengers’ pre-existing medical conditions (RS71),
- credit card issuers (RS72).
RS70. | The Salutation specified for each Passenger in each Travel Insurance Application must be one of the Salutations recognised by Australian Travel Insurance P/L. |
RS71. | Each Medical Condition (if any) specified for each Passenger in each Travel Insurance Application must be one of the Medical Conditions recognised by Australian Travel Insurance P/L. |
RS72. | The Credit Card Issuer specified in each Travel Insurance Application must be one of the Credit Card Issuers recognised by Australian Travel Insurance P/L. |
Alternatively, rule statements similar to RS68 can be used if the set of allowed values is defined in a standard:
RS73. | The Salutation specified for each Passenger in each Travel Insurance Application must be one of the Name Titles listed in AS4590-2006. |
However, many companies limit salutations to relatively small sets: RS74 expresses the rule applied by at least one insurance company.
RS74. | The Salutation specified for each Passenger in each Travel Insurance Application must be Mr, Mrs, Miss, Ms, Sir or Dr. |
Value set rules governing combinations of data items
While the locality name, state, and postal code in a postal address must each individually be a member of the appropriate set, the combination of locality name, state, and postal code in each postal address must be a valid combination. For example, the postal code for Sydney, NSW (New South Wales) is 2000 whereas the code for Melbourne, Victoria is 3000. "Sydney NSW 2000" and "Melbourne Victoria 3000" are therefore valid combinations. However, "Sydney NSW 3000" and "Sydney Victoria 2000" are not (even though each data item in each of those combinations is a member of the appropriate set).
RS75. | The combination of Locality Name, State Code and Postal Code specified in the Postal Address (if any) in each Travel Insurance Application must be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post. |
Common formulation
Like other types of rule statement, value set rule statements have a common formulation:
- the subject, identifying the governed data item, consisting of:
- a determiner: Each if the data item can be repeated, otherwise The
- either the name of the data item (e.g., Region) or combination of, followed by a list of the data items making up the combination
- (if any) if the data item is optional[8]
- specified
- if the governed data item is part of a complex data item, a phrase having the following form:
- for the (if there can only be one of the complex data item) or for each (if there can be more than one of the complex data item)
- the name of the complex data item
- (if any) if the complex data item is optional
- a qualifying clause identifying the type of transaction, in this case in each Travel Insurance Application
- must, followed by a value set predicate.
Value set predicates
There are two varieties of value set predicate:
- those quoting a list of valid values, such as be Africa, Asia, Europe, North America, Oceania or South America in RS67
- those referring to a source of valid values, such as:
- be one of the Countries listed in ISO3166-1 in RS68 and
- be one of the combinations of Locality Name, State Code and Postal Code allocated by Australia Post in RS75.
Thus, value set predicates have a common formulation:
- be, followed by either
- a list of valid values, such as Mr, Mrs, Miss, Ms, Sir or Dr, or
- one of the, followed by either
- the name of the source data item, such as Name Titles, followed by a qualifying clause to identify the source, such as listed in AS4590-2006, or
- combinations of, followed by a list of the data items in the source, such as Locality Name, State Code and Postal Code, followed by a qualifying clause to identify the source, such as allocated by Australia Post.
To be continued...
The next article in this series will discuss another type of data content rule statement, as well as explore the distinction between rules governing the real world and those governing information supplied to and recorded by an organisation.
References
[1] The first of which is: Graham Witt, "A Practical Method of Developing Natural Language Rule Statements (Part 1)," Business Rules Journal, Vol. 10, No. 2 (Feb. 2009), URL: http://www.BRCommunity.com/a2009/b461.html
[2] Graham Witt, Writing Effective Business Rules. Morgan Kaufmann (2012).
[3] Semantics of Business Vocabulary and Business Rules (SBVR), v1.0. Object Management Group (Jan. 2008). Available at http://www.omg.org/spec/SBVR/1.0/
The font and colour conventions used in these rule statements reflect those in the SBVR, namely underlined teal for terms, italic blue for verb phrases, orange for keywords, and double-underlined green for names and other literals. Note that, for clarity, these conventions are not used for rule statements that exhibit one or more non-recommended characteristics.
[4] Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 1 —Basic Principles," Business Rules Journal, Vol. 13, No. 7 (Jul. 2012), URL: http://www.BRCommunity.com/a2012/b660.html
Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 2 —Mandatory Data Rules," Business Rules Journal, Vol. 13, No. 8 (Aug. 2012), URL: http://www.BRCommunity.com/a2012/b665.html
Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 3 — Other Data Cardinality Rules," Business Rules Journal, Vol. 13, No. 9 (Sept. 2012), URL: http://www.BRCommunity.com/a2012/b669.html
Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 4 — Some Data Content Rules," Business Rules Journal, Vol. 13, No. 10 (Oct. 2012), URL: http://www.BRCommunity.com/a2012/b674.html
[5] Statements of rules that require the presence or absence of a data item and/or place a restriction on the maximum or minimum number of occurrences of a data item.
[6] A statement of a rule that places a restriction on the values contained in a data item or set of data items (rather than whether or not they must be present and how many there may or must be).
[7] A statement of a rule that requires that the content of a data item be a value within a particular range.
[8] It should be pointed out that the subject of a range rule statement may similarly start with either Each or The and may or may not include (if any); however the subject of a range rule statement cannot be a combination of data items.
# # #
About our Contributor:
Online Interactive Training Series
In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.