Writing Natural Language Rule Statements — a Systematic Approach Part 3 — Other Data Cardinality Rules
About this series of articlesWhile my first series of articles on writing natural language rule statements[1] explored a wide variety of issues in a rather organic and hence random manner, this series takes a more holistic and systematic approach and draws on insights gained while writing my recently-published book on the same topic.[2] Rule statements recommended in these articles are intended to comply with the Object Management Group's Semantics of Business Vocabulary and Business Rules (SBVR) version 1.0.[3]
The story so far
In the previous article[4] we looked at rule statements for two types of rule:
- mandatory data item rule statements, i.e., statements of mandatory data item rules (rules that require that a particular data item be present) such as RS16, and
- mandatory option selection rule statements, i.e., statements of mandatory option selection rules (rules that require that one of a set of pre-defined options be specified) such as RS21.
RS16. | Each Travel Insurance Application must specify exactly one Departure Date.[5] |
RS21. | Each Travel Insurance Application must specify whether it is for international travel or for domestic travel. |
In that article I demonstrated that each of these types of rule statement has a common formulation, in that all rule statements of that type can be written using a common sentence structure.
These rules are two of the three types of mandatory data rule (rules that mandate the presence of data, i.e., require that data be entered in a transaction form or be present in a message or persistent data record). Let's now look at another type of mandatory data rule.
Alternative data items
Whereas some insurance companies ask for each passenger's birth date, others ask for the age of each passenger. However, no insurance company I have encountered allows the applicant to enter either birth dates or ages. By contrast, I have encountered one company that allows the applicant to enter either the date of return or the travel duration in days, as provided for by rule statement RS43. And while at least one company requires both a postal address and an e-mail address, some require one or the other, while others require at least one of those addresses, as provided for by rule statement RS44.
RS43. | Each Travel Insurance Application must specify a Return Date or a Travel Duration but not both. |
RS44. | Each Travel Insurance Application must specify a Postal Address or an e-Mail Address or both. |
There may, of course, be three or more alternatives, such as a postal address, an e-mail address, or a contact phone number. If that is the case, we cannot use both but need to state how many alternatives must be specified, either the minimum number of alternatives (as in rule statement RS45) or the exact number of alternatives (as in rule statement RS46).
RS45. | Each Travel Insurance Application must specify at least one of the following: a Postal Address, an e-Mail Address or a Contact Phone Number. |
RS46. | Each Travel Insurance Application must specify exactly one of the following: a Postal Address, an e-Mail Address or a Contact Phone Number. |
Since these rules state that at least one or exactly one of a group of data items is mandatory, I refer to them as mandatory group rules. These too have a common formulation:
- the subject, identifying the type of transaction, in this case Each Travel Insurance Application
- must specify
- one of the following:
- two data item names, each preceded by a or an as appropriate, separated by or and followed by but not both
- two data item names, each preceded by a or an as appropriate, separated by a comma and followed by or both (or , or both if preferred)
- at least one of the following: or exactly one of the following: followed by a list of three or more data item names, each preceded by a or an as appropriate, in which:
- each data item name in the list other than the last two is followed by a comma
- the last two are separated by or (or , or if preferred).
Of course, one could use the one of the following formulation for a rule statement governing only two alternative data items, such as RS47 (an alternative to RS43) and RS48 (an alternative to RS44).
RS47. | Each Travel Insurance Application must specify exactly one of the following: a Return Date or a Travel Duration. |
RS48. | Each Travel Insurance Application must specify at least one of the following: a Postal Address, an e-Mail Address. |
Data cardinality rules other than mandatory data rules
The mandatory data rules we have now seen are members of a wider class of rules, data cardinality rules (rules that require the presence or absence of a data item and/or place restrictions on the maximum or minimum number of occurrences of a data item). Let's now look at other types of data cardinality rule.
Prohibited data items
You may recall RS25 from the previous article:
RS25. | Each Travel Insurance Application that is for international travel must specify at least one Region. |
Whereas an application for international travel insurance must specify at least one region, an application for travel insurance that both specifies domestic travel and specifies one or more regions (e.g., Africa, Asia, Europe, South America) is clearly inconsistent and should not be accepted. Regions are thus prohibited data items in applications for domestic travel insurance.
A prohibited data rule statement — such as RS49 — is required in this situation.
RS49. | A Travel Insurance Application that is for domestic travel must not specify a Region. |
RS50 is an example of a rule statement governing a prohibited data item that is part of a complex data item.
RS50. | A Travel Insurance Application must not specify a Medical Condition for any Passenger who does not have any pre-existing medical condition. |
Singular data items
You may recall from the previous article that some data items may appear more than once but others may only appear once (or once for each passenger or high value item). If a data item may only appear once but must appear (i.e., is mandatory), it can be governed using a mandatory data item rule statement such as RS27.
RS27. | Each Travel Insurance Application must specify exactly one Family Name for each Passenger. |
However, if a data item may only appear once but is optional, a maximum cardinality rule statement is required. Some insurance companies credit points to the passengers' frequent flier membership accounts if these are specified.
RS51. | A Travel Insurance Application must not specify more than one Frequent Flier Membership for any one Passenger. |
Note that the upper limit imposed by a maximum cardinality rule is not necessarily 'one'.
RS52. | A Travel Insurance Application must not specify more than nine Passengers. |
By the way, the rule expressed by rule statement RS52 is probably not a business rule but is instead a system rule, i.e., a constraint imposed by a system that does not support any business rule but which:
- makes system design easier,
- is agreed to by the organisation as a trade off against cheaper or earlier system delivery, or
- is accepted as a non-compliant feature of a purchased system which is otherwise fit for purpose.
A common formulation for prohibited data rule statements and maximum cardinality rule statements
These two types of rule statement share a common formulation:
- the subject, identifying the type of transaction, in this case A Travel Insurance Application or A Travel Insurance Application that is for domestic travel
- must not specify
- a (in a prohibited data rule statement), or more than followed by a cardinal number (in a maximum cardinality rule statement)
- the name of the governed data item, e.g., Region, Frequent Flier Membership
- if the governed data item is part of a complex data item, a phrase having the following form:
- for any (in a prohibited data rule statement) or for any one (in a maximum cardinality rule statement)
- the name of the complex data item, e.g., Credit Card, High Value Item
- (if any) if the complex data item is optional
- for any (in a prohibited data rule statement) or for any one (in a maximum cardinality rule statement)
- an optional qualifying clause, such as who does not have any pre-existing medical condition.
Multiple data items
Occasionally, you will encounter a data item that is not only mandatory but must appear more than once in each instance of the transaction in question. For example, a return journey must involve two or more flights. A multiple data rule statement — such as RS53 — is required in this situation.
RS53. | Each Flight Booking Confirmation that is for a return journey or for a multi-stop journey must specify at least two Flights. |
Rule statements of this type have the same formulation as mandatory data item rule statements, with two slight differences:
- The determiner after must specify is exactly or at least followed by a cardinal number greater than one, e.g., at least two (compared with the determiner in a mandatory data item rule statement: exactly one or at least one);
- As a result, the name of the governed data item is in the plural, e.g., Flights.
Data items with cardinality dependent on other data items
Occasionally, the number of instances of a data item that may or must appear in a transaction is dependent on some other data item in that transaction. This can sometimes be handled by multiple rule statements such as we have already encountered, such as RS53 above and RS54:
RS54. | Each Flight Booking Confirmation that is for a one-way journey must specify at least one Flight. |
This, however, is not always practical. Consider RS55. Here the number of instances of the governed data item (Passenger Names) is required to be equal to the value of some other data item (Number of Passengers). Note that it could just as easily be required to be less than or more than the value of the other data item.
RS55. | The number of Passenger Names specified in each Flight Booking Confirmation must be equal to the Number of Passengers specified in the Flight Booking Request that gives rise to that Flight Booking Confirmation. |
This is an example of a dependent cardinality rule statement. Two aspects of the formatting of this rule statement should be noted:
- The subject (The number of Passenger Names …), the verb phrase introducing the predicate (must be equal to) and the object of that verb phrase (the Number of Passengers …) are each on a separate line.
- The words 'number of' are styled differently in 'number of Passenger Names' and 'Number of Passengers'. This is because Passenger Name is a data item that can appear more than once whereas Number of Passengers is a single data item.
Rule statements of this type also have a common formulation:
- the subject, identifying the governed data item, in this case The number of Passenger Names
- specified
- if the governed data item is part of a complex data item, a phrase having the following form:
- for the (if there can only be one of the complex data item) or for each (if there can be more than one of the complex data item)
- the name of the complex data item
- (if any) if the complex data item is optional
- for the (if there can only be one of the complex data item) or for each (if there can be more than one of the complex data item)
- a qualifying clause identifying the type of transaction, in this case in each Flight Booking Confirmation
- must be equal to, must be less than, or must be more than as appropriate
- a determiner specifying the constraint on the number of instances required of the data item in question, such as at least two
- the name of the governing data item preceded by the, in this case the Number of Passengers
- a qualifying clause (always present so as to relate the governed and governing data items), in this case specified in the Flight Booking Request that gives rise to that Flight Booking Confirmation.
Prepositions or verbs?
A reader of the first article in this series[6] has gently taken me to task for:
- not giving Richard Barker and Harry Ellis credit for their English Language approach to naming relationships, as described in: Richard Barker, CASE Method: Entity Relationship Modelling. Addison-Wesley Professional (1990);
- using verbs rather than prepositions to signify relationships between data items.
In answer to the first point, while I accept that Barker and Ellis predated each of the authors cited in that article in labelling relationships with names that could be used to construct natural language sentences, there is more to rules than relationships. Each of the authors cited has constructed constrained natural languages to describe all constraints.
The second point was an allusion to the feature of the Ellis-Barker notation whereby rather than naming relationships, the roles played by the entities at each end of a relationship are named, using:
- a preposition, such as 'for' or 'of'
- a passive participle followed by a preposition, such as 'described by' or 'made up of', or
- a noun followed by a preposition, such as 'description of' or 'part of'.
Any discourse about a relationship between two entities can then precede the appropriate role name with the appropriate form of the verb 'be', as in "each order is made up of order lines" or "each order line is part of an order".
This approach seems to have led some of its proponents to infer (wrongly):
- that verbs (other than 'be') only signify actions rather than relationships;
- that actions are not relevant in discourse about data or rules.
There are plenty of verbs other than 'be' that signify relationships, such as 'contain' and 'include'. To quote from Writing Effective Business Rules:
"When I learnt English grammar at school, a verb was defined as a 'doing word', i.e., one that refers to an action performed by a person or thing (e.g., 'create', 'specify', 'prevent'). Various linguists have identified that verbs are rather more versatile than that. In particular, Halliday (An Introduction to Functional Grammar, 1985) … identified the various types of process that verbs may refer to, namely:
- material processes: actions performed in the physical world, such as 'open', as in 'open the bag', 'open the door';
- mental processes, such as 'like', 'know', 'think', 'understand';
- behavioural processes: physiological or psychological behaviours, such as 'laugh', 'sneeze', 'walk', 'sleep';
- verbal processes, such as 'say', 'tell', 'inform', 'explain', 'specify', 'ask';
- relational processes, such as those expressed using 'be' or 'have';
- existential processes, also using the verb 'be' after 'there', as in 'there is a place'."
Rules certainly need to refer to actions for the simple reason that in a static universe (one in which nothing is happening) there is no need to check any rules (except those imposing time limits on inactivity). It is because things happen that we need to check whether those things comply with or violate our rules. Much of what happens is the direct result of some person, organisation, or machine doing something, i.e., acting. And for all data rules, the thing that has happened is that some person or organisation has stated or specified some information.
My correspondent suggested that, rather than a Travel Insurance Application specifying a Birth Date, a Travel Insurance Application is described by a Birth Date, but this is incorrect. The only thing that might be said to describe (or perhaps characterise) a Travel Insurance Application (or any other transaction) is the date on which it occurred (and perhaps the standard form used for the purpose). A transaction like a Travel Insurance Application is a means by which a person or organisation can state or specify some information.
While I am a great fan of other aspects of the Ellis-Barker notation (its means of depicting subtypes by boxes inside boxes, and its notations for alternative relationships and non-transferable relationships), I find its relationship naming convention unduly restrictive. Since one purpose of a database is to provide an accurate record of what has happened, a data model needs to be able to record actions and hence use verbs other than 'be'.
To be continued...
The next article in this series will discuss some data content rules.[7]
References
[1] The first of which is: Graham Witt, "A Practical Method of Developing Natural Language Rule Statements (Part 1)," Business Rules Journal, Vol. 10, No. 2 (Feb. 2009), URL: http://www.BRCommunity.com/a2009/b461.html
[2] Graham Witt, Writing Effective Business Rules. Morgan Kaufmann (2012).
[3] Semantics of Business Vocabulary and Business Rules (SBVR), v1.0. Object Management Group (Jan. 2008). Available at http://www.omg.org/spec/SBVR/1.0/
[4] Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 2 —Mandatory Data Rules," Business Rules Journal, Vol. 13, No. 8 (Aug. 2012), URL: http://www.BRCommunity.com/a2012/b665.html
[5] The font and colour conventions used in these rule statements reflect those in the SBVR, namely underlined teal for terms, italic blue for verb phrases, orange for keywords, and double-underlined green for names and other literals. Note that, for clarity, these conventions are not used for rule statements that exhibit one or more non-recommended characteristics.
[6] Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach: Part 1 — Basic Principles," Business Rules Journal, Vol. 13, No. 7 (July 2012), URL: http://www.BRCommunity.com/a2012/b660.html
[7] A rule that places a restriction on the values contained in a data item or set of data items (rather than whether or not they must be present and how many there may or must be).
# # #
About our Contributor:
Online Interactive Training Series
In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.