A Practical Method of Developing Natural Language Rule Statements (Part 1)
What is this series of articles about?
This is the first article in a series that describes a practical method of developing rule statements that are in natural language (thus able to verified by stakeholders) and unambiguous (thus able to be implemented consistently within the organisation and the parties with which it deals). The author has developed this method for a large Australian government agency that has selected the Business Rules Approach and the Object Management Group's Semantics of Business Vocabulary and Business Rules (SBVR)[1] as representative of best rules practice.
This series will cover, among other things:
- What is a business rule?
- Types of rule.
- Templates for well-formed rule statements.
- Techniques for avoiding ambiguity.
- Relationships between fact types and rule statements.
The Importance of Rules
All organisations are governed by rules. Each organisation chooses to be so governed for a variety of reasons, including minimisation of exposure to risk, cost reduction, revenue protection, maintenance of market share, etc. While many of these rules will be established by the organisation itself and possibly modified over time, many others will reflect legislation or regulation, since failure to comply with these exposes the organisation to the risk of financial penalty and, possibly, loss of business through adverse publicity. Other rules again will reflect external standards (typically to obtain efficiencies in dealing with other organisations) or best practice (typically to minimise risk without "reinventing the wheel").
All organisations are of course also subject to the laws of physics, although these cannot be violated. Why then should organisations include laws of physics among the rules they manage? Consider a school timetabling system. A school using such a system would be right to expect that it would prevent any staff member being timetabled to be in more than one location at one time, which would violate a law of physics. Another law of physics that has the potential to affect most systems is the fact that (for all practical non-relativistic purposes) time is linear and unidirectional. This underpins the constraint that every arrangement, activity, or version of something cannot finish or expire before it starts. Whether a system is recording information about arrangements, activities, or versions before they occur (and thus perhaps causing them to occur as recorded, as is the case in a timetabling system) or 'after the event', information that suggests that the arrangement, activity, or version will finish before it starts should be rejected.
Given all of the above, any system developed or acquired by an organisation is of more value if that system ensures that the data it manages and the processes it performs comply with all relevant rules to which the organisation is subject. Many older systems do not enforce compliance with all relevant rules. More recent systems may do so through program code. Subsequent developments have seen some rules moved into the database or into a rules engine.
Rule Documentation Techniques
None of the development methods just mentioned requires that the relevant rules be documented in such a manner as to enable stakeholders to review and approve them, change them as required, and so on. This would require in turn that all rule statements be in a standard format and use business terminology (rather than database object names). Of course, rules engines expose all rule statements in a standard format but, even if business terminology can be used in a rules engine, its standard format is unlikely to be particularly business-friendly. Rules implemented in a database may be documented in (or in association with) the logical or conceptual data model (assuming one exists). A more common situation is that any rules are only documented in functional or program specifications.
Ron Ross was the first to propose a comprehensive approach to documenting rules, a diagramming technique[2] that was perhaps too complex to be truly business-friendly. He then developed RuleSpeak[3], which was a worthy first attempt to couch rules in natural language. Terry Halpin was meanwhile working on the verbalisation of ORM (Object Role Modeling[4],[5]) models, and Edition 3 of this author's Data Modeling Essentials[6] (jointly authored with Graeme Simsion) described an approach to expressing a data model (and associated rules) in natural language 'assertions'.
In January 2008 the OMG (Object Management Group) released version 1.0 of the Semantics of Business Vocabulary and Business Rules (SBVR). While SBVR contains what this author believes to be a few flaws (more on this later), it is significant in that it represents a comprehensive analysis of the linguistic and logical concepts underlying business rules and 'rule statements', i.e., their expression in natural language. The complexity necessary for this analysis, however, presents challenges to practitioners, particularly those not well versed in the theory of language and logic.
Some Examples of Systems Governed by Rules
An example of the simplest such system is a locked door that only unlocks if the correct combination is keyed into a keypad beside the door. If any other combination is keyed, the door must remain locked. In this system there is a process (which unlocks the door if appropriate), input data (the combination), and a 'go/no-go' rule that only admits two outcomes (the door either unlocks or it doesn't). The rule is fired whenever input data is received (a combination is keyed into the keypad).
Alternatively, access control may be managed using a 'swipe card' arrangement. Here the input data is whatever is encoded in the magnetic card: this may well include restrictions on the doors by which access is allowed, and/or times and days of the week for which access is allowed. In this situation there are typically two rules — one leading to 'no-go' if the card is not recognised, and one leading to 'go' only if the card is recognised and the card is programmed to allow access through this door and the card is programmed to allow access at this particular time on this particular day of the week. This is an example of a rule that responds not only to input data but to environmental knowledge ("what day and time is it now?").
Environmental knowledge may be spatial as well as temporal. For instance, another system governed by rules is to be found at train stations in Sydney (Australia). On leaving a station most passengers are required to insert a ticket into a slot beside a normally-closed barrier. If the ticket is valid for travel to that station and has not expired and has not already been used for the maximum number of journeys allowed, the barrier opens. The barrier not only reads the ticket (the input data) but also needs to know whether or not it is in the geographical area covered by the ticket, as well as know the date and time.
This system doesn't only include this 'go/no-go' rule, it also makes a decision based on whether it would be legitimate to make any further journeys on the ticket (a single ticket can only be used for one journey, and a return ticket for two, whereas a weekly ticket can be used for unlimited journeys within a week). If no more journeys are possible, the ticket is retained within the barrier but in all other cases is returned to the passenger. This decision would appear to render unnecessary the third clause in the 'go/no-go' rule (which requires a test of whether the ticket has already been used for the maximum number of journeys allowed). However there are stations without barriers so that it cannot be assumed that (for example) a single ticket still in a passenger's possession has not yet been used.
'Mandatory Data' Rules
A very common type of process is one that receives input data from a user interface for recording as such and/or updating of stored data. A good example of this type of process is the online "Book Flights" facility provided by an airline. One of the most common types of rule governing such a process is the 'mandatory data' (or 'required field') rule. A "Book Flights" process will, for example, be governed by a rule that requires the departure date to be specified in the input data. Let us call the input data a 'flight booking request'. Our 'mandatory departure date' rule could be stated in many different ways:
RS1. Departure date is mandatory.
RS2. Departure date must be specified.
RS3. The departure date must be specified.
RS4. The departure date must be specified in a flight booking request.
RS5. It is obligatory that a flight booking request specify a departure date.[7]
RS6. A flight booking request must specify a departure date.
RS7. Each flight booking request must specify a departure date.
RS8. Each flight booking request must specify exactly one departure date.
Of these, the last is preferred for the following reasons:
- RS1 and RS2 are not complete sentences; furthermore the word 'mandatory' is not as commonly used outside the IT community as we might think.
- RS3 is a complete sentence but does not state the circumstances in which a departure date is required.
- RS4 is inappropriately phrased since it places an obligation on where to specify the departure date if there is one but does not actually place any obligation on the content of a flight booking request. This version violates a metarule that should govern rule statement formulation, namely that the focus of the rule (in this case the flight booking request, rather than the departure date) should be the subject of the rule statement. We will discuss this further in a later article in this series.
- RS6 is an improvement on RS5 in that it is less verbose without loss of meaning. While RS5 is an example of what has been proposed in the SBVR as one way, in 'structured English', to state a rule that imposes an obligation, RS6 is based on the RuleSpeak equivalent (also endorsed by the SBVR).
- RS7 is an improvement on RS6 in that it makes clear that every flight booking request must specify the departure date.
- RS8 is an improvement on RS7 in that it additionally rejects flight booking requests with more than one departure date.
Each of these rule statements is based on the fact type:
FT1. flight booking request specifies departure date
which is based in turn on the concepts denoted by the terms flight booking request and departure date.
Of course, our "Book Flights" process should also reject any flight booking request that does not specify a return date for a return journey. This rule can be stated as follows:
RS9. Each flight booking request for a return journey must specify exactly one return date.
This rule statement is based on the fact types:
FT2. flight booking request is for journey
FT3. return journey is a category of journey
FT4. flight booking request specifies return date
the rule statement actually being a (legitimately) shortened form of:
RS10. Each flight booking request that is for a return journey must specify exactly one return date.
Our First Rule Statement Template
Rule statements RS8 and RS9 have the same form, which can be described as:
RT1. Each <term 1> {<qualifying clause> | } must <verb phrase> <cardinality> <term 2>.
where:
- <term 1> and <term 2> are placeholders in place of which terms may be substituted;
- <qualifying clause> is a placeholder in place of which a qualifying clause may be substituted: one form of qualifying clause is "that <verb phrase> a <term 3>";
- the symbols { | } allow for alternatives within a template: in this template either <qualifying clause> or null may appear, i.e., <qualifying clause> is optional;
- <verb phrase> is a placeholder in place of which a verb phrase may be substituted: <term 1> <verb phrase> <term 2> should be a fact type;
- <cardinality> is a placeholder in place of which a cardinality may be substituted, e.g., exactly one, at least one, at most one.
'Prohibited Data' Rules
Of course, our "Book Flights" process should reject a flight booking request that specifies a return date for a one-way flight. This rule can be stated as follows:
RS11. A flight booking request for a one-way journey must not specify a return date.
This rule statement is based on the following fact type in addition to FT2, FT3, and FT4:
FT5. one-way journey is a category of journey
This requires a new template:
RT2. {A|An} <term 1> {<qualifying clause> | } must not <verb phrase> <cardinality> <term 2>.
Note that this template uses the indefinite article ("A" or "An") rather than "Each". This is because it is awkward in English to say "Each … must not …".
To be continued...
In the next article we will look at rules constraining the content of input data, and explore an additional template and some more techniques for rule statement generation.
References
[1] Semantics of Business Vocabulary and Business Rules (SBVR), v1.0. Object Management Group (Jan. 2008). Available at http://www.omg.org/spec/SBVR/1.0/PDF
[2] Ronald G. Ross, The Business Rule Book — Classifying, Defining, and Modeling Rules, Second Edition. Houston, TX: Business Rule Solutions, Inc. (1997).
[3] BRS RuleSpeak® Practitioner's Kit. Business Rule Solutions, LLC (2001-2004). PDF. Available at http://BRSolutions.com/p_rulespeak.php
[4] Terry Halpin, Information Modeling and Relational Databases. Morgan Kaufman (2001).
[6] Graeme Simsion and Graham Witt, Data Modeling Essentials, Third Edition. Morgan Kaufmann (2004).
[7] The font and colour conventions used in this and other well-formed rule statements and fact types in these articles reflect those in the SBVR, namely underlined teal for terms, italic blue for verb phrases, and orange for keywords. Note that, for clarity, less than well-formed rule statements will not use these conventions.
# # #
About our Contributor:
Online Interactive Training Series
In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.