ACM SIGSOFT Software Engineering Notes
Page 1
September 2006
Volume 31 Number 5
An Evaluation of Boolean Expression Testing Techniques R.K. Singh1, Pravin Chandra2, Yogesh Singh2 1
Centre for Development of Advanced Computing email:
[email protected] 2
Guru Gobind Singh Indraprastha University, Delhi e-mail :
[email protected],
[email protected] Abstract Increase in the size and complexity of the software developed has made software testing a challenging exercise. A number of testing techniques are available but they differ in terms of statement coverage, condition coverage and particularly in fault detection capabilities. The size of the test suite also differs from one technique to other. Fault that has propagated into the system inadvertently, especially into the branch statements, have severe effects as they affect the logic of the program. In this paper, an experimental evaluation of the popular branch-testing techniques (Elmendorf’s method, Boolean Operator (BOR), Modified Condition/Decision Coverage (MCDC), and Reinforced Criteria/Decision Coverage (RCDC)) is presented. These techniques are evaluated on the basis of types of faults they identify, size of the test suite and their effectiveness in fault detection. For experiments, various branch statements used and referred in literature are selected. Test cases and mutants were prepared for these branch statements. Mutants were prepared by seeding single operator and operand faults into the statements. The results indicate that for a subset of fault types BOR is effective. A variant of MCDC and RCDC demonstrate better performance on the full class of faults and are only slightly worse than Elmendorf’s (CEG) method test suite.
techniques. Mutation Testing, which is a Fault based testing approach has been used for comparing the fault exposing capability of the various branch testing techniques. For our experiments, we have taken Boolean expressions used and referred in literature [5,7,11,18,24]. We assume that the Boolean expression is written using Boolean variables, Boolean operators (AND,OR), NOT operator and parentheses only. Small case italicized letters represent a Boolean variable, “+” represent “OR” operator and “.” represent “AND” operator. Usually we omit the “.” while writing the Boolean expressions. Logical “TRUE” and “FALSE” is denoted by ‘t’ and ‘f’ respectively.
Based on the common operator and operand type errors, several fault categories have been defined in the literature. We have created mutants for the Boolean expression given in table 5 under various fault categories. The objective is to assess the fault-finding capability of various testing suites. A test set is mutation adequate if it is able to kill all the mutants. The number of live mutants indicates inadequacies in the current test set. Live mutants indicates that potential fault have gone unattended while writing the program. This allows us to measure the effectiveness of the fault detection/testing. The paper is organized as follows: Section 2 presents a short review of existing techniques/strategies for branch Keywords: Software Testing, Branch Testing, Fault, testing. Section 3 describes the experiments design and evaluation methodology. Section 4 discusses the results Mutant obtained. The discussion and conclusion are presented in 1. Introduction Section 5. Branching statements plays a decisive role in any program to accomplish the given activity. Depending 2. Branch Testing Techniques upon the input data, decision is taken whether to execute This section briefly discusses various branch testing a specific branch. Any incorrect decision may be techniques used in this paper. These techniques use catastrophic. Thus, Branch statements in the program control-flow test criteria to verify whether logical require special attention. Branch testing is mainly a decisions of the program correspond to the specification. control-flow test criterion [1,10,11] and concentrates on It requires that all control transfers in the program be the decisions statements of the program for their exercised during testing. accuracy and test adequacy. Several Branch testing techniques (discussed in section 3) have been reported in 2.1 Elmendorf’s Strategy the literature but they differ in terms of statement The Elmendorf’s Cause-Effect Graphing technique coverage, condition coverage and particularly in fault [01,04,12] is a method of generating test case detection capabilities. The size of the test suite also representing combinations of conditions. A “cause” is an differs from one technique to other. input condition and an “effect” is an output condition. This technique requires the translation of a specification In this paper, an attempt is made to empirically evaluate into a Boolean logic network. It mainly uses the logical the effectiveness of the various branch testing
ACM SIGSOFT Software Engineering Notes
Page 2
operator “AND”, “OR”, “NOT” etc. to depict the causeeffect graph. This technique helps to uncover ambiguities and incompleteness of the specifications [04]. 2.2 Branch Operator Strategy (BOR) BOR [14,15,17,18,19,23] is a technique suitable for test generation for singular Boolean expression. In a singular Boolean expression, each participating Boolean variable occurs only once. For example, F = ab + c d is a singular expression as each participating variable occurs only once. Boolean expression F = (a + b)(a + c ) is a non-singular Boolean expression. The BOR testing strategy for Boolean expression guarantees the detection of Boolean operator faults for singular expressions. This strategy can also detect incorrect “AND” and “OR” operators and missing or extra “NOT” operators, with the assumption that the Boolean expression does not contain fault of other type. BOR strategy [16,17] is not suitable for non-singular expression. In this case, when two test sets are merged into one and if they contain conflicting values for the same variable, then the merged set may either result into reduced test set or it may have null elements. For example, in Boolean expression, F = (cd )efga (bc + b d ) two sub-expressions can be E1= (cd ) and E2 =
September 2006
Volume 31 Number 5
“Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken on all possible outcomes at least once, and each condition has been shown to independently affect the decision’s outcome” MCDC pair for a condition is one that changes the output on varying the input from “f” to “t” while keeping the other conditions fixed. At least one pair for each condition is required to form the test suite. A condition is the occurrence of a variable in the Boolean expression. For example, in Boolean expression “ F = ab + ac ”, contains four conditions, a, b, c and the second occurrence of a . First occurrence of variable a is coupled with the second occurrence of a because a change to one condition affects the other condition. If there are multiple occurrences of an input in the Boolean formula, it may not be always possible to get at least one pair for every condition. In such a situation, a condition is strongly coupled. It means that a change in one condition always results in a change in the other condition. In the above example, keeping other conditions constant including first occurrence of a , and changing the value of second occurrence of a , cannot be achieved in any meaningful way. The standard does not specify any alternative in such situation [03] To overcome this limitation and make wider application of MCDC, three different forms of MCDC are proposed :
efga (bc + b d ) . Let E1 _ t , E 2 _ t represents the “TRUE” Unique Cause: Changing a single condition in the and E1 _ f , E 2 _ f represents the “FALSE” results for the expression from "t" to " f " or vice versa, changes the outcome of the expression. expression E1 and E2. Expression E2 is also having two terms, E21= (efga ) and E22= (bc + b d ) . In E22, variable b Unique Cause Masking: Unique Cause Masking is having the coupling effect and this leads to reduced effectively treats coupling of conditions in the expression. Unique Cause Masking Strong (UCMS) and test set for E 22 _ t = (t , t , t ) On merger of E21 and E22 into Unique Cause Masking Weak (UCMW) are two E2 _ t has elements {(t , t , t , t , t , t , t ), (t , t , t , t , f , t , t , )} approaches for handling the strongly coupled conditions. E2, for variables (e, f , g , a, b, c, d ) . E1 _ t , = {( f , t ), (t , f )} In strongly coupled conditions, it may not be possible to vary one condition while keeping other fixed. In UCMS, for variable (c, d ) . For generating the test cases as per masking is allowed for coupled condition only whereas the algorithm, E1 _ t , and E2 _ t , are to be merged in one all other (uncoupled) conditions will remain fixed. In UCMW, coupled conditions are treated as single of the step. This operation could not be completed as the value of c, d in both the sub-expression E1 and E2 are condition. For example, in expression ab + a c , two occurrences of a will be treated as distinct entity to be conflicting, and hence, no true test case for F can be varied. First occurrence of a will have the test pair generated. t1=[ (t , t , f ), ( f , t , f ) ] and the second occurrence will 2.3 Modified Condition /Decision Coverage (MCDC) have the test pair t2= [( f , f , t ), (t , f , t )] . In case of Unique Cause Masking Strong, test cases, t1 and t2 will MCDC [02,03,07,08,09,10,11,25] is a structural be selected whereas in UCMW, either t1 or t2 will be coverage criterion used to support the assessment of selected. adequacy of the requirements-based testing process. Masking MCDC: Masking MCDC, allows masking in This criterion has been defined as follows all cases, coupled or uncoupled. For example, in Boolean [02,08,09,11,25]: expression a + (bc) , to show the independent effect of
a , masking definition requires that sub-expression (bc)
ACM SIGSOFT Software Engineering Notes
Page 3
have no impact on the outcome of the expression. By making the sub-expression false, the same can be achieved i.e. pair [ [(t , f , t ), ( f , t , f )] satisfies the MCDC masking criteria, even though other corresponding input bits are different. This is not allowed in the UCMS. 2.4 Reinforced Condition / Decision Coverage (RCDC)
September 2006 No.
Volume 31 Number 5
Boolean Expression
3
ab + cd ( a + b)(a + c ) ab + bc + c d
4
a (b + c ) d + e
5
efga (bc + b d ) ( ac + bd )e( f + (i ( gj + hk )) ( ac + bd )e(i + gk + j ( h + k )) a + b + c + c defgh + i ( j + k )l
1 2
6
7 RCDC [21,22] criterion is an extension of MCDCUCMW (therefore, we have not considered MCDC8 UCMW separately). MCDC approach does not consider Table 5: List of Boolean expressions used in the the situation where change in a condition should keep experiment(s). the value of a decision. RCDC suggests, in addition to MCDC:For evaluating the performance of the various testing techniques, fault based approach [01,06,11,12] has been “each condition in a decision has been shown to used. In this approach, mutants of the original expression independently keep the decision’s outcome” A condition are generated either by seeding single or multiple faults is shown to independently affect and keep a decision’s in the operator or operand of the expression. outcome by varying just that condition while holding fixed (if possible) all other condition. For the Boolean expression in table 5, Mutants have RCDC criterion states that, in addition to UCMW, every been generated under various fault categories by seeding condition should be shown to independently keep the single error at a time either in the operator or operand. If value of the decision “t” or “f”. For example, in a test case can detect a mutant with single fault, it may expression ab + a c , one selected pair for condition a also detect multiple faults [13]. The various kinds of for UCMW is { [( f , f , t ), (t , f , t )] }. In addition to this faults [06,11,20] that can affect any expression can be pair, RCDC will have the following test cases for classified into the following categories :
condition a , where changing the value of a does not • Operator Reference Fault (ORF) : In this class of change the final decision of the expression : fault, a binary logical operator ‘.’ is replaced by ‘+’ or vice versa. RCDC0 (pair for condition a , with outcome as " f " ) • Expression Negation Fault (ENF) : A sub-expression { ( f , f , f ), (t , f , f ) in the statement is replaced by its negation (¬). RCDC1 (pair for condition a , with outcome as "t" ) • Variable Negation Fault (VNF) : An atomic Boolean { ( f , t , t ), (t , t , t ) }. The total pairs for RCDC, for the
literal is replaced by its negation(¬).
Boolean expression ab + a c are t1= [( f , f , t ), (t , f , t )] , • Associative Shift Fault (ASF): This fault occurs when an association among conditions is incorrectly t2 = ( f , f , f ), (t , f , f ) and t3 = ( f , t , t ), (t , t , t ) implemented due to misunderstanding about operator evaluation properties. 2.5 Size of Test Set • Missing Variable Fault (MVF) : A condition in the Let n be the no. of inputs in a Boolean formula then, expression is missing with respect to original based on the definitions, the table 4 gives the size of test expression. suite that can be generated for the various techniques [11,18]. • Variable Reference Fault (VRF): A condition is Size of Test Set
3.
Experiment Methodology
Design
and
Evaluation
In this section, we empirically evaluate these techniques on the basis of types of faults they identify, size of the test suite and their effectiveness in fault detection. For this purpose we have used the various Boolean expression (Table 5) used in the literature [05,07,11,18,24].
Criteria
Minimum
Maximum
BOR* Elmendorf Strategy MCDC MCDC (UCMS)
n+2 n+1 n+1
n+2 2n 2×n 2n
RCDC
n+1
6×n
Table 4: Size of test suite. * In this case n represents the number of AND/OR operators and the test suite may be null for non-singular expressions.
ACM SIGSOFT Software Engineering Notes
Page 4
replaced by another input which exist in the statement. • Clause Conjunction Fault (CCF): A condition a in expression is replaced by a.b , where b is a variable in the expression. • Clause Disjunction Fault (CDF): A condition a in expression is replaced with a + b , where b is a variable in the expression. • Stuck at 0 : A condition a is replaced with 0 in the function. • Stuck at 1 : A condition a is replaced with 1 in the function.
September 2006
Exp No (table 5)
1 2 3 4 5 6 7 8
# of Var in exp.
Volume 31 Number 5
No. Test Cases Bor
Elmen dorf
RCDC
5 10 9 4 4 8 5 13 11 6 22 17 8 128 19 9 1620 38 8 540 34 10 1322 36 55 3659 172 Table 7 : Number of Test Cases Generated.
Faults and their brief illustration for Boolean expression ( a + b)(a + c) are given in table 6.
4 3 4 5 7 11 10 12
MCDC
6 6 8 7 10 16 13 13 79
%age of Fault Detected
ORF(54) ENF(22)
100.0
VNF(60)
100.0
100.0
100.0
ASF(70)
47.1
100.0
80.0
74.3
MVF(59)
96.6
100.0
98.3
100.0
VRF(343)
89.2
99.4
94.8
95.9
CCF(227)
44.9
100.0
70.9
70.9
CDF
( a + b)(a + c ) ( a + b)( a + c ) a + (b a ) + c b( a + c ) (a + a )( a + c) ( aa + b)( a + c ) (a + c + b)(a + c)
BOR #Test case (55) 100.0
CDF(322)
93.8
99.4
91.9
91.9
SA0 (for a = 0)
0
SA0(61)
96.6
100.0
100.0
100.0
SA1(61) Total fault seeded (1273) %age of Fault Detected
100.0
100.0
100.0
100.0
1050
1269
1148
1149
82.5
99.7
90.2
90.3
Fault Type ORF ENF VNF ASF MVF VRF CCF
Example
(ab)(a + c)
bc
SA1 (for a = 0) Table 6 : Fault Class and Mutant example(s) for the Expression : ( a + b)( a + c ) .
We performed our analysis by running the tests cases for each technique separately on the mutants and recorded whether a seeded fault is killed by any of the test case created. Our analysis is based on the following:
Type of Fault (Seeded)
Elmendorf #Test case (3659) 100.0
RCDC #Test case (172) 100.0
MCDC Test case (79) 100.0
100.0
100.0
100.0 100.0
Table 8 : Summary Data for the Experiments Conducted.
1. Whether a test case of specific technique has been The analysis of the results allows us to infer the able to kill the mutant? following: 2. What type of faults is captured by a specific technique? 1 The cardinality of the test suite(s) is ordered as 4. Results follows: The size of the test suite generated for the various expression of table 5 is given in table 7.
BOR ≤ MCDC ≤ RCDC ≤ ELMENDORF
2 The cardinality of test suites generated by BOR, MCDC and RCDC are comparable but on an Due to the volume of data, only summary results are average, Elmendorf’s method generates test suites of being reported. The result in respect to number of a size one magnitude larger. mutants killed by various test suites are summarized in Table 8. 3 All the four techniques demonstrate complete fault detection capability for ORF, ENF, VNF, and SA1. Thus, if it is anticipated that only these faults can
ACM SIGSOFT Software Engineering Notes
4
5
6
7
8
Page 5
September 2006
Volume 31 Number 5
occur, then BOR can be safely used (for singular of faults. Thus, it is felt that an approach that extends expression). BOR to non-singular expression and the fault types not guaranteed to be detected by BOR (test set), may be For ASF and CCF only Elmendorf’s method worthwhile to explore. Performance of MCDC (90.3%), demonstrates complete fault detection. RCDC shows and RCDC (90.2%) is much better than BOR for all the next best detection capability (80%) followed by kinds of Faults. The size of the test suite is also MCDC (74.3% and BOR showing the worst behavior comparable to BOR. Performance of MCDC degenerates (47.1%) for ASF. For CCF, RCDC and MCDC show in cases where coupling amongst the variables increases. equal detection percentage (70.9%) while BOR’s In such cases, a pair for a condition cannot be developed, behaviour is very poor (44.9%). hence, the fault remains live. RCDC reduces the For MVF, Elmendorf and MCDC shows complete limitation of MCDC by adding the test cases for fault detection. BOR and RCDC captures more than insensitive neighbors i.e. change in a condition does not effect the decision. This increases the possibility of 95% of Missing Variable Fault catching more ASF errors though there is slight decline For VRF no technique could detect 100% of the in the effectives of catching MVF and VRF. One faults, but Elmendorf’s method has the best surprising fact that emerges is that none of these behaviour, followed by MCDC, then by RCDC and techniques guarantee the detection of all the faults lastly BOR. Except BOR all other techniques seeded. In the experiment design, an attempt was made demonstrate above 90% fault detection. to perform exhaustive single fault seeding to remove the For CDF no technique could detect 100% of the tester’s bias in the experiments. One of the reason for faults, but Elmendorf’s method has the best being skeptical to exhaustive fault seeding is that there is behaviour, followed by BOR, then by MCDC and no reported result in the literature (to the knowledge of RCDC. But all four techniques demonstrate above authors) for the number of possible mutants for an expression in the fault family and lack of a methodology 90% fault detection. for the generation of the exhaustive mutant set. For Stuck-at-0 fault all the techniques except BOR could detect 100% faults. This failure of the BOR is References attributed to the non-singular expression at no 7 (Table 5) [01]
5. Discussions and Conclusion An empirical evaluation of the BOR, Elmendorf’s Strategy, MCDC and RCDC techniques using fault based approach has been performed. We have used the Boolean expressions from literature ranging from 3 variables to 12 variables. Performance and effectiveness of the various testing techniques has been assessed based on mutation analysis. Mutants were generated for operator fault (ORF,ENF,VNF,ASF) and operand fault (MVF,VRF,CCF,CDF,SA0,SA1). Elmendorf Method performance has been in the range of 99% - 100% for all the fault classes. One factor explaining the “good” performance of the Elmendorf’s method is the large size of the test suite created. BOR technique has been originally designed for the detection of missing/extra negation operators, therefore, it does not guarantee the detection of other faults. The other limitation of BOR technique is that it is suitable only for the singular expression and performs poorly in the cases where the expression has coupling effect. The size of the test-suite for non-singular expression may be null as an effect of strong coupling among the variables of the expression. Due to these limitations, the performance of BOR is only 82.5% in our experiments. Effectiveness of the BOR is poor, less than 50% in ASF and CCF in general. But for singular expressions, BOR is very effective in detection
[02]
[03]
[04]
[05]
[06]
[07]
[08]
[09]
B. Beizer. Software Testing Techniques. Van Nostrand Reinhold, Inc. New York, 2nd edition, 1990. J. J. Chilenski. An Investigation of Three Forms of the Modifie Condition Decision Coverage (MCDC) Criterion. Technical Report DOT/FAA/AR 01/18, U.S. Department of Transportation, Federal Aviation Administration, April 2001. J.J. Chilenski and S. Miller. Applicability of Modified Condition/ Decision Coverage to Software Testing. Software Engineering Journal, 9(5):193 200, September 1994. W. R. Elmendorf. Cause-Effect Graphs on Functional Testing, TR-00.2487, IBM Systems Development Division, Poughkeepsie, NY(1973) P. G. Frankl and E. Weyuker. A Formal Analysis of the Fault- Detecting Ability of Testing Methods. IEEE Transactions on Software Engineering, 19(3):202–213, March 1993. M. J. Harrold, A. J. Offutt, and K. Tewary. An Approach to Fault Modeling and Fault Seeding Using the Program Dependence Graph. Journal of Systems and Software, 36(3):273–296, March 97. J. A. Jones and M. J. Harrold. Test-Suite Reduction and Prioritization for Modified Condition/Decision Coverage. In International Conference on Software Maintenance (ICSM), pages 92–101. IEEE, November 2001. J. A. Jones and M. J. Harrold. Test-Suite Reduction and Prioritization for Modified Condition/Decision Coverage. IEEE Transactions on Software Engineering, 29(3):195– 209, March 2003. K. Kapoor and J. P. Bowen. Experimental Evaluation of the Tolerance for Control-Flow Test Criteria. In 2nd UK Softest Workshop, University of York, UK, September 2003.
ACM SIGSOFT Software Engineering Notes
[10]
[11]
[12] [13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
Page 6
K. Kapoor and J. P. Bowen. Experimental Evaluation of the Variation in Effectiveness for DC, FPC, MC/DC Test Criteria. In International Symposium on Empirical Software Engineering (ISESE), pages 185–194, September 2003. K. Kapoor. Stability of Test Criteria and Fault Hierarchies in Software Testing, PhD Thesis, London South Bank University, August 2004 G. Myers. The Art of Software Testing. WileyInterscience, 1979. A.F. Offutt. Investigations of the Software Testing Coupling Effect, ACM Transactions on Software Engineering and Methodology, Vol 1(1), pp 5-20, January 1992 A. Paradkar and K.C. Tai. Test-Generation for Boolean Expressions. In Sixth International Symposium on Software Reliability Engineering (ISSRE), pages 106– 115, 1995. A. Paradkar, K. C. Tai, and M. A. Vouk. Automatic TestGeneration for Predicates. IEEE Transactions on Reliability, 45(4):515–530, December 1996. K.C. Tai. Theory of Fault-based Predicate Testing for Computer Programs. IEEE Transactions on Software Engineering, 22(8):552–562, August 1996. K.C. Tai. Theory of Fault Based Predicate Testing for Computer Programs, IEEE Transactions of Software Engineering, vol 22, no 8, pp 552-562, 1996 K.C Tai. M.A Vouk., A. Paradkar., Lu P. , "Predicate Based Testing," IBM Systems Journal, Vol 33 (3), p 445, 1994 K..C. Tai , Vouk, Paradkar, Lu, Evaluation of a predicatebased software testing strategy, IBM Systems Journal, vol 33, No.3, pp 445-457, 1994 T. Tsuchiya and T. Kikuno. On Fault Classes and Error Detection Capability of Specification-based Testing. ACM Transactions on Software Engineering and Methodology, 11(1):58–62, January 2002. S. A. Vilkomir and J. P. Bowen. Reinforced Condition/Decision Coverage (RC/DC): A New Criterion for Software Testing. In D. Bert, J. P. Bowen, M. Henson, and K. Robinson, editors, 2nd International Conference, Formal Specification and Development in Z and B, volume 2272 of Lecture Notes in Computer Science, pages 295–313. Springer-Verlag, January 2002. S. A. Vilkomir, K. Kapoor, and J. P. Bowen. Tolerance of Control- Flow Testing Criteria. In 27th International Computer Software and Applications Conference (COMPSAC), pages 182–187. IEEE Computer Society, November 2003. M. A. Vouk, K. C. Tai, and A. Paradkar. Empirical Studies of Predicate-based Software Testing. In 5th International Symposium on Software Reliability Engineering, pages 55–64. IEEE, 1994. E. Weyuker, T. Gorodia, and A. Singh. Automatically Generating Test Data from a Boolean Specification. IEEE Transactions on Software Engineering, 20(5):353–363, May 1994. A. White. Comments on Modified Condition/Decision Coverage for Software Testing. In IEEE Aerospace Conference, Big Sky, Montana, USA, volume 6, pages 2821–2828. IEEE, March 2001.
September 2006
Volume 31 Number 5