DEPARTMENT OF COMPUTER
SCIENCE AND ENGINEERING (UG & PG)
Final Year Computer
Science and Engineering, 7th Semester
Question and Answer
Subject Code & Name: Software Testing
UNIT – I
1. What is the Purpose of
Testing?
The purpose for testing is
to execute or evaluate programs or systems that do the following:
· Measure the results
against the requirements
· Document the difference
between the expected and actual results
· Assist in resolving
those differences by providing the proper debug aids
2. What is the purpose of
testing? - Boris Beizer – Point of view.
There’s an attitudinal (
positive attitude) progression characterized by the following five phases:
· PHASE 0—There’s
no difference between testing and debugging. Other than in support of
debugging, testing has no
purpose.
· PHASE 1—The
purpose of testing is to show that the software works.
· PHASE 2—The
purpose of testing is to show that the software doesn’t work.
· PHASE 3—The
purpose of testing is not to prove anything, but to reduce the perceived risk
of
not working to an
acceptable value.
· PHASE 4—Testing
is not an act. It is a mental discipline that results in low-risk software without
much testing effort
3. Give some of the
Testing Purpose Examples.
Uncovering defects and
finding important problems
Assessing quality and risk
Certifying to standards
Fulfilling process mandates
Blocking premature
releases Minimizing safety-related lawsuit risks
Minimizing technical
support costs Maximizing efficiency
Verifying correctness
Assessing conformance to specifications or regulations
4. List the Pesticide
Paradox and the Complexity Barrier of the Software Testing
First Law: The Pesticide
Paradox—Every
method you use to prevent or find bugs leaves a residue of
subtler bugs against which
those methods are ineffectual.
That’s not too bad, you
say, because at least the software gets better and better. Not quite!
Second Law: The Complexity
Barrier—Software
complexity (and therefore that of bugs) grows to the limits of our ability to
manage that complexity.
5. What do you meant by
Dichotomy?
Division into two mutually
exclusive, opposed, or contradictory groups: a dichotomy between thought and
action.
6. Differentiate Testing
and Debugging.
The purpose of testing is
to show that a program has bugs. The purpose of debugging is find the error
or misconception that led to the program’s failure and to design and implement
the program changes that correct the error. Debugging usually follows testing,
but they differ as to goals, methods, and
most important, psychology: Testing starts with known conditions, uses
predefined procedures, and
has predictable outcomes; only whether or not the program passes the test is
unpredictable. Debugging starts from possibly unknown initial conditions, and
the end cannot be predicted, except statistically. Testing can and should be
planned, designed,
and scheduled. The
procedures for, and duration of, debugging cannot be so constrained.
Testing is a demonstration
of error or apparent correctness. Debugging is a deductive process.
Testing proves a
programmer’s failure. Debugging is the programmer’s vindication. Testing, as
executed, should strive to be predictable, dull, constrained, rigid, and inhuman.
Debugging demands intuitive leaps, conjectures, experimentation, and freedom Much
of testing can be done without design knowledge. Debugging is impossible
without detailed design knowledge.
Testing can often be done
by an outsider Debugging must be done by an insider Although there is a robust
theory of testing that establishes theoretical limits to what testing can and
can’t do.
Debugging has only
recently been attacked by theorists—and so far there are only rudimentary
results Much of test execution and design can be automated. Automated debugging
is still a dream
7. Differentiate Function
versus Structure
Functional testing the program or system is
treated as a black box. It is subjected to inputs, and its outputs are verified
for conformance to specified behavior. The software’s user should be concerned
only with functionality and features, and the program’s implementation details should
not matter. Functional testing takes the user’s point of view.
Structural testing does look at the
implementation details. Such things as programming style, control method,
source language, database design, and coding details dominate structural
testing; but the boundary between function and structure is fuzzy. Good systems
are built in layers—from the outside to the inside. The user sees only the
outermost layer, the layer of pure
function. Each layer inward
is less related to the system’s functions and more constrained by its
structure: so what is
structure to one layer is function to the next.
8. What do you meant by
Bugs? (Detail)
Bugs
are more insidious than ever we expect them to be. Yet it is convenient to
categorize them: initializations, call sequence, wrong variable, and so on. Our
notion of what is or isn’t a bug varies. A bad specification may lead us to
mistake good behavior for bugs, and vice versa. An unexpected test result may
lead us to change our notion of what a bug is—that is to say, our model of
bugs. If you hold any of the following beliefs, then disabuse yourself of them
because as long as you believe in such things you will be unable to test
effectively and unable to justify the dirty tests most programs need.
Benign Bug Hypothesis—The belief that bugs are
nice, tame, and logical. Only weak bugs
have a logic to them and
are amenable to exposure by strictly logical means. Subtle bugs have
no definable pattern—they
are wild cards.
Bug Locality Hypothesis—The belief that a bug
discovered within a component affects only
that component’s behavior;
that because of structure, language syntax, and data organization,
the symptoms of a bug are
localized to the component’s designed domain. Only weak bugs are
so localized. Subtle bugs
have consequences that are arbitrarily far removed from the cause in
time and/or space from the
component in which they exist.
Control Bug Dominance—The belief that errors in
the control structure of programs dominate
the bugs. While many easy
bugs, especially in components, can be traced to control-flow errors, data-flow
and data-structure errors are as common. Subtle bugs that violate
datastructure boundaries and data/code separation can’t be found by looking
only at control structures.
Code/Data Separation—The belief, especially in
HOL programming, that bugs respect the separation of code and data.* Furthermore,
in real systems the distinction between code and data can be hard to make, and
it is exactly that blurred distinction that permit such bugs to exist.
9. What do you meant by
Tests?
Tests
are formal procedures. Inputs must be prepared, outcomes predicted, tests
documented, commands executed, and results observed; all these steps are
subject to error. There is nothing magical about testing and test design that
immunizes testers against bugs. An unexpected test result is as often cause by
a test bug as it is by a real bug.* Bugs can creep into the
documentation, the inputs, and the commands and becloud our observation of
results. An unexpected test result, therefore, may lead us to revise the tests.
Because the tests are themselves in an environment, we also have a mental model
of the tests, and instead of revising the tests, we may have to revise that
mental model.
10. Different kinds of
testing.
Unit, Unit Testing—A unit is the smallest testable piece of software, by
which I mean that it can be compiled or assembled, linked, loaded, and put
under the control of a test harness or driver. A unit is
usually the work of one programmer and it consists of several hundred or fewer,
lines of source code. Unit testing is the testing we do to show that the
unit does not satisfy its functional specification and/or that its implemented
structure does not match the intended design structure. When our tests reveal
such faults, we say that there is a unit bug.
Component, Component Testing—A component is an integrated
aggregate of one or more units. A unit is a component, a component with
subroutines it calls is a component, etc. By this (recursive) definition, a
component can be anything from a unit to an entire system. Component testing
is the testing we do to show that the component does not satisfy its functional
specification and/or that its implemented structure does not match the intended
design structure. When our tests reveal such problems, we say that there is a component
bug.
Integration, Integration
Testing—Integration
is a process by which components are aggregated to create larger
components. Integration testing is testing done to show that even
though the components were
individually satisfactory, as demonstrated by successful passage of component
tests, the combination of components are incorrect or inconsistent. For
example, components A and B have both passed their component tests. Integration
testing is aimed as showing inconsistencies between A and B. Examples of such
inconsistencies are improper call or return sequences, inconsistent data
validation criteria, and inconsistent handling of data objects. Integration
testing should not be confused with testing integrated objects, which is just higher
level component testing. Integration testing is specifically aimed at exposing
the problems that arise from the combination of components. The sequence, then,
consists of component testing for components A and B, integration testing for
the combination of A and B, and finally, component testing for the “new”
component (A,B).*
System, System Testing—A system is a big
component. System testing is aimed at revealing bugs that cannot be
attributed to components as such, to the inconsistencies between
components, or to the
planned interactions of components and other objects. System testing concerns
issues and behaviors that can only be exposed by testing the entire integrated
system
or a major part of it.
System testing includes testing for performance, security, accountability,
configuration sensitivity,
start-up, and recovery.
11. What are the
approaches used for to demonstrate a program is correct or not?
Three
different approaches can be used to demonstrate that a program is correct: tests
based on structure, tests based on function, and formal proofs of correctness.
Each approach leads to the neither conclusion that complete testing; in the
sense of a proof is neither theoretically nor practically possible.
12. Functional Testing
Functional Testing—Every program operates on
a finite number of inputs. Whatever pragmatic
meaning those inputs might
have, they can always be interpreted as a binary bit stream. A complete functional
test would consist of subjecting the program to all possible input streams. For
each input the routine either accepts the stream and produces a correct
outcome, accepts the stream and produces an incorrect outcome, or rejects the
stream and tells us that it did so. Because the rejection message is itself an
outcome, the problem is reduced to verifying that the
correct outcome is
produced for every input. But a 10-character input string has 280 possible input
streams and corresponding outcomes. So complete functional testing in this
sense is impractical.
13. Structural Testing
Structural Testing—One should design enough
tests to ensure that every path through the routine is exercised at least once.
Right off that’s impossible, because some loops might never terminate. Brush
that problem aside by observing that the universe—including all that’s in it—is
finite. Even so, the
number of paths through a small routine can be awesome because each loop multiplies
the path count by the number of times through the loop. A small routine can
have
millions or billions of
paths, so total path testing is usually impractical, although it can be
done
for some routines. By
doing it we solve the problems of unknown size that we ran into for purely
functional testing; however, it doesn’t solve the problem of preparing a
bug-free input, a bug-free response list, and a bug-free test observation. We
still need those things, because pure structural testing can never assure us
that the routine is doing the right thing.
14. Correctness Proofs
Correctness Proofs—Formal proofs of
correctness rely on a combination of functional and structural concepts.
Requirements are stated in a formal language (e.g., mathematics), and each program
statement is examined and used in a step of an inductive proof that the routine
will produce the correct outcome for all possible input sequences. The
practical issue here is that such proofs are very expensive and have been
applied only to numerical routines or to formal proofs for crucial software such
as a system’s security kernel or portions of compilers. But there are
theoretical objections to formal proofs of correctness that go beyond the
practical issues. How do we know that the specification is achievable? Its
consistency and completeness must be proved, and in general, that is a provably
unsolvable problem. Assuming that the specification has been proved correct,
then the mechanism used to prove the program, the steps in the proof, the logic
used, and so on, must be proved (GOOD75). Mathematicians and logicians have no more
immunity to bugs than programmers or testers have. This also leads to
never-ending sequences of unverifiable assumptions.
15. The theoretical
barriers to complete testing.
Manna and Waldinger
(MANN78) have clearly summarized the theoretical barriers to complete
testing:
“We can never be sure that
the specifications are correct.”
“No verification system
can verify every correct program.”
“We can never be certain
that a verification system is correct.”
16. Dependencies of a bug.
The importance of a bug
depends on frequency, correction cost, installation cost, and consequences.
Frequency— how often does that kind
of bug occur? Pay more attention to the more frequent
bug types.
Correction Cost— what does it cost to
correct the bug after it’s been found? That cost is the
sum of two factors: (1)
the cost of discovery and (2) the cost of correction. These costs go up
dramatically the later in
the development cycle the bug is discovered. Correction cost also depends on
system size. The larger the system the more it costs to correct the same bug.
Installation Cost—Installation cost depends
on the number of installations: small for a singleuser program, but how about a
PC operating system bug? Installation cost can dominate all other costs—fixing
one simple bug and distributing the fix could exceed the entire system’s
development cost.
Consequences— what are the
consequences of the bug? You might measure this by the mean
size of the awards made by
juries to the victims of your bug. A reasonable metric for bug importance is:
importance($) =
frequency*(correction_cost + installation_cost + consequential_cost)
Frequency tends not to
depend on application or environment, but correction, installation, and consequential
costs do. As designers, testers, and QA workers, you must be interested in bug importance,
not raw frequency. Therefore you must create your own importance model. This chapter
will help you do that.
17. What are the
consequences of this bug?
“Bit
so-and-so will be set instead of reset,” you’re avoiding responsibility for the
bug. Although it may be difficult to do in the scope of a subroutine,
programmers should try to measure the consequences of their bugs in human
terms. Here are some consequences on a scale of one to ten:
1. Mild—The symptoms of the bug
offend us aesthetically; a misspelled output or a misaligned printout.
2. Moderate—Outputs are misleading or
redundant. The bug impacts the system’s performance.
3. Annoying—The system’s behavior,
because of the bug, is dehumanizing. Names are truncated or arbitrarily
modified. Bills for $0.00 are sent. Operators must use unnatural command
sequences and must trick the system into a proper response for unusual
bug-related cases.
4. Disturbing—It refuses to handle
legitimate transactions. The automatic teller machine won’t give you money. My
credit card is declared invalid.
5. Serious—It loses track of
transactions: not just the transaction itself (your paycheck), but the fact
that the transaction occurred. Accountability is lost.
6. Very Serious—Instead of losing your
paycheck, the system credits it to another account or converts deposits into
withdrawals. The bug causes the system to do the wrong transaction.
7. Extreme—The problems aren’t
limited to a few users or to a few transaction types. They are frequent and
arbitrary instead of sporadic or for unusual cases.
8. Intolerable—Long-term, unrecoverable
corruption of the data base occurs and the corruption is not easily discovered.
Serious consideration is given to shutting the system down.
9. Catastrophic—The decision to shut down
is taken out of our hands because the system fails. 10. Infectious—What
can be worse than a failed system? One that corrupts other systems even though
it does not fail in itself; that erodes the social or physical environment;
that melts nuclear reactors or starts wars; whose influence, because of
malfunction, is far greater than expected; a system that kills.
18. How to measure the
quality?
Quality
can be measured on some scale, say from 0 to 10. Quality can be measured as a
combination of factors, of which the number of bugs and their severity is only
one component. The detail of how this is done is the subject of another book;
but it’s enough to say that many organizations have designed and use satisfactory,
quantitative, quality metrics. Examining these metrics closer, we see that how
the parts are weighted depends on environment, application, culture, and many
other factors. Few are
Correction Cost—The cost of correcting a bug has almost nothing to do with
symptom
severity. Catastrophic,
life-threatening bugs could be trivial to fix, whereas minor annoyances
could require major
rewrites to correct.
Context and Application Dependency—The severity of a bug,
for the same bug with the same symptoms, depends on context. For example, a
roundoff error in an orbit calculation doesn’t mean much in a spaceship video
game but it matters to real astronauts.
Creating Culture Dependency—What’s important depends on the creators of
the software and their cultural aspirations. Test tool vendors are more
sensitive about bugs in their products than, say, games software vendors.
User Culture Dependency—What’s important depends on the user culture. An R&D shop might
accept a bug for which there’s a workaround; a banker would go to jail for that
same bug; and naive users of PC software go crazy over bugs that pros ignore.
The Software Development
Phase—Severity
depends on development phase. Any bug gets more severe as it gets closer to
field use and more severe the longer it’s been around—more severe because of
the dramatic rise in correction cost with time. Also, what’s a trivial or
subtle bug to the designer means little to the maintenance programmer for whom
all bugs are equally mysterious.
19. How should you go
about quantifying the nightmare?
Here’s a workable
procedure:
1. List your worst software
nightmares. State them in terms of the symptoms they produce and
how your user will react
to those symptoms. For end users and the population at large, the categories of
Section 2.2 above are a starting point. For programmers the nightmare may be closer
to home, such as: “I might get a bad personal performance rating.”
2. Convert the consequences
of each nightmare into a cost. Usually, this is a labor cost for correcting the
nightmare, but if your scope extends to the public, it could be the cost of
lawsuits, lost business, or nuclear reactor meltdowns.
3. Order the list from the
costliest to the cheapest and then discard the low-concern nightmares with
which you can live.
4. Based on your experience,
measured data (the best source to use), intuition, and published statistics
postulate the kinds of bugs that are likely to create the symptoms expressed by
each nightmare. Don’t go too deep because most bugs are easy. This is a bug
design process. If you can “design” the bug by a one-character or one statement
change, then it’s a good target. If it takes hours of sneaky thinking to
characterize the bug, then either it’s an unlikely bug or you’re worried about
a saboteur in your organization, which could be appropriate in some cases. Most bugs are simple goofs once you find and
understand them.
5. For each nightmare, then,
you’ve developed a list of possible causative bugs. Order that list by
decreasing probability. Judge the probability based on your own bug statistics,
intuition, experience, etc. The same bug type will appear in different
nightmares. The importance of a bug type
is calculated by multiplying the expected cost of the nightmare by the
probability of the bug and summing across all nightmares:
6. Rank the bug types in
order of decreasing importance to you.
7. Design tests (based on
your knowledge of test techniques) and design your quality assurance
inspection process by
using the methods that are most effective against the most important bugs.
8. If a test is passed, then
some nightmares or parts of them go away. If a test is failed, then a nightmare
is possible, but upon correcting the bug, it too goes away. Testing, then,
gives you information you can use to revise your estimated nightmare
probabilities. As you test, revise the probabilities and reorder the nightmare
list. Taking whatever information you get from testing and working it back
through the exercise leads you to revise your subsequent test strategy, either on
this project if it’s big enough or long enough, or on subsequent projects.
9. Stop testing when the
probability of all nightmares has been shown to be inconsequential as a result
of hard evidence produced by testing.
20. List out some of the
functional testing techniques.
Most
functional test techniques—that is, those techniques which are based on
a behavioral description of software, such as transaction flow testing, syntax
testing, domain testing, logic testing, and state testing are
useful in testing functional bugs
21. What do you meant by
Control and Sequence Bugs?
Control
and sequence bugs include paths left out, unreachable code, improper nesting of
loops, loop-back or loop-termination criteria incorrect, missing process steps,
duplicated processing, unnecessary processing, rampaging GOTO’s, ill-conceived
switches, spaghetti code, and worst of all, pachinko code.
22. What do you meant by
Logic Bugs?
Bugs
in logic, especially those related to misunderstanding how case statements and
logic operators behave singly and in combinations, include nonexistent cases,
improper layout of cases, “impossible” cases that are not impossible, a
“don’t-care” case that matters, improper negation of a boolean expression (for
example, using “greater than” as the negation of “less than”), improper
simplification and combination of cases, overlap of exclusive cases, confusing
“exclusive OR” with
“inclusive OR.”
23. Processing Bugs
Processing
bugs include arithmetic bugs, algebraic, mathematical function evaluation, algorithm
selection, and general processing. Many problems in this area are related to
incorrect conversion from one data representation to another. This is
especially true in assembly language programming. Other problems include
ignoring overflow, ignoring the difference between positive and negative zero,
improper use of greater-than, greater-than-or-equal, less-than, lessthan- or-equal,
assumption of equality to zero in floating point, and improper comparison between
different formats as in ASCII to binary or integer to floating point.
24. Initialization Bugs
Initialization
bugs are common, and experienced programmers and testers know they must look for
them. Both improper and superfluous initialization occur. The latter tends to
be less harmful but can affect performance. Typical bugs are as follows:
forgetting to initialize working space, registers, or data areas before first
use or assuming that they are initialized elsewhere; a bug in the first value
of a loop-control parameter; accepting an initial value without a validation
check; and initializing to the wrong format, data representation, or type.
25. Data-Flow Bugs and
Anomalies
Most
initialization bugs are a special case of data-flow anomalies. A data-flow
anomaly occurs when there is a path along which we expect to do something
unreasonable with data, such as using an uninitialized variable, attempting to
use a variable before it exists, modifying data and then not storing or using
the result, or initializing twice without an intermediate use. Although part of
data-flow anomaly detection can be done by the compiler based on information
known at compile time, much can be detected only by execution and therefore is
a subject for testing. It is generally recognized today that data-flow
anomalies are as important as control-flow anomalies
26. Data bugs
Data
bugs include all bugs that arise from the specification of data objects, their
formats, the number of such objects, and their initial values. Data bugs are at
least as common as bugs in code, but they are often treated as if they did not
exist at all. Underestimating the frequency of
data bugs is caused by
poor bug accounting. In some projects, bugs in data declarations are just not
counted, and for that matter, data declaration statements are not counted as
part of the code. The categories used for data bugs are different from those
used for code bugs. Each way of looking at data provides a different
perspective. These categories for data bugs overlap and are no stricter than
the categories used for bugs in code.
·
Dynamic versus Static
·
Information, Parameter, and Control
·
Contents, Structure, and Attributes
27. Data specifications
consist of three parts:
Contents—The actual bit pattern,
character string, or number put into a data structure. Content is a pure bit
pattern and has no meaning unless it is interpreted by a hardware or software processor.
All data bugs result in the corruption or misinterpretation of content.
Structure—The size and shape and
numbers that describe the data object, that is, the memory
locations used to store
the content (e.g., 16 characters aligned on a word boundary, 122 blocks
of 83 characters each,
bits 4 through 14 of word 17). Structures can have substructures and can
be arranged into
superstructures. A hunk of memory may have several different structures defined
over it—e.g., a two-dimensional array treated elsewhere as N one-dimensional
arrays. Attributes—The specification of meaning, that is, the semantics
associated with the contents of
a data object (e.g., an
integer, an alphanumeric string, a subroutine).
28. What are the remedies
for test bugs?
The
remedies for test bugs are: test debugging, test quality assurance, test
execution automation, and test design automation.
Test Debugging—The first remedy for test
bugs is testing and debugging the tests. The differences between test debugging
and program debugging are not fundamental. Test debugging is usually easier
because tests, when properly designed, are simpler than programs and do not
have to make concessions to efficiency. Also, tests tend to have a localized
impact relative to other tests, and therefore the complicated interactions that
usually plague software designers are less frequent. We have no magic
prescriptions for test debugging—no more than we have for software debugging.
Test Quality Assurance—Programmers have the
right to ask how quality in independent testing and test design is monitored.
Should we implement test testers and test—tester tests? This sequence does not
converge. Methods for test quality assurance are discussed in Software System
Testing and Quality Assurance (BEIZ84).
Test Execution Automation—The history of software
bug removal and prevention is indistinguishable from the history of programming
automation aids. Assemblers, loaders, compilers, and the like were all
developed to reduce the incidence of programmer and/or operator errors. Test
execution bugs are virtually eliminated by various test execution automation
tools, many of which are discussed throughout this book. The point is that “manual
testing” is
self-contradictory. If you want to get rid of test execution bugs, get rid of
manual execution.
Test Design Automation—Just as much of software
development has been automated (what is a compiler, after all?) much test
design can be and has been automated. For a given productivity rate, automation
reduces bug count—be it for software or be it for tests.
29. Test Case
IEEE Standard 610 (1990)
defines test case as follows:
“(1) A set of test inputs,
execution conditions, and expected results developed for a particular objective,
such as to exercise a particular program path or to verify compliance with a
specific requirement.
“(2) (IEEE Std 829-1983)
Documentation specifying inputs, predicted results, and a set of execution
conditions for a test item.” Boris Beizer (1995, p. 3) defines a test as “A
sequence of one or more subtests executed as a sequence because the outcome
and/or final state of one subtest is the input and/or initial state of the
next. The word ‘test’ is used to include subtests, tests proper, and test
suites.
30. Some recent major
computer system failures caused by software bugs
· Software problems in the
automated baggage sorting system of a major airport in February 2008
prevented thousands of
passengers from checking baggage for their flights. It was reported that
the breakdown occurred
during a software upgrade, despite pre-testing of the software. The system
continued to have problems in subsequent months.
· News reports in December
of 2007 indicated that significant software problems were continuing to occur
in a new ERP payroll system for a large urban school system. It was believed
that more than one third of employees had received incorrect paychecks at
various times since the new system went live the preceding January, resulting
in overpayments of $53 million, as well as underpayments. An employees' union
brought a lawsuit against the school system, the cost of the ERP system was
expected to rise by 40%, and the non-payroll part of the ERP system was delayed.
Inadequate testing reportedly contributed to the problems.
· In November of 2007 a
regional government reportedly brought a multi-million dollar lawsuit against a
software services vendor, claiming that the vendor 'minimized quality' in
delivering software for a large criminal justice information system and the
system did not meet requirements. The vendor also sued its subcontractor on the
project.
· In June of 2007 news
reports claimed that software flaws in a popular online stock-picking contest
could be used to gain an unfair advantage in pursuit of the game's large cash
prizes. Outside investigators were called in and in July the contest winner was
announced. Reportedly the winner had previously been in 6th place, indicating
that the top 5 contestants may have been disqualified.
· A software problem
contributed to a rail car fire in a major underground metro system in April
of 2007 according to
newspaper accounts. The software reportedly failed to perform as expected in
detecting and preventing excess power usage in equipment on a new passenger railcar, resulting in overheating and fire in
the rail car, and evacuation and shutdown of part of the system.
· Tens of thousands of
medical devices were recalled in March of 2007 to correct a software bug
According to news reports,
the software would not reliably indicate when available power to the
device was too low.
· A September 2006 news
report indicated problems with software utilized in a state government's
primary election, resulting in periodic unexpected rebooting of voter checkin machines,
which were separate from the electronic voting machines, and resulted in
confusion and delays at voting sites. The problem was reportedly due to
insufficient testing.
· In August of 2006 a U.S.
government student loan service erroneously made public the personal data of as
many as 21,000 borrowers on it's web site, due to a software error. The bug was
fixed and the government department subsequently offered to arrange for free
credit monitoring services for those affected.
· A software error
reportedly resulted in overbilling of up to several thousand dollars to each of
11,000 customers of a major telecommunications company in June of 2006. It was
reported that
the software bug was fixed
within days, but that correcting the billing errors would take much longer.
· News reports in May of
2006 described a multi-million dollar lawsuit settlement paid by a healthcare
software vendor to one of its customers. It was reported that the customer
claimed there were problems with the software they had contracted for,
including poor integration of software modules, and problems that resulted in
missing or incorrect data used by medical personnel.
· In early 2006 problems
in a government's financial monitoring software resulted in incorrect election
candidate financial reports being made available to the public. The
government's election finance reporting web site had to be shut down until the
software was repaired. · Trading on a major Asian stock exchange was brought to
a halt in November of 2005, reportedly due to an error in a system software
upgrade. The problem was rectified and trading resumed later the same day.
· A May 2005 newspaper
article reported that a major hybrid car manufacturer had to install a software
fix on 20,000 vehicles due to problems with invalid engine warning lights and occasional
stalling. In the article, an automotive software specialist indicated that the automobile
industry spends $2 billion to $3 billion per year fixing software problems.
· Media reports in January
of 2005 detailed severe problems with a $170 million high-profile U.S.
government IT systems project. Software testing was one of the five major
problem areas
according to a report of
the commission reviewing the project. In March of 2005 it was decided to scrap
the entire project.
· In July 2004 newspapers
reported that a new government welfare management system in Canada costing
several hundred million dollars was unable to handle a simple benefits rate increase
after being put into live operation. Reportedly the original contract allowed
for only 6 weeks of acceptance testing and the system was never tested for its
ability to handle a rate increase.
· Millions of bank
accounts were impacted by errors due to installation of inadequately tested software
code in the transaction processing system of a major North American bank,
according to mid-2004 news reports. Articles about the incident stated that it
took two weeks to fix all the resulting errors, that additional problems
resulted when the incident drew a large number of mail phishing attacks against the bank's
customers, and that the total cost of the incident could
exceed $100 million.
· A bug in site management
software utilized by companies with a significant percentage of worldwide web
traffic was reported in May of 2004. The bug resulted in performance problems for
many of the sites simultaneously and required disabling of the software until
the bug was fixed.
· According to news
reports in April of 2004, a software bug was determined to be a major contributor
to the 2003 Northeast blackout, the worst power system failure in North
American history. The failure involved loss of electrical power to 50 million
customers, forced shutdown of 100 power plants, and economic losses estimated
at $6 billion. The bug was reportedly in one utility company's vendor-supplied
power monitoring and management system, which was unable to correctly handle
and report on an unusual confluence of initially localized events. The
error was found and
corrected after examining millions of lines of code.
· In early 2004, news
reports revealed the intentional use of a software bug as a counterespionage tool.
According to the report, in the early 1980's one nation surreptitiously allowed
a hostile nation's espionage service to steal a version of sophisticated
industrial software that had intentionally-added flaws. This eventually
resulted in major industrial disruption in the country that used the stolen
flawed software.
· A major U.S. retailer
was reportedly hit with a large government fine in October of 2003 due to
web site errors that
enabled customers to view one anothers' online orders.
· News stories in the fall
of 2003 stated that a manufacturing company recalled all their transportation
products in order to fix a software problem causing instability in certain circumstances.
The company found and reported the bug itself and initiated the recall
procedure
in which a software
upgrade fixed the problems.
· In August of 2003 a U.S.
court ruled that a lawsuit against a large online brokerage company could
proceed; the lawsuit reportedly involved claims that the company was not fixing
system problems that sometimes resulted in failed stock trades, based on the
experiences of 4 plaintiffs
during an 8-month period.
A previous lower court's ruling that "...six miscues out of more than 400
trades does not indicate negligence." was invalidated.
· In April of 2003 it was
announced that a large student loan company in the U.S. made a software error
in calculating the monthly payments on 800,000 loans. Although borrowers were
to be notified of an
increase in their required payments, the company will still reportedly lose $8
million in interest. The error was uncovered when borrowers began reporting
inconsistencies in their bills.
· News reports in February
of 2003 revealed that the U.S. Treasury Department mailed 50,000 Social
Security checks without any beneficiary names. A spokesperson indicated that
the missing names were due to an error in a software change. Replacement checks
were subsequently mailed out with the problem corrected, and recipients were
then able to cash their
Social Security checks.
· In March of 2002 it was
reported that software bugs in Britain's national tax system resulted in
more than 100,000
erroneous tax overcharges. The problem was partly attributed to the
difficulty of testing the
integration of multiple systems.
· A newspaper columnist
reported in July 2001 that a serious flaw was found in off-the-shelf software
that had long been used in systems for tracking certain U.S. nuclear materials.
The same software had been recently donated to another country to be used in
tracking their own nuclear materials, and it was not until scientists in that
country discovered the problem, and shared the information, that U.S. officials
became aware of the problems.
· According to newspaper
stories in mid-2001, a major systems development contractor was fired and sued
over problems with a large retirement plan management system. According to the reports,
the client claimed that system deliveries were late, the software had excessive
defects, and it caused other systems to crash.
· In January of 2001
newspapers reported that a major European railroad was hit by the aftereffects
of the Y2K bug. The company found that many of their newer trains would not run
due to their inability to recognize the date '31/12/2000'; the trains were
started by altering the control system's
date settings.
· News reports in
September of 2000 told of a software vendor settling a lawsuit with a large mortgage
lender; the vendor had reportedly delivered an online mortgage processing
system that did not meet specifications, was delivered late, and didn't work. ·
In early 2000, major problems were reported with a new computer system in a
large suburban U.S. public school district with 100,000+ students; problems
included 10,000 erroneous report cards and students left stranded by failed
class registration systems; the district's CIO was fired. The school district
decided to reinstate it's original 25-year old system for at least a year until
the bugs were worked out of the new system by the software vendors.
· A review board concluded
that the NASA Mars Polar Lander failed in December 1999 due to software
problems that caused improper functioning of retro rockets utilized by the
Lander as it entered the Martian atmosphere.
· In October of 1999 the
$125 million NASA Mars Climate Orbiter spacecraft was believed to be lost in
space due to a simple data conversion error. It was determined that spacecraft
software used certain data in English units that should have been in metric
units. Among other tasks, the orbiter was to serve as a communications relay
for the Mars Polar Lander mission, which failed for unknown reasons in December
1999. Several investigating panels were convened to determine the process
failures that allowed the error to go undetected.
· Bugs in software
supporting a large commercial high-speed data network affected 70,000 business
customers over a period of 8 days in August of 1999. Among those affected was
the electronic trading system of the largest U.S. futures exchange, which was
shut down for most of a week as a result of the outages.
· In April of 1999 a
software bug caused the failure of a $1.2 billion U.S. military satellite
launch, the costliest unmanned accident in the history of Cape Canaveral
launches. The failure was the latest in a string of launch failures, triggering
a complete military and industry review of U.S. space launch programs,
including software integration and testing processes. Congressional oversight
hearings were requested.
· A small town in Illinois
in the U.S. received an unusually large monthly electric bill of $7 million in
March of 1999. This was about 700 times larger than its normal bill. It turned
out to be due to bugs in new software that had been purchased by the local
power company to deal with Y2K software
issues.
· In early 1999 a major
computer game company recalled all copies of a popular new product due to
software problems. The company made a public apology for releasing a product
before it was ready.
· The computer system of a
major online U.S. stock trading service failed during trading hours
several times over a
period of days in February of 1999 according to nationwide news reports.
The problem was reportedly
due to bugs in a software upgrade intended to speed online trade
confirmations.
· In April of 1998 a major
U.S. data communications network failed for 24 hours, crippling a large part of
some U.S. credit card transaction authorization systems as well as other large
U.S. bank, retail, and government data systems. The cause was eventually traced
to a software bug. · January 1998 news reports told of software problems at a
major U.S. telecommunications company
that resulted in no charges for long distance calls for a month for 400,000
customers.
The problem went
undetected until customers called up with questions about their bills.
· In November of 1997 the
stock of a major health industry company dropped 60% due to reports of failures
in computer billing systems, problems with a large database conversion, and inadequate
software testing. It was reported that more than $100,000,000 in receivables
had to be written off and that multi-million dollar fines were levied on the
company by government agencies.
· A retail store chain
filed suit in August of 1997 against a transaction processing system vendor
(not a credit card
company) due to the software's inability to handle credit cards with year 2000 expiration
dates.
· In August of 1997 one of
the leading consumer credit reporting companies reportedly shut down their new
public web site after less than two days of operation due to software problems.
The new site allowed web site visitors instant access, for a small fee, to
their personal credit reports. However, a number of initial users ended up
viewing each others' reports instead of their own, resulting in irate customers
and nationwide publicity. The problem was attributed to "...unexpectedly
high demand from consumers and faulty software that routed the files to the wrong
computers."
· In November of 1996,
newspapers reported that software bugs caused the 411 telephone information
system of one of the U.S. RBOC's to fail for most of a day. Most of the 2000 operators
had to search through phone books instead of using their 13,000,000-listing database. The bugs were introduced by new
software modifications and the problem software had been installed on both the
production and backup systems. A spokesman for the software vendor reportedly
stated that 'It had nothing to do with the integrity of the software. It was
human error.'
· On June 4 1996 the first
flight of the European Space Agency's new Ariane 5 rocket failed shortly after
launching, resulting in an estimated uninsured loss of a half billion dollars.
It was reportedly due to the lack of exception handling of a floating-point
error in a conversion from a 64-bit integer to a 16-bit signed integer.
· Software bugs caused the
bank accounts of 823 customers of a major U.S. bank to be credited
with $924,844,208.32 each
in May of 1996, according to newspaper reports. The American Bankers
Association claimed it was the largest such error in banking history. A bank spokesman
said the programming
errors were corrected and all funds were recovered.
· On January 1 1984 all
computers produced by one of the leading minicomputer makers of the time
reportedly failed worldwide. The cause was claimed to be a leap year bug in a date
handling function utilized in deletion of temporary operating system files.
Technicians throughout the world worked for several days to clear up the
problem. It was also reported that the same bug affected many of the same
computers four years later.
· Software bugs in a
Soviet early-warning monitoring system nearly brought on nuclear war in 1983,
according to news reports in early 1999. The software was supposed to filter
out false missile detections caused by Soviet satellites picking up sunlight
reflections off cloud-tops, but failed to do so. Disaster was averted when a
Soviet commander, based on what he said was a '...funny feeling in my gut',
decided the apparent missile attack was a false alarm. The filtering software
code was rewritten.
31. Sample Test CASE: http://wiki.openqa.org/display/WTR/Example+Test+Case
More Questions:
1. Error – Error means
mistake. Mistake means bugs. Errors tend to propagate; during design
and coding.
2. Fault – A fault is the
result of an error. i.e. the representation of an error, representation is
the mode of expression,
such is narrative text, dataflow diagrams, hierarchy charts, source
code and so on. Defect is
good synonym for fault, as is bug.
3. Failure – A Failure
occurs when a fault executes. 1. Failures only occur in an executable
representation, which is
usually taken to be source code, or precisely, loaded object code.
2. This definition relates
failures only to faults of commission.
4. Incident – When a
failure occurs, it may or may not be readily apparent to the user. It is a
symptom associated with a
failure that alerts the user to the occurrence of a failure.
5. Test – Testing is
obviously concerned with errors, faults, failures and incident. A test is the
act of exercising software
with test cases. A test has two distinct goals: to find failures and to
demonstrate correct
execution.
6. Test Case – It has an
identity and is associated with a program behavior.
What types of error are there?
of
commission and of omission
Which kind of fault is
most difficult to detect?
Faults
of omission are most difficult to detect
What is the purpose of a
test?
To
verify correct behaviour
To
find a failure
Typical Test CASE
information
What are the difficulties
in making a test case?
Setting up preconditions
Determining expected
output
Are test cases valuable?
Why? What do we do about it?
Yes. Because, Difficult to
construct, Need for verify correctness, Need to reuse for
regression testing, Need
to evolve. We can do , Document, Save and Use again.
What are the advantages of
functional testing?
Independent of implementation
Develop in parallel with program text
What are the disadvantages
of functional testing?
Redundant tests
Gaps in tests
Cannot develop test cases for non-specified behavior
What are the advantages of
structural testing?
Strong theoretical basis
Nothing is a practical as a good theory!
Leads to good methods for discussing test
coverage
Can look for unspecified behaviour
What are the disadvantages
of structural testing?
Cannot find test cases outside the structure of
the program
Faults classified by
severity
1 Mild , 2 Moderate , 3
Annoying, 4 Disturbing, 5 Serious, 6 Very serious , 7 Extreme
8 Intolerable, 9
Catastrophic and 10 Infectious
Fault taxonomy
Input/output faults
Logic faults
Computation faults
Interface faults
Data faults
Levels of abstraction and
testing
What is the craft of
testing?
Identify errors we are likely to make
Create test cases to find the corresponding
faults
Testing limits
Dijkstra: “Program
testing can be used to show the presence of defects, but never
their absence”
It is impossible to fully
test a software system in a reasonable amount of time or money.
The infinite set of tests
There are enormous numbers
of possible tests. To test everything, you would have to:
Test every possible input to every variable.
Test every possible combination of inputs to
every combination of variables.
Test every possible sequence through the
program.
Test every hardware / software configuration,
including configurations
of servers not under your control.
Test every way in which any user might try to
use the program.
What is coverage?
Extent of testing of certain attributes or
pieces of the program, such as statement
coverage or branch
coverage or condition coverage.
Extent of testing completed, compared to a
population of possible tests.
A MODEL FOR TESTING
Testing
is applied to anything from subroutines to systems that consist of millions of statements.
The archetypical system is one that allows the exploration of all aspects of testing
without the complications that have nothing to do with testing but affect any
very large project. It’s medium-scale programming. Testing the interfaces
between different parts of your own mind is very different from testing the
interface between you and other programmers separated from you by geography,
language, time, and disposition. Testing a one-shot routine that will be run
only a few times is very different from testing one that must run for decades
and may be modified by some unknown future programmer. Although all the
problems of the solitary routine occur for the routine that is embedded in a
system, the converse is not true: many kinds of bugs just can’t exist in solitary
routines. There is an implied context for the test methods discussed in this book—a
real-world context characterized by the following model project:
Application—The specifics of the
application are unimportant. It is a real-time system that must provide timely
responses to user requests for services. It is an online system connected to
remote terminals.
Staff—The programming staff
consists of twenty to thirty programmers—big enough to warrant formality, but
not too big to manage—big enough to use specialists for some parts of the
system’s design.
Schedule—The project will take 24
months from the start of design to formal acceptance by the customer.
Acceptance will be followed by a 6-month cutover period. Computer resources for
development and testing will be almost adequate.
Specification—The specification is
good. It is functionally detailed without constraining the design, but there
are undocumented “understandings” concerning the requirements. Acceptance
Test—The system will be accepted only after a formal acceptance test. The application
is not new, so part of the formal test already exists. At first the customer
will intend to design the acceptance test, but later it will become the
software design team’s responsibility.
Personnel—The staff is professional
and experienced in programming and in the application. Half the staff has
programmed that computer before and most know the source language. One-third,
mostly junior programmers, have no experience with the application. The typical
programmer has been employed by the programming department for 3 years. The
climate is open and frank. Management’s attitude is positive and knowledgeable
about the realities of such projects.
Standards—Programming and test
standards exist and are usually followed. They understand the role of
interfaces and the need for interface standards. Documentation is good. There
is an internal, semiformal, quality-assurance function. The database is centrally
developed and administered.
Objectives—The system is the first
of many similar systems that will be implemented in the future. No two will be
identical, but they will have 75% of the code in common. Once installed, the
system is expected to operate profitably for more than 10 years.
Source—One-third of the code is
new, one-third extracted from a previous, reliable, but poorly documented
system, and one-third is being rehosted (from another language, computer,
operating system—take your pick).
History—One programmer will quit
before his components are tested. Another programmer will be fired before
testing begins: excellent work, but poorly documented. One component will have
to be redone after unit testing: a superb piece of work that defies
integration. The customer will insist on five big changes and twenty small
ones. There will be at least one nasty problem that nobody—not the customer,
not the programmer, not the managers, nor the hardware vendor—suspected. A
facility and/or hardware delivery problem will delay testing for several weeks
and force second- and third-shift work. Several important milestones will slip
but the delivery date will be met. Our model project is a typical well-run,
successful project with a share of glory and
catastrophe—neither a
utopian project nor a slice of hell.
Figure 1.1. A Model of
Testing.
Overview
Figure 1.1 is a model of
the testing process. The process starts with a program embedded in an
environment, such as a computer, an operating system, or a calling program. We understand
human nature and its susceptibility to error. This understanding leads us to create
three models: a model of the environment, a model of the program, and a model of
the expected bugs. From these models we create a set of tests, which are then executed.
The result of each test is either expected or unexpected. If unexpected, it may
lead us to revise the test, our model or concept of how the program behaves,
our concept of what bugs are possible,
or the program itself. Only rarely would we attempt to modify the environment.
The Environment
A program’s environment
is the hardware and software required to make it run. For online systems
the environment may include communications lines, other systems, terminals, and
operators. The environment also includes all programs that interact with—and
are used to create—the program under test, such as operating system, loader, linkage
editor, compiler, utility routines. Programmers should learn early in their
careers that it’s not smart to blame the environment (that is, hardware and
firmware) for bugs. Hardware bugs are rare. So are bugs in
manufacturer-supplied software. This isn’t because logic designers and
operating system programmers are better than application programmers, but
because such hardware and software is stable, tends to be in operation for a
long time, and most bugs will have been found and fixed by the time programmers
use that hardware or software.* Because hardware and firmware are
stable, we don’t have to consider all of the environment’s complexity. Instead,
we work with a simplification of it, in which only the features most important
to the program at hand are considered. Our model of the environment includes
our beliefs regarding such things as the workings of the computer’s
instruction set, operating system macros and commands, and what a higherorder language
statement will do. If testing reveals an unexpected result, we may have to change
our beliefs (our model of the environment) to find out what went wrong. But sometimes
the environment could be wrong: the bug could be in the hardware or firmware
after all.
No comments:
Post a Comment