Anna University Syllabus Materials and Question Papers: SOFTWARE TESTING 7th Semester

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING (UG & PG)

Final Year Computer Science and Engineering, 7th Semester

Question and Answer

Subject Code & Name: Software Testing

UNIT – I

1. What is the Purpose of Testing?

The purpose for testing is to execute or evaluate programs or systems that do the following:

· Measure the results against the requirements

· Document the difference between the expected and actual results

· Assist in resolving those differences by providing the proper debug aids

2. What is the purpose of testing? - Boris Beizer – Point of view.

There’s an attitudinal ( positive attitude) progression characterized by the following five phases:

· PHASE 0—There’s no difference between testing and debugging. Other than in support of

debugging, testing has no purpose.

· PHASE 1—The purpose of testing is to show that the software works.

· PHASE 2—The purpose of testing is to show that the software doesn’t work.

· PHASE 3—The purpose of testing is not to prove anything, but to reduce the perceived risk of

not working to an acceptable value.

· PHASE 4—Testing is not an act. It is a mental discipline that results in low-risk software without much testing effort

3. Give some of the Testing Purpose Examples.

Uncovering defects and finding important problems

Assessing quality and risk

Certifying to standards Fulfilling process mandates

Blocking premature releases Minimizing safety-related lawsuit risks

Minimizing technical support costs Maximizing efficiency

Verifying correctness Assessing conformance to specifications or regulations

4. List the Pesticide Paradox and the Complexity Barrier of the Software Testing

First Law: The Pesticide Paradox—Every method you use to prevent or find bugs leaves a residue of

subtler bugs against which those methods are ineffectual.

That’s not too bad, you say, because at least the software gets better and better. Not quite!

Second Law: The Complexity Barrier—Software complexity (and therefore that of bugs) grows to the limits of our ability to manage that complexity.

5. What do you meant by Dichotomy?

Division into two mutually exclusive, opposed, or contradictory groups: a dichotomy between thought and action.

6. Differentiate Testing and Debugging.

The purpose of testing is to show that a program has bugs. The purpose of debugging is find the error or misconception that led to the program’s failure and to design and implement the program changes that correct the error. Debugging usually follows testing, but they differ as to goals, methods, and most important, psychology: Testing starts with known conditions, uses

predefined procedures, and has predictable outcomes; only whether or not the program passes the test is unpredictable. Debugging starts from possibly unknown initial conditions, and the end cannot be predicted, except statistically. Testing can and should be planned, designed,

and scheduled. The procedures for, and duration of, debugging cannot be so constrained.

Testing is a demonstration of error or apparent correctness. Debugging is a deductive process.

Testing proves a programmer’s failure. Debugging is the programmer’s vindication. Testing, as executed, should strive to be predictable, dull, constrained, rigid, and inhuman. Debugging demands intuitive leaps, conjectures, experimentation, and freedom Much of testing can be done without design knowledge. Debugging is impossible without detailed design knowledge.

Testing can often be done by an outsider Debugging must be done by an insider Although there is a robust theory of testing that establishes theoretical limits to what testing can and can’t do.

Debugging has only recently been attacked by theorists—and so far there are only rudimentary results Much of test execution and design can be automated. Automated debugging is still a dream

7. Differentiate Function versus Structure

Functional testing the program or system is treated as a black box. It is subjected to inputs, and its outputs are verified for conformance to specified behavior. The software’s user should be concerned only with functionality and features, and the program’s implementation details should not matter. Functional testing takes the user’s point of view.

Structural testing does look at the implementation details. Such things as programming style, control method, source language, database design, and coding details dominate structural testing; but the boundary between function and structure is fuzzy. Good systems are built in layers—from the outside to the inside. The user sees only the outermost layer, the layer of pure

function. Each layer inward is less related to the system’s functions and more constrained by its

structure: so what is structure to one layer is function to the next.

8. What do you meant by Bugs? (Detail)

Bugs are more insidious than ever we expect them to be. Yet it is convenient to categorize them: initializations, call sequence, wrong variable, and so on. Our notion of what is or isn’t a bug varies. A bad specification may lead us to mistake good behavior for bugs, and vice versa. An unexpected test result may lead us to change our notion of what a bug is—that is to say, our model of bugs. If you hold any of the following beliefs, then disabuse yourself of them because as long as you believe in such things you will be unable to test effectively and unable to justify the dirty tests most programs need.

Benign Bug Hypothesis—The belief that bugs are nice, tame, and logical. Only weak bugs

have a logic to them and are amenable to exposure by strictly logical means. Subtle bugs have

no definable pattern—they are wild cards.

Bug Locality Hypothesis—The belief that a bug discovered within a component affects only

that component’s behavior; that because of structure, language syntax, and data organization,

the symptoms of a bug are localized to the component’s designed domain. Only weak bugs are

so localized. Subtle bugs have consequences that are arbitrarily far removed from the cause in

time and/or space from the component in which they exist.

Control Bug Dominance—The belief that errors in the control structure of programs dominate

the bugs. While many easy bugs, especially in components, can be traced to control-flow errors, data-flow and data-structure errors are as common. Subtle bugs that violate datastructure boundaries and data/code separation can’t be found by looking only at control structures.

Code/Data Separation—The belief, especially in HOL programming, that bugs respect the separation of code and data.* Furthermore, in real systems the distinction between code and data can be hard to make, and it is exactly that blurred distinction that permit such bugs to exist.

9. What do you meant by Tests?

Tests are formal procedures. Inputs must be prepared, outcomes predicted, tests documented, commands executed, and results observed; all these steps are subject to error. There is nothing magical about testing and test design that immunizes testers against bugs. An unexpected test result is as often cause by a test bug as it is by a real bug.* Bugs can creep into the documentation, the inputs, and the commands and becloud our observation of results. An unexpected test result, therefore, may lead us to revise the tests. Because the tests are themselves in an environment, we also have a mental model of the tests, and instead of revising the tests, we may have to revise that mental model.

10. Different kinds of testing.

Unit, Unit Testing—A unit is the smallest testable piece of software, by which I mean that it can be compiled or assembled, linked, loaded, and put under the control of a test harness or driver. A unit is usually the work of one programmer and it consists of several hundred or fewer, lines of source code. Unit testing is the testing we do to show that the unit does not satisfy its functional specification and/or that its implemented structure does not match the intended design structure. When our tests reveal such faults, we say that there is a unit bug.

Component, Component Testing—A component is an integrated aggregate of one or more units. A unit is a component, a component with subroutines it calls is a component, etc. By this (recursive) definition, a component can be anything from a unit to an entire system. Component testing is the testing we do to show that the component does not satisfy its functional specification and/or that its implemented structure does not match the intended design structure. When our tests reveal such problems, we say that there is a component bug.

Integration, Integration Testing—Integration is a process by which components are aggregated to create larger components. Integration testing is testing done to show that even

though the components were individually satisfactory, as demonstrated by successful passage of component tests, the combination of components are incorrect or inconsistent. For example, components A and B have both passed their component tests. Integration testing is aimed as showing inconsistencies between A and B. Examples of such inconsistencies are improper call or return sequences, inconsistent data validation criteria, and inconsistent handling of data objects. Integration testing should not be confused with testing integrated objects, which is just higher level component testing. Integration testing is specifically aimed at exposing the problems that arise from the combination of components. The sequence, then, consists of component testing for components A and B, integration testing for the combination of A and B, and finally, component testing for the “new” component (A,B).*

System, System Testing—A system is a big component. System testing is aimed at revealing bugs that cannot be attributed to components as such, to the inconsistencies between

components, or to the planned interactions of components and other objects. System testing concerns issues and behaviors that can only be exposed by testing the entire integrated system

or a major part of it. System testing includes testing for performance, security, accountability,

configuration sensitivity, start-up, and recovery.

11. What are the approaches used for to demonstrate a program is correct or not?

Three different approaches can be used to demonstrate that a program is correct: tests based on structure, tests based on function, and formal proofs of correctness. Each approach leads to the neither conclusion that complete testing; in the sense of a proof is neither theoretically nor practically possible.

12. Functional Testing

Functional Testing—Every program operates on a finite number of inputs. Whatever pragmatic

meaning those inputs might have, they can always be interpreted as a binary bit stream. A complete functional test would consist of subjecting the program to all possible input streams. For each input the routine either accepts the stream and produces a correct outcome, accepts the stream and produces an incorrect outcome, or rejects the stream and tells us that it did so. Because the rejection message is itself an outcome, the problem is reduced to verifying that the

correct outcome is produced for every input. But a 10-character input string has 280 possible input streams and corresponding outcomes. So complete functional testing in this sense is impractical.

13. Structural Testing

Structural Testing—One should design enough tests to ensure that every path through the routine is exercised at least once. Right off that’s impossible, because some loops might never terminate. Brush that problem aside by observing that the universe—including all that’s in it—is

finite. Even so, the number of paths through a small routine can be awesome because each loop multiplies the path count by the number of times through the loop. A small routine can have

millions or billions of paths, so total path testing is usually impractical, although it can be done

for some routines. By doing it we solve the problems of unknown size that we ran into for purely functional testing; however, it doesn’t solve the problem of preparing a bug-free input, a bug-free response list, and a bug-free test observation. We still need those things, because pure structural testing can never assure us that the routine is doing the right thing.

14. Correctness Proofs

Correctness Proofs—Formal proofs of correctness rely on a combination of functional and structural concepts. Requirements are stated in a formal language (e.g., mathematics), and each program statement is examined and used in a step of an inductive proof that the routine will produce the correct outcome for all possible input sequences. The practical issue here is that such proofs are very expensive and have been applied only to numerical routines or to formal proofs for crucial software such as a system’s security kernel or portions of compilers. But there are theoretical objections to formal proofs of correctness that go beyond the practical issues. How do we know that the specification is achievable? Its consistency and completeness must be proved, and in general, that is a provably unsolvable problem. Assuming that the specification has been proved correct, then the mechanism used to prove the program, the steps in the proof, the logic used, and so on, must be proved (GOOD75). Mathematicians and logicians have no more immunity to bugs than programmers or testers have. This also leads to never-ending sequences of unverifiable assumptions.

15. The theoretical barriers to complete testing.

Manna and Waldinger (MANN78) have clearly summarized the theoretical barriers to complete

testing:

“We can never be sure that the specifications are correct.”

“No verification system can verify every correct program.”

“We can never be certain that a verification system is correct.”

16. Dependencies of a bug.

The importance of a bug depends on frequency, correction cost, installation cost, and consequences.

Frequency— how often does that kind of bug occur? Pay more attention to the more frequent

bug types.

Correction Cost— what does it cost to correct the bug after it’s been found? That cost is the

sum of two factors: (1) the cost of discovery and (2) the cost of correction. These costs go up

dramatically the later in the development cycle the bug is discovered. Correction cost also depends on system size. The larger the system the more it costs to correct the same bug.

Installation Cost—Installation cost depends on the number of installations: small for a singleuser program, but how about a PC operating system bug? Installation cost can dominate all other costs—fixing one simple bug and distributing the fix could exceed the entire system’s

development cost.

Consequences— what are the consequences of the bug? You might measure this by the mean

size of the awards made by juries to the victims of your bug. A reasonable metric for bug importance is:

importance($) = frequency*(correction_cost + installation_cost + consequential_cost)

Frequency tends not to depend on application or environment, but correction, installation, and consequential costs do. As designers, testers, and QA workers, you must be interested in bug importance, not raw frequency. Therefore you must create your own importance model. This chapter will help you do that.

17. What are the consequences of this bug?

“Bit so-and-so will be set instead of reset,” you’re avoiding responsibility for the bug. Although it may be difficult to do in the scope of a subroutine, programmers should try to measure the consequences of their bugs in human terms. Here are some consequences on a scale of one to ten:

1. Mild—The symptoms of the bug offend us aesthetically; a misspelled output or a misaligned printout.

2. Moderate—Outputs are misleading or redundant. The bug impacts the system’s performance.

3. Annoying—The system’s behavior, because of the bug, is dehumanizing. Names are truncated or arbitrarily modified. Bills for $0.00 are sent. Operators must use unnatural command sequences and must trick the system into a proper response for unusual bug-related cases.

4. Disturbing—It refuses to handle legitimate transactions. The automatic teller machine won’t give you money. My credit card is declared invalid.

5. Serious—It loses track of transactions: not just the transaction itself (your paycheck), but the fact that the transaction occurred. Accountability is lost.

6. Very Serious—Instead of losing your paycheck, the system credits it to another account or converts deposits into withdrawals. The bug causes the system to do the wrong transaction.

7. Extreme—The problems aren’t limited to a few users or to a few transaction types. They are frequent and arbitrary instead of sporadic or for unusual cases.

8. Intolerable—Long-term, unrecoverable corruption of the data base occurs and the corruption is not easily discovered. Serious consideration is given to shutting the system down.

9. Catastrophic—The decision to shut down is taken out of our hands because the system fails. 10. Infectious—What can be worse than a failed system? One that corrupts other systems even though it does not fail in itself; that erodes the social or physical environment; that melts nuclear reactors or starts wars; whose influence, because of malfunction, is far greater than expected; a system that kills.

18. How to measure the quality?

Quality can be measured on some scale, say from 0 to 10. Quality can be measured as a combination of factors, of which the number of bugs and their severity is only one component. The detail of how this is done is the subject of another book; but it’s enough to say that many organizations have designed and use satisfactory, quantitative, quality metrics. Examining these metrics closer, we see that how the parts are weighted depends on environment, application, culture, and many other factors. Few are

Correction Cost—The cost of correcting a bug has almost nothing to do with symptom

severity. Catastrophic, life-threatening bugs could be trivial to fix, whereas minor annoyances

could require major rewrites to correct.

Context and Application Dependency—The severity of a bug, for the same bug with the same symptoms, depends on context. For example, a roundoff error in an orbit calculation doesn’t mean much in a spaceship video game but it matters to real astronauts.

Creating Culture Dependency—What’s important depends on the creators of the software and their cultural aspirations. Test tool vendors are more sensitive about bugs in their products than, say, games software vendors.

User Culture Dependency—What’s important depends on the user culture. An R&D shop might accept a bug for which there’s a workaround; a banker would go to jail for that same bug; and naive users of PC software go crazy over bugs that pros ignore.

The Software Development Phase—Severity depends on development phase. Any bug gets more severe as it gets closer to field use and more severe the longer it’s been around—more severe because of the dramatic rise in correction cost with time. Also, what’s a trivial or subtle bug to the designer means little to the maintenance programmer for whom all bugs are equally mysterious.

19. How should you go about quantifying the nightmare?

Here’s a workable procedure:

1. List your worst software nightmares. State them in terms of the symptoms they produce and

how your user will react to those symptoms. For end users and the population at large, the categories of Section 2.2 above are a starting point. For programmers the nightmare may be closer to home, such as: “I might get a bad personal performance rating.”

2. Convert the consequences of each nightmare into a cost. Usually, this is a labor cost for correcting the nightmare, but if your scope extends to the public, it could be the cost of lawsuits, lost business, or nuclear reactor meltdowns.

3. Order the list from the costliest to the cheapest and then discard the low-concern nightmares with which you can live.

4. Based on your experience, measured data (the best source to use), intuition, and published statistics postulate the kinds of bugs that are likely to create the symptoms expressed by each nightmare. Don’t go too deep because most bugs are easy. This is a bug design process. If you can “design” the bug by a one-character or one statement change, then it’s a good target. If it takes hours of sneaky thinking to characterize the bug, then either it’s an unlikely bug or you’re worried about a saboteur in your organization, which could be appropriate in some cases. Most bugs are simple goofs once you find and understand them.

5. For each nightmare, then, you’ve developed a list of possible causative bugs. Order that list by decreasing probability. Judge the probability based on your own bug statistics, intuition, experience, etc. The same bug type will appear in different nightmares. The importance of a bug type is calculated by multiplying the expected cost of the nightmare by the probability of the bug and summing across all nightmares:

6. Rank the bug types in order of decreasing importance to you.

7. Design tests (based on your knowledge of test techniques) and design your quality assurance

inspection process by using the methods that are most effective against the most important bugs.

8. If a test is passed, then some nightmares or parts of them go away. If a test is failed, then a nightmare is possible, but upon correcting the bug, it too goes away. Testing, then, gives you information you can use to revise your estimated nightmare probabilities. As you test, revise the probabilities and reorder the nightmare list. Taking whatever information you get from testing and working it back through the exercise leads you to revise your subsequent test strategy, either on this project if it’s big enough or long enough, or on subsequent projects.

9. Stop testing when the probability of all nightmares has been shown to be inconsequential as a result of hard evidence produced by testing.

20. List out some of the functional testing techniques.

Most functional test techniques—that is, those techniques which are based on a behavioral description of software, such as transaction flow testing, syntax testing, domain testing, logic testing, and state testing are useful in testing functional bugs

21. What do you meant by Control and Sequence Bugs?

Control and sequence bugs include paths left out, unreachable code, improper nesting of loops, loop-back or loop-termination criteria incorrect, missing process steps, duplicated processing, unnecessary processing, rampaging GOTO’s, ill-conceived switches, spaghetti code, and worst of all, pachinko code.

22. What do you meant by Logic Bugs?

Bugs in logic, especially those related to misunderstanding how case statements and logic operators behave singly and in combinations, include nonexistent cases, improper layout of cases, “impossible” cases that are not impossible, a “don’t-care” case that matters, improper negation of a boolean expression (for example, using “greater than” as the negation of “less than”), improper simplification and combination of cases, overlap of exclusive cases, confusing

“exclusive OR” with “inclusive OR.”

23. Processing Bugs

Processing bugs include arithmetic bugs, algebraic, mathematical function evaluation, algorithm selection, and general processing. Many problems in this area are related to incorrect conversion from one data representation to another. This is especially true in assembly language programming. Other problems include ignoring overflow, ignoring the difference between positive and negative zero, improper use of greater-than, greater-than-or-equal, less-than, lessthan- or-equal, assumption of equality to zero in floating point, and improper comparison between different formats as in ASCII to binary or integer to floating point.

24. Initialization Bugs

Initialization bugs are common, and experienced programmers and testers know they must look for them. Both improper and superfluous initialization occur. The latter tends to be less harmful but can affect performance. Typical bugs are as follows: forgetting to initialize working space, registers, or data areas before first use or assuming that they are initialized elsewhere; a bug in the first value of a loop-control parameter; accepting an initial value without a validation check; and initializing to the wrong format, data representation, or type.

25. Data-Flow Bugs and Anomalies

Most initialization bugs are a special case of data-flow anomalies. A data-flow anomaly occurs when there is a path along which we expect to do something unreasonable with data, such as using an uninitialized variable, attempting to use a variable before it exists, modifying data and then not storing or using the result, or initializing twice without an intermediate use. Although part of data-flow anomaly detection can be done by the compiler based on information known at compile time, much can be detected only by execution and therefore is a subject for testing. It is generally recognized today that data-flow anomalies are as important as control-flow anomalies

26. Data bugs

Data bugs include all bugs that arise from the specification of data objects, their formats, the number of such objects, and their initial values. Data bugs are at least as common as bugs in code, but they are often treated as if they did not exist at all. Underestimating the frequency of

data bugs is caused by poor bug accounting. In some projects, bugs in data declarations are just not counted, and for that matter, data declaration statements are not counted as part of the code. The categories used for data bugs are different from those used for code bugs. Each way of looking at data provides a different perspective. These categories for data bugs overlap and are no stricter than the categories used for bugs in code.

· Dynamic versus Static

· Information, Parameter, and Control

· Contents, Structure, and Attributes

27. Data specifications consist of three parts:

Contents—The actual bit pattern, character string, or number put into a data structure. Content is a pure bit pattern and has no meaning unless it is interpreted by a hardware or software processor. All data bugs result in the corruption or misinterpretation of content.

Structure—The size and shape and numbers that describe the data object, that is, the memory

locations used to store the content (e.g., 16 characters aligned on a word boundary, 122 blocks

of 83 characters each, bits 4 through 14 of word 17). Structures can have substructures and can

be arranged into superstructures. A hunk of memory may have several different structures defined over it—e.g., a two-dimensional array treated elsewhere as N one-dimensional arrays. Attributes—The specification of meaning, that is, the semantics associated with the contents of

a data object (e.g., an integer, an alphanumeric string, a subroutine).

28. What are the remedies for test bugs?

The remedies for test bugs are: test debugging, test quality assurance, test execution automation, and test design automation.

Test Debugging—The first remedy for test bugs is testing and debugging the tests. The differences between test debugging and program debugging are not fundamental. Test debugging is usually easier because tests, when properly designed, are simpler than programs and do not have to make concessions to efficiency. Also, tests tend to have a localized impact relative to other tests, and therefore the complicated interactions that usually plague software designers are less frequent. We have no magic prescriptions for test debugging—no more than we have for software debugging.

Test Quality Assurance—Programmers have the right to ask how quality in independent testing and test design is monitored. Should we implement test testers and test—tester tests? This sequence does not converge. Methods for test quality assurance are discussed in Software System Testing and Quality Assurance (BEIZ84).

Test Execution Automation—The history of software bug removal and prevention is indistinguishable from the history of programming automation aids. Assemblers, loaders, compilers, and the like were all developed to reduce the incidence of programmer and/or operator errors. Test execution bugs are virtually eliminated by various test execution automation tools, many of which are discussed throughout this book. The point is that “manual

testing” is self-contradictory. If you want to get rid of test execution bugs, get rid of manual execution.

Test Design Automation—Just as much of software development has been automated (what is a compiler, after all?) much test design can be and has been automated. For a given productivity rate, automation reduces bug count—be it for software or be it for tests.

29. Test Case

IEEE Standard 610 (1990) defines test case as follows:

“(1) A set of test inputs, execution conditions, and expected results developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement.

“(2) (IEEE Std 829-1983) Documentation specifying inputs, predicted results, and a set of execution conditions for a test item.” Boris Beizer (1995, p. 3) defines a test as “A sequence of one or more subtests executed as a sequence because the outcome and/or final state of one subtest is the input and/or initial state of the next. The word ‘test’ is used to include subtests, tests proper, and test suites.

30. Some recent major computer system failures caused by software bugs

· Software problems in the automated baggage sorting system of a major airport in February 2008

prevented thousands of passengers from checking baggage for their flights. It was reported that

the breakdown occurred during a software upgrade, despite pre-testing of the software. The system continued to have problems in subsequent months.

· News reports in December of 2007 indicated that significant software problems were continuing to occur in a new ERP payroll system for a large urban school system. It was believed that more than one third of employees had received incorrect paychecks at various times since the new system went live the preceding January, resulting in overpayments of $53 million, as well as underpayments. An employees' union brought a lawsuit against the school system, the cost of the ERP system was expected to rise by 40%, and the non-payroll part of the ERP system was delayed. Inadequate testing reportedly contributed to the problems.

· In November of 2007 a regional government reportedly brought a multi-million dollar lawsuit against a software services vendor, claiming that the vendor 'minimized quality' in delivering software for a large criminal justice information system and the system did not meet requirements. The vendor also sued its subcontractor on the project.

· In June of 2007 news reports claimed that software flaws in a popular online stock-picking contest could be used to gain an unfair advantage in pursuit of the game's large cash prizes. Outside investigators were called in and in July the contest winner was announced. Reportedly the winner had previously been in 6th place, indicating that the top 5 contestants may have been disqualified.

· A software problem contributed to a rail car fire in a major underground metro system in April

of 2007 according to newspaper accounts. The software reportedly failed to perform as expected in detecting and preventing excess power usage in equipment on a new passenger railcar, resulting in overheating and fire in the rail car, and evacuation and shutdown of part of the system.

· Tens of thousands of medical devices were recalled in March of 2007 to correct a software bug

According to news reports, the software would not reliably indicate when available power to the

device was too low.

· A September 2006 news report indicated problems with software utilized in a state government's primary election, resulting in periodic unexpected rebooting of voter checkin machines, which were separate from the electronic voting machines, and resulted in confusion and delays at voting sites. The problem was reportedly due to insufficient testing.

· In August of 2006 a U.S. government student loan service erroneously made public the personal data of as many as 21,000 borrowers on it's web site, due to a software error. The bug was fixed and the government department subsequently offered to arrange for free credit monitoring services for those affected.

· A software error reportedly resulted in overbilling of up to several thousand dollars to each of 11,000 customers of a major telecommunications company in June of 2006. It was reported that

the software bug was fixed within days, but that correcting the billing errors would take much longer.

· News reports in May of 2006 described a multi-million dollar lawsuit settlement paid by a healthcare software vendor to one of its customers. It was reported that the customer claimed there were problems with the software they had contracted for, including poor integration of software modules, and problems that resulted in missing or incorrect data used by medical personnel.

· In early 2006 problems in a government's financial monitoring software resulted in incorrect election candidate financial reports being made available to the public. The government's election finance reporting web site had to be shut down until the software was repaired. · Trading on a major Asian stock exchange was brought to a halt in November of 2005, reportedly due to an error in a system software upgrade. The problem was rectified and trading resumed later the same day.

· A May 2005 newspaper article reported that a major hybrid car manufacturer had to install a software fix on 20,000 vehicles due to problems with invalid engine warning lights and occasional stalling. In the article, an automotive software specialist indicated that the automobile industry spends $2 billion to $3 billion per year fixing software problems.

· Media reports in January of 2005 detailed severe problems with a $170 million high-profile U.S. government IT systems project. Software testing was one of the five major problem areas

according to a report of the commission reviewing the project. In March of 2005 it was decided to scrap the entire project.

· In July 2004 newspapers reported that a new government welfare management system in Canada costing several hundred million dollars was unable to handle a simple benefits rate increase after being put into live operation. Reportedly the original contract allowed for only 6 weeks of acceptance testing and the system was never tested for its ability to handle a rate increase.

· Millions of bank accounts were impacted by errors due to installation of inadequately tested software code in the transaction processing system of a major North American bank, according to mid-2004 news reports. Articles about the incident stated that it took two weeks to fix all the resulting errors, that additional problems resulted when the incident drew a large number of mail phishing attacks against the bank's customers, and that the total cost of the incident could

exceed $100 million.

· A bug in site management software utilized by companies with a significant percentage of worldwide web traffic was reported in May of 2004. The bug resulted in performance problems for many of the sites simultaneously and required disabling of the software until the bug was fixed.

· According to news reports in April of 2004, a software bug was determined to be a major contributor to the 2003 Northeast blackout, the worst power system failure in North American history. The failure involved loss of electrical power to 50 million customers, forced shutdown of 100 power plants, and economic losses estimated at $6 billion. The bug was reportedly in one utility company's vendor-supplied power monitoring and management system, which was unable to correctly handle and report on an unusual confluence of initially localized events. The

error was found and corrected after examining millions of lines of code.

· In early 2004, news reports revealed the intentional use of a software bug as a counterespionage tool. According to the report, in the early 1980's one nation surreptitiously allowed a hostile nation's espionage service to steal a version of sophisticated industrial software that had intentionally-added flaws. This eventually resulted in major industrial disruption in the country that used the stolen flawed software.

· A major U.S. retailer was reportedly hit with a large government fine in October of 2003 due to

web site errors that enabled customers to view one anothers' online orders.

· News stories in the fall of 2003 stated that a manufacturing company recalled all their transportation products in order to fix a software problem causing instability in certain circumstances. The company found and reported the bug itself and initiated the recall procedure

in which a software upgrade fixed the problems.

· In August of 2003 a U.S. court ruled that a lawsuit against a large online brokerage company could proceed; the lawsuit reportedly involved claims that the company was not fixing system problems that sometimes resulted in failed stock trades, based on the experiences of 4 plaintiffs

during an 8-month period. A previous lower court's ruling that "...six miscues out of more than 400 trades does not indicate negligence." was invalidated.

· In April of 2003 it was announced that a large student loan company in the U.S. made a software error in calculating the monthly payments on 800,000 loans. Although borrowers were

to be notified of an increase in their required payments, the company will still reportedly lose $8 million in interest. The error was uncovered when borrowers began reporting inconsistencies in their bills.

· News reports in February of 2003 revealed that the U.S. Treasury Department mailed 50,000 Social Security checks without any beneficiary names. A spokesperson indicated that the missing names were due to an error in a software change. Replacement checks were subsequently mailed out with the problem corrected, and recipients were then able to cash their

Social Security checks.

· In March of 2002 it was reported that software bugs in Britain's national tax system resulted in

more than 100,000 erroneous tax overcharges. The problem was partly attributed to the

difficulty of testing the integration of multiple systems.

· A newspaper columnist reported in July 2001 that a serious flaw was found in off-the-shelf software that had long been used in systems for tracking certain U.S. nuclear materials. The same software had been recently donated to another country to be used in tracking their own nuclear materials, and it was not until scientists in that country discovered the problem, and shared the information, that U.S. officials became aware of the problems.

· According to newspaper stories in mid-2001, a major systems development contractor was fired and sued over problems with a large retirement plan management system. According to the reports, the client claimed that system deliveries were late, the software had excessive defects, and it caused other systems to crash.

· In January of 2001 newspapers reported that a major European railroad was hit by the aftereffects of the Y2K bug. The company found that many of their newer trains would not run due to their inability to recognize the date '31/12/2000'; the trains were started by altering the control system's date settings.

· News reports in September of 2000 told of a software vendor settling a lawsuit with a large mortgage lender; the vendor had reportedly delivered an online mortgage processing system that did not meet specifications, was delivered late, and didn't work. · In early 2000, major problems were reported with a new computer system in a large suburban U.S. public school district with 100,000+ students; problems included 10,000 erroneous report cards and students left stranded by failed class registration systems; the district's CIO was fired. The school district decided to reinstate it's original 25-year old system for at least a year until the bugs were worked out of the new system by the software vendors.

· A review board concluded that the NASA Mars Polar Lander failed in December 1999 due to software problems that caused improper functioning of retro rockets utilized by the Lander as it entered the Martian atmosphere.

· In October of 1999 the $125 million NASA Mars Climate Orbiter spacecraft was believed to be lost in space due to a simple data conversion error. It was determined that spacecraft software used certain data in English units that should have been in metric units. Among other tasks, the orbiter was to serve as a communications relay for the Mars Polar Lander mission, which failed for unknown reasons in December 1999. Several investigating panels were convened to determine the process failures that allowed the error to go undetected.

· Bugs in software supporting a large commercial high-speed data network affected 70,000 business customers over a period of 8 days in August of 1999. Among those affected was the electronic trading system of the largest U.S. futures exchange, which was shut down for most of a week as a result of the outages.

· In April of 1999 a software bug caused the failure of a $1.2 billion U.S. military satellite launch, the costliest unmanned accident in the history of Cape Canaveral launches. The failure was the latest in a string of launch failures, triggering a complete military and industry review of U.S. space launch programs, including software integration and testing processes. Congressional oversight hearings were requested.

· A small town in Illinois in the U.S. received an unusually large monthly electric bill of $7 million in March of 1999. This was about 700 times larger than its normal bill. It turned out to be due to bugs in new software that had been purchased by the local power company to deal with Y2K software issues.

· In early 1999 a major computer game company recalled all copies of a popular new product due to software problems. The company made a public apology for releasing a product before it was ready.

· The computer system of a major online U.S. stock trading service failed during trading hours

several times over a period of days in February of 1999 according to nationwide news reports.

The problem was reportedly due to bugs in a software upgrade intended to speed online trade

confirmations.

· In April of 1998 a major U.S. data communications network failed for 24 hours, crippling a large part of some U.S. credit card transaction authorization systems as well as other large U.S. bank, retail, and government data systems. The cause was eventually traced to a software bug. · January 1998 news reports told of software problems at a major U.S. telecommunications company that resulted in no charges for long distance calls for a month for 400,000 customers.

The problem went undetected until customers called up with questions about their bills.

· In November of 1997 the stock of a major health industry company dropped 60% due to reports of failures in computer billing systems, problems with a large database conversion, and inadequate software testing. It was reported that more than $100,000,000 in receivables had to be written off and that multi-million dollar fines were levied on the company by government agencies.

· A retail store chain filed suit in August of 1997 against a transaction processing system vendor

(not a credit card company) due to the software's inability to handle credit cards with year 2000 expiration dates.

· In August of 1997 one of the leading consumer credit reporting companies reportedly shut down their new public web site after less than two days of operation due to software problems. The new site allowed web site visitors instant access, for a small fee, to their personal credit reports. However, a number of initial users ended up viewing each others' reports instead of their own, resulting in irate customers and nationwide publicity. The problem was attributed to "...unexpectedly high demand from consumers and faulty software that routed the files to the wrong computers."

· In November of 1996, newspapers reported that software bugs caused the 411 telephone information system of one of the U.S. RBOC's to fail for most of a day. Most of the 2000 operators had to search through phone books instead of using their 13,000,000-listing database. The bugs were introduced by new software modifications and the problem software had been installed on both the production and backup systems. A spokesman for the software vendor reportedly stated that 'It had nothing to do with the integrity of the software. It was human error.'

· On June 4 1996 the first flight of the European Space Agency's new Ariane 5 rocket failed shortly after launching, resulting in an estimated uninsured loss of a half billion dollars. It was reportedly due to the lack of exception handling of a floating-point error in a conversion from a 64-bit integer to a 16-bit signed integer.

· Software bugs caused the bank accounts of 823 customers of a major U.S. bank to be credited

with $924,844,208.32 each in May of 1996, according to newspaper reports. The American Bankers Association claimed it was the largest such error in banking history. A bank spokesman

said the programming errors were corrected and all funds were recovered.

· On January 1 1984 all computers produced by one of the leading minicomputer makers of the time reportedly failed worldwide. The cause was claimed to be a leap year bug in a date handling function utilized in deletion of temporary operating system files. Technicians throughout the world worked for several days to clear up the problem. It was also reported that the same bug affected many of the same computers four years later.

· Software bugs in a Soviet early-warning monitoring system nearly brought on nuclear war in 1983, according to news reports in early 1999. The software was supposed to filter out false missile detections caused by Soviet satellites picking up sunlight reflections off cloud-tops, but failed to do so. Disaster was averted when a Soviet commander, based on what he said was a '...funny feeling in my gut', decided the apparent missile attack was a false alarm. The filtering software code was rewritten.

31. Sample Test CASE: http://wiki.openqa.org/display/WTR/Example+Test+Case

More Questions:

1. Error – Error means mistake. Mistake means bugs. Errors tend to propagate; during design

and coding.

2. Fault – A fault is the result of an error. i.e. the representation of an error, representation is

the mode of expression, such is narrative text, dataflow diagrams, hierarchy charts, source

code and so on. Defect is good synonym for fault, as is bug.

3. Failure – A Failure occurs when a fault executes. 1. Failures only occur in an executable

representation, which is usually taken to be source code, or precisely, loaded object code.

2. This definition relates failures only to faults of commission.

4. Incident – When a failure occurs, it may or may not be readily apparent to the user. It is a

symptom associated with a failure that alerts the user to the occurrence of a failure.

5. Test – Testing is obviously concerned with errors, faults, failures and incident. A test is the

act of exercising software with test cases. A test has two distinct goals: to find failures and to

demonstrate correct execution.

6. Test Case – It has an identity and is associated with a program behavior.

What types of error are there?

of commission and of omission

Which kind of fault is most difficult to detect?

Faults of omission are most difficult to detect

What is the purpose of a test?

To verify correct behaviour

To find a failure

Typical Test CASE information

What are the difficulties in making a test case?

Setting up preconditions

Determining expected output

Are test cases valuable? Why? What do we do about it?

Yes. Because, Difficult to construct, Need for verify correctness, Need to reuse for

regression testing, Need to evolve. We can do , Document, Save and Use again.

What are the advantages of functional testing?

 Independent of implementation

 Develop in parallel with program text

What are the disadvantages of functional testing?

 Redundant tests

 Gaps in tests

 Cannot develop test cases for non-specified behavior

What are the advantages of structural testing?

 Strong theoretical basis

 Nothing is a practical as a good theory!

 Leads to good methods for discussing test coverage

 Can look for unspecified behaviour

What are the disadvantages of structural testing?

 Cannot find test cases outside the structure of the program

Faults classified by severity

1 Mild , 2 Moderate , 3 Annoying, 4 Disturbing, 5 Serious, 6 Very serious , 7 Extreme

8 Intolerable, 9 Catastrophic and 10 Infectious

Fault taxonomy

Input/output faults

Logic faults

Computation faults

Interface faults

Data faults

Levels of abstraction and testing

What is the craft of testing?

 Identify errors we are likely to make

 Create test cases to find the corresponding faults

Testing limits

Dijkstra: “Program testing can be used to show the presence of defects, but never

their absence”

It is impossible to fully test a software system in a reasonable amount of time or money.

The infinite set of tests

There are enormous numbers of possible tests. To test everything, you would have to:

 Test every possible input to every variable.

 Test every possible combination of inputs to every combination of variables.

 Test every possible sequence through the program.

 Test every hardware / software configuration, including configurations of servers not under your control.

 Test every way in which any user might try to use the program.

What is coverage?

 Extent of testing of certain attributes or pieces of the program, such as statement

coverage or branch coverage or condition coverage.

 Extent of testing completed, compared to a population of possible tests.

A MODEL FOR TESTING

Testing is applied to anything from subroutines to systems that consist of millions of statements. The archetypical system is one that allows the exploration of all aspects of testing without the complications that have nothing to do with testing but affect any very large project. It’s medium-scale programming. Testing the interfaces between different parts of your own mind is very different from testing the interface between you and other programmers separated from you by geography, language, time, and disposition. Testing a one-shot routine that will be run only a few times is very different from testing one that must run for decades and may be modified by some unknown future programmer. Although all the problems of the solitary routine occur for the routine that is embedded in a system, the converse is not true: many kinds of bugs just can’t exist in solitary routines. There is an implied context for the test methods discussed in this book—a real-world context characterized by the following model project:

Application—The specifics of the application are unimportant. It is a real-time system that must provide timely responses to user requests for services. It is an online system connected to remote terminals.

Staff—The programming staff consists of twenty to thirty programmers—big enough to warrant formality, but not too big to manage—big enough to use specialists for some parts of the system’s design.

Schedule—The project will take 24 months from the start of design to formal acceptance by the customer. Acceptance will be followed by a 6-month cutover period. Computer resources for development and testing will be almost adequate.

Specification—The specification is good. It is functionally detailed without constraining the design, but there are undocumented “understandings” concerning the requirements. Acceptance Test—The system will be accepted only after a formal acceptance test. The application is not new, so part of the formal test already exists. At first the customer will intend to design the acceptance test, but later it will become the software design team’s responsibility.

Personnel—The staff is professional and experienced in programming and in the application. Half the staff has programmed that computer before and most know the source language. One-third, mostly junior programmers, have no experience with the application. The typical programmer has been employed by the programming department for 3 years. The climate is open and frank. Management’s attitude is positive and knowledgeable about the realities of such projects.

Standards—Programming and test standards exist and are usually followed. They understand the role of interfaces and the need for interface standards. Documentation is good. There is an internal, semiformal, quality-assurance function. The database is centrally developed and administered.

Objectives—The system is the first of many similar systems that will be implemented in the future. No two will be identical, but they will have 75% of the code in common. Once installed, the system is expected to operate profitably for more than 10 years.

Source—One-third of the code is new, one-third extracted from a previous, reliable, but poorly documented system, and one-third is being rehosted (from another language, computer, operating system—take your pick).

History—One programmer will quit before his components are tested. Another programmer will be fired before testing begins: excellent work, but poorly documented. One component will have to be redone after unit testing: a superb piece of work that defies integration. The customer will insist on five big changes and twenty small ones. There will be at least one nasty problem that nobody—not the customer, not the programmer, not the managers, nor the hardware vendor—suspected. A facility and/or hardware delivery problem will delay testing for several weeks and force second- and third-shift work. Several important milestones will slip but the delivery date will be met. Our model project is a typical well-run, successful project with a share of glory and

catastrophe—neither a utopian project nor a slice of hell.

Figure 1.1. A Model of Testing.

Overview

Figure 1.1 is a model of the testing process. The process starts with a program embedded in an environment, such as a computer, an operating system, or a calling program. We understand human nature and its susceptibility to error. This understanding leads us to create three models: a model of the environment, a model of the program, and a model of the expected bugs. From these models we create a set of tests, which are then executed. The result of each test is either expected or unexpected. If unexpected, it may lead us to revise the test, our model or concept of how the program behaves, our concept of what bugs are possible, or the program itself. Only rarely would we attempt to modify the environment.

The Environment

A program’s environment is the hardware and software required to make it run. For online systems the environment may include communications lines, other systems, terminals, and operators. The environment also includes all programs that interact with—and are used to create—the program under test, such as operating system, loader, linkage editor, compiler, utility routines. Programmers should learn early in their careers that it’s not smart to blame the environment (that is, hardware and firmware) for bugs. Hardware bugs are rare. So are bugs in manufacturer-supplied software. This isn’t because logic designers and operating system programmers are better than application programmers, but because such hardware and software is stable, tends to be in operation for a long time, and most bugs will have been found and fixed by the time programmers use that hardware or software.* Because hardware and firmware are stable, we don’t have to consider all of the environment’s complexity. Instead, we work with a simplification of it, in which only the features most important to the program at hand are considered. Our model of the environment includes our beliefs regarding such things as the workings of the computer’s instruction set, operating system macros and commands, and what a higherorder language statement will do. If testing reveals an unexpected result, we may have to change our beliefs (our model of the environment) to find out what went wrong. But sometimes the environment could be wrong: the bug could be in the hardware or firmware after all.

Anna University Syllabus Materials and Question Papers

Saturday, 11 May 2013

SOFTWARE TESTING 7th Semester

No comments:

Post a Comment