Clemson University -- CPSC 215 -- Summer 2003 software development General Principles 1. Aim for reliable and reusable code. a. reliable code i. well-tested ii. retains debugging helps such as integrity checks iii. contains error-handling routines (i.e., expect bad data) b. reusable code i. standard module interfaces ii. well-documented (even with hints on how to make changes) 2. Developing good software is more like building a skyscraper than hacking out a doghouse. (I recommend getting a copy of S. McConnell, Code Complete, Microsoft Press, 1993, 857pp.) 3. There are several phases to software development, from problem specification to maintenance. 4. Read coding and style guidelines and incorporate the suggestions into your coding. 5. Plan for and carry out testing. 6. Learn from experience - for each bug you encounter, ask yourself: a) Does this bug represent a pattern for me? (I.e., is this same mistake or a similar one somewhere else in this program?) b) How could I have automatically detected this bug? c) How could I have prevented this bug? [from Steve Maguire and Tom van Vleck] Software Development analogy of building a doghouse vs. building a skyscraper [Steve McConnell] doghouse skyscraper -------- ---------- one person many people: architects, builders, ... few dollars many dollars one afternoon much time short list of materials much planning one trip to Home Depot scheduled deliveries (just-in-time) if you forget the doorway, doorways and rest of design must be done then hack one out well enough to get people to rent if roof leaks, you can design and/or construction problems start from scratch result in major financial problems also student programs are usually like dixie cups - use once and throw away --- you need to move beyond doghouses and dixie cups --- construction metaphor problem definition - what kind of building? requirements analysis - what do customers need/want? architectural design - overall appearance and functionality detailed blueprints - specifics construction - foundation, framing, roofing, etc. permits, inspections acceptance maintenance => plan, plan, plan ("do the prereqs or be prewrecked") => write it down - the larger the project, the more formal the documents => use standard methodologies to reduce communication difficulties, esp. when many people are involved, e.g., standard blueprint symbols => build in the right order, e.g., do the wiring before the sheet rock => don't build things you can buy already built, e.g., windows, lights => build in margins of safety in the design and in the schedule => consider ease of use and safety, e.g., turning on a light is a matter of putting two wires together, so why do we use light switches on the wall rather than just letting two wires hang down from each fixture? => choose the right tools, e.g. cut lumber with a saw not an axe => make maintenance easy, e.g., reasonable crawl space => find design flaws early, e.g., moving the location of a wall gets progressively more costly as construction continues => don't waste time and money fixing problems that could have been avoided Software Development Phases 1. Problem specification a. describe the problem, not the solution b. description should be in the users' language 2. Requirements analysis and specification a. based on interviews with users b. users should review the document and agree; it acts like a contract to formally set down expectations and thus avoid later arguments c. requirements will change as users better understand their needs, so set up a formal change control procedure that associates reqt. changes with corresponding changes in schedule and budget d. consider building a prototype system to allow users to explore their needs before starting the full-scale system e. costs of fixing an error in the reqts. or making other changes in the reqts. increases with time [IBM,GTE,TRW ests. from McConnell]: make change in reqts. phase: 1x make change in arch. design phase: 5x make change in coding phase: 10x make change in unit testing phase: 20x make change in acceptance test phase: 50x make change in maintenance phase: 100x f. document the required inputs and outputs (incl. report formats) g. document required external interfaces (hardware, software, communication) h. state the expected response times to the tasks i. security reqts. j. how easy will it be to test that an implementation satisfies all reqts.? 3. Architectural (high-level) design a. large-picture design decisions, each with documentation of assumptions involved in the decision making and rationale about why certain choices were made as compared to alternatives b. description of major objectives (e.g., ease of modification versus highest performance) and identification of trade-offs c. buy versus build decisions d. key objects and data structures e. key algorithms f. description of strategy for handling input and user interface i. identification and description of input data and input format ii. analysis of input data partitioning and/or special cases iii. specification of command structures and/or menus g. description of strategy for handling output i. identification and description of output data and output format ii. analysis of output data partitioning and/or special cases iii. specification of report formats h. description of strategy for handling errors i. detection and recovery techniques ii. response of fixing error, entering degraded mode, or stopping iii. how to notify user, e.g., error message conventions i. description of strategy for handling changes i. likely changes that are anticipated ii. how effect of changes can be limited - e.g., use a table-driven approach rather than hard-coded set of if statements, also allow table to be loaded from an external file - e.g., isolate user interface into one or a small number of modules - e.g., what if output had to be translated into another language? j. module definitions i. partition of functionality, designed for loose coupling ii. module interfaces iii. time and space budgets for each task or function k. evaluate system architecture in terms of ease of implementing, testing, changing, maintaining, and reuse 4. Detailed design a. breakdown of modules into routines; only one task per routine; use use separate routines to hide decisions and details about algorithms, data representation, and control structures b. define naming, documentation, and source file layout standards c. procedure call structure chart d. interfaces (e.g., input and output parameters) e. how are errors detected and handled? f. how are error values distinguished from normal return values? g. description of choice of data structures h. high-level description of algorithm, including any decision tables or program logic needed for input/output data partitioning i. description of important variables (e.g., legend with variable name, data type, usage labeled as input/intermediate/output, and short description of usage) j. can special cases be eliminated by using an alternate algorithm? -- try to eliminate every "if" statement to reduce the complexity of testing k. avoid any undefined behavior l. identification and description of test cases, including normal cases, boundary cases, special cases, and error cases m. evaluate detailed design in terms of ease of implementing, testing, changing, maintaining, and reuse 5. Coding and debugging -- see below 6. Unit testing -- see below 7. System integration and testing a. incremental integration b. overall strategy i. top-down ii. bottom-up iii. combined - sandwiched iv. risk-oriented - higher risk first v. feature-oriented - allows evolutionary delivery 8. Acceptance test by users 9. Maintenance and enhancement Example Coding Guidelines 1. Procedures/functions a. a procedure name should be a strong verb and its object - the name should describe everything the procedure does and should reflect a single, clearly-defined purpose (if you have trouble determining such a name, then your procedure is likely doing too much) b. a function name should describe what it returns c. make parameters call-by-value or const whenever possible d. restrict the maximum number of parameters to 7 e. range check or otherwise validate input parameters when possible f. restrict the nesting level of control structures to 3-6 deep g. size should be no larger than three pages total, including comments [See McConnell for more details] 2. Style -- good advice from Andy Glew: "I recommend reading as many different coding standards as possible, so that you can: - get an idea of what standards are around - pick and choose some of the best ideas to apply to your own code - ideally, find that one of the popular coding standards is acceptable, and use it, rather than being totally idiosyncratic - learn to recognize code written to the other popular standards, so that you can sometimes `see the patterns'." [see the links from the course web page] Testing Considerations 1. Plan for testing. Set aside time in your schedule. Testing is the most important part of developing a program and often consumes the most time. However, it is usually the least enjoyable part for the programmers, which is often due to a lack of planning for and appreciation of testing. 2. Testing can increase confidence in your code, but it can never guarantee the absence of bugs. 3. Approach testing as a series of scientific experiments. Your main hypothesis is "my code does not work correctly". Look toward experimentally verifying lots of minor hypotheses, i.e. "my code does not correctly execute this test case". Approach each bug found not as a failure, but as a success! a. identify minor hypothesis b. predict correct outcome c. build experimental test rig that provides observability d. run experiment e. observe outcome f. compare outcome to prediction g. revise theory or experiment or test rig accordingly 4. Different stages of testing a. unit testing of individual modules i. test low-level routines using drivers ii. test high-level routines using stubs b. regression testing after each change c. system testing after integration of modules d. quality assurance before product is released i. alpha tests in-house e ii. external beta test sites d. continuing maintenance 4. Identify test cases a. functional testing - closed box i. specifications ii. special features, including error handling iii. boundary values and special values for input (e.g., 0) b. structural testing - open box i. execution paths (identified by test coverage tool) ii. error-handling paths, including correct error description iii. interfaces iv. possible problems identified during walkthroughs c. combination i. likely problems due to changes in specifications or code ii. pathological cases (based on intuition and experience) d. stress testing i. special test driver that makes a large number of random calls and mechanically tests the correctness of the calls ii. uses second algorithm or implementation to validate results 5. Written test plans a. ground rules (e.g., required hardware and software) b. specific test cases with expected responses c. get approval then freeze the plans, it is not a working document 6. Written error reports a. unique identification number for later tracking b. reference to test case in written test plan c. version of the software in which bug was found d. name of tester and date of testing e. description of problem (i.e., what was incorrect?) f. status: open, in-work, fixed-but-not-in-new-version, ready-for-retest, closed (i.e., resolved/fixed) g. priority: essential, adverse-with-no-workaround, adverse-with-workaround, inconvenience, other h. name of programmer assigned to fix bug i. modules changed during fix 7. Run regression tests. Save old test cases and test rigs. After a change, run several previous tests again. This gives confidence that no bugs have been introduced by the change. Do this even after a "trivial" change. 8. Test teams may be overwhelmed by lots of trivial errors passed to them by the implementation team, and this might cause them to overlook serious errors. Give the testers fewer bugs to handle and they will do a better job looking for the remaining bugs. Debugging Hints know the common bugs (in general and for you specifically) - wrong type declarations - did not properly initialize data structures - off-by-one errors, including one too many trips through a loop and access to just-off-the-edge array entries - memory leaks - null pointers - wrong parameters passed (wrong type, wrong order, too few, too many) - misuse of global variables - assignment operator (=) instead of equivalence (==) if you have a long list of compiler errors - fix the first few then look down the remainder for obvious errors that you can also fix - after you have fixed all the obvious errors and those at the top of the list, recompile; some of the non-ovious error messages may have been generated if the statement parsing was thrown off by one of the errors early in the list test in isolation (e.g., bottom up) - some programmers resent time spent building scaffolding code (e.g., test drivers), but proper testing will save you time in bug detection plan for extra time **think** before throwing in all those extra print statements - try to narrow down where in the program any wrong output could have been created - briefly use a debugger to get a stack trace to see which function was active at the time of a crash - reason back from the output and/or crash site - did your most recent change to the program introduce this error? try to recreate the bug in a smaller or more constrained context - narrow down the input to the smallest context in which the bug shows up - write special test cases / test drivers / trivial programs to verify your understanding (i.e., go back to the idea of testing in isolation) - write a special test case that will confirm or deny your guess about what went wrong save a copy of your source before you start decorating it with prints use binary search in printing to try to reduce the output you have to sort through - start in the middle of your program and see if the error occurred before or after this point your debugging output should be concise and should have fixed display formats so that you can visually search for patterns or other clues consider graphical approaches to output (scatter plots, histograms) to help filter your debugging output through grep flush the output buffers after each print (e.g., fflush(stdout)) so that the program doesn't terminate with unseen debugging messages (i.e., ones left in the output buffer and discarded before being printed) and thus throwing you off the track of the bug print pointer values with %p and try to learn which ranges of addresses are valid (e.g., 0 is not) display sizes and lengths of data structures to gain clues from imbalance Defensive programming turn on all compiler warnings (-Wall flag for gcc) validate predefined data structures (e.g., tables) on startup document preconditions (what must be true before a function or code segment) and postconditions (what must be true afterwards) use asserts to validate input, parameters, assumptions, preconditions, postconditions, and even "impossible" conditions; use it to perform consistency or integrity checks (e.g., sum of probabilities adds to 1.0, each is between 0.0 and 1.0 inclusive) --assert.c-- /* assert function, similar to that recommended by Steve Maguire in * "Writing Solid Code," Microsoft Press, 1993. * * use abort() instead of exit() to cause a core dump, which you can * then inspect using a debugger. */ #include void _assert(char *filename, unsigned linenumber){ fflush(stdout); fprintf(stderr,"\nassertion failed: %s, line %u\n", filename, linenumber); fflush(stderr); exit(-1); } --assert.h-- #ifdef DEBUG void _assert(char *,unsigned); #define ASSERT(f) if(f) NULL; else _assert(__FILE__,__LINE__) #else #define ASSERT(f) NULL #endif --test_assert.c-- #include #include"assert.h" int main(void){ int i, j; i = 0; j = 5; ASSERT(i