OReilly: Practical C++ Programming - 7Chan

15 downloads 510 Views 4MB Size Report
Dec 10, 2002 - tour of the programming process that shows you how real programs are created ...... Answer 5-3: The probl
Main Page Table of content Copyright Preface Scope of This Handbook How This Book Is Organized How to Read This Book If You Already Know C Font Conventions How to Contact Us Acknowledgments for the First Edition Acknowledgments for the Second Edition Part I: The Basics Chapter 1. What Is C++? 1.1 A Brief History of C++ 1.2 C++ Organization 1.3 How to Learn C++ Chapter 2. The Basics of Program Writing 2.1 Programs from Conception to Execution 2.2 Creating a Real Program 2.3 Getting Help in Unix 2.4 Getting Help in an IDE 2.5 Programming Exercises Chapter 3. Style 3.1 Comments 3.2 C++ Code 3.3 Naming Style 3.4 Coding Religion 3.5 Indentation and Code Format 3.6 Clarity 3.7 Simplicity 3.8 Consistency and Organization 3.9 Further Reading 3.10 Summary Chapter 4. Basic Declarations and Expressions 4.1 Basic Program Structure 4.2 Simple Expressions 4.3 The std::cout Output Object 4.4 Variables and Storage 4.5 Variable Declarations 4.6 Integers 4.7 Assignment Statements 4.8 Floating-Point Numbers 4.9 Floating-Point Divide Versus Integer Divide

4.10 Characters 4.11 Wide Characters 4.12 Boolean Type 4.13 Programming Exercises 4.14 Answers to Chapter Questions Chapter 5. Arrays, Qualifiers, and Reading Numbers 5.1 Arrays 5.2 Strings 5.3 Reading Don't play games with the constructor; do exactly what I say!" This is done by declaring the constructor as an explicit construction: class int_array { public: explicit int_array(unsigned int size); Now the we can initialize our variable using the constructor: int_array example(10);

// Works with explicit

But the statement:

int_array example = 10; // Illegal because of "ex is now illegal. It is a good idea to limit the number of side effects and other things that can happen behind your back. For that reason, you should declare your constructors explicit whenever possible.

13.6 Shortcuts So far you have used only function prototypes in the classes you've created. It is possible to define the body of the function inside the class itself. Consider the following code: class stack { public: // .... rest of class // Push an item on the stack void push(const int item); }; inline void stack::push(const int item) { The number of active stacks is " = 0); assert(ch < sizeof(type_info)/sizeof(type if (type_info[ch] == C_DIGIT) return (1); if ((ch >= 'A') && (ch = 'a') && (ch = 0); assert(ch < sizeof(type_info)/sizeof(type

return ((type_info[ch] == C_ALPHA) || (type_info[ch] == C_DIGIT)); default: assert(ch >= 0); assert(ch < sizeof(type_info)/sizeof(type return (type_info[ch] == kind); } }; char_type::CHAR_TYPE char_type::type(const int ch) { if (ch == EOF) return (C_EOF);

assert(ch >= 0); assert(ch < sizeof(type_info)/sizeof(type_info[0] return (type_info[ch]); } Example 27-5. stat/token.h

#include #include /**************************************************** * token -- token handling module * * Functions:

* next_token -- get the next token from the inp ****************************************************

/* * A list of tokens * Note, how this list is used depends on defini * This macro is used for defining the tokens ty * as well as the string version of the tokens. */ #define TOKEN_LIST \ T(T_NUMBER), /* Simple number (floating po T(T_STRING), /* String or character consta T(T_COMMENT), /* Comment */ T(T_NEWLINE), /* Newline character */ T(T_OPERATOR), /* Arithmetic operator */ T(T_L_PAREN), /* Character "(" */ T(T_R_PAREN), /* Character ")" */ T(T_L_CURLY), /* Character "{" */ T(T_R_CURLY), /* Character "}" */ T(T_ID), /* Identifier */ T(T_EOF) /* End of File */

/* * Define the enumerated list of tokens. * This makes use of a trick using the T macro * and our TOKEN_LIST */ #define T(x) x // Define T( ) as the name enum TOKEN_TYPE { TOKEN_LIST }; #undef T // Remove old temporary macro // A list of the names of the tokens extern const char *const TOKEN_NAMES[];

/**************************************************** * input_file -- data from the input file * * The current two characters are store in * cur_char and next_char * * The member function read_char moves eveyone up * one character. * * The line is buffered and output everytime a newlin * is passed. **************************************************** class input_file: public std::ifstream { private: std::string line; // Current line public: int cur_char; // Current character (can be int next_char; // Next character (can be EOF

/* * Initialize the input file and read the fir * characters. */ input_file(const char *const name) : std::ifstream(name), line("") { if (bad( )) return; cur_char = get( ); next_char = get( ); } /*

* Write the line to the screen */ void flush_line( ) { std::cout 1; --argc) { do_file(argv[1]); ++argv; } return (0); } Example 27-8. stat/makefile.unx # # Makefile for many Unix compilers using the # "standard" command name CC # CC=CC CFLAGS=-g OBJS= stat.o ch_type.o token.o all: stat.out stat stat.out: stat stat ../calc3/calc3.cpp >stat.out stat: $(OBJS) $(CC) $(CCFLAGS) -o stat $(OBJS) stat.o: stat.cpp token.h

$(CC) $(CCFLAGS) -c stat.cpp ch_type.o: ch_type.cpp ch_type.h $(CC) $(CCFLAGS) -c ch_type.cpp token.o: token.cpp token.h ch_type.h $(CC) $(CCFLAGS) -c token.cpp clean: rm stat stat.o ch_type.o token.o Example 27-9. stat/makefile.gnu

# # Makefile for the Free Software Foundations g++ comp # CC=g++ CCFLAGS=-g -Wall OBJS= stat.o ch_type.o token.o all: stat.out stat stat.out: stat stat ../calc3/calc3.cpp >stat.out stat: $(OBJS) $(CC) $(CCFLAGS) -o stat $(OBJS) stat.o: stat.cpp token.h $(CC) $(CCFLAGS) -c stat.cpp ch_type.o: ch_type.cpp ch_type.h $(CC) $(CCFLAGS) -c ch_type.cpp

token.o: token.cpp token.h ch_type.h $(CC) $(CCFLAGS) -c token.cpp clean: rm stat stat.o ch_type.o token.o Example 27-10. stat/makefile.bcc # # Makefile for Borland's Borland-C++ compiler # CC=bcc32 # # Flags # -N -- Check for stack overflow # -v -- Enable debugging # -w -- Turn on all warnings # -tWC -- Console application # CFLAGS=-N -v -w -tWC OBJS= stat.obj ch_type.obj token.obj all: stat.out stat.exe stat.out: stat.exe stat ..\calc3\calc3.cpp >stat.out stat.exe: $(OBJS) $(CC) $(CCFLAGS) -estat $(OBJS) stat.obj: stat.cpp token.h $(CC) $(CCFLAGS) -c stat.cpp ch_type.obj: ch_type.cpp ch_type.h $(CC) $(CCFLAGS) -c ch_type.cpp

token.obj: token.cpp token.h ch_type.h $(CC) $(CCFLAGS) -c token.cpp clean:

erase stat.exe stat.obj ch_type.obj token.obj Example 27-11. stat/makefile.msc # # Makefile for Microsoft Visual C++ # CC=cl # # Flags # AL -- Compile for large model # Zi -- Enable debugging # W1 -- Turn on warnings # CFLAGS=/AL /Zi /W1 OBJS= stat.obj ch_type.obj token.obj all: stat.out stat.exe stat.out: stat.exe stat ..\calc3\calc3.cpp >stat.out stat.exe: $(OBJS) $(CC) $(CCFLAGS)

$(OBJS)

stat.obj: stat.cpp token.h $(CC) $(CCFLAGS) -c stat.cpp ch_type.obj: ch_type.cpp ch_type.h

$(CC) $(CCFLAGS) -c ch_type.cpp token.obj: token.cpp token.h ch_type.h $(CC) $(CCFLAGS) -c token.cpp clean:

erase stat.exe stat.obj ch_type.obj token.obj

27.9 Programming Exercises Exercise 27-1: Write a program that checks a text file for doubled words. Exercise 27-2: Write a program that removes vulgar words from a file and replaces them with more acceptable equivalents. Exercise 27-3: Write a mailing-list program. This program will read, write, sort and print mailing labels. Exercise 27-4: Update the statistics program presented in this chapter to add a cross-reference capability. Exercise 27-5: Write a program that takes a text file and splits each long line into two smaller lines. The split point should be at the end of a sentence if possible, or at the end of a word if a sentence is too long.

Chapter 28. From C to C++ No distinction so little excites envy as that which is derived from ancestors by a long descent. ​François de Salignac de la Mothe Fénelon C++ was built on the older language C, and there's a lot of C code still around. That's both a blessing and a curse. It's a curse because you'll probably have to deal with a lot of ancient code. On the other hand, there will always be work for you. This chapter describes some of the differences between C and C++, as well as how to migrate from one to the other.

28.1 K&R-Style Functions Classic C (also called K&R C after its authors, Brian Kernighan and Dennis Ritchie) uses a function header that's different from the one used in C++. In C++ the parameter types and names are included inside the ( ) defining the function. In Classic C, only the names appear. Type information comes later. The following code shows the same function twice, first as defined in C++, followed by its K&R definition: int do_it(char *name, int function) { // Body of the function

// C++ function

int do_it(name, function) char *name; int function; { // Body of the function

// Classic C de

When C++ came along, the ANSI C committee decided it would be a good idea if C used the new function definitions. However, because there was a lot of code out there using the old method, C accepts both types of functions. C++ does not. 28.1.1 Prototypes Classic C does not require prototypes. In many cases, prototypes are missing from C programs. A function that does not have a prototype has an implied prototype of: int funct(

);

// Default prototype for Classic C

The ( ) in C does not denote an empty argument list. Instead it denotes a variable length argument list with no type checking of the parameters. Also, Classic C prototypes have no parameter

lists. The only "prototype" you'll see consists merely of "( )", such as: int do_it(

);

// Classic C function prototype

This tells C that do_it returns an int and takes any number of parameters. C does not type-check parameters, so the following are legal calls to do_it: i = do_it( ); i = do_it(1, 2, 3); i = do_it("Test", 'a'); C++ requires function prototypes, so you have to put them in. There are tools out there such as the GNU prototize utility that help you by reading your code and generating function prototypes. Otherwise, you will have to do it manually.

28.2 struct In C++, when you declare a struct, you can use the structure as a type name. For example: struct sample { int i, j; // Data for the sample }; sample sample_var; // Last sample seen C is more strict. You must put the keyword struct before each variable declaration: struct sample sample_var; sample sample_var;

// Legal in C // Illegal in C

28.3 malloc and free In C++, you use the new operator to get memory from the heap and use delete to return the memory. C has no built-in memory-handling operations. Instead, it makes use of two library routines: malloc and free. 28.3.1 The C malloc function The function malloc takes a single parameter​the number of bytes to allocate​and returns a pointer to them (as a char * or void *). But how do we know how big a structure is? That's where the sizeof operator comes in. It returns the number of bytes in a structure. To allocate a new variable of type struct foo in C, we use the code: foo_ptr = (struct foo *)malloc(sizeof(struct foo)); Note that we must use a cast to turn the pointer returned by malloc into something useful. The C++ syntax for the same operation is much cleaner: foo_ptr = new foo; Suppose we want to allocate an array of three structures. We need to multiply our allocation size by three, resulting in the following C code:

foo_ptr = (struct foo *)malloc(sizeof(struct foo) * 3 The much simpler C++ equivalent is: foo_ptr = new foo[3];

The calloc Function The function calloc is similar to malloc except that it takes two parameters: the number of elements in the array of objects and the size of a single element. Using our array of three foos example, we get: foo_var = (struct foo*)calloc(3, sizeof(foo)); The other difference is that calloc initializes the structure to zero. Thus, the C++ equivalent is: foo_var = new foo[3]; memset(foo_var, '\0', sizeof(foo) * 3);

Programs can freely mix C-style malloc and C++ new calls. The C memory allocators are messy, however, and should be converted to their C++ version whenever possible. There are a number of traps concerning C-style memory allocation. Suppose we take our structure foo and turn it into a class. We can, but shouldn't, use the C memory routines to allocate space for the class:

class foo {...}; foo_var = (struct foo *)malloc(sizeof(struct foo)); / Because C++ treats struct as a special form of class, most compilers won't complain about this code. The problem is that our malloc statement allocates space for foo and that's all. No constructor is called, so it's quite possible that the class will not get set up correctly. The C++ new operator not only allocates the memory, but also calls the constructor so that the class is properly initialized. 28.3.2 The C free function C uses the function free to return memory to the heap. The function free takes a single character pointer as a parameter (thus making a lot of casting necessary): free((char *)foo_var);

foo_var = NULL; In C++ you delete a foo_var that points to a simple value this way: delete foo_var; foo_var = NULL; If foo_array is an pointer to an array, you delete it with the code: delete []foo_array; foo_array = NULL; Again, you must be careful when turning foo into a class. The free function just returns the memory to the heap. It does not call the destructor for foo, while the delete operator calls the destructor and then deletes the class's memory. C-style memory allocation is messy and risky. When converting code to C++ you probably should get rid of all malloc, calloc, and free calls whenever possible.

According to the ANSI C standard, memory allocated by malloc must be deallocated by free. Similarly, memory allocated by new must be deallocated by delete. However, most of the compilers I've seen implement new as a call to malloc and delete as a call to free. In other words, mixing new/free or malloc/free calls will usually work. To avoid errors, you should follow the rules and avoid mixing C and C++ operations.

28.4 Turning Structures into Classes Frequently when examining C code you may find a number of defined struct statements that look like they should be objects defined as C++ classes. Actually, a structure is really just a data-only class with all the members public. C programmers frequently take advantage of the fact that a structure contains only data. One example of this is reading and writing a structure to a binary file. For example: a_struct struct_var;

// A structure variable

// Perform a raw read to read in the structure read_size = read(fd, (char *)&struct_var, sizeof(stru

// Perform a raw write to send the data to a file write_size = write(fd, (char *)&struct_var, sizeof(st Turning a structure like this into a class can cause problems. C++ keeps extra information, such as virtual function pointers, in a class. When you write the class to disk using a raw write, you are outputting all that information. What's worse, when you read the class in, you overwrite this bookkeeping data. For example, suppose we have the class:

class sample { public: const int sample_size; // Number of sampl int cur_sample; // Current sample sample( ) : sample_size(100) {} // Set up cl virtual void get_sample( ); // Routine to ge }; Internally, this class consists of three member variables: a

constant, sample_size (which C++ won't allow you to change); a simple variable, cur_sample; and a pointer to the real function to be used when get_sample is called. All three of these are written to disk by the call:

sample a_sample; // ... write_size = write(fd, (char *)&a_sample, sizeof(a_sa When this class is read, all three members are changed. That includes the constant (which we aren't supposed to change) and the function pointer (which now probably points to something strange). C programmers also make use of the memset function to set all the members of a structure to zero. For example: struct a_struct { ... } a_struct struct_var; // ... memset(&struct_var, '\0', sizeof(struct_var)); Be careful when turning a structure into a class. If we had used the class a_sample in the previous example instead of the structure struct_var, we would have zeroed the constant sample_size as well as the virtual function pointer. The result would probably be a crash if we ever tried to call get_sample.

28.5 setjmp and longjmp C has its own way of handling exceptions through the use of setjmp and longjmp. The setjmp function marks a place in a program. The longjmp function jumps to the place marked by setjmp. Normally setjmp returns a zero. This tells the program to execute normal code. When an exception occurs, the longjmp call returns to the location of the setjmp function. The only difference the program can see between a real setjmp call and a fake setjmp call caused by a longjmp is that normally setjmp returns a zero. When setjmp is "called" by longjmp, the return value is controlled by a parameter to longjmp. The definition of the setjmp function is: #include int setjmp(jmp_buf env); where env is the place where setjmp saves the current environment for later use by longjmp. The setjmp function return values are as follows: 0 Normal call Nonzero Non-zero return codes are the result of a longjmp call. The definition of the longjmp call is: void longjmp(jmp_buf env, int return_code);

where env is the environment initialized by a previous setjmp call, and return_code is the return code that will be returned by the setjmp call. Figure 28-1 illustrates the control flow when using setjmp and longjmp. Figure 28-1. setjmp/longjmp control flow

There is one problem here, however. The longjmp call returns control to the corresponding setjmp. It does not call the destructors of any classes that are "destroyed" in the process. In Figure 28-1 we can see that in the subroutine we define a class named a_list. Normally we would call the destructor for a_list at the end of the function or at a return statement. However, in this case we use longjmp to exit the function. Since longjmp is a C function, it knows nothing about classes and destructors and does not call the destructor for a_list. So we now have a situation where a variable has disappeared but the destructor has not been called. The technical name for this situation is a "foul-up." When converting C to C++, change all setjmp/longjmp

combinations into exceptions.

28.6 Mixing C and C++ Code It is possible for C++ code to call a C function. The trick is that you need to tell C++ that the function you are calling is written in C and not C++. This is accomplished by declaring the function prototypes inside an extern "C" block. For example: extern "C" { extern int the_c_function(int arg); }

28.7 Summary What you must do to get C to compile with a C++ compiler: Change K&R-style function headers into standard C++ headers. Add prototypes. Rename any functions or variables that are C++ keywords. Change setjmp/longjmp calls into catch/throw operations. Once you've done these tasks, you have a C+1/2 program. It works, but it's really a C program in C++'s clothing. To convert it to a real C++ program, you also need to do the following: Change malloc to new. Change free to delete or delete [] calls. Turn printf and scanf calls into cout and cin. When turning struct declarations into class variables, be careful of read, write, and memset functions that use the entire structure or class.

28.8 Programming Exercise Exercise 28-1: There are a lot of C programs out there. Turn one into C++.

Chapter 29. C++'s Dustier Corners There be of them that have left a name behind them. ​Ecclesiasticus XLIV, 1 This chapter describes the few remaining features of C++ that are not described in any of the previous chapters. It is titled "C++'s Dustier Corners" because these statements are hardly ever used in real programming.

29.1 do/while The do/while statement has the following syntax: do {

statement; statement; } while (expression); The program loops, tests the expression, and stops if the expression is false (0).

This construct always executes at least once.

do/while is not frequently used in C++ because most programmers prefer to use a while/break combination.

29.2 goto All the sample programs in this book were coded without using a single goto. In actual practice I find I use a goto statement about once every other year. For those rare times that a goto is necessary, its syntax is: goto label; where label is a statement label. Statement labels follow the same naming convention as variable names. Labeling a statement is done as follows:

label: statement; For example: for (x = 0; x < X_LIMIT; ++x) { for (y = 0; y < Y_LIMIT; ++y) { assert((x >= 0) && (x < X_LIMIT)); assert((y >= 0) && (y < Y_LIMIT)); if (data[x][y] == 0) goto found; } } std::cout ) operator 2nd greater than or equal to (>=) operator 2nd guard digits 2nd [See also floating-point numbers] guidelines coding design modules

I l@ve RuBoard

I l@ve RuBoard

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [V] [W ] [X] headers comments in 2nd files help, online Unix hex I/O manipulator hexadecimal numbers 2nd 3rd 4th converting hiding member functions hierarchy, class high-level languages histogram program (hist) history of C++ of programming hyphen (-) for command-line options

I l@ve RuBoard

I l@ve RuBoard

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [V] [W ] [X] I/O (input/output) binary 2nd C++ file C++ file package conversion routines 2nd w ith disk files manipulators operators >> (input) >) operator 2nd 3rd 4th input_file class inputting data numbers strings instructions 2nd int (integer) keyw ord int number type int variable type integer.h file

integers 2nd 3rd converting to floating-point numbers dividing long int type 2nd short int type 2nd signed versus unsigned types unsigned very short (char type) integrated development environment (IDE) interaction w ith modules interactive debugging conditional breakpoint trick interfaces procedures 2nd troubleshooting internal number formats invert (~) operator [See NOT operator, binary] ios ::app flag ::ate flag ::binary flag 2nd ::dec flag ::fixed flag ::hex flag ::in flag ::internal flag 2nd ::left flag ::nocreate flag ::noreplace flag ::oct flag ::out flag ::right flag ::scientific flag ::show base flag ::show point flag ::show pos flag ::skipw s flag ::trunc flag ::uppercase flag :\:unitbuf flag iostream class ::fill ::precision ::setf ::unsetf iostream.h include file 2nd isalpha macro istream class ::getline ::sentry italics in comments iterators set containers STL

I l@ve RuBoard

I l@ve RuBoard

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [V] [W ] [X] justification

I l@ve RuBoard

I l@ve RuBoard

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [V] [W ] [X] K&R-style functions keyboards, trigraphs keyw ords static struct

I l@ve RuBoard

I l@ve RuBoard

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [V] [W ] [X] L character, for long integers labels for goto statements languages assembly language C [See C language] C++ [See C++ language] COBOL FORTRAN high-level low -level machine code machine language PASCAL %ld conversion leaves, trees left shift (