TOC PREV NEXT INDEX


I've just released a deduplicating backup product for
VMWare Workstation and Server.
Do you need to reduce the storage needed to maintain multiple backups?
Or do you need multiple snapshots on VMWare Server?
If you have either of these needs, just click
here to get started for only $29 (limited time offer ends July 31st, 2011)!

C++: A Dialog


CHAPTER 12 More on the Home Inventory Project
In this chapter, we will take the home inventory project to a stage where it will be a useful, if limited, application program that will allow you to keep track of your possessions. Of course, this doesn't mean that we will have completely finished this project; it's rare for a software application to be finished in the sense that nothing more can be done to improve it. In fact, the usual way to tell when you're done working on a project is that you have run out of time and have to put it into service, not that it does everything that you would like it to do. In this way, the home inventory project is quite representative of programming projects in general.1

We'll get right to our improvements as soon as we get through some definitions as well as the objectives of the chapter.

12.1. Definitions
The preprocessor is a part of the C++ compiler that modifies the source code of a program seen by the rest of the compiler; thus, the name "preprocessor". The preprocessor is needed to accomplish tasks that cannot be done by the rest of the compiler, e.g., to prevent declarations in a header file from being processed twice for the same implementation file.
A preprocessor directive is a command by which we tell the preprocessor to perform a specific task, e.g., to ignore a section of source code under certain conditions.
A preprocessor symbol is a constant value similar to a const that is known only to the preprocessor, not to the rest of the compiler. The rules for naming preprocessor symbols are the same as those for other identifiers, but it is customary to use all uppercase letters in preprocessor symbols so that they can be readily distinguished from other identifiers.
The #ifndef preprocessor directive tells the preprocessor to check whether a particular preprocessor symbol has been defined. If not, the following source code is treated normally; if it has been defined, the following source code is skipped by the rest of the compiler as though it were not present in the source file.
The #define preprocessor directive defines a preprocessor symbol.
The #endif preprocessor directive terminates a section of source code controlled by a #ifndef or other conditional preprocessor directive.
An include guard is a mechanism to prevent a class definition from being included in a source code file more than once.
A default argument is a method of specifying a value for an argument to a function when the user of the function hasn't supplied a value for that argument. The value of the default argument is specified in the declaration of the function.

12.2. Objectives of This Chapter

By the end of this chapter we will have
1. Created an extended version of the standard string class that allows case-insensitive searching and comparison;
2. Learned how to use include guards to prevent a class interface from accidentally being defined more than once;
3. Learned more about the hazards of the "magic" value 0, as well as how we can prevent one of them that can occur when using the standard string class;
4. Learned how to restrict the application of a constructor via the explicit keyword.
5. Discovered just how difficult it is to anticipate how a program will be used and how many bugs it may contain;
6. Seen how a seemingly simple request for an added feature in a program can be extremely difficult to fulfill.

12.3. Extending the Functionality of strings

The standard string class has been satisfactory for the uses we've made of it so far, but at this point we need some functionality that is not present in that class, namely, the ability to compare and search for portions of strings without regard to the case of the text involved. For example, we want to be able to consider "Steve" and "steve" equal, even though their ASCII code sequences differ. For another example, we want to be able to determine whether the character sequence "red" is present in a string, regardless of whether it is capitalized.

Susan had a question about why we need to do this:

Susan: Why isn't this already in the standard string class? We can't be the only ones who want to do this.
Steve: I'm not sure, other than the C++ philosophy of providing only those tools that can't be constructed from other tools. As you'll see, there are a number of ways to implement this new functionality using the existing facilities of the standard library and of C++. So I guess the library implementers figured that we could do it ourselves.
Now that we know what functionality we need to add, how do we implement this?

12.4. How to Implement Our New string Functionality

There are a number of ways that I could have implemented the functionality that we will need to compare and search for character data without regard for the case of the text we are looking for or comparing. Here are several possibilities:
1. Writing global functions to do case insensitive searches and comparisons.
2. Using a completely separate string class like the one we created in Chapters 7 and 8.
3. Using the standard library features that allow us to specify our own comparison and searching functions for the standard string class.
4. Creating a new class that includes a standard library string member variable and forwarding all of the member functions implemented in the standard library string class to that member variable.
5. Deriving a new class by public inheritance from the standard library string class.
I have decided to solve this problem by creating a new class based on the standard string class. I'm calling this new class xstring, for "extended string". How did I come to this decision?

First of all, I don't like global functions, as they make it more difficult for other programmers to keep control of their namespaces. That is, every global function that we add to our programs has the possibility of interfering with another global function written by someone else who wants to use our classes. So I use global functions only when there is no other alternative available, or when I know that they cannot possibly interfere with other programmers' namespaces. For example, global functions used for input and output of our HomeItem objects should not be a problem, as anyone using our HomeItem classes must already have agreed to our definitions for those classes, and therefore should not have any namespace problems caused by the normal global I/O functions associated with those classes. However, of course, most programmers are going to use the standard library string class, and will not be amused if we introduce global functions that might interfere with their existing functions that use the standard library strings.

Okay, so how about switching back to our homegrown string class, defined in Chapters 7 and 8? I don't like that idea very much either, because, as I've already explained, we should get used to the idea of using the standard library as much as possible. Hundreds or maybe thousands of programmer-years have gone into the design and implementation of that library, and even if we don't care for some of its characteristics, we should use it whenever it is feasible to do so.

But if that is true, why did I reject the third alternative of using the built-in standard library facilities that allow us to specify our own searching and comparison criteria?

Because those facilities are so complex to set up and use. This is, after all, a book on learning how to program, not on the arcane details of the standard library. The material we would have to cover for you to be able to understand how to implement the "normal" method of specifying custom comparison and searching operations for the standard library string class is just too difficult for beginning programmers.

Another possibility is to create a new class that includes a standard library string as a member variable, and passing all of the operations implemented in the standard library string class to that member variable. This is not practical because there are so many member functions in the standard library string class. This approach would require us to write a large number of functions in our class just to forward all of those operations to our member variable.

So that leaves us with the option of extending the facilities provided by the standard string class by another method. I have chosen to create a new xstring class that is publicly derived from the standard library string class. Is this really a viable way to proceed?

Why Our New xstring class Is a Good Design Choice

I'm sure there will be objections to my decision to inherit publicly from the standard library string class. This class, like many of the classes in the standard library, was not specifically intended to be inherited from. And, in fact, if we were going to add data members to this class, our design would be flawed, because of the problem of "slicing".

That is, if we were to add new data members to our derived class, and then accidentally assigned one of our derived class objects to a standard library string, the new data members would be lost during the assignment. This is one of the problems with inheriting from classes that were not intended specifically for such inheritance.

However, in this case, we will not run into this problem, because we're not adding any data members to our new class. All we're adding are two regular member functions that will allow us to do case-insensitive comparisons and searches. Therefore, assigning one of our new objects to a standard library string would not cause any data to be lost.

Another possible problem with deriving from the standard library string class is that the destructor for that class is not virtual. Therefore, if someone were to create a pointer to a std::string, assign an xstring to it, and then delete the xstring via that pointer, the destructor for std::string would be called rather than the destructor for xstring.

According to the C++ standard, deleting a derived class object through a base class pointer, where the base class destructor is not declared virtual, is "undefined behavior" (i.e., the compiler can do anything it wants).

However, I'm not worried about that problem, for two reasons: first, since xstring has no member variables, other than its base class part, I have some difficulty figuring out why a call to delete through a base class pointer would do anything other than destroying the string base class part, which is all we wanted to do anyway.

But more importantly, there's no reason for anyone ever to refer to an xstring via a pointer to a std::string because std::string doesn't have any virtual functions. This means that any function call through an xstring via a pointer to a std::string would always result in a call to the corresponding function in std::string. Therefore, no one should ever make the error of deleting an xstring through a pointer to a std::string.2

I'm also following proper object-oriented design by creating a new class whose objects can be substituted for those of the original class without undue surprises to the user of these classes.

Given all of these considerations, I can't see any valid objection to our writing our own class that extends the functionality of the standard library string class.

The Interface of the New xstring class

Now that we've cleared that up, let's take a look at Figure 12.1, which shows the interface of the new xstring class that implements the additional string-related functions we'll need to finish our home inventory project.
FIGURE 12.1. The new xstring class interface (code\xstring.h)
#ifndef XSTRING_H
#define XSTRING_H

#include <iostream>
#include <string>

class xstring : public std::string
{
public:
xstring();
xstring(const xstring& Str);
xstring(const std::string& Str);
xstring(char* p);
xstring(unsigned Length, char Ch);
explicit xstring(unsigned Length);

short find_nocase(const xstring& Str);
bool less_nocase (const xstring& Str);
};

#endif

12.5. The Include Guard

The first new feature in this header file doesn't have anything to do with adding new functionality. Instead, it is a means of preventing problems if we #include this header twice. I'm referring to the two lines at the very beginning of the file and the last line at the end of the file. The first of these lines,

#ifndef XSTRING_H

uses a preprocessor directive called #ifndef (short for "if not defined") to determine whether we've already defined a preprocessor symbol called XSTRING_H. If we have already defined this symbol, the compiler will ignore the rest of the file until it sees a #endif (which in this case is at the end of the file).

The next line,

#define XSTRING_H

defines the same preprocessor symbol, XSTRING_H, that we tested for in the previous line. Finally, the last line of the file,

#endif

ends the scope of the #ifndef directive. This whole mechanism is generally referred to as an include guard.
Susan: I don't get it. What is a preprocessor directive? For that matter, what is a preprocessor?
Steve: The preprocessor used to be a separate program that was executed before the compiler itself, to prepare the source code for processing by the compiler. Nowadays, the preprocessor is almost always part of the compiler, but it is still logically distinct. A preprocessor directive is a command to the preprocessor to manipulate the source code in some way.
Susan: Why do we need the preprocessor anyway?
Steve: We don't need it very much anymore. About the only functions it still serves are the processing of included header files (via the #include preprocessor directive) and the creation of the "include guard".
Susan: About the preprocessor symbol: Why would you want several things all equal to (for example) 123?
Steve: Because that makes the program easier to read than if you just said 123 everywhere you needed to use such a value. Giving a name to a number is now most commonly done via the const construct in C++, which replaces most of the old uses of preprocessor symbols, but we still need them to implement include guards so that we can prevent the C++ compiler itself from seeing the definition of a class more than once.
What is the point of all this? To solve a problem in writing large C++ programs: the possibility that we might #include the same header file more than once in the same source code file. This can happen because a source code file often uses #include to gain access to a number of interface definitions stored in several header files, more than one of which may use a common header file (like xstring.h). If this were to happen without precautions such as an include guard, we would get an error when we tried to compile our program. The error message would say that we had defined the same class twice, which is not allowed. Therefore, any header file that might be used in a number of places should use an include guard to prevent such errors. Susan had some questions about this notion and why it should be needed in the first place.
Susan: Why should it be illegal to define the same class twice?
Steve: If we define the same class twice, which definition should the compiler use? The first one or the second one?
Susan: I see how that might cause a problem, but what if the two definitions are exactly the same? Why would the compiler care then?
Steve: For the compiler to handle that situation, it would have to keep track of every definition it sees for a class rather than just one. Because it's almost always an error to try to define the same class more than once, there's no reason to add that extra complexity to the compiler when we can prevent the problem in the first place.
Assuming that I've convinced you of the value of include guards, how do they work? Well, the #ifndef directive checks to see if a specific preprocessor symbol, in this case XSTRING_H, has already been defined. If it has, then the rest of the #include file is essentially ignored. Let's suppose that XSTRING_H hasn't been defined yet. In that case, we define that symbol in the next line and then allow the compiler to process the rest of the file.

So far this works exactly as it would if we hadn't added the include guard. But suppose that later, during the compilation of the same source file, another header file #includes xstring.h again. In that case, the symbol XSTRING_H would already be defined (because we defined it on the first access to xstring.h). Therefore, the #ifndef would cause the compiler to skip the rest of the header file, preventing the xstring class from being redefined and causing an error.

Of course, the choice of the preprocessor symbol to be defined is more or less arbitrary, but there is a convention in C and C++ that the symbol should be derived from the name of the header file. This is intended to reduce the likelihood of two header files using the same preprocessor symbol in their include guards. If that happened and if both of these header files were #included in the same source file, the definitions in the second one to be #included would be ignored during compilation because the preprocessor symbol used by its include guard would already be defined. To prevent such problems, I'm following the (commonly used) convention of defining a preprocessor symbol whose name is a capitalized version of the header file's name, with the period changed to an underscore to make it a legal preprocessor symbol name. If everyone working on a project follows this convention (or a similar one), the likelihood of trouble will be minimized.

The Constructors for the xstring class

You'll notice that we have declared six constructors for this new class. Let's go through them and see why we need each one and how it is implemented.

First, we have the default constructor, xstring::xstring(). As always, this constructor is designed to create a new object of its class when no initial data is specified for that object. The implementation of this function, which is shown in Figure 12.2, is fairly simple: it merely initializes its base class part, string, using the default constructor for that class.

FIGURE 12.2. The default constructor for the xstring class (from code\xstring.h)
xstring::xstring()
: string()
{
}

Next, we have the copy constructor, xstring::xstring(const xstring& Str), which is shown in Figure 12.3. This one isn't much more complicated on the surface; it just initializes its base class part to be equal to the existing xstring object that we are copying from.
FIGURE 12.3. The copy constructor for the xstring class (from code\xstring.h)
xstring::xstring(const xstring& Str)
: string(Str)
{
}

But there is a subtle issue here that needs examination: how can we initialize a string from an xstring, as we are doing here?

We can do this because we have declared xstring as publicly derived from string. Normally, this feature of C++ wouldn't be desirable, as any added data members of the xstring class would be lost in the assignment; this problem, which we have discussed before, is called slicing. However, in this case that's not a problem, because we don't have any added data members in xstring, so we don't have to worry about their being lost.

Now we come to a constructor that's a little more interesting: xstring::xstring(const string& Str), shown in Figure 12.4.3

FIGURE 12.4. Another constructor for the xstring class (from code\xstring.h)
xstring::xstring(const string& Str)
: string(Str)
{
}

What does this do, and why do we need it?

Automatic Conversion from the Standard string class

This constructor will automatically convert an object of the standard string class to one of our xstring objects, in much the same way that the standard copy constructor creates a new object of the same type as an existing one. In this case, however, the new object being created is an xstring, with the same contents as the standard library string that we are copying from.

That much should be reasonably clear. But it will take some more explanation to explain why we need it. The reason is that otherwise we will not be able to use all of the normal facilities of the standard string class in a natural manner. Since the standard library doesn't know about our xstring class, standard library functions will never return a value of that type; instead, they will return standard library strings. So if we want to be able to tap into all of the functionality of the standard string class, we have to convert a returned std::string into one of our xstrings.

Let's make this a bit more concrete with an example. In this example, we will use a new feature of the standard library string class: concatenation, which is just a fancy word for "adding one string onto the end of another one". For example, if we have someone's first and last names as separate strings, it might be handy to be able to tack the last name to the end of the first name so we can store the entire name as one string. While we could use stringstreams, that is a very inefficient and clumsy way to accomplish a common operation.

Concatenation is a common enough operation that a convention has been developed to use the + sign to indicate it. This symbol is also used in languages such as Java and Basic for the same operation, so C++ isn't too unusual in this regard. But exactly how would we use the + to concatenate strings in C++?

Take a look at Figure 12.5 for an example of this operation.

FIGURE 12.5. A little test program for an early version of the xstring class (code\xstrtstc.cpp)
#include <iostream>
#include <string>
#include "xstringc.h"
using namespace std;

int main()
{
xstring x = "Steve ";
xstring y = "Heller";
xstring z = x + y;
cout << z << endl;
}

What we want this program to do is display my name on the screen by concatenating my last name on to the end of my first name (with a space in between so they don't run together). Unfortunately, it won't work, because the version of the xstring header file that it includes doesn't have the constructor that we've been discussing. That header file is shown in Figure 12.6.
FIGURE 12.6. An early version of the xstring header file (code\xstringc.h)
#ifndef XSTRING_H
#define XSTRING_H

#include <iostream>
#include <string>

class xstring : public std::string
{
public:
xstring();
xstring(const xstring& Str);
xstring(char* p);

short find_nocase(const xstring& Str);
bool less_nocase (const xstring& Str);
};
#endif
Okay, but exactly why doesn't this work? The problem is that the string concatenation operator, operator +, is part of the standard library string class, and its return type is std::string, not xstring. Therefore, the compiler gives an error message that looks like Figure 12.7:
FIGURE 12.7. An error message from mixing strings and xstrings
xstrtstc.cpp:
Error E2034 xstrtstc.cpp 11: Cannot convert `string' to `xstring' in function main()
*** 1 errors in Compile ***

How do we fix this? By declaring and implementing the constructor that automatically creates an xstring from a standard library string, xstring(const std::string& Str);, as shown in Figure 12.4 on page 867. Figure 12.8 shows the header file with that addition.
FIGURE 12.8. Another version of the xstring header file (code\xstringd.h)

#ifndef XSTRING_H

#define XSTRING_H

#include <iostream>

#include <string>

class xstring : public std::string

{

public:

xstring();

xstring(const xstring& Str);

xstring(const std::string& Str);

xstring(char* p);

xstring(unsigned Length, char Ch);

xstring(unsigned Length);

short find_nocase(const xstring& Str);

bool less_nocase (const xstring& Str);

};

#endif

Once we add that constructor, the program in Figure 12.9 will compile successfully. Running it will display my name, as desired.
FIGURE 12.9. A successful attempt to mix strings and xstrings (code\xstrtstd.cpp)
#include <iostream>
#include <string>
#include "xstringd.h"
using namespace std;

int main()
{
xstring x = "Steve ";
xstring y = "Heller";
xstring z = x + y;
cout << z << endl;
}

Now let's see why we need the other constructors in the xstring class. The next constructor, xstring(char* p);, creates an xstring from a C string literal. We need this to be able to initialize an xstring conveniently, such as in the statement xstring x = "Steve ";. The implementation of this constructor is trivial; it just initializes its base class string part with the C string literal value that we give it. Figure 12.10 shows the code for this constructor.
FIGURE 12.10. The char* constructor for the xstring class (from code\xstring.h)
xstring::xstring(char* p)
: string(p)
{
}

Now how about the next constructor, xstring(unsigned Length, char Ch);? This constructor creates an xstring with a specified length, with all the characters set to a specified character. We need this one to help us with our formatting of our screen displays, as you'll see later in this chapter. Figure 12.11 shows the code for this constructor, which is also quite simple; it merely calls the corresponding constructor from the standard library string class.
FIGURE 12.11. Another constructor for the xstring class (from code\xstring.h)
xstring::xstring(unsigned Length, char Ch)
: string(Length, Ch)
{
}

The last constructor for this type, while no more complex in its implementation, is considerably more interesting in the service that it provides for us.
FIGURE 12.12. The final constructor for the xstring class (from code\xstring.h)
xstring::xstring(unsigned Length)
: string(Length, ` `)
{
}

As you can see, this is very similar to the previous constructor. The only difference is that you don't have to specify what character you want to fill in the string; it will automatically be filled in with blanks.

But this constructor also serves another very useful purpose, besides letting us type a few less characters when we want to create an xstring filled with a specified number of blanks. It prevents a certain potentially serious error that can happen when we initialize standard library strings.

Preventing Accidental Initialization by 0

Remember the problem I alerted you to under the heading "One of the Oddities of the Value 0" on page 781 in Chapter 11? I accidentally initialized a string member variable with the value 0 in one of the constructors for the HomeItem class. Since the value 0 is acceptable as a pointer of any type, this resulted in calling the string class constructor that takes a char* argument, passing the value 0 as the address from which to copy characters into the newly created string. The result was an incorrectly initialized string that would cause the program to blow up if it were ever used, because 0, of course, is not a valid memory address.

It would be very nice to prevent this problem by making the compiler reject the value 0 as an initializer for our xstring class. And that's what that most recently defined constructor does, as you'll see if you try to compile the program shown in Figure 12.13.

FIGURE 12.13. An illegal program (code\strzero.cpp)
#include <string>
#include "xstringd.h"
using namespace std;

main()
{
xstring y(0);
y = "abc";
}

Figure 12.14 shows the error message that you'll get if you try to compile this program:
FIGURE 12.14. The error message from compiling that illegal program (code\strzero.err)
strzero.cpp:
Error E2015 strzero.cpp 8: Ambiguity between `xstring::xstring(char *)' and `xstring::xstring(unsigned int)' in function main()
*** 1 errors in Compile ***

This is an excellent result, but exactly why are we getting an error in the first place?

Because we are turning one of the most irritating characteristics of the value 0 to our advantage. You see, while 0 is an acceptable value for a char*, it is also an acceptable value for an unsigned int. So because we have defined two different constructors that could take a 0 argument, the compiler complains that it can't tell which one we mean, and refuses to compile the offending statement. This lets us know about the improper initialization of one of our variables.

However, we're not quite out of the woods on this issue yet. Let's try another case where we really don't want an automatic conversion from a number to an xstring. Figure 12.15 shows an example of this sort of dubious conversion.

FIGURE 12.15. A legal but dubious program (code\strone.cpp)
#include <string>
#include "xstringd.h"
using namespace std;

main()
{
xstring z(1);

xstring y;
y = 1;
}

With the latest version of the xstring header file we've seen so far, xstringd.h, this program will compile successfully. The first statement inside the function, xstring z(1);, is fine; it initializes an xstring to hold exactly one blank, as we intended. Of course, the second statement, xstring y; just declares another xstring variable, so that's okay as well. However, what about the third statement, y = 1;? What does that mean and why does it compile successfully?

That turns out to mean that we want to set the value of y to consist of one blank. That's because, as we have seen already in the discussion of our own string class, a constructor that takes exactly one argument will automatically be used to create a new object whenever necessary. In this case, we have a constructor for the xstring class that accepts unsigned integers. Since 1 can be considered an unsigned integer, and we are trying to assign an xstring the value 1, that constructor is called automatically.

This isn't quite as serious a problem as the one with 0, because at least the new xstring has a legitimate value, namely one blank character, rather than an invalid value that will cause the program to blow up. However, it is still not very desirable, because we can now set the value of an xstring to a series of blanks accidentally, just by assigning an integer value to it, when it is very unlikely that we mean to do that.

Luckily, there is a solution to this problem also. Let's see how it works.

The explicit Keyword

This is a job for the explicit keyword, which was added to C++ to allow class designers to solve a problem with constructors that resulted from a (usually convenient) feature of the language called implicit conversion. Under the rules of implicit conversion, a constructor that takes one argument is also a conversion function that is called automatically (or implicitly) where an argument of a certain type is needed and an argument of another type is supplied.4 In many cases, an implicit constructor call is very useful; for example, it's extremely handy to be able to supply a char* argument when an xstring (or a const xstring&) is specified as the actual argument type in a function declaration. The compiler allows this without complaint because we have an xstring constructor that takes a char* argument and can be called implicitly. However, sometimes we don't want the compiler to supply this automatic conversion because the results would be surprising to the user of the class. In some previous versions of C++, it wasn't possible to prevent the compiler from supplying the automatic conversion, but in standard C++ we can use the explicit keyword to tell the compiler, in essence, that we don't want it to use a particular constructor unless we explicitly ask for it.

Figure 12.16 shows the final version of the xstring header file, with the explicit specification for the constructor that takes an integer type.

FIGURE 12.16. The final version of the xstring header file (code\xstring.h)

#ifndef XSTRING_H

#define XSTRING_H

#include <iostream>

#include <string>

class xstring : public std::string

{

public:

xstring();

xstring(const xstring& Str);

xstring(const std::string& Str);

xstring(char* p);

xstring(unsigned Length, char Ch);

explicit xstring(unsigned Length);

short find_nocase(const xstring& Str);

bool less_nocase (const xstring& Str);

};

#endif

To see how this prevents the error of trying to set an xstring to an integer value, let's analyze why trying to compile the program shown in Figure 12.17 produces an error. Note that this is the same program as the previous one shown in Figure 12.15 on page 874, except that it uses the final version of our xstring header file.
FIGURE 12.17. An illegal program (code\strfix.cpp)
#include <string>
#include "xstring.h"
using namespace std;

main()
{
xstring z(1);

xstring y;
y = 1;
}

Figure 12.18 shows the error message that will be generated if you try to compile the above program.
FIGURE 12.18. The error message from strfix.cpp (code\strfix.err)
strfix.cpp:
Error E2285 strfix.cpp 11: Could not find a match for `xstring::operator =(int)' in function main()
*** 1 errors in Compile ***

The reason that the line xstring z(1); is legal, given the definitions in xstring.h, is that we are explicitly stating that we want to construct an xstring by calling a constructor that will accept an integer value; in this case, that constructor happens to be xstring(unsigned).

By contrast, the line y = 1; will be rejected by the compiler. For this line to be legal, the xstring class defined in xstring.h would need a constructor that could be called implicitly to create an xstring from the literal value 1. Although that interface file does indeed define an xstring constructor that can be called with one argument of an integer type, we've added the explicit keyword to its declaration to tell the compiler that this constructor doesn't accept implicit calls.

This is how we prevent a user from accidentally calling the xstring(unsigned) constructor by providing an integer argument to a function that expects an xstring. Because the user is very unlikely to want an integer value such as 1 silently converted to an xstring of one blank, as the xstring(unsigned) constructor would do in this case, making this constructor explicit will reduce unpleasant surprises.5

12.6. Lessons of the xstring class Implementation

I believe the existing standard library string class exhibits undesirable and improper behavior when passed a constant value 0 as an initializer. Perhaps the standard library will be corrected to prevent this problem eventually, and in many languages, we would have no choice but to wait for the library correction. However, we are fortunate that in the case of C++, we can fix this problem in the process of defining an extended xstring class, without losing any of the benefits of the existing std::string class. So this is a fine example of using object-oriented design to make use of a lot of existing code (in the standard library) while adding appropriate and necessary features for our own purposes.

One More Design Decision Regarding the xstring class

I should mention one more design decision that I have had to make in the process of creating this new class: whether to change the behavior of >> when reading data into an xstring, so that it would read an entire line rather than only up to the first blank. That was very tempting, as it would have simplified the example programs quite a bit, making them easier to write and understand. So why have I not done this?

Because it would change the interface of the new class from the standard library string class interface, thus making it impossible to switch from one to the other of these classes as necessary without changing the input operations in your programs. Even though I believe that the decision of the standard library designers was incorrect in this case, the drawbacks of changing the interface are significant enough that I have reluctantly decided to stay with the original behavior of the input operator. This means that you won't have another learning curve when you have to use the standard library string class in future programs you write.

Susan had a couple of questions about defining this new class:

Susan: Why did you call this new class xstring? Why not just call it string?
Steve: Because that would be very confusing to people accustomed to using the standard string class. It's a really bad idea to define classes whose names clash with names of classes in the standard library. One exception is when we're just showing how we might implement a standard library class, as in the case of the string class earlier in the book.

12.7. Case-Insensitive Searching

Now we're ready to discuss the first regular member function in the xstring class: find_nocase. We need this function to determine whether a given xstring contains a particular sequence of characters. For example, if we have an xstring containing the value "red, blue, and green", describing the colors of a sofa, we want to be able to determine whether the letters "b", "l", "u", and "e" appear consecutively in that string. If they do, it is sometimes also useful to know where that sequence of characters starts in the xstring.

Susan wanted to know why we would need to know where a sequence of characters was found in a string:

Susan: I understand why we need to know if we can find some characters in a string, but why would we care where they were?
Steve: Well, what if we wanted to change all the occurrences of "blue" to "purple" because we were changing our decor? In that case, there are string functions we could use to take apart a string, remove "blue", replace it with "purple", and put it back together. But to do that we would need to know where we had to take the string apart; i.e., at what point in the string the characters "blue" appeared.
Because this new function finds a sequence of chars in an xstring, its name should be something like find. Because it is going to be case-insensitive (e.g., RED, Red, and red will all be considered equivalent), we'll call it find_nocase.6

What exactly do I mean by "case-insensitive"? That when this new find_nocase function looks for a sequence of chars within an xstring, it considers upper- and lower-case letters equivalent. We will employ this function in the HomeInventory class functions that will enable us to, for example, search through the home inventory for all HomeItem objects containing the word "purple" in their description fields. Before we look at how this is implemented, let's see how it can be used (Figure 12.19).

FIGURE 12.19. Using xstring::find_nocase (code\xstrtstb.cpp)
#include <iostream>
#include <string>
#include "xstring.h"
using namespace std;

int main()
{
xstring x = "purple";
xstring y = "A Purple couch";
short where;

where = y.find_nocase(x);
cout << "The string " << x <<
" can be found starting at position " <<
where << " in the string " << endl;
cout << y << "." << endl;

where = x.find_nocase("rp");
cout << "The string `rp' can be found starting at position " <<
where << " in the string " << x << "." << endl;

where = x.find_nocase("rpx");
cout << "The string `rpx' can be found starting at position " <<
where << " in the string " << x << "." << endl;

return 0;
}

This program starts out by defining some xstring variables called x and y and initializing them to the values "purple" and "A Purple couch", respectively. Then it defines a short value called where that will hold the result of each search for an included sequence of chars. The next line, where = y.find_nocase(x);, calls the find_nocase member function of the xstring class to locate an occurrence of the value "purple" in the xstring y, which has the value "A Purple couch". The next three lines display the results of that search; as you can see, the return value of this function is equal to the position in xstring y where the string to be found, "purple", was indeed found, even though the word "Purple" was capitalized in the xstring we were looking through.

The other two similar sequences search the same xstring value (in y) for the literal values "rp" and "rpx", respectively, and display the results of these searches. The first of these is very similar to the previous search for the word "purple", but serves to point out that we don't have to search for a word - any sequence of characters will do.

The last sequence, however, is somewhat different because we are searching for a literal value ("rpx") that is not present in the xstring we're examining ("A Purple couch"). The question, of course, is what value the find_nocase function should return when this happens. Perhaps the most obvious possibility is 0, but that is unfortunately not appropriate because it violates the C and C++ convention that the first position of a string-like variable is considered position 0; that is, the return value 0 would signify that the string we were searching for was found at the beginning of the xstring we were searching in. Therefore, find_nocase returns the value -1 to indicate that the desired value has not been found in the xstring being examined.

Susan wanted to know how I picked -1 to indicate that we couldn't find a particular sequence of characters:

Susan: How did you come up with the number -1 to mean "not found"?
Steve: Well, what number should mean that? Zero isn't a good choice, because that would mean we had found the desired sequence of characters at the beginning of the target xstring. Remember, in C++, we start counting at 0. So I had to pick a number that couldn't possibly be a position in an xstring, and the standard convention in C++ is to use -1 to mean "not a valid position".
Now that we've seen how to use find_nocase, let's take a look at its implementation, which is shown in Figure 12.20.
FIGURE 12.20. The implementation of xstring::find_nocase (from code\xstring.cpp)
short xstring::find_nocase(const xstring& Str)
{
short i;
short thislength = size();
short strlength = Str.size();
const char* thisdata = c_str();
const char* strdata = Str.c_str();

for (i = 0; i < thislength-strlength+1; i ++)
{
if (strnicmp(thisdata+i,strdata,strlength) == 0)
return i;
}

return -1;
}

To make the discussion simpler, let's call the xstring that might contain the desired value (i.e., the one pointed to by this) the target string and the argument xstring Str the search string. So this function starts out by defining some variables called thislength and strlength to hold the actual number of chars in the target and the search xstrings, respectively. Then it uses a special function of the standard library string class called c_str() to set the variables thisdata and strdata to the addresses of the char data for the target xstring and the search xstring, respectively.7

Now we get to the heart of the function: the loop that uses strnicmp to compare each possible section of the target xstring with the search xstring we're looking for. We haven't discussed the strnicmp function yet, but it's quite similar to memcmp, with three differences:8

1. strnicmp ignores case in its comparison, so that (for example) RED, Red, and red all compare as equal.
2. strnicmp is a C string function rather than a C memory manipulation function like memcmp, so it stops when it encounters a null byte.
3. strnicmp isn't part of the C++ standard library. However, it is very commonly supplied by compiler manufacturers, so that shouldn't cause you much trouble.
The first of these characteristics of strnicmp is the reason that we have to use strnicmp rather than memcmp, which is case-sensitive. The second characteristic isn't an advantage when dealing with xstrings, which can theoretically contain null bytes. However, this isn't a problem in the find_nocase function, as that function applies only to ASCII text that doesn't contain null bytes anyway.

Now let's get back to the discussion of find_nocase. On the first time through the loop, the value of i is 0; therefore, the function call strnicmp(thisdata+i,strdata,strlength) compares strlength bytes from the beginning of the target xstring to the same number of bytes in the search xstring. If the two sets of bytes are equal (ignoring case), the result of the comparison is 0, in which case we have found what we were looking for, and so we exit the loop.

On the other hand, if the result of the comparison is not 0, we have to keep looking. The next step is to increment the value of the loop index i. The second time through the loop, the value of i is 1, so strnicmp(thisdata+i,strdata,strlength) compares strlength bytes starting at the second byte of the target xstring with the same number of bytes starting at the beginning of the search xstring. If this comparison is successful, we stop and indicate success; if not, we continue executing the loop until we find a match or run out of data in the target xstring.

Let's look at an example in more detail. Suppose we are searching through the target xstring "A Purple couch", looking for the search xstring "purple". The first time through the loop, we compare the first 6 bytes in the target xstring to the 6 bytes in the search xstring. Since the first byte of the target xstring is 'A' and the first byte of the search xstring is 'p', strnicmp returns a non-zero value to let us know that we haven't yet found a match. Therefore, we have to re-execute the loop. The second time through, we start the comparison at the second byte of the target xstring and the first byte of the search xstring; the second byte of the target xstring is a space, which isn't the same as the 'p' from the search xstring, so strnicmp returns a non-zero value to let us know that we still haven't found the search xstring. The third time through the loop, we start the comparison at the third byte of the target xstring and (as always) the first byte of the search xstring. Both of these have the value 'p' (if we ignore case), so strnicmp continues by comparing the fourth byte of the target xstring with the second byte of the search xstring. Those also match, so strnicmp continues to compare the rest of the bytes in the two strings until it gets to the end of the search xstring. This time they all match, so strnicmp returns 0 to let us know that we have found the search xstring.

Of course, the other possibility is that the search xstring isn't present in the target xstring. In that case, strnicmp won't return 0 on any of these passes through the loop, so eventually i will exceed its limit, causing the loop to stop executing. However, there's one thing I haven't explained yet: how we calculate the maximum number of times that we have to execute the loop. If we look at the for loop, we see that the continuation expression is i < thislength - strlength + 1. Is this the right limit for i, and if so, why?

Well, if the target xstring is the same length as the search xstring, then we know that we have to execute the loop only once because there's only one possible place to start comparing two strings of the same length - at the beginning of both strings. If we start i at 0 on the first time through the loop, it will be 1 at the beginning of the second time through the loop, so thislength - strlength + 1 gives the correct limit (of 1) if thislength and strlength have the same value. This demonstrates that the expression thislength - strlength + 1 is correct for the case where the search and target strings are the same length. Now, what about the case where the target xstring is 1 byte longer than the search xstring? In that case, there is one extra position in which the search xstring could be found - namely, starting at the second character of the target xstring. Continuing with this analysis, each additional character in the target xstring beyond the length of the search xstring adds one possible position in which the search xstring might be found in the target xstring, and therefore adds 1 to the number of times we might have to go through the loop. Since adding 1 to the value of thislength will add 1 to the value of the expression thislength - strlength + 1, that expression will produce the correct limit for any value of thislength and strlength.9

There's one more function in the xstring class that we need to discuss: less_nocase. Its code is shown in Figure 12.21.

FIGURE 12.21. The less_nocase function (from code\xstring.cpp)
bool xstring::less_nocase(const xstring& Str)
{
short Result;
short CompareLength;

short thislength = size();
short strlength = Str.size();
const char* thisdata = c_str();
const char* strdata = Str.c_str();

if (strlength < thislength)
CompareLength = strlength;
else
CompareLength = thislength;

Result = strnicmp(thisdata,strdata,CompareLength);

if (Result < 0)
return true;

if (Result > 0)
return false;

if (thislength < strlength)
return true;

return false;
}

This function is almost exactly the same as the operator < function in the string class that we created earlier in this book, except that it uses the strnicmp function rather than the memcmp function used in that implementation of operator <.

12.8. Searching for an Item by a Substring

Let's add some capabilities to our home inventory project. First on the list is the ability to find an item by searching for a sequence of chars in its description. Before we see how this is implemented, let's take a look at how it is used. Figure 12.22 shows the new application program that uses this feature.
FIGURE 12.22. The latest home inventory application program (code\hmtst6.cpp)
#include <iostream>
#include <fstream>
#include <string>

#include "Vec.h"
#include "xstring.h"
#include "hmit6.h"
#include "hmin6.h"
using namespace std;

int main()
{
ifstream HomeInfo("home3.in");
HomeInventory MyInventory;
HomeItem TempItem;
xstring Name;
xstring Description;

MyInventory.LoadInventory(HomeInfo);

TempItem = MyInventory.FindItemByName("Relish");
cout << endl;

TempItem.Edit();
cout << endl;

TempItem.FormattedDisplay(cout);
cout << endl;

cout << "Please enter a search string: ";
cin >> Description;

TempItem = MyInventory.FindItemByDescription(Description);

if (TempItem.IsNull())
cout << "Sorry, I couldn't locate that item." << endl;
else
TempItem.FormattedDisplay(cout);
cout << endl;

return 0;
}

This program starts out just as the previous one did, by loading the inventory from the input file, looking up the entry whose name is "Relish", and displaying it for editing. Once the user has finished editing the entry, we get to the new part: reading a search xstring from the user and searching for that item in the inventory by the new FindItemByDescription member function of HomeInventory. Let's go through the changes needed to implement this new feature, starting with the new version of the HomeInventory interface, shown in Figure 12.23.
FIGURE 12.23. The latest version of the HomeInventory interface (hmin6.h)
//hmin6.h

class HomeInventory
{
public:
HomeInventory();

short LoadInventory(std::ifstream& is);
void DumpInventory();
HomeItem FindItemByName(const xstring& Name);
HomeItem AddItem();
short LocateItemByName(const xstring& Name);
HomeItem EditItem(const xstring& Name);

HomeItem FindItemByDescription(const xstring& Partial);
short LocateItemByDescription(const xstring& Partial);

private:
Vec<HomeItem> m_Home;
};

The only new functions added to this interface since the last version (see Figure 11.26) are the two "ItemByDescription" functions that parallel the "ItemByName" functions we implemented previously. However, there's another modification in this and the other new interface files: I've changed all the value arguments of user-defined types to const references, because passing such arguments by const reference is more efficient than passing them by value but just as safe, since it's impossible to accidentally change the calling function's variables through a const reference. For this reason, this is usually the best method of passing arguments of user-defined types. Variables of native types, on the other hand, are most efficiently passed by value because they do not require copy constructors or other overhead when passed in that way, as objects of user-defined types do.

Susan had a question about passing arguments.

Susan: Why does it matter whether an argument is native or user-defined?
Steve: Remember, every value argument in a function is actually a new variable whose value is copied from the argument passed by the caller. Passing a native variable by value is efficient because it's a simple "bunch of bits"; there's no need to worry about pointers and the like. On the other hand, passing a user-defined variable by value requires a lot more work because the compiler has to call the copy constructor for that type of variable. Therefore, it's best to avoid call-by-value for user-defined types when it isn't necessary.
Now let's take a look at the first of the two new functions, HomeInventory::FindItemByDescription, shown in Figure 12.24.
FIGURE 12.24. HomeInventory::FindItemByDescription (from code\hmin6.cpp)
HomeItem HomeInventory::FindItemByDescription(
const xstring& Partial)
{
short i;
xstring Description;
bool Found = false;
short ItemCount = m_Home.size();

for (i = 0; i < ItemCount; i ++)
{
Description = m_Home[i].GetDescription();
if (Description.find_nocase(Partial) >= 0)
{
Found = true;
break;
}
}

if (Found)
return m_Home[i];

return HomeItem();
}

This function is very similar to FindItemByName, so I won't go over it in excruciating detail. The main difference between the two is that FindItemByDescription uses the new find_nocase function to locate an occurrence of a search xstring named Partial in the description of each HomeItem object in the m_Home vector. To do this, it retrieves the description from a HomeItem object, using the GetDescription member function, and stores it in an xstring called (imaginatively enough) Description. Then it calls find_nocase to see if there is an occurrence of Partial in the Description xstring. If so, it leaves the loop and returns the object whose description contained the contents of the Partial argument; otherwise, it executes the loop again. This continues until it either finds a match or runs out of items to examine. In the latter case, it returns a null HomeItem to indicate that it couldn't find what the user was looking for.

Susan had a question about why we have to return a null HomeItem in the latter case:

Susan: Why do we have to return anything if we don't find what we're looking for? Why not just return nothing?
Steve: We can't return nothing, because the declaration of our function says that we will return a HomeItem, so the compiler will complain if we don't return a HomeItem. So the only alternative is to return a null HomeItem, so the calling function knows that we didn't find what we were looking for.
However, this possibility means that we need an IsNull member function in the HomeItem class so that the calling program can tell whether it has received a null HomeItem. To see this and the other (relatively minor) changes to the HomeItem interface, let's take a look at the new version of that interface, which is shown in Figure 12.25.
FIGURE 12.25. The new version of the HomeItem interface (code\hmit6.h)
// hmit6.h

class HomeItem
{
friend std::ostream& operator << (std::ostream& os,
const HomeItem& Item);

friend std::istream& operator >> (std::istream& is, HomeItem& Item);

public:
HomeItem();
HomeItem(const HomeItem& Item);
HomeItem& operator = (const HomeItem& Item);
virtual ~HomeItem();

// Basic: Art objects, furniture, jewelry, etc.
HomeItem(const xstring& Name, double PurchasePrice,
long PurchaseDate, const xstring& Description,
const xstring& Category);

// Music: CDs, LPs, cassettes, etc.
HomeItem(const xstring& Name, double PurchasePrice,
long PurchaseDate, const xstring& Description,
const xstring& Category, const xstring& Artist,
const Vec<xstring>& Track);

virtual void Write(std::ostream& os);
virtual short FormattedDisplay(std::ostream& os);
virtual xstring GetName();
virtual xstring GetDescription();
virtual bool IsNull();
static HomeItem NewItem();

virtual void Read(std::istream& is);
virtual void Edit();

protected:
HomeItem(int);
virtual HomeItem* CopyData();

protected:
HomeItem* m_Worker;
short m_Count;
};

Susan had a couple of questions about this new version of the interface.
Susan: Why didn't you put the destructor at the end of the interface? After all, it is the last function to be executed.
Steve: I always put the "concrete data type" functions together at the beginning of the public section of the interface. That makes them easier to find.
Susan: Why aren't the constructors virtual if the destructor is?
Steve: Constructors can't be virtual, because the whole point of a virtual function is to allow the program to use the actual type of an object to determine which function is called. When we call a constructor to create an object, the object doesn't exist yet, so there would be no way to determine which virtual function should be called.
As with the HomeInventory class, I've changed all the xstring arguments to const xstring& to improve efficiency by preventing excessive copying. I've done the same with the Track argument to the HomeItemMusic normal constructor; it's now a const Vec<xstring>& rather than a Vec<xstring>, as in the previous version of the header. I've also added two new functions: GetDescription and IsNull. As usual, the HomeItem versions of these functions merely call the corresponding virtual function in the worker object and pass the results back to the calling function. As for the HomeItemBasic version of GetDescription, this function just returns the current value of the m_Description field in its object, so we don't have to bother analyzing it. IsNull is pretty simple too, but we should still take a look at its implementation, shown in Figure 12.26.
FIGURE 12.26. HomeItemBasic::IsNull (from code\hmit6.cpp)
bool HomeItemBasic::IsNull()
{
if (m_Name == "")
return true;

return false;
}

The idea here is that every actual HomeItem has to have a name, so any object that doesn't have one must be a null HomeItem. Therefore, we check whether the name is null. If so, we have a null item, so we return true; otherwise, it's a real item, so we return false to indicate that it's not null.

Susan had a question about the implementation of this function:

Susan: Why isn't there an else in that function? It seems like there should be one.
Steve: Yes, that is a little bit tricky. Let's think about what happens when we execute that function. If the name is equal to "" (nothing), then we will return the value true, so we will never get to the statement that says to return the value false. On the other hand, if the name isn't equal to "", we won't execute the first return statement that returns the value true, but will execute the return statement that returns the value false. So this function will do what we want in either case.
Of course, we don't have to reimplement IsNull in HomeItemMusic because the test for a null music item is identical to the test for a null basic item. Therefore, we can use the HomeItemBasic version of this function should we need to check whether a HomeItemMusic object is "real" or null.

12.9. Putting It All Together

When first writing this part of this chapter, I thought that we had already covered everything needed to build a real application program that would allow the user to create, update, and find items in the database with reasonable ease and convenience. The main program for my first attempt at this was called hmtst7.cpp. When Susan tried it out and then read the code, it had quite an effect on her, as indicated in this letter she wrote to her sister.
Susan: I had another revelation over programming last night. After having read hundreds of pages of this book, Steve showed me the Home Inventory program that we are writing. Annie, it is the smallest little program that you can imagine. Just in DOS, and it is so simple. But I have just spent weeks tearing my hair out trying to understand how it all comes together, it is so complicated, and JUST SO HARD.
Then Steve shows me the program [running] and IT LOOKS LIKE NOTHING! I could not believe it. I just see this little menu on the screen and yet I know what is behind it. At least 1500 lines of code including 7 different header files. If you saw this program [run] you would laugh. It is just so basic. But if you saw what went into it you would die. It is nothing less than pure genius. As I told Steve yesterday it is like having a steak dinner but having not to just cook the meat, but having to go kill the cow. He corrected me, it is like having to make the gun first to kill the cow and invent fire and a grill to cook it.
You have to write everything, I mean all of it. That includes the meaning for = and all the other operators. Unbelievable. Then I looked at Windows 95 and said, then "If that little program takes 1500 lines of code, what does this take?" Steve said "About 5 million lines." Computers look like they are technological miracles. And they are. But behind them is nothing but sheer, old fashioned, genius of man. And it is all hard work. It looks like a miracle but there is no magic.
If you wish, you can try the program yourself, but I won't be reproducing the code here because it turned out that it was far from finished. You see, as stunned as Susan may have been by the complexity of this program, she wasn't too amazed to tell me what was wrong with it and how it could be improved. Here's her "wish list", along with my responses.

First Test Session: Change Requests and Problem Reports

1. Presenting the menu options in different colors.
Can't be done with standard C++ input/output functions.
2. Showing the list of names below the menu rather than above.
See below.
3. Putting the menu in the center of the screen rather than at the top.
These two changes became irrelevant after I redesigned the program to use the screen more efficiently.
4. Sorting the items by name rather than by the order in which they were originally entered.
Done, via a new HomeInventory::SortInventoryByName member function (but see later comments on problems with this function).
5. Being able to move to the next matching item if there is more than one (on a partial field match).
Wrote several HomeInventory member functions to assist in handling multiple matching entries (for more details, see "handling more entries than will fit on the screen", below).
6. Removing an item.
Done.
7. Making a list of categories and displaying it when adding a new item.
Not done, for reasons indicated below.
While watching Susan use the program, I came up with my own list of problems that needed to be fixed and some other improvements in addition to the ones she mentioned above.
8. The error message for an invalid entry is poorly formatted.
Added functions for error reporting.
9. An error in entering a numeric value isn't handled properly.
Same as above.
10. There's no indication to the user as to how the date or amount fields are supposed to be entered (YYYYMMDD and a number with a decimal point but no $ or comma, respectively).
Added a note in the input prompt indicating proper data entry format.
11. An invalid date entry (i.e., other than a valid YYYYMMDD date) should be detected and reported to the user.
Added code to check for this problem: the date must be at least 19000101 (but see later discussion of problems with this solution).
12. The user should be able to determine how many items are in the inventory.
Added a line at the top of the menu indicating how many elements are currently in the inventory.
13. Allowing the user to select an item from a list of all items that meet some criterion (e.g., name, description).
Added selection functions as noted above for this purpose.
14. Handling a list containing more entries than will fit on the screen at one time.
Wrote a selection function in a new HomeUtility namespace to allow scrolling through any number of items.
15. Continuing rather than aborting when an error is detected.
Handled most of these problems by improved error checking in the code; further problems surfaced later and were noted where they occurred.
16. Saving the data during the execution of the program.
Not handled in this version of the program.
17. Backing up the old database before writing over it.
Not done; left as an exercise.
18. If the user asks to delete an item, the program should ask, "Are you sure?"
Done.
Amazingly, the hardest one of these turned out to be the "list of categories". In fact, I didn't implement this at all because it would have required a fairly significant reconstruction of the structure of classes in the program. The problem with this seemingly simple request is that the category list would have to be generated in the HomeInventory member function (because each HomeItem has only one category value out of all those in use, not the entire list). However, the list would need to be used in the HomeItem classes because that's where the user needs to specify the category. Getting the data from the HomeInventory object to the appropriate HomeItem object would be difficult because the existing HomeItem functions don't have any facilities for doing this. While there are ways to solve such problems, they would take us too far afield to be worth the trip in this particular case, since the category listing is not absolutely essential to the usability of the program.

After finishing the above revisions to the program, I compiled it and tested it myself until I was satisfied that it worked. Then I had Susan give it another test. I fully expected that she would be happy with the new functionality and would find the program pretty much bullet-proof. Here are my notes from that second test session, along with my determination of the cause of the problem or question.

Second Test Session: Change Requests and Problem Reports

1. Adding a new Music item with a bad date such as "1997/02/15" (instead of 19970215) caused the program to abort with a fatal error.
This was caused by my forgetting to check the return value of the HomeItemBasic version of the input function when calling it from the corresponding HomeItemMusic function.
2. The sorting algorithm used to put the items into alphabetical order sorts lowercase letters after uppercase ones. It should ignore case.
This problem was caused by the use of <, which is case-sensitive, to compare strings. I fixed this by adding a less_nocase function to the xstring class and using that function in the sorting algorithm instead of <.
3. If the user typed in an item number that was not valid, the program exited with an illegal vector element error message.
I added code to check whether the item number was valid and to ignore an invalid number rather than end the program.
Susan had some other comments and questions after using this version of the program.
Susan: The Music items should be listed separately from the other items.
Steve: What if you want to see all the items in your inventory? That may not seem to make sense with CDs and furniture, but what about when we add the clothing, appliance, and other types?
Susan: I don't want them jumbled together.
Steve: Well, we could add separate functions for showing things of each type, but I'm not sure how valuable the discussion of all those functions would be. I know, I'll make it an exercise!
Susan: Okay. Now, I noticed that if you type in something illegal (like a bad date) you have to start over again rather than being able to fix it right there.
Steve: Yes, that's true. It would make another good exercise to add the ability to continue entering data for the same item after an error. Thanks for the suggestion!
Susan: As long as I don't have to do it. I also have some other questions about the dates. What if I inherited something old that was purchased before 1900? Also, what if I don't know the date when I bought something? Can I type in ???????
Steve: I'll make the starting date 1800 rather than 1900. As for using question marks to mean "I don't know": that won't work because the input routine is checking for a numeric value. However, I'll change it so you can use 0 to mean "I don't know when I got it". How's that?
Susan: That's okay. However, I noticed one more thing. It would let me type 19931822 (the 22nd day of the 18th month of 1993). Shouldn't it check for legal dates?
Steve: Yes, that would be a good . . . exercise!
After making all these changes, I recompiled and tested the program, then gave it to Susan to see what she could do with (or to) it. Here are my notes of that trial along with how I handled the points that came up.

Third Test Session: Change Requests and Problem Reports

1. On a screen with only 25 lines, it was pretty easy to overflow the available space when editing a long item. This resulted in a very messy display.
I changed the program to clear the screen before editing an item. Now the entire screen was available to display the item rather than the main menu being on the screen all the time.
2. If the user typed something other than just hitting the Enter key after a message that said "Enter to continue", the other keystrokes were interpreted as menu selections.
I added code to ignore any keystrokes preceding an Enter in that situation.
3. When the user typed an invalid item number, the program just ignored it rather than giving any indication of an error.
I added code to display an "invalid item number" message in such a situation.
4. The routine that asked for an item number didn't handle backspaces correctly.
I changed the routine to fix that problem.
5. If there weren't any items matching the user's selection criterion, the selection area was blank with no other indication of what had happened.
I added code to provide a "No items found" error message.
6. The "category" field was pretty useless, as it couldn't be used for selecting items.
I added a "select by category" menu item, which displays the category as well as the name of the item.
In addition to these changes, this version of the program also implemented a "crash protection" feature, which automatically saves the latest version of the database in a separate file every time a change is made to any item or an item is added or deleted. This might be a problem in terms of performance if the database gets to be very large, but I think it's worth it overall. Without such a feature, the user could spend an hour or two typing in data and then have a power failure (or a program bug for that matter) and lose all that work.

Another item that might be useful (especially with long descriptions), but isn't essential in getting the program to work, is the ability to edit a text field without typing it all in again. That's the topic of another exercise at the end of this chapter.

After making all of these changes and testing them to make sure they seemed to work, I went back to the well one more time. Here are the results of this go-round.

Fourth Test Session: Change Requests and Problem Reports

1. If Susan typed in a category that didn't exist, the program aborted with a "virtual memory exceeded" error.
The problem here turned out to be my attempt to format the category listing header. If the header was longer than the actual category names, the code calculated a negative number of characters of padding. Then it tried to create an xstring of that number of spaces by calling the xstring constructor that takes a number of characters as its argument. When this constructor called the new operator to allocate the space for the padding, it passed the negative number of characters along to new. However, new doesn't expect to get a negative argument as the number of objects to create, and so it interpreted that negative value as a very large positive number. When it tried to allocate that number of characters, the underlying memory allocation routines couldn't handle the request and terminated the program after giving that error message. I fixed this by correcting the formatting logic so that the program wouldn't ask for a negative amount of padding.
2. The code to display an error if there were no items found wasn't working.
I changed the "wait for CR" code to fix this.
3. The name and category header wasn't lined up properly with the actual name and category entries for item numbers greater than 9 (i.e., more than one digit).
I changed the formatting of the selection function to fix the width of the item number at 5 digits rather than as variable according to the size of the item number. This made it much easier to line the header up with the data.
Susan had another suggestion to make the program easier to use, as well as a question about the category listing function.
Susan: How about letting the user type the date in with the slashes, as YYYY/MM/DD?
Steve: That would make a good exercise too.
Susan: How does the category listing function know where to put the categories on the screen?
Steve: It goes through the items once to figure out how long the names are and once to do the formatting. Thus, by the time it does the formatting, it knows how long the longest item name is. We'll go over exactly how that works when we get to that function, SelectItemFromCategoryList (Figure 13.31 on page 969).
After making all the changes indicated above, I had Susan try it once more, with the following results.

Fifth Test Session: Change Requests and Problem Reports

1. There wasn't any way to cancel the "Add item" operation if the user decided not to do that after starting the operation.
I added code to allow Enter to cancel the "Add item" operation.
2. Susan wanted to know how she could use the "Find item by category" operation if she didn't remember which category she was looking for.
Here is one of the rare times when a program has a desired feature that the programmer didn't think to add explicitly. As it happens, all you have to do is hit Enter when you are asked for the category name, and it will include items from all categories. I changed the prompt to inform the user about this feature. Since this serendipitous feature also works when the user is asked for a description or name, I changed those prompts as well.
After all these changes, I finally had a program that seemed to work properly according to a representative user's expectations for it. We'll analyze this final version of the home inventory program in great detail in the next chapter, but first we should take a look at the development process that I've just described.

12.10. How Software Development Really Works

Most books on programming present the software development process as a linear progression from the beginning to the end of a project with no detours on the way. However, this is a very misleading picture of what is actually an iterative process: every actual project requires a lot of feedback from users. It also requires considerable time spent correcting errors that may have been overlooked in previous revisions or introduced while adding new features (or even fixing old bugs). The whole process often involves one step back (or sideways) for each several steps forward.

In fact, even the picture I just presented of the incremental implementation of the home inventory program up to this point is over-simplified. I left out a number of occasions on which designing and implementing a new feature took several attempts, including adding further infrastructure support (e.g., the less_nocase function in the xstring class).

This whole process may seem odd. After all, programming is not a "soft" subject; for any given program, it should be fairly easy to decide whether it works or it doesn't.10 The error in this analysis is that even relatively simple programs (such as the one we've spent so much time on) are complex constructs that display a wide range of behavior. Given this complexity, determining whether a given program "works" is anything but trivial; otherwise, I wouldn't have needed Susan to test the program after I finished my own testing. In fact, on virtually every occasion when I had her test a new version of the program in which I had made significant changes, she found some anomaly that necessitated further work. This is anything but unusual; in fact, some large software companies, in a process known as beta testing, send out thousands of copies of each of their major programs to be tested before they release them, in the hope of finding most of the bugs before the paying customers find them (and get upset). Of course, those programs are much more complicated than the little one we've been developing, but the principle is the same: if a program hasn't been tested, it will have bugs.11

Luckily, by using the techniques I've illustrated, it's possible to produce programs that should have fewer bugs, and bugs that are less difficult to find. Hopefully, the program that we've just finished working on doesn't have very many bugs left in it by this time, but I can't guarantee that it is bug-free.12

Now it's time for some review of what we've done in this chapter.

12.11. Review

We started this chapter by creating a new xstring class based on the standard library string class, so that we could add some new functionality to the home inventory program. Before we started to examine the functions in this class, the first new construct we encountered was the include guard, which is a means of preventing the C++ compiler from seeing the same class definition more than one time for a given source code file. Implementing the include guard required us to look at some very old parts of C++ dating back to the early days of C: the preprocessor and its preprocessor symbols. The preprocessor was originally a separate program that was executed before the compiler itself, but nowadays it is usually physically part of the compiler. Most of its features are no longer needed in C++, but we still need it to create an include guard as well as to handle included header files - its most common use these days.

After dealing with that new construct, we went over the implementation of the functions in the new xstring class, starting with the constructors, which allow us to create and copy objects of this new type, including making them interoperate with objects of the standard library string class. Then we continued by examining the first regular member function of this class: find_nocase. This function looks for a sequence of characters within a particular xstring without worrying about whether the case of the characters in the "target" xstring matches the case of the specified sequence of characters. We needed this ability to implement the functions that allow the user to find items whose descriptions or other fields contain a specified sequence of characters. During the discussion of this function, we ran across a (non-standard) C library function called strnicmp, which compares two C strings without considering the case of the characters in them. This is an important feature to provide because the user may not remember the case of the word for which he or she is looking.

The second and final regular member function in the xstring class is less_nocase. It is exactly like the normal operator < function in the string class that we created earlier in this book, except that it uses the strnicmp function to do the comparison so that upper- and lower-case characters will be compared without regard to considering their case. This was necessary in the implementation of the sorting routine in SortInventoryByName.

After finishing the discussion of the less_nocase function, we looked at an example program that illustrates the use of one of the newly added "find" functions, FindItemByDescription, which, as its name suggests, searches for an item whose description field contains a particular sequence of characters. After a brief description of this new example program, we started to tackle the latest version of the HomeInventory class, hmin6.h. Besides adding the new member functions needed to support searching for items by name or description, I took this opportunity to change the argument-passing conventions of all of the member functions that previously used value arguments of user-defined types to use const references instead. This is usually the best way to pass arguments of user-defined types because it avoids the necessity of making a copy of the argument. Arguments of native types, on the other hand, typically are best passed by value, as copying them is not very expensive.

Next, we discussed FindItemByDescription, which is very similar to the previously defined FindItemByName, except that it uses the new find_nocase function to search for an item that matches the user's partial description.

At this point, we were ready to look at the next version of the HomeItem interface, hmit6.h, which differs from the previous version of this interface in a few minor ways. First, I changed all the arguments of user-defined types to use const references rather than pass-by-value, just as I did with the HomeInventory class. I also added two new functions, GetDescription and IsNull. Neither of these functions is at all complicated, so we passed over them quickly.

At that point, I fondly believed that I had all the pieces to complete the home inventory project. Therefore, I wrote what I thought would be the final version of the program and submitted it to my beta tester, Susan, for her approval. I was quite surprised to discover that we still had a long way to go. After she had taken a look at the program, both in execution and in source code, we embarked on a voyage of discovery to find and fix errors and inconveniences, and to find ways to improve the functioning of the program. The next 10 pages or so of the chapter consisted of a repeated cycle of the following steps:

1. Susan's trying the program while I watched;
2. My fixing the problems she discovered and adding new features that would make the program easier to use.
Most of the changes I made fell into three main categories: fixing errors (including improved error handling), cosmetic changes (such as position of items on the screen), and improvements to the functioning of the program (such as allowing the user to select items by category). There were also a few changes that I didn't make because they would have required effort out of proportion to their importance in the functioning of the program. Instead, I left them as exercises, which you will see at the end of the next chapter, after we have gone over the code of the final version of the program.

12.12. Conclusion

In this chapter, we brought our home inventory project from humble beginnings to a working, usable program. In the process, we learned about a number of features of C++ that we hadn't needed before. In the next chapter, we'll finish analyzing the project we started so many pages ago.

1 In fact, there is an application development method called "Extreme Programming" that is primarily devoted to dealing with this problem. See Extreme Programming Explained by Kent Beck (Addison-Wesley, ISBN 0-201-61641-6) for an overview of this method.

2 I've suggested a change to the C++ standard that would add a new type of derivation called using inheritance. This new form of inheritance would allow convenient use of all of the facilities of the base class part of a derived class object without allowing a base class pointer to point to a derived class object. Maybe the next revision of the standard will provide this facility, in which case the last reasonable objection to deriving xstring from std::string will be removed.

3 If you're paying very close attention here, you'll notice that the declaration in the header file specifies std::string, not simply string, as the definition in this figure does. The reason we don't have to specify std::string in the definition is that we have included the statement using std::string; in the implementation file so that we don't have to repeat the std:: qualifier throughout this file. I didn't include that using statement in the header file, for reasons explained under the heading "Implementing operator <" on page 514 in Chapter 8, so we need the std:: qualifier there.

4 By the way, a constructor that takes more than one argument but has default arguments for all arguments after the first one is also a conversion function, because it can be called with one argument.

5 Note that the type of the argument to the constructor doesn't have to match the declaration of the constructor exactly to be acceptable. The value 1, for example, will match any unsigned or signed integer type. The argument matching rules are complicated, but luckily we don't have to worry about them further at this point.

6 I made this function case-insensitive after watching Susan's attempt to use a version of the HomeInventory example program that used a case-sensitive searching function. This illustrates why it is absolutely necessary to watch a (hopefully representative) user of a program actually try it before making the assumption that it is "ready for prime time".

7 According to the C++ standard, the address returned by the c_str function may not actually be the address of the char data for the string itself, but the address of a copy of that data. However, this implementation detail doesn't affect us.

8 We've discussed memcmp under the heading "Using a Standard Library Function to Simplify the Code" on page 529 in Chapter 8.

9 If this comparison procedure still isn't clear to you, you might want to try drawing a diagram of a search string and target string with appropriate contents, and tracing execution of find_nocase. The reason I haven't provided such diagrams is that they would really need to be animated to be of much use, and I'm afraid that book publishing technology isn't quite up to that yet!

10 Actually, this is not true in theory. It was proven many years ago that it is impossible in the general case even to determine whether a particular program will run forever or stop, given a particular set of input, by any mechanical means. This is the famous "halting problem" of Alan Turing. In practice, however, it is usually possible to determine whether a meaningful program works correctly, although this may be quite difficult to do.

11 According to a very well known principle called "Murphy's Law", even if a program has been tested, it will still have bugs.

12 By the way, if you find any bugs in this or any of the other sample programs, which is certainly possible despite my (and Susan's) testing, please let me know so I can fix them for the next printing of this book.


TOC PREV NEXT INDEX