TOC PREV NEXT INDEX


I've just released a deduplicating backup product for
VMWare Workstation and Server.
Do you need to reduce the storage needed to maintain multiple backups?
Or do you need multiple snapshots on VMWare Server?
If you have either of these needs, just click
here to get started for only $29 (limited time offer ends July 31st, 2011)!

C++: A Dialog


CHAPTER 4 More Basics
Now that we have seen how to write a simple program in C++, it's time to acquire some more tools. We'll extend our example program from Chapter 3 for finding the heaviest pumpkin. Eventually, we want to provide the weights of the three heaviest pumpkins, so that first, second, and third prizes can be awarded. It might seem that this would require just a minor modification of the previous program, in which we would keep track of the heaviest so far, second heaviest so far, and third heaviest so far, rather than merely the heaviest so far. However, this modification turns out to be a bit more complicated than it seems. Since this book is intended to teach you how to program using C++, rather than just how to use the C++ language, it's worth investigating why this is. First, though, here are the objectives of this chapter.
4.1. Objectives of This Chapter
By the end of this chapter, you should
1. Understand the likelihood of error in even a small change to a program.
2. Be aware that even seemingly small changes in a problem can result in large changes in the program that solves the problem.
3. Have some understanding of the type of thinking needed to solve problems with programming.
4. Understand the selection sorting algorithm for arranging values in order.
5. Understand how to use a Vec to maintain a number of values under one name.
6. Be able to use the for statement to execute program statements a (possibly varying) number of times.
7. Be familiar with the arithmetic operators ++ and +=, which are used to modify the value of variables.

4.2. Algorithmic Thinking

Let's take our program modification one step at a time, starting with just the top two weights. Figure 4.1 is one possible way to handle this version of the problem.
FIGURE 4.1. Finding the top two weights, first try (code\pump1a.cpp)

#include <iostream>

using namespace std;

int main()
{
short CurrentWeight;
short HighestWeight;
short SecondHighestWeight;

cout << "Please enter the first weight: ";
cin >> CurrentWeight;
HighestWeight = CurrentWeight;
SecondHighestWeight = 0;
cout << "Current weight " << CurrentWeight << endl;
cout << "Highest weight " << HighestWeight << endl;

while (CurrentWeight > 0)
{
cout << "Please enter the next weight: ";
cin >> CurrentWeight;
if (CurrentWeight > HighestWeight)
{
SecondHighestWeight = HighestWeight;
HighestWeight = CurrentWeight;
}
cout << "Current weight " << CurrentWeight << endl;
cout << "Highest weight " << HighestWeight << endl;
cout << "Second highest weight " << SecondHighestWeight << endl;
}

return 0;
}
The reasons behind some of the new code, shown in bold, should be fairly obvious, but we'll go over them anyway. First, of course, we need a new variable, SecondHighestWeight, to hold the current value of the second highest weight we've seen so far. Then, when the first weight is entered, the statement SecondHighestWeight = 0; sets the SecondHighestWeight to 0. After all, there isn't any second-highest weight when we've only seen one weight. The first nonobvious change is the addition of the statement SecondHighestWeight = HighestWeight;, which copies the old HighestWeight to SecondHighestWeight, whenever there's a new highest weight. On reflection, however, this should make sense; when a new high is detected, the old high must be the second highest value (so far). Also, we have to copy the old HighestWeight to SecondHighestWeight before we change HighestWeight. After we have set HighestWeight to a new value, it's too late to copy its old value into SecondHighestWeight.

First, let's see how Susan viewed this solution:

Susan: I noticed that you separate out the main program { } from the other { } by indenting. Is that how the compiler knows which set of { } goes to which statements and doesn't confuse them with the main ones that are the body of the program?
Steve: The compiler doesn't care about indentation at all; that's just for the people reading the program. All the compiler cares about is the number of { it has seen so far without matching }. There aren't any hard rules about indentation; it's a "religious" issue in C++, where different programmers can't agree on the best way.
Susan: Now on this thing with setting SecondHighestWeight to 0. Is that initializing it? See, I know what you are doing, and yet I can't see the purpose of doing this clearly, unless it is initializing, and then it makes sense.
Steve: That's correct.
Susan: How do you know how to order your statements? For example, why did you put the "SecondHighestWeight = HighestWeight;" above the other statement? What would happen if you reversed that order?
Steve: Think about it. Let's suppose that:

CurrentWeight is 40

HighestWeight is 30

SecondHighestWeight is 15

and the statements were executed in the following order:
1. HighestWeight = CurrentWeight
2. SecondHighestWeight = HighestWeight
What would happen to the values? Well, statement 1 would set HighestWeight to CurrentWeight, so the values would be like this:

CurrentWeight is 40

HighestWeight is 40

SecondHighestWeight is 15

Then statement 2 would set SecondHighestWeight to HighestWeight, leaving the situation as follows:

CurrentWeight is 40

HighestWeight is 40

SecondHighestWeight is 40

This is clearly wrong. The problem is that we need the value of HighestWeight before it is set to the value of CurrentWeight, not afterward. After that occurs, the previous value is lost.
Susan: Yes, that is apparent; I was just wondering if the computer had to read it in the order that you wrote it, being that it was grouped together in the {}. For example, you said that the compiler doesn't read the {} as we write them, so I was wondering if it read those statements as we write them. Obviously it has to. So then everything descends in a progression downward and outward, as you get more detailed in the instructions.
If you wish, you can try out this program. First, you have to compile it by following the compilation instructions on the CD. Then type pump1a to run the program under DOS. It will ask you for weights and keep track of the highest weight and second-highest weight that you've entered. Type 0 and hit Enter to end the program.

You can also run it under the debugger by following the usual instructions for that method.

Susan Finds a Bug

This program may seem to keep track of the highest and second highest weights correctly, but in fact there's a hole in the logic. To be exact, it doesn't work correctly when the user enters a new value that's less than the previous high value but more than the previous second-high value. In that case, the new value should be the second-high value, even though there's no new high value. For example, suppose that you enter the following weights: 5 2 11 3 7. If we were to update SecondHighestWeight only when we see a new high, our program would indicate that 11 was the high, and 5 the second highest. Since neither 3 nor 7 is a new high, SecondHighestWeight would remain as it was when the 11 was entered.

Here's what ensued when Susan tried out the program and discovered this problem:

Susan: Steve, the program! I have been playing with it. Hey this is fun, but look, it took me awhile. I had to go over it and over it, and then I was having trouble getting it to put current weights that were higher than second weights into the second weight slot. For example, if I had a highest weight of 40 and the second highest weight of 30 and then selected 35 for a current weight, it wouldn't accept 35 as the second-highest weight. It increased the highest weights just fine and it didn't change anything if I selected a lower number of the two for a current weight. Or did you mean to do that to make a point? I am supposed to find the problem? I bet that is what you are doing.
Steve: Yep, and I'm not sorry, either. <G>
Susan: You just had to do this to me, didn't you? OK, what you need to do is to put in a statement that says if the current weight is greater than the second-highest weight, then set the second-highest weight to the current weight (as illustrated in Figure 4.2).
FIGURE 4.2. Susan's solution to the bug in the first attempt
else
{
if (CurrentWeight > Second HighestWeight)
Second HighestWeight = CurrentWeight;
}
I hope you are satisfied.
Steve: Satisfied? Well, no, I wouldn't use that word. How about ecstatic? You have just figured out a bug in a program, and determined what the solution is. Don't tell me you don't understand how a program works.
Now I have to point out something about your code. I understood what you wrote perfectly. Unfortunately, compilers aren't very smart and therefore have to be extremely picky. So you have to make sure to spell the variable names correctly, that is, with no spaces between the words that make up a variable name. This would make your answer like the else clause shown in Figure 4.3.
Congratulations again.
As Susan figured out, we have to add an else clause to our if statement, so that the corrected version of the statement looks like Figure 4.3.
FIGURE 4.3. Using an if statement with an else clause
if (CurrentWeight > HighestWeight)
{
SecondHighestWeight = HighestWeight;
HighestWeight = CurrentWeight;
}
else
{
if (CurrentWeight > SecondHighestWeight)
SecondHighestWeight = CurrentWeight;
}
In this case, the condition in the first if is checking whether CurrentWeight is greater than the previous HighestWeight; when this is true, we have a new HighestWeight and we can update both HighestWeight and SecondHighestWeight. However, if CurrentWeight is not greater than HighestWeight, the else clause is executed. It contains another if; this one checks whether CurrentWeight is greater than the old SecondHighestWeight. If so, SecondHighestWeight is set to the value of CurrentWeight.

What happens if two (or more) pumpkins are tied for the highest weight? In that case, the first one of them to be encountered is going to set HighestWeight, as it will be the highest yet encountered. When the second pumpkin of the same weight is seen, it won't trigger a change to HighestWeight, since it's not higher than the current value of that variable. It will pass the test in the else clause, if (CurrentWeight > SecondHighestWeight), however, which will cause SecondHighestWeight to be set to the same value as HighestWeight. This is reasonable behavior, unlikely to startle the (hypothetical) user of the program, and therefore is good enough for our purposes. In a real application program we'd have to try to determine what the user of this program would want us to do.

Figure 4.4 shows the corrected program.

FIGURE 4.4. Finding the top two weights (code\pump2.cpp)
#include <iostream>
using namespace std;

int main()
{
short CurrentWeight;
short HighestWeight;
short SecondHighestWeight;

cout << "Please enter the first weight: ";
cin >> CurrentWeight;
HighestWeight = CurrentWeight;
SecondHighestWeight = 0;
cout << "Current weight " << CurrentWeight << endl;
cout << "Highest weight " << HighestWeight << endl;

while (CurrentWeight > 0)
{
cout << "Please enter the next weight: ";
cin >> CurrentWeight;
if (CurrentWeight > HighestWeight)
{
SecondHighestWeight = HighestWeight;
HighestWeight = CurrentWeight;
}
else
{
if (CurrentWeight > SecondHighestWeight)
SecondHighestWeight = CurrentWeight;
}
cout << "Current weight " << CurrentWeight << endl;
cout << "Highest weight " << HighestWeight << endl;
cout << "Second highest weight " << SecondHighestWeight << endl;
}

return 0;
}
If you wish, you can try out this program. First, you have to compile it by following the compilation instructions on the CD. Then type pump2 to run the program under DOS. It will ask you for weights and keep track of the highest weight and second-highest weight that you've entered. Type 0 and hit Enter to end the program.

You can also run it under the debugger, by following the usual instructions for that method. When you are asked for a weight, type one in and hit Enter just as when executing normally. When you enter a 0 weight, the program will stop looping and execution will take the path to the end }.

By the way, since we've been using the if statement pretty heavily, this would be a good time to list all of the conditions that it can test. We've already seen some of them, but it can't hurt to have them all in one place. Figure 4.5 lists these conditions, with translations.

FIGURE 4.5. What if?

Symbol Controlled block will be executed if:
> First item is larger than second item
< First item is smaller than second item
>= First item is larger than or equal to second item
<= First item is smaller than or equal to second item
!= First item differs from second item
== First item has the same value as the second item

You may wonder why we have to use == to test for equality rather than just =. That's because = means "assign right hand value to variable on left", rather than "compare two items for equality". This is a "feature" of C++ (and C) that allows us to accidentally write if (a = b) when we mean if (a == b). What does if (a = b) mean? It means the following:
1. Assign the value of b to a.
2. If that value is 0, then the result of the expression in parentheses is false, so the controlled block of the if is not executed.
3. Otherwise, the result of the expression in parentheses is true, so the controlled block of the if is executed.
Some people find this useful; I don't. Therefore, whenever possible I enable a compiler warning that tells you when you use = inside an if statement in a way that looks like you meant to test for equality.

Even a Simple Change to a Program Can Cause Errors

I hope this excursion has given you some appreciation of the subtleties that await in even the simplest change to a working program. Many experienced programmers still underestimate such difficulties and the amount of time that may be needed to ensure that the changes are correct.1 I don't think it's necessary to continue along the same path with a program that can award three prizes. The principle is the same, although the complexity of the code grows with the number of special cases we have to handle. Obviously, a solution that could handle any number of prizes without special cases would be a big improvement, but it will require some major changes in the organization of the program. That's what we'll take up next.

4.3. Handling Any Number of Prizes

One of the primary advantages of the method we've used so far to find the heaviest pumpkin(s) is that we didn't have to save the weights of all the pumpkins as we went along. If we don't mind saving all the weights, then we can solve the "three prize" problem in a different way. Let's assume for the purpose of simplicity that there are only five weights to be saved, in which case the solution looks like this:
1. Read in all of the weights.
2. Make a list consisting of the three highest weights in descending order.
3. Award the first, second, and third prizes, in that order, to the three entries in the list of highest weights.
Now let's break those down into substeps that can be more easily translated into C++:
1. Read in all of the weights.
a. Read first number
b. Read next number
c. If we haven't read five weights yet, go back to 1b
2. Make a list consisting of the three highest weights in descending order.
a. Find the largest number in the original list of weights
b. Copy it to the sorted list
c. If we haven't found the three highest numbers, go back to 2a
Oops. That's not going to work, since we'll get the same number each time.2 To prevent that from happening, we have to mark off each number as we select it. Here's the revised version of step 2:
2. Make a list consisting of the three highest weights in descending order.
a. Find the largest number in the original list of weights
b. Copy it to the sorted list
c. Mark it off in the original list of weights, so we don't select it again
d. If we haven't found the three highest numbers, go back to 2a
3. Award the first, second, and third prizes, in that order, to the three entries in the list of highest weights.
a. Display first number in the list
b. Display next number in the list
c. If we haven't displayed them all, go back to 3b
Unlike our previous approach, this can easily be generalized to handle any number of prizes. However, we have to address two problems before we can use this approach: First, how do we keep track of the weights? And second, how do we select out the highest three weights? Both of these problems are much easier to solve if we don't have a separate variable for each weight.

The vector Data Type

We often need to keep track of a number of closely related variables in C++ programs, so the standard library provides a type called vector, defined in the header file <vector>. A vector is a variable containing a number of "sub-variables" that can be addressed by position in the vector; each of these sub-variables is called an element. A vector has a name, just like a regular variable, but the elements do not. Instead, each element has a number, corresponding to its position in the vector. For example, we might want to create a vector of short values called Weight, with five elements. To do this, we would write this line: vector<short> Weight(5);. Then, to refer to one of the elements of the vector Weight, say element 3, we could use the expression Weight[3].

The standard library is the result of an immense amount of work by very talented programmers over the course of quite a few years. Whenever possible, in code that you intend to be used over a period of time, you should use the standard library, as you are very unlikely to be able to create better code yourself. And even if you could improve on it somewhat, you would probably spend your time more productively by writing the parts of your program that aren't already taken care of by the standard library.

The Vec Data Type

So you may be surprised to learn that we're not going to use the vector data type, but one called Vec, defined in the header file "Vec.h". I have created this type, which uses the standard library vector to do most of the actual work, following the example of Bjarne Stroustrup in The C++ Programming Language.

Why are Bjarne and I creating our own Vec types rather than using one from the standard library? Because the standard library is designed for professional programmers, not for people who are just learning the language. For this reason, it is oriented more toward efficiency than toward safety. To make the learning process easier and more enjoyable for novices, the Vec type always checks whether an element number (such as 3 in the above example) is valid before trying to use it, which the standard vector does not. We'll see how this helps us later on.

We haven't heard from Susan for awhile, but the following exchange should make up for that.

Susan: OK, why do we need another header (#include "Vec.h")?
Steve: Each header contains definitions for a specific purpose. For example, <iostream> contains definitions that allow us to get information in (I) and out (O) of the computer. On the other hand, "Vec.h" contains definitions that allow us to use Vecs.
Susan: So then using a Vec is just another way of writing this same program, only making it a little more efficient?
Steve: In this case, the new program can do more than the old program could: the new program can easily be changed to handle virtually any number of prizes, whereas the old program couldn't.
Susan: So there is more than one way to write a program that does basically the same thing?
Steve: As many ways as there are to write a book about the same topic.
Susan: I find this to be very odd. I mean, on one hand the code seems to be so unrelentingly exact; on the other, it can be done in as many ways as there are artists to paint the same flower. That must be where the creativity comes in. Then I would expect that the programs should behave in different manners, yet accomplish the same goal.
Steve: It's possible for two programs to produce similar (or even exactly the same) results from the user's perspective and yet work very differently internally. For example, the "vectorized" version of the weighing program, if we had it display only the top two weights, would produce exactly the same final results as the final "non-vectorized" version, even though the method of finding the top two weights was quite different.
Now we can refer to the individual elements of the Vec called Weight by using their numbers, enclosed in square brackets ([ ]); the number in the brackets is called the index. Here are some examples:

Weight[1] = 123;
Weight[2] = 456;
Weight[3] = Weight[1] + Weight[2];
Weight[i+1] = Weight[i] + 5;
As these examples indicate, an element of a Vec can be used anywhere a "regular" variable can be used.3 But an element of a Vec has an attribute that makes it much more valuable than a "regular" variable for our purposes here: we can vary which element we are referring to in a given statement, by varying an index. Take a look at the last sample line, in which two elements of the Vec Weight are used: the first one is element i+1 and the other is element i.4 As this indicates, we don't have to use a constant value for the element number, but can calculate it while the program is executing. In this case, if i is 0, the two elements referred to are element 1 and element 0, while if i is 5, the two elements are elements 6 and 5, respectively.

The ability to refer to an element of a Vec by number rather than by name allows us to write statements that can refer to any element in a Vec, depending on the value of the index variable in the statements. To see how this works in practice, let's look at Figure 4.6, which solves our three-prize problem.5

FIGURE 4.6. Using a Vec (code\vect1.cpp)

#include <iostream>

#include "vec.h"

using namespace std;

int main()

{

Vec<short> Weight(5);

Vec<short> SortedWeight(3);

short HighestWeight;

short HighestIndex;

short i;

short k;

cout << "I'm going to ask you to type in five weights, in pounds." << endl;

for (i = 0; i < 5; i ++)

{

cout << "Please type in weight #" << i+1 << ": ";

cin >> Weight[i];

}

for (i = 0; i < 3; i ++)

{

HighestWeight = 0;

for (k = 0; k < 5; k ++)

{

if (Weight[k] > HighestWeight)

{

HighestWeight = Weight[k];

HighestIndex = k;

}

}

SortedWeight[i] = HighestWeight;

Weight[HighestIndex] = 0;

}

cout << "The highest weight was: " << SortedWeight[0] << endl;

cout << "The second highest weight was: " << SortedWeight[1] << endl;

cout << "The third highest weight was: " << SortedWeight[2] << endl;

return 0;

}

If you wish, you can try out this program. First, you have to compile it by following the compilation instructions on the CD. Then type vect1 to run the program under DOS. It will ask you for five weights. After you've entered five weights, the program will sort them and display the top three.

You can also run it under the debugger, by following the usual instructions for that method. When you are asked for a weight, type one in and hit Enter just as when executing normally. After you've entered 5 weights, the program will start the sorting process.

This program uses several new features of C++ which need some explanation. First, of course, there is the line that defines the Vec Weight:

Vec<short> Weight(5);
As you might have guessed, this means that we want a Vec of five elements, each of which is a short. As we have already seen, this means that there are five distinct index values each of which refers to one element. However, what isn't so obvious is what those five distinct index values actually are. You might expect them to be 1, 2, 3, 4 and 5; actually, they are 0, 1, 2, 3, and 4.

This method of referring to elements in a Vec is called zero-based indexing, as contrasted with the seemingly more natural one-based indexing where we start counting at 1. Although zero-based indexing might seem odd, assembly language programmers find it perfectly natural because the calculation of the address of an element is simpler with such indexing; the formula is "address of an element = (address of first element) + (element number) * (size of element)".

This bit of history is relevant because C, the predecessor of C++, was originally intended to replace assembly language so that programs could be moved from one machine architecture to another with as little difficulty as possible. One reason for some of the eccentricities of C++ is that it has to be able to replace C as a "portable assembly language" that doesn't depend on any specific machine architecture. This explains, for example, the great concern of the inventor of C++ for run-time efficiency, as he wished to allow programmers to avoid the use of C or assembly language for efficiency.6 Since C++ was intended to replace C completely, it has to be as efficient as possible; otherwise, programmers might switch back from C++ to C whenever they were concerned about the speed and size of their programs.

A Historical Example of One-based Indexing

While we're on the topic of zero-based indexing, I'd like to mention a case where the inventors of a commonly used facility should have used zero-based indexing, but didn't. We're still suffering from the annoyances of this one.

Long ago, there was no standard calendar with year numbers progressing from one to the next when January 1st came around. Instead, years were numbered relative to the reign of the current monarch; for example, the Bible might refer to "the third year of Herod's reign". This was fine in antiquity, when most people really didn't care what year it was. There weren't very many retirement plans or 50th wedding anniversaries to celebrate anyway. However, eventually historians realized that it was a major nuisance to try to calculate the age of someone who was born in the fourth year of someone's reign and died in the tenth year of someone else's. According to Grolier's Multimedia Encyclopedia:

'About AD 525, a monk named Dionysius Exiguus suggested that years be counted from the birth of Christ, which was designated AD (anno Domini, "the year of the Lord") 1. This proposal came to be adopted throughout Christendom during the next 500 years. The year before AD 1 is designated 1 BC (before Christ).'
The encyclopedia doesn't state when the use of the term BC started, but the fact that its expansion is English rather than Latin is a suspicious sign indicating that this development was considerably later. In any event, this numbering system made matters considerably easier. Now, you could tell that someone who was born in AD 600 and died in AD 650 was approximately 50 years old at death.

Unfortunately, however, there was still a small problem. Zero hadn't yet made it to Europe from Asia when the new plan was adopted, so the new calendar numbered the years starting with 1, rather than 0; that is, the year after 1 BC was 1 AD. While this may seem reasonable, it accounts for a number of oddities of our current calendar:

1. Date ranges spanning AD and BC are hard to calculate, since you can't just treat BC as negative. For example, if someone were born in 1 BC and died in 1 AD, how old was that person? You might think that this could be calculated as 1 - (-1), or 2; however, the last day of 1 BC immediately preceded the first day of 1 AD, so the person might have been only a few days old.
2. The 20th century consists of the years 1901 to 2000; the year numbers of all but the last year of that century actually start with the digits 19 rather than 20.
3. Similarly, the third millennium started on January 1, 2001, not 2000.
The reason for the second and third of these oddities is that since the first century started in 1 AD, the second century had to start in 101 AD; if it started in 100 AD, the first century would have consisted of only 99 years (1-99), rather than 100.

If only they had known about the zero! Then the zeroth century would have started at the beginning of 0 AD and ended on the last day of 99 AD. The first century would have started at 100 AD, and so on; coming up to current time, we would be living through the first years of the 20th century, which would be defined as all of those years whose year numbers started with 20. The second millennium would have started on January 1, 2000, as everyone would expect.7

4.4. Index Variables

Now let's get back to our discussion of the revised pumpkin-weighing program. The last two lines in the variable definition phase define two variables, called i and k, which have been traditional names for index variables (i.e., variables used to hold indexes) since the invention of FORTRAN, one of the first relatively "user-friendly" computer languages, in the 1950s. The inventors of FORTRAN used a fairly simple method of determining the type of a variable: if it began with one of the letters I through N, it was an integer. Otherwise, it was a floating-point variable (i.e., one that can hold values that contain a fractional part, such as 3.876). This rule was later changed so that the user could specify the type of the variable regardless of its name, as in C++, but the default rules were the same as in the earlier versions of FORTRAN, to allow programs using the old rules to continue to compile and run correctly.

Needless to say, Susan had some questions about the names of index variables:

Susan: So whenever you see i or k you know you are dealing with a Vec?
Steve: Not necessarily. Variables named i and k are commonly used as indexes, but they are also used for other purposes sometimes.
Susan: Anyway, if i and k are sometimes used for other purposes, then the compiler doesn't care what you use as indexes? Again, no rules, just customs?
Steve: Right. It's just for the benefit of other programmers, who will see i and say "oh, this is probably an index variable".
I suspect one reason for the durability of these short names is that they're easy to type, and many programmers aren't very good typists.8 In C++, the letters i, j, k, m and n are commonly used as indexes; however, l (the letter "ell") generally isn't, because it looks too much like a 1 (the numeral one).9 The compiler doesn't get confused by this resemblance, but programmers very well might.

After the variable definitions are out of the way, we can proceed to the executable portion of our program. First we type out a note to the user, stating what to expect. Then we get to the code in Figure 4.7.

FIGURE 4.7. Using a for loop (from code\vect1.cpp)
for (i = 0; i < 5; i ++)
{
cout << "Please type in weight #" << i+1 << ": ";
cin >> Weight[i];
}
The first line here is called a ffor statement, which is used to control a for loop; this is a loop control facility similar to the while loop we encountered in Chapter 3. The difference between these two statements is that a for loop allows us to specify more than just the condition under which the controlled block will be repetitively executed.10 A for statement specifies three expressions (separated by ";") that control the execution of the for loop: a starting expression, a continuation expression, and a modification expression. In our case, these are i = 0, i < 5, and i ++, respectively. Let's look at the function and meaning of each of these components.

First, the starting expression, in this case i = 0. The starting expression is executed once before the block controlled by the for statement is executed. In this case, we use it to set our index variable, i, to 0, which will refer to the first element of our Weight Vec.

Next, the continuation expression, in this case i < 5. As long as this expression is true, the controlled block of the for will be executed; in this case, we will continue executing the controlled block as long as the value of i is less than 5. Note: the continuation expression is actually executed before every execution of the controlled block; thus, if the continuation expression is false when the loop is entered for the first time, the controlled block will not be executed at all.

The notion of the continuation expression is apparently confusing to some novices. Susan fell into that group.

Susan: In your definition of for, how come there is no ending expression? Why is it only a modification expression? Is there never a case for a conclusion?
Steve: The "continuation expression" tells the compiler when you want to continue the loop; if the continuation expression comes out false, then the loop terminates. That serves the same purpose as an "ending expression" might, but in reverse.
Finally, let's consider the modification expression, i ++.11 This is exactly equivalent to i = i + 1, which means "set i to one more than its current value", an operation technically referred to as incrementing a variable. You may wonder why we need two ways to say the same thing; actually, there are a few reasons. One is that ++ requires less typing, which as I've already mentioned isn't a strong point of many programmers. Also, the ++ (pronounced "plus plus") operator doesn't allow the possibility of mistyping the statement as, for example, i = j + 1; when you really meant to increment i. Another reason why this feature was added to the C language is that in the early days of C, compiler technology wasn't very advanced and the ++ operator allowed the production of more efficient programs. You see, many machines can add one to a memory location by a single machine language instruction, usually called something like inc, shorthand for increment memory. Even a simple compiler can generate an "increment memory" instruction as a translation of i ++, while it takes a bit more sophistication for the compiler to recognize i = i + 1 as an increment operation. Since incrementing a variable is a very common operation in C++, this was worth handling specially.12

Here's an English translation of our sample for statement:

1. Set the index variable i to 0.
2. If the value of i is less than 5, execute the following block (in this case, the block with the cout and cin statements). Otherwise, skip to the next statement after the end of the controlled block; that is, the one following the closing }.
3. Add one to the value of i and go back to step 2.
Susan didn't think these steps were very clear. Let's listen in on the conversation that ensued:
Susan: Where in the for statement does it say to skip to the next statement after the end of the controlled block when i is 5 or more?
Steve: It doesn't have to. Remember, the point of { } is to make a group of statements act like one. A for statement always controls exactly one "statement", which can be a block contained in {}. Therefore, when the continuation expression is no longer true, the next "statement" to be executed is whatever follows the } at the end of the block.
Susan: Okay, now I get it. The { } curly brackets work together with the < 5 to determine that the program should go on to the next statement.
Steve: Right.
Susan: Now, on the "controlled block" - so other statements can be considered controlled blocks too? I mean is a controlled block basically just the same thing as a block? I reviewed your definition of block, and it seems to me that they are. I guess it is just a statement that in this case is being controlled by for.
Steve: Correct. It's called a controlled block because it's under the control of another statement.
Susan: So if we used while before the { } then that would be a while controlled block?
Steve: Right.
Susan: Then where in step 3 or in i++ does it say to go back to step 2?
Steve: Again, the for statement executes one block (the controlled block) repeatedly until the continuation expression is false. Since a block is equivalent to one statement, the controlled block can also be referred to as the controlled statement. In the current example, the block that is controlled by the for loop consists of the four lines starting with the opening { on the next line after the for statement itself and ending with the closing } after the line that says cin >> Weight[i];.
Susan: Okay. But now I am a little confused about something else here. I thought that cout statements were just things that you would type in to be seen on the screen.
Steve: That's correct, except that cout is a variable used for I/O, not a statement.
Susan: So then why is << i+1 << put in at this point? I understand what it does now but I don't understand why it is where it is.
Steve: Because we want to produce an output line that varies depending on the value of i. The first time, it should say
Please enter weight #1:

The second time, it should say


Please enter weight #2:
and so on. The number of the weight we're asking for is one more than i; therefore we insert the expression << i + 1 << in the output statement so that it will stick the correct number into the output line at that point.
Susan: How does << i+1 << end up as #1 ?
Steve: The first time, i is 0; therefore, i + 1 is 1. The # comes from the end of the preceding part of the output statement.
Now let's continue with the next step in the description of our for loop, the modification expression i ++. In our example, this will be executed five times. The first time, i will be 0, then 1, 2, 3, and finally 4. When the loop is executed for the fifth time, i will be incremented to 5; therefore, step 2 will end the loop by skipping to the next statement after the controlled block.13 A bit of terminology is useful here: each time through the loop is called an iteration.

Let's hear Susan's thoughts on this matter.

Susan: When you say that "step 2 will end the loop by skipping to the next statement after the controlled block", does that mean it is now going on to the next for statement? So when i is no longer less than 5, the completion of the loop signals the next controlled block?
Steve: Yes and no. In general, after all the iterations in a loop have been performed, execution proceeds to whatever statement follows the controlled block. In this case, the next statement is indeed a for statement, so that's the next statement that is performed after the end of the current loop.
The discussion of the for statement led to some more questions about loop control facilities and the use of parentheses:
Susan: How do you know when to use () ? Is it only with if and for and while and else and stuff like that, whatever these statements are called? I mean they appear to be modifiers of some sort; is there a special name for them?
Steve: The term loop control applies to statements that control loops that can execute controlled blocks a (possibly varying) number of times; these include for and while. The if statement is somewhat different, since its controlled block is executed either once or not at all. The () are needed in that case to indicate where the controlling expression(s) end and the controlled block begins. You can also use () to control the order of evaluation of an arithmetic expression: The part of the expression inside parentheses is executed first, regardless of normal ordering rules. For example, 2*5+3 is 13, while 2*(5+3) is 16.
Susan: So then if you just wrote while CurrentWeight > 0 with no () then the compiler couldn't read it?
Steve: Correct.
Susan: Actually it is beginning to look to me as I scan over a few figures that almost everything has a caption of some sort surrounding it. Everything either has a " " or () or { } or [ ] or <> around it. Is that how it is going to be? I am still not clear on the different uses of () and { }; does it depend on the control loop?
Steve: The { } are used to mark the controlled block, while the () are used to mark the conditional expression(s) for the if, while, for, and the like. Unfortunately, () also have other meanings in C++, which we'll get to eventually. The inventor of the language considers them to have been overused for too many different meanings, and I agree.
Susan: OK, I think I have it: { } define blocks and () define expressions. How am I to know when a new block starts? I mean if I were doing the writing, it would be like a new paragraph in English, right? So are there any rules for knowing when to stop one block and start another?
Steve: It depends entirely on what you're trying to accomplish. The main purpose of a block is to make a group of statements act like one statement; therefore, for example, when you want to control a group of statements by one if or for, you group those statements into a block.
Now that we've examined the for statement in excruciating detail, what about the block it controls? The first statement in the block:

cout << "Please type in weight #" << i+1 << ": ";
doesn't contain anything much we haven't seen before; it just displays a request to enter a weight. The only difference from previous uses we've made of the cout facility, is that we're inserting a numeric expression containing a variable, i+1, into the output. This causes the expression to be translated into a human-readable form consisting of digits. All of the expressions being sent to cout in one statement are strung together to make one line of output, if we don't specify otherwise. Therefore, when this statement is executed during the first iteration of the loop, the user of this program will see:

Please type in weight #1:
Then the user will type in the first weight. The same request, with a different value for the weight number, will show up each time the user hits Enter, until five values have been accepted.

The second statement in the controlled block,

cin >> Weight[i];
is a little different. Here, we're reading the number the user has typed in at the keyboard and storing it in a variable. But the variable we're using is different each time through the loop; it's the "ith" element of the Weight Vec. So, on the first iteration, the value the user types in will go into Weight[0], the value accepted on the second iteration will go into Weight[1], and so on, until on the fifth and last iteration, the typed-in value will be stored in Weight[4].

Here's Susan's take on this.

Susan: What do you mean by the ith element? So does Weight[i] mean you are directing the number that the user types in to a certain location in memory?
Steve: Yes, to the element whose number is the current value of i.
Susan: When you say cin >> Weight[i] does that mean you are telling the computer to place that variable in the index? So this serves two functions, displaying the weight the user types in and associating it to the index?
Steve: No, that statement has the sole function of telling the computer to place the value read in from the keyboard into element i of Vec Weight.
Susan: What I am confusing is what is being seen on the screen at the time that the user types in the input. So, the user sees the number on the screen but then it isn't displayed anywhere after that number is entered? Then, the statement cin >> Weight [i] directs it to a location somewhere in memory with a group of other numbers that the user types in?
Steve: Correct. This will be illustrated under the contents of Weight heading in Figures 4.10 - 4.13.

4.5. The Selection Sort

Now that we have stored all of the weights, we want to find the three highest of the weights. We'll use a sorting algorithm called a selection sort, which can be expressed in English as follows:
1. Repeat the following steps three times, once through for each weight that we want to select.
2. Search through the list (i.e., the Weight Vec), keeping track of the highest weight seen so far in the list and the index of that highest weight.
3. When we get to the end of the list, store the highest weight we've found in another list (the "output list", which in this case is the Vec SortedWeight).
4. Finally, set the highest weight we've found in the original list to 0, so we won't select it as the highest value again on the next pass through the list.
Let's take a look at the portion of our C++ program that implements this sort, in Figure 4.8.
FIGURE 4.8. Sorting the weights (from code\vect1.cpp)
for (i = 0; i < 3; i ++)
{
HighestWeight = 0;
for (k = 0; k < 5; k ++)
{
if (Weight[k] > HighestWeight)
{
HighestWeight = Weight[k];
HighestIndex = k;
}
}
SortedWeight[i] = HighestWeight;
Weight[HighestIndex] = 0;
}
Susan had some interesting comments and questions on this algorithm. Let's take a look at our discussion of the use of the variable i:
Susan: Now I understand why you used the example of i = i + 1; in Chapter 3; before, it didn't make sense why you would do that silly thing. Anyway, now let me get this straight. To say that, in the context of this exercise, means you can keep adding 1 to the value of i? I am finding it hard to see where this works for the number 7, say, or anything above 5 for that matter. So, it just means you can have 4 +1 or + another 1, and so on? See where I am having trouble?
Steve: Remember, a short variable such as i is just a name for a 2-byte area of RAM, which can hold any value between - 32768 and +32767. Therefore, the statement i ++; means that we want to recalculate the contents of that area of RAM by adding 1 to its former contents.
Susan: No, that is not the answer to my question. Yes, I know all that <G>. What I am saying is this: I assume that i ++; is the expression that handles any value over 4, right? Then let's say that you have pumpkins that weigh 1, 2, 3, 4, and 5 pounds consecutively. No problem, but what if the next pumpkin was not 6 but say 7 pounds? If at that point, the highest value for i was only 5 and you could only add 1 to it, how does that work? It just doesn't yet have the base of 6 to add 1 to. Now do you understand what I am saying?
Steve: I think I see the problem you're having now. We're using the variable i to indicate which weight we're talking about, not the weight itself. In other words, the first weight is Weight[0], the second is Weight[1], the third is Weight[2], the fourth is Weight[3], and the fifth is Weight[4]. The actual values of the weights are whatever the user of the program types in. For example, if the user types in 3 for the first weight, 9 for the second one, 6 for the third, 12 for the fourth, and 1 for the fifth, then the Vec will look like Figure 4.9.
The value of i has to increase by only one each time because it indicates which element of the Vec Weight is to store the current value being typed in by the user. Does this clear up your confusion?
Susan: I think so. Then it can have any whole number value 0 or higher (well, up to 32767); adding the 1 means you are permitting the addition of at least 1 to any existing value, thereby allowing it to increase. Is that it?
FIGURE 4.9. Elements vs. values
Element Value
Weight[0]
3
Weight[1]
9
Weight[2]
6
Weight[3]
12
Weight[4]
1

Steve: No, I'm not permitting an addition; I'm performing it. Let's suppose i is 0. In that case, Weight[i] means Weight[0], or the first element of the Weight Vec. When I add 1 to i, i becomes 1. Therefore, Weight[i] now means Weight[1]. The next execution of i ++; sets i to 2; therefore, Weight[i] now means Weight[2]. Any time i is used in an expression, for example, Weight[i], i + j, or i + 1, you can replace the i by whatever the current value of i is. The only place where you can't replace a variable such as i by its current value is when it is being modified, as in i ++ or the i in i = j + 1. In those cases, i means the address where the value of the variable i is stored.
Susan: OK, then i is not the number of the value typed in by the user; it is the location of an element in the Weight Vec, and that is why it can increase by 1, because of the i ++?
Steve: Correct, except that I would say "that is why it does increase by 1 ". This may just be terminology.
Susan: But in this case it can increase no more than 4 because of the i < 5 thing?
Steve: Correct.
Susan: But it has to start with a 0 because of the i = 0 thing?
Steve: Correct.
Susan: So then cin >> Weight [i] means that the number the user is typing has to go into one of those locations but the only word that says what that location could be is Weight; it puts no limitations on the location in that Weight Vec other than when you defined the index variable as short i;. This means the index cannot be more than 32767.
Steve: Correct. The current value of i is what determines which element of Weight the user's input goes into.
Susan: I think I was not understanding this because I kept thinking that i was what the user typed in and we were defining its limitations. Instead we are telling it where to go.
Steve: Correct.
Having beaten that topic into the ground, let's look at the correspondence between the English description of the algorithm and the code:
1. Repeat the following steps once through for each prize:
for (i = 0; i < 3; i ++)
During this process the variable i is the index into the SortedWeight Vec where we're going to store the weight for the current prize we're working on. While we're looking for the highest weight, i is 0; for the second-highest weight, i is 1; finally, when we're getting ready to award a third prize, i will be 2.
2. Initialize the variable that we will use to keep track of the highest weight for this pass through the data:
HighestWeight = 0;
3. Step through the input list:
for (k = 0; k < 5; k ++)
4. For each element of the list Weight, we check whether that element (Weight[k]) is greater than the highest weight seen so far in the list (HighestWeight). If that is the case, then we reset HighestWeight to the value of the current element (Weight[k]) and the index of the highest weight so far (HighestIndex) to the index of the current element (k):
if (Weight[k] > HighestWeight)
{
HighestWeight = Weight[k];
HighestIndex = k;
}
5. When we get to the end of the input list, HighestWeight is the highest weight in the list, and HighestIndex is the index of that element of the list that had the highest weight. Therefore, we can copy the highest weight to the current element of another list (the "output list"). As mentioned earlier, i is the index of the current element in the output list. Its value is the number of times we have been through the outer loop before; that is, the highest weight, which we will identify first, goes in position 0 of the output list, the next highest in position 1, and so on:
SortedWeight[i] = HighestWeight;
6. Finally, set the highest weight in the input list to 0, so we won't select it as the highest value again on the next pass through the list.
Weight[HighestIndex] = 0;
This statement is the reason that we have to keep track of the "highest index"; that is, the index of the highest weight. Otherwise, we wouldn't know which element of the original Weight Vec we've used and therefore wouldn't be able to set it to 0 to prevent its being used again.
Here's Susan's rendition of this algorithm:
Susan: OK, let me repeat this back to you in English. The result of this program is that after scanning the list of user input weights the weights are put in another list, which is an ordering list, named k. The program starts by finding the highest weight in the input list. It then takes it out, puts it in k, and replaces that value it took out with a 0, so it won't be picked up again. Then it comes back to find the next highest weight and does the same thing all over again until nothing is left to order. Actually this is more than that one statement. But is this what you mean? That one statement is responsible for finding the highest weight in the user input list and placing it in k. Is this right?
Steve: It's almost exactly right. The only error is that the list that the weights are moved to is the SortedWeight Vec, rather than k. The variable k is used to keep track of which is the next entry to be put into the SortedWeight Vec.
Susan: OK. There was also something else I didn't understand when tracing through the program. I did see at one point during the execution of the program that i=5. Well, first I didn't know how that could be because i is supposed to be < 5, but then I remembered that i ++ expression in the for loop, so I wondered if that is how this happened. I forgot where I was at that point, but I think it was after I had just completed entering 5 values and i was incrementing with each value. But see, it really should not have been more than 4 because if you start at 0 then that is where it should have ended up.
Steve: The reason that i gets to be 5 after the end of the loop is that at the end of each pass through the loop, the modification expression (i ++) is executed before the continuation expression (i < 5). So, at the end of the fifth pass through the loop, i is incremented to 5 and then tested to see if it is still less than 5. Since it isn't, the loop terminates at that point.
Susan: I get that. But I still have a question about the line if (Weight[k] > HighestWeight). Well, the first time through, this will definitely be true because we've initialized HighestWeight to 0, since any weight would be greater than 0. Is that right?
Steve: Yes. Every time through the outer (i) loop, as we get to the top of the inner loop, the 0 that we've just put in HighestWeight should be replaced by the first element of Weight; that is, Weight[0], except of course if we've already replaced Weight[0] by 0 during a previous pass. It would also be possible to initialize HighestWeight to Weight[0] and then start the loop by setting k to 1 rather than 0. That would cause the inner (k) loop to be executed only four times per outer loop execution, rather than five, and therefore would be more efficient.
Susan: Then HighestIndex=k; is the statement that sets the placement of the highest number to its position in the Vec?
Steve: Right.
Susan: Then I thought about this. It seems that the highest weight is set first, then the sorting takes place so it makes four passes (actually five) to stop the loop.
Steve: The sorting is the whole process. Each pass through the outer loop locates one more element to be put into the SortedWeight Vec.
Susan: Then the statement Weight[HighestIndex] = 0; comes into play, replacing the highest number selected on that pass with 0.
Steve: Correct.
Susan: Oh, when k is going through the sorting process why does i increment though each pass? It seems that k should be incrementing.
Steve: Actually, k increments on each pass through the inner loop, or 15 times in all. It's reset to 0 on each pass through the outer loop, so that we look at all of the elements again when we're trying to find the highest remaining weight. On the other hand, i is incremented on each pass through the outer loop or three times in all, once for each "highest" weight that gets put into the SortedWeight Vec.
Susan: OK, I get the idea with i, but what is the deal with k? I mean I see it was defined as a short, but what is it supposed to represent, and how did you know in advance that you were going to need it?
Steve: It represents the position in the original list, as indicated in the description of the algorithm.
Susan: I still don't understand where k fits into this picture. What does it do?
Steve: It's the index in the "inner loop", which steps through the elements looking for the highest one that's still there. We get one "highest" value every time through the "outer loop", so we have to execute that outer loop three times. Each time through the outer loop, we execute the inner loop five times, once for each entry in the input list.
Susan: Too many terms again. Which is the "outer loop" and which is the "inner loop"?
Steve: The outer loop executes once for each "highest" weight we're locating. Each time we find one, we set it to 0 (at the end of the loop) so that it won't be found again the next time through.
Susan: OK, but now I am confused with the statement: if (Weight[k] > HighestWeight). This is what gets me: if I understand this right (and obviously I don't) how could Weight[k] ever be greater than HighestWeight, since every possible value of k represents one of the elements in the Weight Vec, and HighestWeight is the highest weight in that Vec? For this reason I am having a hard time understanding the code for step 2, but not the concept.
Steve: The value of HighestWeight at any time is equal to the highest weight that has been seen so far. At the beginning of each execution of the outer loop, HighestWeight is set to 0. Then, every time that the current weight (Weight[k]) is higher than the current value of HighestWeight, we reset HighestWeight to the value of the current weight.
Susan: I still don't understand this statement. Help.
Steve: Remember that HighestWeight is reset to 0 on each pass through the outer loop. Thus, this if statement checks whether the kth element of the Weight Vec exceeds the highest weight we've seen before in this pass. If that is true, obviously our "highest" weight isn't really the highest, so we have to reset the highest weight to the value of the kth element; if the kth element isn't the true highest weight, at least it's higher than what we had before. Since we replace the "highest" weight value with the kth value any time that the kth value is higher than the current "highest" weight, at the end of the inner loop, the number remaining in HighestWeight will be the true highest weight left in Weight. This is essentially the same algorithm as we used to find the highest weight in the original version of this program, but now we apply it several times to find successively lower "highest" weights.
Susan: OK, I understand now, i increments to show how many times it has looped through to find the highest number. You are doing a loop within a loop, really, it is not side by side is it?
Steve: Correct.
Susan: So, when you first enter your numbers they are placed in an index called i, then they are going to be cycled through again, placing them in a corresponding index named k, looking for the top three numbers. To start out through each pass, you first set the highest weight to the first weight since you have preset the highest weight to 0. But, to find the top three numbers you have to look at each place or element in the index. At the end of each loop you sort out the highest number and then set that removed element to 0 so it won't be selected again. You do this whole thing three times.
Steve: That's right, except for some terminology: where you say "an index called i", you should say "a Vec called Weight", and where you say "an index called k", you should say "a Vec called SortedWeight". The variables i and k are used to step through the Vecs, but they are not the Vecs themselves.
Susan: OK, then the index variables just are the working representation of what is going on in those Vecs. But are not the numbers "assigned" an index? Let's see; if you lined up your five numbers you could refer to each number as to its placement in a Vec. Could you then have the column of weights in the middle of the two indexes of i and k to each side?
Steve: If I understand your suggestion, it wouldn't work, because k and i vary at different speeds. During the first pass of the outer loop, i is 0, while k varies from 0 to 5; on the second pass of the outer loop, i is 1, while k varies from 0 to 5 again, and the same for the third pass of the outer loop. The value of i is used to refer to an individual element of the SortedWeight Vec, the one that will receive the next "highest" weight we locate. The value of k is used to refer to an individual element of the Weight Vec, the one we're examining to see if it's higher than the current HighestWeight.
Susan: This is what gets me, how do you know in advance that you are going to have to set HighestIndex to k? I see it in the program as it happens and I understand it then, but how would you know that the program wouldn't run without doing that? Trial and error? Experience? Rule books? <G>
Steve: Logic. Let's look at the problem again. The sorting algorithm that we're using here is called selection sort, because each time through the outer loop it selects one element out of the input Vec and moves it to the output Vec. To prevent our selecting the same weight (i.e., the highest one in the original input) every time through the outer loop, we have to clear each weight to 0 as we select it. But, to do that, we have to keep track of which one we selected; that's why we need to save HighestIndex.

The Selection Sort in More Detail

To help clear up any remaining questions Susan (and you) might have about this algorithm, let's go back and look at its steps more closely (they start on page 182). Steps 1 through 3 should be fairly self-explanatory, once you're familiar with the syntax of the for statement; they start the outer loop, initialize the highest weight value for the current loop, and start the inner loop.

Step 4 is quite similar to the process we went through to find the highest weight in our previous two programs; however, the reason for the HighestIndex variable may still need some more clarification. We need to keep track of which element of the original Vec (i.e., Weight) we have decided is the highest so far, so that this element won't be selected as the highest weight on every pass through the Weight Vec. To prevent this error, step 4 sets each "highest" weight to a value that won't be selected on a succeeding pass. Since we know there should be no 0 weights in the Weight Vec, we can set each selected element to 0 after it has been selected, to prevent its reselection. Figure 4.10 shows a picture of the situation before the first pass through the data, with ??? in SortedWeight to indicate that those locations contain unknown data, as they haven't been initialized yet.

In Figure 4.10, the highest value is 11 in Weight[2]. After we've located it and copied its value to SortedWeight[0], we set Weight[2] to 0, yielding the situation in Figure 4.11.

FIGURE 4.10. Initial situation
Index Contents of Weight Contents of SortedWeight
0 5 ???
1 2 ???
2 11 ???
3 3
4 7

FIGURE 4.11. After the first pass
Index Contents of Weight Contents of SortedWeight
0 5 11
1 2 ???
2 0 ???
3 3
4 7

Now we're ready for the second pass. This time, the highest value is the 7 in Weight[4]. After we copy the 7 to SortedWeight[1], we set Weight[4] to 0, leaving the situation in Figure 4.12.
FIGURE 4.12. After the second pass
Index Contents of Weight Contents of SortedWeight
0 5 11
1 2 7
2 0 ???
3 3
4 0

On the third and final pass, we locate the 5 in Weight[0], copy it to SortedWeight[2], and set Weight[0] to 0. As you can see in Figure 4.13, SortedWeight now has the results we were looking for: the top three weights, in descending order.
FIGURE 4.13. Final situation
Index Contents of Weight Contents of SortedWeight
0 0 11
1 2 7
2 0 5
3 3
4 0

Making Assumptions Can Be Dangerous

That accounts for all the steps in the sorting algorithm. However, our implementation of the algorithm has a weak spot that we should fix. If you want to try to find it yourself, look at the code and explanation again before going on. Ready?

The key word in the explanation is "should" in the following sentence: "Since we know there should be no 0 weights in the Weight Vec, we can set each selected element to 0 after it has been selected, to prevent its reselection." How do we know that there are no 0 weights? We don't, unless we screen for them when we accept input. In the first pumpkin-weighing program, we stopped the input when we got a 0, but in the programs in this chapter we ask for a set number of weights. If one of them is 0, the program will continue along happily.14 Before we change the program, though, let's try to figure out what would happen if the user types in a 0 for every weight.

You can try this scenario out yourself. First, you have to compile the program vect1 by the usual method, then run the program, either normally or under the debugger. When it asks for weights, enter a 0 for each of the five weights.

In case you're reading this away from your computer, Figure 4.14 shows what might happen (although the element number in the message may not be the same)15.

FIGURE 4.14. A possible error message
You have tried to use element 68 of a vector which has only 5 elements.
Why doesn't the program work in this case? Because we have an uninitialized variable; that is, one that has never been set to a valid value. In this case, it's HighestIndex. Let's look at the sorting code one more time, in Figure 4.15.16
FIGURE 4.15. Sorting the weights, again (from code\vect1.cc)
for (i = 0; i < 3; i ++)
{
HighestWeight = 0;
for (k = 0; k < 5; k ++)
{
if (Weight[k] > HighestWeight)
{
HighestWeight = Weight[k];
HighestIndex = k;
}
}
SortedWeight[i] = HighestWeight;
Weight[HighestIndex] = 0;
}
It's clear that HighestWeight is initialized (i.e., given a valid value) before it is ever used; the statement HighestWeight = 0; is the first statement in the block controlled by the outer for loop. However, the same is not true of HighestIndex. Whenever the condition in the if statement is true, both HighestWeight and HighestIndex will indeed be set to legitimate values: HighestWeight will be the highest weight seen so far on this pass, and HighestIndex will be the index of that weight in the Weight Vec. However, what happens if the condition in the if statement never becomes true? In that case, HighestIndex will have whatever random value it started out with at the beginning of the program; it's very unlikely that such a value will be correct or even refer to an actual element in the Weight Vec.

Why We're Using Vec Rather Than vector

By the way, this illustrates the reason why we are using the Vec type rather than the standard library type vector; that error message comes from my code in the implementation of Vec, not from vector. If you try to refer to a nonexistent element of a vector, the results will be unpredictable. This is not a defect in the design of the vector type, by the way; because of the emphasis on speed in C++, that particular error check was left to be added by the programmer if he or she wants to add it. If the program is designed in such a way that no illegal reference can ever be made to an element of a vector, there is no necessity to slow it down by doing the error check all the time.17

However, even as a professional programmer of many years experience, I prefer to have the assurance that if I make a mistake in using a vector, or similar data type, I will be notified of it rather than having something go wrong that I don't know about, so I usually accept the performance penalty of such checking. Of course, this is even more important when you are just getting started, as you have much less experience to draw on to try to figure out why your program isn't working.

Uninitialized Variables

As you can imagine, Susan was interested in the notion of uninitialized variables. Here's the discussion we had on this topic:
Susan: You say that HighestIndex isn't initialized properly. But what about when you set k equal to 0 and then HighestIndex is set equal to k? Is that not initialized?
Steve: The problem is that the statement HighestIndex = k; is executed only when Weight[k] is greater than HighestWeight. If that never occurs, then HighestIndex is left in some random state.
Susan: OK, then why didn't you say so in the first place? I understand that. However, I still don't understand why the program would fail if all the weights the user typed in were 0. To me it would just have a very boring outcome.
Steve: That's the case in which HighestIndex would never be initialized; therefore, it would contain random garbage and would cause the program to try to display an element at some random index value.
Susan: I traced through the program again briefly tonight and that reminds me to ask you why you put the highest weight value to 1596 and the second-highest weight value to 1614?
Steve: I didn't. Those just happened to be the values that those memory locations had in them before they were initialized.
Susan: I was totally confused right from the beginning when I saw that. But did you do that to show that those were just the first two weights, and that they have not been, how would you say this, "ordered" yet? I don't know the language for this in computerese, but I am sure you know what I am saying.
Steve: Not exactly; those variables haven't been initialized at that point, so whatever values they might contain would be garbage.
Susan: So at that point they were just the first and second weights, or did you just arbitrarily put those weights in there to get it started? Anyway, that was baffling when I saw that.
Steve: Before you set a variable to a particular value, it will have some kind of random junk in it. That's what you're seeing at the beginning of the program, before the variables have been initialized.
Susan: OK, I am glad this happened, I can see this better, but whose computer did that? Was it yours or mine? I mean did you run it first and your computer did it, or was it my computer that came up with those values?
Steve: It's your computer. The program starts out with "undefined" values for all of the uninitialized variables. What this means in practice is that their values are whatever happened to be left around in memory at those addresses. This is quite likely to be different on your machine from what it is on mine or even on yours at a different time.
Susan: So something has to be there; and if you don't tell it what it is, the old contents of memory just comes up?
Steve: Right.
Susan: If it had worked out that the higher number had been in the first place, then I would have just assumed that you put that there as a starting point. I am really glad that this happened but I was not too happy about it when I was trying to figure it out.
Steve: See, it's all for your own good.
Susan: If that were the case, I would think it nearly impossible that we have the same values at any given address. How could they ever be remotely the same?
Steve: It's very unlikely that they would, unless the address was one that was used by very basic software such as DOS or Windows, which might be the same on our computers.
Susan: Anyway, then you must have known I was going to get "garbage" in those two variables, didn't you? Why didn't you advise me at least about that? Do you know how confusing it was to see that first thing?
Steve: Yes, but it's better for you to figure it out yourself. Now you really know it, whereas if I had told you about it in advance, you would have relied on my knowledge rather than developing your own.
I hope that has cleared up the confusion about the effect of an uninitialized variable in this example. But you may still be wondering why we have to initialize variables ourselves; surely they must have some value at any given time. Let's listen in on the conversation that Susan and I had about this point:
Susan: So, each bit in RAM is capable of being turned on or off by a 1 or a 0? Which one is on and which one is off? Or does that matter? How does this work electronically? I mean how does the presence of a 0 or a 1 throw the RAM into a different electronic state?
Steve: To be more exact, each "switch" is capable of existing in either the "on" or "off" state. The assignment of states to 1s and 0s is our notion, which doesn't affect the fact that there are exactly two distinct states the switch can assume, just like a light switch (without a dimmer). We say that if the switch is off, it's storing a 0, and if it's on, it's storing a 1.
Susan: What is the "normal state" of RAM: on or off?
Steve: It's indeterminate. That's one reason why we need to explicitly set our variables to a known state before we use them.
Susan: That didn't make sense to me originally, but I woke up this morning and the first thing that came to my mind was the light switch analogy. I think I know what you meant by indeterminate.
If we consider the light switch as imposed with our parental and financial values, it is tempting to view the "normal state" of a light switch as off. Hey, does the light switch really care? It could sit there for 100 years in the on position as easily as in the off position. Who is to say what is normal? The only consequence is that the light bulb will have been long burned out. So it doesn't matter, it really doesn't have a normal state, unless people decide that there is one.
Steve: What you've said is correct. The switch doesn't care whether it's on or off. In that sense, the "normal" position doesn't really have a definition other than one we give it.
However, what I meant by indeterminate is slightly different. When power is applied to the RAM, each bit (or to be more precise, a switch that represents that bit) could just as easily start out on as off. It's actually either one or the other, but which one is pretty much random so we have to set it to something before we know its value.
Susan: Oh, you broke my heart, when I thought I had it all figured out! Well, I guess it was OK, at least as far as the light switch was concerned, but then RAM and a light switch are not created equal. So RAM is pretty easy to please, I guess...
After that bit of comic relief, let's get back to the analysis of this program. It should be fairly obvious that if the user types in even one weight greater than 0, the if statement will be true when that weight is encountered, so the program will work. However, if the user typed in all 0 weights, the program would fail, as we saw before, because the condition in the if statement would never become true. To prevent this from causing program failure, all we have to do is to add one more line, the one in bold in Figure 4.16.
FIGURE 4.16. Sorting the weights, with correct initialization (from code\vect2.cpp)
for (i = 0; i < 3; i ++)
{
HighestWeight = 0;
HighestIndex = 0;
for (k = 0; k < 5; k ++)
{
if (Weight[k] > HighestWeight)
{
HighestWeight = Weight[k];
HighestIndex = k;
}
}
SortedWeight[i] = HighestWeight;
Weight[HighestIndex] = 0;
}
Now we can be sure that HighestIndex always has a value that corresponds to some element of the Weight Vec, so we won't see the program fail as the previous one would.

You can try this scenario out yourself. First you have to compile the program vect2 by the usual method, then run the program, either normally or under the debugger. When it asks for weights, enter a 0 for each of the five weights.

After you've entered 5 weights, the program will start the sorting process. This time, entering five 0 weights will produce the expected result: the top three weights will all be 0.

By the way, it's also possible to initialize a variable at the same time as you define it. For example, the statement short i = 12; defines a short variable called i and sets it to the value 12 at the same time. This is generally a good practice to follow when possible; if you initialize the variable when you define it, you don't have to remember to write a separate statement to do the initialization.

4.6. Program Failure

We should pay some more attention to the notion of program failure, as it's very important. The first question, of course, is what it means to say that a program "fails". A valid answer is that it doesn't work correctly, but that isn't very specific.

As you can imagine, this notion was the topic of some discussion with Susan:

Susan: What do you mean by a program failing? I know it means it won't work, but what happens? Do you just get error messages, and it won't do anything? Or is it like the message that you have on page 192?
Steve: In general, a program "failing" means that it does something unexpected and erroneous. Because I have put some safety features into the implementation of Vec, you'll get an error message if you misuse a Vec by referring to a nonexistent element.
In general, a program failure may or may not produce an error message. In the specific case that we've just seen, we'll probably get an error message while trying to access a nonexistent element of the Weight Vec. However, it's entirely possible for a program to just "hang" (run endlessly), "crash" your system, produce an obviously ridiculous answer, or worst of all, provide a seemingly correct but actually erroneous result.

The causes of program failures are legion. A few of the possibilities are these:

1. Problems isolated to our code
a. The original problem could have been stated incorrectly.
b. The algorithm(s) we're using could have been inappropriate for the problem.
c. The algorithm(s) might have been implemented incorrectly.
d. An input value might be outside the expected range.

 

And so on...

2. Problems interacting with other programs
a. We might be misusing a function supplied by the standard library, like the << operator.
b. The documentation for a standard library function might be incorrect or incomplete. This is especially common in "guru"-oriented operating systems, where the users are supposed to know everything.
c. A standard library function might be unreliable. This is more common than it should be.
d. The compiler might be generating the wrong instructions. I've seen this on a few rare occasions.
e. Another program in the system might be interfering with our program. This is quite common in some popular operating environments that allow several programs to be executing concurrently. 

And so on...

With simple programs like the ones we're writing here, errors such as the ones listed under problems with our code are more likely as we have relatively little interaction with the rest of the system. As we start to use more sophisticated mechanisms in C++, we're more likely to run into instances of interaction problems.

Why We Need to Initialize Variables Explicitly

After that excursion into the sources of program failure, let's get back to our question about initializing variables. Why do we have to worry about this at all? It would seem perfectly reasonable for the compiler to make sure that our variables were always initialized to some reasonable value; in the case of numeric variables such as a short, 0 would be a good choice. Surely Bjarne Stroustrup, the designer of C++, didn't overlook this.

No, he didn't; he made a conscious decision not to provide this facility. It's not due to cruelty or unconcern with the needs of programmers. On the contrary, he stated in the Preface to the first Edition of The C++ Programming Language that "C++ is a general-purpose programming language designed to make programming more enjoyable for the serious programmer".18 To allow C++ to replace C completely, he could not add features that would penalize efficiency for programmers who do not use these features. By the same reasoning that prevented the inclusion of index error checking in the vector data type, Bjarne decided not to add initialization as a built-in function of the language because it would make programs larger and slower if the programmer had already taken care of initializing all variables as needed. This may not be obvious, but we'll see in a later section why it is so.

Here's Susan's reaction to these points about C++:

Susan: What is run-time efficiency?
Steve: How long it takes to run the program and how much memory it uses.
Susan: So are you saying that C++ is totally different from C? That one is not based on the other?
Steve: No, C++ is a descendant of C. However, C++ provides much more flexibility to programmers than C.
Susan: Now, about what Bjarne said back in 1986: Who enjoys this, and if C++ is intended for a serious programmer, why am I reading this book? What is a serious programmer? Would you not think a serious programmer should have at least taken Computer Programming 101?
Steve: This book has been used as a textbook for such a course. Anyway, if you want to learn how to program, you have to start somewhere, and it might as well be with the intention of being a serious programmer.

Preventing Improper Data Input

In the meantime, there's something else we should do if we want the program to work as it should. As the old saying "Garbage in, garbage out" suggests, by far the best solution to handling spurious input values is to prevent them from being entered in the first place. What we want to do is to check each input value and warn the user if it's invalid. Figure 4.17 illustrates a new input routine that looks like it might do the trick.
FIGURE 4.17. Garbage prevention, first attempt (from code\vect2a.cc)
for (i = 0; i < 5; i ++)
{
cout << "Please type in weight #" << i+1 << ": ";
cin >> Weight[i];
if (Weight[i] <= 0)
{
cout << "I'm sorry, " << Weight[i] << " is not a valid weight.";
cout << endl;
}
}
As usual, you can try this version out yourself. First, you have to compile the program vect2a by the usual method, then run the program, either normally or under the debugger. When you are asked for a weight, type one in and hit Enter just as when executing normally. After you've entered 5 weights, the program will start the sorting process. When finished, it will display the top three weights of the five that were entered. Most of this should be familiar; the only line that has a new construct in it is the if statement. The condition <= means "less than or equal to", which is reasonably intuitive.

Unfortunately, this program won't really solve the problem of bad input. The problem is what happens after the error message is displayed; namely, the loop continues at the top with the next weight, and we never correct the erroneous input. Susan didn't have much trouble figuring out exactly what that last statement meant:

Susan: When you say that "we never correct the erroneous input", does that mean that it is added to the list and not ignored?
Steve: Right.
To fix this problem completely, we need to use an approach similar to the one shown in the final version of this program (Figure 4.18). To try this version out yourself, you have to compile the program vect2a by the usual method, then run the program, either normally or under the debugger. When you are asked for a weight, type one in and hit Enter. After you've entered five weights, the program will start the sorting process, and will display the results when finished.
FIGURE 4.18. Finding the top three weights using Vecs (code\vect3.cc)
#include <iostream>
#include "Vec.h"
using namespace std;

int main()
{
Vec<short> Weight(5);
Vec<short> SortedWeight(3);
short HighestWeight;
short HighestIndex;
short i;
short k;

cout << "I'm going to ask you to type in five weights, in pounds." << endl;

for (i = 0; i < 5; )
{
cout << "Please type in weight #" << i+1 << ": ";
cin >> Weight[i];
if (Weight[i] <= 0)
{
cout << "I'm sorry, " << Weight[i] << " is not a valid weight.";
cout << endl;
}
else
i ++;
}

for (i = 0; i < 3; i ++)
{
HighestIndex = 0;
HighestWeight = 0;
for (k = 0; k < 5; k ++)
{
if (Weight[k] > HighestWeight)
{
HighestWeight = Weight[k];
HighestIndex = k;
}
}
SortedWeight[i] = HighestWeight;
Weight[HighestIndex] = 0;
}

cout << "The highest weight was: " << SortedWeight[0] << endl;
cout << "The second highest weight was: " << SortedWeight[1] << endl;
cout << "The third highest weight was: " << SortedWeight[2] << endl;

return 0;
}

Now let's look at the changes that we've made to the program from the last revision. The first change is that the for statement that controls the block where the input is accepted from the user has only two sections rather than three. As you may recall, the first section specifies the initial condition of the index variable; in this case, we're starting i out at 0, as is usual in C and C++. The second section indicates when we should continue executing the loop; here, we should continue as long as i is less than 5. But the third section, which usually indicates what to do to the index variable, is missing. The reason for this is that we're going to adjust the index variable manually in the loop, depending on what the user enters.

In this case, if the user enters an invalid value (i.e., less than or equal to 0), we display an error message and leave i as it was, so that the next time through the loop the value will go into the same element in the Weight Vec. When the user enters a valid value, the else clause increments i so that the next value will go into the next element in the Vec. This fixes the error in our previous version that left incorrect entries in the Vec.

After finishing with the pumpkin program, let's tune in on a discussion I had with Susan on how to create an algorithm in the first place.

Susan: Do they make instruction sheets with directions of paths to follow? How do you identify problems? I mean, don't you encounter pretty much the same types of problems frequently in programming and can they not be identified some way so that if you knew a certain problem could be categorized as a Type C problem, let's say, you would approach it with a Type C methodology to the solution? Does that make sense? Probably not.
Steve: It does make sense, and in fact that's what the standard library is designed to do for very commonly used algorithms and data structures. At another level, another book of mine, Optimizing C++ (ISBN 0-13-977430-0), is designed to provide something like you're suggesting for more specific, but still relatively common, problems at the algorithmic level. There's also a book called Design Patterns (ISBN 0-201-63361-2) that tries to provide tested solutions to common design problems, at a much more abstract level, and a new book called Modern C++ Design (ISBN 0-201-70431-5) that shows how to use advanced features of C++ to implement a number of the high-level design ideas from the Design Patterns book (among a lot of other very useful material).
Now that we have beaten the pumpkin weighing example to a pulp19, let's review the mass of information to which I've subjected you so far in this chapter.

4.7. Review

We started out by extending our pumpkin weighing program to tell us the highest two weights rather than just the highest one. During this exercise, we learned the use of the else clause of an if statement. We also saw that making even an apparently simple change to a working program can introduce an error; in this case we were copying the highest weight to the next highest weight only when a new high weight was detected. This would produce an incorrect result if a value higher than the previous second highest but lower than the current highest weight were entered.

Next we extended the program again, this time to handle any number of prizes to be given to the highest weight, second-highest weight, third-highest weights, and so on. This inspired a complete reorganization of the program; the new version used the selection sort algorithm to produce a list of as many of the highest weights as we need, in descending order. To do this, we had to use a Vec, or set of values with a common name, to store all of the weights as they were read in. When they had all been entered, we searched through them three times, once to find each of the top three elements. A Vec, just like a regular variable, has a name. However, unlike a regular variable, a Vec does not have a single value, but rather consists of a number of elements, each of which has a separate value. An element is referred to by a number, called an index, rather than by a unique name; each element has a different index. The lowest index is 0, and the highest index is 1 less than the number of elements in the Vec; for example, with a 10 element Vec, the legal indexes are 0 through 9. The ability to refer to an element by its index allows us to vary the element we are referring to in a statement by varying the index; we put this facility to good use in our implementation of the selection sort, which we'll review shortly.

We then added the for statement to our repertoire of loop control facilities. This statement provides more precise control than the while statement. Using for, we can specify a starting expression, a continuation expression, and a modification expression. The starting expression sets up the initial conditions for the loop. Before each possible execution of the controlled block, the continuation expression is checked and if it is true, the controlled block will be executed; otherwise, the for loop will terminate. Finally, the modification expression is executed after each execution of the controlled block. Most commonly, the starting expression sets the initial value of a variable, the continuation expression tests whether that variable is still in the range we are interested in, and the modification expression changes the value of the variable. For example, in the for statement

for (i = 0; i < 5; i ++)
the starting expression is i = 0, the continuation expression is i < 5, and the modification expression is i ++. Therefore, the block controlled by the for statement will be executed first with the variable i set to 0; at the end of the block, the variable i will be incremented by 1, and the loop will continue if i is still less than 5.

Then we used the for statement and a couple of Vecs to implement a selection sort. This algorithm goes through an "input list" of n elements once for each desired "result element". In our case, we want the top three elements of the sorted list, so the input list has to be scanned three times. On each time through, the algorithm picks the highest value remaining in the list and adds that to the end of a new "output list". Then it removes the found value from the input list. At the end of this process, the output list has all of the desired values from the input list, in descending order of size. When going over the program, we found a weak spot in the first version: If all the weights the user typed were less than or equal to 0, the program would fail because one of the variables in the program would never be initialized; that is, set to a known value.

This led to a discussion of why variable initialization isn't done automatically in C++. Adding this feature to programs would make them slower and larger than C programs doing the same task, and C++ was intended to replace C completely. If C++ programs were significantly less efficient than equivalent C programs, this would not be possible, so Bjarne Stroustrup omitted this feature.

While it's important to insure that our programs work correctly even when given unreasonable input, it's even better to prevent this situation from occurring in the first place. So, the next improvement we made to our pumpkin weighing program was to tell the user when an invalid value had been entered and ask for a valid value in its place. This involved a for loop without a modification expression, since we wanted to increment the index variable i to point to the next element of the Vec only when the user typed in a valid entry; if an illegal value was typed in, we requested a legal value for the same element of the Vec.

4.8. Exercises

So that you can test your understanding of this material, here are some exercises.
1. If the program in Figure 4.19 is run, what will be displayed?
FIGURE 4.19. Exercise 1 (code\morbas00.cpp)
#include <iostream>
#include "Vec.h"
using namespace std;

int main()
{
Vec<short> x(5);
short Result;
short i;

for (i = 0; i < 5; i ++)
{
x[i] = 2 * i;
}

for (i = 0; i < 5; i ++)
{
Result = Result + x[i];
}

cout << Result << endl;

return 0;
}
2. If the program in Figure 4.20 is run, what will be displayed?
FIGURE 4.20. Exercise 2 (code\morbas01.cpp)
#include <iostream>
#include "Vec.h"
using namespace std;

int main()
{
Vec<short> x(4);
short Result;
short i;

x[0] = 3;
for (i = 1; i < 4; i ++)
x[i] = x[i-1] * 2;

Result = 0;
for (i = 0; i < 4; i ++)
Result = Result + x[i];

cout << Result << endl;

return 0;
}
3. Write a program that asks the user to type in a weight, and display the weight on the screen.
4. Modify the program from exercise 3 to ask the user to type as many weights as desired, stopping as soon as a 0 is entered. Add up all of the weights entered and display the total on the screen at the end of the program.
Answers to exercises can be found at the end of the chapter.

4.9. Conclusion

We've covered a lot of material in this chapter in our quest for better pumpkin weighing, ranging from sorting data into order based on numeric value through the anatomy of Vecs. Next, we'll take up some more of the language features you will need to write any significant C++ programs.

4.10. Answers to Exercises

1. The correct answer is: "Who knows?" If you said "30", you forgot that the loop variable values are from 0 through 4, rather than from 1 through 5. On the other hand, if you said "20", you had the right total of the numbers 0, 2, 4, 6, and 8, but didn't notice that the variable Result was never initialized. Of course, adding anything to an unknown value makes the final value unpredictable. Many current compilers, unfortunately not including the one on the CD in the back of this book, are capable of warning you about such problems. If you're using a compiler that supports such warnings, I recommended that you enable them because that is the easiest way to find such errors, especially in a large program. Unfortunately, a compiler may produce such warnings even when they are not valid, so the final decision is still up to you.
To try this program out, compile morbas00.cpp in the usual manner. Running this program normally isn't likely to give you much information, so you'll probably want to run it under the debugger.
2. The correct answer is 45. In case this isn't obvious, consider the following:
a. The value of x[0] is set to 3.
b. In the first for loop, the value of i starts out at 1.
c. Therefore, the first execution of the assignment statement x[i] = x[i-1] * 2; is equivalent to x[1] = x[0] * 2;. This clearly sets x[1] to 6.
d. The next time through the loop i is 2, so that same assignment statement x[i] = x[i-1] * 2; is equivalent to x[2] = x[1] * 2;. This sets x[2] to 12.
e. Finally, on the last pass through the loop, the value of i is 3, so that assignment statement x[i] = x[i-1] * 2; is equivalent to x[3] = x[2] * 2; This sets x[3] to 24.
f. The second for loop just adds up the values of all the entries in the x Vec; this time, we remembered to initialize the total, Result, to 0, so the total is calculated and displayed correctly.
Running this program normally isn't likely to give you much information, but you might want to run it under control of the debugger.
3. Let's start with Susan's proposed solution to this problem, in Figure 4.21, and the questions that came up during the process.
FIGURE 4.21. Weight requesting program, first try (code\morbas02.cpp)
#include <iostream>
using namespace std;

int main()
{
short weight;

cout << "Please write your weight here. "\n;

cin >> weight

return 0;
}
Susan: Would this work? Right now by just doing this it brought up several things that I have not thought about before.
First, is the # standard for no matter what type of program you are doing?
Steve: The iostream header file is needed if you want to use <<, >>, cin and cout, which most programs do, but not all.
Susan: Ok, but I meant the actual pound sign (#), is that always a part of iostream?
Steve: It's not part of the filename, it's part of the #include command, which tells the compiler that you want it to include the definitions in the iostream file in your program at that point.
Susan: So then this header is declaring that all you are going to be doing is input and output?
Steve: Not exactly. It tells the compiler how to understand input and output via << and >>. Each header tells the compiler how to interpret some type of library functions; iostream is the one for input and output.
Susan: Where is the word iostream derived from? (OK, io, but what about stream?)
Steve: A stream is C++ talk for "a place to get or put characters". A given stream is usually either an istream (input stream) or an ostream (output stream). As these names suggest, you can read from an istream or write to an ostream.
Susan: Second, is the \n really necessary here, or would the program work without it?
Steve: It's optional, however, if you want to use it the \n should be inside the quotes, since it's used to control the appearance of the output. It can't do that if it's not sent to cout. Without the \n, the user would type the answer to the question on the same line as the question. With the \n, the answer would be typed on the next line as the \n would cause the active screen position to move to the next line at the end of the question.
Susan: OK, that is good, since I intended for the weight to be typed on a different line. Now I understand this much better. As far as why I didn't include the \n inside the quotes, I can't tell you other than the time of night I was writing or it was an oversight or a typo. I was following your examples and I am not a stickler for details type person.
Now that that's settled, I have another question: Is "return 0" the same thing as an Enter on the keyboard with nothing left to process?
Steve: Sort of. When you get to that statement in main, it means that you're done with whatever processing you were doing and are returning control to the operating system (the C: prompt).
Susan: How does the program handle the Enter? I don't see where it comes into the programs you have written. It just seems that, at the end of any pause, an Enter would be appropriate. So does the compiler just know by the way the code is written that an Enter will necessarily come next?
Steve: The >> input mechanism lets you type until you hit an Enter (or a space in the case of a string), then takes the result up to that point.
One more point. We never tell the user that we have received the information. I've added that to your example.
Figure 4.22 illustrates the compiler's output for that erroneous program.
FIGURE 4.22. Error messages from the erroneous weight program (code\morbas02.cpp)
Error E2206 MORBAS02.cpp 8: Illegal character '\' (0x5c) in function main()
Error E2379 MORBAS02.cpp 8: Statement missing ; in function main()
Error E2379 MORBAS02.cpp 12: Statement missing ; in function main()

And Figure 4.23 shows the corrected program.
FIGURE 4.23. The corrected weight program (code\morbas03.cpp)
#include <iostream>
using namespace std;

int main()
{
short weight;

cout << "Please write your weight here: ";

cin >> weight;

cout << "I wish I only weighed " << weight << " pounds.";

return 0;
}

4. This was an offshoot of the previous question, which occurred when Susan wondered when the program in Figure 4.23 would terminate. Let's start from that point in the conversation:
Susan: Would this only run once? If so how would you get it to repeat?
Steve: We could use a while loop. Let's suppose that we wanted to add up all the weights that were entered. Then the program might look like Figure 4.24.
FIGURE 4.24. The weight totalling program (code\morbas04.cpp)
#include <iostream>
using namespace std;

int main()
{
short weight;
short total;

cout << "Please type in your weight, typing 0 to end:";
cin >> weight;

total = weight;

while (weight > 0)
{
cout << "Please type in your weight, typing 0 to end:";
cin >> weight;
total = total + weight;
}

cout << "The total is: " << total << endl;
return 0;
}

In case you were wondering, the reason we have to duplicate the statements to read in the weight is that we need an initial value for the variable weight before we start the while loop, so that the condition in the while will be calculated correctly.
By the way, there's another way to write the statement
total = total + weight;
that uses an operator analogous to ++, the increment operator:
total += weight;
This new operator, +=, means "add what's on the right to what's on the left". The motivation for this shortcut, as you might imagine, is the same as that for ++: it requires less typing, is more likely to be correct, and is easier to compile to efficient code. Just like the "increment memory" instruction, many machines have an "add (something) to memory" instruction. It's easier to figure out that such an instruction should be used for an expression like x += y than in the case of the equivalent x = x + y. Let's see what Susan has to say about this notation:
Susan: Now I did find something that was very confusing. You say that += means to "add what's on the right to what's on the left" but your example shows that it is the other way around. Unless this is supposed to be mirror imaged or something, I don't get it.
Steve: No, the example is correct. total += weight; is the same as total = total + weight;, so we're adding the value on the right of the += (i.e., weight) to the variable on the left (i.e., total). Is that clearer now?
Susan: OK, I think I got it now, I guess if it were more like an equation, you would have to subtract total from the other side when you moved it. Why is it that the math recollection that I have instead of helping me just confuses me?
Steve: Because, unfortunately, the = is the same symbol used to mean "is equal to" in mathematics. The = in C++ means something completely different: "set the thing on the left to the value on the right".
Running this program normally isn't likely to give you much information, so you might want to run it under control of the debugger.
1 There is a book called Refactoring: Improving the Design of Existing Code (ISBN 0-201-48567-2) that is very helpful to programmers with some experience who want to improve their ability to modify programs without introducing new errors. Reading this book has changed my attitude toward maintenance from distaste to interest.

2 I realize I'm breaking a cardinal rule of textbooks: Never admit that the solution to a problem is anything but obvious, so the student feels like an idiot if it isn't actually obvious. In reality, even a simple program is difficult to get right, and indicating the sort of thought processes that go into analyzing a programming problem might help demystify this difficult task.

3 What I'm calling a "regular" variable here is technically known as a scalar variable; that is, one with only one value at any given time.

4 By the way, if you're wondering how to pronounce Weight[i], it's "weight sub i". "Sub" is short for subscript, which is an old term for "index".

5 Note that the #include statement for Vec.h in Figure 4.6 uses "" rather than <> around the file name. The use of "" tells the compiler to search for Vec.h in the current directory first, and then the "normal" places that header files supplied with the compiler are located. This is necessary because Vec.h, in fact, is in the current directory. If we had written #include <Vec.h>, the compiler would look only in the "normal" places, and therefore would not find Vec.h. We will use "" for our header files, and <> for ones supplied with the compiler.


6 Run-time efficiency means the amount of time a program takes to run, as well as how much memory it uses. These issues are very significant when writing a program to be sold to or used by others, as an inefficient program may be unacceptable to the users.

7 For much more on the history of zero, see Charles Seife's Zero: The Biography of a Dangerous Idea (ISBN 0-670-88457-X).

8 I strongly recommend learning how to type (i.e., touch type). I was a professional programmer without typing skills for over 10 years before agreeing to type someone else's book manuscript. At that point, I decided to teach myself to touch-type, so I wrote a Dvorak keyboard driver for my Radio Shack Model III computer and started typing. In about a month I could type faster than with my previous two-finger method and eventually got up to 80+ words per minute on English text. If you've never heard of the Dvorak keyboard, it's the one that has the letters laid out in an efficient manner; the "home row" keys are AOEUIDHTNS rather than the absurd set ASDFGHJKL;. This "new" (1930s) keyboard layout reduces effort and increases speed and accuracy compared to the old QWERTY keyboard, which was invented in the 1880s to prevent people from typing two keys in rapid succession and jamming the typebars together. This problem has been nonexistent since the invention of the Selectric typewriter (which uses a ball rather than typebars) in 1960, but inertia keeps the old layout in use even though it is very inefficient. In any event, since I learned to type, writing documentation has required much less effort. This applies especially to writing articles or books, which would be a painful process otherwise.

9 I also don't use variables called j in the programs in this book, because j looks too much like i and could therefore be confusing.

10 You may sometimes see the term controlled statement used in place of controlled block. Since, as we have already seen, a block can be used anywhere that a single statement can be used, controlled statement and controlled block are actually just two ways of saying nearly the same thing.

11 You don't need a space between the variable name and the ++ operator; however, I think it's easier to read this way.

12 By the way, the name C++ is sort of a pun using this notation; it's supposed to mean "the language following C". In case you're not doubled over with laughter, you're not alone. I guess you had to be there.

13 Why is the value of i at the end of this loop 5 rather than 4? Because at the end of each pass through the loop, the modification expression (i ++) is executed before the continuation expression that determines whether the next execution will take place (i < 5). Thus, at the end of the fifth pass through the loop, i is incremented to 5 and then tested to see if it is still less than 5. Since it isn't, the loop terminates at that point.

14 For that matter, what if someone types in a negative weight, such as - 5? Of course, this doesn't make any sense; but it's a good idea to try to prevent errors rather than assuming that users of a program will always act sensibly.

15 It's even possible that you won't get an error message at all and the program will run correctly. However, if this happens, it just means that you're lucky. As you'll see later, you shouldn't count on that sort of luck.

16 You may have noticed a slight oddity in this code. The block controlled by the for statement consists of exactly one statement; namely, the if that checks for a new HighestWeight value. According to the rules I've provided, that means we don't have to put curly braces ({ }) around it to make it a block. While this is true, long experience has indicated that it's a very good idea to make it a block anyway, as a preventive measure. It's very common to revisit old code to fix bugs or add new functions, and in so doing we might add another statement after the if statement at a later time, intending it to be controlled by the for. The results wouldn't be correct, since the added statement would be executed exactly one time after the loop was finished, rather than once each time through the loop. Such errors are very difficult to find, because the code looks all right when inspected casually; therefore, a little extra caution when writing the program in the first place often pays off handsomely.

17 Although it would have been possible to make the normal indexing via [ ] do the check as we do in the Vec type and add another way to say "don't check the validity of this index". That would be a better design, in my opinion.

18 The C++ Programming Language, 3rd Edition. ix.

19 Pumpkin pie, anyone?


TOC PREV NEXT INDEX