Many programs revolve around the idea of reading and writing one character at a time, and developing the skill of writing such programs is a very important aspect of programming. We can use scanf to read a single character from the standard input (the keyboard) into a char variable (ch, say) with:
scanf("%c", &ch);
The next character in the data is stored in ch. It is very important to note a big difference between reading a number and reading a character. When reading a number, scanf will skip over any amount of whitespace until it finds the number. When reading a character, the very next character (whatever it is, even if it’s a space) is stored in the variable.
While we can use scanf, reading a character is important enough that C provides a special function getchar for reading characters from the standard input. (Strictly speaking, getchar is what’s called a macro, but the distinction is not important for our purposes.) For the most part, we can think that getchar returns the next character in the data. However, it actually returns the numeric code of the next character. For this reason, it is usually assigned to an int variable, as in:
int c = getchar(); // the brackets are required But it can also be assigned to a char variable, as in:
char ch = getchar(); // the brackets are required
To be precise, getchar returns the next byte in the data – to all intents and purposes, this is the next character. If we call getchar when there is no more data, it returns -1.
To be more precise, it returns the value designated by the symbolic constant EOF (all uppercase) defined in stdio.h. This value is usually, though not always, -1. The actual value is system dependent, but EOF will always denote the value returned on the system on which the program is run. We can, of course, always find out what value is returned by printing EOF, thus:
printf("Value of EOF is %d \n", EOF);
For an example, consider the statement:
char ch = getchar();
Suppose the data typed by the user is this:
Hello
When ch = getchar() is executed, the first character H is read and stored in ch. We can then use ch in whatever way we like. Suppose we just want to print the first character read. We could use:
printf("%c \n", ch);
This would print H
on a line by itself. We could, of course, label our output as in the following statement:
printf("The first character is %c \n", ch);
This would print The first character is H
Finally, we don’t even need ch. If all we want to do is print the first character in the data, we could do so with:
printf("The first character is %c \n", getchar());
If we want to print the numeric code of the first character, we could do so by using the specification %d instead of %c. These ideas are incorporated in Program P6.1.
Program P6.1
//read the first character in the data, print it, //its code and the value of EOF
#include <stdio.h>
int main() {
printf("Type some data and press 'Enter' \n");
char ch = getchar();
printf("\nThe first character is %c \n", ch);
printf("Its code is %d \n", ch);
printf("Value of EOF is %d \n", EOF);
}
The following is a sample run:
Type some data and press 'Enter' Hello
The first character is H Its code is 72
Value of EOF is -1
A word of caution: we might be tempted to write the following:
printf("The first character is %c \n", getchar());
printf("Its code is %d \n", getchar()); // wrong
But if we did, and assuming that Hello is typed as input, these statements will print:
The first character is H Its code is 101
Why? In the first printf, getchar returns H, which is printed. In the second printf, getchar returns the next character, which is e; it is e’s code (101) that is printed.
In Program P6.1, we could use an int variable (n, say) instead of ch and the program would work in an identical manner. If an int variable is printed using %c, the last (rightmost) 8 bits of the variable are interpreted as a character and this character is printed. For example, the code for H is 72 which is 01001000 in binary, using 8 bits. Assuming n is a 16-bit int, when H is read, the value assigned to n will be
00000000 01001000
If n is now printed with %c, the last 8 bits will be interpreted as a character which, of course, is H.
Similarly, if an int value n is assigned to a char variable (ch, say), the last 8 bits of n will be assigned to ch.
As mentioned, getchar returns the integer value of the character read. What does it return when the user presses “Enter” or “Return” on the keyboard? It returns the newline character \n, whose code is 10. This can be seen using Program P6.1. When the program is waiting for you to type data, if you press the “Enter” or “Return” key only, the first lines of output would be as follows (note the blank line):
The first character is Its code is 10
Why the blank line? Since ch contains \n, the statement printf("\nThe first character is %c \n", ch);
is effectively the same as the following (with %c replaced by the value of ch) printf("\nThe first character is \n \n");
The \n after is ends the first line and the last \n ends the second line, effectively printing a blank line. Note, however, that the code for \n is printed correctly.
In Program P6.1, we read just the first character. If we want to read and print the first three characters, we could do this with Program P6.2.
Program P6.2
//read and print the first 3 characters in the data
#include <stdio.h>
int main() {
printf("Type some data and press 'Enter' \n");
for (int h = 1; h <= 3; h++) { char ch = getchar();
printf("Character %d is %c \n", h, ch);
} }
The following is a sample run of the program:
Type some data and press 'Enter' Hi, how are you?
Character 1 is H Character 2 is i Character 3 is ,
If we want to read and print the first 20 characters, all we have to do is change 3 to 20 in the for statement.
Suppose the first part of the data line contains an arbitrary number of blanks, including none.
How do we find and print the first non-blank character? Since we do not know how many blanks to read, we cannot say something like “read 7 blanks, then the next character.”
More likely, we need to say something like “as long as the character read is a blank, keep reading.” We have the notion of doing something (reading a character) as long as some ‘condition’
is true; the condition here is whether the character is a blank. This can be expressed more concisely as follows:
read a character
while the character read is a blank read the next character
Program P6.3 shows how to read the data and print the first non-blank character. (This code will be written more concisely later in this section.)
Program P6.3
//read and print the first non-blank character in the data
#include <stdio.h>
int main() {
printf("Type some data and press 'Enter' \n");
char ch = getchar(); // get the first character while (ch == ' ') // as long as ch is a blank ch = getchar(); // get another character printf("The first non-blank is %c \n", ch);
}
The following is a sample run of the program (◊ denotes a blank):
Type some data and press 'Enter'
◊◊◊Hello
The first non-blank is H
The program will locate the first non-blank character regardless of how many blanks precede it.
As a reminder of how the while statement works, consider the following portion of code from Program P6.3 with different comments:
char ch = getchar(); //executed once; gives ch a value //to be tested in the while condition while (ch == ' ')
ch = getchar(); //executed as long as ch is ' ' and suppose the data entered is (◊ denotes a space):
◊◊◊Hello
The code will execute as follows:
1. The first character is read and stored in ch; it is a blank.
2. The while condition is tested; it is true.
3. The while body ch = getchar(); is executed and the second character is read and stored in ch; it is a blank.
4. The while condition is tested; it is true.
5. The while body ch = getchar(); is executed and the third character is read and stored in ch; it is a blank.
6. The while condition is tested; it is true.
7. The while body ch = getchar(); is executed and the fourth character is read and stored in ch; it is H.
8. The while condition is tested; it is false.
9. Control goes to the printf, which prints.
The first non-blank is H
What if H was the very first character in the data? The code will execute as follows:
1. The first character is read and stored in ch; it is H.
2. The while condition is tested; it is false.
3. Control goes to the printf, which prints.
The first non-blank is H
It still works! If the while condition is false the first time it is tested, the body is not executed at all.
As another example, suppose we want to print all characters up to, but not including, the first blank. To do this, we could use Program P6.4.
Program P6.4
//print all characters before the first blank in the data
#include <stdio.h>
int main() {
printf("Type some data and press 'Enter' \n");
char ch = getchar(); // get the first character while (ch != ' ') { // as long as ch is NOT a blank printf("%c \n", ch);// print it
ch = getchar(); // and get another character }
}
The following is a sample run of P6.4:
Type some data and press 'Enter' Way to go
W a y
The body of the while consists of two statements. These are enclosed by { and } to satisfy C’s rule that the while body must be a single statement or a block. Here, the body is executed as long as the character read is not a blank – we write the condition using != (not equal to).
If the character is not a blank, it is printed and the next character read. If that is not a blank, it is printed and the next character read. If that is not a blank, it is printed and the next character read. And so on, until a blank character is read, making the while condition false, causing an exit from the loop.
We would be amiss if we didn’t enlighten you about some of the expressive power in C. For instance, in Program P6.3, we could have read the character and tested it in the while condition.
We could have rewritten the following three lines:
ch = getchar(); // get the first character while (ch == ' ') // as long as ch is a blank ch = getchar(); // get another character as one line
while ((ch = getchar()) == ' '); // get a character and test it
ch = getchar() is an assignment expression whose value is the character assigned to ch, that is, the character read. This value is then tested to see if it is a blank. The brackets around ch
= getchar() are required since == has higher precedence than =. Without them, the condition would be interpreted as ch = (getchar() == ' '). This would assign the value of a condition (which, in C, is 0 for false or 1 for true) to the variable ch; this is not what we want.
Now that we have moved the statement in the body into the condition, the body is empty; this is permitted in C. The condition would now be executed repeatedly until it becomes false.
To give another example, in Program 6.4, consider the following code:
char ch = getchar(); // get the first character while (ch != ' ') { // as long as ch is NOT a blank printf("%c \n", ch) // print it
ch = getchar(); // and get another character }
This could be re-coded as follows (assuming ch is declared before the loop):
while ((ch = getchar()) != ' ') // get a character
printf("%c \n", ch); // print it if non-blank; repeat
Now that the body consists of just one statement, the braces are no longer required. Five lines have been reduced to two!