Strings

Strings are arrays of characters that are terminated by the null character: 0 (equivalent to '\0'). By having this designated termination character, strings have greater flexibility than simple character arrays: the length of a string can change during the execution of the program (as long as it doesn't exceed the maximum amount of memory allocated to it.) Notice that this null termination character is not the same as the character zero, which is expressed as '0.' (Recall that single quotes are used for character constants.)

A string can be declared like this:

char myString[80];

'myString' can hold any string with length less than 80. (Remember that we need to reserve the last character for the null character.) So, we can write:

myString = "Hello. How are you?";

The length of this string is 19 characters, and 20 'slots' of the array are used to hold it. We can also express strings as pointers to characters, like this:

char *newString;

Now, 'newString' can be assigned the value of any other string. However, it cannot be assigned the value of a string that does not exist. For example:

newString = myString;

is valid. But,

newString = "this is a string";

is not. The reason is simple (but easily forgotten.) Memory must be allocated for a string, just like it must be allocated for any other data structure (such as a node of a tree.) Just as we could have mulitple pointers pointing to the same node in a linked list, we can have multiple character pointers pointing to the same string.

 

As an alternative to allocated memory for a string using an array declaration, we can also use 'malloc.' This gives us even greater flexibility: we can dynamically allocate memory for a string. We can allocate memory for our string, 'newString' like this:

newString = (char *)malloc(80);

The '(char *)' part of this exression 'casts' the type of memory allocated to be type 'char.' It is not completely necessary. In this expression, we allocated 80 bytes of memory to newString. (Recall that a character requires 1 byte of memory.) Instead of a constant value like 80, we could have used an integer variable (possibly the result of some other computation.) For example, if we wanted to make a copy of a string, we could calcuate its length, allocate that much memory (+1 for the null character) to a new pointer (of type char), and then copy the original string to the newly allocated memory of the pointer (character by character.)


One of the simplest things you can do with strings is read and write them. Actually, we've been using strings for a while now - whenever we've used 'printf().' The argument we pass to printf() is a string. Instead of passing the string as text enclosed by double quotes, we could also pass a string variable. Example:

printf("Hello. How are you?");

printf(myString);

Those two statements will print the same thing (as we've defined 'myString.') Another way to print the same statement is:

printf("%s", myString);

'%s' is the escape character for strings. Here are some other useful functions for reading and writing strings:


The <string.h> library provides many useful string functions. It's important to understand how they are implemented. Professor Sedgewick discusses the implementations in section 3.7 - in particular, see table 3.2. You should also look over appendix B of K&R for the definitions of the functions in the stdio, string, and stdlib C libraries.

Notice, also, how these functions could be implemented using either pointer or array notation. Understanding this equivalence will make programming with strings easier for you.

You won't necessarily need to use any of these string library functions in your programs. Nevertheless, they are good examples of simple functions which operate on strings. You should be able to write the code for any of them by yourself.


Arrays of strings

It is often useful to store strings in arrays. The following code creates a 20-item array of strings (actually, pointers to characters.)

char *myStrings[20];

Notice that you haven't allocated any memory for your strings, yet. So,

myStrings[1] = "hello";

is illegal. You can use malloc to allocate memory for any of the strings in the array. Or, you can just assign the string variables to a pre-existing string, like this:

myStrings[0] = myString;

(where 'myString' is defined as above.) What this assignment does is set 'myStrings[0]' to point to the string 'myString.' It does not make a new copy of the string. An alternative way to initialize elements of an array of strings is like this:

char *p[] = {
   "one string",
   "another",
   "one more"
};

This code creates an array 'p' that has three strings. A reference to p[1] (for example) will return the string &quoteanother"e;. If we want to reference a particular string in the array, we use the & operator. For example, to reassign our string from above, 'myString,' to one of the values of the array, we would write:

myString = &p[0];

Now, myString is the string "one string". Review pointers if this doesn't make sense to you! Also, to make sure that you understand strings, please review program 3.15 in Sedgewick.