First Next Previous Last Glossary About

Extending the work with streams


Introduction

In the introductory lesson on streams I quoted Bruce Eckel "A stream is an object that formats and holds bytes". In that lesson we studied cout and cin the standard output and standard input streams. These streams are, by default, attached to hardware devices, the screen is the destination or sink for cout and the keyboard is the source for cin. You also learned in that lesson that cout and cin could be redirected to other sinks and sources, to files on disk for instance. This is useful because files can persist between invocations of a program, files are permanent storage. This isn't true of screens and keyboards of course, we can't "save" program output on a screen for long, nor can we read long-term "saved" input from a keyboard and even though we can redirect cout and cin it is a primitive way to deal with stored data. We need to access stored data ie files in a more flexible way and streams permit this. Since the stream "is an object that formats and holds bytes" it gives us this kind of flexibility. In days gone by, when we dealt with stored data, we focused our attention on the file, the actual data storage. Now, with streams, the focus is more on the accessing of that data, after all it is the data we are interested in.


Return to top of page

Access methods

A text data file: file1.txt
It little profits that an idle king
By this still hearth
Amongst these barren crags
Matched with an aged wife ...
EOF
A binary data file: file2.dat
Record 001
Record 002
Record 003
Record 004
EOF

The file is the most easily understood stream and the most commonly used. We use files when we want to store data for the long term, ie between invocations of a program.

I've shown here two distinctive kinds of file. The first is a text file, that is it contains text only and each element of the file is a line of text. Each line can be a different length. It is the kind of file that might be generated by a text-editor. The second is a binary or data file. Each element or record is a fixed size. This could be the kind of file generated by a simple database program.

We can access the contents of files either sequentially or directly. Usually files with variables length elements or records, like text files, are accessed sequentially. Files with fixed-size records can be accessed in either way.

Sequential access simply means that a program starts at the beginning of the file and reads each element in sequence until EOF, the End Of File is reached. Direct access, also called random access, means that any chosen element can be accessed. This requires that we work out the relevant positioning information for the element we want, then move (seek) to that location in the file and read the element.

Sequential access is easier to understand and deal with so we look at that first.


Return to top of page

File modes

Stream open modes
FlagMeaning
inOpen the stream for input - ie read from the stream.
outOpen the stream for output - ie write to the stream.
ate"at end" - Open the stream and seek to the end of the stream.
app"append" Open the stream for output but seek to the end of the stream for each write.
binaryOpen the stream for binary read/write. Generally used for writing fixed length data.
trunc"truncate" Open the stream and truncate it, ie set to zero length.

Streams are opened in specific modes. Each of the modes is specified by one or more flags which can be combined in a bitwise fashion using the or operator (|):

Flag (mode) combinationsMeaning
binaryinouttruncateappend
1Write at current position in stream
11Write at end of stream
11Write at beginning of stream
1Read from current position
11Read and write at current position
111Read and write at end
11Write binary at current position
111Write binary at end
111Write binary and truncate
11Read binary
111Read/Write binary
1111Read/Write binary and truncate

The table shows the only valid open mode combinations. If anything else is tried then the stream open operation will fail. I will write more about the stream open modes as this lesson develops.

Associated with streams and files is the notion of a stream or file position pointer. This is mainly a conceptual thing - it points to the next position in the stream at which a read or write will take place. If we open a stram in out mode then immediately after opening the stream pointer "points" to the start of the stream. Any writing will take place at that point. Likewise opening a stream in in mode means that reading will start at the beginning of the stream.

At the moment you might be a bit confused by the various modes. I will write more about the stream open modes as this lesson develops. I've included the mode material here since it has to be introduced somewhere and this is as good a place as any.

If you would like to read the gory details of streams then you should read The GNU C++ Iostream Library. Be warned: it is heavy going if you are a beginner.

There is also a great deal of information in the ANSI 1997 C++ Public Review Document - Input/output library . Be warned: Like the IOStream library it is heavy going if you are a beginner.

Return to top of page

Accessing streams sequentially - reading

When we are using file streams there are a number of steps we take:

  1. declare a stream variable,
  2. associate a file name with variable,
  3. attempt to open the file,
  4. test the success of the open operation,
  5. work with the stream,
  6. close the stream when finished.
//fstr_ex1.cc
#include <iostream>
#include <fstream.h>
#include <string>

void main(int argc, char *argv[])
{
  ifstream strIn;
  string s;

  strIn.open(argv[1]);
  if (strIn.fail())
   { cout << "Unable to open file" << endl;
     return(EXIT_FAILURE);
   }
  while (getline(strIn,s))
   cout << s << endl;
  strIn.close(); 
  return(EXIT_SUCCESS);  
}

Here is a simple example. You can see that apart from the usual include files I also include fstream which is the file stream header file. I also use command-line arguments and it is good practice for you to get comfortable with this. I haven't included any code to ensure that we get the correct number of arguments but it isn't essential in this case.

There is a new form of declaration which uses the ifstream class. I have declared a variable, or instance, of ifstream which is called strIn.

In the statement strIn.open(argv[1]); I combine the steps of associating a file name with the stream and opening the stream. The open() function is a member function of the ifstream class and it requires at least one parameter, the file name. The statement strIn.open(argv[1]); could also have been written as:

strIn.open(argv[1], ios::in);

that is, the read mode for text was specified. The default open mode is "text for reading" and there must be an existing file.

string filename, s;

filename = argv[1];
strIn.open(filename.c_str());

The file name is a C-style string, ie a null-terminated string. There is a bit of a trap here. Assume we modify the example as shown here and assign the file name from the command-line to a string class instance called filename. Since strIn.open() requires a C-style string we have to use the c_str() string class member function. A C-style string and a string class aren't the same thing. This is fine and as long as we actually have a command-line argument the program will work. BUT! if we forget to add a file name to the command-line the program will crash. The complete example is shown here. To avoid this kind of drama you should check the command-line arguments. Take a look here to see how.

After associating the file with the stream and attempting to open it we should test the success of these steps and the statement if (strIn.fail()) does that. The fail() function is another member function of the stream class. If it returns true then the file could not be opened, otherwise we are ready to access the file via the stream strIn and we do this with the while loop that follows. The getline() function is used to extract data from the stream and store it in the string s and you know that whatever we are using in condition of the while loop must evaluate to true or false. At some point we will reach the end of the input file and the stream state will indicate that. We can use getline() in this way because the function actually returns a reference to an input stream. Later in this lesson we look at stream states in more detail.

The last step is to close the stream strIn.close(); with close() member function. In this simple example it isn't really necessary since the program exits almost immediately after reading from the stream. Nonetheless you should get into the habit of always closing streams when they are not in use. This helps to preserve system resources since each open file stream uses resources even when it is not actively being read or written. It also lessens the risk of programming errors and, most important, closing files ensures that files which are being written are correctly updated.

Here is a table which summarises the stream operations so far

Declare an instance of an input stream
ifstream InFile;
Open a file stream using the open() member function
InFile.open("fil21.txt");
Use the fail() member function to determine if a file stream was successfully opened.
Infile.fail();
Use the close() member function to close a file stream and release any resources allocated to it.
Infile.close();


The examples we have used so far have all dealt with text files and treated the input as text. There is really no difference between an input file stream and the stream cin. You know that you can read any simple type from cin. For example if you want a char from cin then extract from the stream into a char variable. The same applies to a file stream which reads from a text file.

#include <iostream>
#include <fstream>
#include <stdlib.h>

int main()
{ int i, total = 0;
  ifstream nfile;
  nfile.open("numbers.txt");
  if (nfile.fail())
   { cout << "Unable to open file." << endl;
     return(EXIT_FAILURE);
   } 
  while (!nfile.eof())
   { nfile >> i;
     total += i;
     cout << i << " " << total << endl;
   }
  nfile.close();
  cout << total << endl;
  return (EXIT_SUCCESS);
}

Here is an example which shows that.

Assume there is a text file called numbers.txt which contains the lines:

11
56
78
94
45

Each line contains a single integer value. Our program has to read each line and total the integers.

The program opens the file, tests to see that the open was successful then uses a while loop to read lines from the input file.

The while loop uses a stream member function called eof() which you may not have seen before. This function returns true when the end of the stream is reached. If we invert this using the not operator, ie (!nfile.eof()) then we can keep looping while eof() is not true.

Within the loop the program extracts the value read from the stream and assigns it to i, an integer variable.

11 11
56 67
78 145
94 239
45 284
45 329
329

There is one small problem: the program miscalculates the answer. You can see the output from the program here.

The correct answer is 284 but the program gives 329 because it adds the last value, 45, twice. This is due to the way that streams handle whitespace.

Remember that whitespace separates data in the input stream. Each read of i from the stream reads past leading whitespace then reads a complete value and then the read stops at the trailing whitespace which will be "consumed" on the next read.

When the program reads the last value (45) there is trailing whitespace. This means that the program has not yet reached the end of the file so !eof() is still true but there there is nothing left to extract from the stream and i retains its previous value.

nfile >> i >> ws;

The solution is quite simple: we instruct the program to extract trailing whitespace by modifying the line where the read happens. The addition of the manipulator ws fixes our problem.



Return to top of page


Tutorial 1

Exercise 1

Write a program which counts the number of lines in a text file. The text file name should be entered on the command line. The program should just output the count, it need not display the lines. There is an sample answer here if you get stuck.

Exercise 2

Write a program which reads integers, one per line, from a text file and determines which line contained the largest number.There is an sample answer here if you get stuck.

Return to top of page

Accessing streams sequentially - writing

Dealing with an input stream was quite straightforward. We associate a file with an input stream, open it and if all goes well we start reading from the stream. Dealing with an output stream is almost as straightforward if the file already exists. If the file doesn't exist then we may need to create it.

//file_out1.cc
#include <string>
#include <fstream>

int main()
{
 ofstream out("anyoldfile.txt");  
 string m("");

 while ( m != "END")
  { getline(cin, m);
    out << m << endl;
  }
 out.close();
}

Here is a simple example that takes lines of text from cin and writes them to a file.

A variable of type ofstream called out is declared and the file name "anyoldfile.txt" is assigned to the variable. out is an instance of an output stream. The default mode for output file streams is output text, ie we could have written ofstream out("anyoldfile.txt", ios::out);. Since we have just opened the file then the file position pointer is placed at the start of the file. If this file already exists then any write will over-write the current contents. If the file doesn't exist then the file will be created.

There is also a string variable m which is initialised to an empty string. The while loop accepts values from the stream cin which are assigned to m. m is then written to the output stream and the loop terminates when m = "END".

The last step is to close the stream using the output stream member function close(). This is important! To ensure that all data is written to the stream you should always close output streams when you are finished with them.


//file_out2.cc
#include <string>
#include <fstream>

int main()
{
 ofstream out("anyoldfile.txt", ios::app);  
 string m("");

 while ( m != "END")
  { getline(cin, m);
    out << m << endl;
  }
 out.close();
}

The next example shows how we can write to the end of a file.

The only difference between this example and the previous example is the addition of the ios::app to the stream declaration. This sets the open mode to append and the contents of the file won't be overwritten by new write operations.


Return to top of page

Accessing streams directly

Although much stream i/o is done sequentially there are many occasions

There are two iostream member functions or methods which are used to aid in controlling the "file position" or "stream position" pointer:

Here is an example:

We declare the relevant include files

#include <iostream>
#include <fstream>
#include <string>

and a structure which represents the record. It consists of fields each of which is a character array - ie C style strings. Note that each array is one larger than the actual string to allow for the terminating \0.

struct custrec
 { char custnum   [ 6];
   char surname   [21];
   char fname     [21];
   char street    [26];
   char town      [26];
   char salescode [ 4];
 };

The function prototypes. Note: read_record() declares a reference to a custrec type.

void display_data(custrec);
bool read_record( long, custrec & );

custdata is local to main()

int main ()
{ custrec custdata;

Display the record size then call read_record(). We will read record number 190. If the read function returns true then we have some data to display so display it with display_data().

 cout << "Record size is: "
      << sizeof(custrec)
      << endl;
 if (read_record(190, custdata))
  { display_data(custdata);
    return 1;
  }
 else return 0;
}

datafile is the stream we read from and offset will be the position we seek in the stream.

bool read_record( long rec,
                  custrec & custdata)
{ ifstream datafile;
  long offset;

Open the file for input only in binary mode.

 datafile.open( "cust2.dat",
                ios::in | ios::binary);

If the open operation failed for some reason then display a message and return a false value to indicate failure.

 if ( !datafile )
  { cout << "Unable to open file." << endl;
    return false;
  }

Calculate the position to seek - the offset, call the seekg() method and read 104 bytes (sizeof(custrec)) into the data structure referred to by &custdata. Close the file and return true to show all was well.

 else 
 { offset = (rec - 1) * sizeof (custrec);
   datafile.seekg(offset, ios::beg);
   datafile.read (&custdata, sizeof (custrec) );
   datafile.close();
   return true;
 }
}

Display the contents of the customer record.

void display_data(custrec c)
{ cout << c.custnum   << endl
       << c.surname   << endl
       << c.fname     << endl
       << c.street    << endl
       << c.town      << endl
       << c.salescode << endl;
}


 if ( !datafile )
  { cout << "Unable to open file." << endl;
    return false;
  }
 else 
 { offset = (rec - 1) * sizeof (custrec);
   datafile.seekg(offset, ios::beg);
   datafile.read (&custdata, sizeof (custrec) );
   datafile.close();
   return true;
 }

In my brief explanation of the seek and read I kept things brief so that the example could be kept as simple as possible. It needs some enlargement and modification.

First of all the error checking is minimal. We've taken care of an unsuccessful open stream operation but what do we do if the seek fails or the read fails? We really should check these things.

The second issue is, at least for you, what do all these things like seekg(offset, ios::beg) and read (&custdata, sizeof (custrec) ); mean?


We look at the second problem first. You deserve a more detailed explanation.

Remember that we are doing direct access because it is faster than sequential access and we are dealing a file which contains elements which are all the same size. Each element of the file cust2.dat is 104 bytes in size. Add up the array sizes in the struct custrec and see for youself. If our file contains 191 records then it should be a total of 19864 bytes in size. From the point of view of our stream the first byte, at the beginning of the file is byte 0 and the last is byte 19863. If we seek to byte 104 this will position the stream pointer at the start of the second record in the file. Users tend to number things from 1, compilers number them from 0. The first record, for us mere mortals is record 1, for the program it is record 0.

The seekg() method has two arguments

There are three variations:

The method tellg() returns a long which tells us what our current byte position is. You will see that this is a useful thing to know.

Now what does datafile.read (&custdata, sizeof (custrec) ); mean? This statement calls the read() method with the first argument which points to a variable - &custdata - to receive the data. Note that it is an address, we want to get something back. The second arguments states how many bytes to read from the current position.

 { offset = (rec - 1) * sizeof (custrec);
   datafile.seekg(0, ios::end);
   streamlen = datafile.tellg();
   cout << "File: " << streamlen << endl;
   if (offset <= (streamlen - sizeof (custrec)))
    { datafile.seekg(offset, ios::beg);
      datafile.read (&custdata, sizeof (custrec) );
      if ( datafile.fail() )
       { datafile.close();
         return false;  
       }
      datafile.close();
      return true;
    }
   else
    { datafile.close();
      return false;
    }

Now we can look at the error checking and I have modified the relevant parts of the program to show you.

The first difference is the introduction of the long variable streamlen. We use streamlen to get the size of the file and do it by first seeking to the end of the file with datafile.seekg(0, ios::end) which states that we seek 0 bytes from the end of the stream. Next we call the tellg() method streamlen = datafile.tellg() and assign its value to streamlen. Now we test streamlen against the position we want to seek and read with (offset <= (streamlen - sizeof (custrec))). If this fails then we know we were trying to seek past the end of the file.

The last thing to check is the success of the read operation and we do this by via the fail() method. Streams have a number of state flags associated with them:

and there are a number of methods for examining the state of the stream:

You can see the complete program here.


Return to top of page


Tutorial 2 - Direct access

Exercise 2.1

In the modified example program we make sure we don't seek past the end of a stream by using this condition: (offset <= (streamlen - sizeof (custrec))). But what is to stop a request to seek past the beginning of the file? Better fix it!

Exercise 2.2

Modify the example so that it takes a record number from the keyboard and returns the record if the record number is valid. If the record number is invalid the program should return an error.



Return to top of page

Formatting streams

It's often not apparent to the beginning programmer what a fine piece of work the iostream class is. We tend to take it for granted. We use cout without a thought to the nature of the data we are inserting into the stream and cout takes care of it for us. We can do more though.

There are many occasions when we want to control the formatting of output. The iostream class gives us some tools for doing this, we can use:

The manipulators are

You have used quite a few of the manipulators already.

The stream member functions are

The ios format flags are used with the member functions setf() and unsetf. For example:

       cout.setf(ios::hex);
       cout.unsetf(ios::dec);

Since we have three different ways of formatting streams which do you use?

Here are some example programs which show the use of manipulators, member functions and format flags:

Format with manipulators

#include <iostream>
#include <iomanip>

int main()
{
  float f = 22.3;
  double g = 22.3;

// no format manipulator used
  cout << f << endl;

// with format manipulator set width
  cout << setw(10) << f << endl;

// some format manipulators only apply to the next object in the stream
  cout << setw(10) << f << endl
       << f << endl;

// effect of manipulators is cumulative
// fill leading blanks with other than space
// and set width to 20
// setfill is retained
  cout << setfill('*') << setw(20) << f << endl;
// see we still have leading blanks filled with '*'
  cout << setw(20) << f << endl;

  cout << setfill(' ');

  cout << setw(20) << f << endl;

// we can change the number base displayed
  for (int i = 0; i < 16; i++)
   cout << i << " "
   << hex << i << " "
   << oct << i << endl;

// setbase is retained
  cout << setbase(16) << 255 << " " << 255 << endl;
  cout << setbase(8)  << 255 << " " << 255 << endl;
  cout << setbase(10)  << 255 << endl;

  for (int i = 1; i < 20; i++)
   cout << "Precision is " << setw(4) << i 
        << setw(6) << "f is" 
        << setprecision(i) << setw(23) << f 
        << setw(6) << "g is" 
        << setw(23) << g << endl;
  return 1;  
}

Format with member functions

#include <iostream>

int main()
{
  float f = 22.3;
  double g = 22.3;

  cout << "Precision is " << cout.precision() << endl;
  cout << "Width is     " << cout.width()     << endl;
  cout.precision(20);
  cout.width(25);
  cout << cout.width() << " " << f << endl;
  cout << "Precision is " << cout.precision() << endl;
  cout << "Width is     " << cout.width()     << endl;
  cout << g << endl;
  cout << "Precision is " << cout.precision() << endl;
  cout << "Width is     " << cout.width()     << endl;
  cout << cout.precision() << " " << g << endl;

  return 1;  
}

Format with flags

#include <iostream>

int main()
{
  int i = 1023;

  cout << cout.flags() << endl;
  cout.setf(ios::hex);
  cout << cout.flags() << endl;
  cout.unsetf(ios::hex);
  cout << cout.flags() << endl;

  cout.setf(ios::hex);
  cout.unsetf(ios::dec);
  cout << i << endl;

  cout.setf(ios::dec);
  cout.unsetf(ios::hex);
  cout << i << endl;

  return 1;  
}

Return to top of page


Tutorial 3 - Formatting streams

Exercise 3.1

I suggest that you try the example programs first and then experiment with the different options.

Return to top of page


First Next Previous Last Glossary About


Copyright © 1999 - 2001 David Beech