Linux Programming and C++

By Mario Giannini

Part I – Getting Started

In the past year Linux has grabbed 17% of the operating systems sales, making it worthy of attention. While some people call it the 'Windows Killer', others just consider it a server replacement, and still other consider it a not-ready-for-prime-time player because of it's current complexities. But, with it's growing market share, attention from manufacturers such as IBM and Dell, the fact that the majority of people interested in it are programmers, and that it's free, it's time to take a programmers look at it.

Overview

First off, this document will assume you already installed Linux. And, that you installed the development tools for C++ as well. Later, we will discuss X Window programming, in XFree86 using the gtk library, so you would also want to get and install the latest version of that software. I will be using C++, not C, for program development, and will assume that you will be using g++ to create programs, not gcc.

This set of documents is targeted to people who already have an understanding of programming in C, and some C++. In the final document we will write C++ classes, creating socket and email classes, and eventually tie it to a simple X Windows email application. So, if your interested in some advanced C++ features, Linux programming, and/or X Windows development, you should find something here helpful.

The Tools

We'll start the discussion with a look at the Linux programming tools g++, ar, make, and gdb:

 

The g++ Compiler

The g++ compiler will be compiling and linking our source code files into executable programs. So far, the only compatibility issue I have come across with this compiler (version 2.7.3) is that it does not support the strstream class of the 1997 C++ standard. I'll avoid using this class, but there are places where it would have been helpful. All examples will use the 1997 standard, the first that C++ programmers will notice about this is that the ',h' is missing from header names. For example:

#include <iostream>
void main()
{
  cout << "Hello world" << endl;
}

Note how the <iostream> does not have a .h, which is the latest standard. Later on, we will make use of exceptions and the STL (Standard Template Library) of C++, so it's important to make sure now that you have the correct compiler installed.

Linux C++ programs usually have a .cxx extension, unlike the .cpp that most DOS/Windows programmers are used to. So, the above program would be called hello.cxx. In order to compile the above program, enter the following line at your Linux prompt:

g++ hello.cxx

g++ should start up, compile the file, link it, and create the executable file with the name a.out. In order to run the program, you might need to do the following at the Linux prompt:

./a.out

The reason for the ./ is because by default, Linux shells do not search the current directory for an executable file, and this permits us to tell the shell that the a.out executable file is in the current directory, and not the system folder. I have altered my .bashrc file ( a 'hidden' file, in my home directory) so that it also searches the current directory when running a program.

Run the program, and verify that it responds with 'Hello World'. If it does, then congratulations, your past hurtle number 1.

In order to really be able to use g++, we need to know a few 'tricks' about it:

Redirecting output

When g++ compiles a file, it sends it's errors to the standard error device. This is normally the console. This means that if you want to save the errors from a g++ compile, you can't redirect it in the normal fashion:

g++ hello.cxx > hello.errs

Instead, you will need to redirect the standard error file. In order to to this, you use a slightly different version of the redirection operator (>):

g++ hello.cxx 2> hello.errs

Now, you can save the errors in a file, and review them later, or fix them starting at the first error (this is always a good practice, because C++ compilers might encounter a single error that throws it out of whack, flagging other perfectly fine lines as errors, until the first error is fixed).

Compiling several files into one executable

In order to compile several .cxx modules into one executable, simply add the names of the modules to the command line. For example:

g++ hello.cxx mymenufuncs.cxx myfilefuncs.c mygraphics.o myxlib.a

The line above, will create an a.out file that is made of all the files above. g++ recognizes the various file extensions above and knows how to work with them. A quick summary of these types are:

Specifying a different executable output filename

Ok, so a.out isn't the best name for an executable. And it's the name always used by the g++ compiler, unnless you specify an alternative name. To specify an alternative output name, use the –o option of g++. For example:

g++ -o hello hello.cxx

Now, instead of getting a file called a.out, the compiler will create an executable called hello. Unlike DOS and Windows, Linux executables do not have a specific extension like EXE or COM. Instead, a file has it's 'execute' status set, which is done automatically by g++. If you get a list of the files now, using the command ls –l, you will see that hello is the only file that has an 'x' in it's left most column, meaning it is an executable.

Compiling only, without linking

Sometimes, a source module is not a complete program, but rather a set of functions, or a class, that is a small part of a larger program. For such modules, it is often convenient to compile them once, and then simply re-use the object module created by the compile. During a normal compile, the object modules are created, then linked into an executable, and then deleted. To have g++ just do a compile, and not a link, use the –c option:

g++ -o mygraphics.cxx

This will now create a file called mygraphics.o that we can use in later compiles, as mentioned in the Compiling several files into one executable section above. In DOS and Windows compilers, this is the counterpart to the .OBJ file. Note: I have not presented the code for mygraphics.cxx here, it is a 'pseduo module', for the point of discussion.

Compiling with debugging information

When your pgrogram has a bug, a debugger is an invaluable tool. Later on we will discuss using the gdb debugging tool to help in such a situation. But before gdb can debug a program, it must have debugging information included in it's executable. In order to do this (make the program usable in the gdb debugger), we need to add the –g option:

g++ -g hello.cxx

This will again create the a.out file, but now the hello executable can be debugged using gdb.

 

The ar utility

The ar (archive) utility is a convenient way to create static libraries. Let's say you have written 100 source modules, containing mostly re-usable code. It's convenient to place those 100 files into a library, and treat them as one entity. The link process of g++ will search a library, to find whatever other modules it needs to make an executable (the term 'link' means to link 1 or more object modules into an executable program).

Because of it's simplicity, we will only look at one version of using ar. An example would be:

ar -cr mglin1.0.a msomemodule.o

Here, the ar command has been called with the –cr options. The 'c' means to create the archive if it doesn't already exist. The 'r' means to add the module to the archive, or replace it if it's already there. Mglin1.0.a is the name of the archive (something like Mario Giannini's LINux library version 1.0). Archives have a file extension of .a, as noted in the Compiling several files into one executable section above. The msomemodule.o is the name of an object module. The DOS/Windows counterpart to a .a file is a .LIB file. Note: The msomemodule.o file is a pseudo file, and we have not presented it's source code here.

 

The Make utility

Make is a utility for managing large projects that are comprised of several source modules and libraries. For example, let's say you have a project made up of 100 files. You go and change 3 source modules on Monday, then 2 more on Tuesday, then 4 more on Wednesday. Now, it's Thursday, and you need to re-compile your project, and you don't recall which modules you changed. Make can help eliminate the need for you to keep track of the changed files.

First, make is a utility, but it uses another text file to describe exactly how to build a project. The file it uses is a text file that you must set up, and is normally called makefile. If you just type 'make' at a linux prompt, the utility will search for the file makefile, and build the project based on that. You can also use the –f option to specify an alternate filename, besides makefile.

The organization of a make file is one of 'targets', 'dependencies', and 'build rules'. Let's assume we have 3 source modules in our project, called hello.cxx, funca.cxx, and funcb.cxx. All three are needed to create the executable file hello. An example make file would look like the following:

# Make comments start with a '#'
hello : hello.o funca.o funcb.o funcs.h

hello.o : hello.cxx funcs.h
   g++ -c hello.cxx

funca.o : funca.cxx funcs.h
   g++ -c funca.cxx

funcb.o : funcb.cxx funcs.h
   g++ -c funcb.cxx

hello : hello.o funca.o funcb.o
   g++ -o hello hello.o funca.o funcb.o

At the top, below the comment, is the main build line. It states that the 'target' hello is dependant upon the hello.o, funca.o, funcb.o, and funcs.h files. This is just to establish the files needed, and does not have a 'build rule' (it is a single line).

Two lines down (a blank line seperates sections of a make file) is an example of a 'target', 'dependancy', and build rule. This line says that hello.o is dependant upon hello.cxx and funcs.h. What make does, is compare the time and date of the target, against it's dependancies. If the target is older than any of it's dependancies, then it invokes the 'build rule' which is on the line immediately below it, and indented. The build rule here is 'g++ -c hello.cxx', which will re-create the hello.o file.

Make continues checking all the files and dependancies, re-building only those modules needed. The end result of the above make file, will be to re-create the hello executable, only re-compiling the modules needed.

It's worth noting that the example above is an extremely simplistic one. The make utility is used in lots of places in Linux, and has many features that we have not, and will not, touch on in this document. It provides all sorts of macros, variable names, and default build rules.

 

The gdb debugger

A debugger can be an incredibly useful tool. Many programmers use basic intution when debugging a program, and rely on doing things like inserting cout or printf statements to help them see what values of variables are, and what point of a program has been reached at a certain time. The problem with this approach, is that once your program is fixed, you have to go back and remove those cout and printf statements. A hidden 'gotcha' to this approach is that the very act of removing, or inserting these cout and printfs statements, may actually introduce new bugs.

The gdb debugger, like all debuggers has tons of features. We will concentrate on a few of the primarily important ones. At this point, you must remember to compile your programs with the –g option of g++, so that the executable can be debugged with gdb.

To start debugging your program, simply run gdb with the name of the executable as a command line argument. The following example assumes that our program name is 'crasher':

gdb crasher

The following is the listing for the crasher program, we will be using throughout this discussion:

#include <iostream>
void FuncA()
{
   cout << "In FuncA"<<endl;
}
void FuncB()
{
   cout << "In FuncB"<<endl;
}
void FuncC()
{
   cout << "In FuncC"<<endl;
}
void Crasher()
{
   char * p = NULL;
   cout << "About to crash program..." << endl;
   for( int i =0; i < 10000; i++ )
      *p++ = 0;
}
void main()
{
   FuncA();
   FuncB();
   Crasher();
   FuncC();
}

 

List

Pretty straightfoward, this option permits you to list the source code of your program, either by function name, or by line number. For example, once the program is loaded, doing a 'list main,100' at the gdb prompt, will list 100 lines of your source code, starting at the main function. Doing a 'list 10,100' will list 100 lines, starting at line 10.

break

The 'break' statement lets you define a 'breakpoint'. With a break point set, you can run the program, and then gdb will suspend your program when it hits that breakpoint, and return you to the gdb prompt. Now, you can examine variables, set other breakpoints, or single-step the program (watching the program execute, one line at a time). To use the break command, enter something like 'break 21' to stop the program on line 21. In order to actually start the program, you need to use the 'run' command.

run

Use the 'run' command at the gdb prompt, to start your program running. You would probably want to set break points before doing a run. Once run, your program will be suspended when it hits a breakpoint.

next

Use the 'next' command at the gdb prompt, once your program has hit a break point. The 'next' command will permit you to execute the the next line of code, and then will suspend your program again. If the next statement is a function call, then 'next' will not trace into the function, but will step over it, executing it.

step

The 'step' command is similar to 'next', except that it will not step over a function if it is the next statement. Instead, it will enter the function, and suspend the program at the first line of that function.

print

The 'print' command is used to examine the contents of a variable. For example, if you were in the Crasher function above, you could do a 'print p' to show the contents of the 'p' pointer in that function.

Using a Core Dump

When a Linux program causes a serious problem, like a pointer problem, the operating system performs a 'core dump'. This means it take a snapshot of the program in memory, and saves it to a disk file called 'core'. This process is aften accompanied by Linux displaying a message like 'segmentation fault. (core dumped)'. Inside this file, is everything you need to identify the exact line of code that caused the core dump (identifying why it occurred is still up to you).

Using a core dump file is extremely easy. Just take the core dump file, and the executable (compiled with the –g switch), and run the gdb program with the executable name, and the –c corefilename option. When gdb starts up, the first thing it will do, is display to you the line that caused the core dump. You can try this with the crasher program above. And example command line, would be:

gdb –c core crasher

 

Summary

That was a summary of the main tools used in Linux/Unix software development. The next document will tackle writing some C++ code, specifically to deal with 'sockets'. Sockets are how internet programs like FTP, Browsers, and email programs communicate with each other. The goal of that class will be to introduce some actual Linux system programming, as well as to establish some good C++ practices and class design.