Harley Hahn's Guide to Unix and Linux

Chapters...
   1   2   3
   4   5   6
   7   8   9
  10 11 12
  13 14 15
  16 17 18
  19 20 21
  22 23 24
  25 26

Glossary

Appendixes...
  A B C
  D E F
  G H

Command
Summary...

• Alphabetical
• By category

Unix-Linux
Timeline

Internet
Resources

Errors and
Corrections

Endorsements

INSTRUCTOR
AND STUDENT
MATERIAL...

Home Page
& Overview

Exercises
& Answers

The Unix Model
Curriculum &
Course Outlines

PowerPoint Files
for Teachers

Chapter 12...

Using the Shell: Variables and Options

Interactive and Non-Interactive Shells
The Environment, Processes and Variables
Forking Till You Die
Environment Variables and Shell Variables
Displaying Environment Variables: env, printenv
Displaying Shell Variables: set
Displaying and Using the Value of a Variable: echo, print
Bourne Shell Family: Using Variables: export, unset
C-Shell Family: Using Variables: setenv, unsetenv, set, unset
Shell Options: set -o, set +o
Displaying Shell Options
Machine-readable, Human-readable
Exercises

Many people do not take the time to learn how to use the shell well. This is a mistake. To be sure, the shell — like all complex Unix programs — has many features you do not really need to understand. However, there are a number of fundamental ideas that are of great practical value. Here is my list:

• Interactive Shells
• Processes
• Environment Variables
• Shell Variables
• Shell Options
• Metacharacters
• Quoting
• External Commands
• Builtin Commands
• Search Path
• Command Substitution
• History List
• Autocompletion
• Command Line Editing
• Aliases
• Initialization Files
• Comments

If you look at this list and feel overwhelmed, I understand. You may even feel like asking, "Do I really need to learn all this stuff?"

The answer is yes, and it will take a bit of time, but don't worry. First of all, I will spread the material over three chapters, so you won't get too much at once. Second, I will make sure that we cover the topics in a way that each one leads to the next, so you won't feel confused. (In fact, we will be covering the topics in the order you see in the list.) Finally, as you come to appreciate the beauty of the shell, and it all starts to make sense, you will find yourself having a good time as we move from one idea to the next.

As you know, there are two shell families: the Bourne family (Bash, Korn shell) and the C-Shell family (C-Shell, Tcsh). When you learn how to use a shell, some of the details vary depending on which shell you are using, and that will be reflected in what I will be teaching you. Nevertheless, it is part of basic Unix to understand how each of the shell families approaches certain problems and to be able to appreciate those differences.

Thus, as you study the next three chapters, I'd like you to read all the sections and look at all the examples, regardless of which shell you use. Some people only study the shell they happen to be using at the time, but that is a mistake. My goal is for you to be comfortable with all the major shells. The way to do this is by paying attention to the basic principles, not by memorizing the esoteric details of one particular shell.

As you read this chapter, you will need to know what shell you are currently using. If you are not sure, you can display the name of the shell that started when you logged in by using the following command (which will make sense later in the chapter):

echo $SHELL

If you have temporarily changed to a different shell (see Chapter 11), you will, of course, know what shell you are using.

One last point before we start. Chapter 11 covers shells in general. If you have not already read that chapter, please take a few moments to look at it before you continue.

Jump to top of page

Interactive and Non-interactive Shells

An INTERACTIVE program is one that communicates with a person. When you run an interactive program, the input comes from your keyboard or your mouse, and the output is sent to your monitor. For example, when you use a word processor or a Web browser, you are using an interactive program.

A NON-INTERACTIVE program runs independently of a person. Typically, it will get its input from a file and write its output to another file. For example, when you compile a program (process it so it can be run), you are using a non-interactive program, the compiler.

At times, the lines between interactive and non-interactive programs can blur a bit. For instance, an interactive program might send output to a file or to a printer. Similarly, a non-interactive program might ask you to enter a bit of data at the keyboard, or might display a message on your monitor when something important happens.

Still, for practical purposes, it is usually simple to classify a program as being interactive (working with you) or non-interactive (working by itself). In general, interactive programs get their input from a person (keyboard, mouse) and send their output to a person (monitor, speakers). Non-interactive programs use input that comes from a non-human source (say, a file), and send their output to a non-human source (another file). So, the question arises, what is the shell: interactive or non-interactive?

The answer is, it can be both. You will recall from Chapter 11 that the shell can act as both a user interface and a script interpreter. To use the CLI (command line interface) you open a terminal window or use a virtual terminal (see Chapter 6). When you see the shell prompt, you enter a command. The shell processes your command and then displays another prompt. As you work in this way, the shell is your user interface, and we say that you are using an INTERACTIVE SHELL.

Alternatively, you can create a set of commands, called a SHELL SCRIPT, which you save in a file. When you run your script, the shell reads the commands from the file and processes them one at a time without your input. When this happens, we say that you are using a NON-INTERACTIVE SHELL.

It is important to understand that, in each case, you are using the same type of shell. This is possible because shells are designed to work either interactively or non-interactively.

When you log in, a shell is started on your behalf and set to be interactive. Similarly, when you manually start a new shell from the command line (say, by typing bash or tcsh), the new shell is also set to be interactive.

On the other hand, when you run a shell script, a new shell is started automatically and given the task of interpreting your script. This new shell is set to be non-interactive. Once the job is done — that is, when the script is finished — the non-interactive shell is terminated.

So how does a shell know whether it should be interactive or non-interactive? It all depends on what options it is given when it starts. We will discuss shell options later in the chapter.

What's in a Name?

Shell

In everyday life, we use the word "shell" in two different ways, which can be confusing. We can talk about the idea of the shell in general, or we can refer to an instance of a shell that is running.

For example, you are a young man who is invited to a sorority party where there are a lot of cheerleaders. Someone introduces you to the prettiest girl in the room and to break the ice you ask her, "What shell do you use?" After talking to you for a few minutes she says, "Football players bore me. I like a man who understands the shell. Why don't you come over to my place and help me fine tune my kernel?" In this example, you and the girl are talking about the shell in general.

The next day, you are sitting in a lecture in your Unix class and the professor says, "...After you log in, a shell is started to act as your user interface. If you type the bash command, a new shell is started. When you run a shell script, another shell is started..." In this case, the professor is not talking about the shell as a general concept. He is talking about actual shells that are running.

To make sure that you understand the distinction, see if the following sentence makes sense to you: "Once you learn how to use the shell, you can start a new shell whenever you want."

Jump to top of page

The Environment, Processes and Variables

In Chapter 6, during the discussion of multitasking, I introduced the idea that, within a Unix system, every object is represented by either a file or a process. In simple terms, files hold data or allow access to resources; processes are programs that are executing. Thus, a shell that is running is a process. Similarly, any program that is started from within the shell is also a process.

As a process runs, it has access to what is called the ENVIRONMENT, a table of variables, each of which is used to hold information. To make sense out of this idea, we need to start with a basic question: What are variables and what can we do with them?

Let's start with a definition. A VARIABLE is an entity used to store data. Every variable has a name and a value. The NAME is an identifier we use to refer to that variable; the VALUE is the data that is stored within the variable.

Here is an example. As we discussed in Chapter 7, Unix uses a variable named TERM to store the name of the type of terminal you are using. The idea is that any program that needs to know your terminal type can simply look at the value of TERM. The most common values for TERM are xterm, linux, vt100 and ansi.

When you name your own variables, you have a lot of leeway: there are only two simple rules. First, a variable name must consist of uppercase letters (A-Z), lowercase letters (a-z), numbers (0-9), or the underscore character (_). Second, a variable name must start with a letter or an underscore; it cannot begin with a number. Thus, the variable names TERM, path and TIME_ZONE are valid; the name 2HARLEY is not.

When you use a Unix shell, there are two different types of variables. They are called "shell variables" and "environment variables", and we will talk about them throughout the chapter. As a general rule, there are only four different things you can do with variables. You can create them, check their value, change their value, or destroy them.

Informally, it can help to think of a variable as a small box. The name of the variable is on the box. The value of the variable is whatever is inside the box. For example, you might imagine a box named TERM that contains the word xterm. In this case, we say that the variable TERM has the value xterm.

With most programming languages, variables can contain a variety of different types of data: characters, strings, integers, floating-point numbers, arrays, sets, and so on. With the shell, however, variables almost always store only one type of data, a CHARACTER STRING, that is, a sequence of plain-text characters. For instance, in our example the TERM variable stores a string consisting of 5 characters: x, t, e, r and m.

When you create a variable, you will often give it a value at the same time, although you don't have to. If you don't, we say that the variable has a NULL value, which is the same as saying it has no value. This is like creating a box with a name, but not putting anything inside the box. If a variable has a null value, you can always give it a value later, should the need arise.

Now let's see how variables, the environment and processes fit together. Consider the following scenario. You are at a shell prompt and you start the vi text editor (Chapter 22). In technical terms, we say that one process, the shell, starts another process, vi. (We'll talk about the details in the next section.)

When this happens, the first process is called the PARENT PROCESS or PARENT; the second process is the CHILD PROCESS or CHILD. In this case, the parent is the shell and the child is vi.

At the time the child process is created, it is given an environment which is a copy of the parent's environment. We say that the child INHERITS the parent's environment. This means that all the ENVIRONMENT variables that were accessible to the parent are now accessible to the child.

For instance, in our example, when vi (the child) is created, it inherits the environment of the shell (its parent). In particular, vi is now able to examine the value of the TERM variable in order to discover what type of terminal you are using. This enables vi to format its output properly for your particular terminal.

Jump to top of page

Forking Till You Die

(The following section explains some of the details involved when a parent process creates a child. If you have an interest in Unix programming, especially if you want to create your own shell scripts, the following material is important for you to understand. If programming has no attraction for you, you can skip this section. However, even if you do not want to program, the section is rather interesting, and you may want to read it just for fun.)

When a process needs to call upon the kernel to perform a service, it uses a SYSTEM CALL. Within Unix and Linux, there are many different system calls, and part of becoming a programmer is learning to use the most common ones. Three of the most important system calls are the ones used to create and use new processes. They are called fork, exec and wait.

The fork system call creates a copy of the current process. (Think of a hiking trail, where one path turns into two.) The wait system call causes the original process to pause until the new process stops. Finally, the exec system call changes the program that a process is running. What is amazing is that, by using only three basic system calls (with a few minor variations we can ignore), Unix processes are able to coordinate the elaborate interaction that takes place between you, the shell, and the programs that you choose to run.

To make it easy to talk about creating processes, we often use the words FORK, EXEC and WAIT as verbs. For example, I might tell you that each time a process forks, the existing process becomes the parent and the new process becomes the child.

When a process finally stops (for whatever reason) we say that it DIES or TERMINATES. In fact, as we will discuss in Chapter 26, whenever we choose to stop a process, we say that we KILL it. When a child dies, the parent that has been waiting for that child is woken up automatically. At the moment this happens, the dead child vanishes forever. (Unix programming is not for the faint of heart.)

In other words, in order to run a program, a process forks to create a child and then waits for that child to die. Once the child is created, it execs to run the program. When the program dies, the parent is woken up, causing the child to vanish.

As an example, here is what happens when you enter the command to start the vi text editor. As soon as you enter the command, the parent (your shell) forks to create a child, identical to itself. The parent then pauses and waits for the child to die. Meanwhile, the child execs to change from running a shell to running vi. What you notice is that, an instant after you enter the vi command, the shell prompt is replaced by the vi program.

When you finish with vi, you quit the program, which kills the child. The death of the child wakes up the parent, which causes the child to vanish. What you notice is that, an instant after you stop the vi program, it is replaced by a shell prompt.

You might ask, what if a parent forks and then dies unexpectedly, leaving the child all alone? In that case, the child is called an ORPHAN. But what happens when an orphan dies? Because there is no parent to wake up, the dead child cannot vanish.

When a child process dies (terminates), it is called a ZOMBIE. It stops being a zombie and vanishes when the parent process wakes up. If there is no parent to wake up, the process remains a zombie. This can happen when a program has a bug that allows it to create a child without waiting for it. If you inadvertently create a zombie, there is no way for you to get rid of it. (How can you kill something that is already dead?)

On older Unix systems, a zombie stays around forever (or until the system is rebooted, whichever comes first). On modern Unix systems, zombies are adopted by process #1, the init process (left over from the boot procedure), which then proceeds to destroy them without a trace of mercy.

Is Unix programming cool, or what?

Jump to top of page

Environment Variables and Shell Variables

If you are a programmer, you will understand the difference between global and local variables. In programming, a LOCAL VARIABLE exists only within the scope in which it was created. For example, let's say you are writing a program and you create a variable count to use only within a function named calculate. We would say that count is a local variable. More specifically, we would say that count is LOCAL to the function calculate. This means that, while calculate is running, the variable count exists. Once calculate stops running, the variable count ceases to exist.(*)

* Footnote

In this chapter, we will be talking about simple variables that store only one value at a time. If you plan to write shell scripts, you should know that both Bash and the Korn Shell allow you to also use one-dimensional arrays. (An array is a variable that contains a list of values.) For more details, see the Bash man page (look in the "Arrays" section) or the Korn shell man page (look in the "Parameters" section).

A GLOBAL VARIABLE, on the other hand, is available everywhere within a program. For example, let's say you are writing a program to perform statistical operations upon a long list of numbers. If you make the list a global variable, it is available to all parts of the program. This means, for example, that if one function or procedure makes a change to the list, other parts of the program will see that change.

The question arises: when you use a Unix shell, are there global and local variables similar to those used by programmers?

The answer is yes. All shells use global and local variables, and you need to know how they work. First, there are the environment variables we have already discussed. Since environment variables are available to all processes, they are global variables and, indeed, we often refer to them by that name(*).

* Footnote

In a strict programming sense, environment variables are not completely global, because changes made by a child process are not propagated back to the parent.

There is a good reason for this limitation. Allowing child processes to change environment variables for parent processes would be a massive source of confusion, bugs, and security holes.

Second, there are SHELL VARIABLES that are used only within a particular shell and are not part of the environment. As such, they are not passed from parent to child and, for this reason, we call them local variables.

As a general rule, local (shell) variables are used in one of two ways. First, they may hold information that is meaningful to the shell itself. For example, within the C-Shell and Tcsh, the ignoreeof shell variable is used to control whether or not the shell should ignore the eof signal when you press ^D (see Chapter 7).

Second, shell variables are used in shell scripts in the same way that local variables are used in ordinary programs: as temporary storage containers. Thus, when you write shell scripts, you create shell variables to use as temporary storage as the need arises.

So far, this is all straightforward. Shell variables are local to the shell in which they are created. Environment variables are global, because they are accessible to any process that uses the same environment.

In practice, however, there is a problem. This is because, when it comes to shells, the line between local and global variables is blurry. For that reason, I want to spend a few minutes explaining exactly how the shell handles variables. Moreover, there are significant differences between the Bourne and C-Shell families, so we'll have to talk about them separately. These concepts are so important, however, that I want you to make sure you understand how variables work with both families, regardless of which shell you happen to use right now.

Before I start, let me take a moment to explain how variables are named. There is a tradition with some programming languages that global variables are given uppercase names and local variables are given lowercase names. This tradition is used with the C-Shell family (C-Shell, Tcsh). Environment variables have uppercase names, such as HARLEY; shell variables have lowercase names, such as harley.

The Bourne shell family (Bash, Korn shell) is different: both shell variables and environment variables are traditionally given uppercase names. Why this is the case will become clear in a moment.

With most programming languages, a variable is either local or global. With the shell, there is a strange problem: some variables have meaning as both local and global variables. In other words, there are some variables that are useful to the shell itself (which means they should be shell variables), as well as to processes that are started by the shell (which means they should be environment variables).

The Bourne shell family handles this problem by mandating that every variable is either local only, or both local and global. There is no such thing as a purely global variable. For example, you might have two variables A and B, such that A is a shell variable, and B is both a shell variable and an environment variable. You cannot, however, have a variable that is only an environment variable. (Take a moment to think about this.)

So what happens when you create a variable? Within the Bourne shell family, you are only allowed to create local variables. That is, every new variable is automatically a shell variable. If you want a variable to also be an environment variable, you must use a special command called export. The export command changes a shell variable into a shell+environment variable. When you do this, we say that you EXPORT the variable to the environment.

Here is an example. (Don't worry about the details, we'll go over them later in chapter.) To start, we will create a variable named HARLEY and give it a value of cool:

HARLEY=cool

At this point, HARLEY is only a shell variable. If we start a new shell or run a command, the new processes will not be able to access HARLEY because it is not part of the environment. Let us now export HARLEY to the environment:

export HARLEY

HARLEY is now both a shell variable and an environment variable. If we start a new shell or run a command, they will be able to access HARLEY.

So now you see why the Bourne shell family uses only uppercase letters for both shell variables and environment variables. Using uppercase makes the name stand out and, because there is no such thing as a pure environment variable, there is no easy way to distinguish between local and global. (Take another moment to think this through.)

As you can see, the way in which the Bourne shell family handles variables is bewildering, especially to beginners. In fact, the system used by these shells dates back to the first Bourne shell, developed in 1976 by Steve Bourne at Bell Labs (see Chapter 11). Two years later, in 1978, when Bill Joy was developing the C-Shell at U.C. Berkeley (also see Chapter 11), he decided to improve how variables were organized. To do so, he created a much simpler system in which there is a clear distinction between environment variables and shell variables.

In the C-Shell family, environment variables are created by the setenv command (described later) and are given uppercase names, such as TERM. Shell variables are created by the set command (also described later) and are given lowercase names, such as user.(*) For practical purposes, that's all there is to it.

* Footnote

For a long time, it has been fashionable to disparage the C-Shell, especially when comparing it to modern Bourne shells, such as Bash. I talked about this cultural belief in Chapter 11, when we discussed the essay Csh Programming Considered Harmful by Tom Christiansen.

The Bourne shells, however, inherited a number of serious design flaws that, in order to maintain backwards compatibility, cannot be changed. Consider, for example, the confusing way in which the Bourne shells handle local and global variables. The C-Shell, though it has its faults, reflects the insights of Bill Joy, a brilliant programmer who, in his youth, had an amazing flair for designing high-quality tools.

When it comes to choosing your own personal shell, don't let people influence you unduly. The modern version of the C-Shell (the Tcsh) is an excellent tool that, for interactive use, can hold its own against Bash and the Korn shell. (Perhaps it's time for someone to write a new essay called Don't Bash the C-Shell.)

However, the simplicity of the C-Shell system leaves one nagging problem. As I mentioned, there are certain variables that have meaning both within the shell and within all the child processes. The Bourne shell family avoids this problem by letting you use variables that are both local and global. The C-Shell family does not allow this.

Instead, the C-Shell family recognizes a handful of special quantities that need to be both local and global. The solution is to have a few special shell variables that are tied to environment variables. Whenever one of these variables changes, the shell automatically updates the other one.

For example, there is a shell variable named home that corresponds to the environment variable named HOME. If you change home, the shell will make the same change to HOME. If you change HOME, the shell will change home.

Of all the dual-name variables, there are only five that are important for everyday use, which I have listed in Figure 12-1. The TERM and USER variables should make sense to you now. The PATH variable will be explained later in the chapter. PWD and HOME will make sense after we have discussed the Unix file system (Chapter 23) and directories (Chapter 24).

Figure 12-1: C-Shell family: Connected shell/environment variables

With the C-Shell family, a few shell variables are considered to be the same as corresponding environment variables. When one variable of the pair is changed, the shell automatically changes the other one. For example, when home is changed, the shell automatically changes HOME, and vice versa. See text for details.

Shell Variable	Environment Variable	Meaning
cwd	PWD	your current/working directory
home	HOME	your home directory
path	PATH	directories to search for programs
term	TERM	type of terminal you are using
user	USER	current userid

What's in a Name?

cwd, PWD

Figure 12-1 shows the pairs of C-Shell variables that are connected to one another. As you can see, with one exception, every shell variable has the same name as its corresponding environment variable (disregarding lower and uppercase). The exception is cwd and PWD.

These variables contain the name of your working directory, which is sometimes called the current directory (see Chapter 24). Hence, the name cwd: current/working directory.

The PWD variable is named after the pwd command, which displays the name of the working directory. Interestingly enough, pwd is one of the original Unix commands. It stands for "print working directory", and it dates from the time that computer output was actually printed on paper (see Chapters 3 and 7).

Jump to top of page

Displaying Environment Variables:
env, printenv

Although it is possible to create your own environment variables and shell variables, you won't need to do so unless you write programs. Most of the time, you will use the default variables.

But what are the default variables? To display them, you use the env command:

env

On many systems, there is another command you can use as well, printenv:

printenv

When you use env or printenv, there may be so many environment variables that they scroll off the screen. If so, use less to display the output one page at a time:

env | less
printenv | less

When you display your environment variables, you will notice they are not in alphabetical order. To sort the output, use the sort command (see Chapter 19) as follows:

env | sort | less
printenv | sort | less

This construction is called a "pipeline". We will talk about it in Chapters 15 and 16.

For reference, Figure 12-2 shows the most important environment variables and what they mean. The actual variables you see will vary depending on which operating system and which shell you are using. However, you should have most of the variables in the table. Don't worry if you don't understand everything: by the time you learn enough to care about using an environment variable, you will understand its purpose.

Figure 12-2: The most important environment variables

By default, Unix systems use a large number of environment variables. What you will find on your system depends on which operating system and which shell you are using.

The leftmost column shows which shells support each variable: B = Bash; K = Korn shell; C = C-Shell; T = Tcsh. A dot indicates that a shell does not support that option.

Shells	Variable	Meaning
B K • •	CDPATH	directories searched by the cd command
B K • T	COLUMNS	width (in characters) of your screen or window
B K C T	EDITOR	default text editor
B K • •	ENV	name of environment file
B K • •	FCEDIT	history list: editor for fc command to use
B K • •	HISTFILE	history list: name of file used to store command history
B K • •	HISTSIZE	history list: maximum number of lines to store
B K C T	HOME	your home directory
• • • T	HOST	name of your computer
B • • •	HOSTNAME	name of your computer
B • • T	HOSTTYPE	type of host computer
B • • •	IGNOREEOF	number of eof signals (^D) to ignore before ending shell
B K C T	LOGNAME	current userid
B • • T	MACHTYPE	description of system
B K C T	MAIL	file to check for new mail
B K C T	MAILCHECK	how often (in seconds) the shell checks for new mail
B K • •	MAILPATH	files to check for new mail
B K • •	OLDPWD	your previous working directory
B • • T	OSTYPE	description of operating system
B K C T	PAGER	default program for displaying data (should be less)
B K C T	PATH	directories to search for programs
B K • •	PS1	your shell prompt (customize by changing this variable)
B K • •	PS2	special shell prompt for continued lines
B K C T	PWD	your working [current] directory
B K • •	RANDOM	random number between 0 and 32,767
B K • •	SECONDS	time (in seconds) since the shell was invoked
B K C T	SHELL	pathname of your login shell
B K C T	TERM	type of terminal you are using
B K • •	TMOUT	if you don't type a command, seconds until auto-logout
• K C T	TZ	time zone information
B K C T	USER	current userid
B K C T	VISUAL	default text editor (overrides EDITOR)
Shells	Variable	Meaning

Jump to top of page

Displaying Shell Variables: set

To display all the shell variables along with their values, you use the set command with no options or arguments:

set

This command is simple and will work for all shells. There is, however, an important point you need to remember.

With the C-Shell family, the shell variables you see will all have lowercase names. By definition, they are local variables.

With the Bourne shell family, the shell variables all have uppercase names. However, you can't tell if a particular variable is a local or global variable just by looking at its name. If it is a shell variable only, it is local; if it is a shell variable and an environment variable, it is both local and global. (Remember, in the Bourne shell family, there are no purely global variables.) This means that when you use set to list your variables, there is no easy way to know which ones have not been exported to the environment.

— hint —

Strange but true: The only way to determine which Bourne shell variables are not exported is to compare the output of set to the output of env. If a variable is listed by set but not by env, it is a shell variable. If the variable is listed by set and by env, it is both a shell variable and an environment variable.

Obviously, this is confusing. However, it doesn't matter a lot because shell variables aren't used much with the Bourne shell family. To be sure, when you write shell scripts you will create local (shell) variables as you need them. But for day-to-day interactive work, it is the environment variables that are important, not the shell variables.

In the C-Shell family, things are different. There are a large number of shell variables, many of which are used to control the behavior of the shell. Earlier, I mentioned several of these variables: cwd, home, term and user. For reference, Figure 12-3 shows you these four, as well as the others I consider to be the most important. For a comprehensive list, see Appendix G. (In fact, you might want to take a moment right now to sneak a quick look at Appendix G, just to see how many shell variables the C-Shell family actually uses.)

This leaves us with one last question. If the C-Shell family uses shell variables to control the behavior of the shell, what does the Bourne shell family use? The answer is: an elaborate system called "shell options", which we will talk about later in the chapter. First, however, we need to cover a few more basic concepts related to using variables.

Figure 12-3: C-Shell family: The most important shell variables

With the C-Shell family, there are a great many shell variables that are used by the shell for special purposes. Here are the ones I consider to be the most useful. A more comprehensive list can be found in Appendix G.

The leftmost column shows which shells support each option: C = C-Shell; T = Tcsh. A dot indicates that a shell does not support that option.

Shells	Variable	Meaning
• T	autologout	if you don't type a command, time (in minutes) until auto-logout
C T	cdpath	directories to be searched by cd, chdir, popd
• T	color	cause ls-F command to use color
C T	cwd	your working [current] directory (compare to owd)
C T	filec	autocomplete: enable
C T	history	history list: maximum number of lines to store
C T	home	your home directory
C T	ignoreeof	do not quit shell upon eof signal (^D)
• T	implicitcd	typing directory name by itself means change to that directory
• T	listjobs	job control: list all jobs whenever a job is suspended; long = long format
• T	loginsh	set to indicate a login shell
C T	mail	list of files to check for new email
C T	noclobber	do not allow redirected output to replace a file
C T	notify	job control: notify immediately when background jobs are finished
• T	owd	your most recent [old] working directory (compare to cwd)
C T	path	directories to search for programs
C T	prompt	your shell prompt (customize by changing this variable)
• T	pushdsilent	directory stack: pushd and popd do not list directory stack
• T	pushdtohome	directory stack: pushd without arguments assumes home directory (same as cd)
• T	rmstar	force user to confirm before executing rm * (remove all files)
• T	rprompt	special prompt for right side of screen (hint: set to %~ or %/)
• T	savedirs	directory stack: before logout, save directory stack
C T	savehist	history list: before logout, save this number of lines
C T	shell	pathname of your login shell
C T	term	type of terminal you are using
C T	user	current userid
C T	verbose	debug: echo each command, after history substitution only
• T	visiblebell	use a screen flash instead of an audible sound
Shells	Variable	Meaning

Jump to top of page

Displaying and Using the Value
of a Variable: echo, print

If you want to display the values of all your environment variables at once, you can use the env or printenv command. If you want to display all your shell variables, you can use set. There will be many times, however, when you want to display the value of a single variable. In such cases, you use the echo command.

The job of the echo command is to display the value of anything you give it. For example, if you enter:

echo I love Unix

You will see:

I love Unix

(Which, by now, should be true.)

To display the value of a variable, you use a $ (dollar sign) character followed by the name of the variable enclosed in brace brackets (usually referred to as braces). For example, to display the value of TERM, you would enter:

echo ${TERM}

Try it on your system and see what you get.

If there is no ambiguity, you can leave out the braces.

echo $TERM

This will be the case most of the time, but I'll show you an example in a moment where you would need the braces.

When we talk about using variables in this way, we pronounce the $ character as "dollar". Thus, you might hear someone say, "If you want to display the value of the TERM variable, use the command echo-dollar-term."

The notation $NAME is important so I want you to remember it. When you type a variable name alone, it is just a name; when you type a $ followed by a name (such as $TERM), it refers to the value of the variable with that name. Thus, the echo command above means, "Display the value of the variable TERM."

Consider the following example, similar to the previous one but without the $ character. In this case, the echo command will simply display the characters TERM, not the value of the TERM variable:

echo TERM

You can use echo to display variables and text in any way you want. For example, here is a more informative message about the type of your terminal:

echo The terminal type is $TERM

If your terminal type is, say, xterm, you will see:

The terminal type is xterm

Within the shell, some punctuation characters — they are called "metacharacters" — have special meanings (we'll talk about this in Chapter 13). To keep the shell from interpreting metacharacters, you can enclose them in double quotes. This tells the shell to take the characters literally. For example, let's say you want to display the value of TERM within angled brackets. You might try:

echo The terminal type is <$TERM>.

The < and > characters, however, are metacharacters used for "redirection" (see Chapter 15), and the command won't work. (Try it.) Instead, you need to use:

echo "The terminal type is <$TERM>."

— hint —

When you use the echo command to display punctuation, use double quotes to tell the shell not to interpret the punctuation as metacharacters.

When you use the echo command, you have a lot of flexibility. For example, you can display more than one variable:

echo $HOME $TERM $PATH $SHELL

If the variable is not separated from its neighbors, you must use braces to delimit it. For example, say that the variable ACTIVITY has the value surf. The command:

echo "My favorite sport is ${ACTIVITY}ing."

will display:

My favorite sport is surfing.

— hint —

If you write shell scripts, you will find yourself using the echo command a lot. Take a moment to check out the man page (man echo), where you will find a variety of options and features you can use to control the format and content of the output.

— hint for Korn shell users: —

All shells let you use the echo command to display text and variables. With the Korn shell, you can also use the print command:

print "The terminal type is $TERM."

The developer of the Korn shell, David Korn, (see Chapter 11) created print to replace echo. He did this because the two main versions of Unix at the time, System V and BSD (see Chapter 2), used echo commands that had slightly different features. This meant that shell scripts that used echo were not always portable from one system to another.

To solve the problem, Korn designed print to work the same on all systems. This is not as much of an issue today as it was in Korn's day. Still, if you are writing a Korn shell script that you know will be run on more than one computer, it is prudent to use print instead of echo.

Jump to top of page

Bourne Shell Family:
Using Variables: export, unset

With the Bourne shell family, it is easy to create a variable. All you do is type a name, followed by an = (equal sign) character, followed by a value. The value must be a string of characters. The syntax is:

NAME=value

As I mentioned earlier, a variable name can use letters, numbers or an underscore (_). However, a variable name cannot start with a number.

Here is an example you can try (use your own name if you want). Be careful not to put spaces around the equal sign:

HARLEY=cool

When you create a variable in this way, we say that you SET it. Thus, we can say that the previous example sets the variable HARLEY and gives it a value of cool.

If you want to use a value that contains whitespace (spaces or tabs; see Chapter 10), put the value in double quotes:

WEEDLY="a cool cat"

Once a variable exists, you can use the same syntax to change its value. For example, once you have created HARLEY, you can change its value from cool to smart by using:

HARLEY=smart

Within the Bourne shell family, every new variable is automatically a shell variable. (See the discussion earlier in the chapter.) To export a variable to the environment, you use the export command. Type export followed by the name of one or more variables. The following example exports HARLEY and WEEDLY:

export HARLEY WEEDLY

Both HARLEY and WEEDLY have now been changed from shell variables to shell+environment variables.

As we discussed in Chapter 10, you can enter multiple commands on the same line by separating them with a semicolon. Because it is common to create a variable and export it immediately, it makes sense to enter the two commands together, for example:

PAGER=less; export PAGER

This is faster than entering two separate commands and you will see this pattern used a lot, especially in Unix documentation and in shell scripts. However, there is an even better way. The export command actually lets you set a variable and export it at the same time. The syntax is:

export NAME[=value]...

Here is a simple example:

export PAGER=less

If you look closely at the syntax, you can see that export allows you to specify one or more variable names, each of which may have a value. Thus, you can do a lot with one command:

export HARLEY WEEDLY LITTLENIPPER
export PAGER=less EDITOR=vi PATH="/usr/local/bin:/usr/bin:/bin"

— hint —

As a rule, the very best Unix users tend to think fast. As such, they favor commands that are as easy to type as possible. For this reason, the preferred way to set and export a variable is with a single command:

export PAGER=less

Although many people use two commands to set and export a variable, using a single command to do a double job marks you as a person of intelligence and distinction.

As I mentioned, when you create a variable, we say that you set it. When you delete a variable, we say that you UNSET it. You will rarely have to unset a variable, but if the need arises, you can do so with the unset command. The syntax is simple:

unset NAME...

Here is an example:

unset HARLEY WEEDLY

— hint —

Interestingly enough, within the Bourne shell family, there is no easy way to remove a variable from the environment. Once a variable is exported, the only way to un-export it is to unset it.

In other words, the only way to remove a Bourne shell variable from the environment is to destroy it.(*)

* Footnote

Riddle: How is a Bourne shell variable like the spotted owl?

Jump to top of page

C-Shell Family:
Using Variables: setenv, unsetenv, set, unset

As we have discussed, the C-Shell family — unlike the Bourne shell family — has a clear separation between environment variables and shell variables. In other words, the C-Shell clearly distinguishes between global and local variables. For this reason, you will find that working with variables is easier with the C-Shell family than with the Bourne shell family.

To set (create) and unset (delete) environment variables, you use setenv and unsetenv. To set and unset shell variables, you use set and unset.

The syntax of the setenv command is as follows:

setenv NAME [value]

where NAME is the name of the variable; value is the value to which you want to set the variable. Notice we do not use an = (equal sign) character.

Here are some examples in which we create environment variables. If you want to experiment, remember that, once you have created environment variables, you can display them using the env or printenv commands.

setenv PATH /usr/local/bin:/usr/bin:/bin
setenv HARLEY cool
setenv WEEDLY "a cool cat"
setenv LITTLENIPPER

The first three commands set a variable and give it a specified value. In the third example, we use double quotes to contain whitespace (the two spaces). In the last example, we specify a variable name (LITTLENIPPER) without a value. This creates the variable with a null value, that, with no value. We do this when we care only that a variable exists, but we don't care about its value.

To unset an environment variable, you use the unsetenv command. The syntax is:

unsetenv NAME

where NAME is the name of the variable.

For example, to unset (delete) the variable HARLEY, you would use:

unsetenv HARLEY

To set a shell variable, you use a set command with the following syntax:

set name[=value]

where name is the name of a shell variable; value is the value to which you want to set the variable.

Here are several examples:

set term=vt100
set path=(/usr/bin /bin /usr/ucb)
set ignoreeof

The first example is straightforward. We set the value of the shell variable term to vt100.

The second example illustrates an important point. When you are using variables with the C-Shell family, there are times when you enclose a set of character strings in parentheses, rather than double quotes. When you do so, it defines a set of strings that can be accessed individually. In this case the value of path is set to the three character strings within the parentheses.

In the last example, we specify a shell variable without a value. This gives the variable a null value. In this case, the fact that the variable ignoreeof exists tells the shell to ignore the eof signal. This requires us to use the logout command to end the shell. (See Chapter 7.)

Once a shell variable exists, you can delete it by using the unset command. The syntax is:

unset variable

where variable is the name of a variable.

As an example, if you want to tell the shell to turn off the ignoreeof feature, you would use:

unset ignoreeof

Make sure you understand the difference between setting a variable to null and deleting the variable. Consider the following three commands:

set harley=cool
set harley
unset harley

The first command creates a shell variable named harley and gives it a value of cool. The second command sets the value of harley to null. The final command deletes the variable completely.

Jump to top of page

Shell Options: set -o, set +o

As we discussed earlier, with the C-Shell family, we can control various aspects of the shell's behavior by using shell variables. With the Bourne shell family, we use what are called SHELL OPTIONS. For instance, it is shell options that control whether a shell is interactive or non-interactive.

Shell options act like on/off switches. When you turn on an option, we say that you SET it. This tells the shell to act in a certain way. When you turn off an option, we say that you UNSET it. This tells the shell to stop acting in that way.

For example, the shell supports a facility called "job control" to let you run programs in the background. (We'll talk about this in Chapter 26.) To turn on job control, you set the monitor option. If you want to turn off job control, you unset the monitor option. By default, monitor is turned on for interactive shells.

— hint —

The words "set" and "unset" have different meanings depending on whether we are talking about shell options or variables.

Shell options are either off or on; they do not need to be created. Thus, when we set a shell option, we turn it on. When we unset an option, we turn it off.

Variables are different. When we set a variable, we actually create it. When we unset a variable, we delete it permanently.

There are two ways in which shell options can be set or unset. First, at the time a shell is started, options can be specified in the usual manner, by specifying one or more options with the command (see Chapter 10). For example, the following command starts a Korn shell with the monitor option set (turned on):

ksh -m

In addition to the standard command-line options, there is another way to turn shell options on and off, using a variation of the set command. Here is the syntax. To set an option, use:

set -o option

To unset an option, you use:

set +o option

where option is the "long name" of an option (see Figure 12-4).

For example, say the shell is running and you want to set the monitor option. Use:

set -o monitor

To unset the monitor option, use:

set +o monitor

Be careful that you type o, the lowercase letter "o", not a zero. (Just remember, o stands for "option".)

At first, it will seem strange to use -o to turn an option on and +o to turn it off. However, I promise you, it will make sense eventually(*).

* Footnote

Here is the short explanation. As you know from Chapter 10, the standard form of an option is a - (hyphen) character followed by a single letter. It happens that most of the time when you modify shell options, you will want to set them — that is, turn them on — not unset them. For this reason, the common syntax (-o) is used for "set", and the less common syntax (+o) is used for "unset".

Over time, as you gain experience, this type of reasoning will start to make sense to you. As that happens, something changes in your brain and using Unix becomes much easier. (Unfortunately, this same change also makes it harder to meet cheerleaders at sorority parties.)

Every time a shell starts, the various options are either set or unset by default, according to whether the shell is interactive or non-interactive. The programmers who designed the shell knew what they were doing and, in most cases, the shell options are just fine the way they are. This means that you will rarely have to change a shell option.

However, if you do, you can use Figure 12-4 for reference: it shows the shell options that are the most useful with interactive shells. As with the environment variables we discussed earlier, don't worry if you don't understand everything. This list is for reference. By the time you learn enough to care about using an option, you will understand its purpose.

Figure 12-4: Bourne shell family: Summary of options for interactive shells

This table summarizes the shell options that are useful with an interactive shell. For more information, see the man page for your particular shell.

The leftmost column shows which shells support each option: B = Bash; K = Korn shell. A dot indicates that a shell does not support that option. Notice that some options, such as history, have a long name but not a short option name.

Notes: (1) Although Bash supports the emacs and vi options, it does not use -E and -V. (2) The Korn shell uses -h, but does not support the long name hashall.

Shells	Option	Long Name	Meaning
B K	-a	allexport	export all subsequently defined variables and functions
B •	-B	braceexpand	enable brace expansion (generate patterns of characters)
B K	-E	emacs	command line editor: Emacs mode; turns off vi mode
B K	-h	hashall	hash (remember) locations of commands as they are found
B •	-H	histexpand	history list: enable !-style substitution
B •		history	history list: enable
B K	-I	ignoreeof	ignore eof signal ^D; use exit to quit shell (see Chapter 7)
• K		markdirs	when globbing, append / to directory names
B K	-m	monitor	job control: enable
B K	-C	noclobber	do not allow redirected output to replace a file
• K		nolog	history list: do not save function definitions
B K	-b	notify	job control: notify immediately when background job is finished
• K		trackall	aliases: substitute full pathnames for commands
B K	-V	vi	command line editor: vi mode; turns off Emacs mode
• K		viraw	in vi mode: process each character as it is typed

Aside from what you see in Figure 12-4, there are many other shell options, most of which are useful with non-interactive shells (that is, when you are writing shell scripts). In addition, if you use Bash, there is a special command called shopt ("shell options") that gives you access to yet more options.

I have collected the full set of shell options, plus some hints about shopt in Appendix G. Although you don't need to understand all this material right now, I'd like you to take a moment to look at Appendix G, just so you know what's available.

— hint —

For definitive information about shell options, see your shell man page:

man bash
man ksh

With Bash, search for "SHELL BUILTIN COMMANDS". With the Korn shell, search for "Built-in Commands" or "Special Commands".

Jump to top of page

Displaying Shell Options

The Bourne shell family uses shell options to control the operation of the shell. To display the current value of your shell options, use either set -o or set +o by themselves:

set -o
set +o

Using set -o displays the current state of all the options in a way that is easy to read. Using set +o displays the same information in a compact format that is suitable for using as data to a shell script or a program.

If the output is too long for your screen, send it to less, which will display it one screenful at a time:

set -o | less
set +o | less

If you would like to practice setting and unsetting options, try doing so with the ignoreeof option. As we discussed in Chapter 7, you can terminate a shell by pressing ^D (the eof key). However, if the shell happens to be your login shell, you will be logged out.

Unfortunately, it is all too easy to press ^D by accident and log yourself out unexpectedly. To guard against this you can set the ignoreeof option. This tells the shell not to end a shell when you press ^D. Instead, you must enter exit or logout. To set this option, use the following command:

set -o ignoreeof

To unset the option, use:

set +o ignoreeof

Try experimenting by setting, unsetting and displaying the options. Each time you make a change, display the current state of the options, then press ^D and see what happens.

— hint —

Unless you are an advanced user, the only options you need to concern yourself with are ignoreeof, monitor and noclobber, and either emacs or vi.

The monitor option enables job control, which I discuss in Chapter 26. The noclobber option prevents you from accidentally removing a file when you redirect the standard output (see Chapter 15). The emacs and vi options are used to specify which built-in editor you want to use to recall and edit previous commands. This is explained later in the chapter.

These options are best set from within your environment file, an initialization file that is executed automatically each time a new shell is started. We will discuss this file in Chapter 14.

What's in a Name?

set

You will notice that, in just this one chapter, we have used the set command in several different ways, each with its own syntax. We have used set to display shell variables, to create shell variables, to turn shell options on and off, and to display shell options.

If you take a careful look, you can see that we are actually dealing with four different commands that happen to have the same name. Apparently, there is something about the name set that makes programmers want to use it a lot.

This is not as odd as it sounds. Did you know that, in English, the word "set" has more distinct meanings than any other word in the language? Check with a dictionary. I promise you will be amazed.

Jump to top of page

Machine-readable, Human-readable

When a program displays complex output in a way that it can be used as data for another program, we say that the output is MACHINE-READABLE. Although this term conjures up the image of a robot reading like a person, all it means is that the output is formatted in a way that is suitable for a program to process. For example, you might have a table of census data formatted as lists of numbers separated by commas, instead of organized into columns. Although this would be awkward for you or me to read, it would be suitable input for a program.

When output is designed to be particularly easy to read, we sometimes say that it is HUMAN-READABLE. This term is not used much, but you will encounter it within the man pages for the GNU utilities which, as we discussed in Chapter 2, are used with many types of Unix, including Linux. In fact, many commands have options that are designed specifically to produce human-readable output.

As an example, in the previous section, I mentioned that the set -o command displays output in a way that is easy to read, while set +o displays output suitable for using as data for a shell script. Another way to say this is that set -o produces human-readable output, while set +o produces machine-readable output.

Of course, not everyone is the same. Just because something is supposed to be human-readable, doesn't mean you personally will like it better than the machine-readable counterpart. For example, I happen to like the output of set +o better than the output of set -o(*).

* Footnote

But then, I also like putting peanut butter on avocado.

Jump to top of page

Continue to Chapter 13
Using the Shell: Commands and Customization

Exercises

Review Question #1:

What is the difference between an interactive shell and a non-interactive shell?

Review Question #2:

The environment is a table of variables available to the shell and to any program started by that shell. What type of variables are stored within the environment? Give three examples.

What type of variables are not part of the environment?

Review Question #3:

With the Bourne shell family (Bash, Korn shell), what command do you use to make a shell variable part of the environment?

Review Question #4:

How do you display the values of all your environment variables? Your shell variables? A single variable?

Review Question #5:

Explain the terms "machine-readable" and "human-readable".

Applying Your Knowledge #1:

The environment variable USER contains the name of the current userid. Show three different ways to display the value of this variable.

Applying Your Knowledge #2:

Create an environment variable named SPORT and give it the value "surfing". Display the value of the variable.

Start a new shell. Display the variable again to show that it was passed to the new shell as part of the environment.

Change the value of SPORT to "running", and display the new value.

Now quit the shell and return to the original shell. Display the value of SPORT. What do you see and why?

Applying Your Knowledge #3:

Within the Bourne shell family (Bash, Korn shell), the ignoreeof option tells the shell not to log you out when you press ^D (the eof key).

Start either Bash or a Korn shell. Check if ignoreeof is turned on or off. If it is off, turn it on.

Press ^D to confirm that you do not log out. Then turn ignoreeof off. Press ^D again to confirm that you do indeed log out.

For Further Thought #1:

Environment variables are not true global variables, because changes made by a child process are not propagated back to the parent. Suppose this was not the case. Show how a hacker might use this loophole to cause trouble on a multiuser system on which he has an account.

For Further Thought #2:

The C-Shell family has a clear separation between environment variables and shell variables. The Bourne shell family is not so clear, because a newly created variable is, by default, a shell variable. If it is exported, it becomes both a local and an environment variable.

What are the advantages and disadvantages of each system?

Which system do you think is better, and why?

Why do you think the Bourne shell family has such a confusing way of dealing with variables?

Jump to top of page

Display the Answers to the Exercises Above

Jump to Chapter 13
Using the Shell: Commands and Customization

List of Chapters + Appendixes
Table of Contents