Chapter 4. The shell

Table of Contents
Command line editing
Wildcards
Environment variables
The PATH variable
Programs, processes and the shell
Job control
Redirection and piping
Aliases

Command line editing

Working in the shell it is often necessary to correct typing errors at the beginning of the line and it is now and then convenient to reuse commands that have been given before. The shell contains powerful editing mechanisms with arrow keys and control sequences (press the Ctrl-key and another key simultaneously) to make such changes simple and efficient. Use arrow-up to access an earlier command. The cursor is positioned on the line with arrow-left and arrow-right. The Insert key toggles (changes back and forth) between the default character-insert mode and character overwrite. The following control sequences are also useful:

With the exception of C-h the control characters have the same meaning as in emacs, see Chapter 5.

Another mechanism that help improving the speed is the completion mechanism (invoked with the TAB character) that works both for commands and file names. Try $ cat /etc/hostn<TAB> to see how it works!


Wildcards

It is often convenient to refer to a whole group of files at the same time. The shell includes such a mechanism which is called wildcards. The most important ones are

Try
$ ls /dev/tty*
$ ls /dev/tty?
$ ls /dev/tty{a,b}{3,4}
(The /dev directory contains lots of device files that give access to drivers for the hardware through the kernel (don't bother). The only reason to use that directory in this exercise is that it contains a large number of files with somewhat similar names.)

The file names in a Linux system may contain all kinds of characters except "/" and the null character (which means end of string). It is nevertheless wise to avoid certain charcters in file names since the shell assigns special meanings to them. These characters include "*", "$", "?", "!", " ", ";", "\", "|", "&", "~", "{", "}", "[", "]".


Environment variables

Some aspects of the behavior of shell are governed by environment variables. These variables will be passed on to all processes that are started from the shell. The command printenv shows a list of the environment variables. Some of these are:

USER=olsson
SHELL=/usr/local/bin/bash
OSTYPE=linux
HOME=/Home/guests/olsson
TERM=xterm
PATH=/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games
The syntax for changing the environment variables is different in different shells. In bash (which is the default shell at lu.umu.se) the command
$ export MYVAR=myvalue
sets up MYVAR as an environment variable. It can then be examined with printenv or the echo command:
$ echo $MYVAR
myvalue


The PATH variable

The most direct way to execute a command is to write the pathname of the command, e.g. /bin/ls. It is however also possible (and much more common, as in all our examples above) to only write a command name, e.g. ls. The connection between the command names and the path names for the files is through the PATH variable.

The PATH variable is a list of directories where the system looks for commands to be executed. One common setting is

PATH=/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games
which is a colon-separated list of directories. When a command, as e.g. "ls", is given to the shell, the system first looks for an executable file with name ls in the "/usr/local/bin"-directory. If that is not found it continues with "/usr/bin" and so on.

The connection between the command name ls and the executable file (program) /bin/ls is shown by the which command:

$ which ls
/bin/ls

It is common for users to put their own commands in a directory like "/home/olsson/bin". To make these commands directly accessible as commands in the shell, just add "/home/olsson/bin" to the path:

$ export PATH=/home/olsson/bin:$PATH


Programs, processes and the shell

The concept process is central to UNIX systems. Note that program and process are different things:

The ps command (process status) shows (among other things) process id, controlling terminal (e.g. pts/2), executing time and command name. The option -a is necessary to get all the processes, not only the ones running with the same controlling terminal. The example below illustrate the use of ps -a and the kill command that is one way to stop the execution of a process.
$ ps -a
  PID TTY          TIME CMD
 4679 pts/2    00:00:00 bash
 5071 pts/2    00:00:05 xterm
 6373 pts/4    00:00:00 top
 6374 pts/2    00:00:00 ps
$ kill 6373
$ ps -a
  PID TTY          TIME CMD
 4679 pts/2    00:00:00 bash
 5071 pts/2    00:00:05 xterm
 6375 pts/2    00:00:00 ps
In cases when kill is not enough, try with kill -9 6373 which is a "sure kill".

The discussion and the examples above have illustrated that the actions we take in a shell have the effect to start a process. One remarkable thing is that the shell actually is an ordinary process itself with no special rights. An ordinary user can write a program to work as his own shell with whatever syntax he wants.


Job control

We only mention two aspects of job control:


Redirection and piping

The output from commands like cat or ls is usually written to standard output. There are however cases when one instead wants the output written to a file. One way to solve this would be to ask the program to open a file and direct the output to that file.

The Unix way of doing this is to instead let the shell handle the redirection, i.e. before executing the command, the shell opens the file. The command is then exectuted with standard output directed to the file. To illustrate this, we use the find command (which is a normally used with some of its many options; see the man page for a quick introduction.).

$ cd
$ find .
This gives a list of all the files in and below your home directory.

We now do the same but also redirect the output to the file find-output:

$ find . > find-output
and examine the content of the file with
$ cat find-output

This is now in general an unsorted list of files. To get the list ordered we can use the sort command that sorts line of input. We now redirect the input:

$ sort < find-output

This is all very nice, but it is possible to take one more step. We can actually do without the file find-output and instead directly connect the output of find to the input of sort. That is called piping and is performed with the "|" symbol. The construct

$ find . | sort
gives exactly the same output, but now in a more efficient way that is easy to generalize to rather complex constructs.


Aliases

A simple way to change the behavior of the shell to fit you needs is by defining aliases. It could e.g. be good to be asked for a confirmation before the rm command actually does remove the file(s). That kind of behavior is given by rm -i, but to make it happen each time you write rm define the alias

$ alias rm='rm -i'

The alias with no argument lists the currently defined aliases and unalias deletes the alias.

Several aliases are defined by default in this Linux system. Use

$ alias
to get the current list of aliases. (The source command in your own ".tcshrc" is responsible for reading in the system file with these alias definitions.)