Computer Program Technology: Shell Programming

Shell Programming - WHY?
While it is very nice to have a shell at which you can issue commands, have you had the feeling that something is missing? Do you feel the urge to issue multiple commands by only typing one word? Do you feel the need for variables, logic conditions and loops? Do you strive for automation?
If so, then welcome to shell programming.
(If you answered no to any of the above then you are obviously in the wrong frame of mind to be reading this - please try again later :)

Shell programming allows system administrators (and users) to create small (and occasionally not-so-small) programs for various purposes including automation of system administration tasks, text processing and installation of software.

Shell Programming - WHAT?
A shell program (sometimes referred to as a shell script) is a text file containing shell and UNIX commands.   Remember - a UNIX command is a physical program (like cat, cut and grep) where as a shell command is an "interpreted" command - there isn't a physical file associated with the command; when the shell sees the command, the shell itself performs certain actions (for example, echo)

When a shell program is executed the shell reads the contents of the file line by line. Each line is executed as if you were typing it at the shell prompt. There isn't anything that you can place in a shell program that you can't type at the shell prompt.

Shell programs contain most things you would expect to find in a simple programming language. Programs can contain services including:
* variables
* logic constructs (IF THEN AND OR etc)
* looping constructs (WHILE FOR)
* functions
* comments (strangely the most least used service)
The way in which these services are implemented is dependant on the shell that is being used (remember - there is more than one shell). While the variations are often not major it does mean that a program written for the bourne shell (sh/bash) will not run in the c shell (csh). All the examples in this chapter are written for the bourne shell.

Shell Programming - HOW?
Shell programs are a little different from what you'd usually class as a program. They are plain text and they don't need to be compiled. The shell "interprets" shell programs - the shell reads the shell program line by line and executes the commands it encounters. If it encounters an error (syntax or execution), it is just as if you typed the command at the shell prompt - an error is displayed.
This is in contrast to C/C++, Pascal and Ada programs (to name but a few) which have source in plain text, but require compiling and linking to produce a final executable program.
So, what are the real differences between the two types of programs? At the most basic level, interpreted programs are typically quick to write/modify and execute (generally in that order and in a seemingly endless loop :). Compiled programs typically require writing, compiling, linking and executing, thus are generally more time consuming to develop and test.
However, when it comes to executing the finished programs, the execution speeds are often widely separated. A compiled/linked program is a binary file containing a collection direct systems calls. The interpreted program, on the other hand, must first be processed by the shell which then converts the commands to system calls or calls other binaries - this makes shell programs slow in comparison. In other words, shell programs are not generally efficient on CPU time.
Is there a happy medium? Yes! It is called Perl. Perl is an interpreted language but is interpreted by an extremely fast, optimised interpreter. It is worth noting that a Perl program will be executed inside one process, whereas a shell program will be interpreted from a parent process but may launch many child processes in the form of UNIX commands (ie. each call to a UNIX command is executed in a new process). However, Perl is a far more difficult (but extremely powerful) tool to learn - and this chapter is called "Shell Programming"...

The Basics
A Basic Program
It is traditional at this stage to write the standard "Hello World" program. To do this in a shell program is so obscenely easy that we're going to examine something a bit more complex - a hello world program that knows who you are...
To create your shell program, you must first edit a file - name it something like "hello", "hello world" or something equally as imaginative - just don't call it "test" - we will explain why later.

In the editor, type the following:
#!/bin/bash
# This is a program that says hello
echo "Hello $LOGNAME, I hope you have a nice day!"

(You may change the text of line three to reflect your current mood if you wish)
Now, at the prompt, type the name of your program - you should see something like:
bash: ./helloworld: Permission denied
Why?
The reason is that your shell program isn't executable because it doesn't have its execution permissions set. After setting these (Hint: something involving the chmod command), you may execute the program by again typing its name at the prompt.
An alternate way of executing shell programs is to issue a command at the shell prompt to the effect of:
<shell> <shell program>
eg
bash helloworld
This simply instructs the shell to take a list of commands from a given file (your shell script). This method does not require the shell script to have execute permissions. However, in general you will execute your shell scripts via the first method.
And yet you may still find your script won't execute - why? On some UNIX systems (Red Hat Linux included) the current directory (.) is not included in the PATH environment variable. This mans that the shell can't find the script that you want to execute, even when it's sitting in the current directory! To get around this either:
* Modify the PATH variable to include the "." directory:
PATH=$PATH:.
* Or, execute the program with an explicit path:
./helloworld

An Explanation of the Program
Line one, #!/bin/bash is used to indicate which shell the shell program is to be run in. If this program was written for the C shell, then you might have #!/bin/csh instead.
It is probably worth mentioning at this point that UNIX "executes" programs by first looking at the first two bytes of the file (this is similar to the way MS-DOS looks at the first two bytes of executable programs; all .EXE programs start with "MZ"). From these two characters, the system knows if the file is an interpreted script (#!) or some other file type (more information can be obtained about this by typing man file). If the file is an interpreted script, then the system looks for a following path indicating an interpreter. For example:
#!/bin/bash
#!/usr/bin/perl
#!/bin/sh
Are all valid interpreters.
Line two, # This is a program that says hello , is (you guessed it) a comment. The "#" in a shell script is interpreted as "anything to the right of this is a comment, go onto the next line". Note that it is similar to line one except that line one has the "!" mark after the comment.
Comments are a very important part of any program - it is a really good idea to include some. The reasons why are standard to all languages - readability, maintenance and self congratulation. It is more so important for a system administrator as they very rarely remain at one site for their entire working career, therefore, they must work with other people's shell scripts (as other people must work with theirs).

Always have a comment header; it should include things like:
# AUTHOR:       Who wrote it
# DATE:         Date first written
# PROGRAM:      Name of the program
# USAGE:        How to run the script; include any parameters
# PURPOSE:      Describe in more than three words what the
#               program does
#
# FILES:        Files the shell script uses
#
# NOTES:        Optional but can include a list of "features"
#               to be fixed
#
# HISTORY:      Revisions/Changes

This format isn't set in stone, but use common sense and write fairly self documenting programs.
Line three, echo "Hello $LOGNAME, I hope you have a nice day!" is actually a command. The echo command prints text to the screen. Normal shell rules for interpreting special characters apply for the echo statement, so you should generally enclose most text in "". The only tricky bit about this line is the $LOGNAME . What is this?
$LOGNAME is a shell variable; you can see it and others by typing "set" at the shell prompt. In the context of our program, the shell substitutes the $LOGNAME value with the username of the person running the program, so the output looks something like:
Hello jamiesob, I hope you have a nice day!
All variables are referenced for output by placing a "$" sign in front of them - we will examine this in the next section.
Exercises
8.1    Modify the helloworld program so its output is something similar to:
Hello <username>, welcome to <machine name>
All You Ever Wanted to Know About Variables
You have previously encountered shell variables and the way in which they are set. To quickly revise, variables may be set at the shell prompt by typing:
Shell_Prompt: variable="a string"
Since you can type this at the prompt, the same syntax applies within shell programs.
You can also set variables to the results of commands, for example:
Shell_Prompt: variable=`ls -al`
(Remember - the ` is the execute quote)
To print the contents of a variable, simply type:
Shell_Prompt: echo $variable
Note that we've added the "$" to the variable name. Variables are always accessed for output with the "$" sign, but without it for input/set operations.
Returning to the previous example, what would you expect to be the output?
You would probably expect the output from ls -al to be something like:
drwxr-xr-x   2 jamiesob users        1024 Feb 27 19:05 ./
drwxr-xr-x 45 jamiesob users        2048 Feb 25 20:32 ../
-rw-r--r--   1 jamiesob users         851 Feb 25 19:37 conX
-rw-r--r--   1 jamiesob users       12517 Feb 25 19:36 confile
-rw-r--r--   1 jamiesob users           8 Feb 26 22:50 helloworld
-rw-r--r--   1 jamiesob users       46604 Feb 25 19:34 net-acct
and therefore, printing a variable that contains the output from that command would contain something similar, yet you may be surprised to find that it looks something like:
drwxr-xr-x 2 jamiesob users 1024 Feb 27 19:05 ./ drwxr-xr-x 45 jamiesob users 2048 Feb 25 20:32 ../ -rw-r--r-- 1 jamiesob users 851 Feb 25 19:37 conX -rw-r--r-- 1 jamiesob users 12517 Feb 25 19:36 confile -rw-r--r-- 1 jamiesob users 8 Feb 26 22:50 helloworld -rw-r--r-- 1 jamiesob users 46604 Feb 25 19:34 net-acct
Why?
When placing the output of a command into a shell variable, the shell removes all the end-of-line markers, leaving a string separated only by spaces. The use for this will become more obvious later, but for the moment, consider what the following script will do:
#!/bin/bash
$filelist=`ls`
cat $filelist
Exercise
8.2    Type in the above program and run it. Explain what is happening. Would the above program work if "ls -al" was used rather than "ls" - Why/why not?
Predefined Variables
There are many predefined shell variables, most established during your login. Examples include $LOGNAME, $HOSTNAME and $TERM - these names are not always standard from system to system (for example, $LOGNAME can also be called $USER). There are however, several standard predefined shell variables you should be familiar with. These include:
$$    (The current process ID)
$?    (The exits status of last command)
How would these be useful?
$$
$$ is extremely useful in creating unique temporary files. You will often find the following in shell programs:
some command > /tmp/temp.$$
.
.
some commands using /tmp/temp.$$>
.
.
rm /tmp/temp.$$
/tmp/temp.$$ would always be a unique file - this allows several people to run the same shell script simultaneously. Since one of the only unique things about a process is its PID (Process-Identifier), this is an ideal component in a temporary file name. It should be noted at this point that temporary files are generally located in the /tmp directory.
$?
$? becomes important when you need to know if the last command that was executed was successful. All programs have a numeric exit status - on UNIX systems 0 indicates that the program was successful, any other number indicates a failure. We will examine how to use this value at a later point in time.
Is there a way you can show if your programs succeeded or failed? Yes! This is done via the use of the exit command. If placed as the last command in your shell program, it will enable you to indicate, to the calling program, the exit status of your script.
exit is used as follows:
exit 0        # Exit the script, $? = 0 (success)
exit 1        # Exit the script, $? = 1 (fail)
Another category of standard shell variables are shell parameters.
Parameters - Special Shell Variables
If you thought shell programming was the best thing since COBOL, then you haven't even begun to be awed - shell programs can actually take parameters. Table 8.1 lists each variable associated with parameters in shell programs:
Variable    Purpose
$0    the name of the shell program
$1 thru $9    the first thru to ninth parameters
$#    the number of parameters
$*    all the parameters passed represented as a single word with individual parameters separated
$@    all the parameters passed with each parameter as a separate word
Table 8.1
Shell Parameter Variables
The following program demonstrates a very basic use of parameters:
#!/bin/bash
# FILE:         parm1
VAL=`expr ${1:-0} + ${2:-0} + ${3:-0}`
echo "The answer is $VAL"
Pop Quiz: Why are we using ${1:-0} instead of $1? Hint: What would happen if any of the variables were not set?
A sample testing of the program looks like:
Shell_Prompt: parm1 2 3 5
The answer is 10

Shell_Prompt: parm1 2 3
The answer is 5

Shell_Prompt:
The answer is 0
Consider the program below:
#!/bin/bash
# FILE:         mywc

FCOUNT='ls $* 2> /dev/null | wc -w'
echo "Performing word count on $*"
echo
wc -w $* 2> /dev/null
echo
echo "Attempted to count words on $# files, found $FCOUNT"

If the program that was run in a directory containing:
conX         net-acct           notes.txt        shellprog~     t1~
confile         netnasties          notes.txt~   study.htm       ttt
helloworld    netnasties~   scanit*        study.txt               tes/
my_file          netwatch     scanit~         study_~1.htm
mywc*         netwatch~       shellprog       parm1*
Some sample testing would produce:
Shell_Prompt: mywc mywc
Performing word count on mywc

34 mywc

Attempted to count words on 1 files, found       1
Shell_Prompt: mywc mywc anotherfile
Performing word count on mywc anotherfile

34 mywc
34 total

Attempted to count words on 2 files, found       1
Exercise
8.3    Explain line by line what this program is doing. What would happen if the user didn't enter any parameters? How could you fix this?
Only Nine Parameters?
Well that's what it looks like doesn't it? We have $1 to $9 - what happens if we try to access $10? Try the code below:
#!/bin/bash
# FILE:        testparms
echo "$1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12"
echo $*
echo $#
Run testparms as follows:
Shell_Prompt: testparms a b c d e f g h I j k l
The output will look something like:
a b c d e f g h i a0 a1 a2
a b c d e f g h I j k l
12
Why?
The shell only has 9 parameters defined at any one time $1 to $9. When the shell sees "$10" it interprets this as "$1" and "0" therefore resulting in the "1p0" string. Yet $* still shows all the parameters you typed!
To our rescue comes the shift command. shift works by removing the first parameter from the parameter list and shuffling the parameters along. Thus $2 becomes $1, $3 becomes $2 etc. Finally, (what was originally) the tenth parameter becomes $9. However, beware! Once you've run shift, you have lost the original value of $1 forever - it is also removed from $* and $@. shift is executed by, well, placing the word "shift" in your shell script, for example:
#!/bin/bash
echo $1 $2 $3
s

Computer Program Technology

Saturday, 8 September 2012

Shell Programming

No comments:

Post a Comment