.. _csh2: .. highlight:: tcsh ********************************* Shell Scripting 2 ********************************* This lab continues covering shell scripts, looking at conditionals, loops, and other useful tools. ================ Conditionals ================ You can include ``if`` statements in shell scripts to control the flow of your script. For simple one-line if statements, the syntax is ``if () `` all on a single line. This will execute ```` if ```` is true. For a simple example: :: #!/bin/csh # simple if statement example set a = 1 if ($a == 1) echo "a is one" More complex ``if`` statements use the syntax below: :: #!/bin/csh # complex if statement set a = 1 set b = 2 if ( $a == 2 ) then echo "a is two" else if ( $b == 2 ) then echo "b is two" else echo "neither variable is two" endif The syntax here differs from ``bash`` shell scripts, which end ``if`` statements with ``fi`` rather than ``end if``. Besides the string comparisons that I mention above, there are a number of file booleans that you can use: * ``-e `` -- test if ```` exists * ``-d `` -- test if ```` exists and is a directory * ``-f `` -- test if ```` exists and is a regular file * ``-r `` -- test if ```` exists and has read permissions * ``-w `` -- test if ```` exists and has write permissions * ``-x `` -- test if ```` exists and has execute permissions * ``-z `` -- test if ```` is empty The above are useful for checking necessary conditions before performing some action that requires the above. There is not a built in assertion in the shell, so I frequently check things using conditionals instead if there are conditions that will trip up my shell script. Other useful tools for comparisons inlcude ``!=`` (not equal to), ``<`` (less than), ``>`` (greater than), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)``&&`` (logical and), ``||`` (logical or). Comparisons involving math can be done with ``bc``. To test if the sum of variables ``x`` and ``y`` is less than some value, you can do this with :: if ( `echo "$x + $y < 3" | bc`) echo "x and y sum to less than three" To do this, we use backwards quoting to put the output of the command ``echo "$x + $y < 3" | bc`` into the ``if`` statement. Note that we need double quotes here in order to substitute the variable values into the string before we pipe it into the ``bc`` command. The resulting expression evaluates to ``1`` if the statement is true, and ``0`` if the statement is false, two values that are compatible with the ``if`` statement. The backwards quotes send the standard output of the ``bc`` command to the ``if`` statement. Note that this can also be done if you do shell math and save the sum in some other variable, so this is not strictly necessary, but it is a useful trick. ====================== For and While Loops ====================== You can also use loops in a shell script, both of the ``for`` and ``while`` flavors. ``while`` loops have the following syntax: :: #!/bin/csh # while loop example set a = 1 while ( $a != 10 ) echo $a @ a++ end You can use ``break`` or ``continue`` within a while loop to either terminate execution or continue to the next loop, respectively. ``bash`` uses a different syntax (``while (); do ; done``). For loops use the following syntax to iterate over all elements of an array: :: foreach () end So to print out the same numbers as we saw with the above ``while`` loop, we could use the following script: :: #!/bin/csh # for loop example set x = (1 2 3 4 5 6 7 8 9) foreach var ($x) echo $var end ``bash`` again uses a different syntax for this; in ``bash`` the syntax is ``for in ; do ; done``. If you put the ``for``, ``do``, and ``done`` on separate lines, the semicolons are not necessary. Loops are probably the most important construct in the shell for automating data analysis. They allow you to perform analysis on many different files in a robust, reproducible way. For instance, if you need to loop over all files matching a certain wildcard pattern, you can use backwards quoting and a loop: :: #!/bin/csh # perform an action on all files ending in .txt foreach file (`ls *.txt`) echo "Processing $file" myprogram $file > output_${file}.txt end The backward quoting stores the result of the ``ls`` command in an array that is then looped over, with the value of ``file`` taking on all results from the ``ls`` command. This is a very handy way of repeating actions on many files without having to do large amounts of typing. You can also create an array of integers using the ``seq`` shell command with backwards quotes. ``seq 0 1 9`` gives a list of integers starting with 0, ending with 9, and incrementing by 1: :: #!/bin/csh # perform an action on 10 files with numerical identifiers # each file is named filen.txt, where n is a number 0-9 foreach n (`seq 0 1 9`) echo "Processing file $n" myprogram file${n}.txt > output_${n}.txt end However, note that ``seq`` is not a standard UNIX program, so may not be available on all systems. If it is not available, you can always use AWK to create the numbers, again using backwards quotes. ================= Functions ================= Shell scripts in ``bash`` can define internal functions, which is not possible in the C Shell. However, you can write external shell scripts that take input arguments (``$1``, ``$2``, etc.) that can serve as functions. More lightweight, single command tasks that are repeated can be shortened using aliases (recall the syntax for setting an alias is ``alias ``). I find aliases in the shell particularly useful for writing GMT scripts that work for both GMT 4 and GMT 5, which we will see starting next week when we cover GMT. =============================== Standard Input Using ``<<`` =============================== As mentioned in the AWK labs, the ``<<`` operator to give standard input is a useful tool in shell scripts when short input to a command is necessary. Why is this a good idea? If we need to give input to a command via standard input, we have a few options: 1. You could simply type out the entry into a separate file. This is fine in many cases, but it has two drawbacks. First, the file exists separately from the shell script, so it is not clear that the two files are related to each other, and you now have to keep track of multiple files and might inadvertantly delete the input file. Second, and most importantly, what if the input changes depending on some other conditions in the shell script? There is no way to tailor the input to the specific task being done. 2. To get around this, a better approach would be to produce the input file within the shell script. That way, you can use variables to modify the input accordingly if the input needs to be adapted to the task. One way to do this is to use a print statement and output redirection: :: #!/bin/csh echo "1 2 3 4 \ 5 6 7 8" > tmp.txt cat tmp.txt This should write the desired numbers to file, and print it to the terminal. If you needed to change the input based on some variable value, you would replace the appropriate entry with a variable. While the second approach is more robust, it does have one downside in that it still produces an extra file that is saved in the current directory. This leads to unneeded clutter unless you explicitly clean up after yourself (which is nevertheless a good idea in shell scripts!). It turns out that we can eliminate the file altogether by using the ``<<`` operator. ``<<`` tells the shell that input text will follow, to be terminated by a user-defined keyword. This keyword can be anything that you want (provided it does not appear in the text to be input to the command). A common choice that is easily understood is ``END`` or ``EOF`` (End Of File). As an example, the following shell script will do the same thing as above, without writing the data to a file (thus avoiding the need to clean up after ourselves). :: #!/bin/csh cat << END 1 2 3 4 5 6 7 8 END Essentially, the ``<<`` operator says that the following lines are input, until you run into the word that follows (``END`` in this case). You can put shell variables into this text, giving you flexibility to use one script for multiple situations, and there are no additional files to keep track of or maintain. Obviously, this is only practical for short snippets of text to be entered, as otherwise this would make our shell script unreadable, but this is by far the best method for small files. We will use this method below for calling SAC from a shell script, and in the upcoming classes on GMT, so it is a good idea to become aquainted with this method of entering input to shell commands. =================================== Calling SAC From a Shell Script =================================== I personally use the above tricks frequently when using SAC, and do most of the complex programming, looping, and file manipulation in the shell rather than directly in SAC. I find that the variable definitions and looping abilites of the shell are more powerful in the Shell than in SAC, so I take advantage of it by writing shell scripts that call SAC for nearly all of my SAC work. SAC can take commands from standard input. I use the above technique to give standard input to embed short SAC macros into shell scripts, so that I only have one file to maintain and can use shell variables to handle input and things like loops. Here is an example. Imagine that I have downloaded a series of SAC files from the internet, and want to low-pass filter them all using a user-specified corner frequency, save the files to disk, and then plot them all. The following is how I would do this using a shell script: :: #!/bin/csh # Shell script to filter all SAC files in the current directory (corner frequency first input argument) # data is written to file, then plotted # script then cleans up afterwards # loop over files foreach file (`ls *.SAC`) sac << END read $file lowpass butter npoles 2 corner $1 write $file:r_filt.SAC quit END end # now read all files in at once and plot set files = (`ls *_filt.SAC`) sac << END read $files plot1 quit END # clean up rm *_filt.SAC Here, I used ``:r``, which strips off the file suffix (so that I could add in the ``_filt`` part to the filename before the ``.SAC`` suffix. One thing to be careful about here is that if you want to read from standard input, you must have the input text flush to the left edge of the script (so that the shell can find the ``END`` delimiter correctly), so I cannot indent my loops in this case. This is the one exception I make to always indenting blocks of code. Note that the variable names are substituted in the input. Note also that I need to put ``quit`` commands in each set of SAC commands, as otherwise sac will continue running and the script will hang. I find this approach to be much more powerful than doing everything directly in SAC. Feel free to borrow on these ideas when using SAC. **Exercise:** Adapt this shell script to your macros for the 4th and 5th SAC labs, moving all of the loops from the SAC macros to shell scripts. =========== Summary =========== This class covered a number of useful programming techniques in the shell, including: * ``if`` statements * Comparisons in the shell * ``while`` loops * ``foreach`` loops * ``<<`` to give input to a command * Application of these to writing shell scripts that call SAC macros.