MandrakeUser.Org - Your Mandrake-Linux Knowledge Base!


 
 

* DocIndex - Basics

Using the Shell IV

* Filename Globbing
* Output Redirection

Related Resources:

MdkRef. 3,3
MdkRef. 3, 4
man bash

Revision / Modified: Feb. 28, 2002
Author: Tom Berger

 

* Filename Globbing

Filename globbing allows you to provide more than one filename to a command without having to write all filenames in full. You use special characters for this, called 'wildcards'.

Say you want to delete all files in a directory that end with the string '.bak' using the 'rm' command'. Instead of typing each filename as an argument to 'rm', you use the '*' wildcard:

rm *.bak

'*' matches zero or more characters. In this example, you tell the shell to expand the argument to the 'rm' command to "all file names ending on or consisting of the string '.bak'", the shell does so and passes that expanded argument to the 'rm' command.

As you will see, it is important to note that the shell reads and interprets the command line before the command does it. This has the advantage that you can use wildcards with (almost) all shell commands which take strings (file names, directory names, search strings etc) for an argument.

Let's play a bit more with the '*' wildcard. You have a directory which contains the files '124.bak', '346.bak' and '583.bak'. You want to keep '583.bak'. What you do is this

rm *4*.bak

The shell expands '*4*.bak' to "all file names which contain the number '4' and the end on the string '.bak'".
Notice that rm 4*.bak would not have worked, since this would only have encompassed file names beginning with the number '4'. Since there are no such files in this directory, the shell expands this pattern to an empty string and 'rm' issues an error message:

rm: cannot remove `4*.bak': No such file or directory

Now you want to keep the file '346.bak' but delete '124.bak' and '583.bak'. That's trickier since the files which are to be deleted have nothing in common except the ending. But lucky as you are, you can also define files by what they not have:

rm *[!6].bak

This reads: delete all files which end on the string '.bak' exceptfor those which end on the string '6.bak'. You have to put the negation sign '!' and the character to be negated (here '6') into brackets, because otherwise the shell interprets the exclamation mark as the beginning of a history substitution . Negation works with all globbing patterns introduced in this article.

Notice that it is very easy to shoot yourself in the foot with the '*' wildcard and negation. Guess what

rm *[!6]*.bak

does? It deletes all the files, even the one which does contain a '6' in its filename. If you put '*' wildcards before and after a negation, it renders the negation practically useless, because the shell expands this to "all file names which do not have that character at any given position". In our example, the only file name on which that pattern would not have matched, would have been '666.bak'.

The second wildcard is the question mark, '?'. In a globbing pattern, a question mark represents exactly one character. To demonstrate its use, let's add two new files to the three example files, '311.bak~' and 'some.text'. Now list all files, which have exactly four characters after the dot:

ls *.????

does this. The question mark wildcard is also a useful means to avoid the 'negation trap' mentioned above:

rm *[!4]?.*

This expands to "all files except for those with a '4' in the second to last position before the dot" and deletes all files except for '346.bak'.

Is there more? You bet. So far, you've only seen globbing patterns which match one character at a certain position. But nothing keeps you from matching more than one:

ls [13]*

lists all files which either begin with the character '1' or the character '3'; in our test case the file '124.bak', '311.bak~' and '346.bak' match. Notice that you have to enclose the pattern in brackets, otherwise the pattern would match only files which begin with the string '13'.

Now all that's left for ultimate happiness to ensue is the possibility to define ranges of matches:

ls *[3-8]?.*

lists all files whose second to last character before the dot is a number between '3' and '8'. In our example, this matches the files '346.bak' and '583.bak'.

Quoting Special Shell Characters

These powerful mechanisms have one drawback, though: the shell will always try to expand them, and it will do so before the command. There are several cases in which this can get in your way:

  • File names with special characters. Assume you have another file in that directory with the name '!56.bak'. Try to match it with a globbing pattern:

    rm !*
    rm
    rm: too few arguments

    The shell interprets '!*' as a history substitution ('insert all arguments from previous command'), not as a globbing pattern.

  • Commands which take special characters as arguments themselves. A row of Linux command line tools like (e)grep, sed, awk, find and locate for example use their own set of what is then called 'regular expressions'. These expressions may look strikingly similar to globbing patterns but are in some cases interpreted differently.
    But in order to enable the command to interpret them in the first place, the shell must be prevented from interpreting them as globbing patterns first:

    find . -name [1-9]* -print
    find: paths must precede expression

    Properly:

    find . -name '[1-9]*' -print
    ./346.bak
    ./124.bak
    ./583.bak
    ./311.bak~

You can quote such special characters like !, $, ? or the empty space either with a back slash:

ls \!*
!56.bak

or with (single) quotes

ls '!'*
!56.bak

Notice that using quotes may need some deliberation on where to put them.ls '!*' would look for file called '!*' since the second wildcard is now quoted, too, and thus interpreted literally.

* section index * top

* Output Redirection

The Unix philosophy is to have many small programs, each excelling at a certain task. Complex tasks are not fulfilled by complex programs but by tying together a bunch of programs with a handful of shell mechanisms. One of them is redirecting output.

Redirecting between two or more commands

This is done via 'pipes', denoted by the pipe symbol |. The syntax is

command1 | command2 | command3 etc

You've certainly seen them already. They are often used to direct the output of a program to a pager like 'more' or 'less'.

ls -l | less

The first command provides the directory listing and the second displays it in a scrollable manner. A more complex example:

rpm -qa | grep ^x | less

The first command puts together a list of all installed RPMs, the second filters ('grep') those that start ('^') with an 'x' and the third displays the results in a paged and scrollable list.

Redirecting from or into files

Sometimes you want to save the output of a command in a file or feed it from a file. This is done via the operators '>' and '<'.

command > file

saves the output of command in file overwriting all previous content of file:

ls > dirlist

saves the listing of the current directory to a file called 'dirlist'.

command < file

uses file as the input for command:

sort < dirlist > sdirlist

feeds the content of 'dirlist' to the 'sort' command, which sorts it and puts the sorted output into the 'sdirlist' file. Of course, if you're clever, you'd do that in one step:

ls | sort > sdirlist

A special case is 'command 2> file'. This puts just the error messages of command into file. You may need that from time to time...
Another operator is '>>'. This one appends the output to a existent file:

echo "string" >> file

This would append string to the content of the filefile. A quick way to edit a file without opening an editor first!
There is an important restriction to the '<' and '>' operators, though: Something like

command < file1 >file1

will erase the content of file1. However

command < file1 >> file1

will work fine and append the processed content of file1 to the same file.

That's a lot isn't it? ;-) No need to panic, you can learn everything step by step at your own pace. Practice makes perfect ...
Once you are familiar with the most common shell mechanisms, you might feel the urge to customize your environment. You will find some ideas on the next two pages. On the last page, you'll also find a short FAQ dealing with the most common shell error messages and some minor configuration settings.

* section index * top

* Customizing: Shell configuration files, the prompt, $PATH


 
Legal: All texts on this site are covered by the GNU Free Documentation License. Standard disclaimers of warranty apply. Copyright LSTB (Tom Berger) and Mandrakesoft 1999-2002.