Illustrated Bash Redirection Tutorial
                     =====================================


Table of Contents
=================

1 Introduction
2 stdin, stdout, stderr
3 Simple Redirections
  3.1 Output Redirection "n> file"
  3.2 Input Output "n< file"
  3.3 Pipes |
4 More On File Descriptors
  4.1 Duplicating File Descriptor 2>&1
  4.2 Order Of Redirection, ie "> file 2>&1" vs "2>&1 >file"
  4.3 Why sed 's/foo/bar/' file >file Doesn't Work
  4.4 exec
  4.5 Closing The File Descriptors
5 An Example
6 Syntax
7 Conclusion


1 Introduction
~~~~~~~~~~~~~~

This tutorial is not a complete guide to redirection, it will not
cover heredocs, here strings, named pipes etc...I just hope it’ll make
you understand better what things like 3>&2, 2>&1 or 1>&3- do.

stdin, stdout, stderr

When Bash starts, normally, 3 file descriptors are opened, 0, 1 and 2 also
known as standard input (stdin), standard output (stdout) and standard error
(stderr).

For example, in Bash, in a terminal emulator, on Linux you’ll see:

$ lsof -a -p $$ -d0,1,2
COMMAND   PID USER   FD   TYPE DEVICE SIZE NODE NAME
bash    24507 root    0u   CHR  136,5         7 /dev/pts/5
bash    24507 root    1u   CHR  136,5         7 /dev/pts/5
bash    24507 root    2u   CHR  136,5         7 /dev/pts/5

This /dev/pts/5 pts is a pseudo terminal used to simulate a real terminal
connected to the computer. Bash reads (stdin) from this terminal and prints via
stdout and stderr to this terminal.

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

When a command, a compound command, a subshell etc... is executed, it inherits
these file descriptors. For instance echo foo will send the text foo to the
file descriptor 1 inherited from the shell, which is connected to /dev/pts/5.

3 Simple Redirections
~~~~~~~~~~~~~~~~~~~~~
3.1 Output Redirection "n> file"
================================

> is probably the simplest redirection.

echo foo > file

The > file after the command alter the file descriptors of the command foo. It
changes the file descriptor 1 (> file is the same as 1>file) so that it points
to the file "file". They will look like:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

Now characters written by our command, echo, that are sent to the standard
output ie the file descriptor 1, end up in the file named "file".

In the same way command 2> file will change the standard error and will make it
point to "file". Standard error is often use by applications to print... errors.

What command 3> file will do? It will open a new file descriptor pointing to
file. The command will then start with:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
new descriptor   ( 3 ) ---->| file                  |
                  ---       +-----------------------+

What will the command do with this descriptor? It depends, often nothing. We
will see later why we might want other file descriptors.

3.2 Input Output "n< file"
==========================

I hope that the following will be obvious, when you run a command
using command < file, it changes the file descriptor 0 so that the
descriptors look like:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

If the command reads from stdin, it now will read from "file" and not
from the terminal.

As with >, < can be used to open a new file descriptor for reading,
command 3 <file. We will see how this can be useful.

3.3 Pipes |
===========

What this | do? Among other things, it connects the standard output of the left
command to the standard input of the right command. That is it creates a
special file, a pipe which is opened for writing for the left command, and
opened for reading for the command on the right.

           echo foo               |                cat

 ---       +--------------+               ---       +--------------+
( 0 ) ---->| /dev/pts/5   |     ------>  ( 0 ) ---->|pipe (read)   |
 ---       +--------------+    /          ---       +--------------+
                              /
 ---       +--------------+  /            ---       +--------------+
( 1 ) ---->| pipe (write) | /            ( 1 ) ---->| /dev/pts     |
 ---       +--------------+               ---       +--------------+

 ---       +--------------+               ---       +--------------+
( 2 ) ---->| /dev/pts/5   |              ( 2 ) ---->| /dev/pts/    |
 ---       +--------------+               ---       +--------------+


This is possible because the redirections are set up by the shell, before the
commands are executed and the commands inherit the file descriptors.

4 More On File Descriptors
~~~~~~~~~~~~~~~~~~~~~~~~~~
4.1 Duplicating File Descriptor 2>&1
====================================

We have seen how to open (or redirect) file descriptors. Let us see
how to duplicate them, starting with the classic 2>&1. 2>&1 means that
something written on the file descriptor 2 will go where file
descriptor 1 goes. 2>&1 is not a very interesting when use with a
single simple command so we will use: 
ls /tmp/ doesnotexist 2>&1 | less

   ls /tmp/ doesnotexist 2>&1     |                   less

 ---       +--------------+              ---       +--------------+
( 0 ) ---->| /dev/pts/5   |     ------> ( 0 ) ---->|from the pipe |
 ---       +--------------+    /   --->  ---       +--------------+
                              /   /
 ---       +--------------+  /   /       ---       +--------------+
( 1 ) ---->| to the pipe  | /   /       ( 1 ) ---->|  /dev/pts    |
 ---       +--------------+    /         ---       +--------------+
                              /
 ---       +--------------+  /           ---       +--------------+
( 2 ) ---->|  to the pipe | /           ( 2 ) ---->| /dev/pts/    |
 ---       +--------------+              ---       +--------------+

Why is it call duplicating? Because after 2>&1, we have 2 file descriptor
pointing to the same file. Take care that this is not called “File Descriptor
Aliasing”, if we redirect stdout after 2>&1 to a file B, filedescriptor 2 will
still be opened on the file A where it was. This is often misunderstood by
people wanting to redirect both standard input and standard output to the file.
Continue reading for more on this.

So if you have a file descriptor like:

                  ---       +-----------------------+
 a descriptor    ( n ) ---->| /some/file            |
                  ---       +-----------------------+

Using a m>&n (where m is a number) you got a copy of this descriptor:

                  ---       +-----------------------+
 a descriptor    ( m ) ---->| /some/file            |
                  ---       +-----------------------+

Note that the positions are also duplicated. If you have allready read a line
of n, then after n<&m if you read a line from m, you will get the second line
of the file.

4.2 Order Of Redirection, ie "> file 2>&1" vs "2>&1 >file"
==========================================================

While it doesn’t matter where the redirections appear on the command line,
their order does matter. They are setup from left to right.

  * 2>&1 >file

A common error, is to use "command 2>&1 > file" to redirect both
stderr and stdout to file. Let’s see what’s going on, first we type
the command in our typical terminal, the descriptors look like this:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

Then our shell, bash, sees 2>&1 so it duplicates 1, and the file
descriptors look like this:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

That’s right, nothing has changed, 2 was already pointing to the same place as
1. Now Bash sees > file and thus change stdin:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

And that’s not what we wanted.

  * >file 2>&1

Now let’s look at the correct "command >file 2>&1". We start as in the
previous example, and Bash sees > file:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

Then it sees our duplication 2>&1:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| file                  |
                  ---       +-----------------------+

And voila, both 1 and 2 are redirected to file.

4.3 Why sed 's/foo/bar/' file >file Doesn't Work
================================================

This is a common error, we want to modify a file using something that reads
from a file and write the result to stdout. To do this we redirect stdout to
the file we want to modify. The problem here is that, as we have seen, the
redirections are setup before the command is actually executed.

So BEFORE sed starts standard input has already been redirected, with
the additional side effect that, because we use >, "file" is
truncated. When sed starts to read the file, it contains nothing.

4.4 exec
========

In bash the exec built-in replaces the shell with the specified
program. So what does this has to do with redirection? exec also
allows us to manipulate the file descriptors. If you don’t specify a
program the redirection after exec modify the file descriptors of the
current shell.

For example, all the commands after exec 2>file will have file
descriptors like:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| file                  |
                  ---       +-----------------------+

All the the errors sent to stderr by the commands after the exec
2>file will go to the file, just as if you have the commands in a
script and run myscript 2> file.

exec can be used for instance if you want to log the errors that the
commands in your script produce, just add "exec 2>myscript.log" at the
beginning of your script.

Let’s see another use case. We want to read a file line by line, this is easy
we just do:

 while read -r line;do echo "$line";done < file

Now, we want to pause after printing each line, waiting for the user
to press a key:

 while read -r line;do echo "$line"; read -p "Press any key" -n 1;done < file

And, surprise, this doesn’t work. Why ? because the file descriptors of the
while loop looks like:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| file                  |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

and so our second read command (read -p “Press any key” -n 1) inherits
them and thus read from file and not from our terminal.

A quick look at help read tells us that we can specify a file descriptor from
which read should read. Cool. Now let’s use exec to get another descriptor:

 exec 3<file
 while read -u 3 line;do echo echo "$line"; read -p "Press any key" -n 1;done

Now the file descriptors look like:

                  ---       +-----------------------+
standard input   ( 0 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard output  ( 1 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 2 ) ---->| /dev/pts/5            |
                  ---       +-----------------------+

                  ---       +-----------------------+
standard error   ( 3 ) ---->| file                  |
                  ---       +-----------------------+

and it works.

4.5 Closing The File Descriptors
================================

Closing a file descriptor is easy, just make it a duplicate of -. For
instance let’s close stdin <&- and stderr 2>&-:

 bash -c '{ lsof -a -p $$ -d0,1,2 ;} <&- 2>&-'
 COMMAND   PID USER   FD   TYPE DEVICE SIZE NODE NAME
 bash    10668 pgas    1u   CHR  136,2         4 /dev/pts/2

we see that inside the {} that only 1 is still here.

Though the OS will clean up the mess, it can be a good idea to close
the file descriptors you open. For instance if you open a file
descriptor with exec 3>file, all the commands afterward will inherit
it. It’s nicer to do something like:

exec 3>file
.....
#commands that uses 3
.....
exec 3>&-

#we don't need 3 any more

I’ve seen some people using this as a way to discard, say, stderr,
using something like: command 2>&-. Though it might work, I’m sure
that some applications don't behave correctly with a closed stderr.

In doubt, I prefer to use 2>/ dev/null for this.

5 An Example
~~~~~~~~~~~~
This example comes from this post (ffe4c2e382034ed9) on the comp.unix.shell
newsgroup:

{
  {
    cmd1 3>&- |
      cmd2 2>&3 3>&-
  } 2>&1 >&4 4>&- |
    cmd3 3>&- 4>&-<

} 3>&2 4>&1


The redirections are processed from left to right, but as the file descriptors
are inherited we will also have to work from the outer to the inner contexts.
We will assume that we run this command in a terminal. Let’s start with the
outer { } 3>&2 4>&1.

 ---       +-------------+    ---       +-------------+
( 0 ) ---->| /dev/pts/5  |   ( 3 ) ---->| /dev/pts/5  |
 ---       +-------------+    ---       +-------------+

 ---       +-------------+    ---       +-------------+
( 1 ) ---->| /dev/pts/5  |   ( 4 ) ---->| /dev/pts/5  |
 ---       +-------------+    ---       +-------------+

 ---       +-------------+
( 2 ) ---->| /dev/pts/5  |
 ---       +-------------+

We only made 2 copies of stderr and stdout. 3>&1 4>&1 would have produce the
same result here because we run the command in a terminal and thus 1 and 2 goes
to the terminal. As an exercise you can start with 1 pointing to file.stdout
and 2 pointing to file.stderr.

Let’s continue with the right part of the second pipe: | cmd3 3>&- 4>&-

 ---       +-------------+
( 0 ) ---->| 2nd pipe    |
 ---       +-------------+

 ---       +-------------+
( 1 ) ---->| /dev/pts/5  |
 ---       +-------------+

 ---       +-------------+
( 2 ) ---->| /dev/pts/5  |
 ---       +-------------+

It inherits the previous file descriptors, close 3 and 4 and setup a pipe for
read. Now for the left part of the second pipe {...} 2>&1 >&4 4>&- |

 ---       +-------------+  ---       +-------------+
( 0 ) ---->| /dev/pts/5  | ( 3 ) ---->| /dev/pts/5  |
 ---       +-------------+  ---       +-------------+

 ---       +-------------+
( 1 ) ---->| /dev/pts/5  |
 ---       +-------------+

 ---       +-------------+
( 2 ) ---->| 2nd pipe    |
 ---       +-------------+

First, The file descriptor 1 is connected to the pipe (|), then 2 is made a
copy of 1 and thus is made a fd to the pipe (2>&1), then 1 is made a copy of 4
(>&4), then 4 is closed. These are the file descriptors of the inner {}, let’s
go inside and have a look at the right part of the first pipe: | cmd2 2>&3 3>&-

 ---       +-------------+
( 0 ) ---->| 1st pipe    |
 ---       +-------------+

 ---       +-------------+
( 1 ) ---->| /dev/pts/5  |
 ---       +-------------+

 ---       +-------------+
( 2 ) ---->| /dev/pts/5  |
 ---       +-------------+

It inherits the previous file descriptors, connect 0 to the 1st pipe, the
filedescriptor 2 is made a copy of 3 and 3 is closed. Finally the for the left
part of the pipe:

 ---       +-------------+
( 0 ) ---->| /dev/pts/5  |
 ---       +-------------+

 ---       +-------------+
( 1 ) ---->| 1st pipe    |
 ---       +-------------+

 ---       +-------------+
( 2 ) ---->| 2nd pipe    |
 ---       +-------------+

It also inhertits the file descriptor of the left part of the 2nd pipe,
filedescriptor 1 is connected to the first pipe, 3 is closed.

The purpose of all this becomes clear if we take only the commands:

                                                   cmd2

                                           ---       +-------------+
                                       -->( 0 ) ---->| 1st pipe    |
                                      /    ---       +-------------+
                                     /
                                    /      ---       +-------------+
         cmd 1                     /      ( 1 ) ---->| /dev/pts/5  |
                                  /        ---       +-------------+
                                 /
 ---       +-------------+      /          ---       +-------------+
( 0 ) ---->| /dev/pts/5  |     /          ( 2 ) ---->| /dev/pts/5  |
 ---       +-------------+    /            ---       +-------------+
                             /
 ---       +-------------+  /                       cmd3
( 1 ) ---->| 1st pipe    | /
 ---       +-------------+                 ---       +-------------+
                             ------------>( 0 ) ---->| 2nd pipe    |
 ---       +-------------+ /               ---       +-------------+
( 2 ) ---->| 2nd pipe    |/
 ---       +-------------+                 ---       +-------------+
                                          ( 1 ) ---->| /dev/pts/5  |
                                           ---       +-------------+

                                           ---       +-------------+
                                          ( 2 ) ---->| /dev/pts/5  |
                                           ---       +-------------+


As said previously, as an exercise you can start with 1 open on a file and 2
open on another file to see how the stdin from cmd2 and cmd3 goes to the
original stdin and how the stderr goes to the original stderr.

6 Syntax
~~~~~~~~

I used to have trouble to choose between 0&<3 3&>1 3>&1 ->2 -<&0 &-<0
0<&- etc... (I think probably because the syntax represents more the
result ie the redirection than what is done ie opening, closing, and
duplicating filedescriptors).

If it’s also your case, then maybe the following “rules” will help you, a
redirection is always like:

 lhs op rhs

  * lhs is always a file descriptor i.e. a number:
      - Either one we want to open, duplicate, move or one we want to close. If
        the op is < then there is an implicit 0, if it’s > or >> there is an
        implicit 1.

  * op is either <, >, >>, >|, or <>:
      - < if the file decriptor in lhs will be read, > if it will be written,
        >> if data will be appended to the file, >| to overwrite an existing
        file or <> if it will be both read and written.

  * rhs is the thing that the filedescriptor will describe:
      - It can be either the name of a file, the place where another descriptor
        goes (&1), or the special nowhere, &-, which will close the
        filedescriptor.

You might not like this description, and find it a bit incomplete or inexact,
but I think it really helps to easily find that, say, &->0 is incorrect.

7 Conclusion
~~~~~~~~~~~~

I hope this tutorial worked for you.

I lied, I did not explain 1>&3-, go check the manual ;-)

Thanks to Stéphane Chazelas from whom I stole both the intro and the
example....

The intro is inspired by this introduction, you’ll find a nice exercise there
too:

  * A Detailed Introduction to I/O and I/O Redirection
    http://tldp.org/LDP/abs/html/ioredirintro.html

The last example comes from this post:

  * comp.unix.shell: piping stdout and stderr to different processe
    http://tinyurl.com/2kw7hg


Author: Pierre Gaston <pierre.gaston@gmail.com>
Date: 2007/09/20 10:34:08 AM
written for http://www.bash-hackers.org/