Monday, February 11, 2008

Bash Command-line Programming: Redirection

The bash shell is also a pretty handy programming language. One way to use this is writing scripts. However, another use is writing ad-hoc, one-time-use programs, for very specific tasks, right on the command line. I do this a lot, and find myself using the same techniques over and over.

In this post, I'll share some useful command-line techniques for redirection.

There are many ways other than pipes for redirecting stdin and stdout:

  • cmd &>file: send both stdout and stderr of cmd to file. Equivalent to cmd >file 2>&1.
  • cmd <file: pipes the contents of file into cmd. Similar to cat file | cmd, except that while pipes execute in a subshell with their own scope, this keeps everything in the same scope.
  • cmd <<<word: expands word and pipes it into cmd. word can be anything you'd type as a program argument. For example, cmd <<<$VAR pipes the value of $VAR into cmd.

Also, sometimes programs need arguments on the command line, rather than through stdin:

  • cmd $(<file): expands the contents of file as arguments to cmd. For example, if the file toRemove contains a list of files, rm $(<toRemove) removes those files.
  • cmd1 <(cmd2): creates a temporary file containing the output of cmd2, then puts the name of that file as an argument to cmd1. This is handy when cmd1 expects filename arguments. For example, to see the difference between the contents of directories dir1 and dir2, use diff <(ls dir1) <(ls dir2). This is conceptually equivalent to
    ls dir1 >/tmp/contentsDir1
    ls dir2 >/tmp/contentsDir2
    diff /tmp/contentsDir1 /tmp/contentsDir2
    rm /tmp/contentsDir1 /tmp/contentsDir2
    
    (only conceptually, though, since it actually uses fifos). For another handy command for this, check out comm.

Finally, you sometimes want to redirect to and from multiple programs at once:

  • {cmd1; cmd2; cmd3;} | cmd: pipes output of cmd1, cmd2, and cmd3 to cmd.
  • cmd | tee >(cmd1) >(cmd2) >(cmd3) >/dev/null: pipes output of cmd to cmd1, cmd2, and cmd3 in parallel. This trick is a tweak on that here. In the same way <(cmd) is replaced with a file containing the stdout of cmd, >(cmd) is replaced with a file that becomes the stdin of cmd. Since tee writes its stdin to each given file, you can combine it with >(cmd) to send the output of one command to the stdin of many. The final >/dev/null discards the stdout of tee, which we no longer need. Doesn't come up too often, but it's certainly neat.

No comments: