Pipelines: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
Philosophy of the commandline: Small tools that do one thing very well, loosely connected together to make custom "pipelines" or workflows to do specific (or surprising) things. | Philosophy of the commandline: Small tools that do one thing very well, loosely connected together to make custom "pipelines" or workflows to do specific (or surprising) things. | ||
The ''pipeline'' is a fundamental feature of the UNIX command line. By connecting the output of one program to the input of another, you can build chains of commands. In this way simple commands can form the building blocks to build more sophisticated / personalized scripts to do powerful things. | |||
[[Wikipedia:Pipeline (Unix)|wikipedia]] | |||
== stdin and stdout == | == stdin and stdout == |
Revision as of 16:38, 15 September 2013
Philosophy of the commandline: Small tools that do one thing very well, loosely connected together to make custom "pipelines" or workflows to do specific (or surprising) things.
The pipeline is a fundamental feature of the UNIX command line. By connecting the output of one program to the input of another, you can build chains of commands. In this way simple commands can form the building blocks to build more sophisticated / personalized scripts to do powerful things.
stdin and stdout
Every program receives "standard in", and sends its output to "standard out". By default, stdin is taken from the keyboard, and stdout will display something to the screen. These mappings can be adjusted however using redirection using the special pipeline characters '>', '<', and '|'.
Redirecting stdout with >
date
Displays the date to the screen (no stdin used by date).
date > time.txt
Redirects the output of date and "saves as" time.txt.
cat time.txt
Display time.txt (to the screen by default)
Variation, Adding to a file with >>
date >> time.txt
Will addon to a file (or "append" in CS lingo).
Redirecting stdin with <
wc -l
"Word count" program can be used to simply count the number of lines of a text file (with the -l option). When the above command is run, wc "listens to stdin" which is the console/keyboard. The program appears to do nothing and the shell "hangs" waiting for input. Type a few lines in such as...
testing one two three <CTRL-D>
Finally, on a blank line, pressing Ctrl-D tells the shell "END OF FILE" -- or stop reading input, and wc will snap into action and output the number of lines it read from stdin.
wc -l < mytextfile
Tells wc to use mytextfile as stdin and thus shows how many lines are in that file.
Piping (stdin=>stdout) with |
ls | wc -l
Is sort of like:
ls >< wc -l (this is invalid!)
In that stdout of ls is "piped" to be the stdin of wc. The result is a file count of the current directory. (NB: the ls command is smart and disables multiple column output if it's being redirect (ie not going straight to the console), to see this try:
ls | cat
The smoking cat: cat+pipe
Note that you can also get the same effect of < by using the cat program (which just copies the contents of a file to stdout)
cat somefile | wc -l
This is sometimes nice to read as left to right flow is maybe easier to read than putting a "<" at the end.
Other commands
shuf, head, tail
Shuffle a file
shuf is a simple program than randomizes the lines of a file. It can be run like:
shuf < somefile
or
cat somefile | shuf
Also if shuf is run with the name of a file, it will use that as it's input:
shuf somefile
Heads and Tails
To see the top of a file, you can use head:
cat somefile | head
Head has an option (-n) for how many lines to show.
Similarly tail shows the bottom of a file, this one is very useful for quickly checking a log file:
sudo tail /var/log/apache2/error.log
Random Line
To pick a random line of a file, you could first shuffle it, then pick the first line:
cat somefile | shuf | head -n1