Standard streams, Redirection and File manipulation in Linux
Learn how to use streams & redirection and manipulate files
If you are familiar with the basic Linux Commands then the next thing you can learn about is stream and how input output redirection happens. So, let's dive into the new concepts and gain some more knowledge.
Stream
What is a stream?
- A stream can be thought of as a channel connecting a processor(or logic unit) and input and output devices.
- Whatever data we use in our programming flows through a stream. That means, stream transfers data. In case of Linux, your data is simple text.
- Text input can be taken from
- keyboard,
- file or,
- data piped or redirected from a command/program/process.
- Text output
- can be displayed on terminal window
- overwritten on or append to a file, or
- can be transferred to another command/program/process for further processing. This is known as piping.
- We have one input stream and two output streams.
3 standard streams
- stdin (Standard Input)
- The standard input is data that is being passed to a program.
- File descriptor - 0
- stdout (Standard Output)
- displays data that is generated by a program.
- data is printed to the terminal if it is not redirected or piped.
- File descriptor - 1
- stderr (Standard Error)
- similar to the standard output except that it is used for error messages from a program.
- File descriptor - 2
Use of assigning file descriptors to standard streams
- We assign file descriptors to standard streams basically to handle streams differently.
- Suppose, you are testing a program and you want to keep track of errors. You can do this by sending the standard error into a file. This way you and others can keep track of all the errors in a program.
Redirection and Using file descriptors
- To redirect your standard output use '>' symbol. You may also write '1>' to specify that the output stream is standard output. But, by default '>' denotes standard output, so no need to write 1.
- To redirect your standard error use '2>' symbol. We have added '2' because it is the file descriptor of standard error.
- To output data in an output stream, use the file descriptor for that particular stream.
- 0 → standard input stream
- 1 → standard output stream
- 2 → standard error stream
- For example, in the below image you can see that we can redirect our output streams using '>' symbol and file descriptors.
- If you mention '2' which is the file descriptor of standard error then the text in the standard error stream will be redirected.
- Here in our e.g., let's talk about the command
sanskriti@BhairaviKriti:~$ lg > result.txt
. Since, there is no file descriptor mentioned here, by default the stream will be standard output stream. It means, take the data in the standard output stream and overwrite it on 'result.txt' file, but sincelg
is not a valid command it doesn't redirect any data in the standard output stream. So, nothing is overwritten on the 'result.txt' file. - But it does have some data int the output error stream. Therefore, we can redirect data in the output error stream to the 'result.txt' file.
- For the command
sanskriti@BhairaviKriti:~$ lg > result.txt
since you didn’t tell the terminal to redirect the standard error to a specific place, it is actually going to print std error on the screen, which is why you see the errorlg: command not found
.
- If you neither want to display error nor want to save it on a file, then just redirect the error to
/dev/null
. E.g,sanskriti@BhairaviKriti:~$ lg 2> /dev/null
- For further understanding, watch this video: Data Streams
What is pipe? ('|')
- It allows you to take the stdout of one command and pass it as a stdin to another command.
- With pipe, both commands share the same memory buffer.
- Example,
sanskriti@BhairaviKriti:~$ ls -la /etc | less
Environment variables
- Stores and provides useful info that the shell and some other processes use.
- Env. var. comes in handy to the terminal, processes that they are running, and to some command as well that you are trying to execute.
echo $PATH
- Contains all the paths that your terminal will search whenever you try to execute a command.
Commands learned
env
- used to print environment variables.
env
less
- This command opens output of a command is a separate window terminal.
- Press 'q' to quit this window and get back to the terminal.
- Use it when you want to view larger files but don't want to populate your terminal.
head
- Print the top N number of data of the given input. By default, it prints the first 10 lines of the specified files
head -n 15 fileName
tail
- Complementary to the
head
command. - -f : This option shows the last ten lines of a file and will update when new lines are added. As new lines are written to the log, the console will update with the new lines.
tail -n 5 fileName
- Complementary to the
sort
- used to sort a file, arranging the records in a particular order.
sort fileName
- arranges the record in ascending order.sort - r fileName
- arranges the record in descending order.
tr
tr
stands for translate.- supports a range of transformations including uppercase to lowercase, squeezing repeating characters, deleting specific characters and basic find and replace
- To transform contents of a file from uppercase to lowercase:
cat fileName | tr a-z A-Z
uniq
- This command is designed to work on sorted files.
- used to remove all repeated lines in a file.
- -c →Finding number of times the lines are repeated.
- -u →Finding lines which are unique
- -d→ Finding repeated lines.
uniq -d fileName
- If you want this command to work for unsorted files too, then first sort it, and then pass it on as stdin to
uniq
command.sort fileName | uniq -c
wc
wc fileName
→ provides no. of lines. word count and byte count in a filewc -l fileName
→ provides no. of lineswc -w fileName
→ provides word countwc -c fileName
→ provides byte count (file size)
grep
- searches a file for a particular pattern of characters, and displays all lines that contain that pattern. Example,
env | grep PATH
→ It will display all the lines containing the word 'PATH'.
For further reading, take a look at this amazing article from where I have stolen some pictures :) - Input, Output and Error Redirection in Linux