Working under a Linux operating system distribution gives you a very unique perspective on how alterable the computing world can be. For instance, getting rid of white space characters on text files can somewhat sound like a tedious task unless you are using Linux as your primary operating system.
White spaces are not just horizontal like the spacing of words in this article or other printable characters. White spaces also exist as vertical spacing of lines and/or paragraphs. So why remove white spaces? The primary reason is to sanitize the outlook of your targeted text file.
Reference Text File
Consider the following sample text file.
$ sudo nano sample_file.txt
As per the above screen capture, we can evidently note that the text file exhibits symptoms of both vertical and horizontal white space characters. We can use the cat command plus the -n
option to open this file in a numbered view.
$ cat -n sample_file.txt
The obvious white space characters exhibited on a text file include spaces, tabs, and line breaks. To remove all these white space characters, we will consider the aid and use of three Inbuilt Linux commands.
Method 1: Using the tr Command
The tr command will take the above text file as its input, translate the content of the text file, delete the white space characters and afterward write back to the file a non-white-space character output.
We, however, need to provide the tr command with specific execution requirements i.e. whether to remove horizontal white space characters, vertical white space characters, or both.
Remove Horizontal White Space Characters in File
We will use the "[:blank:]"
character option as part of a command and afterward execute the cat command to print the final layout of the file.
$ tr -d "[:blank:]" < sample_file.txt | cat -n
As per the above output, we have successfully eliminated the horizontal white space characters from the text file in Linux.
Remove All White Space Characters in File
The "[:space:]"
option is used to remove both horizontal and vertical white space characters from a text file.
$ tr -d "[:space:]" < raw_file.txt | cat -n
Method 2: Using the sed Command
Since the sed command is popularly implemented with regular expressions, we can use it in the following manner:
$ sed 's/[[:blank:]]//g' sample_file.txt | cat -n [Remove Horizontal Spaces] $ sed ':a; N; s/[[:space:]]//g; ta' sample_file.txt | cat -n [Remove All Spaces]
Method 2: Using the awk Command
This powerful text-processing utility makes use of its C-like script and other built-in functions and variables to flexibly manipulate text processing.
We will utilize its gsub function in the following manner to remove horizontal white space characters.
$ awk '{gsub(/[[:blank:]]/,""); print}' sample_file.txt | cat -n
To remove all white space characters from a text file, we will make the following modification to the above command:
$ awk -v ORS="" '{gsub(/[[:space:]]/,""); print}' sample_file.txt | cat -n
With the above-discussed three Linux command approaches, getting rid of unwanted white space characters on your text files in Linux should be a non-issue.