The Swiss Army Knife of Text Operations in Linux
Sorting and Filtering Data
One of the most common tasks performed on the command line is sorting data. This can be done using the sort command, which is part of the GNU Core Utilities package. Sorting is useful when you need to arrange a list of items in a specific order, such as alphabetically or numerically.
Here's an example of how to use the sort command:
Let's break down this command:
LC_ALL=C
: This sets the locale to C, which is a neutral language setting that can be used with any character set.-u
: Only output unique lines from the input files. If this option is not specified, duplicate lines will be included in the output.-b
: Ignore leading whitespace characters when sorting. This option is useful if your data contains spaces or tabs at the beginning of each line.-i
: Ignore case when sorting. This means that uppercase and lowercase letters are considered equivalent for purposes of comparing two strings.-f
: Fold upper- and lower-case letters. Like -i, this option makes the sort command case-insensitive, but it also folds all upper-case letters into their corresponding lower-case counterparts before performing the comparison.-S
: Sort a file using temporary files that take up to 80% of free disk space (default is 25%).--parallel=8
: Use eight processes in parallel to perform the sort. This can speed up the sorting process by allowing multiple cores to work on the task simultaneously.
The file.txt
at the end of the command specifies the input file that you want to sort.
Deleting Elements from JSON Data
Another common operation performed on the command line is deleting data from files, directories, or databases. The json format is a popular way to store structured data, and it can be used with various tools and libraries to manipulate JSON documents.
Here's an example of how to delete an element from a JSON file using the jq
command:
This command deletes the .invite_code
field from each object in the input file. The |
character is used to pipe the output of one command into another, and the select()
function is used to filter out the elements that match a specific condition.
Piped Command for JSON Data
If you need to perform multiple deletions on a single line, you can use pipes (|
) to chain together several jq
commands:
1LC_ALL=C jq 'del(.updatedAt | select (..))' file | jq 'del(.createdAt | select (..))' | jq 'del(.roles | select (..))' | jq --raw-output -c 'del(.preferences.fcm_token | select (..))' > outfile
This example deletes four different fields from the input JSON document and outputs the result to a new file called outfile
.
Deleting Directories
Directories can be deleted using the rm
command, which is also part of the GNU Core Utilities package. You can use the -f
option to force deletion without prompting for confirmation, and the -r
option to delete all subdirectories recursively.
Here's an example that deletes all directories that don't match a specific pattern:
The find
command is used to locate the directories, and -type d
specifies that only directories should be included in the search. The -not -name "US*"
option excludes any directory names containing the string "US". Finally, the -exec rm -f -r {} \;
part of the command actually deletes each matching directory.
Decompressing Files with Password
Finally, if you need to perform a batch operation on multiple files or directories, you can use a for
loop in combination with other commands. For example, here's how you could decompress all RAR archives in the current directory using a password:
This script will prompt for a password once and then use it to extract each RAR archive found.
Converting Files
Another useful command is find
, which can locate files or directories based on various criteria. Here's an example that finds all DOC files in the current directory and its subdirectories, and converts them to TXT format using the catdoc
utility:
The -iname
option makes the search case-insensitive.
Finding Directories
If you are looking for directories with very specific names, you can use a regular expression (regex) with the find
command. Here's an example that finds all directories in the current directory with names consisting of only uppercase letters and containing exactly two characters:
This will output a list of matching directory paths.