Here is a fragment of a future book, Basic Tools and Practices for a Novice Software Developer, by Balthazar Ruberol and Etienne Broad . The book should help educate the younger generation of developers. It will cover topics such as mastering the console, setting up and working efficiently in the command shell, managing versions of code using gitSQL fundamentals, tools like Make, jqand regular expressions, basics of networking, as well as best practices for software development and collaboration. The authors are currently working hard on this project and are inviting everyone to participate in the mailing list .

Content

Shell Text Processing

One of the reasons that make the command shell an invaluable tool is the large number of word processing commands and the ability to easily combine them into the pipeline, creating complex processing templates. These commands make many tasks trivial for analyzing text and data, converting data between different formats, filtering strings, etc.

When working with text data, the main principle is to break any complex problem into many smaller ones - and solve each of them using a specialized tool.

Make each program do one thing well - The Fundamentals of Unix Philosophy

The examples in this chapter may seem a little far-fetched at first glance, but this is done on purpose. Each of the tools is designed to solve one small problem. However, when combined, they become extremely powerful.

We will look at some of the most common and useful text processing commands in the shell and demonstrate the real workflows that connect them together. I suggest looking at the mana of these teams to see the full breadth of possibilities at your disposal.

An example CSV file is available online . You can download it to check the material.

cat

The command is catused to compile a list of one or more files and display their contents on the screen.

$ cat Documents/readme
Thanks again for reading this book!
I hope you're following so far!

$ cat Documents/computers
Computers are not intelligent
They're just fast at making dumb things.

$ cat Documents/readme Documents/computers
Thanks again for reading this book!
I hope you are following so far!

Computers are not intelligent
They're just fast at making dumb things.

head

headprints the first n lines in the file. This can be very useful for looking into a file of unknown structure and format without filling up the entire console with a bunch of text.

$ head -n 2 metadata.csv
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name
mysql.galera.wsrep_cluster_size,gauge,,node,,The current number of nodes in the Galera cluster.,0,mysql,galera cluster size

If -nnot specified, headprints the first ten lines of the specified file or input stream.

tail

tail- an analogue head, only it displays the last n lines in the file.

$ tail -n 1 metadata.csv
mysql.performance.queries,gauge,,query,second,The rate of queries.,0,mysql,queries

If you want to print all the lines located after the nth line (including it), you can use the argument -n +n.

$ tail -n +42 metadata.csv
mysql.replication.slaves_connected,gauge,,,,Number of slaves connected to a replication master.,0,mysql,slaves connected
mysql.performance.queries,gauge,,query,second,The rate of queries.,0,mysql,queries

There are 43 lines in our file, therefore it tail -n +42only outputs the 42nd and 43rd lines from it.

If the parameter is -nnot specified, it tailwill output the last ten lines in the specified file or input stream.

tail -for tail --followdisplay the last lines in the file and each new line as they are written to the file. This is very useful for viewing activity in real time, for example, what is recorded in the web server logs, etc.

wc

wc(word count) displays the number of characters ( -c), words ( -w) or lines ( -l) in the specified file or stream.

$ wc -l metadata.csv
43  metadata.csv
$ wc -w metadata.csv
405 metadata.csv
$ wc -c metadata.csv
5094 metadata.csv

By default, all of the above is displayed.

$ wc metadata.csv
43     405    5094 metadata.csv

If text data is pipelined or redirected to stdin, only the counter is displayed.

$ cat metadata.csv | wc
43     405    5094
$ cat metadata.csv | wc -l
43
$ wc -w < metadata.csv
405

grep

grep- This is a Swiss knife filtering strings according to a given pattern.

For example, we can find all occurrences of the word mutex in a file.

$ grep mutex metadata.csv
mysql.innodb.mutex_os_waits,gauge,,event,second,The rate of mutex OS waits.,0,mysql,mutex os waits
mysql.innodb.mutex_spin_rounds,gauge,,event,second,The rate of mutex spin rounds.,0,mysql,mutex spin rounds
mysql.innodb.mutex_spin_waits,gauge,,event,second,The rate of mutex spin waits.,0,mysql,mutex spin waits

grepcan process either files specified as arguments or a stream of text passed to it stdin. Thus, we can concatenate several commands grepto further filter the text. In the following example, we filter the lines in our file metadata.csvto find lines containing both mutex and OS .

$ grep mutex metadata.csv | grep OS
mysql.innodb.mutex_os_waits,gauge,,event,second,The rate of mutex OS waits.,0,mysql,mutex os waits

Let's consider some options grepand their behavior.

grep -vPerforms inverse matching: filters strings that do not match the argument pattern.

$ grep -v gauge metadata.csv
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name

grep -iPerforms case insensitive matching. The following example grep -i osfinds both OS and os .

$ grep -i os metadata.csv
mysql.innodb.mutex_os_waits,gauge,,event,second,The rate of mutex OS waits.,0,mysql,mutex os waits
mysql.innodb.os_log_fsyncs,gauge,,write,second,The rate of fsync writes to the log file.,0,mysql,log fsyncs

grep -l Lists files containing a match.

$ grep -l mysql metadata.csv
metadata.csv

The team grep -ccounts how many times a sample was found.

$ grep -c select metadata.csv
3

grep -r recursively searches for files in the current working directory and all its subdirectories.

$ grep -r are ~/Documents
/home/br/Documents/computers:Computers are not intelligent
/home/br/Documents/readme:I hope you are following so far!

grep -w shows only matching whole words.

$ grep follow ~/Documents/readme
I hope you are following so far!
$ grep -w follow ~/Documents/readme
$

cut

cutextracts part of the file (or, as usual, the input stream). The command defines the field separator (which separates the columns) using the option -d, and the column numbers to retrieve using the option -f.

For example, the following command retrieves the first column from the last five lines of our CSV file.

$ tail -n 5 metadata.csv | cut -d , -f 1
mysql.performance.user_time
mysql.replication.seconds_behind_master
mysql.replication.slave_running
mysql.replication.slaves_connected
mysql.performance.queries

Since we are dealing with CSV, the columns are separated by a comma, and the option is responsible for extracting the first column -f 1.

You can select both the first and second columns using the option -f 1,2.

$ tail -n 5 metadata.csv | cut -d , -f 1,2
mysql.performance.user_time,gauge
mysql.replication.seconds_behind_master,gauge
mysql.replication.slave_running,gauge
mysql.replication.slaves_connected,gauge
mysql.performance.queries,gauge

paste

paste merges together two different files into one multi-column file.

$ cat ingredients
eggs
milk
butter
tomatoes
$ cat prices
1$
1.99$
1.50$
2$/kg
$ paste ingredients prices
eggs    1$
milk    1.99$
butter  1.50$
tomatoes    2$/kg

By default, it pasteuses a tab delimiter, but it can be changed using the parameter -d.

$ paste ingredients prices -d:
eggs:1$
milk:1.99$
butter:1.50$
tomatoes:2$/kg

Another common use case paste is to combine all the lines in a stream or file using the specified delimiter using a combination of -sand -d.

$ paste -s -d, ingredients
eggs,milk,butter,tomatoes

If a parameter is specified as an input file -, then it will be read instead stdin.

$ cat ingredients | paste -s -d, -
eggs,milk,butter,tomatoes

sort

The command sortactually sorts the data (in the specified file or input stream).

$ cat ingredients
eggs
milk
butter
tomatoes
salt
$ sort ingredients
butter
eggs
milk
salt
tomatoes

sort -r performs reverse sorting.

$ sort -r ingredients
tomatoes
salt
milk
eggs
butter

sort -n Sorts fields by their arithmetic value.

$ cat numbers
0
2
1
10
3
$ sort numbers
0
1
10
2
3
$ sort -n numbers
0
1
2
3
10

uniq

uniq Detects and filters adjacent identical lines in the specified file or input stream.

$ cat duplicates
and one
and one
and two
and one
and two
and one, two, three
$ uniq duplicates
and one
and two
and one
and two
and one, two, three

Since it uniqfilters out only adjacent lines, duplicates may still remain in our data. To filter all the same lines from a file, you must first sort its contents.

$ sort duplicates | uniq
and one
and one, two, three
and two

uniq -c at the beginning of each line inserts the number of its occurrences.

$ sort duplicates | uniq -c
   3 and one
   1 and one, two, three
   2 and two

uniq -u Displays only unique strings.

$ sort duplicates | uniq -u
and one, two, three

Note. uniqIt is especially useful in combination with sorting, as the pipeline | sort | uniqallows you to delete all duplicate lines in a file or stream.

awk

awk- This is a little more than just a word processing tool: in fact, it has a whole programming language . What is awk really good is splitting files into columns, and it does it with special brilliance when spaces and tabs are mixed in files.

$ cat -t multi-columns
John Smith    Doctor^ITardis
Sarah-James Smith^I    Companion^ILondon
Rose Tyler   Companion^ILondon

Note. cat -tdisplays tabs as ^I.

As you can see, the columns are separated by either spaces or tabs, and not always by the same number of spaces. cutit is useless here because it works with only one separator character. But awkit’s easy to deal with such a file.

awk '{ print $n }'displays the nth column in the text.

$ cat multi-columns | awk '{ print $1 }'
John
Sarah-James
Rose
$ cat multi-columns | awk '{ print $3 }'
Doctor
Companion
Companion
$ cat multi-columns | awk '{ print $1,$2 }'
John Smith
Sarah-James Smith
Rose Tyler

Although awkcapable of much more, the output of the speakers is probably 99% of the use cases in my personal case.

Note. { print $NF }displays the last column in a row.

tr

trstands for translate . This command replaces one character with another. It works with either characters or character classes such as lowercase, typed, spaces, alphanumeric, etc.

On standard input, it tr <char1> <char2>replaces all occurrences of <char1> with <char2>.

$ echo "Computers are fast" | tr a A
computers Are fAst

trcan translate character classes using notation [:class:]. A complete list of available classes is described on the man page tr, but some are demonstrated here.

[:space:]represents all types of spaces, from simple spaces to tabs or newlines.

$ echo "computers are fast" | tr '[:space:]' ','
computers,are,fast,%

All characters, like spaces, are comma-separated. Please note that the character %at the end of the output indicates the absence of a terminating new line. Indeed, this character is also converted to a comma.

[:lower:]represents all lowercase characters, and [:upper:] all uppercase characters . Thus, the transformation between them becomes trivial.

$ echo "computers are fast" | tr '[:lower:]' '[:upper:]'
COMPUTERS ARE FAST
$ echo "COMPUTERS ARE FAST" | tr '[:upper:]' '[:lower:]'
computers are fast

tr -c SET1 SET2converts any character not included in SET1 to characters in SET2. In the following example, all characters except the indicated vowels are replaced with spaces.

$ echo "computers are fast" | tr -c '[aeiouy]' ' '
 o  u e   a e  a

tr -dDeletes the specified characters, but does not replace them. This is the equivalent tr <char> ''.

$ echo "Computers Are Fast" | tr -d '[:lower:]'
C A F

trcan also replace character ranges, for example, all letters between a and e or all numbers between 1 and 8, using notation s-e, where s is the starting character, and e is the ending.

$ echo "computers are fast" | tr 'a-e' 'x'
xomputxrs xrx fxst
$ echo "5uch l337 5p34k" | tr '1-4' 'x'
5uch lxx7 5pxxk

The command tr -s string1compresses all multiple occurrences of characters string1in one single. One of the most useful uses tr -sis to replace multiple consecutive spaces with one.

$ echo "Computers         are       fast" | tr -s ' '
Computers are fast

fold

The command foldcollapses all input lines to the specified width. For example, it can be useful to make sure that the text fits on small displays. So, fold -w nstacks strings of width n characters.

$ cat ~/Documents/readme | fold -w 16
Thanks again for
 reading this bo
ok!
I hope you're fo
llowing so far!

The command fold -swill break lines only on space characters. It can be combined with the previous one to limit the string to the specified number of characters.

Thanks again
for reading
this book!
I hope you're
following so
far!

sed

sedIs a non-interactive stream editor that is used to convert text in the input stream line by line. As input, either a file or, or stdin, at the output, either a file or stdout.

Editor commands can include one or more addresses , a function, and parameters . Thus, the commands are as follows:

[address[,address]]function[arguments]

Although it sedperforms many functions, we will only consider replacing text as one of the most common use cases.

Text Replacement

The replacement command is sedas follows:

s/PATTERN/REPLACEMENT/[options]

Example : replacing the first instance of a word in each line in a file:

$ cat hello
hello hello
hello world!
hi
$ cat hello | sed 's/hello/Hey I just met you/'
Hey I just met you hello
Hey I just met you world
hi

We see that in the first line only the first instance is replaced hello. To replace all occurrences helloin all lines, you can use the option g(means global ).

$ cat hello | sed 's/hello/Hey I just met you/g'
Hey I just met you Hey I just met you
Hey I just met you world
hi

sedallows you to use any delimiters except /, which especially improves readability if there are slashes in the command arguments themselves.

$ cat hello | sed 's@hello@Hey I just met you@g'
Hey I just met you Hey I just met you
Hey I just met you world
hi

The address tells the editor on which line or range of lines to perform the substitution.

$ cat hello | sed '1s/hello/Hey I just met you/g'
Hey I just met you hello
hello world
hi
$ cat hello | sed '2s/hello/Hey I just met you/g'
hello hello
Hey I just met you  world
hi

The address 1indicates replace helloon Hey I just met youin the first line. We can specify the range of addresses in the notation <start>,<end>, where it <end>can be either the line number or $, that is, the last line in the file.

$ cat hello | sed '1,2s/hello/Hey I just met you/g'
Hey I just met you Hey I just met you
Hey I just met you world
hi
$ cat hello | sed '2,3s/hello/Hey I just met you/g'
hello hello
Hey I just met you world
hi
$ cat hello | sed '2,$s/hello/Hey I just met you/g'
hello hello
Hey I just met you world
hi

By default, it sedproduces the result in its own stdout, but can also edit the original file with the option -i.

$ sed -i '' 's/hello/Bonjour/' sed-data
$ cat sed-data
Bonjour hello
Bonjour world
hi

Note. On Linux, just enough -i. But on macOS, the behavior of the command is slightly different, so -iyou need to add right after ''.

Real examples

CSV filtering with grep and awk

$ grep -w gauge metadata.csv | awk -F, '{ if ($4 == "query") { print $1, "per", $5 } }'
mysql.performance.com_delete per second
mysql.performance.com_delete_multi per second
mysql.performance.com_insert per second
mysql.performance.com_insert_select per second
mysql.performance.com_replace_select per second
mysql.performance.com_select per second
mysql.performance.com_update per second
mysql.performance.com_update_multi per second
mysql.performance.questions per second
mysql.performance.slow_queries per second
mysql.performance.queries per second

In this example grep, the file metadata.csvfirst filters the lines containing the word gauge, then those with the queryfourth column, and displays the metric name (1st column) with the corresponding value per_unit_name(5th column).

Displays the IPv4 address associated with the network interface

$ ifconfig en0 | grep inet | grep -v inet6 | awk '{ print $2 }'
192.168.0.38

The command ifconfig <interface name>displays information on the specified network interface. For instance:

en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether 19:64:92:de:20:ba
    inet6 fe80::8a3:a1cb:56ae:7c7c%en0 prefixlen 64 secured scopeid 0x7
    inet 192.168.0.38 netmask 0xffffff00 broadcast 192.168.0.255
    nd6 options=201<PERFORMNUD,DAD>
    media: autoselect
    status: active

Then run grepfor inet, that will produce two lines of correspondence.

$ ifconfig en0 | grep inet
    inet6 fe80::8a3:a1cb:56ae:7c7c%en0 prefixlen 64 secured scopeid 0x7
    inet 192.168.0.38 netmask 0xffffff00 broadcast 192.168.0.255

Then, using grep -vexclude the line with ipv6.

$ ifconfig en0 | grep inet | grep -v inet6
inet 192.168.0.38 netmask 0xffffff00 broadcast 192.168.0.255

Finally, with the help, we awkrequest the second column in this row: this is the IPv4 address associated with our network interface en0.

$ ifconfig en0 | grep inet | grep -v inet6 | awk '{ print $2 }'
192.168.0.38

Note. I was offered to replace with grep inet | grep -v inet6such a reliable team awk:
$ ifconfig en0 | awk ' $1 == "inet" { print $2 }'
192.168.0.38
It is shorter and specifically targeted to IPv4 with the condition $1 == "inet".

Retrieving a value from a configuration file

$ grep 'editor =' ~/.gitconfig  | cut -d = -f2 | sed 's/ //g'
/usr/bin/vim

In the git configuration file of the current user, look for the value editor =, crop the character =, extract the second column and delete all the spaces around.

$ grep 'editor =' ~/.gitconfig
     editor = /usr/bin/vim
$ grep 'editor =' ~/.gitconfig  | cut -d'=' -f2
 /usr/bin/vim
$ grep 'editor =' ~/.gitconfig  | cut -d'=' -f2 | sed 's/ //'
/usr/bin/vim

Extract IPs from a log file

The following real code looks for a message in the database log Too many connections from(followed by an IP address) and displays the ten main intruders.

$ grep 'Too many connections from' db.log | \
  awk '{ print $12 }' | \
  sed 's@/@@' | \
  sort | \
  uniq -c | \
  sort -rn | \
  head -n 10 | \
  awk '{ print $2 }'
   10.11.112.108
   10.11.111.70
   10.11.97.57
   10.11.109.72
   10.11.116.156
   10.11.100.221
   10.11.96.242
   10.11.81.68
   10.11.99.112
   10.11.107.120

Let's see what this pipeline does. First, what does the line in the log look like.

$ grep "Too many connections from" db.log | head -n 1
2020-01-01 08:02:37,617 [myid:1] - WARN  [NIOServerCxn.Factory:1.2.3.4/1.2.3.4:2181:NIOServerCnxnFactory@193] - Too many connections from /10.11.112.108 - max is 60

It then awk '{ print $12 }'extracts the IP address from the string.

$ grep "Too many connections from" db.log | awk '{ print $12 }'
/10.11.112.108
...

The command sed 's@/@@'deletes the initial slash.

$ grep "Too many connections from" db.log | awk '{ print $12 }' | sed 's@/@@'
10.11.112.108
...

Note. As we saw earlier, sedyou can use any separator in. Although it is usually used as a separator /, here we are replacing exactly this character, which will slightly impair the readability of the substitution expression.
sed 's/\///'

sort | uniq -c sorts IP addresses in lexicographical order, and then removes duplicates, adding the number of entries before each IP address.

$ grep 'Too many connections from' db.log | \
  awk '{ print $12 }' | \
  sed 's@/@@' | \
  sort | \
  uniq -c
   1379 10.11.100.221
   1213 10.11.103.168
   1138 10.11.105.177
    946 10.11.106.213
   1211 10.11.106.4
   1326 10.11.107.120
   ...

sort -rn | head -n 10sorts the lines by the number of occurrences, numerically and in reverse order, so that the main violators are displayed first, of which 10 lines are displayed. The last command awk { print $2 }retrieves the IP addresses themselves.

$ grep 'Too many connections from' db.log | \
  awk '{ print $12 }' | \
  sed 's@/@@' | \
  sort | \
  uniq -c | \
  sort -rn | \
  head -n 10 | \
  awk '{ print $2 }'
  10.11.112.108
  10.11.111.70
  10.11.97.57
  10.11.109.72
  10.11.116.156
  10.11.100.221
  10.11.96.242
  10.11.81.68
  10.11.99.112
  10.11.107.120

Renaming a function in the source file

Imagine that we are working on a project and would like to rename the under-named function (or class, variable, etc.) in the source file. You can do this with a command sed -ithat replaces directly in the original file.

$ cat izk/utils.py
def bool_from_str(s):
    if s.isdigit():
        return int(s) == 1
    return s.lower() in ['yes', 'true', 'y']

$ sed -i 's/def bool_from_str/def is_affirmative/' izk/utils.py
$ cat izk/utils.py
def is_affirmative(s):
    if s.isdigit():
        return int(s) == 1
    return s.lower() in ['yes', 'true', 'y']

Note. On macOS, sed -iuse instead sed -i ''.

However, we renamed the function only in the original file. This will break the import bool_from_strin any other file as this function is no longer defined. We need to find a way to rename bool_from_strthroughout our project. This can be achieved using commands grep, sedas well as loops foror using xargs.

Deepening: cycles `for`and`xargs`

To replace all occurrences in our project bool_from_str, you must first find them recursively with grep -r.

$ grep -r bool_from_str .
./tests/test_utils.py:from izk.utils import bool_from_str
./tests/test_utils.py:def test_bool_from_str(s, expected):
./tests/test_utils.py:    assert bool_from_str(s) == expected
./izk/utils.py:def bool_from_str(s):
./izk/prompt.py:from .utils import bool_from_str
./izk/prompt.py:                    default = bool_from_str(os.environ[envvar])

Since we are only interested in files with matches, you must also use the option -l/--files-with-matches:

-l, --files-with-matches
        Only the names of files containing selected lines are written to standard out-
        put.  grep will only search a file until a match has been found, making
        searches potentially less expensive.  Pathnames are listed once per file
        searched.  If the standard input is searched, the string ``(standard input)''
        is written.

$ grep -r --files-with-matches bool_from_str .
./tests/test_utils.py
./izk/utils.py
./izk/prompt.py

Then we can use the command xargsto carry out actions from each line of the output (that is, all files containing the line bool_from_str).

$ grep -r --files-with-matches bool_from_str . | \
  xargs -n 1 sed -i 's/bool_from_str/is_affirmative/'

The option -n 1indicates that each line in the output should execute a separate command sed.

Then the following commands are executed:

$ sed -i 's/bool_from_str/is_affirmative/' ./tests/test_utils.py
$ sed -i 's/bool_from_str/is_affirmative/' ./izk/utils.py
$ sed -i 's/bool_from_str/is_affirmative/' ./izk/prompt.py

If the command that you call with xargs(in our case sed) supports multiple arguments, then you should discard the argument -n 1for performance.

grep -r --files-with-matches bool_from_str . | xargs sed -i 's/bool_from_str/is_affirmative/'

This command will then execute

$ sed -i 's/bool_from_str/is_affirmative/' ./tests/test_utils.py ./izk/utils.py ./izk/prompt.py

Note. From the synopsis sedon the man page, it can be seen that the team can take several arguments.
SYNOPSIS
     sed [-Ealn] command [file ...]
     sed [-Ealn] [-e command] [-f command_file] [-i extension] [file ...]
Indeed, as we saw in the previous chapter, it file ...means that several arguments are accepted, which are file names.

We see that replacements are made for all occurrences bool_from_str.

$ grep -r is_affirmative .
./tests/test_utils.py:from izk.utils import is_affirmative
./tests/test_utils.py:def test_is_affirmative(s, expected):
./tests/test_utils.py:    assert is_affirmative(s) == expected
./izk/utils.py:def is_affirmative(s):
./izk/prompt.py:from .utils import is_affirmative
./izk/prompt.py:                    default = is_affirmative(os.environ[envvar])

As often happens, there are several ways to achieve the same result. Instead, xargswe could use loops forto iterate over the lines in the list and take action on each item. These loops have the following syntax:

for item in list; do
    command $item
done

If we wrap our command grepin $(), then the shell will execute it in a subshell , the result of which will then be repeated in a loop for.

$ for file in $(grep -r --files-with-matches bool_from_str .); do
  sed -i 's/bool_from_str/is_affirmative/' $file
done

This command will execute

$ sed -i 's/bool_from_str/is_affirmative/' ./tests/test_utils.py
$ sed -i 's/bool_from_str/is_affirmative/' ./izk/utils.py
$ sed -i 's/bool_from_str/is_affirmative/' ./izk/prompt.py

The loop syntax forseems clearer to me than that xargs, but the latter can execute commands in parallel using parameters -P n, where nis the maximum number of parallel commands executed at the same time, which can give a performance gain.

Summary

All these tools open up a whole world of possibilities, as they allow you to extract and transform data, creating entire pipelines from teams that may never have been intended to work together. Each of them performs a relatively small function (sorting sort, combining cat, filters grep, editing sed, cutting cut, etc.).

Any task that includes text can be reduced to a pipeline of smaller tasks, each of which performs a simple action and transfers its output to the next task.

For example, if we want to know how many unique IP addresses are in the log file, and so that these IP addresses always appear in the same column, then we can run the following sequence of commands:

grep strings that match the pattern of strings with IP addresses
find the column with IP address, extract it with awk
sort the list of IP addresses using sort
eliminate adjacent duplicates with uniq
count the number of lines (i.e. unique IP addresses) using wc -l

Since there are many native and third-party word processing tools, there are also many ways to solve any problem.

The examples in this article were far-fetched, but I suggest you read the amazing article “Command-line tools can be 235 times faster than your Hadoop cluster” to get an idea of how useful and powerful these commands really are and what real problems they can decide.

What's next

Count the number of files and directories located in your home directory.
.
, .
. .

« » (Essential Tools and Practices for the Aspiring Software Developer) , . , , , , git, SQL, Make, jq , , .

, !

13 shell-based word processing tools

Content

Shell Text Processing

cat

head

tail

wc

grep

cut

paste

sort

uniq

awk

tr

fold

sed

Text Replacement

Real examples

CSV filtering with grep and awk

Displays the IPv4 address associated with the network interface

Retrieving a value from a configuration file

Extract IPs from a log file

Renaming a function in the source file

Deepening: cycles `for`and`xargs`

Summary

What's next

More articles:

13 shell-based word processing tools

Content

Shell Text Processing

cat

head

tail

wc

grep

cut

paste

sort

uniq

awk

tr

fold

sed

Text Replacement

Real examples

CSV filtering with grep and awk

Displays the IPv4 address associated with the network interface

Retrieving a value from a configuration file

Extract IPs from a log file

Renaming a function in the source file

Deepening: cycles forandxargs

Summary

What's next

More articles:

Deepening: cycles `for`and`xargs`