Mastering Essential Linux Commands: Your Path to File and Directory Mastery¶
Introduction¶
This documentation aims to offer a comprehensive understanding of essential commands and techniques for file and directory management in a Linux environment. Mastering these commands is crucial for efficient navigation, manipulation, and analysis of files and directories.
We'll embark on a journey by delving into the foundational usage of key commands like wc, du, grep, awk, and find, uncovering their individual functionalities. Additionally, we'll explore how these commands can be combined using powerful methods such as pipes (|), -exec {} \;, or -exec {} +, unlocking their synergistic potential.
Moreover, to solidify your understanding, real-life examples showcasing practical applications will be demonstrated.
The hands-on experience gained through testing and implementing these commands will be pivotal in comprehending their nuanced usage and unleashing their practical utility.
Let the learning begin !
1. Basic Commands Overview¶
wc (Word Count)¶
The wc command is used to count lines, words, and characters in files.
- Counting lines in a file:
This command displays the number of lines in file.txt.
du (Disk Usage)¶
The du command estimates file and directory space usage.
- Getting the size of a directory:
This command provides the disk usage of the specified directory (/path/to/directory) in a human-readable format (-h).
grep (Global Regular Expression Print)¶
The grep command searches for patterns in files.
- Searching for lines containing a pattern in a file:
This command displays lines in file.txt that contain the specified pattern.
- Searching for lines containing multiple patterns in a file:
This command displays lines in file.txt that contain either "pattern1" or "pattern2".
Additional grep options:
-
-H: Print the filename for each match when searching in multiple files. -
-l: Display only the names of files that contain at least one match, instead of showing the matching lines. -
-n: Display the line numbers along with the matching lines. -
-w: Match the whole word, ensuring that only entire words are considered. -
-i: Perform case-insensitive matching, ignoring differences in case when searching for the pattern. -
-B N: Display N lines before the matching line. -
-A N: Display N lines after the matching line.
These options enhance the functionality of grep by providing more context, line numbers, and filename information when searching for patterns in files.
awk (Aho, Weinberger, and Kernighan)¶
- Basic Syntax:
pattern: The condition that a line must meet to trigger the action.-
action: The set of commands to be executed when the pattern is matched.Example
This command prints the first field of each line infile.txtwhere the pattern is found.Common Use Cases:
-
Printing Specific Columns:
This prints the second and fourth columns of each line infile.txt. -
Pattern Matching:
Prints lines containing the word "error" from thelog.txtfile. -
Calculations:
Calculates and prints the sum of the values in the first column ofnumbers.txt. -
Custom Field and Record Separators:
Specifies ',' as the field separator in a CSV file.
Advanced Features:
-
Variables:
Uses the variabletotalto accumulate values. -
Built-in Functions:
Prints the length of each line intext.txt. -
Conditional Statements:
Prints lines where the value in the first column is greater than 10.
-
awk is versatile and can be highly customized for various text processing tasks. It's especially useful for working with structured data in files.
head and tail¶
The head command displays the beginning of a file, while tail shows the end.
- Viewing the first few lines of a file with
head:
This command displays the first few lines of file.txt.
- Displaying a specific number of lines at the beginning of a file with
head -n:
This command displays the first 10 lines of file.txt. You can replace 10 with any number to view a different quantity of lines.
- Viewing the last few lines of a file with
tail:
This command shows the last few lines of file.txt.
- Displaying a specific number of lines at the end of a file with
tail -n:
This command shows the last 15 lines of file.txt. Similarly, you can adjust 15 to any desired number to see a different quantity of lines.
Using -n with head or tail allows you to precisely control the number of lines displayed from the beginning or end of a file.
less and more¶
Both less and more are used to view text files in a paginated manner.
- Viewing a file with
less:
less allows you to navigate through the file interactively.
- Viewing a file with
more:
more displays the file content page by page, but it has more limited navigation options compared to less.
These commands provide different ways to view file contents, either scrolling through the entire file or just a section at a time.
find (Search for Files)¶
The find command searches for files and directories based on various criteria.
- Basic Syntax:
path: Specifies the directory or directories to start the search from. If omitted, the current directory is used.options: Additional flags or arguments that modify the behavior of thefindcommand.-
expression: Specifies the criteria that files must meet to be included in the search results.Example
This command searches for files with a.txtextension within the/path/to/directoryand its subdirectories.for case insensitive search, use
-inameinstead of-nameCommon Use Cases:
-
Searching by Name:
Searches for a file named "example.txt" in the current directory and its subdirectories. -
Searching files matching pattern:
Searchesreq*.txtin/path/to/searchand its subdirectories. -
Searching by Type:
Finds directories under/home/user.
-type fsearch files when-type dis for directories-
Searching by Size:
Finds files larger than 1 megabyte in size under/var/log. -
Searching by Modification Time:
Finds files modified within the last 7 days under/etc. -
Combining Multiple Criteria:
Finds log files larger than 100 kilobytes in size under/tmp. -
Find and delete files:
This command finds a file namedfile_to_delete.txtand deletes it.
Advanced Features:
-
Executing Commands on Found Files:
Changes the permissions of all files under/hometo 644. -
Using Logical Operators:
Finds files with either.logor.txtextensions under/var/log. -
Combining with Other Commands:
Searches for the keyword "keyword" within files under/etc.
-
find is a powerful utility for searching and locating files based on various criteria such as name, type, size, and modification time.
xargs (Extended Arguments)¶
xargs is a command in Linux/Unix operating systems that allows you to build and execute command lines from standard input. It's particularly useful when you have a list of items from some source (like the output of another command, a file, or user input), and you want to pass these items as arguments to another command.
- Basic Syntax:
options: Various options that modify the behavior ofxargs.command: The command to be executed with the arguments passed byxargs.-
initial-arguments: Optional initial arguments to be used with the command.How do
xargsworks ?Here's how
xargstypically works:-
Standard Input:
xargsreads data from standard input (stdin) by default. This data can be piped intoxargsfrom the output of another command or provided directly via keyboard input. -
Tokenization:
xargsbreaks the input into pieces or tokens. By default, it splits the input into items based on whitespace (spaces, tabs, and newlines), but you can specify a different delimiter using the-doption. -
Command Execution:
xargstakes each token or item from the input and appends it to the end of a command line template. It then executes the resulting command line. -
Argument Limit:
xargsautomatically splits the input into multiple command invocations if the number of items exceeds the maximum argument limit for the operating system. -
Handling Spaces and Special Characters:
xargsensures that each item is properly quoted so that spaces and other special characters are correctly interpreted by the command being executed.
Here's a simple example to illustrate how
xargsworks:In this example:
echo "file1 file2 file3"prints the string "file1 file2 file3" to stdout.|(pipe) sends the output ofechoas input toxargs.xargs ls -ltakes each space-separated item from the input ("file1", "file2", "file3") and appends them to thels -lcommand, resulting inls -l file1 file2 file3.ls -lis executed with the provided arguments, listing the details of the specified files.
This is just a basic usage example.
xargshas many options and can be used in various complex scenarios to construct and execute command lines efficiently.Example
This command lists the details of filesfile1,file2, andfile3.Common Use Cases:
-
Reading from Standard Input:
Removes filesfile1,file2, andfile3. -
Passing Filenames from a File:
Searches for the specified pattern in each file listed infiles.txt. -
Limiting Arguments per Command:
Removes each file one by one. -
Parallel Execution:
Counts the lines in.txtfiles, executing up to fourwc -lcommands concurrently.
Advanced Features:
-
Null-Terminated Input:
Safely removes files with names containing spaces or special characters. -
Interactive Execution:
Prompts for confirmation before removing each file. -
Verbose Output:
Prints thermcommand being executed for each file. -
Specifying Maximum Processes:
Limits the number of concurrentwc -lprocesses to four. -
Specifying Maximum Arguments per Command:
Removes each file one by one by executingrmcommand for each input line. -
Interactive Replacement String:
Copiesfile1andfile2to./backupdirectory, replacing{}with each input item. I can rename the files like above tofile1.bakandfile2.bak -
Verbose Mode:
Prints thermcommand being executed for each file.
-
xargs is a flexible tool for constructing and executing commands with arguments from standard input or files. It's particularly useful for batch processing and handling large sets of data efficiently.
sed (Stream Editor)¶
- Basic Syntax:
options: Additional flags or arguments that modify the behavior of thesedcommand.command: Specifies the editing operation to be performed on the input.-
file.txt: The input file to be processed bysed.Example
This command replaces the first occurrence of "old" with "new" on each line offile.txt.Common Use Cases:
-
Substitution:
Substitutes all occurrences of "foo" with "bar" infile.txt. -
Deleting Lines:
Deletes lines containing "pattern" fromfile.txt. -
Printing Lines:
Prints lines 10 to 20 offile.txt. -
Inserting and Appending Text:
Inserts "Inserted Text" before the first line offile.txt. -
Using Regular Expressions:
Reverses the order of numbers separated by a dot infile.txt.
Advanced Features:
-
In-Place Editing:
Performs in-place editing, modifyingfile.txtdirectly. -
Multiple Commands:
Executes multiple editing commands sequentially onfile.txt. -
Using Hold and Pattern Space:
Collects all lines into the hold space and then prints them in reverse order. -
Conditional Execution:
Substitutes "old" with "new" only on lines containing "pattern".
-
Conclusion - basic commands¶
These commands offer different functionalities:
wccounts lines, words, or characters in a file.duestimates disk usage for files and directories.grepsearches for patterns in files and prints lines containing the specified pattern.awkis a powerful text processing tool for pattern scanning and processing.- The
findcommand in Linux is a powerful tool used for searching files and directories based on various criteria. xargs, constructs and executes commands with arguments from standard input or filessed, performs various editing operations on text files, making it invaluable for automation and scripting tasks.
You can use these commands to perform various operations related to file content, size estimation, and pattern matching within files.
2. Combining find with Other Commands¶
In this section, We explore how to combine the previous commands using pipes (|), -exec {} \;, or -exec {} +:
Pipes (|)¶
Using pipes to pass the output of one command as input to another.
-
Finding specific files and counting them:
Finds
.txtfiles and counts them usingwc -l.
-exec {} \;: Find and perform an action on each file¶
Executing a command for each matched file or directory.
-
Finding files and displaying their sizes:
Displays sizes of files (each file in a different command) found by
findusingdu -h. -
Finding files and performing deletion:
Deletes files (each file in a different command) matching the name
file_to_delete.txt. -
Finding and searching patterns:
This command finds all
.txtfiles in the specified directory and runsgrepto search for a specific pattern within each of those files.
-exec {} +: Find and perform an action on all files at once¶
Optimizing efficiency by passing multiple arguments to a command.
-
Finding files and performing deletion:
Deletes files (all in one command) matching the name
file_to_delete.txt
-exec -c option¶
-
get and
.txtfiles and change their extension to.mdusing a combination of thefindcommand andsed(stream editor)Here's a breakdown of what's happening
-
find /your/folder/path -type f -name "*.txt": This command finds all files (-type f) with the.txtextension in the specified folder and its subdirectories. -
-exec sh -c 'mv "$0" "${0%.txt}.md"' {} \;: For each file found, it executes theshshell command to rename the file.${0%.txt}.mdis a parameter expansion that replaces.txtwith.mdin the filename.
Make sure to replace
/your/folder/pathwith the actual path to your folder. -
\( ... \): Grouping Expressions¶
When using find to search for files based on multiple criteria, such as file name patterns, types, or sizes, you may need to combine these criteria using logical operators like -and, -or, or -not. The \( ... \) construct allows you to group these expressions together to ensure they are evaluated as a single logical unit.
Grouping multiple expressions together for logical operations.
-
Grouping Expressions in
find:Groups the conditions for finding files with either
.txtor.pdfextensions and with a size greater than 1MB.Using
\( ... \)allows for the proper grouping of expressions within afindcommand, ensuring that logical operations are applied correctly.
Overall, \( ... \) is a crucial construct in find commands for combining multiple search criteria and ensuring their proper evaluation. It helps create more complex search patterns while maintaining clarity and precision in the command syntax.
conclusion - combine commands¶
find is an incredibly versatile command that can be combined with various flags and options to perform advanced searches based on filenames, types, sizes, modification times, and more. It's a great tool for locating specific files or performing actions on groups of files based on specific criteria.
3. Application showcases¶
Counting Files in a Folder¶
To count the number of files in a folder, you can use the following commands:
-
Using
find:More
- This command uses
findto search for files (-type f) in the specified folder without going into subdirectories (-maxdepth 1). - The output is then piped to
wc -l, which counts the number of lines, effectively giving you the count of files.
- This command uses
-
Using
ls:More
Here, -
ls -llists the contents of the folder with detailed information -grep "^-"filters out only the lines that represent files (as opposed to directories or other types of items) -wc -lcounts the number of lines, providing the count of files in the folder.
Counting Files/Folders in a Folder¶
or
Determining the Number of Columns in a CSV File¶
This command reads the first line, apply the , separator then count
Finding requirements.txt Files Containing "openpyxl"¶
- find all requirements.txt files
- find all requirements.txt files who contain "openpyxl"
Utilizing maxdepth for Search¶
- find all
.txtfiles non recursively
- find ...
- save the result
Skipping Certain Paths in a Search¶
- find all py files but skip venv folders (paths containing venv)
- find all py files but skip venv folders and apply yapf on each file
- find all py files but skip folders(likely env) and apply yapf on each file
- search files where the word wrappers is mentionned and avoid some folders
find . -type f -not -path '*/node_modules/*' -not -path '*env*' -not -name '*_*' -name '*.py' -exec grep -l 'wrappers' {} +
git grep for version controlled files¶
- search files where the word wrappers is mentionned withing the version controlled files
Using head and tail Commands¶
- display the last 50 lines of a file
- filter the output of another command
This command will display the last 50 lines of the cli.log file and filter out only the lines that contain "/api/". This combination of tail and grep will help you isolate and display the relevant lines.
- install the first lines of
requirements.txtusingheadandxargs
This command will read the first 18 lines of requirements.txt, then install each package listed there using pip3.
An improvement of this command has been proposed here using sed to remove from the requirement file, spaces, comment, empty lines, ...
search for lines containing the word "black" within .sh files¶
More on find and grep options
This command will search for lines containing the word "black" within .sh files. The command (grep) displays the actual lines containing "black" within the files
-
grep options:
grepvsgrep -lTo only show the filenames without the matches, use the command (grep) instead of (grep -l) -
-execoption in thefindcommand This syntax uses+at the end of the-execoption. It gathers the file names that match the criteria (*.sh) and passes them togrepin batches, rather than invokinggreponce per file. This is generally more efficient, especially when dealing with a large number of files.To invoke
grepindividually for each file that matches the criteria (*.sh), use insteadfind /path/to/search -type f -name "*.sh" -exec grep -l "black" {} \;: This syntax uses\;at the end of the-execoption.This method might be less efficient, especially for a large number of files, as it starts a new
grepprocess for each file separately.
search for folders¶
- search folder by name (ex:all name containing eigen3)
- search for a specific folder like "LAStools/bin" starting from the root directory
/home.
This command searches the entire root directory / for directories (-type d) named "bin" (-name "bin") that are part of a path containing "LAStools" (-path "*LAStools*")
Using sudo might be necessary to have permission to search directories that your user account doesn't have access to by default.
Searching for "AAA" in Files¶
- search for "AAA" in all files
- search for "AAA" in all files with file name and line number display
Searching for a keyword in Files from the parent directory¶
- search for "L337" in all files
- search for "L337" in all files with 5 lines after each match
This command will find all occurrences of "L337" in files within the parent directory and its subdirectories and display the filename, line number, and the line containing "L337", along with the five lines that follow it.
Get unique lines across files¶
- see the unique lines common to all three files without repetitions
Alternative:
- use the precedent list to filter lines from another file (
people)
This command will first find the unique lines common to all three files, then filter those lines using grep -f based on the patterns present in the specified file (in this case, the file containing sorted and unique lines from file1.txt, file2.txt, and file3.txt). Finally, it will display the lines that match in both sets and also contain the term people.
Additional Commands¶
- extract names of females with the name 'Annabel' from the 'people' file
In the project, this command filter the lines of the file people containing the word Annabel and where the person is female (3rd fiel == F) then use awk to print from the filtered file, only the first and second fields. The fields in each line are separated by a space
- search for 'Annabel' in all files and extract the names of females with the name 'Annabel'
This command will apply the precedent operation on each file returned by the find command
Grouping Examples¶
Finding files with either ".txt" or ".pdf" extensions and with a size greater than 1MB
Suppose you want to find files with either ".txt" or ".pdf" extensions and with a size greater than 1MB. You can use \( ... \) to group the size condition with the extension conditions:
In this command:
\( -name "*.txt" -o -name "*.pdf" \)groups the conditions for finding files with ".txt" or ".pdf" extensions.-size +1Mspecifies the condition for files with a size greater than 1MB.
By grouping the extension conditions together, you ensure that the size condition is applied to both ".txt" and ".pdf" files.
Finding specific files with specific extensions
Suppose you want to find specific files with extensions such as ".sh", ".md", or "Dockerfile" and then search for a particular pattern within them. You can use the following command:
find /home/ubuntu/Documents/GitHub/ \( -name "*.sh" -o -name "*.md" -o -name "Dockerfile" \) -exec grep -Hn "apt install ./mongodb-database-tools-*.deb &" {} \;
In this command:
\( -name "*.sh" -o -name "*.md" -o -name "Dockerfile" \)groups the conditions for finding files with the specified extensions.-exec grep -Hn "apt install ./mongodb-database-tools-*.deb &" {} \;executes thegrepcommand to search for the specified pattern within each matched file.
The \( ... \) construct is used to group the -name expressions together. This grouping is necessary because the -o operator (logical OR) has lower precedence than the implicit logical AND applied to separate find expressions. By using \( ... \), you ensure that the logical OR operation is applied correctly within the grouped expressions.
Without the grouping, the command would not function as intended because each -name expression would be evaluated separately, potentially leading to unexpected results.
running a linter script md files from a repo subfolder and the readme file in the main directory
To find both .md files in the ./docs directory and README.md files in the current directory, you can use the -o (OR) operator along with the -exec option. Here's how you can do it:
In this command:
.: Specifies the current directory as the starting point for thefindcommand.\( ... \): Groups conditions together.-path "./docs" -name "*.md": Specifies files with the.mdextension in the./docsdirectory.-o: Acts as the logical OR operator.-path "./README.md": Specifies theREADME.mdfile in the current directory.-exec markdownlint-cli2 --fix {} +: Executes themarkdownlint-cli2 --fixcommand on the found files. The{}is replaced by the found file names.
This command will execute markdownlint-cli2 --fix on all .md files in the ./docs directory and README.md in the current directory.
comment out some lines matching a pattern¶
To comment out all lines containing font-size: in a CSS file using Bash, you can use the sed command. sed is a powerful stream editor in Unix-like systems, used for performing basic text transformations on an input stream (a file or input from a pipeline).
Here’s how you can use sed to find all occurrences of font-size: and comment out those lines in a CSS file. You will use the in-place edit option -i to modify the file directly. Please ensure to back up your file before running such commands, as they make direct changes to your files.
Step-by-Step Command¶
Backup Your CSS File
Before running the command, it's a good idea to make a backup of your CSS file:
The following sed command will add /* and */ around any line that contains font-size:. This approach assumes that the lines do not already contain block comments (/* */), as nested comments are not supported in CSS and will lead to errors.
Explanation:
Here’s what each part of the command does:
sed -i: The-ioption edits files in-place (i.e., saves back to the original file).'/font-size:/: Selects lines that match the patternfont-size:.s|.*|/* & */|: Replaces the entire line (.*matches everything) with/*followed by the original line (&represents the matched line), followed by*/.
Alternative Command
If you just want to prepend // to lines instead of wrapping them with /* */, assuming this is for a CSS-like language that supports // comments (note: standard CSS does not support // for commenting), you could do:
This command adds // at the start of any line that contains font-size:.
Testing the Command
It's a good practice to test the command on a sample file or with a copy of your file to ensure it performs as expected without modifying the original file:
This command applies the changes and redirects the output to test_output.css instead of altering the original file.
These commands should help you automatically comment out all lines containing font-size: in your CSS file using Bash.
Conclusion¶
These commands, when mastered and strategically combined, offer a robust toolkit for proficiently managing and manipulating files and directories in a Linux environment. By leveraging these commands in tandem, users can perform intricate searches, conduct comprehensive analyses, and execute operations swiftly, significantly enhancing productivity and workflow efficiency.