Advanced Bash Scripting
Master advanced techniques for production-grade shell scripts: process substitution, here documents, signal handling, debugging, text processing, and automation.
Here Documents
A here document (<<) passes multi-line input to a command:
cat << 'EOF'
This is a multi-line
document that preserves
all formatting and newlines.
EOF
Quoted delimiter ('EOF') prevents expansion:
cat << 'EOF'
Variables like $PATH are not expanded
Backticks `date` are literal
EOF
Unquoted delimiter allows expansion:
cat << EOF
User: $USER
Home: $HOME
Date: $(date)
EOF
Here Strings
A here string (<<<) passes a single string to a command:
grep "pattern" <<< "Here is the input string"
bc <<< "5 + 3" # Outputs: 8
wc -w <<< "Count these words" # Outputs: 3
Redirect to Variable
output=$(cat << 'EOF'
Line 1
Line 2
Line 3
EOF
)
echo "$output"
Process Substitution
Process substitution (<() and >()) treats command output as files:
Input substitution — Pass command output as file argument:
diff <(sort file1.txt) <(sort file2.txt)
Equivalent to:
diff <(cat file1.txt | sort) <(cat file2.txt | sort)
Output substitution — Redirect to command input:
tee >(cat > file1.txt) >(cat > file2.txt) <<< "content"
Practical Examples
Compare two commands:
diff <(ps aux) <(ps aux)
Merge sorted streams:
comm <(grep admin /etc/passwd | sort) \
<(grep admin /etc/group | sort)
Parallel processing:
paste <(seq 1 5) <(seq 6 10) # Creates two-column output
Signal Handling with trap
The trap command catches signals and system events:
trap 'echo "Caught interrupt"; exit' INT
Clean up on exit:
#!/bin/bash
cleanup() {
echo "Cleaning up..."
rm -f /tmp/temp_$$
exit 0
}
trap cleanup EXIT
# Create temp file
temp_file="/tmp/temp_$$"
touch "$temp_file"
echo "Working with $temp_file"
Handle multiple signals:
error_handler() {
echo "Error on line $1"
exit 1
}
trap 'error_handler $LINENO' ERR
Common signals:
| Signal | Meaning | Use |
|---|---|---|
INT | Ctrl+C | Clean shutdown |
TERM | Termination | Graceful exit |
EXIT | Script exit | Final cleanup |
ERR | Command error | Error handling |
DEBUG | Each command | Tracing |
Debugging
Debug Mode
Enable debugging to trace execution:
bash -x script.sh # Run with trace
Inside script:
set -x # Start debugging
commands...
set +x # Stop debugging
Output shows command before execution:
+ echo 'Processing file'
Processing file
+ ls -l
total 8
...
PS4 Customization
Customize debug prompt:
PS4='+ [${BASH_SOURCE}:${LINENO}] '
set -x
Output shows filename and line number.
Verbose Mode
set -v # Print commands before execution
set +v # Turn off
Check Syntax Without Running
bash -n script.sh # Check syntax
Parameter Parsing with getopts
Parse command-line options safely:
#!/bin/bash
verbose=false
output_file=""
while getopts "vo:" opt; do
case $opt in
v)
verbose=true
;;
o)
output_file="$OPTARG"
;;
*)
echo "Usage: $0 [-v] [-o FILE]"
exit 1
;;
esac
done
shift $((OPTIND-1)) # Remove processed options
remaining_args="$@"
echo "Verbose: $verbose"
echo "Output: $output_file"
echo "Arguments: $remaining_args"
Usage:
./script.sh -v -o output.txt file1 file2
Text Processing: sed and awk
sed (Stream Editor)
Print lines:
sed -n '5p' file.txt # Print line 5
sed -n '1,5p' file.txt # Print lines 1-5
sed -n '/pattern/p' file.txt # Print matching lines
Delete lines:
sed '5d' file.txt # Delete line 5
sed '/pattern/d' file.txt # Delete matching lines
sed '1,5d' file.txt # Delete lines 1-5
Substitute (replace):
sed 's/old/new/' file.txt # Replace first on each line
sed 's/old/new/g' file.txt # Replace all on each line
sed 's/old/new/2' file.txt # Replace second on each line
sed -i 's/old/new/g' file.txt # In-place edit
Case-insensitive:
sed 's/Pattern/replaced/I' file.txt
Multi-line sed:
sed -e 's/pattern1/replace1/' \
-e 's/pattern2/replace2/' \
-e '/unwanted/d' file.txt
awk (Pattern-Action Language)
Print columns:
awk '{print $1, $3}' file.txt # Print columns 1 and 3
awk '{print NF}' file.txt # Print number of fields
Filter by pattern:
awk '/pattern/ {print}' file.txt # Print matching lines
awk '$2 > 100 {print $1, $2}' file.txt # Print where col 2 > 100
Arithmetic:
awk '{sum += $1} END {print sum}' file.txt # Sum column 1
awk '{print $1 * 2}' file.txt # Double column 1
Field separator:
awk -F: '{print $1, $3}' /etc/passwd # Use ':' as separator
awk -F'[,:]' '{print $1}' file.txt # Multiple separators
Built-in variables:
| Variable | Meaning |
|---|---|
NR | Current row number |
NF | Number of fields |
FS | Field separator (default: space) |
RS | Record separator (default: newline) |
OFS | Output field separator |
ORS | Output record separator |
Complex example:
awk -F: '
NR > 1 { # Skip header
sum += $3
count++
}
END {
if (count > 0)
print "Average:", sum/count
}
' data.txt
Text Processing Pipelines
Extract and transform:
cat data.txt | \
grep "status=active" | \
awk -F',' '{print $2, $4}' | \
sort | \
uniq -c | \
sort -rn
Count occurrences:
cat file.txt | \
tr ' ' '\n' | \
sort | \
uniq -c | \
sort -rn | \
head -10
Extract IP addresses:
grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' file.txt | \
sort | \
uniq -c | \
sort -rn
Regular Expressions
Basics
^text # Start of line
text$ # End of line
. # Any character
* # Zero or more
+ # One or more (extended regex)
? # Zero or one (extended regex)
[abc] # Character class
[^abc] # Negated character class
\d # Digit (in some contexts)
\w # Word character (in some contexts)
Examples
# Match email-like pattern
grep -E '[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-z]{2,}' file.txt
# Extract version numbers
grep -oE '[0-9]+\.[0-9]+\.[0-9]+' file.txt
# Find log entries for specific date
grep '^2024-03-' log.txt
# Match lines with exactly 3 digits
grep -E '^[0-9]{3}$' file.txt
Cron Automation
Crontab Syntax
┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-7, 0/7 = Sunday)
│ │ │ │ │
│ │ │ │ │
* * * * * command
Common Schedules
0 9 * * * # 9:00 AM daily
0 */4 * * * # Every 4 hours
30 2 * * 0 # 2:30 AM on Sunday
0 0 1 * * # Midnight on first of month
0 12 * * 1-5 # Noon, Monday-Friday
*/15 * * * * # Every 15 minutes
Create Cron Job
crontab -e # Edit current user's crontab
crontab -l # List current crontab
crontab -r # Remove crontab
crontab -u user -e # Edit another user's (as root)
Example crontab entry:
# Backup database daily at 2 AM
0 2 * * * /home/user/scripts/backup.sh >> /var/log/backup.log 2>&1
# Clean logs weekly
0 3 * * 0 find /var/log -name "*.log" -mtime +30 -delete
Best Practices
- Use full paths in cron scripts
- Redirect output to log files
- Use
>/dev/null 2>&1to suppress output if not needed - Set email recipient for errors:
MAILTO=admin@example.com - Use UTC for consistency across systems
Regex Matching in Bash
Conditional regex match:
if [[ $email =~ ^[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-z]{2,}$ ]]; then
echo "Valid email"
fi
Extract matched groups:
if [[ $string =~ ^([a-z]+)-([0-9]+)$ ]]; then
name="${BASH_REMATCH[1]}"
number="${BASH_REMATCH[2]}"
fi
Advanced I/O
Read with IFS
Split input by field separator:
IFS=: read -r user password uid gid rest <<< "root:x:0:0:..."
echo "User: $user, UID: $uid"
Parallel Processing
Using xargs:
find . -name "*.txt" | xargs -I {} wc -l {}
cat file_list.txt | xargs -P 4 -I {} process_file {}
Using GNU Parallel:
parallel "process {} > {.}_output.txt" ::: file1.txt file2.txt file3.txt
Exercises
Exercise 1: Log Analysis Script
Create a script analyzing access logs:
#!/bin/bash
logfile="${1:-/var/log/access.log}"
if [ ! -f "$logfile" ]; then
echo "Error: $logfile not found"
exit 1
fi
echo "=== Log Analysis ==="
echo "Total requests: $(wc -l < "$logfile")"
echo "Unique IPs:"
awk '{print $1}' "$logfile" | sort | uniq -c | sort -rn | head -5
echo "Status codes:"
awk '{print $9}' "$logfile" | sort | uniq -c | sort -rn
Exercise 2: Backup with Cleanup
Create an automated backup script:
#!/bin/bash
set -euo pipefail
backup_dir="/backups"
source_dir="/home/data"
retention_days=7
cleanup() {
echo "Cleaning up..."
rm -f "$backup_file.tmp"
}
trap cleanup EXIT
backup_file="$backup_dir/backup_$(date +%Y%m%d_%H%M%S).tar.gz"
echo "Starting backup of $source_dir..."
tar -czf "$backup_file.tmp" "$source_dir"
mv "$backup_file.tmp" "$backup_file"
echo "Backup complete: $backup_file"
# Delete old backups
find "$backup_dir" -name "backup_*.tar.gz" -mtime "+$retention_days" -delete
Exercise 3: Config Parser
Parse key=value configuration:
#!/bin/bash
config_file="${1:-config.txt}"
while IFS='=' read -r key value; do
# Skip comments and empty lines
[[ "$key" =~ ^#.*$ ]] && continue
[ -z "$key" ] && continue
# Trim whitespace
key="${key// /}"
value="${value// /}"
export "$key=$value"
done < "$config_file"
echo "Loaded config:"
declare -p | grep "^\|declare"
Exercise 4: Monitoring Script with Cron
Create a system monitoring script:
#!/bin/bash
check_disk() {
usage=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$usage" -gt 80 ]; then
echo "WARNING: Disk usage at $usage%"
fi
}
check_memory() {
usage=$(free | awk 'NR==2 {printf "%.0f", ($3/$2)*100}')
if [ "$usage" -gt 80 ]; then
echo "WARNING: Memory usage at $usage%"
fi
}
check_load() {
load=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1)
threshold=$(nproc)
if (( $(echo "$load > $threshold" | bc -l) )); then
echo "WARNING: Load average $load exceeds CPU count"
fi
}
{
check_disk
check_memory
check_load
} 2>&1 | while IFS= read -r line; do
[ -n "$line" ] && echo "[$(date)] $line"
done >> /var/log/system_checks.log
Add to crontab:
*/5 * * * * /usr/local/bin/monitor.sh
Key Takeaways
- Here documents enable multi-line input; here strings pass single strings
- Process substitution treats command output as file arguments
traphandles signals for clean shutdown and resource cleanup- Debugging with
set -xandbash -ncatches errors early getoptsprovides robust command-line parsingsedandawkare powerful for text transformation- Regular expressions enable pattern matching and extraction
- Cron automates periodic tasks with specific scheduling
- Combining commands in pipelines creates powerful data workflows
Next Steps
Explore the Cheatsheet for quick reference of 100+ commands and snippets, or review Interview Questions to prepare for technical assessments.