Advanced Bash Scripting

Master advanced techniques for production-grade shell scripts: process substitution, here documents, signal handling, debugging, text processing, and automation.

Here Documents

A here document (<<) passes multi-line input to a command:

cat << 'EOF'
This is a multi-line
document that preserves
all formatting and newlines.
EOF

Quoted delimiter ('EOF') prevents expansion:

cat << 'EOF'
Variables like $PATH are not expanded
Backticks `date` are literal
EOF

Unquoted delimiter allows expansion:

cat << EOF
User: $USER
Home: $HOME
Date: $(date)
EOF

Here Strings

A here string (<<<) passes a single string to a command:

grep "pattern" <<< "Here is the input string"
bc <<< "5 + 3"                    # Outputs: 8
wc -w <<< "Count these words"     # Outputs: 3

Redirect to Variable

output=$(cat << 'EOF'
Line 1
Line 2
Line 3
EOF
)
echo "$output"

Process Substitution

Process substitution (<() and >()) treats command output as files:

Input substitution — Pass command output as file argument:

diff <(sort file1.txt) <(sort file2.txt)

Equivalent to:

diff <(cat file1.txt | sort) <(cat file2.txt | sort)

Output substitution — Redirect to command input:

tee >(cat > file1.txt) >(cat > file2.txt) <<< "content"

Practical Examples

Compare two commands:

diff <(ps aux) <(ps aux)

Merge sorted streams:

comm <(grep admin /etc/passwd | sort) \
     <(grep admin /etc/group | sort)

Parallel processing:

paste <(seq 1 5) <(seq 6 10)  # Creates two-column output

Signal Handling with trap

The trap command catches signals and system events:

trap 'echo "Caught interrupt"; exit' INT

Clean up on exit:

#!/bin/bash

cleanup() {
  echo "Cleaning up..."
  rm -f /tmp/temp_$$
  exit 0
}

trap cleanup EXIT

# Create temp file
temp_file="/tmp/temp_$$"
touch "$temp_file"
echo "Working with $temp_file"

Handle multiple signals:

error_handler() {
  echo "Error on line $1"
  exit 1
}

trap 'error_handler $LINENO' ERR

Common signals:

Signal	Meaning	Use
`INT`	Ctrl+C	Clean shutdown
`TERM`	Termination	Graceful exit
`EXIT`	Script exit	Final cleanup
`ERR`	Command error	Error handling
`DEBUG`	Each command	Tracing

Debugging

Debug Mode

Enable debugging to trace execution:

bash -x script.sh              # Run with trace

Inside script:

set -x                         # Start debugging
commands...
set +x                         # Stop debugging

Output shows command before execution:

+ echo 'Processing file'
Processing file
+ ls -l
total 8
...

PS4 Customization

Customize debug prompt:

PS4='+ [${BASH_SOURCE}:${LINENO}] '
set -x

Output shows filename and line number.

Verbose Mode

set -v                         # Print commands before execution
set +v                         # Turn off

Check Syntax Without Running

bash -n script.sh              # Check syntax

Parameter Parsing with getopts

Parse command-line options safely:

#!/bin/bash

verbose=false
output_file=""

while getopts "vo:" opt; do
  case $opt in
    v)
      verbose=true
      ;;
    o)
      output_file="$OPTARG"
      ;;
    *)
      echo "Usage: $0 [-v] [-o FILE]"
      exit 1
      ;;
  esac
done

shift $((OPTIND-1))            # Remove processed options
remaining_args="$@"

echo "Verbose: $verbose"
echo "Output: $output_file"
echo "Arguments: $remaining_args"

Usage:

./script.sh -v -o output.txt file1 file2

Text Processing: sed and awk

sed (Stream Editor)

Print lines:

sed -n '5p' file.txt           # Print line 5
sed -n '1,5p' file.txt         # Print lines 1-5
sed -n '/pattern/p' file.txt   # Print matching lines

Delete lines:

sed '5d' file.txt              # Delete line 5
sed '/pattern/d' file.txt      # Delete matching lines
sed '1,5d' file.txt            # Delete lines 1-5

Substitute (replace):

sed 's/old/new/' file.txt      # Replace first on each line
sed 's/old/new/g' file.txt     # Replace all on each line
sed 's/old/new/2' file.txt     # Replace second on each line
sed -i 's/old/new/g' file.txt  # In-place edit

Case-insensitive:

sed 's/Pattern/replaced/I' file.txt

Multi-line sed:

sed -e 's/pattern1/replace1/' \
    -e 's/pattern2/replace2/' \
    -e '/unwanted/d' file.txt

awk (Pattern-Action Language)

Print columns:

awk '{print $1, $3}' file.txt  # Print columns 1 and 3
awk '{print NF}' file.txt      # Print number of fields

Filter by pattern:

awk '/pattern/ {print}' file.txt           # Print matching lines
awk '$2 > 100 {print $1, $2}' file.txt     # Print where col 2 > 100

Arithmetic:

awk '{sum += $1} END {print sum}' file.txt  # Sum column 1
awk '{print $1 * 2}' file.txt               # Double column 1

Field separator:

awk -F: '{print $1, $3}' /etc/passwd   # Use ':' as separator
awk -F'[,:]' '{print $1}' file.txt     # Multiple separators

Built-in variables:

Variable	Meaning
`NR`	Current row number
`NF`	Number of fields
`FS`	Field separator (default: space)
`RS`	Record separator (default: newline)
`OFS`	Output field separator
`ORS`	Output record separator

Complex example:

awk -F: '
  NR > 1 {                    # Skip header
    sum += $3
    count++
  }
  END {
    if (count > 0)
      print "Average:", sum/count
  }
' data.txt

Text Processing Pipelines

Extract and transform:

cat data.txt | \
  grep "status=active" | \
  awk -F',' '{print $2, $4}' | \
  sort | \
  uniq -c | \
  sort -rn

Count occurrences:

cat file.txt | \
  tr ' ' '\n' | \
  sort | \
  uniq -c | \
  sort -rn | \
  head -10

Extract IP addresses:

grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' file.txt | \
  sort | \
  uniq -c | \
  sort -rn

Regular Expressions

Basics

^text       # Start of line
text$       # End of line
.           # Any character
*           # Zero or more
+           # One or more (extended regex)
?           # Zero or one (extended regex)
[abc]       # Character class
[^abc]      # Negated character class
\d          # Digit (in some contexts)
\w          # Word character (in some contexts)

Examples

# Match email-like pattern
grep -E '[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-z]{2,}' file.txt

# Extract version numbers
grep -oE '[0-9]+\.[0-9]+\.[0-9]+' file.txt

# Find log entries for specific date
grep '^2024-03-' log.txt

# Match lines with exactly 3 digits
grep -E '^[0-9]{3}$' file.txt

Cron Automation

Crontab Syntax

┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-7, 0/7 = Sunday)
│ │ │ │ │
│ │ │ │ │
* * * * * command

Common Schedules

9 * * *       # 9:00 AM daily
*/4 * * *     # Every 4 hours
2 * * 0      # 2:30 AM on Sunday
0 1 * *       # Midnight on first of month
12 * * 1-5    # Noon, Monday-Friday
*/15 * * * *    # Every 15 minutes

Create Cron Job

crontab -e                     # Edit current user's crontab
crontab -l                     # List current crontab
crontab -r                     # Remove crontab
crontab -u user -e             # Edit another user's (as root)

Example crontab entry:

# Backup database daily at 2 AM
0 2 * * * /home/user/scripts/backup.sh >> /var/log/backup.log 2>&1

# Clean logs weekly
0 3 * * 0 find /var/log -name "*.log" -mtime +30 -delete

Best Practices

Use full paths in cron scripts
Redirect output to log files
Use >/dev/null 2>&1 to suppress output if not needed
Set email recipient for errors: MAILTO=admin@example.com
Use UTC for consistency across systems

Regex Matching in Bash

Conditional regex match:

if [[ $email =~ ^[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-z]{2,}$ ]]; then
  echo "Valid email"
fi

Extract matched groups:

if [[ $string =~ ^([a-z]+)-([0-9]+)$ ]]; then
  name="${BASH_REMATCH[1]}"
  number="${BASH_REMATCH[2]}"
fi

Advanced I/O

Read with IFS

Split input by field separator:

IFS=: read -r user password uid gid rest <<< "root:x:0:0:..."
echo "User: $user, UID: $uid"

Parallel Processing

Using xargs:

find . -name "*.txt" | xargs -I {} wc -l {}
cat file_list.txt | xargs -P 4 -I {} process_file {}

Using GNU Parallel:

parallel "process {} > {.}_output.txt" ::: file1.txt file2.txt file3.txt

Exercises

Exercise 1: Log Analysis Script

Create a script analyzing access logs:

#!/bin/bash

logfile="${1:-/var/log/access.log}"

if [ ! -f "$logfile" ]; then
  echo "Error: $logfile not found"
  exit 1
fi

echo "=== Log Analysis ==="
echo "Total requests: $(wc -l < "$logfile")"
echo "Unique IPs:"
awk '{print $1}' "$logfile" | sort | uniq -c | sort -rn | head -5
echo "Status codes:"
awk '{print $9}' "$logfile" | sort | uniq -c | sort -rn

Exercise 2: Backup with Cleanup

Create an automated backup script:

#!/bin/bash

set -euo pipefail

backup_dir="/backups"
source_dir="/home/data"
retention_days=7

cleanup() {
  echo "Cleaning up..."
  rm -f "$backup_file.tmp"
}

trap cleanup EXIT

backup_file="$backup_dir/backup_$(date +%Y%m%d_%H%M%S).tar.gz"

echo "Starting backup of $source_dir..."
tar -czf "$backup_file.tmp" "$source_dir"
mv "$backup_file.tmp" "$backup_file"
echo "Backup complete: $backup_file"

# Delete old backups
find "$backup_dir" -name "backup_*.tar.gz" -mtime "+$retention_days" -delete

Exercise 3: Config Parser

Parse key=value configuration:

#!/bin/bash

config_file="${1:-config.txt}"

while IFS='=' read -r key value; do
  # Skip comments and empty lines
  [[ "$key" =~ ^#.*$ ]] && continue
  [ -z "$key" ] && continue

  # Trim whitespace
  key="${key// /}"
  value="${value// /}"

  export "$key=$value"
done < "$config_file"

echo "Loaded config:"
declare -p | grep "^\|declare"

Exercise 4: Monitoring Script with Cron

Create a system monitoring script:

#!/bin/bash

check_disk() {
  usage=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
  if [ "$usage" -gt 80 ]; then
    echo "WARNING: Disk usage at $usage%"
  fi
}

check_memory() {
  usage=$(free | awk 'NR==2 {printf "%.0f", ($3/$2)*100}')
  if [ "$usage" -gt 80 ]; then
    echo "WARNING: Memory usage at $usage%"
  fi
}

check_load() {
  load=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1)
  threshold=$(nproc)
  if (( $(echo "$load > $threshold" | bc -l) )); then
    echo "WARNING: Load average $load exceeds CPU count"
  fi
}

{
  check_disk
  check_memory
  check_load
} 2>&1 | while IFS= read -r line; do
  [ -n "$line" ] && echo "[$(date)] $line"
done >> /var/log/system_checks.log

Add to crontab:

*/5 * * * * /usr/local/bin/monitor.sh

Key Takeaways

Here documents enable multi-line input; here strings pass single strings
Process substitution treats command output as file arguments
trap handles signals for clean shutdown and resource cleanup
Debugging with set -x and bash -n catches errors early
getopts provides robust command-line parsing
sed and awk are powerful for text transformation
Regular expressions enable pattern matching and extraction
Cron automates periodic tasks with specific scheduling
Combining commands in pipelines creates powerful data workflows

Next Steps

Explore the Cheatsheet for quick reference of 100+ commands and snippets, or review Interview Questions to prepare for technical assessments.

Here Documents​

Here Strings​

Redirect to Variable​

Process Substitution​

Practical Examples​

Signal Handling with trap​

Debugging​

Debug Mode​

PS4 Customization​

Verbose Mode​

Check Syntax Without Running​

Parameter Parsing with getopts​

Text Processing: sed and awk​

sed (Stream Editor)​

awk (Pattern-Action Language)​

Text Processing Pipelines​

Regular Expressions​

Basics​

Examples​

Cron Automation​

Crontab Syntax​

Common Schedules​

Create Cron Job​

Best Practices​

Regex Matching in Bash​

Advanced I/O​

Read with IFS​

Parallel Processing​

Exercises​

Exercise 1: Log Analysis Script​

Exercise 2: Backup with Cleanup​

Exercise 3: Config Parser​

Exercise 4: Monitoring Script with Cron​

Key Takeaways​

Next Steps​