AWK, a powerful and versatile scripting language embedded in the bash shell, excels in data manipulation and reporting. Whether you’re a seasoned developer or a system administrator, AWK’s concise syntax and robust functionality make it an indispensable tool for processing text files and generating reports. In this guide, we will discuss how to use the if-else statement in the AWK command with practical examples.
The If Statement in AWK
The if statement is a fundamental control structure in AWK, enabling the execution of specific actions based on conditional evaluations.
Basic Syntax
The syntax for the if statement in AWK is straightforward:
awk '{ if (condition) { statement } }' [input_file]
When the specified condition evaluates to true, the statement enclosed within the braces is executed.
Download the practice file if you want to work through the problems using the examples provided below.
Example 1: Filtering Data Based on a Condition
Consider a scenario where you have a file named employees.txt
with the following content:
ID | Name | Department | Salary |
---|---|---|---|
201 | Alice | HR | 50000 |
202 | Bob | Engineering | 60000 |
203 | Charlie | Marketing | 45000 |
204 | David | Engineering | 70000 |
205 | Eva | HR | 52000 |
To print the details of employees in the Engineering department, use the following AWK command:
awk '{
if ($3 == "Engineering") {
print "Employee ID:", $1, "Name:", $2, "Salary:", $4;
}
}' employees.txt
Output
Example 2: Identifying High Salary Employees
Suppose you want to identify employees earning more than $50,000. The AWK command would be:
awk '{
if ($4 > 50000) {
print "High Salary Employee - ID:", $1, "Name:", $2, "Salary:", $4;
}
}' employees.txt
Output
Example 3: Checking for Specific Values
Let’s say you need to find employees whose ID is 203. You can use the if statement as follows:
awk '{
if ($1 == 203) {
print "Employee Found - Name:", $2, "Department:", $3, "Salary:", $4;
}
}' employees.txt
Output
The If-Else Statement in AWK
The if-else statement in AWK introduces a dual-path control structure, allowing you to specify actions for both true and false evaluations of a condition. This flexibility is vital for data processing tasks where different outcomes are required based on varying conditions.
Basic Syntax
The syntax for the if-else statement in AWK is as follows:
awk '{
if (condition) {
statement1
} else {
statement2
}
}' [input_file]
If the specified condition evaluates to true, statement1
is executed. Otherwise, statement2
is executed.
Example 1: Categorizing Employees by Department
Consider a file named staff.txt
with the following data:
ID | Name | Age | Department |
---|---|---|---|
301 | Emma | 28 | Sales |
302 | Liam | 34 | Engineering |
303 | Olivia | 29 | Marketing |
304 | Noah | 42 | Sales |
305 | Ava | 37 | Engineering |
To categorize employees into Sales and non-Sales departments, you can use the following AWK command:
awk '{
if ($4 == "Sales") {
print "Sales Department - ID:", $1, "Name:", $2;
} else {
print "Non-Sales Department - ID:", $1, "Name:", $2;
}
}' staff.txt
Output
Example 2: Evaluating Age Groups
Suppose you want to classify employees as either under 30 or 30 and above. The AWK command would be:
awk '{
if ($3 < 30) {
print "Under 30 - Name:", $2, "Age:", $3;
} else {
print "30 and Above - Name:", $2, "Age:", $3;
}
}' staff.txt
Output
Example 3: Distinguishing High and Low Salaries
Let’s modify our dataset to include salaries and use an AWK command to differentiate between high and low earners. Assume staff_salaries.txt
has the following structure:
ID | Name | Department | Salary |
---|---|---|---|
401 | Jacob | IT | 65000 |
402 | Mia | HR | 48000 |
403 | William | IT | 73000 |
404 | Sophia | Marketing | 52000 |
405 | James | HR | 46000 |
To distinguish employees earning more or less than $50,000, the AWK command would be:
awk '{
if ($4 > 50000) {
print "High Salary - Name:", $2, "Salary:", $4;
} else {
print "Low Salary - Name:", $2, "Salary:", $4;
}
}' staff_salaries.txt
Output
The If-Else-If Statement in AWK
The if-else-if statement in AWK provides a more complex conditional structure, allowing you to evaluate multiple conditions in sequence. This capability is crucial for scenarios where decisions depend on a variety of criteria.
Basic Syntax
The if-else-if statement in AWK follows this syntax:
awk '{
if (condition1) {
statement1
} else if (condition2) {
statement2
} else if (condition3) {
statement3
} else {
statementN
}
}' [input_file]
AWK evaluates each condition in order. When a true condition is found, the corresponding statement is executed, and the remaining conditions are skipped.
Example 1: Classifying Employees by Age Group
Consider a file named workforce.txt
containing the following data:
ID | Name | Age | Department |
---|---|---|---|
501 | Hannah | 24 | Finance |
502 | Ethan | 31 | IT |
503 | Isabella | 26 | Marketing |
504 | Mason | 40 | Sales |
505 | Lily | 22 | HR |
To classify employees into young, middle-aged, and senior categories, use the following AWK command:
awk '{
if ($3 < 25) {
print "Young Employee - Name:", $2, "Age:", $3;
} else if ($3 <= 35) {
print "Middle-Aged Employee - Name:", $2, "Age:", $3;
} else {
print "Senior Employee - Name:", $2, "Age:", $3;
}
}' workforce.txt
Output
Example 2: Grading Students Based on Scores
Assume you have a file named students_scores.txt
with the following structure:
ID | Name | Score |
---|---|---|
601 | Ava | 85 |
602 | Ben | 92 |
603 | Chloe | 77 |
604 | Daniel | 65 |
605 | Emily | 58 |
To assign grades based on score ranges, you can use the following AWK script:
awk '{
if ($3 >= 90) {
print "Excellent - Name:", $2, "Score:", $3;
} else if ($3 >= 75) {
print "Good - Name:", $2, "Score:", $3;
} else if ($3 >= 60) {
print "Average - Name:", $2, "Score:", $3;
} else {
print "Needs Improvement - Name:", $2, "Score:", $3;
}
}' students_scores.txt
Output
Example 3: Determining Discount Levels for Products
Consider a file named products.txt
with the following data:
ID | Product | Price |
---|---|---|
701 | Laptop | 1200 |
702 | Smartphone | 800 |
703 | Tablet | 300 |
704 | Headphones | 150 |
705 | Monitor | 400 |
To determine discount levels based on price, use the following AWK command:
awk '{
if ($3 > 1000) {
print "High Discount - Product:", $2, "Price:", $3;
} else if ($3 > 500) {
print "Medium Discount - Product:", $2, "Price:", $3;
} else if ($3 > 100) {
print "Low Discount - Product:", $2, "Price:", $3;
} else {
print "No Discount - Product:", $2, "Price:", $3;
}
}' products.txt
Output
Using the Ternary Operator in AWK
The ternary operator in AWK provides a succinct way to perform conditional evaluations and execute one of two expressions based on a condition. This operator is especially useful for simplifying if-else statements, making your code more concise and readable.
Basic Syntax
The ternary operator in AWK follows the syntax:
(condition) ? expression1 : expression2
If the condition evaluates to true, expression1
is executed; otherwise, expression2
is executed.
Example 1: Categorizing Products by Stock Level
Consider a file named inventory.txt
containing the following data:
ProductID | ProductName | Stock |
---|---|---|
101 | WidgetA | 150 |
102 | WidgetB | 50 |
103 | WidgetC | 200 |
104 | WidgetD | 30 |
105 | WidgetE | 75 |
To categorize products as “In Stock” or “Low Stock” based on a stock threshold of 100 units, use the following AWK command:
awk '{print ($3 >= 100) ? "In Stock - Product:" $2 : "Low Stock - Product:" $2}' inventory.txt
Output
Example 2: Assigning Pass or Fail Status to Students
Suppose you have a file named grades.txt
with the following structure:
StudentID | Name | Score |
---|---|---|
201 | Alice | 88 |
202 | Bob | 45 |
203 | Charlie | 76 |
204 | Dana | 59 |
205 | Eva | 91 |
To determine whether students have passed or failed based on a passing score of 60, you can use the ternary operator:
awk '{print ($3 >= 60) ? "Pass - Name:" $2 : "Fail - Name:" $2}' grades.txt
Output
Example 3: Identifying Expensive and Affordable Products
Consider a file named products_pricing.txt
with the following data:
ProductID | ProductName | Price |
---|---|---|
301 | Laptop | 1200 |
302 | Mouse | 25 |
303 | Keyboard | 50 |
304 | Monitor | 300 |
305 | Desk | 150 |
To classify products as “Expensive” if their price is above $500 or “Affordable” otherwise, use the following AWK command:
awk '{print ($3 > 500) ? "Expensive - Product:" $2 : "Affordable - Product:" $2}' products_pricing.txt
Output
Conclusion
In this guide, we’ve delved into the versatility and power of the AWK scripting language, focusing on key constructs such as if statements, if-else statements, if-else-if statements, AWK program files, and the ternary operator. By mastering these elements, you can efficiently handle a wide range of data processing tasks, making AWK an indispensable tool for any data professional or enthusiast.