A clear explanation of the Transpose File shell problem using awk to transform rows into columns.
Problem Restatement
We are given a text file named file.txt.
Each line contains several words separated by spaces.
We need to transpose the file.
That means:
- The first column becomes the first row.
- The second column becomes the second row.
- The third column becomes the third row.
- And so on.
Example input:
name age
alice 21
ryan 30Expected output:
name alice ryan
age 21 30The original rows become columns.
Input and Output
| Item | Meaning |
|---|---|
| Input | A file named file.txt |
| Output | The transposed version of the file |
| Separator | Words are separated by spaces |
| Language | Bash shell script |
Examples
Input:
name age
alice 21
ryan 30Visual form:
| Column 1 | Column 2 |
|---|---|
| name | age |
| alice | 21 |
| ryan | 30 |
After transposing:
| Row 1 | name alice ryan |
|---|---|
| Row 2 | age 21 30 |
Output:
name alice ryan
age 21 30Another example:
Input:
a b c
d e f
g h iOutput:
a d g
b e h
c f iFirst Thought
We need to rebuild the file column by column.
While reading each row, we can collect values belonging to the same column.
For example:
a b c
d e f
g h iColumn 1 becomes:
a d gColumn 2 becomes:
b e hColumn 3 becomes:
c f iThis suggests storing data indexed by column number.
Key Insight
awk automatically splits each line into fields.
For example:
a b cInside awk:
| Variable | Value |
|---|---|
$1 | a |
$2 | b |
$3 | c |
NF | 3 |
NF means “number of fields”.
So while processing each line, we can append every field to a column buffer.
Example:
| Column Buffer | Value |
|---|---|
col[1] | a d g |
col[2] | b e h |
col[3] | c f i |
After reading the whole file, we print each buffer.
Algorithm
For every line:
- Loop through all fields from
1toNF. - If the column already has content, append a space.
- Append the current field to that column.
- Track the maximum number of columns.
After processing all lines:
- Print every column buffer from
1tomax_columns.
Correctness
Each line is split into fields automatically by awk.
For every field position i, the algorithm appends the current field $i to the buffer corresponding to column i.
Therefore:
- All first-column values are stored together.
- All second-column values are stored together.
- And so on.
The order of insertion matches the original row order.
So each column buffer becomes exactly one row of the transposed matrix.
Finally, printing the buffers from left to right outputs the correct transpose of the original file.
Complexity
Suppose the file contains r rows and c columns.
| Metric | Value | Why |
|---|---|---|
| Time | O(r × c) | Every field is visited once |
| Space | O(r × c) | The transposed content is stored in memory |
Implementation
awk '
{
for (i = 1; i <= NF; i++) {
if (NR == 1) {
col[i] = $i
} else {
col[i] = col[i] " " $i
}
}
}
END {
for (i = 1; i <= NF; i++) {
print col[i]
}
}
' file.txtCode Explanation
The main loop processes each line:
{
for (i = 1; i <= NF; i++) {NF is the number of fields in the current line.
So the loop visits every column.
For the first row:
if (NR == 1)we initialize the column buffer directly.
NR means “number of records”, which is the current line number.
For later rows:
col[i] = col[i] " " $iwe append the new value with a space separator.
At the end:
END {runs after the entire file has been processed.
Then:
print col[i]prints each transposed row.
Testing
Create a sample file:
cat > file.txt << 'EOF'
name age
alice 21
ryan 30
EOFRun the solution:
awk '
{
for (i = 1; i <= NF; i++) {
if (NR == 1) {
col[i] = $i
} else {
col[i] = col[i] " " $i
}
}
}
END {
for (i = 1; i <= NF; i++) {
print col[i]
}
}
' file.txtExpected output:
name alice ryan
age 21 30Another test:
Input:
a b c
d e f
g h iExpected output:
a d g
b e h
c f iSingle-row test:
Input:
x y zExpected output:
x
y
z