While improving a backup script I came across a need to modify the information output by the dd command. FreeNAS, the system the script runs on, does not have a version of dd that has the human readable option. My quest: make the output from dd human readable.
I’ve only partially fulfilled this quest at this point, but the basic functionality is there. The first step is to redirect all output of the dd command (both stdout and stderr) to a variable (this particular syntax is for bash)
DD_OUTPUT=$(dd if=/dev/zero of=test.img bs=1000000 count=1000 2>&1)
The 2>&1 redirects all output to the variable instead of the console.
The next step is to use sed to remove unwanted information (number records in and out)
sed -r '/.*records /d'
Next, remove any parenthesis. This is due to the parenthesis from dd output messing up the math that’s going to be done later.
tr -d '()'
-d ‘()’ simply tells tr to remove any instance of the characters between the single quotes.
And now, for awk. Awk allows you to manipulate certain pieces of a text file. Each section, separated by a space, is known as a record. Awk lets us edit some parts of text while keeping others intact. My expertise with awk is that of a n00b so I’m sure there is a more efficient way of doing this; nevertheless here is my solution:
awk '{print ($1/1024)/1024 " MB" " " $3 " " $4 " " $5*60 " minutes" " (" ($7/1024)/1024 " MB/sec)" }'
Since dd outputs its information in bytes I’m having it divide by 1024 twice so the resulting number is now in megabytes. I also have it divide the seconds by 60 to return the number of minutes dd took. Additionally I’m re-inserting the parenthesis removed by the tr command now that the math is correctly done.
The last step is to pipe all of this together:
DD_OUTPUT=$(dd if=/dev/zero of=test.img bs=1000000 count=10000 2>&1)
DD_HUMANIZED=$(echo "$DD_OUTPUT" | sed -r '/.*records /d' | tr -d '()' | awk '{print ($1/1024)/1024 " MB" " " $3 " " $4 " " $5/60 " minutes" " (" ($7/1024)/1024 " MB/sec)" }')
After running the above here are the results:
Original output:
echo "$DD_OUTPUT" 10000+0 records in 10000+0 records out 10000000000 bytes transferred in 103.369538 secs (96740299 bytes/sec)
Humanized output:
echo "$DD_HUMANIZED" 9536.74 MB transferred in 1.72283 minutes (92.2587 MB/sec)
The printf function of awk would allow us to round the calculations but I couldn’t quite get the syntax to work and have abandoned the effort for now.
Of course all this is not necessary if you have the correct dd version. The GNU version of dd defaults to human readability; certain BSD versions have the option of passing the msgfmt=human argument per here.
Update: I discovered that the awk method above will only print the one line it finds and ignore all other lines, which is less than ideal for scripting. I updated the awk syntax to do a search and replace (sub) instead so that it will print all other lines as well:
awk '{sub(/.*bytes /, $1/1024/1024" MB "); sub(/in .* secs/, "in "$5/60" mins "); sub(/mins .*/, "mins (" $7/1024/1024" MB/sec)"); printf}')
My new all in one line is:
DD_HUMANIZED=$(echo "$DD_OUTPUT" | sed -r '/.*records /d' | tr -d '()' | awk '{sub(/.*bytes /, $1/1024/1024" MB "); sub(/in .* secs/, "in "$5/60" mins "); sub(/mins .*/, "mins (" $7/1024/1024" MB/sec)"); printf}')