logo

Header

The first record in a data table contains the names of the fields and it deserves special attention. Below are recipes for saving, adding and removing a header, and for doing things with the rest of the table while leaving the header intact as the first line in the table.

Save a header

To save the header of the file table for later use as the file header:

$ head -n 1 table > header
 
$ sed -n '1p' table > header
 
$ awk 'NR==1' table > header

Add a header

To add an existing header file to a header-less table:

$ cat header table_without_header > table_with_header

A header can also be added to a header-less table as a text string. Suppose there are 3 fields with the names 'field1', 'field2' and 'field3':

$ echo -e "field1\tfield2\tfield3" | cat - table_without_header > table_with_header
 
$ sed '1i field1\tfield2\tfield3' table_without_header > table_with_header
 
$ awk 'BEGIN {print "field1\tfield2\tfield3"} {print}' table_without_header > table_with_header

Remove a header

To remove a header without saving it as a separate file:

$ tail -n +2 table > table_without_header
 
$ sed '1d' table > table_without_header
 
$ awk 'NR>1' table > table_without_header

Ignore a header

Suppose you want to sort a data table, but you don't want the header to be involved in the sorting. How do you ignore the header?

Print the header line first, then send the rest of the table to sort:

$ head -n 1 table && tail -n +2 table | sort > sorted_table

Suppose you want to add a first field to table containing a unique, serial ID number for each record. How can you do this without also numbering the header? 

This recipe will be a little more complicated, since you also need to add a name for the new field (here I'm calling it 'ID') to the header. Once you've done that, you can use the nl command to number the remaining lines in the table.

$ echo -e "ID\t$(head -n 1 table)" && tail -n +2 table | nl

A more clever way to do this 'offset' numbering job is with AWK:

$ awk 'BEGIN {FS=OFS="\t"} {print (NR==1 ? "ID" OFS $0 : NR-1 OFS $0)}' table

This command uses the AWK ternary operator '?' and a colon in an if-else statement. If the first record (NR==1) is being read by AWK, then AWK prints the word 'ID' followed by the output field separator (OFS, the tab character '\t') followed by the rest of the first record. If the record being read isn't the first one, then AWK prints one less than the record number (NR-1), a tab, and the rest of the line.