logo

Bulk replacement - 2

Address-based replacement: sed

The first recipe uses sed and is similar to 'many to one' replacement, but the command list is built around a list of line numbers, i.e. addresses for replacement. Suppose that in this 10-line file:

$ cat file1
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.

we want to replace Ralph Kramden with Ed Norton on lines 1, 2, 5, 7, 8. Starting with a list of those numbers:

$ cat list
1
2
5
7
8

we build a list of replacement commands for the required addresses:

$ sed 's/$/s\/Ralph Kramden\/Ed Norton\/g/' list
1s/Ralph Kramden/Ed Norton/g
2s/Ralph Kramden/Ed Norton/g
5s/Ralph Kramden/Ed Norton/g
7s/Ralph Kramden/Ed Norton/g
8s/Ralph Kramden/Ed Norton/g

(Again using 'g' for global replacement. It's not needed in this particular case, but it might be needed in a recipe you build from this model.)

The last step is to feed the result of that last command to sed -e for operating on file1:

$ sed -e "$(sed 's/$/s\/Ralph Kramden\/Ed Norton\/g/' list)" file1
This line contains the name Ed Norton.
This line contains the name Ed Norton.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.
This line contains the name Ed Norton.
This line contains the name Ralph Kramden.
This line contains the name Ed Norton.
This line contains the name Ed Norton.
This line contains the name Ralph Kramden.
This line contains the name Ralph Kramden.



Address-based replacement: AWK

Now for something that sed can't do: bulk replacement at particular addresses in particular fields. The aim will be to replace R. Kramden with Ralph Kramden in file2, but only in field 3 and only on lines 4, 7 and 8:

$ cat file2
Transaction    Customer    Item
00101    R. Kramden    $2 to A. Kramden
00102    R. Kramden    $2 to E. Norton
00103    A. Kramden    $4 to R. Kramden
00104    R. Kramden    $1 to T. Norton
00105    R. Kramden    $6 to A. Kramden
00106    A. Kramden    $5 to R. Kramden
00107    E. Norton    $6 to R. Kramden
00108    R. Kramden    $3 to A. Kramden
00109    R. Kramden    $2 to T. Norton
00110    E. Norton    $3 to R. Kramden

Once again, we make a list of the line numbers for replacements:

$ cat list
4
7
8

and feed the list to an AWK array, 'a'. When reading file2 AWK does the replacement in field 3, but only if the current line number (FNR) is one of the numbers in 'a':

$ awk 'BEGIN {FS=OFS="\t"} FNR==NR {a[$0];next} FNR in a {sub("R. Kramden","Ralph Kramden",$3)} 1' list file2
Transaction    Customer    Item
00101    R. Kramden    $2 to A. Kramden
00102    R. Kramden    $2 to E. Norton
00103    A. Kramden    $4 to Ralph Kramden
00104    R. Kramden    $1 to T. Norton
00105    R. Kramden    $6 to A. Kramden
00106    A. Kramden    $5 to Ralph Kramden
00107    E. Norton    $6 to Ralph Kramden
00108    R. Kramden    $3 to A. Kramden
00109    R. Kramden    $2 to T. Norton
00110    E. Norton    $3 to R. Kramden

Back to many to one bulk replacement...