banner

For a full list of BASHing data blog posts, see the index page.     RSS


Avoiding senior moments with command-line functions

A Data Cleaner's Cookbook has a whole page devoted to handy functions, and there are more data-related functions on the gremlins and encoding pages. You can find even more functions scattered through the posts on this blog. I do use these functions, but how do I remember their names and what they do?

senior moment

The answer is, "I don't". My memory isn't good enough. Instead, I've written a plain text file called "functions" with details of my functions (which are all in my ~/.bashrc file). I page through "functions" in a terminal with the less command to find out what I need to know, then quit less with "q" and get back to work.

The "functions" reference file is in my ~/scripts folder and it's called up with a function called... (wait for it) ...functions:

functions() { cd ~/scripts; echo -e "$(cat functions)" | less -R; }

This slightly elaborate command with echo -e, cat and less -R is actually a simple way to get less to print ANSI escape colours. A typical entry in the "functions" file (the fields function from the Cookbook) looks like this:

==================
 
\033[1;31mfields\033[0m
 
fields() { head -n 1 "$1" | tr "\\t" "\\n" | nl -w1 | pr -t -2; }
 
\033[1;36mfields [FILE]\033[0m
 
Prints a 2-column, numbered list of the field names in the header line of a tab-separated text file

Note the double escapes in the tr command. If those were single escapes, echo -e would print \t as a real tab, and \n as a real newline.

When I enter "functions" in a terminal and scroll or page down to the "fields" entry, I see this:

example

The newest addition to "functions" helps me with grep searches. When I'm doing a lot of grepping I often don't want to see the lines containing the search pattern. I don't even want a count of such lines, which I can get with grep -c. Sometimes, all I want to know is "Is this pattern in the file, or not?"

The function isthere answers that question:

isthere() { if (($(grep -c -m 1 "$1" "$3"))); then echo "YES"; else echo "NO"; fi; }

The command is a little unusual. The "-m" option for grep stops the search after a specified number of matching lines. Here that number is 1, so grep stops looking after the first match. The "-c" option then returns "1". Also unusual is the IF part of this command, because it's in BASH arithmetic brackets, but there doesn't seem to be an IF test. The output will be "1" if grep finds the pattern, "0" if it doesn't. But where's the arithmetical test of that output? None is needed. BASH arithmetic can also be used as a kind of "truth test": ((0)) is false, and a non-zero expression like ((1)) is true.

The other slightly odd thing about isthere is that it looks for a pattern as the first argument ($1) within a file as the third argument ($3). That's because I like to write the word "in" between pattern and file, so "in" is the second argument and the function ignores it. Examples:

Immigrant_Song

Last update: 2018-11-13