I’m working on the Web site for Creating Cool Web Sites and just realized that the little Unix trick I used while editing a file is actually a beautiful example of why so many people love Unix so much, so I thought I’d share it. The problem: a file where lines have 1-10 words + a last field value (in this case, a page number) that I don’t want. The challenge is to figure out how to easily remove that last field, when there are a variable number of fields on the line.
My first instinct was to turn to awk or perl, but in fact there’s a much easier way using basic Unix utilities: rev and cut:
$ rev inputfile | cut -f2- | rev > outputfile
How does this work? The rev command reverses each line of input, so that means that the first field of each now-reversed line is the field that we want to remove. That’s easily done with cut with the -f2- flag (that means output field two through the end of the line). Then, finally, we re-reverse (that is, fix) each line and save the output in a new file.
Quickly and easily done.
Thanks a lot!
Thanks Nick..
awk -F/ ‘{gsub($NF,””);print}’
helped me too much.. was searching for this..
I am adding something else
awk -F/ ‘{gsub($NF,””);sub(“.$”, “”);print}’
to cut the last part with delimiter as /
input
=====
/home/smilyface/aaa/bbb
output
======
/home/smilyface/aaa
This was exactly what I was looking for…thanks.
You could shorten second command to:
awk -F/ ‘{sub(“/”$NF,””);print}’
Be careful copy/pasting from this website…the quotes don’t match unix quotes…had to overwrite single/double quotes.
Good point about the quotes, TB. Don’t know how to tweak it to have “straight” quotes in the theme or I’d so so. 🙂
I managed to do this with awk:
cat filename |awk -F/ ‘{gsub($NF,””);print}’
The gsub replaces the last field with a null string. My delimiter is /
Nick
I had to strip the last field from some 2Million line apache logs. First I tried the reverse-cut-reverse approach and that was likely to take ‘forever’.
So I found a solution with sed that worked a lot better:
date; time (gzcat infile.gz | sed ‘s/\(.*\)\ \(.*\)/\1/’ | gzip -c – > outfile.gz ); date
What the sed command does is strip 1 parameter to the preceeding space from the end of the line.
awk ‘{NF–;print}’ Prints the whole name for me. What am I missing?
awk ‘{NF=””;print}’ works, but I need to include the periods.
Nice, this just solved a problem I’ve been working on. So thanks.
I figured I’d try to make it work without the rev. Reversing the argument of the -f parameter should returned the same results.
I prefer:
awk ‘{NF–;print}’
With NF=”” you will have additional spaces at the end of each line.
try this
awk ‘{$NF=””; print $0}’
Cheers
Amit
well done