Ask Dave Taylor
  • Facebook
  • Instagram
  • Linkedin
  • Pinterest
  • Twitter
  • YouTube
  • Home
  • YouTube Videos
  • Top Categories
  • Subscribe via Email
  • Ask A Question
  • Meet Dave
  • Home
  • Linux Help
  • How can I keep a compressed Linux archive up to date?

How can I keep a compressed Linux archive up to date?

August 29, 2009 / Dave Taylor / Linux Help, Linux Shell Script Programming / 2 Comments

We have a situation where we need to keep a ZIP archive of some data files available on our Ubuntu Linux server so that our satellite offices can grab the information through slower data lines. Problem is, the underlying files change 2-3 times a day. What’s a quick, efficient way to only rebuild the ZIP archive file on our Linux system if a file’s changed, but leave it as-is if everything’s stayed the same?

I really like these sort of questions because there are so many different ways to solve them. You could, for example, just brute force rebuild the ZIP archive every few hours, but that’s a pretty inelegant solution and is bound to waste a lot of computing cycles, though that might not be a big deal. The bigger deal is that it could also leave your remote offices stuck with corrupted archive files because a new build started half-way through their latest transfer, a situation that’s a worst case scenario, I’m sure.
The cornerstone of this solution is to create a short shell script and then use “test” to ascertain if the data source files are updated (or, in the language of the script, newer than the ZIP archive file). If they are, then create the ZIP file to a different filename and when the archive and compression process is done, rename the new name to the standard archive name.
The basic logic is:

if [ files-to-archive are newer than archive ] then
  rebuild archive to temp file
  mv temp file to archive
endif

Now, to make that code, we’ll want to check the “test” man page, which informs us that:
    file1 -nt file2
      True if file1 exists and is newer than file2.
I have a similar situation with an archive I’m maintaining, so the first step is to ascertain which files we want to test against. In my case, it’s 26 files, so having a chain of if-then-else statements would be crazy ugly. But how to ascertain which file is newest?
The solution is so simple it’s eerie! Just use “ls”: ls -t | head -1 gives you the most recently modified (touched) file in the directory. Since I am working with XML files it makes sense to constrain this just a little bit, so I’ll use something more akin to ls -t *.xml | head -1 instead.
If I had an explicit list of files to check, it’d be easy to set a variable that contains all the names:

filenames=”file1 file2 file3 file4 file5 file6 file7″

So let’s put it all together and see what we get:

target=”everything” # target filename for full ZIP archive + .zip
searchdb=”search-database” # target filename for search db ZIP archive

newestfile=”$(ls -t *xml | head -1)”

if [ $newestfile -nt $target.zip ] ; then
  # time to rebuild the archive
  zip $target *xml
fi

That’s basically all you need: make sure that the “newestfile” accurately picks up which of your set of source files is newest (and if you use a list of files, just use that in the statement instead of an explicit pattern, like “newestfile=$(ls -t $filenames | head -1)”
The only issue remaining in the above code is the potential problem of having the archive be slowly built while a remote site is downloading it at the same time. Not good. To avoid that, just use this:

if [ $newestfile -nt $target.zip ] ; then
  # time to rebuild the archive
  zip $interim *xml
  mv $interim.zip $target.zip
fi

What’s nice about this is that it has a very low processor footprint, so it’s going to have minimal impact if you have the script run every hour or two via a cron job, which is what I do. In fact, my script is a bit more complex because I also take advantage of the “-x” flag to “zip” that lets me exclude a specific temporary file, as in “zip archive * -x *zip”.

Let’s Stay In Touch!

Never miss a single article, review or tutorial here on AskDaveTaylor, sign up for my fun weekly newsletter!
Name: 
Your email address:*
Please enter all required fields
Correct invalid entries
No spam, ever. Promise. Powered by FeedBlitz
Please choose a color:
Starbucks coffee cup I do have a lot to say, and questions of my own for that matter, but first I'd like to say thank you, Dave, for all your helpful information by buying you a cup of coffee!

2 comments on “How can I keep a compressed Linux archive up to date?”

  1. Dave Taylor says:
    September 7, 2009 at 8:01 pm

    Cooper, I think it’s an old dog, new tricks, problem: I still default to “tar” even though it was originally written to stream data to mag tapes. Yeah, it’s that old. 🙂

    Reply
  2. Cooper Strange says:
    September 6, 2009 at 8:27 pm

    In your Unix & Unix SysAdmin books, you covered this same kind of thing, but I cannot remember if you used TAR or CPIO. Which would be better?

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Recent Posts

  • How Do You Rearrange App Icons on an Android Phone?
  • How Can I Enable Emergency Alerts in Spanish on Android?
  • Switch from 24-Hour Time to AM/PM in Ubuntu Linux?
  • Protect Your Connection and Privacy with Surfshark VPN
  • Can I Send Texts in iMessage with Effects from my Mac System?

On Our YouTube Channel

Monoprice DT-3BT Bluetooth Desktop Speakers -- REVIEW

FATORK Wi-Fi Smart Portable Movie Projector -- DEMO & REVIEW

Categories

  • AdSense, AdWords, and PPC Help (106)
  • Amazon, eBay, and Online Shopping Help, (161)
  • Android Help (203)
  • Apple iPad Help (145)
  • Apple Watch Help (53)
  • Articles, Tutorials, and Reviews (344)
  • Auto Tech Help (12)
  • Business Advice (199)
  • Chrome OS Help (25)
  • Computer & Internet Basics (764)
  • d) None of the Above (165)
  • Facebook Help (383)
  • Google, Chrome & Gmail Help (180)
  • HTML & Web Page Design (245)
  • Instagram Help (48)
  • iPhone & iOS Help (607)
  • iPod & MP3 Player Help (173)
  • Kindle & Nook Help (93)
  • LinkedIn Help (85)
  • Linux Help (167)
  • Linux Shell Script Programming (87)
  • Mac & MacOS Help (895)
  • Most Popular (16)
  • Outlook & Office 365 Help (26)
  • PayPal Help (69)
  • Pinterest Help (53)
  • Reddit Help (18)
  • SEO & Marketing (81)
  • Spam, Scams & Security (93)
  • Trade Show News & Updates (23)
  • Twitter Help (217)
  • Video Game Tips (66)
  • Web Site Traffic Tips (62)
  • Windows PC Help (922)
  • Wordpress Help (204)
  • Writing and Publishing (72)
  • YouTube Help (46)
  • YouTube Video Reviews (159)
  • Zoom, Skype & Video Chat Help (57)

Archives

Social Connections:

Ask Dave Taylor


Follow Me on Pinterest
Follow me on Twitter
Follow me on LinkedIn
Follow me on Instagram


AskDaveTaylor on Facebook



microsoft insider mvp


This web site is for the purpose of disseminating information for educational purposes, free of charge, for the benefit of all visitors. We take great care to provide quality information. However, we do not guarantee, and accept no legal liability whatsoever arising from or connected to, the accuracy, reliability, currency or completeness of any material contained on this site or on any linked site. Further, please note that by submitting a question or comment you're agreeing to our terms of service, which are: you relinquish any subsequent rights of ownership to your material by submitting it on this site. Our lawyer says "Thanks for your cooperation."
© 2022 by Dave Taylor. "Ask Dave Taylor®" is a registered trademark of Intuitive Systems, LLC.
Privacy Policy - Terms and Conditions - Accessibility Policy