sed and awk - my two old friends

September 30th, 2007

Writing some shell scripts I needed to do some a little fancier variable substitution than the standard shell offers. The heavyweight solution would be to write a perl one-liner, but this is, well…, heavyweight? ;-)

Here’s a couple of patterns I used:

  • --parameter=$(sed -re 's/ /,/g' -e 's/(^|,)/\1file:/g' <<<$INPUT) - replaces spaces with commas and prepends file to every file.
  • --parameter=$(awk '{split($0, a, /@/); printf "%s-?????-of-%05d", a[1], a[2]} <<<$INPUT)'
  • - replaces file@5 with file-?????-of-00005
  • --parameter=$(awk '{sub(/.*:/, ""); print $0}' <<<$INPUT) - removes everything before the colon.

Parsing parameters in bash - a getopt template

August 16th, 2007

Writing some bash scripts that parse command lines, I wrote this handy template with getopt. It is easy to apply even for simplest scripts.

OPTION_SPEC="help,flag1,flag2_params:"
PARSED_OPTIONS=$(getopt -n "$0" -a -o h --long $OPTION_SPEC -- "$@")
OPTIONS_RET=$?
eval set -- "$PARSED_OPTIONS"

Parsing error or no flags

if [ $OPTIONS_RET -ne 0 ] || [ $# -le 0 ]; then usage; die; fi

while [ $# -ge 1 ]; do case $1 in --help | -h ) usage; die;; --flag1 ) FLAG1=1;; --flag2_params ) shift; FLAG2_PARAMS="$1";; -- ) shift;; * ) echo "ERROR: unknown flag $1"; usage; die;; esac shift done

No more unannotated $ns in my scripts!

Date of yesterday in bash?

August 16th, 2007

I recently had to hack a small shell script that would read files in a directory structure generated based on the date, something like 2007/08/16. The trick was that the script would look at yesterday’s file or files generated a few days ago.

A quick search on info and here’s the magic command

FILE="...$(date -d 'yesterday' +%Y/%m/%d)"

Interestingly, you can also use things like 3 days ago, next Monday, 2 months etc. Cool!

Finding top-N items in a stream

July 4th, 2007

How to (approximately) generate a top-N items list without counting the number of occurrences of all instances? Two interesting papers I found on the topic: http://citeseer.ist.psu.edu/charikar02finding.html and http://citeseer.ist.psu.edu/jin03dynamically.html. I also somebody’s seminar powerpoint presentation explaining it.

Listing socket/network connection owners on OSX

July 4th, 2007

While playing with OSX I was wondering how to find out all the networks connections a particular process owns. On Linux I’d use netstat -p for this, which does not work on OSX.

It turns out that the solution is quite simple - lsof -i does the job and works on both Linux and OSX. Two other useful commands:

lsof -ai -p PID    # all connections/sockets owned by PID
lsof -i:PORT       # lists all connections/sockets with a particular PORT.

Link: Surviving traffic storms with Wordpress

May 2nd, 2007

Interesting link on surviving traffic storms with Wordpress: not that I currently need it, but maybe in the future… ;)

In a nutshell:

  1. fine tuning of Apache (adjusting #processes, keep alives and ListenBacklog to values that match your machine’s constraints).
  2. fine tuning of MySQL query caching
  3. installing WP-Cache plugin + adaptive switching on of WP-cache plugin (only in heavy-load condition)
  4. disabling some plugins (the ones that take up a lot of resources)
  5. enabling Squid caching for static content.

Adding custom firewall rules in OSX

May 1st, 2007

Having extensively used Linux before I found GUI configuration of OSX firewall somewhat lacking. In particular, I wanted to limit outgoing access to some IP addresses (but I can imagine you may want to play with other things as well).

I found that I could buy Flying Buttress which should allow me to do this, but I really don’t need a graphical ipfw frontend, especially the one I’d have to pay for ;-) All I needed was to write some ipfw rules and make them persistent.

Here’s what I did:

 mkdir /Library/StartupItems/CustomIPFWRules
 cd !$

Created a file called StatupParameters.plist containing:

{
  Description     = "Custom Tadek's IPFW Rules";
  Provides        = ("CustomIPFWRules");
  Uses            = ("Network");
}

Created a file called CustomIPFWRules (the name has to match the directory name) containing a simple shell script:

#!/bin/sh

. /etc/rc.common

case "$1" in
        start)

        ConsoleMessage "applying tadek's ipfw rules"
        ipfw add 2045 deny tcp from any to "ip_I_want_to_block" out
        ;;
esac

exit 0

Voila!

BTW: a useful link on playing with Firewall in OSX.

iPhoto - my experiences

April 22nd, 2007

Being a happy owner of a Mac I decided to give iPhoto a try to manage my photos. To give a bit of a background, we have an external Gallery2 to which we export selected photos (but locally we store more photos). I also occasionally edit my photos in GIMP (each time happy that there’s such a powerful application and at the same time swearing at the user interface) and also use Panorama Tools to stitch panoramas I took. Finally, I use my own tool for geotagging of my photos if I had taken a GPS with me on my trip. Now here’s how I manage to do all this.

Rolls vs Albums

Whenever you import photos iPhoto creates a Roll for you, whether you like it or not. The idea is that iPhoto rolls correspond to film rolls, sadly, with all disadvantages of the latter. Living in 21st century, I found it immensely annoying that my hiking photos are intermixed with my balcony photos just because I forgot to download them before going for a hike. Conversely, whenever I import a stitched panorama it always appears as a “Roll XX”, even if I’d want it to be a part of my hiking roll.

Fortunately, there’s a way to manage this. Unfortunately, it’s not very intuitive:

  • Merging rolls: you can drag photos (or an entire roll) from a roll to another roll. Note that it’s not sufficient to drag it to where the photos are, you need to drag it exactly to the roll bar. Because of this, I fold the foll I want to copy to before doing it. Once the source roll is empty, it magically vanishes.
  • Creating a new roll: if you select some photos you can create a roll with File>Create Film Roll.

Under the hood: each roll corresponds to a directory on the harddisk with the roll’s name. For example, my roll called Pfaff hike would be under ~/Pictures/iPhoto Library/Originals/2007/Pfaff Hike. Modified photos will be under ~/Pictures/iPhoto Library/Originals/2007/Pfaff Hike.

Finally, deleting photos from rolls deletes them, deleting them from album does not.

Lossy re-compression and image rotation

I noticed that whenever I import photos, iPhoto rotates them and saves them in the Modifed directory. There are two problems with it:

  • duplicate disk space: iPhoto keeps both the rotated and unrotated version.
  • lossy transform: to my horror I realized that iPhoto performs a lossy rotation operation as the photos shirnk significantly (this is really shamefull as lossless 90deg rotation is not something very difficult).

The solution is to transform the photos on the CF card. It’s a bit of a nuissance as the images will be read and written to the CF card, but let be it. Initially, I thought of exittran as I used in on Linux, but I have not found it in Darwin ports. Instead, I learned that jtran will also do the job and can be installed with port install jhead (credits to donc).

Finally, here’s a magic command that rotates all photos on my CF card:

find /Volumes/EOS_DIGITAL -name '*.JPG' -exec jhead -autorot {} \;

It is important that it’s executed after the card is inserted but before the photos are imported to iPhoto.

Second, I try to avoid any editing operations in iPhoto and set an external editor to GIMP. Now if I need to change something, I double-click on the photo and I edit it in GIMP. Also in GIMP the default quality is 85 so I have to use Save As… to set higher quality. I typically use 95-97 or so ;-)

Tadek’s workflow:

To summarize, here’s what I do with my photos on import:

  1. Get myself a cup of tea.
  2. Insert CF card, wait until “Import photos dialog pops up”.
  3. Open shell and run the magic command: find /Volumes/EOS_DIGITAL -name '*.JPG' -exec jhead -autorot {} \;
  4. Run my geotagging script of on /Volumes/EOS_DIGITAL
  5. Once the rotation has finished, import the photos.
  6. Split the photos into thematic rolls.
  7. Select photos for panoramas (use “Show File” to find the file to be imported in PTGui).
  8. Stitch panoramas.
  9. Import panoramas back to iPhoto (will show as new rolls).
  10. Merge panormas rolls.
  11. Go quickly through my rolls, deleting the photos I definitely don’t want.
  12. Edit some photos in GIMP.
  13. Create an album with all the photos from my roll.
  14. Select the best photos by reordering / deleting them.
  15. Import photos to my Gallery using the great iPhotoToGallery plugin.
  16. Change album sorting to “manual”.

As simple as one button press, huh? ;-)

Pondering about a prime lens for my camera - a scientific approach

March 27th, 2007

I recently got caught up in a discussion about getting a prime lens (== a fixed focal length lens) or another zoom lens for my camera. It seems that I will not be getting a lens anytime soon, but I wrote a cool perl script, which I want to share here ;-)

I decided I should analyze what focal lengths I used while taking my photos so that I can take a more conscious decision. After some trials and errors, what I came up with was:

find . -exec exiftool {} \; | perl -ne  '/^Focal Length.*equivalent:(.*)\)/ && print "$1\n";' | sort | uniq -c

which runs exiftool on all my photos, extracts the 35mm equivalent length (I took my photos with at least 2 different cameras) sorts them and generates a pseudo-histogram.

I thought I was done, but I then realized that something must have been wrong with the data as some of the focal length ranges were well beyond what any of my camera has. I investigated a little and found out that some of the photos from my G3 have an incorrectly calculated 35mm equivalent. This means that such a simple script will not do.

Here is my second try:

find . -exec exiftool {} \; | perl -ne '
if (/^ExifTool/) { $camera = ""; $lens=0; };
if (/^Camera Model Name.*: (.*)/) { $camera = $1; };
if (/^Focal Length.*: (.*)mm/) { $lens = $1; print "\"$camera\" $lens\n"; }; ' > photos-lengths.ssv

Now I have a text file with something like:

"Canon PowerShot G3" 7.2
"Canon EOS 350D DIGITAL" 38.0

This file can be conveniently read into R so that I can plot a real histogram:

lengths <- read.table("photos-lengths.ssv")
lengths_camera <- split(lengths, lengths$V1)
num_cameras <- length(lengths_camera)
old_par <- par(mfrow = c(floor(sqrt(num_cameras)), ceiling(num_cameras / floor(sqrt(num_cameras)))))
# one chart for each camera.
for (c in names(lengths_camera)) {
   l <- lengths_camera[[c]]$V2
   # filter likely corrupted data.
   l_filt <- l [ (5 < l) & ( l < 100) ]
   hist(l_filt, xlab="focal length [mm]", ylab="#photos", main=c, breaks = 30);
}
par(old_par);

This little script produces a nice matrix charts with histograms of focal lengths for all the camera. So the conclusions is that I could get a 30mm prime lens, but also a 10-20mm lens would not be a bad idea ;-)