> Some of the > organization packages do a pretty good job of this just by virtue of > the way they organize their photo database (ie iphoto puts things in > directories by date of photo taken, which is pretty helpful for > detecting duplicates. Two files of the same name and size that were > taken on the same date most likely are duplicates...) If files are unmodified then matching is not hard. But there are lots of derivatives or lookalikes that arise along the way that it would be nice to associate together. I don't necessarily want to delete duplicates or derivatives - mainly I want to know that they exist and where they are. Sving an image from an editor, even if notionally unmodifed, will change file size unless you specifically save the identical file. Even at very low loss JPG compression the file size reduces substantially from the camera JPG file size so this is useful when providing DVDs etc for people to view. eg a 4 MB camera JPG may reduce to around 1.5 MB at JPG 90. While you can possibly see the difference in quality it usually requires flicker comparison (alternate between notionally identical images). I have spent alot of time cropping sections out of images and pasting them ext to each other and poring over them at pixel per display pixel resolution or above to see what differences occur. It is almost always extremely subtle at say JPG90 and when viewed with a 12MP image on a 1.5 MP or so LCD screen the differences are essentially or actually non-existent. If using RAW files an editor may produce a reduce dresolution JPG from the embedded JPG and this will have the original file name unless care is taken to differentiate it. eg camera may produce RAW[ActualRAW file, 1.5 MP JPG] + 12 MP JPG. When I edit files I have a semiformal naming convention which adds suffixes to the file name R modified (usually just colour balance/ contrast / saturation / brightness / gamma) RC modified and cropped. Rn or RCn N = 1 2 3 ... for variants ...S with sharpening (may just use R) ... usm - unsharp mask (but may just use S) ... X experimental ie expect funny results. ... Z probably rubbish. etc. THEN I may add text after that. Just extracting the basic filename + date/time will help ignore the add ons. BUT, to keep life busier, I modify the file names :-(. Sony use DSCxxxxx.yyy The xxxxx COULD HAVE given them a 100,000 file range, but they (stupidly) actually only implement DSC0xxxx.yyy So, I change the 0 to an incrementing digit every time the camera rolls over 10,000 images. So DSC02345.JPG may become DSC32345.JPG. Which is fine as long as I don't copy the DSC0... files anywhere before renaming. On the camera cards they are still DSC0... so comparisons 'in camera' are complexified. What fun we have. > Google's Picasa > tools do something similar. But it ought to be possible to do tasks > outside the fancy GUIs a lot more efficiently, and especially so if > you add "intimate" knowledge of how the major photo programs organize > their data. I have so far avoided writing data to the available EXIF fields, but it looks like the day may be coming. This removes the very valuable ability to use 'visual inspection'. > The only thing I've heard of so far is some CLI utilities that will > let you do things based on the exif and other jpeg photo info, and > some people have written substantial scripts on top of these to > automatically organize photos as they're taken. That may yet be necessary, alas. > Let us know what you find out. Will do. I presently have "Duplicate Cleaner" running to see how it goes. On the onboard 1TB + 300 GB + backup 1 TB it reports 1023875 image files and after 0.4% processing complete has found 3728 "duplicate sets". If that is representative the final count will be about 100,000 "duplicate sets" on these 3 drives alone. As the 1TB external isa backup drive for some of the onboard photos that's not surprising - final figure may be higher depending on where it has looked so far. Russell -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist