One of my favourite things is dealing with dark data and putting it on to SharePoint. No, seriously, I love large data manipulation and trying to cram TBs of old data on SharePoint just so we can easily search it. It’s the challenge.
There’s quite a few blockers for moving large quantities of data to the cloud for archive and one of them is always sheer size. While working on a project recently I needed to get the image files down to something sensible. Why people feel the need to shoot in practically 140MB RAW and then lob in to 100% jpg I have no idea but it results in absolutely HUGE image files. To simplify in to a decent screen-viewable image that I could take online I decided to run the lot and compress in place. The OS of choice, Linux. The tool of choice, ImageMagick.
My preferred method to find and shrink the files is simply this (warning, it’s in-place and overwrites the original file!!):
find . -iname "IM*.jpg" -print0|xargs -I{} -0 mogrify -verbose -resize 1920x1920\> {}
Let’s break that down…
“find” is the find tool, much loved by Linux folk for it’s raw search power
“.” is the directory (folder for you youngsters) that I was starting from, which is the one I’m in
“-iname” is the command I send to find to search for case insensitive
“IM*.jpg” is the images that I want to change all start with IM and end in .jpg. * selects everything
“-print0” prints the output on a line to …
“|” pushes the find data to another command, it’s called a pipe
“xargs” is a great tool for taking arguments from one command and doing something in bulk on another
“-I{}” that means the file name from find command will be used when the braces are specified
“-0” uses a null character in case of special letters or whitespace (just use it!)
“mogrify” is the imagemagick mogrify command
“-verbose” as I like screen output
“-resize” I’ll let you guess
“1920×1920\>” Weird one, but let me explain… The images I want to be at least 1920 in Landscape or Portrait. This commands looks for the largest side of the image and then resizes at scale to 1920px. So if you have a widescreen image the width gets the 1920 size but if portrait the height will. I hope that makes sense.
“{}” is the file name to mogrify
After letting this chomp through around 100,000 images that were over 50GB in size I ended with a nice modest 6GB of images that were crystal clear to view on screen. Worth it – up to SharePoint you go.
Just so you know, this only changes the actual dimensions of the image, I could also throw in some command to mogrify to adjust pixels per square inch too.