After slicing through YouTube using the filename IMG_4228 I thought I’d try the same with Flickr. As you might expect there are a lot more items on Flickr that haven’t been renamed from that machine-generated filename, 30,118 at the time of writing with 4 or 5 being added each day. My plan was to download as many of these as I could and then see what I’d got.
My tool for this was Bulkr, an program running on Adobe Air designed for downloading your Flickr account to your own computer (good practice as Yahoo aren’t likely to be around forever). By upgrading to Pro I was able to set a search for IMG_4228 and download up to 500 at a time. I could have figured out how to write a program to do this myself but for now £20 wasn’t a massive hardship to save some time. If I decide to pursue this technique further I will, of course, do some learning.
Bulkr can only do what its permitted to do by Flickr’s servers and eventually I noticed it seemed to be hitting walls. But by this time I had over 3,500 photos to play with. Next step was to clean the data. I removed the following:
- Duplicates. While I was downloading people were adding new photos meaning the last photos in one batch might also appear as the first photos in the next batch. This was done manually so I probably missed a couple.
- Non-photos. I was pretty liberal in what I allowed but screenshots and scans didn’t seem in keeping with what I was trying to do.
- Icky nudity. There’s a fair amount of cheesecake in there and I’d switched off the “safe search” filter but I didn’t think exposing people to a butt-plugged hairy arse and balls was the right thing to do, so I removed it.
Bulkr had also prevented me from downloading any photos where the owner had explicitly said they didn’t want them downloaded, which was about 60 per 500 batch.
The final count was 3,537 images posted to Flickr with the title IMG_4228 between June 3rd 2011 and Jan 15th 2012.
Finally I used the Mac’s Automator to resize and pad the images into 480px squares, while also giving them sequential filenames. This gave me a consistent shape to work with.
In order to figure out what these images could tell me I had to try and look at them as a whole. For now I’ve taken three approached.
The Giant Poster
Seen at the top of this post and full-size in this (low-res) 20MB jpeg, this has allows you to scroll around a giant contact sheet and, by pulling out, notice any patterns when the photos are really teeny. On the whole it looks pretty random in the abstract, which is exactly what you’d expect from a neutral-criteria slice through Flickr.
The Sequential Slideshow
A simple video sequence rendered at two speeds. 2 seconds per frame lets you look at each photo and lasts for half an hour.
15 frames per second only lasts for four minutes and starts to compress the images together into a single object.
Contact Sheet Slideshow
A compromise of the above. Twenty five contact sheets shown for two seconds each.
There’s plenty more ways of looking at these photos. Loading them randomly, sorting by colour, pixelating them, statistical analysis of the contents… I may do some of these, I may not. The first stage, as with the videos, is to figure out if there’s anything worth looking for or if it’s just a random selection of 3,537 photographs.