Techies, tumblrs, IT people: Have you seen our Tumblr blog? Or maybe our #imageoftheday posts on Twitter and Facebook? We’ve been sharing images from our online galleries for your viewing pleasure. And it’s done almost entirely automatically thanks to a pair of Python scripts.
We have a trove of images of great album and book covers, 45 adaptors, punk flyers and more, but they’re a bit buried in the galleries section of our site. How best to get these images out to our friends and followers? This seemed like an ideal project for the programming course I was taking last fall as part of my library science grad program at Pratt, so I set out to write some Python. I ended up with two scripts that can work solo or in concert.
The first script crawls through the ARC gallery pages, which are generated by the popular NextGEN Gallery WordPress plugin, scraping image URLs and metadata. This information is written out to a JSON file.
The second script uses pytumblr, a Python Tumblr API client, to build and send a user-determined number of randomly-selected photo posts to Tumblr along with appropriate caption text and tags. The JSON file is then updated to indicate which images have been posted.
Why Tumblr? It’s free; many themes support a photo-gallery style layout; users can queue and schedule up to 300 posts for publication; and it can serve as a social media hub — using a service like IFTTT.com, Tumblr photo posts can trigger parallel photo posts on Twitter and Facebook.
Some challenges: Getting pytumblr to install successfully was difficult. It has the OAuth2 module as a dependency, and this proved tricky to install on my machine. Eventually, I was able to install and run Pytumblr using Python 2.7 (as opposed to Python 3.4). After writing the scripts specifically with the ARChive in mind, I went back and moved all ARC-specific data into a separate settings file. This leaves the Tumblr-post code clean and generic — in fact, you could use this code to run your own Tumblr bot, with your own set of images. However, this process was much more difficult for the web scraper script, as this kind of image-scraping is so context-dependent.
There’s still some work to do to really polish it up, but hey, it works.
#ARCbot lives in a cozy Github repository. Take a look and let us know what you think.