Imagine you have a large folder of images in Google Drive. I don’t have to imagine this as I do thanks to an IFTTT recipe.1 Google tends to be kind of stingy with the kinds of filtering/interactions you can have with files in their folders and we know that if you get stuff in Google Sheets then a world of other possibilities opens up. I’ve been thinking about what options there are with Google file storage because VCU is a GAFE school and we have unlimited Drive storage. That might open some media storage options with heavy load projects like our Field Botany site or our more recent work with the East End Cemetery. So . . . I wrote a quick script to take a large G Drive folder full of images and write the content to a spreadsheet while embedding an image preview. The script is below. I ended up revamping both enough that I felt it was worth reposting. For the record, the script ran through about 4,500 images but it may have timed out so keep that in mind if you’re dealing with lots of images. 1 I figure if I have enough backups to my online backups then I can pretend I’m safe.
via the magical XKCD Fun being a fairly relative term . . . but I’m amused. The Meat You can write custom functions in Google Spreadsheets and then use them like other built in functions. I didn’t realize that. The script below1 grabs all the URLs from a chunk of text. You could add it to your spreadsheet in Google by going to Tools>Script Editor and opening a blank project. Replace all of the content with this and then save it. You can now use it like other functions by putting =findULR(A1) (assuming A1 is the text chunk you want). It spits out a single cell with the URLs in it on individual lines and with a count of the URLs found at the top of that cell. I’ve commented up the script below in case you want to understand/change it to better suit your needs. Next Steps The regex works well about 90-95% of the time. So it sure beats doing it by hand but it could be improved. You can try your own stuff here. The post I’m using there is pretty messy so it’s a good thing to check against. It’s hard to guess what odd things people will do with URLs. I’ve already seen people doing stuff like “”http://blah.com””. No idea why they did double quotes. […]
Functional is a kind way to describe this . . . but it’s hard to argue with computers doing tedious work that was previously done by hand. The goal here was to automate the collection of the comments on 300 or so posts from the CDC’s Facebook page so they could be analyzed. Ebola is in the mix which leads to some comments that may very well rival YouTube comments in terms of causing me sadness. We did end up with around 34,000 comments which would have been hand copied in the past. So some karma balance was achieved. wTake all this with a grain or two of salt. It works but may very well be illegal in some states or do things in a way that would make real programmers weep. Consider my programming skills to be like a man trying to swat flies with a broomstick- lots of furious action, panting, cursing, and every so often I will kill a fly and celebrate wildly.1 The Facebook This was my first experience trying to do anything with Facebook. The hardest part here for me was just getting the stupid access token needed to get at the JSON feeds. I eventually found the secretly named Facebook Login Example. Part of my issue was that I thought I could skip the […]
This is a post about (at least temporary) failure. I should be able to do all this via the Facebook API/SDK but I’m doing something wrong. While I’m learning a decent amount in the land of programming, fundamentally I still suck. With that cathartic self-flagellation out of the way . . . I’m working with someone who wants to grab all the comments from a large number of CDC posts about health issues for some research. She is currently doing it by hand. Here are two improvements to that awful reality.1 Expand All Comments In FB land if there are many comments you only see the most recent. And if there are many, many comments you have to keep hitting “View Previous Comments” over and over. That’s super boring to do once. If you have to do it a lot it would really suck. Enter Alec’s bookmarklet. I tweaked it a tiny bit because I think FB changed the wording but it works like a charm. You’d copy the text below. Add a bookmark and then click edit. Replace the URL with this text and name it whatever you want. Grab All Comments Now to grab all these enlightening comments . . . Install the Scraper Chrome Extension. You can now right click a particular item and choose “scrape similar.” […]
flickr photo shared by Lynn Friedman under a Creative Commons ( BY-NC-ND ) license We’re still messing with manipulating data in Google Sheets. I hate to think of humans going through and manually parsing things that the machine should do so I’m putting all this out in the hopes of saving someone somewhere some time. It’s also meant to help our internal people see how it works so it might be more broadly applied. In this case, we’re just looking for a line break and using that to cut out a chunk of text. This Stack Overflow post gave me the char(10) piece so I could find the line break. You can see the details above and more on unicode characters here. It’s broken in two pieces for clarity. Piece one finds the number of characters prior to the line break. The second piece takes that number of letters from the left. I find manipulating text in spreadsheets to be a constant and practical use of algebra.
I thought I’d take a stab at using Google Script to capture Instagram data after being inspired by all the great Martin Hawksey has done with the TAGS Explorer. I’m doing something very similar and it turned out to be fairly straightforward to take the PHP I’d written previously and turn it into something that’d function in Google Sheets. I found this example highly useful in creating my own script.1 The following script can also be set with a time based trigger to fire every X minutes/hours/days which is perfect for this particular project. You can see it pulling 20 pictures worth of data every hour here if you’d like. 1 I basically copied portions of it.
flickr photo shared by JeepersMedia under a Creative Commons ( BY ) license I started to apologize for writing three posts on this and promising not to do any more but I reconsidered. This is my site. I’ll write whatever I want. Skip it if it bores you or exile me from your feed reader.1 Alan’s comment got me thinking that using spreadsheet formulas was not necessary and it felt awkward to me anyway. So I figured out how to do it all in the php. I’ll include the relevant portion of the code below. You can get the whole thing here. The key is substr_count which will find stuff in a string and count it. The other little piece is boolval which returns true if it’s greater than 0. 1 Plus no one reads blogs any more. Shouldn’t you be on Twitter or vaping?
flickr photo shared by Marco Gomes under a Creative Commons ( BY ) license Remember last night when I posted about scraping data from Instagram? I woke up this morning about 5:30 (literally with a start) astounded by how easy the solution to archiving the pagination returns was. So before I even left for work I managed to get this working so much better than my previous attempt. I stripped out all of the previous GitHub stuff as I realized I didn’t really need it. It had provided a nice crutch and let me know I sort of knew what I was doing. The explanation of what’s going on is in the comments interspersed in the code below. There’s a much cleaner way to do this where I don’t duplicate so much code. I could just call the part that builds the csv1 twice. I may do that at some point but I think having it all in one place will help people new to this sort of thing see what’s going on more clearly. This is fun stuff. I need to do more of it and more consistently. In the past, I’d do some programming for a few days and then not do any for a number of months. That makes for slow progress and frustration. I’m going to […]
flickr photo shared by ajmexico under a Creative Commons ( BY ) license I’m trying to step up my programming game a bit.1 APIs are also getting more and more accessible to jokers like myself.2 (In this case I also use php, cron, and some regex.) All of this should make Alan very proud. But I’m relatively terrible at doing things without a purpose. Luckily one wandered in on Tuesday. A faculty member who I’ve worked with a few times before came in and asked if there was any way to grab Instagram data for a project on social media and health that focused on vaping and ecigs. I’m not one to look a gift project in the mouth so I said I’d take a stab at it. Step one was to check out Instagram’s API3– in particular I wanted to see the tag endpoints. Those are URLs that give you access to JSON data. To get at these you need to register as an Instagram developer and register a client. This is a pretty straightforward process. After that I browsed around GitHub to see what might already exist. This got me to the Instagram PHP API. I always start by wandering GitHub much like I start my WordPress work by looking at plugins first. It took me a long […]
As part of the gen ed seminar I pulled the rampages.us user signup data for Kristina Anthony. It was just a straight export from the wp_users table and stripped of everything but the date. She pulled it into Excel and used a pivot table to make it manageable. Which is awesome. So I pulled it down and pushed it back up into Google Docs so that I could embed the chart in this post. It makes me feel better to look at the growth over what amounts to around a year of actual use. I tend to focus on places for improvement (and there are many) but it’s worth looking at what ALT Lab has managed to achieve in a fairly short period of time.1 The July to February jump of about 6000 users is pretty insane. I have every expectation that we’ll add another 6000 or so users next year. Things will certainly only get more interesting. This has been done without huge student training initiatives. For the most part faculty members are able to support their own students. We have some of that filter up and we deal with some troubleshooting online but there’s no dedicated person(s) to support WordPress issues or train students. That’s a testament to WordPress. 1 In the higher ed dimension a year is […]