Scraping Instagram – Take 3


flickr photo shared by JeepersMedia under a Creative Commons ( BY ) license

I started to apologize for writing three posts on this and promising not to do any more but I reconsidered. This is my site. I’ll write whatever I want. Skip it if it bores you or exile me from your feed reader.1

Alan’s comment got me thinking that using spreadsheet formulas was not necessary and it felt awkward to me anyway. So I figured out how to do it all in the php. I’ll include the relevant portion of the code below. You can get the whole thing here.

The key is substr_count which will find stuff in a string and count it. The other little piece is boolval which returns true if it’s greater than 0.

//$caption gets all the text associated with the instagram post
                $caption = $media->caption->text;
                $filter = $media->filter;
//$hashcount looks at $caption and counts how many times it finds a #
                $hashcount = substr_count($caption, '#');
//$hashtrue looks at $hashcount and if it's >0 it returns true
                $hashtrue = (boolval($hashcount) ? 'true' : 'false');
//same pattern here counting @ instead
                $atcount = substr_count($caption, '@');
                $attrue = (boolval($atcount) ? 'true' : 'false');
//add the results to the CSV                
				array_push($list, $username . '?' . $likes . '?' . $comments . '?' . $link . '?' . $caption . '?' . $filter . '?' . date(DATE_RFC2822) . '?' . $hashtrue . '?' . $hashcount . '?' . $attrue . '?' .$atcount) ;

1 Plus no one reads blogs any more. Shouldn’t you be on Twitter or vaping?

7 thoughts on “Scraping Instagram – Take 3

    1. Damn it. Now I’m eating into my core constituency. It’s you, Alan, and Jim. But I must maintain my integrity as a blogger. Come what may.

      1. Sorry. I gotta do what I gotta do. But you might be temporarily saved, apparently Google Reader is like down for some maintenance or something. I’ll have to try again later.

  1. I may have half submitted a comment while on a mobile in the back of a car in Mexico. Keep the code brewing.

    The boolval function is not needed, there are more direct ways you can test values in an if () statement. For example, if your $hashcount finds any matches, 1,2, 234, evaluatng a simple if ($hashcount) returns true if it is any value other than 0. 0 is the same as false, any integer value is true. When you have a string value, a if ($string) evaluates false for an empty one, and true otherwise.

    So you can go shorter with

    if ($hashcount) {
    // do stuff if there are hashtags
    } else {
    // do other stuff
    }

    PS What is “Vaping”?

    1. Would I want to do that if all I want is true/false? There is no ‘else’ that I want. I just want it to write true if it’s not 0 and false if it is. It seems economical.

      Not arguing just trying to see how that’s more direct.

      Vaping is a bizarre ecigarette thing taken to the next level.

      1. 6 of 1 etc; if never even used/seen boolval(). You could do it either

        $hashtrue = ($hashcount) ? ‘true’ : ‘false’;

        Or it looks with that function, this is most compact–

        $hashtrue = boolval($hashcount);

        They all do the same thing;.

        1. Ah. Got it. I just didn’t get the structure when you weren’t really doing anything. There are some basic things that I’ve skipped in my miseducation.

Comments are closed.