I got an email from a KQED reporter asking about this old post on importing CSVs into Google Sheets. The source data was big though and hosted on GitHub and they only wanted the California data.
I thought that anything we did in Google Script was likely to time out so I built a little PHP filter.
//I commented these out so I could more easily see the data at the URL //header("Content-Type: text/csv"); //header("Content-Disposition: attachment; filename=CA-data.csv"); $csvFile = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties-2022.csv';//original file source readCSV($csvFile,$state); function readCSV($csvFile,$state){ $file_handle = fopen($csvFile,'r');//get that file $i = 0; while (!feof($file_handle)) { $line = fgetcsv($file_handle, 1024,';'); $row = explode(',', $line[0]);//turn the csv line into an array so we can check for the state more easily- the state is the 2nd object in the array (remember counting starts at 0 here) if($row[2] === 'California' || $i === 0){//is the state California or is it the first row? if so . . . echo $line[0] . PHP_EOL; //spit out the data and add the PHP end of line $i = $i+1; } } fclose($file_handle); }
Now instead of importing the main CSV URL in GitHub, we’ll do the import on my filter URL at https://tomwoodward.us/csv-filter.php. Seems to work nicely and you can set the trigger to re-import every day at whatever time you want.
KQED! that’s pretty cool. San Francisco, CA affiliate as I recall. Very auspicious.
It’s always nice to be able to help people doing good work.