The first step in starting to consider your personal API is figuring out where your stuff is now. This has been an interesting experiment for me as I’ve flung stuff all around the Internet with very little concern for long-term considerations.
Where is my stuff?
I’m trying to think about all the places I’ve put work and/or media I care about. I’m also trying to group all of it in some sort of organized fashion. I thought it’d make sense to think big picture and work my way down.
- Domains/Servers
- bionicteaching.com on bluehost until I can do the reclaim migration
- mainly the blog but lots of random files as well- no real idea what’s on here
- tomwoodward.us on bluehost until I can do the reclaim migration
- rampages.us (work) – on reclaim, code stuff is mostly on github but content is in the wind
- augmenting.me (work) – on media temple, code stuff is mostly on github, maybe
- greatvcubikerace.net (work) – limited, no idea if I’ve got this on github
- teachers.henrico.k12.va.us – (old work) not sure it’s salvageable in time (lost to the monsters?)
- bionicteaching.com on bluehost until I can do the reclaim migration
- Google Docs
- bionicteaching – 5GB
- vcu- work – 11GB
- montessori – work
- henrico – work (lost to the monsters – I document this as reminder of how much stuff can be lost when you change jobs- remember changing ownership across google domains does not work)
- Photos
- flickr – bionicteaching – 24,620 photos
- flickr – woodward98 – 29,000 photos
- flickr – vmi football – 3,700 photos – not sure I have the login anymore
- flickr – crossfit huntsville pgs 16-30 – I don’t have the login anymore and someone else does the site but I want some of the pictures I took for a regional event
- instagram – twwoodward – I think it’s all duplicated in flickr via an IFTTT recipe, use is sporadic in any case
- 500 px – duplicates of flickr photos – can let it burn
- Google Photos – bionicteaching flickr stuff backs up here but only for the last year or so and is reliant on IFTTT to do it
- Bookmarks
- Video
- Code
- Twitter – archived and updating thanks to Martin Hawksey (although saved in Google Drive)
- Stack Exchange
- Facebook – let it burn
- Reddit – it’s been a long time probably not worth it
- Feedly – get that OPML file
There is other stuff but I think that gets me almost all of the things I care about.
What Do I Want in the End?
I want one place where all the stuff ends up. I want it duplicated. I want to know that as I wander jobs or services die that most of all this work will be fine . . . even if I never look at it again. I want structured data around things so I can find things by date, location, category/tags, or basic text search. I want to hook in into a decent portfolio that I provision on the fly. I want to be able to sort, shift, push, and pull stuff.
I know that certain things will simply die if the service dies. Most of the things I value from Google Sheets will not work if the service is gone. The downloaded Excel sheets won’t do the network aware things that I want . . . plus I’d have to learn visual basic or something like that.
Some of the things, like the Stack Exchange comments, don’t have as much value apart from the context but are still useful.
I’ve been considering my own API nouns and just how close to the metal (relatively speaking) I’ll need to be to be happy. I’m also trying to look at this in a more practical way and consider what people would be willing/interested in doing around this. How much stuff are they likely to have in how many places? What level of effort would they be willing to extend? How much is acting in the space important? The POSSE concept is based on the idea that being in the space doesn’t matter much. I’m not sure that’s true. It seems like the entire reason people like things like Instagram is because of the user experience. It might be that a system that reclaims from those systems ends up being more difficult but is more likely to be successful. In other scenarios, like submitting work to a professor, then POSSE makes more sense. There is no user experience to miss.
Like Jim tweeted, I’m gonna be copying you on the inventory approach. I like where you define stuff you can reclaim, stuff that you *could* reclaim but not worth the bother (google sheets), and stuff you are just willing to wither if they must (your “let it burn”). There’s stuff in the middle I like to call “co-claiming”.
One bit of inventory I will do is looking at all the sites I forgot creating accounts in stored in my Chrome passwords and Apple Keychain- and nuking the ones I no longer need. And figuring out from others what is claimable. I might also look at adding, for old stuff out of my control, what might be still available on the Internet Archive, so I know in advance.
For me, that is all my work from my Maricopa years, 14 years of web stuff they took offline (much of it had security issues, perl scripts writing to open text files, but not all). I have a full web directory archive, and restored some on my own domain, but nearly everything is functional in the archive.
And some of the hosted stuff, I bet we can wrabgle some APi stuff to make those counts in flickr youtube (you forgot soundcloud) dynamic?
But dude, get off of Bluehost!
The archive.org piece is really important and one I did recently but didn’t think to include.
I do need to nuke the abandoned accounts.
The bluehost thing is an embarrassment and I am suitably ashamed.
Jon Udell’s Hosted Lifebits (circa 2007) is where I first ran into this idea. http://blog.jonudell.net/2007/05/22/hosted-lifebits/
And Gordon Bell at Microsoft Research with his MyLifeBits http://research.microsoft.com/en-us/projects/mylifebits/
I think it’s time to do the heavy lifting and make this fer rilz and this audit process IS part of that heavy lifting.
If you can’t measure it, you can’t manage it. (attrb to Andy Grove)
I’m at least following in impressive footsteps . . . even if it’s ten years later. 🙂
It would be interesting to automate a chunk of the audit process or at least guide it a bit. Given Alan’s comment, I wonder what you could grab from the browser automatically and then do some sort of guided suggestion . . .
Sometimes identifying what that first step should be is the biggest breakthrough. Automating a first pass audit/inventory and then attempting to identify possible “channels” for input/output is a big, big step forward. I don’t want to hyperbolize, but I haven’t seen/heard that before. Unless someone would consider it “middleware” or some type of glue code, this is new to me at least.
It seems like a way to make it more accessible. I’m tempted to shoot for an omnivorous json ingester for WordPress- easy mapping from json into custom fields etc. for major platforms but possibility to map other services more manually. I have a more detailed blog post in my head but I’m fighting myself around whether I see everything as a nail to be hit with WP.
I say fix it for yourself first. Achieve that. Then abstract, re-factor, optimize, generalize the workflow after that. Define the exceptions, edge-cases, the null set and see if that falls inside/outide of 1.What WP can do 2. Might be better done by a native API provided by that app. I’m guessing much of this is http get/put transactions of objects back and forth.