My initial filenaming convention for PixCede was pure rubbish.
It all started with the best of intentions. After reading about the development of Pastebin and why the database got scrapped, I was inspired to make PixCede rely on the filesystem and not the database. I did, however, need to keep the timestamp so I could sort the images by the order in which they arrived. So my initial naming convention was:
2008-07-04T07:19:03+00:00_f6cfee1f4e477963a3c24d8f9b769722.jpg
That is, PHP's date('c') followed by an MD5 of the file. My logic was, date('c') would keep the time property and, if by chance any two images hit the system in the same second, they would certainly have different contents, and the MD5 would differentiate them (unless of course, the same image hit at the same time, but.. I don't see why it would need to be there twice). Using date('c') was just a bad idea from the start; if I had spent 2 seconds more thinking about it, I would have just used time(), which returns the number of seconds since the Unix epoch. Using an MD5 hash is a dumb idea too, because it's a pretty expensive operation.
So for version two, I used time() concatenated to uniqid(), a function that creates a UID based on the current time in microseconds (it's what the PHP manual pages recommend to use for Session IDs). Without any parameters, uniqid() returns a 13 character string. That brings me to:
1219098558133aaffc6178602.jpg
Substantially better, but.. as the great doctor says, if something's worth doing, it's worth doing right. Ideally, I want PixCede to send a small enough unique identifier back to the user that he can type it into his browser, after receiving an SMS back. Next on the chopping block: the 13 character unique identifier.
In a dream world, I might get 1000 pictures per second with PixCede. Realistically, I think I only need to worry about two at once (and that's a stretch), but 1000 seems like a nice round number. If I use PHP's base_convert(), I can create a random number from 0 to 1,679,616 (36^4), and convert it to base 36 (10 digits + 26 letters) and only use 4 precious characters. This brings it down to:
1219098558xe21.jpg
Which is a lot better! It preserves the ordering in the first 10 characters, and uses 4 characters on the end to make it unique. But why not go balls to the wall? I might as well convert both the timestamp and the random number to base 36; a timestamp in base36 will sort just as well as one in base10. The final code I am using:
$id = (time() * pow (10, 7)) + rand(0, pow(36, 4));
$uid = base_convert($id, 10, 36);
3bnd9c4wf66.jpg
11 character total, a saving of 47 over my original, poorly encoded 58 characters! This, I feel is an acceptable UID to have to type in by hand. If you want to improve on it, base62 is just small function away (there's a user contributed one on the PHP base_convert() page).
EDIT:
I decided to go ahead and switch it over to base62 encoding. The comment on base_convert() actually didn't work for what I needed; the images were mostly sorted, but off a little bit. It turns out you need to switch the upper and lower case letter sets in the dec2any() function ("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"), and now everything works proper, with a final, 9 character encoding of:
tfebHWV5s.jpg
2008-08-18
PixCede gets shorter file identifiers
2008-08-03
Adam's Hang Gliding Adventure: Cooler Than What You Did Saturday
As I was going through some envelopes last week, I found two $100 checks that I wasn't counting on. The "Rules of Surprise Money" were very clear on the next step: I had to buy something I didn't need that was awesome, or do something I didn't need to do that would be awesome. I've got enough junk, so I decided to go with the latter. Browsing through Meetup.com, I found a group of people going hang gliding on August 2nd, and the total cost of ground instruction plus a tandem flight at 1000 feet was $195. The choice was clear -- hang gliding and a Chipotle burrito!
This involved waking up early that man was meant to on a Saturday to meet the Meetup.com participants in Santa Clara at 6:15am and then continue the ride down to Trespines outside of Hollister, CA. Once we arrived there, we split into a ground instruction group and the tandem group. The ground training consisted of learning the basics: learning how to run taking long strides (the longer the strides, the less you bounce up and down, and the lower chance you have of rupturing air tension along the wings of the glider), picking up the glider and balancing it on your shoulders, learning how to orient the glider with or against the wind, walking the glider on the ground, running with the glider, and finally, freakin' lifting off the ground with the glider!
The tandem was a little more intense. They had a winch that pulled a tow line connected to the tandem glider, so that it would pick up enough speed to lift up. Man. There are very few things that I have experienced that are as intense as that initial lift -- you go up one hundred feet before you even realize what's going on!
After climbing to cruising altitude of about 700 feet, you see those birds flying around down there near the ground. I taunted them for being wusses. After circling around for six or seven minutes, we landed in the field right next to the runway.
The instructors had all been doing this for years, and had great stories; for example, jumping off of Glacier Point in Yosemite and hang gliding from one side of the valley to the other. Or, the guy who hang glided from Mt. Tam just north of San Francisco about a hundred miles south. Or, the guy who has the longest flight record at 440 miles. Yea. 440 miles. That's roughly equivalent to the distance between Cleveland, Ohio and New York City. In a hang glider.
2008-07-08
PixCede: Thoughts on how to proceed
I haven't done any "real" work on PixCede since late last week, but I've done a lot of thinking about it -- which is probably a lot better than implementing without thinking. I've also been using it myself more than I thought I would be, which is really exciting. I've told a couple of my friends and collected some feedback, but Zach is the only one who started really playing with it.
I've thought a lot about whether or not to allow emails from non-phones through. The main concern is the potential for abuse -- who's to stop someone from writing a script to constantly spam the front page with whatever? Zach had a good suggestion of getting a short SMS number (like GOOGL) and only allowing pictures to come in through that. Then, I realized that getting a picture from one of my computers to the other meant emailing it to myself, which for some reason just seems like a huge pain in the ass. I used GMail to email PixCede a picture of myself on the Half Dome, and just retrieved it through the web site on the other computer. Yea, still email on one end, but on the receiving end, I felt incredibly less inconvenienced. Therefore, I think I'm going to implement a simple file upload feature to PixCede so that you can post a picture from one computer quickly, and retrieve it from another computer quickly -- without the hassle of logging into email on both computers.
I know what you're thinking: great idea! TinyPic had that idea YEARS ago! Short answer, yes, long answer... not quite. For one, I really am going to try to model this site after PasteBin more so than TinyPic and other similar services. I really like the dead simple usage of PasteBin, and I also like the content expiration too. I feel like that would discourage people from using PixCede as an image hosting solution -- which it is not. PixCede is a simple way to transfer pictures from one device, be that a cell phone, a workstation, or something else I don't know about, to another. So in that respect, I hope to differentiate myself from TinyPic and Flickr -- this is more of a PasteBin fork.
So, speaking of PixCede, I realized that the owner made all of his code open source. I love not re-inventing the wheel, so I'm going to start reading his blog and poking into the source code. I've briefly scanned his blog, and I was happy to see that he switched from a database back end to just a plain file system -- a decision I arrived at with PixCede a week or so ago. For something this dead simple, a relational database is really overkill. It just seems like the common decision to "my web app needs to persist information" is to just cram it into a database without really thinking about it. A well designed file structure will work just as well in this case.