JungleDisk and Amazon S3
October 4th, 2008In my move to Ubuntu Linux as my desktop I lost my existing desktop backup solution (www.backupmybusiness.com) as it didn’t run on Linux. I must say that I wasn’t super thrilled with backupmybusiness as it wasn’t as intuitive as I would like and it ran about $50/month for a *single* user/desktop.
So that let me on a hunt for something that would work well for Ubuntu (non-intrusive, inexpensive remote storage, runs on all platforms KrengelTech supports [Mac/Windows/Linux], easy to use for both geeks and non-geeks). That led me to JungleDisk which I have researched for a couple months and have been a full-time user f
or about 1 month. I must say that it meets all the criteria I have laid out and does it very well (or at least I have zero complaints to date). JungleDisk uses Amazon’s S3 “data storage cloud” to sync a folder on your HD to Amazon. Note that Amazon themselves don’t really offer any real user interface to their service and instead rely entirely on third parties, like JungleDisk, to develop user GUI clients to interface with their API set (which seems like a good idea to me). The cool thing is that JungleDisk is VERY reasonable with a one-time $20 fee for a license that can be used for a single Amazon S3 account. The way my workgroup organized ourselves is we have a single S3 account and everyone has their own password protected “bucket” within that account. That means it was $20 for multiple staff members to have JungleDisk and the Amazon S3 service is pay-as-you-go and that is probably running around $50/month for ALL of the groups data needs! The cool thing is that Amazon has a price calculator so you can estimate very close to what you will need for storage and what it will cost you.
Let me lay out a scenario for a company of 10 to 12 people and what their needs would entail. Note that after your initial push of data from your HD to Amazon there won’t be as much data going in and out, and instead there will be more query requests to check file timestamps to see if a file has changed on your HD and whether it should be transmitted to AmazonS3 (that’s where the 2 million number comes into play).
Storage: 150GB
Data Transfer-in: 30GB
Data Transfer-out: 30GB
PUT/LIST Requests: 2,000,000
Other requests: 2,000,000
Total cost for storage and bandwidth: $52.60
My approach for how I made this work on my desktop is to have a folder named “sync” that I put anything in that I want backed up and I just have JungleDisk backup that folder. The cool thing is that you can schedule JungleDisk to run at practically anytime so it doesn’t interrupt your daily work. I have mine backup twice a day - once in the early morning and once immediately when I am done with work for the day.
One of my friends, Mike Wills, has tried the Amazon S3 service and says there have been outages in the past for an hour at a time, so I guess your milage may vary (though I haven’t experienced that). In the end I just want my data in a safe place, and if my internet goes down once a month (which is does because I am in a newer development with construction crews constantly “cutting the wire”) it is OK because my backup with just run an hour later.
On a final point, what do people think of cloud computing? I think it is safe enough for the SOHO (Small Office/Home Office) arena, but what about bigger businesses? Obviously huge enterprises are embracing SaaS (Software as a Service) though things like Salesforce.com and that isn’t on their network to easily control.



