Tea Leaves » Computers

The Digital Desert Island

psu — Mon, 28 Feb 2011 22:57:51 +0000

The other day I accidentally reset the music that I sync into my iPhone. As I result, I had to wait for a long while for iTunes to copy all the music back into the phone where I had inadvertently deleted it. As it copied track after track it occured to me that the iPod/mp3/iTunes/digital music era has made something that used to be a mainstay of lazy music writing completely irrelevant: the desert island record list.

You know the drill. Suppose while traveling you are stranded on a desert island with no way back to civilization. But, this island somehow has an independent source of infinite electricity, and you happened to pack your LP record player and some albums for your trip. Of course, since you were on a trip you can only carry ten records. Which records do you take?

Or maybe this variation: you are about to be abducted by aliens, but they will let you take as much of your music collection as you can carry. Which ten records do you take along?

Back in the 1990s, these sorts of questions made sense. This is because several hundreds of hours of music would require an entire wall of storage to hold the disks. But in the new digital era, those same hundreds of hours of music now fit in your pocket. Furthermore, in less space that it used to require to hold those ten LP records you can carry a portable hard drive that can not only hold the content of all of your music, but probably all of your books, movies, magazine archives, and every web page you ever read since the web was invented. The infinite scalability of digital storage has rendered the entire question of what to take with you completely irrelevant. You just take everything. So don’t worry about that desert island, or those pesky aliens. You will not be bored.

This is the fundamental theorem of digital content: scarcity does not create value. Scarcity just pisses off your customer. Your customer is now expecting to be able to buy and consume your product anywhere and any time she pleases. Furthermore, if you stand in the way of this expectation, she will just leave and find something else because she is carrying the entire library of everything ever published in her pocket.

In other words, there are no digital desert islands.

It’s been my observation lately that in the modern content marketplace there are two kinds of vendors: those who understand this thought experiment and those who do not. The music industry tried to ignore this truth, and they were crushed. The movie industry is trying its hardest to ignore this truth, and it seems to me that they are probably on the way down as well. Book publishers are also being shown the truth, and reluctantly starting to play along. The only modern player that I can think of that ignores this rule and gets away with it is the NFL. But off-shore streams or armies of people with Slingboxes might yet defeat that dragon as well.

The industry that actually got me thinking about this was comic books. I read the occasional superhero comic book when I was in junior high school, but my interest in them had largely waned as I became an adult. This changed when I got my iPad. The comic book reader on the iPad is great. It is one of the most perfect marriages of hardware and software to a particular style of content that the world has yet seen. It could not be any better, unless you have some emotional attachment to actually turning the pages, or actually looking at ink on paper.

However, there is one catch: the comic book industry does not actually want you, the iPad user, as a customer. They want to continue to sell comics the old way. My main evidence for this conclusion comes from having interacted with the iPad reader, marvel.com and comixology off and on for the last eight months or so. If you try and use these outlets, the first thing you notice is that the availability of various titles is spotty, and that at best they come out several months to a year after the same titles come out in print. What’s clear from the beginning is that the comics industry has not shaken its general prejudice against delivering their content in digital form. They see it as a second class revenue stream which is not really worth their full attention. In other words, if you want to read comics primarily on your iPad, you are not dedicated enough to be a customer that they will pay attention to.

The comics people don’t even seem to be comfortable with the middle ground of me finding and ordering *print* versions of the comics in their electronic storefronts. If you hit “buy in print” in the iPad apps you get an interface that just shows you the phone number and address of near by comic book stores at which you can buy the book. They don’t even have the decency to send you to amazon.com.

The Marvel and Comixology web sites are not much better. Assuming you can figure out how to navigate to the book you want to buy neither site seems to do any direct sales. Or, if they do I could not find the interface for it, so it amounts to the same thing. In addition, the registration and account management workflows on these sites is like something out of e-commerce circa 1997. The Marvel site is so bad that I couldn’t even change my contact information in their database without sending e-mail to a customer support professional to have them do it.

Over and over again the message is clear. They want you to go to the comic book store and pick up the carefully packaged little book still in its plastic baggie. They want to require you to build some trumped up relationship with a store-owner with questionable personal hygiene habits in order to have the right to consume their content. In other words, they still think limiting the distribution of their content increases its value, when really all it does is make new customers like me give up again. Time will tell if they are right. But I think that they should try harder to understand the digital desert island theorem. So far everyone else who has ignored it has lost.

The Ultimate Goto

psu — Sat, 18 Dec 2010 02:12:45 +0000

When I was in college I was something of a programming languages hobbyist. I think all young dorks go through this phase. Programming languages are fascinating repositories of different ideas for creating abstractions for constructs that programmers find themselves building over and over again. Back in the day, one of my favorite papers was the Guy Steele title whose short form is just LAMBDA: The Ultimate GOTO. The title is fantastic because it brings together several disparate trains of thought on how programming languages work and combines them into a single statement. Lambda? Goto? What do these have to do with one another? Therein lies a story.

Computers, you will recall, work by fetching a stream of instructions from some kind of memory and executing these instructions one at a time until they come to the end of the stream or the universe ends, whichever comes first. Computers would not be very interesting if all they could do was follow a single “straight line” of instructions though. If this were the case, all they would be able to do is perform the same computation over and over again. What makes computers interesting is that they can examine their input and make decisions about what to do based on what is presented to them. In programming lingo this is called “flow control.”

You need two kinds of instructions to implement flow control. First, you need some way to evaluate boolean expressions. For example, you want to be able to ask “Hey computer, is this number I gave you bigger than 10?”. Or maybe “hey computer, did I just touch the iPad screen on top of that button?”. Next, you need a way to jump from wherever you are in the instruction stream to some other location in the instruction stream based on the result of a conditional expression. This jump is what we call a “goto” instruction. You say, “Hey computer, if that value is bigger than 10 GOTO memory location 55 and begin executing whatever instruction is sitting there instead of the one that is right after me.”

It turns out that if you combine memory (that is, a way to save state), conditions and GOTO, you can compute everything that is computable… in the sense that you can emulate any computing machine that man has dreamed up in the past, and will dream up in the future. Alan Turing figured this out back in the day, but that’s a different article.

It also turns out that only having memory, conditions, and GOTO is a tedious way to go through life. Programs built on such a simple infrastructure are hard to organize and difficult to understand. What you’d like to be able to do is organize your program into smaller bits called functions and have these functions be executable from other bits of code. That is, you’d like to be able to have the computer save your spot in whatever code was currently executing and jump somewhere else to do something, and then automatically jump back to where you were in the first place. Why would you want this? It turns out that most programs need perform many common tasks, like reading and writing files, or making connections to the Internet. Rather than making every program implement these tasks separately, if we have this “jump then return” mechanism, we can write the code once, and then whenever we need to use it, we can just jump over into that code and then return when it is finished.

Happily, most computers implement just such a instruction. In the venerable 6502 chip, that instruction was called “JSR” which means “jump to subroutine” which is a weird way of saying “jump over there, but save your place so you can return.” The 6502 had another instruction called “RTS” which basically just jumped to the last place you saved.

Most programming languages have similar high level mechanisms for building functions or procedures that use these hardware instructions. Typically a function is defined to take a few arguments that the caller provides. These can be used to change the behavior of the function as it executes. So, the function you call to handle the fact that the user just touched a button might take the name of the button that was pushed, so you know what command to run. Once you have hardware instructions like JSR and RTS, it’s pretty easy to build up a high level notion of functions. You just need to define conventions for how to manage arguments and results, which is tedious, but not complicated.

Early in the history of computing it was thought that function calls (or procedure calls, as Guy Steele calls them) were relatively expensive. It turns out that they were just implemented badly. This fact is the main subject of the paper referenced above. Steele notes that when you think about procedure calls correctly, all you are really doing is saving some state and then using GOTO to jump to a new place in your program. In 1977, this was a pretty radical idea.

Reading the paper, you might now think, “well, that explains the whole ‘Debunking the Expensive Procedure Call Myth’ thing, but what about ‘The Ultimate GOTO’”. Well, that’s a longer story.

Functions are so useful that a bunch of clever language designers, including Guy Steele, got to thinking about whether you could define an entire programming language that was completely centered around the idea of function evaluation rather than the more typical “set this value in this memory location and go run that code” programming structure that we are all more used to. To this end, they began to play around with a simple abstract notation called the “lambda calculus” that expresses function evaluation in a way that seems completely different from the operational jump and return dance that I described above.

In the lambda calculus, you write a function in terms of the values that it takes as arguments and the values that it returns as results. The “lambda” in the lambda calculus is an operator that binds names to values. So, you might write a simple function like this:

lambda (x) . (x + 1)

This takes a single argument “x” and returns the value that you get by evaluating the expression “x+1″. In other words, it adds 1 to the argument. You might write something like

(lambda (x) . (x + 1)) 10

which will evaluate the function we wrote with the argument “10″. First, the value “10″ is bound to the argument “x”. Then we evaluate the expression in the function itself, and we get 11.

Surprisingly, it turns out that if all you have is some rules for binding and evaluation and a few primitive functions, you can take any program at all and translate it into the lambda calculus. But that’s a subject for a course in theoretical computer science. Not so surprisingly, actually writing programs in lambda calculus gets tedious quickly. As with the primitive machine language, you need some higher level languages that let you organize programs into smaller bits that are more easily understood. One such language is called Scheme and happens to be the one that Guy Steele was interested in at the time he wrote his paper.

Scheme programs look a lot like lambda calculus. The function above might be written like this

(define add-one (lambda (x) (+ x 1)))

Then when you evaluate the expression

(add-one 10)

you’d get back the value 11. Easy. Scheme defines various rules for binding values to arguments, and you can think of the evaluation engine as just a fancy and more featureful version of the simple lambda calculus.

One of the more novel ideas implemented in Scheme was the notion that functions themselves would be manipulated as primitive values in the language. This is a natural outgrowth of the language’s basis in the lambda calculus. Consider the code above. What we are really doing there is taking the name add-one and binding it to a value which is the function defined by the lambda expression. There are some tricky mechanical issues involved in implementing a mechanism like this. The main issue is that you need a way to capture bindings for all names that appear in the body of the function, even those that are not defined as arguments to the function. I’m not going to get into the details of where such bindings come from, or exactly how you implement this capture scheme. Let’s just assume that we have a magic box that does the right thing, and let’s call that box a “closure”.

In other words, an expression like (lambda (x) (…)) constructs a special object which first captures bindings for all the names in the body of the function and then transfers control of the program to the function itself. But wait. That sounds a lot like the simple procedure call mechanism that we defined on our simple memory and GOTO machine. In the context of this paper, the phrase “The Ultimate GOTO” is used to illustrate that while procedure calls and GOTOs seem very different, in fact they are not.

But there is more to it that this.

Recall how our simple abstract machine implements function calls:

1. Save values for arguments.

2. Save location to return to.

3. GOTO the code for the function

4. At the end of the function, save the return value of the function and then GOTO the location you saved in step 2.

Suppose we think about this process slightly differently:

1. Save values for arguments

2. Save a function value that represents a function to call with the result

3. GOTO the code for the function

4. At the end of the function, call the function value you saved in (2) with the result of your computation as an argument.

The new forms of steps (2) and (4) seem on the surface to be different than before. But really they are not. As we have already seen, function calls and GOTOs are really the same thing. It turns out that this is a pretty old idea, and the theorists call the function that we create in step (2) a continuation.

In later papers on Scheme, Steele and others observed that you could create very efficient implementations of Scheme by structuring the runtime to transform procedures and procedure calls into what they called “continuation passing style”. All this means is that all of the functions are transformed into something like the second form above. In other words, all of the code in a Scheme program is twisted around so that all the function calls have an extra argument that is a function value that represents “where to go next.”

But, Scheme programs are nothing but function calls, so this means that the “where to go next” function is always available to the runtime. It’s sitting right there, since we created it to implement the function call in the first place. Therefore, Scheme also defined a special construct called “call with current continuation” (or call/cc) that allowed the programmer to explicitly capture the “where to go next” function and pass it wherever you wanted. When called, this captured function would restore the state of the program to be exactly the same as it was when the function was captured. This is a fantastically powerful and psychotic mechanism. Having access to the current continuation lets you capture and manipulate the control state of your programs any way you want. Iteration, recursion, exception handling, multiple threads of control and any other control construct that you can imagine can be implemented using this mechanism. In other words, lambda really is the ultimate GOTO.

Extra Notes

Scheme is not the only language that has call/cc. ML is another famous one.

Closures have made their way into more mainstream languages: Java, C#, and Objective C among others all have constructs that are similar to closures. As far as I know, there isn’t really anything like continuations outside of the functional languages, although setjmp/longmp in C is similar, but not as “clean”. This is probably for the best, since esoteric mechanisms for creating odd flows of control tend to be used only for evil.

I had always assumed that the idea of the continuation had originated with the work on Lisp and Scheme, but I was wrong. It’s actually a much older idea, as discussed in this paper by John Reynolds, the notable programming languages researcher at CMU. The use of continuations in implementing Scheme is discussed in this paper by Sussman and Steele.

Who said weblogs aren’t educational?

Snapped

psu — Wed, 15 Dec 2010 02:34:08 +0000

Dear Internet Forums: While I love you for the wonderful capacity you have for the intelligent and thoughtful exchange of useful information with others from around the world, I am wondering if you could do me one little favor.

I’m wondering if y’all could simply keep track of which god-damned messages in a thread I have actually read rather than just the threads I’ve looked at since the last time you stamped a cookie on me? I mean, is this really that god-damned hard? After all, USENET did all the way back in 1984.

Thanks!

Crossing the Chasm

peterb — Wed, 24 Nov 2010 01:52:26 +0000

Thomas at Mile Zero recently wrote a piece called The Console Model is a Regressive Tax on Creativity. I think Thomas is wrong. Here’s my reply.

Thomas,

Your attempt at a conceptual leap from “There exist platforms that limit the amount of hacking that can be done” to “Those limitations are a barrier to entry for minorities” rivals Evel Knievel’s storied jump over the Snake River Canyon, and ultimately it is no more successful.

First off, encoded in your article are the following assumptions:

…that coding or “hacking” is the most meaningful way to interact with technology.

…that “running homebrew code” is a valuable form of experimentation for a statistically significant number of people.

…that making a platform more amenable to end-user coding is cost-free (and here I’m using “cost” not in the monetary sense, but in the sense of “cost to the end user in terms of the usefulness of the platform”).

All three of these assumptions are not simply incorrect, but woefully so. To borrow the words of Wolfgang Pauli, “This isn’t right. This isn’t even wrong.”

Regarding the first assumption: while those of us in the technology sector like to romanticize our early experiences with our various Apple IIs, TRS-80s, and other hackable platforms, the fact is that the skills we learned from doing that work had a narrow effect: they made us marginally more likely to enter into one particular career track in the tech industry. There are thousands of other equally valid, fulfilling, and financially rewarding career tracks in the tech industry that don’t involve the specific aspects of software development you are rooting for. Put another way, the assumption that hacking is an intrinsic good in and of itself seems to me morally equivalent to bemoaning the fact that most people who buy modern cars aren’t using them to brush up on their auto mechanic skills.

Regarding your second assumption: to the extent that one decides to make the (incorrect) choice that Everyone Should Be A Computer Engineer, it’s not clear to me that bemoaning the existence of less-hackable platforms is in any way effective. Risking a second analogy, you are complaining that it’s hard to use hammers as screwdrivers, and really, those hammer manufacturers are holding us back by not putting a Phillips-head socket on the handle.

Regarding your third assumption, end-user electronic appliances have expected use cases. Presumably, both the people selling these devices and the people spending money on them think that these use cases are important. Making a platform “hackable” is not simply a matter of “don’t put DRM on it”. In many cases, what you call “hackable” I call “dangerous”, in the sense of a gun manufacturer who sells a handgun without a safety. If leaving a platform “open” or “hackable” means that developers should compromise the experience of 98% of their users for the benefit of the 2% of users who desperately want to use their hammers as screwdrivers, then I have to say that “open” sounds like perhaps one of the worst ideas I’ve heard of. If, on the other hand, you’re positing that we should refuse compromise, and perfectly meet the needs of 100% of everyone, and make an open platform that in no way compromises the user experience, then I look forward to eventually eating and/or cleaning my kitchen with your combination floor wax/dessert topping.

In summary: even if I accept the (flawed) premise that “careers in technology” are driven by, specifically, the ability to write software, you have in no way made the case that degrading people’s experiences with technology is the best way to get them to write more software. Optimizing technology for openness is almost never free, and if in making a product more hackable you end up interfering with the job the product was intended to do, you will, in the long run, do more harm than good.

Open platforms will always exist. This is a good thing. Equally important is that there will always be platforms whose primary concern is that of the experience of the user, and not the desires of the tinkerer. To elevate the tinkerer’s needs above the needs of those who need the product is, to be blunt, vaguely silly fetishism of technology as an end in itself.

Stream The World Cup

peterb — Sat, 12 Jun 2010 21:29:06 +0000

Recently I wrote about streaming movies on demand. I’m interested in this not because I care overmuch whether content is “streamed” or “downloaded.” Rather, what I desire is “deliver the content I want, when I want it, where I want it, with a minimum of effort.” Today I had another encounter with The Future™ that I think is worth mentioning.

It happened when I was driving into the city during the early World Cup game today. I sort of wanted to be able to listen to the game. My first instinct was to use the XM radio installed in my car. I haven’t kept my XM subscription up, because I generally prefer the music on my iPod or iPhone. But, I thought to myself, perhaps i can activate XM for this month, just for the World Cup, and then cancel when it’s over. When I arrived at my destination, I looked into this option.

In order to do this, I went to XM’s web site, which was a mistake. What I learned from their web site is that they can’t actually tell me how much their service costs, and if I want to buy it I have to type in my credit card and a bunch of numbers identifying my radio, and so on.

As I was looking all this stuff up on my iPhone, a question suddenly popped into my head: “Gee, I wonder if ESPN has an app that does this?” Sure enough, within 30 seconds I had downloaded ESPN’s free 2010 World Cup app. 10 seconds after that, I had made an in-app purchase – for about $8 – that gave me access to streaming audio for all 64 World Cup games. No typing in my credit card numbers, no creating an account, no recurring subscription, no email confirmations. One tap to download the app, entering my iTunes password, one tapping “yes” to agree to the in-app purchase, and one tapping “Yes” to confirm the charge. Within 1 minute I was listening to Argentina tussle with Nigeria.

That is how I want my digital transactions to work. If you are selling a similar service, and it’s more work for your customer than that, you are doing it wrong.

Dear Western Digital

psu — Thu, 10 Jun 2010 01:58:57 +0000

What I want to buy from you is a hard drive. I want to buy hard drive from you because you are good at building drives. I do not want to buy software from you because you suck at building software. Here are some things not to do.

1. Do not waste 500MB on a hidden partition with shitty software on it that I will never run.

2. Do not make this 500MB hidden partition undeleteable.

3. Do not write firmware that tries to use the 500MB hidden partition to create a phantom “CD Drive” that is supposed to mount whenever I plug the drive into my computer. Especially do not do this if the fucking fake CD drive device never mounts.

4. Do not make me download special software to disable your brain dead fake CD drive device firmware bullshit that doesn’t actually work.

5. Do not make me download more software to do nothing more than reset some display text on the front of your drive because you think I want to stare at “VIDEOS 09″ whenever I look at the drive.

6. Do not make said second software download also install some more brain dead backup software that runs as root on my machine. There is no fucking way I will run your backup software. You can’t even make a fake CD drive that works. I already have backup software. The only reason I bought your drive was to write data to it from my backup software

At least you made it possible for me to uninstall your dipshit backup software.

Anyway, never buy a Western Digital external hard drive again. Go to Other World Computing instead. Their drives just have bits on them. And some software you can just delete and doesn’t otherwise get in your way.

We are Dusty Again

psu — Fri, 04 Jun 2010 11:51:17 +0000

Please bear with us. We hit the “upgrade” button on WordPress. As you know, the upgrade button on WordPress actually means “break the entire god-damned weblog until we can hand edit all the completely hateful PHP and CSS so that the site almost looks like it did before, but not quite”. If you have been paying attention you’ve noticed that the site has been looking incrementally more ragged over time. This happens every time we hit the upgrade button. This is because fundamentlaly WordPress sucks.

Stream The World

peterb — Tue, 25 May 2010 23:50:29 +0000

10 years after canceling my account for the first time, I have resubscribed to Netflix.

I canceled my account, all those years ago, not because of any flaw in Netflix’s service, or because it wasn’t worth the money, but because Netflix was making me crazy. I had to watch the movies. I had to watch all the movies, and I had to watch them immediately. Friends would come over: “Hey, want to go out and see Bob?” “Can’t. MUST WATCH MOVIES so that I can RETURN THEM.” The fact that I wasn’t actually enjoying all this panicked watching was sort of beside the point. Somehow it tripped a wire in my brain. I eventually realized this wasn’t healthy for me, and I dropped the subscription.

The arrival of the Netflix app on the iPad brought me back. I’m sufficiently steeped in movies at this point that I’ve come around to the psu way of thinking about Netflix: you’re not paying for the subscription to watch the movies, you’re paying for the subscription so the movie can sit on your end-table, unwatched, for 6 months while you don’t worry about it. And the ability to stream movies to the iPad, to the computer, or to the Xbox is pure love. I actually watched half of a movie on the iPad, then later started watching it on the Xbox, and Netflix remembered where I was and picked up where I left off. That gave me a little thrill.

Conceptually, streaming these things should be the ideal way to go. When it works, it is a little like living in The Future. The quality is good (or at least “good enough”), there’s no physical media management to worry about, and I don’t have to think about getting to the Post Office. In reality, the selection of movies available for streaming is a bit thin. It’s definitely enough to find something to watch if you’re bored, but the percentage of movies that can be streamed is still comparatively miniscule. Interestingly, this is true not just of Netflix, but also of various other online movie rental services, such as iTunes or Amazon MP3. This, for me, has been the frustrating part of the movies-on-computers experience: there are some movies that I want to watch, without having to touch a DVD, that I am ready and willing to pay for, but Hollywood can’t figure out how to take my money.

This doesn’t just apply to movies, but also to other forms of media. I can’t tell you the number of books I have wanted to buy electronic copies of that aren’t available. And I don’t mean “not available on the iBookstore,” or “not available on the Kindle store,” but not available anywhere. The only way someone will sell me some of these books is if I let someone take an axe, chop down a tree, soak it in chlorine bleach, and stain it with the processed juice of long-dead animals and plants. It’s like the publishing industry is composed entirely of ignorant, filthy savages. My solution to this is “I don’t buy the book.” They don’t get my money, and my local library gets more use. Everyone wins except the barbarian who can’t figure out how to take my cash.

We are so close to my ideal media world that it almost hurts. We need to get to a place where we can (legally) watch any movie (or read any book, or hear any song) ever made, wirelessly, with no physical media, on whatever screen we happen to be sitting in front of.

Lose The Disk

psu — Wed, 21 Apr 2010 12:03:34 +0000

Long time readers of this site will remember (or not) that I’ve been slowly working my way through my CD collection and adding it all to iTunes. Over about the last three years, I’ve gone from having 140 albums filed to the current figure of around 375. The total number of disks is somewhat higher, because a lot of my collection is large boxed sets that I hardly listen to. Because I’m stupid. Anyway, at the current rate, I should be finished with this project in about 2017. A week or two ago I came to an interesting epiphany about the whole endeavour: what I should do is rip the disks and then throw them away. I’m just going to lose them anyway.

The event that brought about this conclusion was one that has happened two dozen times while I have been filing my CDs. Here is how it goes:

1. I see album A already ripped in iTunes, but at a hideously low bitrate.

2. I look for album A on my shelf.

3. Album A is gone, because there is nothing in my life easier to lose than a physical disk with digital data on it.

4. Go and buy another copy of the CD and rip that.

This finally came to a head when I temporarily lost two boxed sets which contained about 10 disks each. The boxes finally did turn up (they had been misplaced during some renovation on the house), but the fact that two large repositories of digital music could just disappear like that got me to thinking about my position on the purchase of downloads in general.

In the past I have had two main objections to music (or other) downloads:

1. I tend to believe that I will lose the data and have no recourse.

2. I have a fear of the “lower bit-rate” format. The files you download are more compressed than the music you buy on disk, so I’d like to have a copy of the “full bit-rate” recording around “just in case.”

I now realize that my first objection is complete bunk and my second objection is probably meaningless.

While ripping my music collection, I think that I have discovered one sobering fact: I have lost an order of magnitude more data on physical media than I have ever lost on a sealed disk drive in my various laptop and desktop computers. This is because I am very careful about preserving data on hard drives. That said, preserving data on hard drives is easy: you just make lots of copies.

In contrast, it seems to me that keeping track of physical disks is a lot more complicated. You need shelves. You need to make sure you put the disk back in the same place you found it every time you use it. This works OK for disks you never listen to (like my boxed sets) or large disks that you would step on if you were not careful (like my LPs). But, CDs and DVDs just get lost. It is apparent to me that I don’t have the space in either my physical or mental life to actually keep track of them. Therefore, they get lost. And when they get lost, I replace them and then lose the replacement. I bet I have six copies of Kind of Blue lying around the house in various places. This is stupid.

As for bit rate, I made a discovery there too. I ripped a couple of my out of print CDs in Apple Lossless in a fit of misplaced paranoia. What I discovered was that Apple Lossless averages at about 500Kbps. Songs from the iTunes store (and the stuff I rip myself) comes in at 256Kbps (in AAC format). It is my solemn belief that doubling the number of bits will not make a material difference for me and my sound reproduction equipment. I have made an executive decision to throw half the bits on the floor because I don’t care anymore.

These two conclusions have a large impact on how I view my little CD ripping project. In the past, I had viewed it as a way to convert “my collection” into a more convenient form for listening, while preserving the “base data” somewhere else so I would not lose it. Now, I think, I have tipped over and realize that the iTunes database is the entire point and the truth is that I don’t want the disks at all (except for the liner notes, dammit!).

This means that increasingly often, my workflow in the project has become:

1. Find a disk or disks I need to rip.

2. Realize that I will spend 15 minutes per disk getting the data into iTunes, editing meta-data, and filing things.

3. Notice that the album is at the iTunes or Amazon store.

4. Just buy it and download it.

Incredibly, this flow is even possible for a large percentage of the more obscure classical titles that I own.

My final conclusion is this: forget about the disk. It just doesn’t matter. In fact, I’ll go further and say that all of this holds, only doubly so, for disk based movie formats. The only thing allowing Blu-Ray and DVD to tread water against the inevitable is that the movie companies have done a better job at locking the rights down and making it hard to find movies for download. But these days, given the choice between buying/renting a movie on disk or just downloading the thing I will just download the thing 99.99% of the time.

I’ll only buy Blu-Rays if they come with the digital file as well. That way I have something to watch when I lose the disk.

P.S.: None of this applies to LPs, because LPs rule! I just like staring at them in all of their oversized plastic glory. So there.

Backups

psu — Tue, 30 Mar 2010 12:13:39 +0000

You need to do backups. Back when computers were primitive and useless, backups were not as much of a problem because no one was storing anything in the machines. But these days, computers have grown up and it’s possible that your entire life is stored in your laptop. So you should back it up. I came up with the following backup scheme primarily to archive my digital pictures. But I think it’s useful for most general use too. The scheme sounds complicated, but it’s actually fairly simple.

Rule number one about backups is simple: don’t think. Back it all up. The less you have to think the better.

Rule number two about backups is that the best place to put them is on an external hard disk. In the old days, optical disks were big enough for people to consider using them, but DVDs only hold at most 8GB each which means it would take three or four to do a full backup of a circa 2001 Apple Powerbook with a 20GB drive. Disk storage has grown by an order of magnitude since then, so you are talking about burning a few hundred DVDs for a backup. I was too lazy to even do four back in the day. So avoid optical disks because they make it too hard to apply rule number one (backup everything).

The best thing about external disks is that they get bigger every couple of years. So every couple of years you buy a new set and copy all the old data to the new disks. This is rule number three about backups: no matter how you store the backups, the bits will rot. The only way to make sure they do not rot is to copy them over and over again. Copying everything to a new set of disks every year or two is therefore a good idea. Don’t worry about the cost, you can buy a terabyte external drive for $100 now. Compared to losing your entire digital life, that’s free.

The next best thing about external hard drives is that they are so cheap you can buy lots of them and use them for different backup purposes. At a minimum you should have two external drives that are about the same size as the main drive in the machine you are concerned about. My main machine is a laptop, so here is what I do:

1. Whenever I get a new laptop, I buy two bus powered portable drives that are about the same size as the main drive in the laptop.

2. When I get the laptop set up, I take one of these drives and make a complete bootable copy of the drive in the laptop. On Macs, there are various pieces of software that allow you to do this easily. You can use Disk Utility to do it. I use something called SuperDuper! because it has some other handy features. I then label this drive as the “mirror” drive. I carry this drive on trips in case the disk in the laptop goes bad and I need an emergency boot drive.

The mirror drive is also handy for when you need to undo some disaster that you have perpetrated on your main machine. You can mirror your machine before every major OS update, for example, in case you need to back it out. If it all goes bad, you can boot from your mirror and copy it all back to the main drive on the laptop. Macs make this easy. I would assume and hope that there is some way to do this in Windows too, but I don’t really know.

3. I use the second external drive for Time Machine. Time machine is handy for automatically keeping track of all the new files I’ve created without me needing to do anything. And it keeps multiple versions of all those files. So in a disaster, I can boot with the mirror drive and restore all my data from the Time Machine drive if the mirror drive is out of date.

My backup scheme follows this main pattern. I generally do a backup whenever I have a add batch of pictures to my laptop. I do this a lot, and digital photographs make up the bulk of the data that I want to back up, so this makes sense for me. Here is how the flow works.

1. Load pictures into laptop. Do all the picture workflow stuff on them.

2. Hook up the mirror drive. Sync the new data to the pictures folder there. SuperDuper! has an incremental update feature which makes this easy.

3. Hook up the Time Machine drive, let Time Machine do its thing.

At this point, the first level backups are done. But, there is more to consider.

In addition to the two small drives, I also have two large desktop drives that have my complete photo archive on them. The laptop and portable drives really only have enough room for a year or two of pictures, and I’m up to eight years of pictures now. These two large drives are specifically dedicated to the photo archive. So whenever I add pictures, I use a script to copy the new pictures to these drives. In addition, I keep one of these drives at work, just in case my house burns down. This forms the second level of my backup archives.

As a side note, the one time I had a serious disk crash (knock wood) on my laptop, it was the work backup drive that saved me from losing about a month of photos. I generally bring my laptop to work about once a week to sync it.

The fourth rule about backups is that if you can make two backups of something, it’s just as easy to make four. I have a desktop machine at home that is used mostly for iTunes. But I also use it as a third level of backups. Whenever I add pictures to the mirror drive, I also sync them to my home directory on the iMac. Then I let Time Machine on the iMac copy them to yet another external drive.

So now each new photo is stored seven times:

1. On the laptop.

2. On pocket drives #1 and #2.

3. On desktop drives #1 and #2.

4. On the iMac.

5. On the Time Machine disk for the iMac.

When I run out of space on the laptop and the smaller external disks, I delete older photos. So for the long term, I have the entire photo archive stored four times. As before, every couple of years I upgrade all these disks.

Most recently, I also decided to sign up for one of these online backup services. I picked Backblaze. I told it to backup my iMac to some data center somewhere. So if Pittsburgh burns down and takes all of my local drives with it, I can still call them up and get my photos back. It only took a month for the iMac to copy everything over the network.

To summarize. There are four things you need to know about backups:

1. Backup everything. Even what you don’t need. Backup the important stuff more.

2. Use disk drives. Other media are too inconvenient, which means you won’t make backups.

3. You must make new copies of the data every year or two. Conveniently, you need new disk drives once in a while anyway.

4. If you are going to back everything up once, you might as well do it twice. And if you are going to do it twice, you might as well do it four times.