Wednesday, May 9, 2007

Code wants to be miscellaneous

I’ve been reading the fascinating Everything is Miscellaneous by David Weinberger, and have a few preliminary ideas about how this relates to open source. The final two sentences of his prologue: “…information doesn’t just want to be free. It wants to be miscellaneous.” I’d like draw out an implication of this statement:

Code doesn’t just want to be free. It wants to be miscellaneous.

So, what’s that supposed to mean? I’m only going to make a few suggestions here, as I best understand it at the moment, and leave the implications to more study and discussion of Weinberger’s book.

Open source code is free, as we all know, like in “free speech” not “free beer." Well, it is free in this sense modulo the open source licenses morass, but that’s another post… But it is not "miscellaneous" in the sense that Weinberger uses it. For him, an important characteristic of information being miscellaneous is the ability to recombine it in ways desired by the consumer of that information. Information is allowed be an unstructured, unordered, teaming mass with rich associations. Consumers of this information take slices configured according to their wishes, and in the process both create additional information and associations.

I wonder if open source can get to this miscellaneous state? A state in which components, source files, and even parts of source code itself can be recombined as desired by consumers, and in which rich meta-data enables this usage. An example from Eclipse: there are feature definitions which bundle plug-ins into groups based on usage patterns or functional areas. While the feature mechanism is a great tool, anyone who has provided Eclipse plug-ins to enough adopters can tell you that no set of feature definitions will ever satisfy everyone. There’s just too many ways to slice and dice plug-in sets, even in smaller projects, and by the time you get to the size of most Eclipse Foundation top-level projects, the possibilities are huge. In practice most adopters just take the best approximation of their needs based on existing features, though some apparently define their own custom feature definitions. In this system, the plug-in set is structured into features at the wrong end (following Weinberger’s theory): the providers are imposing what they feel to be the appropriate feature set on adopters, whereas it should be the adopters who are defining features as desired. And I don’t mean the adopters suggest a new feature set to the Eclipse project: I mean that the adopters dynamically create their feature sets based on their needs as the moment.

Another example: what if there were an open source project where anyone could modify the code? I don’t mean “submit a patch” or “modify their local copy,” but rather an open source project where anyone could commit changes. You say that this would quickly become chaos? The code would never run? Security bugs could be injected on purpose? How can the quality/intellectual property be managed? Yes, these are real concerns today. But, to me, this points to the need for a new set of tools and a new way of working with source code. One which lets me understand the lineage, modifications, and intent of various “layers” (versions) within the miscellaneous code base. One which lets me have the stability demanded by (especially commercial) adopters, with the free-flowing experimentation of an ideal(istic?) open source community.

I can see two objections here: this whole “miscellaneous” business is a sham, and arguments that we can/are doing so already. Maybe the first objection is correct, but I’d reserve judgment until the ideas and implications of Weinberger’s book are better understood. Personally, I doubt this is a sham: on the contrary, it is pointing to something very important. For the second, I’d agree in theory, if not practice. In other words, in is possible, but not at all practical, to do these sorts of things with existing technology. But that’s not the point: the reason why we’re not still writing everything in assembly language, for example, is that higher-level languages bring abstractions and enable thinking in different ways. The same is true here: we need tools working at the right “level” so miscellaneous information can be harnessed on a routine basis, without requiring large effort, knowledge or time.