Lotka’s Law and Open Source Software
Guest blogger Adam Messinger chimes in on some of the things we’ve found using our build, test and analysis service to observe open source projects.
The 80/20 rule is a popular aphorism in software development. It is frequently applied to code optimization (80% of the time is spent in 20% of the code) and caching (80% of the requests are for 20% of the data). What developers may not know is that the 80/20 rule is equally applicable to other fields. Indeed, the origins of the rule are from the field of economics — Vilfredo Pareto famously observed that 80% of the land in Italy was owned by 20% of the population.
The 80/20 rule is actually only one of a specific type of numerical relationships known generally as power laws. Much has been written about power laws and their applicability to everything from linguistics to hedge funds. Recently folks have been writing a lot about the power law scaling of web logs and of the “long tail” of web businesses. Way back in 1926, a statistician for MetLife named Alfred Lotka, published a paper in which he observed that the productivity of scientific authors also followed an power-law relationship. Put simply “Lotka’s Law” says that a few authors do most of the work, dragging along a long tail of less productive authors.
Using Gauntlet Systems’ analysis tools, we can look for this phenomenon in software development as well. These tools facilitate reporting on the activity in a software project — you can think of it as Business Intelligence for software project managers. We’ve pulled a number of Open Source Software projects into the Gauntlet System environment. We can easily use these tools to see how much different authors contribute to projects. Taking two popular projects, Lucene and Hibernate, as examples, we can easily generate the following graphs for activity over the past year.
Both of these graphs display the same type of scaling which Lotka observed in scholarly publication. Given the similarity of the types of work, this observation is perhaps unsurprising. It would be interesting to know if this effect is limited to OSS or if it applies to commercial projects as well. This is just one of many questions we hope to be able to answer as Gauntlet begins taking on customers. Email us if you are interested in finding the answer for your project.