An Open Letter to the Maven Central Repository Maintainers:
My first impressions of Maven were negative to say the least. Where previously I had seen all manner of Open Source projects loosely following the same conventions, suddenly I found projects that all looked the same. Their websites were almost identical, there deliverables were all following the exact same convention, and it was becoming increasingly difficult to tell them apart. Obviously there are a number of benefits that come from such uniformity, however all I could see was that Maven was infringing on the creative process of these Open Source projects.
Coding is, without doubt, a creative process and Open Source is perhaps the most visible expression of that process. Open Source offers software makers an outlet for their creative expression, with the only real tangible goal being the evolution of the software itself. Unlike many other creative activities however, software typically has a purpose beyond its creation: applications are designed to provide benefit to the end-user, and software libraries are designed to be useful to other software makers. Without these less tangible goals, many software projects would not have a reason to exist.
There is perhaps no better example of this than the wide array of Open Source Java libraries available today. These libraries are now at the heart of most Java-based projects, and I would be very surprised to learn of any Java project that did not depend on Open Source in some form. With these dependencies being such an integral part of the Java development process, we definitely need conventions that simplify the management of such dependencies. In the past most projects have used Ant as the basis of their build, package and distribution process, a tool that has worked reasonably well due to different projects adopting similar conventions which we all understand. However, Ant does not address the issue of managing the growing list of dependencies we use in our projects, which has always been essentially a manual process. Upgrading to newer versions of dependencies can be quite a headache in Ant-based projects, depending on how sophisticated your build scripts are. We have also found that in order to provide a consistent environment for these projects, we have been forced to store these dependencies in our source control repositories, for a lack of a better place.
So the rise of dependency management tools in Java projects is inevitable, due to the obvious benefits they provide to the development process. But where Maven appears to go beyond the aims of other dependency management tools is that it attempts to impose strict controls over the build, package and distribution process. At least, those were my first impressions anyway.
Since exploring Maven2 I have been pleasantly surprised by the fact that the configuration defaults are just that - defaults! Just about anything can been overridden in a Maven project descriptor (POM), allowing the developer to retain a certain level of creative freedom over their project. Obviously the package and distribution conventions cannot be altered to a large degree, and generally this is a Good Thing, as interoperability with other projects (third-party libraries, etc.) should remain of primary importance when developing Java-based projects. So to this end I have come to think of Maven as offering a benevolent dictatorship for Java dependencies, whereby strict control is only imposed for the benefit of the dependency management process.
Recently however, I have come to a different conclusion, one which I have been dreading since I first noticed Maven. It seems to me now that this dictatorship is perhaps not as benevolent as I first thought. This new conclusion is derived from a recent discussion had with one of the Maven Central Repository maintainers, in an attempt to load my Open Source project libraries into the master repository for distribution to the general public. It appears that the Maven repository maintainers are not happy with the chaos and disorganisation in the master repository, and have decided to impose even greater control over how you distribute your project. In the past you were welcome to contribute your project using any Group Id/Artifact Id, as long as it was unique! Now this is very important for distinguishing between one project and another, and to this point has served Maven’s goal of distribution reasonably well. Now it seems the repository maintainers are not happy with this approach, as identified in this note on their website:
IMPORTANT considerations about the groupId: it will identify your project uniquely across all projects, so we need to enforce a naming schema. For projects with artifacts already uploaded to ibiblio it can be equal to the previous used, but for new projects it has to follow the package name rules, what means that has to be at least as a domain name you control, and you can create as many subgroups as you want. There are a lot of poorly defined package names so you must provide proof that you control the domain that matches the groupId. Provide proof means that the project is hosted at that domain or it’s owned by a member, in that case you must give the link to the registrar database (whois) where the owner is listed and the page in the project web where the owner is associated with the project. eg. If you use a com.sun.xyz package name we expect that the project is hosted at http://xyz.sun.com.
Now, aside from the fact that they are basing this restriction on the Java package name convention that is not enforced, we must question what is the perceived benefit of this draconian measure. Will it help Java developers find the dependencies they require? No. Does it make it any easier to navigate the master repository? Not at all. Does it provide any real tangible benefit to Java developers? Definitely not! The only reason I can see for this is that the repository maintainers felt that the structure was a bit messy!
Some may argue that it helps to identify the correct Group Id where you know the domain under which the project is hosted. This is simply not true, as assuming that the project is actually hosted in the Maven repository you can be assured that instructions on the Group Id/Artifact Id will be provided somewhere on the project site (especially if they choose to use Maven’s uniformly-generated site!). In fact there is not even a guarantee that the Group Id will be the exact reverse of the host domain name anyway, as is stated in the Java Package Name convention from which this restriction is derived:
The name of a package is not meant to imply where the package is stored within the Internet; for example, a package named edu.cmu.cs.bovik.cheese is not necessarily obtainable from Internet address cmu.edu or from cs.cmu.edu or from bovik.cs.cmu.edu. The suggested convention for generating unique package names is merely a way to piggyback a package naming convention on top of an existing, widely known unique name registry instead of having to create a separate registry for package names.
Consider where Java would be today if you had to prove ownership of a domain prior to publishing any code using the reversed domain name package convention? Sure, it would be near impossible to implement such a restriction, but if Sun (or anyone else) had even attempted to try you can be assured that Java would not be as widely accepted as it is today. There are many reasons why Java has become such a popular language, but I think you’ll find that the primary reason for its success is that it is freely available and doesn’t impose too many restrictions on a developer’s creative freedoms.
Do we see such restrictions imposed elsewhere on the Internet? Take this Blog as an example. Do I need to own the domain www.thenextbigthing.com to publish it? You’ll find many blogs on the Internet with the same title, and in hindsight that probably makes it a poor choice in distinguishing my own blog, but it is my absolute right to publish this blog wherever I see fit, as long as the address is unique! Let me make this point absolutely clear: the content of this blog does not determine the URL used to locate it! So how can it possibly be suggested that Maven Group Ids must be derived from ownership of a domain?
Let’s say I moonlight as a Cyber Squatter in my spare time. I manage to get a hold of the domain maven.com. Does that mean I am now free to publish my Java projects using the Group Id com.maven? Even if my project is hosted on Sourceforge and neither myself nor my project have absolutely anything to do with that name, other than I got lucky at an auction?
Alternatively, how about a graduate developer that really digs the Open Source ideals and wants to start a new project. Initially she likely won’t own a domain, but doesn’t feel comfortable using an Open Source host such as SourceForge. No problem, she has broadband and a home server where she can host her project at: http://32.254.123.91/projects/foo. The added benefit of self-hosting is that it provides experience with the intricacies of site management. She also likes the idea of Maven as it gives her an easy way to get the project started and distributed to others. How exactly would you propose she can get her project into the master Maven repository?
I realise that these enforcement policies are not maliciously intended, but rather come from a lack of consideration of the wide variety of styles and opinions in the greater community. We see similar motives in the Linux community, with attempts to create the one true Linux, which always fail miserably. This can be attributed directly to the different styles and goals of the wider community, and such diversity should be celebrated rather than suppressed. Maven surely provides great benefit to the Java development community in the management of dependencies, but it’s maintainers need to understand that restrictions imposed simply to maximise order and minimise chaos with no real benefit to the wider community, will fail just as miserably as any political dictatorship has and always will.