10 Most Important Open Source Projects of 2011

Great article over on linux.com, here are the results:

Hadoop

 

Hadoop Logo

Hadoop Logo

Without a doubt, Hadoop has had a fantastic year. The distributed computing platform from Apache has seen massive uptake and industry support.

 

Hadoop is being used and/or supported by almost every enterprise player. Naturally it’s big with Yahoo, the company that started the project, but it’s also being used by Amazon, IBM, Twitter, Facebook, and just about any other company that’s working with Big Data.

Hadoop isn’t new, of course, but this year it really seemed to take off as an industry standard. Kind of like Linux, when you think about it… This year EMC, Oracle, and even Microsoft announced commercial support or products that work with Hadoop, and Yahoo spun off HortonWorks to focus on Hadoop. It’s almost easier to name companies that aren’t working with Hadoop than ones that are.

Git

Speaking of ubiquity, how about that Git, huh? Linus Torvalds other little hobby project has not only done good for Linux, but it’s hugely popular for FOSS projects. If you’re working on a new open source project, the odds are pretty good that you’re going to be using Git over any other distributed version control system (DVCS).

Git isn’t just a popular tool, it’s the foundation of one of the most popular gathering spots around the Web for open source development: GitHub. It’s also being used and offered by Gitorious, SourceForge.net, Google Code Hosting, and pretty much every other major platform for hosting FOSS projects.

Cassandra

Was 2011 the peak of noSQL as a buzzword, or was that 2010? It’s so hard to keep track, but Apache Cassandra deserves a slot in the top 10 this year buzzword or no.

If you’re not familiar with Cassandra, it’s a scalable, distributed, and fault-tolerant database that takes cues from Amazon’s Dynamo (PDF) and Google’s BigTable database system.

Cassandra has been adopted by an impressive list of users including IBM, Netflix, Digg, Facebook, Rackspace, and many others.

LibreOffice

The LibreOffice team has done a great job of keeping the OpenOffice.org torch burning after the Sun acquisition. While Apache is working to continue OpenOffice.org, LibreOffice picked up the ball and ran with it. The project has delivered release after release, not only with a slew of new features but also with reliable updates for major versions that are exactly what organizations that depend on an office suite need.

For anybody that’s interested in running Linux on the desktop, LibreOffice has been a crucial project. For users who want to get away from Microsoft Office, but still have compatibility with Office file formats, LibreOffice has been there for them.

Not only has LibreOffice done well technically, it’s also moved forward with impressive speed as an organization. 2012 should be an interesting year for the open source office suite.

OpenStack

Few projects have taken off quite like OpenStack. The “cloud operating system” kicked off by RackSpace has signed up (at this count) 144 companies to work on OpenStack, including SUSE and Canonical.

OpenStack is designed to provide the components that any organization would need to use to deploy their own private or public cloud: Compute, object storage, image service, and (newer) identity management and a GUI dashboard.

Now, you’re not going to see much OpenStack in deployment yet — but it’s definitely a project to watch for open source cloud.

An honorary mention goes to Eucalyptus, though. While OpenStack has oodles of momentum and industry support, Eucalyptus has production deployments and Amazon Web Services compatibility. This is not an area where it’s a “zero sum” game — there’s room for several players, and I suspect that Eucalyptus will be around for a very long time as well.

Nginx

Apache (more accurately, the Apache HTTP Server Project) still rules the Web with an iron fist. OK, it’s more like a velvet glove, but Apache is definitely far and away the most popular Web server. But 2011 was a huge year for Nginx, an alternative Web server that excels at HTTP and reverse proxy serving.

Nginx reached a lifetime peak of 8.85% market share this year on the Netcraft Server Survey. According to this profile on Royal Pingdom, the usage for Nginx has jumped nearly 300%.

The little server that could reached another major milestone this year as well. Specifically, Nginx went corporate and started offering commercial support.

It’s being used by some of the biggest sites in the world, including Dropbox, WordPress.com, Facebook, and about 25% of the world’s busiest sites.

jQuery

You can’t swing a cat these days without hitting a Web developer using jQuery. Not that you should go around swinging cats, of course. jQuery is a JavaScript library that’s massively popular. In fact, it’sconsidered the most widely used JavaScript library in the world.

If you’re working with JavaScript, you’ve probably touched on jQuery this year. As of late, it’s come into criticism and some folks have tried to slim it down, but jQuery is still the go-to for many developers.

Node.js

Another JavaScript entry for the top 10, you’d almost think that Web development was important this year or something. Node.js is built on Google’s V8 JavaScript engine and is designed to be “an easy way to build scalable network programs.”

Node.js is another big win for open source industry acceptance – sponsored by Joyent, it has a healthy community of contributors and is used by everybody from LinkedIn to 37Signals, Rdio, Yahoo, and GitHub.

Puppet

Another set of watch-words for 2011? DevOps, and IT automation. While there are a number of excellent open source IT automation offerings out there, this year belonged to Puppet.

Puppet is an “automated administrative engine” primarily aimed at Linux and UNIX-like systems. It can be used to perform administrative tasks across two, twenty, or two thousand computers. (Probably even more.) Puppet has been steadily growing and improving for years, but this year Puppet went after the enterprise big time with its Puppet Enterprise offering. It’s also gotten a big vote of confidence in the form of an investment from Google Ventures, Cisco, and VMware. Puppet hasn’t just been important in 2011, expect it to be big in 2012, too. (And if you’re a system administrator hunting for work, you probably want Puppet on your resume along with our next entry.)

Linux

Linux, the kernel, has had a pretty good year. What am I talking about? Linux had a great year. It turned 20, hit 3.0 (not coincidentally) and continued merrily on the path to world domination.

Sure, we kid about world domination – but have you looked around lately? Linux is everywhere. It’s powering phones and all kinds of embedded devices. It’s the bedrock of cloud services, and dominates theTOP500 supercomputer list.

Google, Netflix, Facebook, Twitter, countless government agencies, businesses, and educational institutions depend on Linux for mission-critical services. The long and short of it is, without Linux, many of the other projects we depend on simply wouldn’t have been possible. It’s the rock-solid foundation that people use to build so many important services. (And not-so-important, too.)

No Android?

While I was compiling this list, I thought hard about putting Android on. It’s hard to argue that Android is unimportant in 2011, isn’t it? Absolutely. It’s also, unfortunately, hard to make a strong case for Android as an open source project.

Sure, Google lobs some source over the wall when it gets around to it – but Android development happens mostly behind closed doors. There’s little opportunity for the millions of Android fans and potential hackers around the world to influence Android development unless they happen to work for Google or one of its partner companies.

It’s great that Google releases the code, but it’s more of a “source open” project than an open source project.

Importing Delicious tags into MarsEdit

MarsEdit

For some time I’ve been wanting to move my blogging efforts over a desktop app, to be able to control multiple blogs from one place, and to search through and manage the content more easily.

The 2 main contenders that I could find for the Mac platform were Ecto and MarsEdit.  The latter seems to be considerably more actively maintained, has a slicker interface and was available for download on the Mac AppStore so I went with that.

The main feature I was missing in MarsEdit was the fairly comprehensive tag handling offered by WordPress.  The fact you can type only a few letters of the tag you need and it auto-completes is something you can’t do without once you get used to it.  The Delicious website and similar desktop tools all offer this functionality.

The first challenge was to get my data out of Delicious, not the links/bookmarks that are offered as their only export option but the actual tags.  Not surprisingly there’s a WP plugin for that, enter EG-Delicious Tags.

Once your tags are copied over to your WP installation you can access the data as a serialized array from the WP database, it’s in the options table under the key _transient_egdel_tags.

Then the data needs to be integrated with MarsEdit.  The app’s author, Daniel Jalkut, kindly explained which plist file within the app needs to be updated to enable the search auto-completion feature.

A simple PHP script was sufficient to deserialize the data and wrap it in XML tags:

<?php
$data = <<<DATA
a:773:{s:7:”_blogit”; …
DATA;
$struct = (unserialize($data));
$keys = array_keys($struct);
$out = ”;
$out .=”\n”;
foreach ($keys as $tag) { $out .=”<string>$tag</string>\n”;}
print $out;
?>

Add the XML to the relevant key and save out the DataSources.plist file and you’re done.

Mashable Interview

I was fortunate enough to be interviewed on Mashable by Jolie O’Dell recently, cited as a “PHP expert” :-)

Here’s a full transcript of the original email interview, some responses make more sense in context:

Am running a bit behind but here are my answers for your interview, please let me know your feedback and if this gets published:

–What advice would you give to a developer just starting to learn PHP?

– keep on top of best practices including a healthy approach to security
– read the code of seasoned devs, there’s always a better/cleaner way to do things
– ensure your code is human readable, if you can’t understand it 6 months later, how will it be for other devs
– always try and simplify your interfaces, it’s much more difficult to write simpler code but consistent refactoring will save you a lot of time and headaches when it comes to maintenance
– don’t reinvent any wheels, you will always have more than enough to program, use reputable libraries whenever you can avoid writing the code yourself
– read up on some of the great programmers (eg: http://www.codersatwork.com/) and find out how they stayed passionate about the art of programming so many years later

–In your opinion, what’s PHP’s biggest strength? Biggest limitation?

I think it’s easier to start off with its biggest limitation first: so many people criticise PHP that you’d be tempted to think it’s a rubbish language; that couldn’t be further from the truth. The biggest limitation is _aspects_ of PHP are easier to learn than comparable aspects in other languages, so PHP attracts a lot of “developers” who don’t have a clue, write horrendous code, show their ignorance in forums and generally dangerously decrease the signal to noise ratio for the rest of us.

When I first started with PHP in 2000 I remember discovering of project by a German developer that struck me as very well designed yet according to the critics this should have been impossible:
– it was done in one of the earliest versions of PHP4 (4.0.0, released in 22 May 2000, http://php.net/releases/index.php) yet still displayed all the sophistication of someone who understood software engineering
– the language itself supposed had all sorts of limitations and defects that meant using it for OOP was technically impossible: wrong

See for yourself, still not updated since 2000 and still probably better than most PHP that gets written today:

http://www.phpdoc.de/

The point is a simple one: if you’re a developer who has the discipline to learn about software development, PHP can be an excellent tool.

The strengths of the language are simple and obvious:
– it stays close to its C roots while removing some of the unnecessary pain points like memory management, pointers and the compile cycle
– the OOP implementation is simple, elegant and easier to read than its peers
– the Java mantra of “complexity at any cost” is nowhere to be found, concise method names are used throughout
– libraries and extensions exist for pretty much every technology on the planet
– hacking activity and community participation most likely the highest of any programming language

There are however a few difficult situations that are directly linked and result from the above positive list:
– there is too much choice when it comes to selecting a library or framework to work with, and the information available is often biased and unreliable (posted by teenagers) so a lot of time can be wasted searching for quality
– the core development team is somewhat hysterical and not professional at times which has resulted in backwards compatibility being broken often, and in unacceptable ways, and our current namespace implementation
– there currently isn’t any decent IDE for PHP, not something comparable to what’s available for Java. This became increasingly obvious when I got into Objective C and Mac development, Xcode really sets the standard. A new candidate that seems promising and is non-free is PHPstorm, so far I’ve found it a relief to use compared to Netbeans. Eclipse, on the Mac at least, I don’t think is even in the race.

–For more intermediate or advanced PHP devs, what are some tips that have helped you along the way?

One of the key problems with PHP is the absence of any authoritative standard library, something which is literally taken for granted in Java, Python, Ruby, Perl and others. PEAR could have been it, but Zend chose to fork for political reasons, now we have the Zend Framework which is not really a framework but more like a library, and it still has some serious quality consistency issues. It seems ZF will likely become the dominant PHP library, but work still needs to be done by the community to refactor the “frameworky” libraries, ie, those that have dependencies on Zend_Config, Zend_Registry, etc. Documentation for many of the ZF libraries is flaky and incomplete, often the comments contain the clues you need to get things working.

In terms of tips, I’d make the following suggestions for devs who are keen to move out of beginner status:
– don’t be afraid of using an interactive debugger, available in decent IDEs like PHPstorm and also Netbeans and Eclipse if you have the patience, this is the best way to understand what the code is doing. If you’re using print_r($foo) you’re a beginner.
– don’t be afraid of unit tests, not only will you have an easier time maintaining your codebase, but often unit tests are the best form of documentation for a codebase, and will allow new devs to get up to speed fast
– use some of the available static analysis and IDE tools to help you refactor your code, good code is not subjective!

–What’s the best app or most clever hack you’ve seen that uses PHP? (Links, please!)

Facebook? Although from the code leaked a few years ago, the quality was primitive.