Imagine this scenario: You’re about to launch your product, but there are several bugs with it. They’re not very serious, but they are still bugs. The engineering team wants more time to fix the bugs, but you’ve already announced the product, marketing has done a bunch of prep for it, and there are customers waiting for the new version of the product. What do you do in this scenario? Do you delay shipping or do you launch the buggy product anyway?
The answer is tricky, but there are some “rules of thumb” that product guys have to make this difficult decision. The first clue is the axiom of software development: “There is no such thing as perfect software. All software has bugs”.
For example, NASA spent millions and millions of dollars developing the mars orbiter, verifying and testing the software. But as the satellite reached mars, a software bug caused it to crash, sinking the entire $125 million mission.
That doesn’t mean that your software is going to crash and burn, but it just means that you can’t protect against bugs – You need to figure out how to deal with bugs out in the wild. The most important thing with software is to launch v1 of the product and get it out in front of your users and customers. It is important to see how they use the product and listen to their feedback, and work on that to improve your product. There is no point sitting by yourself in a cubicle trying to fix every last bug. There’s no glory in having a perfect piece of software that never gets used – because it never launched.
Facebook actually has codified this mantra – “Done is better than perfect”. It is far more important to get your code launched and out the door, instead of waiting around for it to be perfect. That’s actually a trick statement, because there is no such thing as perfect software.
The way Google Chrome is managed is a great example. Firstly, they have a very aggressive release schedule – They have a new version every 6 weeks. While this is great for innovation, once in a while things go wrong. Just yesterday, there was a major bug with Chrome where the search from the address bar stopped working. But luckily, Chrome has a way to push fixes to all users very fast. The bug was fixed in a matter of hours. Not only that, but the Chrome team admits that their software is not perfect – It will always have bugs. And to motivate people to report back bugs, Chrome has bounties for finding bugs - upto $20,000 per bug!
As an entrepreneur, you may not be in a position to hand out $20,000 to every customer that reports a bug, there are some other simple things you can do to deal with bugs in production.
1) Do percentage rollouts
The way large web companies deal with this issue is to do what is called a “percentage rollout”. Usually, there are several load-balancing servers that run the app. When there is a new version of the software, only one of the servers is updated with the new version of the software, and a small percentage of the total traffic is sent to this server to see if the new version of the software is stable and bug-free. Once there is confidence in the new version of the software, it is rolled out to 2 servers, and then slowly to the rest until all the servers are upgraded to the new software.
2) Have a way to auto-update
The era of shrink-wrapped software is over. Nearly all software running today has a way to get to the internet, so all software should have a way to update itself. This is really important not just to fix any bugs, but also to push security updates. Without this, you leave your software to be very vulnerable. Even for complicated, enterprise software, an administrator should be able to pull an update if they run into the bug.
This is becoming more the norm now. Some software like web-browsers have actually started auto-updating without even notifying the user. Mobile apps also have a way to automatically update through the respective marketplaces. Operating Systems and other major pieces of software even have a way of classifying some updates as “recommended” or “critical”
3) Have automated crash/error reporting and tracking
For mobile apps, web apps or even desktop apps, you should build into your software a crash-reporting system. If the app encounters an error or an exception, it should not just crap out on the user, but also capture the error/stacktrace/core dump and send it back to your servers for analysis along with some diagnostic information. This kind of system will give you a real-time view into how your software is behaving out in the real world, and alert you early to problems customers may be experiencing.
So, should you launch the buggy product? Unless you have a serious bug that wrecks the user experience or a security issue, you should generally prioritize launching the product vs waiting for it to be perfect.