Enroll into QATUTOR Video Course on 

Release

Lecture 4 - The Software Development Life Cycle -> Quick Intro -> Idea -> Product Design -> Coding: Part 1 -> Coding: Part 2 -> Testing And Bug Fixes -> Release -> Maintenance

Example

In the spirit of Steven King’s novel, a little boy – a dreamer, a book lover, and an insect collector – is being constantly humiliated by his siblings, classmates, and accidental bystanders. One day he says, “Enough is enough,” and starts to cut, shoot, strangle, and burn his abusers, and for the sake of prevention, all others who are in his way. This situation is about letting the steam out – or “release” in everyday terms.

Luckily for us, in the software industry the term “release” is used in a totally different way:

– as a verb, “to release” means to transfer a piece of software to the users. For example, we can ask a release engineer to release the code to production.

– as a noun, “release” means a certain piece of software. For example, we can say: “We are testing release 5.0.”

BTW

We’ll apply the term “production version” or “version of prod” to the release that is in production now.

We’ll apply terms “coming release” to whatever version we are going to release next.

The important thing to understand is that a release is not some kind of abstract software; it is a package of concrete files having concrete versions. Please pay attention.

Example

The release 5.0 contains 63 files. Each of these 63 files has its own version. When the testers have finished testing release 5.0, the release engineer will release precisely the same 63 files that were in the testing environment when the testers finished testing. If the testers finish the acceptance test on January 03 at 4:00 pm, and at that time the file register.py had its own version 5.34, then it must be exactly version 5.34 of register.py that is included into the release 5.0 when 5.0 goes to production.

BTW, the version number is automatically assigned to file by CVS (or whatever version control system is used) every time a developer saves (or more professionally, “commits”) updated version of the file to CVS.

The purpose of the release is to transfer one or a combination of the following things, to production:

1. New features.

2. Modification/removal of existing features.

3. Bug fixes.

There are two main types of releases:

1. A major (or milestone) release happens at the “Release” stage of the Cycle, after the “Testing and bug fixes” stage is over; i.e., “Go” decision was made at “Go/No-Go” meeting.

The version of a major release is presented as an integer: 7.0

2. A minor release takes place between major releases. Minor releases can have one of three variants:

– feature release;

– patch release;

– mixed release.

A FEATURE RELEASE takes place when there is a need to

– add new features;

– modify/remove existing features.

A PATCH RELEASE takes place when the code in production has a bug (or bugs). Here we release a fixed code.

BTW

Please note that discovering a bug in production doesn’t mean that the users have already run into it.

For example, let’s assume that minor release 7.1 took place at 2:00 a.m. This release contained a special tool for bank payment processing called process_payments. The tool is scheduled to run at 1:00 am every day. So, there is at least 23 hour window during which we can find and fix a bug in process_payments before our users suffer from it.

BTW

In some cases, a bug is found “theoretically”: e.g., the PM or developer suddenly wakes up in the middle of the night and asks himself, “What if we have this situation: <…>? Can our code handle this?”

BTW

There are two cases when we have a bug in production:

a. The bug has non-P1 priority; e.g., the code of register.py doesn’t check if the email entered by the user has a dot (“.”) before the top-level domain. In this case, there is no emergency to release a bug fix ASAP. So:

– we can “accumulate” several non-P1 bugs and do a single patch release with bug fixes for all of them OR

– we’ll just wait until the coming major release (where all those bug fixes are present) is pushed to production.

b. The bug has P1 priority; e.g., the user cannot register at all. In this case, we have an emergency. A patch release is initiated, created, and released by the rules set in the Emergency Bug Fix procedure (EBF procedure). The patch release for the Emergency Feature Request (EFR) must also follow EBF procedure. You’ll read more about EBF and EFR in a minute. See example of EBF procedure under Downloads section on qatutor.com.

MIXED RELEASE is a minor release that occurs when there are both feature related changes and bug fixes.

The version of a minor release is presented as a number after a decimal point and incremented by one after each minor release: 7.1

1. The main difference is that, as a rule, major releases have tons of new features/bug fixes, while minor releases usually contain only one new feature/bug fix. For example, after the release of version 5.0, we signed a contract with the credit card processor so we can accept another credit card – Discover. In that case we can do a special feature release 5.1 to add just one functionality (“users can pay with Discover”) to our Web site.

2. As a rule, major releases have a recurring schedule; e.g., one major release per month or per quarter. Minor releases happen whenever they are required.

3. Major releases exist in the main Cycle, while minor releases exist outside of the main Cycle.

Brain Positioning

So far we have used the term “Cycle” as a companywide activity, the purpose of which is to create a major release. But “Cycle” can also be applied to software development activities on a much smaller scale. For example:

– At 10:00 a.m. developer Josh was writing code for 7.0. and tester Chad was testing code for 6.0.

– At 11:00 a.m. developer Andrew discovered a bug in the production version, 5.0.

– At 11:15 a.m. Josh and Chad were assigned to work on a patch release.

At 11:15 Josh and Chad started working within their own mini Cycle for patch release 5.1., which exists outside of the Cycles for major releases 6.0 and 7.0.

4. A major release cannot be considered as a maintenance release for the production version, while some minor releases, e.g., patch releases, have purely a maintenance nature.

5. A major release is always a planned event, while a minor release can be both planned and unplanned.

A planned minor release usually happens:

– if there is a certain feature that couldn’t be included in the major release due to resource/time constraints

For example, we really loved the features from the spec #1478 “Improvements for Shopping Cart,” but we didn’t have time to develop and test it for release 5.0 which had to be pushed to production on March 15th. So, we just release 5.0 and begin work on 5.1, which will contain features from 1478.

– if code of major release had known non-P1 bugs prior to that release, and we agreed to fix them in a patch release after the major release is out.

– if we’ve discovered several non-P1 production bugs and decide it’s time to do a patch release.

An unplanned minor release usually takes place if there is:

an emergency bug fix

– an emergency feature request

BTW

An Emergency Bug Fix (EBF) is a situation where a P1 bug is found in production and we need to push a patch release ASAP.

An Emergency Feature Request (EFR) is a situation where we need to release a certain feature ASAP; e.g., in case of a court ruling or to comply with a new law. For example, our competitor won a patent case, and so we have to change some piece of our software ASAP to make it work in a different way.

Minor releases for an EBF and an EFR are treated as patch releases.

In case of an EBF, an entry with the type “Bug” and the priority P1 is entered into the bug tracking system.

In case of an EFR, an entry with the type “Feature request” and the priority P1 is entered into the bug tracking system.

BTW

In order to address possible EBFs after a major release, many companies create SWAT* teams which consist of a developer, a tester, and a release engineer. Each team has a twenty-four hour period when each member of the team must be available to come to the office at any time. Each team member whose team is on duty must have his cell phone on at all times and must refrain from doing stuff like drinking too much tequila or going skiing at Lake Tahoe. When a call about an EBF is received by a SWAT team member, he must drop whatever he is doing, come to the office, and do his job until the patch release is out, or until the next SWAT team arrives.

* In the real world, a SWAT (Special Weapons And Tactics) team is a specially equipped, quick response police unit.

1. We had 9 major releases.

2. We had 14 minor releases AFTER the 9th major release was pushed to production.

What is the version of the coming major release? 10.0.

BTW

Please note that this format

<number of major releases>.<number of minor releases>

is the most common way to track release versions, but each company can use whatever format they like.

BTW

At some companies, there is a tradition of giving major releases names instead of numbers. There are basically two reasons for this:

1. Start-up folks have creative minds, and it’s kind of cool to call a release “Sunrise” instead of “1.0.”

2. When people work on something with a meaningful name, they develop a personal attachment to the project. Naturally, it’s more inspiring to work on “Muse” than on “5.0.”

While I totally agree with these two points, my recommendation is simple: DON’T DO IT.

For example, company N. traditionally names their major releases using pop groups/singers. Below is a dialog between two friends, Anthony who works for start-up company N. and his friend Steve who just got a bottle of Hennessy Paradis and box of Padilla Habano cigars:

– Wassap, Tony? It’s Friday night. Are you coming or what?

– Nah, Stevo. I’m hanging with “Jessica Simpson.”

– WHAT?!

In my opinion, the only benefit for company N. to give the name “Jessica Simpson” to their major release is the enormous respect that company employees get from their pals outside the company. But apart from that, it’s a bad idea.

The first reason is that it’s not clear which release was first – “Paul McCartney” or “John Lennon.” But it’s crystal clear that “4.0” came after “3.0.”

The second reason is that it sounds like crap when you say things like: “We are going to release a patch to Louis Armstrong.” Come on, Louis Armstrong is a legend, and it’s almost blasphemy to associate his name with a trivial patch release.

The third reason is that your business partners will understand what “5.0” means, but they would be in a little frustrated if you tell them: “We are going to implement a new payment functionality in “Animals.'”

BTW

In case of very aggressive release schedule (e.g. weekly or bi-weekly), a very good approach is to call any coming release “Release 1” (or just “R1”), with subsequent release called “Release 2” (or “R2”), and so on. Production release is referred to as “R0”.

So, if we release every Wednesday and today is Friday, R1 is release that we test now and that is to be released to production on Wednesday next week.

This principle is very useful because:

1. There is often confusion when “next” is used to refer to a release, because some people understand “next” to mean “coming release,” while others think it means “the release following the coming release.”

Similar situation was perfectly illustrated in the Seinfeld’s episode “The Alternate Side”:

Sid: Well I’m going down to visit my sister in Virginia next Wednesday, for a week, so I can’t park it.

Jerry: This Wednesday?

Sid: No, next Wednesday, week after this Wednesday.

Jerry: But the Wednesday two days from now is the next Wednesday.

Sid: If I meant this Wednesday, I would have said this Wednesday. It’s the week after this Wednesday.

2. In the case of frequent releases, it’s natural to refer to a particular release in a related manner – e.g., R1 rather than its formal ID, 54.0.

Release infrastructure (CVS, etc.) and the actual push of released software to production is the responsibility of release engineers (REs).

Let’s imagine that our company, ShareLane Super Duper, Inc., was created to sell books via the Internet, and it has just received its first round of financing. Not much: only 5 million of rapidly depreciating U.S. dollars.

– two programmers: Billy and Willy;

– CEO Jean Batiste Emmanuel Zorg (further referred to as “Mr. Zorg” or “Evil Boss”);

– the ultra slim, cool-looking laptop of Mr. Zorg (OS doesn’t matter);

– one server (known as “the Star”) with Linux OS for the development/test environment

In fact, Billy wanted to name this server after his cat: “Borborygmus,” but after Willy and Mr. Zorg asked him a sobering question: “Are you crazy?” it was decided that they would have to be practical and give their servers beautiful, but completely understandable names, like “the Star.”

BTW

On the Star, we have five test/dev environments:

https://old.sharelane.com – here we have the same version that’s on the production machine. If prod has version 1.0, this environment has version 1.0.

https://main.sharelane.com – here we have a version of the coming major release (e.g. 2.0) if code for that release has already been frozen for testing. Otherwise, prod, Old and Main will have the same version (e.g., 1.0)*

https://dev.sharelane.com – here is where programmers do integration between their code before that code is delivered to main.sharelane.com. Any kind of version can be here.

https://billy.sharelane.com – Billy’s playground. Any kind of version can be here.

https://willy.sharelane.com – Willy’s playground. Any kind of version can be here.

* In fact, we’ve just released our version 1.0, so www.sharelane.com (prod), old.sharelane.com and main.sharelane.com will have the same version.

ShareLane -> See complete list of ShareLane environments here: Test Portal>Release Engineering>Environments.

1. We register the domain name sharelane.com.
2. We rent the server for the production environment at the hosting provider.
3. We unite all our local computers (Billy’s machine, Willy’s machine, the Star, and Zorg’s laptop) into our Intranet.
4. The programmers start working on the code.

As you already learned, classic Web project architecture has these three components:

– Web server

– Application core

– DB

BTW

Because we’ve just started, all our test/dev environments will reside on the Star. Please note that each of those 5 environments will have its own Web server, application core and DB.

1. APACHE WEB SERVER

The name “Apache” comes from “a patchy” because of the enormous number of patch releases applied to this free software. However, those patch releases did good, because Apache is a very reliable, high-quality software package. In the Apache directories we store

– HTML and JavaScript files: JavaScript code (or reference to the file with JavaScript) is incorporated into the HTML code, and it serves many purposes, from enhancing user interface to checking Web forms. On ShareLane we have JavaScript file timer.js called during user login (see Test Portal>Application>Source Code>log_in.py).

BTW

The advantage of using JavaScript to check Web forms is that this check (for example, for a valid format of email; e.g., no “@@”) happens on the user’s computer rather than on the production server. This way, we can reduce a load on the production environment.

Images: For example, .GIF and .JPEG files; e.g., file logo.jpg: https://www.sharelane.com/images/logo.jpg.

2. APPLICATION CORE IS WRITTEN IN PYTHON

Python scripts reside in a special directory in Apache called “cgi-bin.” You can look into actual software code of the application core here: Test Portal>Application>Source Code.

3. DATABASE MYSQL

In DB we’ll store data about users, books, orders and other things. See all current data inside ShareLane DB here: Test Portal>DB>Data.

Brain Positioning

Please note that we have to distinguish DB schema from DB data. The DB itself is a set of virtual containers called tables (actually, there is MUCH more to the DB, but for now we just need the basics).

ShareLane -> Let’s look at the table “cc_transactions” (Test Portal>DB>Data>cc_transactions), where we store data about the success/failure of each credit card transaction, i.e. each attempt to use credit card for purchase.

The cc_transactions table has 4 columns: id, result, user_id and order_id.

After each new transaction when the user attempts to buy a book, a new row is inserted into cc_transactions. These rows are called “DB rows”, “DB records”, or simply “rows” or “records”.

The value of the column id is populated automatically with each new credit card transaction.

The value of the column result consists of two concatenated (joined) values:

<internal id of credit card, e.g. “1” for Visa>

and

<success code of transaction: “0” for success and “1” for failure>

; e.g., ’10’ means that transaction with Visa* was successful.

*you can see all card ids here: Test Portal>DB>Data>cc_types.

The values for columns user_id and order_id* are equal to the corresponding ids from tables users (Test Portal>DB>Data>users) and orders (Test Portal>DB>Data>orders); i.e. if purchase was made by user whose id in table users equals “1220”, value of user_id in corresponding record of table cc_transactions will also be equal “1220”.

* in case of failed credit card transaction, order_id value is equal “0” (zero), because no order has been made.

Now, let’s simulate a book order where a user enters his MasterCard and the payment was successful.

1. Create new user account on main.sharelane.com.

2. Buy any book using MasterCard and write down Order id.

3. Go to Test Portal>DB>Data>cc_transactions and search web page for your Order id (you should find value of your Order id under column order_id).

4. Get value of column “result” from the same DB record.

Expected result: 20

As you can see, actual result is “10”, so we got a bug!!!

What is the bug summary? How about this: “Checkout: wrong value in result column of cc_transactions when using MasterCard“.

Let’s do another cool thing now. Let’s file that bug into our bug tracking system: go to Test Portal>Bug Tracking>Training BTS>Submit New Bug. Fill up only Summary and Description (put steps to reproduce the bug) before submitting the bug.

CONGRATULATIONS, you’ve just filed your first bug!!!

Let’s proceed.

The DB schema is about containers: tables, columns, etc. The DB data is about the content of these containers.

So,

– if we add a new column time_created to the table cc_transactions or create a new table, we’ll change the DB schema.

– If we do any manipulation with the content of the table cc_transactions, e.g., create a new transaction that would insert a new record into cc_transactions, we’ll change the DB data.

The best analogy is this: The DB schema is a plastic bag; the DB data is water in this plastic bag.

The DB schema has its own versioning, usually starting with 1 and incremented by 1 with every schema modification. So, if we have a version of the DB equal to 34, then after we modify the DB schema (e.g., add a new column to the cc_transactions) we’ll have DB version 35.

The DB schema is created/modified

– manually – e.g., developer runs SQL statement (-s);

– by creating and running SQL procedures which incorporate those SQL statements.

SQL statements/procedures that modify DB schema must be checked into the CVS.

People who work professionally with databases are called DB Administrators or DBAs for short.

ShareLane -> You can see DB Schema version 34 of ShareLane.com here: Test Portal>DB>Schema.

So, Billy, Willy, and Evil Boss make a historic decision to use CVS.

1. Versions of each file are stored in CVS repository.

We can retrieve the needed version of the file from the CVS repository to view/edit it (this operation is called “checkout”).

We can place a new version inside the CVS repository after creation/editing of the file (this operation is called “checkin”).

Here is how we do this:

– addition of new file to CVS: cvs add register.py

– checkout of the latest version of register.py: cvs co register.py

checkin of the new version of register.py: cvs ci register.py

2. When we checkin a new version of the file (the file is considered to have a new version even if we added/deleted a single character, like “#”):

– CVS automatically assigns a unique version number to that particular version of the file.

– During checkin, CVS also stores the version number, comments, and name of the person who did the checkin and the time of checkin. So, not only we can see all the checked in versions of the file, but we also can see all of that information for each version. How cool is that!

Step 1. Checkout from CVS all application files for a specific release. In other words, we need to get an image (reflection) of the latest CVS content for a specific release. This image is called a build.

Step 2. Transfer those files to the corresponding directories in a certain environment (e.g., if we want to have files for the coming release in our test environment, main.sharelane.com, we must use these directories:

/var/www/main/htdocs for HTML and JS files;

/var/www/main/htdocs/images for images;

/var/www/main/cgi-bin for application core (Python files).

Let’s elaborate on Step 1. Here is the definition for the term “build”: Build is a sub-version of the specific release.

Example

Let’s say that our coming release is 4.0 and our application consists of only two files: index.html and register.py. In CVS we have version 4.11 of index.html and 4.23 of register.py. So, the package that consists of version 4.11 of index.html and 4.23 of register.py is a build. What if Billy changes register.py and checks it into CVS, so we have version 4.24 of register.py? In that case, the CVS image for 4.0 will be different, and thus, next build (i.e., sub version of 4.0) will be different from previous one.

After a code freeze, a build script is often added to cron (the task scheduler on a Linux system) to create and push builds in equal intervals of time, e.g., every three hours.

ShareLane -> See Test Portal>Release Engineering>Build Schedule.

The purpose of creating new builds over and over again is to make a modified application available for testers.

Example

Let’s say that you’ve found a bug during your testing on main.sharelane.com. After you filed that bug into the bug tracking system, the developer fixes it and checks that code into the CVS. The build script picks up that new code as part of the new build and pushes that build on main.sharelane.com, replacing the previous build. Now you can verify if the fix is good or not.

– There is no sense in doing testing from 12:00 to 12:15, from 15:00 to 15:15, etc, because the build is being created and pushed, and when you do testing in the middle of the process, some files can belong to the previous build and some files can belong to the new one.

– If a programmer fixed your bug and checked in the fixed file(s) into the CVS, you’ll be able to verify the fix only after the new build is pushed to the testing environment. So, if the checkin took place at 16:00, then your fix will be available for verification only at 18:15. Thus, in many cases it makes sense to launch the build script right when you need it. But if you do this, please make sure that the other testers know about it so you won’t mess up their testing. In light of this, it’s a good thing when each tester has his own testing environment.

BTW

It’s a good idea to ask your release engineer to create a build status page where you can see

– Build id,

– DB version,

– Success status of build script run,

– Time when build script run was complete.

ShareLane -> See ShareLane build status page here: Test Portal>Release Engineering>Build Status

In some companies, release engineers create a Web interface to enable testers to push new builds when needed without the involvement of the release engineer.

Build numbering starts with 1 for a concrete release and increments by 1 every time a new build is created. So after we created the first build for the minor release 23.1, the unique build identifier called build id will be 23.1-1. After a new build is created, the build id will be 23.1-2, and so on.

BTW

As a rule, DB schema versioning is not linked to release versioning, so we don’t start over with 1 with each new release like we do in the case of builds. We just increment each DB version by 1 every time the DB schema is changed.

At ShareLane, we specify the DB version after “/” following the build id; e.g., in 23.1-2/78, 78 is the DB version.

Before you start testing or bug fix verification, make sure that you are testing the correct application version by checking:

– the build id and

– the DB version

As you already know, you should ask the release engineer to provide you with an interface to easily identify the application version.

Finally,

– the code is written

– the testing is finished, and the bug fixes are made and verified

– acceptance testing is finished

– at our Go/No-Go meeting we decided that we are ready for our first major release 1.0. Hooray!

Our first live application version will be 1.0-23/34

1. Configure the production machine, e.g. create needed directories: /var/www/prod/cgi-bin/, etc.

2. Upload the SQL procedure to the production machine and run that procedure against the DB to create the DB schema with version 34.

3. Configure the build script to create the build on the production machine.

BTW

The production machine is simply a remote physical computer located at our hosting provider. That machine has a unique (among all computers on the Internet) identifier – an IP address, also called an external IP address. The format of the IP address is <0-255>.<0-255>.<0-255>.<0-255>.

BTW

If you want to find out the IP address of some server, do this (instructions are for Windows OS):

1. Click button “Start”.

2. Select option “Run”.

3. Type “cmd” in the dialog box and press “Enter.

4. When the command prompt is invoked, type this:

ping www.google.com

Note that “https://” is not needed.

The value after “Reply from” is the IP address of one of the production Google machines. If we were within the sharelane.com Intranet, we could find out the internal IP address of the Star by typing:

ping star

or

ping main.sharelane.com

or by typing any other hostname of a Web site on the Star; e.g., billy.sharelane.com after “ping”

The difference between an external IP address and an internal IP address is that

– an external IP address must be unique among all computers on the Internet. An external IP address is like the address of an apartment; it must be unique among all apartments in the world, otherwise mail cannot be sent there.

– an internal IP address must be unique only among all computers on an Intranet. An internal IP address is like the conference room within a company; it must have a unique name among all other conference rooms within the company, otherwise we’ll be confused about where the meeting is: Is it in the conference room “Infinity” located next to the espresso machine, or in the conference room “Infinity” located next to the printers?

4. Run a build script to create a build on a production machine. The build script checks out files of the application version that we are going to release from CVS and copies those files to the production machine.

BTW

At ShareLane,

– Linux utility scp is used to copy files between Linux machines.

– Windows utility WinSCP is used to copy files between Windows and Linux machines.

Both utilities are free. You can find URLs to all the utilities mentioned in this Course under Downloads on qatutor.com, or you can just google their names.

As our project evolves, our production environment will turn into tens of servers, which will form our production pool, but for now:

Ladies and gentlemen, our first release of sharelane.com is LIVE!

Guess what? Users seem to like us! Our user base grows like crazy, and now we have hired two PMs, four more developers for the application core, one UI designer/developer, one DBA, one tester, and one CS (customer support) person. After another three weeks of hard work, we release version 2.0! But, once we poured the champagne to celebrate 2.0, our CS Nina bursts into the conference room and screams that she’s been getting tons of complaints from users, because 2.0 is as saturated with bugs as the U.S. Congress is with lobbyists for a military industrial complex.

– Push 1.0 to prod, i.e. revert back* to the good old version.

* another term is “rollback”

It might look like the first option is problematic, because at ShareLane nobody seems to remember the exact version of each file for the production version of 1.0. Our build script is primitive, and as we create each new build, we don’t save the association between the build id and the file versions that belong to that build. But after a couple of beers, Willy declares that first option is doable, because we can recreate 1.0 by checking out from CVS file versions dated right before 1.0 was released.

BTW

Association between the build id and files with their versions can be created inside a special log file. Every time we have a new build, the build script appends that text file with detailed information about that build.

Here is an example of file format:

<build id>;<DB version>;<success code: 0 – success (build is fine), 1 – failure (there were build errors)>;<filename/file version,filename/file version, etc.>;Unix timestamp.

That how that file might look like if we had only 2 files: register.py and checkout.py:

1.0-22;32;0;register.py/1.12,checkout.py/1.8;1206612924

1.0-23;34;0;register.py/1.12,checkout.py/1.9;1206623816

ShareLane ->See, build log of ShareLane here: Test Portal>Release Engineering>build_log.txt

In the general case, it’s not easy to figure out if the second option (bug fix) is more or less attractive than the first one (rollback to old version), but this time it was easy for us. The reason is that we HAVE TO go for the second option. Why? Because we’ve just discovered that our DB procedures … have not been checked in into CVS and nobody remembers the exact DB schema used for 1.0. So, even if we get the files for 1.0, there is a fat chance that they will be incompatible with the DB schema for 2.0 and thus we might get lots of bugs.

Example

register.py (1.0) queries table users (DB schema for 1.0), but if DB schema for 2.0 has this table renamed into customers, register.py (1.0) will not work.

– The programmers who fix the bugs for 2.0 don’t work on 3.0.

– The programmers who don’t fix the bugs for 2.0 cannot do a CVS checkin for 3.0, because it was decided to lock the CVS and allow only bug fixes for 2.0 to be checked in.

– The tester is spending his time verifying bug fixes for 2.0 instead of writing test cases for 3.0.

After the 2.1 patch release where all the nasty bugs have been fixed, we have a meeting in which Billy suggests that we have to take our version control to the next level by creating branches in CVS. He says:

“Okay, guys and gals. I have a four-year-old son Edward, and I have to send photographs of him once a month by email to my mother-in-law who lives in Rome. If a photo shows that Eddie is sick, then she calls and screams with anger, just as if she’s used our 2.0. So what I do is this: I save photos of Edward looking good in a special folder, and if my mother-in-law starts to complain after getting a photo where Edward doesn’t look healthy enough, I just say to her, “Wait a minute, that was the wrong photo,” and I email her a photo from my golden reserve of alive-and-kicking Eddies.

“Here is the story of our project from the release engineer’s point of view:

“One day we started to write our code, and as we proceeded further, we decided to use CVS to store versions of our files. At one point, we said “Stop’ and decided to call whatever we had in the CVS “version 1.0.’ Then we started to add and checkin to CVS our new files and checkin into the CVS new versions of existing files, and again, at one point, we said “Stop’ and decided that whatever is in the CVS must be called “version 2.0.’ We did everything right, except one thing: the files of 1.0 and 2.0 got mixed up because we didn’t separate them.”

“Now, imagine a tree: a trunk and branches.

Here is what we should have done from the beginning:

– The files created for and up to the 1.0 release make up the trunk of a virtual tree in the CVS. Dot at the right end of the trunk is also called HEAD – it has the most recent checkin and hence it’s the most recent version of the trunk.

trunk1

– Once we say, “Stop” for 1.0, we create (or “cut”) a virtual CVS branch from the trunk, and that branch will contain our files for release 1.0 (the trunk will have those files, too).

trunk2

– So now we have a CVS trunk and a CVS branch 1.0.

– The programmer who writes code for 2.0 must check in his files into the trunk (dotted line).

trunk3

– Once the 2.0 code is finished, we cut another branch called 2.0.

– Now we have a trunk, a branch 1.0, and a branch 2.0.

trunk4

What shall we use to add/checkin the files for 3.0? Of course, the trunk (dotted line)…

trunk5

– and so on.

This way, the code of each release lives in its own branch, or exists as a continuously updated trunk.

There are a lot of nuances about branching, but for now it’s important that you grasp the concept of why branching is necessary.

What about our stuff? What’s done is done. In our messy CVS, we have:

– all code for 1.0

– all code for 2.1

– part of the code for 3.0.

Let’s call the trunk whatever we have in the CVS now. I’ll spend my time finding all the files in their versions for 2.1, and I’ll create a branch for 2.1, so if we release a buggy 3.0, we can easily go back to 2.1 by checking out the files from branch 2.1 and sending them to prod. And of course, from now on, we cut separate branches for each release.

And, last but not least, I’m going to fix our build scripts to

– enable logging associations between

1. The build number;

2. The file versions in that build;

3. The time when the build is pushed to the target environment.

– enable the build script to create any past build when the build id is provided as input.

And … I’m personally going to kick in the butt anyone who modifies the DB schema and doesn’t check in SQL procedures into the CVS.”

First, we can easily get back to any of the previous versions.

Second, the results of their work on each of the versions will be separated in CVS by the branching mechanism.

Third, we can control the state of each branch and trunk. Let’s set up our branching mechanism to be able to have three states of branches:

OPEN: we can add/delete/checkin files (to/from/into CVS) without getting approvals, meeting certain conditions, etc. The trunk is always open.

CONDITIONALLY OPEN: we can add/delete/checkin files if we meet certain conditions depending on concrete situations. For example, in some companies approval from the dev manager is needed to add/delete/checkin files if a bug is found during acceptance testing, i.e. at the end of stage “Testing and bug fixes”.

LOCKED: this applies to all branches with past/present production versions.

1. During “Coding,” the trunk is OPEN; the developers who are working on the coming release can mess up the trunk as much as they like.

2. During “Testing and bug fixes,” the branch is CONDITIONALLY OPEN; the developers can do add/delete/checkin operations in the coming release branch only if they provide a valid bug number during actions with the CVS.

Example

When the developer tries to commit some file into conditionally open CVS branch, the CVS opens up a special text file, and the developer must type the valid bug number on the first line of it:

8766

CVS: ———————————————————————-

CVS: Enter Log. Lines beginning with “CVS:’ are removed automatically

CVS:

CVS: Modified Files:

CVS:register.py

CVS: ———————————————————————-

8766 is a bug number. When the developer saves this text file, a special program (a CVS “trigger”) queries the DB of the bug tracking system whether two conditions are met:

1. A bug with this exact number exists.

2. That bug is in an open state.

If it’s a double “Yes,” then CVS allows the developer to save this text file and executes the original command passed to it. If one or both of these conditions are not met, then CVS asks the developer to enter a valid bug number.

3. Once the code is in production, then the corresponding release branch is LOCKED.

BTW

When a bug is found in production, the following situation can occur:

The developer

– checks in his fixed code into the branch with the patch release and

– forgets to checkin the fix into the trunk.

The consequence of that forgetfulness is this: When the next branch is cut from the trunk, that new branch will have the same bug. That’s why we can have a situation when a bug that has been fixed in a previous production version (e.g., 2.1) reappears in production after a major release (e.g., 3.0).

So the golden rule is to create a test case for each bug found in production. Add this test case to a special test suite called “Test Cases for Production Bugs.” I recommend keeping that suite dynamic: you add new test cases as production bugs are found (and fixed) and retire test cases from it once you execute them for the coming release and the next release after the coming release.

BTW,

When we encounter a bug in production, it’s a good idea to have a postmortem. This term is borrowed from the medical field where it refers to a “medical procedure that consists of a thorough examination of a corpse to determine the cause and manner of death and to evaluate any disease or injury that may be present” (Source: Wikipedia).

By way of analogy, during a bug postmortem at a software company, we:

– do a thorough examination of why that bug was missed;

– try to identify weak points in our Cycle.

Depending on the severity of the situation, a postmortem can be held as a separate meeting or just as an email thread.

Postmortems should not be witch hunts. On the contrary, they should be constructive, positive measures targeted at improvements.

Sometimes Internet companies make a beta release prior to a major release. The idea behind a beta release it is this: Before we make an official major release (in other words, a major release available to ALL possible users), we make the code of that major release available to a limited group of people (beta testers) who represent our target users.

BTW

A target user is basically a person who we expect to use our Web site. Target users can be identified by different sets of criteria: age, gender, occupation, interests, country, etc.

Beta testers are not test professionals. They are just regular folks that can be useful to us; e.g., we at sharelane.com can invite our most active users to be our beta testers if we decide to do a beta release. We can just send them an email with a secret URL; e.g., https://beta.sharelane.com. You can tempt users to become beta testers by offering them free items like t-shirts with your company logo.

1. Beta testers will report bugs to us.

2. We’ll monitor our system and see how it works under real life usage. For example, if the DB crashes during beta testing, we can assume that it will also crash after a major release when many more users are going to use that code.

As beta testing goes on, we fix bugs and deal with other discovered problems (e.g., we might decide to add more servers to improve Web site performance). An example of a beta release is the email service Gmail: until Feb. 2007 new accounts could’ve been created by invitation only.

BTW

Please note that in some cases, a company will push out a major release with a label of “Beta.” So, if you see “Beta” on some Web site that’s available to EVERYONE, you can translate the word “Beta” as “This software is freshly baked and probably buggy. So don’t blame us if something wrong happens. Just send us an email with a description of the problem.”

As a rule, companies use beta releases in two cases:

1. The very first release (1.0) of the software.
2. The release of a large, important project; e.g., Gmail by Google.

The logical question is: “If we have beta testing, then we must have done alpha testing, right?” Yes, alpha testing is the testing done BEFORE releasing the software to beta or regular users; e.g., the testing done during the stage “Testing and bug fixes” is alpha testing. Please note that alpha testing is performed by anyone inside the company who tries the new code before it’s released. For instance, the PM can ask the developer to play with the fresh code on the developer’s playground to see how the ideas from the spec are implemented in the software.

Testers in Internet companies are in a privileged position compared to testers from other industries. If we at sharelane.com accidentally release a bug on production, we can do a patch release and remove the production problem within minutes. In many cases, that patch release will have a very low cost, and users will have no idea that a bug ever existed. But what if a P1 bug is found in the braking mechanism of an automobile?

– It will cost the auto company millions of dollars to make a recall.

– It will require active user participation to drive to the dealership to fix the problem.

A release that doesn’t have a critical urgency must be pushed to production while the majority of users are nonactive; i.e., during the night. You can define a “night” for your releases using some interval of time (for example, from 00:00 to 6:00) in the time zone where most of your target users live. This can be very difficult for Web sites with a big international exposure, like www.google.com. As a rule, U.S. companies make releases between 11:00 p.m. Pacific Standard Time (2:00 a.m. Eastern Standard Time) and 3:00 a.m. PST (6:00 a.m. EST), so they have a four-hour window when the majority of people who live in the continental U.S. are asleep.

Right before and during the time the release to production is under way, put a polite message like this on the production homepage: “Sorry for any inconvenience. This site is under maintenance. We’ll be up at 3:00 PST.” It’s not a big deal from a technical point of view, but your users will really appreciate your consideration.

– In many cases, a coming release is not pushed to all the servers in the production pool, but rather to just one or a few of them. The logic behind it is that we don’t want to expose ALL of our users to the new code until we verify that this code works in real world conditions. So, random users hit our new code on one or several production machines and we monitor the quality in production by looking at the DB and log files. This approach is especially good for architectural releases when the front end is absolutely the same, but the back end is different.

– Depending on the specifics of the business, Internet companies usually can predict the times when users are going to be more active than usual. For companies that sell consumer goods, like Amazon, the period between December 1st and December 24th (the Christmas season) is the hottest time of year when a great deal of sales are made. If we know about that period of time beforehand, we must introduce a moratorium for any release to production, except EBF and EFR releases. The reason is simple:

– We don’t want to jeopardize our major revenues. Next ->

Lecture 4 - The Software Development Life Cycle -> Quick Intro -> Idea -> Product Design -> Coding: Part 1 -> Coding: Part 2 -> Testing And Bug Fixes -> Release -> Maintenance