Features in TFSC are fairly standard among SCC tools

Some of the features in TFSC are fairly standard among SCC tools:

  • Workspace creation
  • Workspace synchronization
  • File checkout
  • Overlapping checkout by multiple users of the same file
  • Atomic change-set check-in
  • File diffs
  • Automated merge
  • Code-line branching
  • File-set labeling
  • User management and security

What really sets TFSC apart from the competition is its powerful merging and branching features

Tagged : / / / / /

Merging Functionality in TFSC

The merging functionality in TFSC is centered on the following typical development scenarios:

  • Scenario 1: The catch-up merge— The user wants to merge all changes from a source branch that have not yet been migrated to the target branch. The source and target can be a subtree or an individual file/folder.
  • Scenario 2: The catch-up no-merge— The user wants to discard nonmerged changes in the source branch from the set of candidate changes for future merges between the specified source and target.
  • Scenario 3: The cherry-pick merge— The user wants to merge individual change sets from the source branch to the target branch. Changes introduced to those files prior to the specified change set should not be migrated.
    • The user can specify the change sets to merge with a change set number.
    • The user can specify individual file revisions to merge between the source and target.
  • Scenario 4: The cherry-pick no-merge— The user wants to discard a single change set from the list of all possible changes to merge between the source and target so that this change set never appears in the list of candidates for a cherry pick merge.
  • Scenario 5: Merge history query— The user wants to know whether the specified change set has been merged into the target branch. If it has, the user wants to know what change set the merge was committed in. The user also wants to know if part of the change set has been merged, but not all.
  • Scenario 6: Merge candidate query— The user wants to obtain a list of change sets that have been committed to a source branch but have not yet been migrated to the target branch. From this list, the user selects change sets to migrate with a cherry pick merge.

How TFSC Addresses the Scenarios

TFSC merging is designed to provide users with an extremely powerful and flexible tool for managing the contents of branches. Merges can be made into a single file or into a tree of related files. Merges can also migrate the entire change history of the specified source files or an individual change set or revision that might contain a specific fix or feature that should be migrated without moving other changes from the source in the process. Merging the entire change history prior to a given point in time is known as a catch-up merge (Scenarios 1 and 2), whereas selecting individual change sets or revisions to merge is known as a cherry-pick merge (Scenarios 3 and 4). The merge command also allows users to query for merge history and merge candidates and perform the actual merge operation.

TFSC presents merge history and candidate merges as a list of change sets that have or can be migrated between a source and a target branch. Merges can be made to a subset of files in a change set, creating a situation in which a partial change set has been merged. In this case, TFSC represents the partial state of the merge and allows the user to finish merging the change set later.

Merges are pending changes in TFSC. The user can choose to perform several merge operations within a workspace without committing changes following each merge. All these merges can be staged in the user’s workspace and committed with a single check-in as a single change set. In addition, the pending merge operation can be combined with the checkout and rename commands to interject additional changes to the files that will be committed with the merge.

Hopefully you followed this summary and are still with me. Now let’s go into how branching works in TFSC.

Reference: The Build Master: Microsoft’s Software Configuration Management Best Practices

Tagged : / / / / /

Branching in TFSC

Branching in TFSC

Branching is the SCM operation of creating an independent line of development for one or more files. In a sense, branching a file results in two identical copies of the original file that can be modified as desired. Changes in the old line are not, by default, reflected in the new line and vice versa. Explicit operations can be performed to merge changes from one branch into another.

There are many different reasons for branching and many different techniques to accomplish it. In the most common scenarios, branching is reasonably simple, but branching can become complicated. A complex system with lots of branched files can be hard to visualize. I recommend mapping this with a visual product (such as Visio) so that the picture is clear.

Following are a handful of scenarios in which branching is interesting. Any SCM team should adopt these definitions.

Release Branching

We’ve been working on a Version 1 release for a year now, and it is time to begin work on Version 2. We need to finish coding Version 1—fixing bugs, running tests, and so on—but many of the developers are finished with their Version 1 work (other than occasional interruption for bug fixes) and want to start designing and implementing features for Version 2. To enable this, we want to create a branch off the Version 1 tree for the Version 2 work. Over time, we want to migrate all the bug fixes we make in the process of releasing Version 1 into the Version 2 code base. Furthermore, we occasionally find a Version 1 bug that happens to be fixed already in Version 2. We want to migrate the fix from the Version 2 tree into the Version 1 tree.

Promotion Modeling

Promotion modeling is equivalent to release branching, where each phase is a release. It is a development methodology in which source files go through stages. Source files might start in the development phase, be promoted to the test phase, and then go through integration testing, release candidate, and release. This phasing serves a couple of purposes. It allows parallel work in different phases, and it clearly identifies the status of all the sources. Separate branches are sometimes used for each phase of the development process.

Developer Isolation

A developer (or a group) needs to work on a new feature that will be destabilizing and take a long time to implement. In the meantime, the developer needs to be able to version his changes (check in intermediate progress, and so on). To accomplish this, he branches the code that he intends to work on and does all his work independently. Periodically, he can merge changes from the main branch to make sure that his changes don’t get too far out of sync with the work of other developers. When he is done, he can merge his changes back into the main branch.

Developer isolation also applies when semi-independent teams collaborate on a product. Each team wants to work with the latest version of its own source but wants to use an approved version of source from other teams. The teams can accomplish this in two ways. In the first way, the subscribing team “pulls” the snapshot that it wants into its configuration, and in the second way, the publishing team publishes the “approved” version for all the client teams to pick up automatically.

Label Branching

We label important points in time, such as every build that we produce. A partner team picks up and uses our published builds on a periodic basis, perhaps monthly. A couple of weeks after picking up a build, the team discovers a blocking bug. It needs a fix quickly but can’t afford the time to go through the approval process of picking up an entirely new build. The team needs the build it picked up before plus one fix. To do this, we create a branch of the source tree that contains all the appropriate file versions that are labeled with the selected build number. We can fix the bug in that branch directly and migrate the changes into the “main” branch, or we can migrate the existing fix (if it had been done) from the “main” branch into the new partner build branch.

Component Branching

We have a component that performs a function (for simplicity, let’s imagine it is a single file component). We discover that we need another component that does nearly the same thing but with some level of change. We don’t want to modify the code to perform both functions; rather, we want to use the code for the old component as the basis for creating the new component. We could just copy the code into another file and check it in, but among other things, the new copy loses all the history of what brought it to this point. The solution is to branch the file. That way, both files can be modified independently, both can preserve their history, and bug fixes can be migrated between them if necessary.

Partial Branching

Partial branching is equivalent to component branching, where the “component” is the versioned product. In this case, we work on a product that has a series of releases. We shipped the Everett release and are working on the Whidbey release. As a general rule, all artifacts that make up each version should be branched for the release (source, tools, specs, and so on). However, some versioned files aren’t release specific. For example, we have an emergency contact list that has the home phone numbers for team members. When we update the list, we don’t want to be bothered with having to merge the changes into each of the product version branches, yet the developers who are enlisted in each version branch want to be able to sync the file to their enlistment.

Identifying Branches (Configurations)

When a file is branched, it is as if a new file is created. We need a way to identify that new file. Historically, this has been done by including the version number of the file as part of the name of the file. In such a mechanism, the version number consists of a branch number and a revision number. A branch number is formed by taking the version number of the file to be branched, appending an integer, and then adding a second integer as a revision number. For example, 1.2 becomes 1.2.1.1 (where 1.2.1 is the branch number and 1 is the revision number).

 

Reference: The Build Master: Microsoft’s Software Configuration Management Best Practices

Tagged : / / / / /

Offline Checkout/Check-In in TFSC

  1. A contributor syncs his workspace and takes his laptop home for the evening.

  2. At home, he continues working and chooses to check out a file.

  3. An unmodified copy of the checked-out file is placed in the contributor’s cache on his local computer.

  4. The contributor continues to work and check out additional files. Unmodified copies of all these files are placed in the cache.

  5. When the feature is complete, the user attempts to check in the change set. Because the user is offline, the check-in option is not available.

  6. Wanting to begin work on the next feature, the user shelves his modifications for retrieval and check-in when he is able to go back online.

Tagged : / / /

Hardware for The Build Lab

The build lab should include some high-end hardware for building the applications. Because the entire team depends on the results of a build, the high-end computers ensure that the build is completed as quickly as possible. Furthermore, you can use high-speed network equipment to push bits around from source control to build machines to release servers.

At a minimum, the build lab should have four machines:

  • Server that contains the Source Code Control program— This is your product. Do you really want this server residing someplace where you have little control over this box?
  • Debug build machine for the mainline builds— If you don’t separate your debug and release machines, you will accidentally ship debug binaries, which is not a good thing.
  • Release build machine for the mainline builds— This is a “golden goose” that turns out the “gold eggs” of your company or group. Treasure this machine like a princess, and guard it like all the king’s fortunes.
  • Internal release share server— This is one more piece of hardware that stores the “bread and butter” of the group or company. Don’t give up control of this hardware to anyone unless your IT department reports through your development group.

Hardware Requirements

Each machine in the preceding list should meet the following requirements:

  • Number of processors— This depends on the build tool you use. One is usually sufficient, because few build tools really take advantage of multiple processors.
  • Processor speed— The lab budget dictates this, but the faster the processor, the better it is.
  • Amount of installed RAM— Max out the machine. RAM is relatively cheap these days, especially when you consider the performance increase you get. Increasing the RAM is usually the first upgrade done when trying to improve the performance of any computer.
  • Number of hard drives— A minimum of two drives (or partitions) is preferred:
    • Drive 1 (C:) is for the operating system and installed applications.
    • Drive 2 (D:) is for building binaries, release shares, or the source database; the minimum space required is roughly ten times the space needed to build your application.
    • The split partitions are good because if you ever need to format or blow away a drive due to corruption, only part of the project will be affected. The recovery is much faster and easier.
  • Hard drive type— This is most likely SCSI, but it could be IDE.
  • Number of power supplies— If you purchase server class hardware (pizza boxes) that belong in racks, you need to consider how many power supplies to order.
  • Motherboard BIOS version— This does make a difference. Make sure you note what is being used and standardize on it.
Tagged : / / / / /

XML Is the Here, the Now, and the Future

XML Is the Here, the Now, and the Future

XML is short for Extensible Markup Language, which is a specification developed by the World Wide Web Consortium (W3C). XML is a pared-down version of SGML, designed especially for Web documents. It allows designers to create their own customized tags, enabling the definition, transmission, validation, and interpretation of data between applications and between organizations.

Following are three good reasons why you should master XML:

  • XML is seen as a universal, open, readable representation for software integration and data exchange. IBM, Microsoft, Oracle, and Sun have built XML into database authoring.
  • .NET and J2EE (Java 2 Platform, Enterprise Edition) depend heavily on XML.
    • All ASP.NET configuration files are based on XML.
    • XML provides serialization or deserialization, sending objects across a network in an understandable format.
    • XML offers SOAP Web Services communication.
    • XML offers temporary data storage.
  • MSBuild and the future project files of Visual Studio will be in XML format. ANT is also XML based.

Thus, if you want to learn one language that will cover many tools and technologies no matter what platform you are working on, that language is XML. The main difference in all these build tools is not so much the feature set but the syntax. I get tired of learning all the quirks of new languages, but I’m happy to learn XML because it’s here to stay and it’s fairly easy to learn.

Tagged : / / / / /

Some new terms of TFSC

  • Repository— The data store containing all files and folders in the TFSC database.
  • Mapping— An association of a repository path with a local working folder on the client computer.
  • Working folder— A directory on the client computer containing a local copy of some subset of the files and folders in a repository.
  • Workspace— A definition of an individual user’s copy of the files from the repository. The workspace contains a reference to the repository and a series of mappings that associate a repository path with a working folder on the user’s computer.
  • Change set— A set of modifications to one or more files/folders that is atomically applied to the repository at check-in.
  • Shelve— The operation of archiving all modifications in the current change set and replacing those files with original copies. The shelved files can be retrieved at a later time for development to be continued. This is my favorite feature.
Tagged : / / / / / / /

Important Configuration Definitions

  • Source code— Files written in high-level languages such as C# that need to be compiled (for example, foo.cs).
  • Source(s)— All the files involved in building a product (for example, C, CPP, VB, DOC, HTM, H, and CS). This term is used mostly as a catch-all phrase that is specific not only to source code files but to all the files that are stored in version tracking systems.
  • Codeline— A tree or branch of code that has a specific purpose, such as the mainline, release line, or hotfix line that grows collectively.
  • Mainline or trunk (“The Golden Tree”)— The main codeline of the product that contains the entire source code, document files, and anything else necessary to build and release the complete product.
  • Snapshot— A specific point in time in which the sources and build are captured and stored, usually on a release or build machine.
  • Milestone— A measurement of work items that includes a specified number of deliverables for a given project scheduled for a specified amount of time that are delivered, reviewed, and fixed to meet a high quality bar. The purpose of a milestone is to understand what is done, what is left to do, and how that fits with the given schedule and resources. To do this, the team must complete a portion of the project and review it to understand where the project is in the schedule and to reconcile what is not done with the rest of the schedule. A milestone is the best way to know how much time a portion of the project will take.
  • Code freeze— A period when the automatic updates and build processes are stopped to take the final check-ins at a milestone.
  • Public build— A build using the sources from the mainline or trunk.
  • Private build (also referred to as a sandbox build)— A build using a project component tree to build more specific pieces of the product. This is usually done prior to checking in the code to the mainline.
  • Branching— A superset of files off the mainline taken at a certain time (snapshot) that contains new developments for hotfixes or new versions. Each branch continues to grow independently or dependently on the mainline.
  • Forking— Cloning a source tree to allow controlled changes on one tree while allowing the other tree to grow at its own rate. The difference between forking and branching is that forking involves two trees, whereas branching involves just one. It is also important to note that forking or cloning makes a copy (snapshot) of the tree and does not share the history between the two trees, whereas branching does share the history.
  • Virtual Build Labs (VBLs)— A Virtual Build Lab is a build lab that is owned by a specific component or project team. The owner is responsible for propagating and integrating his code into the mainline or public build. Each VBL performs full builds and installable releases from the code in its source lines and the mainline. Although the term virtual is used in the name of the labs, don’t confuse it with Virtual PC or Virtual Machines because the labs are real physical rooms and computer boxes. It is not recommended that you use Virtual software for build machines except possibly for an occasional one-off or hotfix build. , “The Build Lab and Personnel.” There is usually a hierarchy of VBLs so that code “rolls up” to the mainline or trunk. For example, let’s say that you have a mainline, Project A is a branch off of the mainline, and Developer 1 has a branch off the project branch. Developer 1 has several branches off his branch, with each branch representing a different component of the product. If he wants to integrate one of his branches into main, he should first merge his changes with all the levels above the branch to make sure he gets all the changes. Alternatively, he can just roll the changes into main, which sits higher in the hierarchy. This will become clearer in the next couple of pages.
  • Reverse integration (RI)— The process of moving sources from one branch or tree to another that is higher in the VBL hierarchy.
  • Forward integration (FI)— The process of moving sources from one branch or tree to another that is lower in the VBL hierarchy.
  • Buddy build— A build performed on a machine other than the machine that the developer originally made changes on. This is done to validate the list of changed files so that there are no unintended consequences to the change in the mainline build.
Tagged : / / / / / / / /

More Important Build Definitions

I need to define some common build terms that are used throughout this article. It is also important for groups or teams to define these terms on a project-wide basis so that everyone is clear on what he is getting when a build is released.

  • Pre-build— Steps taken or tools run on code before the build is run to ensure zero build errors. Also involved are necessary steps to prepare the build and release machines for the daily build, such as checking for appropriate disk space.
  • Post-build— Includes scripts that are run to ensure that the proper build verification tests (BVTs) are run. This also includes security tests to make sure the correct code was built and nothing was fused into the build.
  • Clean build— Deleting all obj files, resource files, precompiled headers, generated import libraries, or other byproducts of the build process. I like to call this cleaning up the “build turds.” This is the first part of a clean build definition. Most of the time, build tools such as NMake.exe or DevEnv.exe handle this procedure automatically, but sometimes you have to specify the file extensions that need to be cleaned up. The second part of a clean build definition is rebuilding every component and every piece of code in a project. Basically the perfect clean build would be building on a build machine with the operating system and all build tools freshly installed.
  • Incremental build— The secret to getting out a daily build to the test team, regardless of circumstances, is to perform incremental builds instead of daily clean builds. This is also the best way that you can maintain quality and a known state of a build. An incremental build includes only the code of the source tree that has changed since the previous build. As you can guess, the build time needed for an incremental build is just a fraction of what a clean build takes.
  • Continuous integration build— This term is borrowed from the extreme programming (XP) practice. It means that software is built and tested several times per day as opposed to the more traditional daily builds. A typical setup is to perform a build every time a code check-in occurs.
  • Build break— In the simplest definition, a build break is when a compiler, linker, or other software development tool (such as a help file generator) outputs an error caused by the source code it was run against.
  • Build defect— This type of problem does not generate an error during the build process; however, something is checked into the source tree that breaks another component when the application is run. A build break is sometimes referred to or subclassed as a build defect.
  • Last known good (LKG) or internal developers workstation (IDW) builds— These terms are used as markers to indicate that the build has reached a certain quality assurance criterion and that it contains new high-priority fixes that are critical to the next baseline of the shipping code. The term LKG originated in the Visual Studio team, and IDW came from the Windows NT organization. LKG seems to be the more popular term at Microsoft.
Tagged : / / / / / / / /

Types of Builds: Developers and Project

I like to say that there are really only two types of builds: ones that work and ones that don’t. Seriously, though, when you’re shipping a product, you should consider these two different types of builds:

  • Developers’ (local machine builds)— These types of builds often happen within an editor such as Visual Studio, Emaqs, Slick, or VI. Usually, this is a fast compile/link of code that the developer is currently working on.
  • Project (central build process)— This type of build typically involves several components of an application, product, or a large project, such as Windows, or in some cases several projects included in a product, such as Microsoft Office.

The developer’s build process should be optimized for speed, but the project build process should be optimized for debugging and releases. I am talking about optimizing the process, not compiler or linker optimization switches. Although speed and debugging are important to everyone who is writing code, you must design a project build process to track build breaks and the offender(s) as quickly as possible because numerous people are waiting for a build to be released. For a developer, what seems to be most important is clicking some type of Build and Run button to make sure the code compiles without errors and then checking it in. For the build team, building without errors and having the ability to track down the person who broke the build is the most important thing.

 

Reference: The Build Master: Microsoft’s Software Configuration Management Best Practices

Tagged : / / /