Top 10 Face Recognition Tools

What are Face Recognition Tools?

Face recognition tools refer to software or systems that utilize computer vision and machine learning techniques to automatically detect, analyze, and recognize human faces from images or video data. These tools are designed to identify individuals based on unique facial features and can be used for a variety of applications, including security, access control, user authentication, personalized experiences, surveillance, and more.

Face recognition tools typically consist of algorithms and models that are trained on large datasets to learn facial patterns, features, and variations. They leverage deep learning techniques, such as convolutional neural networks (CNNs), to extract facial embeddings or representations that capture the distinctive characteristics of each face. These embeddings are then compared with existing face templates or a database of known faces to determine similarity or identity.

Here are 10 popular face recognition tools that are widely used in various applications:

  1. OpenCV
  2. Dlib
  3. TensorFlow
  4. Microsoft Azure Face API
  5. Amazon Rekognition
  6. FaceNet
  7. Kairos
  8. Face Recognition by Aging
  9. Luxand FaceSDK
  10. FaceX

1. OpenCV:

OpenCV (Open Source Computer Vision Library) is a versatile open-source computer vision library that provides face detection and recognition functionalities. It offers robust face detection algorithms and pre-trained models for facial recognition.

Key features:

  • Image and Video Processing: OpenCV provides a comprehensive set of functions and algorithms for image and video processing. It supports reading, writing, and manipulation of images and videos in various formats. It offers operations such as resizing, cropping, rotation, filtering, and blending.
  • Image and Video Capture: OpenCV allows capturing video from cameras or reading video files. It provides an interface to interact with cameras and grab frames in real time. It supports a variety of camera interfaces and formats, making it versatile for different platforms.
  • Object Detection and Tracking: OpenCV includes algorithms for object detection and tracking in images and videos. It provides pre-trained models and functions for popular object detection techniques like Haar cascades and deep learning-based methods. These capabilities are widely used in applications like face detection, pedestrian detection, and motion tracking.

2. Dlib:

Dlib is a powerful open-source library that includes facial landmark detection, face detection, and face recognition capabilities. It provides high-quality and accurate face recognition algorithms and models.

Key features:

  • Face Detection: Dlib includes highly accurate face detection algorithms that can identify faces in images or video frames. It utilizes a combination of Haar cascades, HOG (Histogram of Oriented Gradients), and SVM (Support Vector Machines) to detect faces with high precision.
  • Facial Landmark Detection: Dlib provides facial landmark detection algorithms that can identify specific points on a face, such as the positions of the eyes, nose, mouth, and jawline. These landmarks are essential for tasks like face alignment, emotion analysis, and face morphing.
  • Object Detection: Dlib offers object detection algorithms based on a combination of HOG features and SVM classifiers. It allows users to train their own object detectors or use pre-trained models for detecting various objects in images or video frames.

3. TensorFlow:

TensorFlow, an open-source machine learning framework developed by Google, offers face recognition capabilities through its deep learning models and APIs. It provides pre-trained models for face recognition tasks and allows users to develop custom face recognition models.

Key features:

  • Flexibility and Scalability: TensorFlow provides a flexible and scalable platform for developing machine learning models. It supports both high-level APIs, such as Keras, for easy model building, as well as low-level APIs that offer greater flexibility and control over model architecture and training process.
  • Deep Learning Capabilities: TensorFlow is particularly known for its robust support for deep learning models. It offers a wide range of pre-built layers and operations for building deep neural networks, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers. It also provides pre-trained models and utilities for transfer learning.
  • TensorFlow Extended (TFX): TensorFlow includes TFX, an end-to-end platform for deploying machine learning models in production. TFX provides tools for data preprocessing, model training, model serving, and monitoring. It facilitates the development of scalable and production-ready machine learning pipelines.

4. Microsoft Azure Face API:

Microsoft Azure Face API is a cloud-based face recognition service provided by Microsoft. It offers robust face detection and recognition functionalities with features like facial verification, identification, emotion detection, and age estimation.

Key features:

  • Face Detection: Azure Face API can detect human faces in images or video streams. It provides highly accurate face detection capabilities, even in complex scenarios with varying lighting conditions, occlusions, and pose variations.
  • Face Recognition: The Face API enables face recognition by identifying and verifying individuals based on their facial features. It allows you to create and manage face recognition models, enroll faces, and perform face matching and identification tasks.
  • Facial Landmark Detection: The API can detect facial landmarks or key points on faces, such as the positions of eyes, nose, mouth, and eyebrows. This information is useful for face analysis, alignment, and other facial feature-based applications.

5. Amazon Rekognition:

Amazon Rekognition is a cloud-based computer vision service offered by Amazon Web Services. It provides face detection and recognition capabilities, along with features like facial analysis, celebrity recognition, and facial similarity searching.

Key features:

  • Face Detection and Analysis: Amazon Rekognition can detect faces in images and videos with high accuracy. It can identify and analyze facial attributes such as age range, gender, emotions (like happy, sad, and angry), and facial landmarks (such as eyes, nose, and mouth).
  • Face Recognition: The service provides face recognition capabilities, allowing you to create face collections and compare faces against a collection to determine potential matches. It enables use cases like identity verification, person tracking, and indexing faces for faster searching.
  • Celebrity Recognition: Amazon Rekognition has a built-in celebrity recognition feature that can identify well-known celebrities in images and videos. This functionality can be used for media analysis, content tagging, and social media applications.

6. FaceNet:

FaceNet is a deep learning-based face recognition system developed by Google. It utilizes deep convolutional neural networks to generate highly discriminative face embeddings, enabling accurate face recognition and verification.

Key features:

  • Deep Convolutional Neural Network (CNN): FaceNet utilizes a deep CNN architecture to extract high-level features from face images. The network learns to automatically encode facial features in a way that is invariant to variations in lighting, pose, and facial expressions.
  • Triplet Loss Optimization: FaceNet employs a triplet loss function during training to learn a face embedding space where faces of the same identity are closer together and faces of different identities are farther apart. This metric learning approach improves the discriminative power of the learned embeddings.
  • End-to-End Learning: FaceNet is trained in an end-to-end manner, meaning that the entire network is trained jointly to optimize the embedding space and minimize the triplet loss. This approach allows the model to learn directly from raw face images, without the need for manual feature extraction.

7. Kairos:

Kairos is a cloud-based face recognition platform that offers a range of face analysis and recognition services. It provides APIs for face detection, face recognition, emotion analysis, age estimation, and gender identification.

Key features:

  • Facial Recognition: Kairos offers highly accurate facial recognition capabilities. It can detect and recognize faces in images or video streams, enabling identity verification, access control, and personalized user experiences.
  • Face Matching and Identification: The platform allows for face matching and identification by comparing faces against a database of known individuals. It can determine if a face belongs to a known person or if it is an unknown face, enabling applications such as user authentication and watchlist screening.
  • Emotion Analysis: Kairos includes emotion analysis features that can detect and analyze facial expressions to determine emotional states. It can recognize emotions such as happiness, sadness, anger, surprise, and more. This functionality is useful for sentiment analysis, user experience optimization, and market research.

8. Face Recognition by Aging:

This Python library by Adam Geitgey provides a simple and easy-to-use face recognition API. It utilizes the lib library and pre-trained models to perform face recognition tasks.

Key features:

  • Face Detection: The library offers robust face detection capabilities, allowing you to locate and identify faces within images or video frames. It can detect multiple faces in a given image, even under varying lighting conditions and different orientations.
  • Face Recognition: Face Recognition by aging includes face recognition functionality, enabling you to compare and identify faces by creating unique face encodings. It provides a convenient API for face matching and verification against a database of known faces.
  • Facial Feature Extraction: The library can extract facial features such as landmarks, facial landmarks, and pose estimation. It provides access to key points on a face, including eyes, nose, mouth, and eyebrows, allowing for further analysis and applications such as face alignment and augmented reality.

9. Luxand FaceSDK:

Luxand FaceSDK is a commercial face recognition software development kit (SDK) that provides robust face detection and recognition capabilities for desktop and mobile platforms. It supports real-time face detection and offers high accuracy in face recognition tasks.

Key features:

  • Face Detection: Luxand FaceSDK provides robust face detection capabilities, allowing you to detect and locate faces within images or video streams. It can detect multiple faces simultaneously, even in complex scenarios with variations in lighting, pose, and occlusions.
  • Face Recognition: The SDK includes powerful face recognition algorithms for identifying and verifying individuals based on their facial features. It enables you to create face recognition systems, enroll faces, and perform accurate face matching and identification tasks.
  • Facial Landmark Detection: Luxand FaceSDK can detect and track facial landmarks or key points on faces, such as the positions of eyes, nose, mouth, and eyebrows. This feature enables detailed face analysis, face alignment, and applications that require precise facial feature extraction.

10. FaceX:

FaceX is a cloud-based face recognition API that offers a comprehensive set of face recognition features, including face detection, identification, verification, and emotion analysis. It provides easy-to-use APIs for integrating face recognition into applications.

Key features:

  • Face Detection: FaceX provides accurate face detection capabilities, allowing you to locate and identify faces within images or video frames. It can detect multiple faces in a given image and handle variations in lighting, pose, and occlusions.
  • Face Recognition: The platform includes face recognition functionality, enabling you to compare and identify faces by creating unique face templates or embeddings. It allows you to perform face matching and verification against a database of known faces for various applications.
  • Facial Attribute Analysis: FaceX can analyze facial attributes such as age, gender, ethnicity, and emotions. It provides insights into demographic information and emotional states, which can be utilized for targeted marketing, sentiment analysis, and user experience optimization.
Tagged : / / / /

Top 10 Speech Recognition Tools

What are Speech Recognition Tools?

Speech recognition tools refer to software or systems that utilize various algorithms and techniques to convert spoken language or audio input into written text or commands. These tools leverage machine learning and signal processing techniques to analyze and interpret audio signals and transcribe them into textual form.

Here are the top 10 speech recognition tools:

  1. Google Cloud Speech-to-Text
  2. Microsoft Azure Speech Services
  3. Amazon Transcribe
  4. IBM Watson Speech to Text
  5. Nuance Dragon Professional
  6. Apple Siri
  7. Speechmatics
  8. Kaldi
  9. CMUSphinx
  10. Deepgram

1. Google Cloud Speech-to-Text:

Google Cloud’s Speech-to-Text API enables developers to convert spoken language into written text. It offers accurate and real-time transcription of audio data and supports multiple languages.

Key features:

  • Accurate Speech Recognition: Google Cloud Speech-to-Text uses advanced machine learning algorithms to provide highly accurate transcription of audio data. It can handle a variety of audio formats and supports multiple languages, including regional accents and dialects.
  • Real-Time Transcription: The API supports real-time streaming, allowing for immediate transcription as the audio is being spoken. This feature is useful for applications that require real-time speech recognition, such as live captioning or voice-controlled systems.
  • Enhanced Speech Models: Google Cloud Speech-to-Text offers enhanced models specifically trained for specific domains, such as phone calls, videos, or commands. These models are optimized for better accuracy and performance in their respective domains.

2. Microsoft Azure Speech Services:

Microsoft Azure Speech Services provides speech recognition capabilities that can convert spoken language into text. It offers features like speech-to-text transcription, speaker recognition, and real-time translation.

Key features:

  • Speech-to-Text Conversion: Azure Speech Services enables accurate and real-time conversion of spoken language into written text. It supports multiple languages and dialects, allowing for global application deployment.
  • Custom Speech Models: Developers can create custom speech models using their own training data to improve recognition accuracy for domain-specific vocabulary or jargon. This feature is particularly useful for industries with specialized terminology or unique speech patterns.
  • Speaker Recognition: Azure Speech Services includes speaker recognition capabilities, allowing for speaker verification and identification. It can differentiate between multiple speakers in an audio stream and associate speech segments with specific individuals.

3. Amazon Transcribe:

Amazon Transcribe is a fully managed automatic speech recognition (ASR) service offered by Amazon Web Services. It can convert speech into accurate text and supports various audio formats and languages.

Key features:

  • Accurate Speech-to-Text Conversion: Amazon Transcribe leverages advanced machine learning algorithms to accurately transcribe audio data into written text. It supports various audio formats, including WAV, MP3, and FLAC, making it compatible with different recording sources.
  • Real-Time Transcription: The service supports real-time streaming, allowing developers to receive immediate transcription results as audio is being spoken. This feature is valuable for applications that require real-time speech recognition, such as live captioning or voice-controlled systems.
  • Automatic Language Identification: Amazon Transcribe automatically detects the language spoken in the audio, eliminating the need for manual language selection. It supports a wide range of languages and dialects, allowing for global application deployment.

4. IBM Watson Speech to Text:

IBM Watson Speech to Text is a cloud-based speech recognition service that converts spoken language into written text. It provides high accuracy and supports multiple languages and industry-specific models.

Key features:

  • Accurate Speech Recognition: IBM Watson Speech to Text utilizes deep learning techniques and advanced algorithms to provide highly accurate transcription of audio data. It can handle a wide range of audio formats and supports multiple languages, dialects, and accents.
  • Real-Time Transcription: The service supports real-time streaming, allowing for immediate transcription as the audio is being spoken. This feature is valuable for applications that require real-time speech recognition, such as live captioning or voice-controlled systems.
  • Custom Language Models: Developers can create custom language models to improve recognition accuracy for a domain-specific vocabulary or specialized terminology. This feature is particularly useful for industries with unique speech patterns or terminology.

5. Nuance Dragon Professional:

Nuance Dragon Professional is a speech recognition software designed for professionals. It allows users to dictate documents, emails, and other text, providing accurate transcription and voice commands for hands-free productivity.

Key features:

  • Accurate Speech Recognition: Nuance Dragon Professional offers high accuracy in converting spoken language into written text. It leverages deep learning technology and adaptive algorithms to continually improve accuracy and adapt to users’ voice patterns.
  • Dictation and Transcription: Users can dictate their thoughts, documents, emails, or other text-based content using their voice, allowing for faster and more efficient creation of written materials. It also supports the transcription of audio recordings, making it convenient for converting recorded meetings or interviews into text.
  • Customizable Vocabulary: Dragon Professional allows users to create custom vocabularies by adding industry-specific terms, jargon, or personal preferences. This customization enhances recognition accuracy for specialized terminology and improves overall transcription quality.

6. Apple Siri:

Apple Siri is a virtual assistant that includes speech recognition capabilities. It can understand and respond to voice commands, perform tasks, and provide information using natural language processing and AI.

Key features:

  • Voice Commands and Control: Siri allows users to interact with their Apple devices using voice commands, providing hands-free control over various functions and features. Users can make calls, send messages, set reminders, schedule appointments, play music, control smart home devices, and more, simply by speaking to Siri.
  • Natural Language Processing: Siri utilizes natural language processing (NLP) to understand and interpret user commands and queries. It can comprehend and respond to conversational language, allowing for more natural and intuitive interactions.
  • Personal Assistant Features: Siri acts as a personal assistant, helping users with everyday tasks and information retrieval. It can answer questions, provide weather updates, set alarms and timers, perform calculations, recommend nearby restaurants, offer sports scores and schedules, and deliver various other helpful information.

7. Speechmatics:

Speechmatics offers automatic speech recognition technology that can convert spoken language into written text. It supports multiple languages and offers customization options to adapt to specific use cases.

Key features:

  • Multilingual Support: Speechmatics supports a wide range of languages, including major global languages as well as regional and less widely spoken languages. This multilingual capability allows for speech recognition and transcription in various linguistic contexts.
  • Customizable Language Models: Users can create and fine-tune custom language models specific to their domain or industry. This customization enhances recognition accuracy for specialized vocabulary, technical terms, and jargon unique to particular applications.
  • Real-Time and Batch Processing: Speechmatics provides both real-time and batch processing options to cater to different use cases. Real-time processing allows for immediate transcription as audio is being spoken, while batch processing enables large-scale and offline transcription of pre-recorded audio.

8. Kaldi:

Kaldi is an open-source toolkit for speech recognition. It provides a framework for building speech recognition systems and supports various acoustic and language models for transcription and speaker identification.

Key features:

  • Modularity: Kaldi is designed with a highly modular architecture, allowing users to easily customize and extend its functionality. It provides a collection of libraries and tools that can be combined and configured in various ways to build speech recognition systems.
  • Speech Recognition: Kaldi provides state-of-the-art tools and algorithms for automatic speech recognition (ASR). It includes a wide range of techniques for acoustic modeling, language modeling, and decoding. It supports both speaker-independent and speaker-adaptive models.
  • Flexibility: Kaldi supports a variety of data formats and can handle large-scale speech recognition tasks. It can process audio data in various formats, including raw waveforms, wave files, and compressed audio formats. It also supports various transcription formats and language model formats.

9. CMUSphinx:

CMUSphinx is an open-source speech recognition system that offers accurate speech-to-text conversion. It supports multiple languages and provides flexibility for customization and integration into different applications.

Key features:

  • Modularity: Similar to Kaldi, CMUSphinx is designed with a modular architecture, allowing users to customize and extend its functionality. It provides a set of libraries and tools that can be combined to build speech recognition systems tailored to specific needs.
  • Acoustic Modeling: CMUSphinx supports various acoustic modeling techniques, including Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs). It provides tools for training and adapting acoustic models to specific speakers or conditions.
  • Language Modeling: CMUSphinx supports language modeling using n-gram models, which are commonly used for ASR. It allows users to train language models from large text corpora or integrate pre-existing language models into the recognition system.

10. Deepgram:

Deepgram is a speech recognition platform that utilizes deep learning techniques to transcribe audio data into text. It offers real-time processing, and custom language models, and supports large-scale speech recognition applications.

Key features:

  • Automatic Speech Recognition (ASR): Deepgram offers powerful ASR capabilities for converting spoken language into written text. It utilizes deep learning models, such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs), to achieve high accuracy in transcribing speech.
  • Real-Time Processing: Deepgram is designed for real-time processing of streaming audio data. It can process and transcribe live audio streams with low latency, making it suitable for applications that require immediate or near real-time speech recognition, such as transcription services, voice assistants, and call center analytics.
  • Multichannel Audio Support: Deepgram supports multichannel audio, enabling the recognition of speech from various sources simultaneously. This feature is particularly useful in scenarios where multiple speakers or audio channels need to be processed and transcribed accurately, such as conference calls or meetings.
Tagged : / / / /

Git Troubleshooting | Git Troubleshooting Techniques

git-troubleshooting

export GIT_CURL_VERBOSE=1

git push -u origin –all –verbose

git config –list

&

GIT_CURL_VERBOSE=1 git push

&

export GIT_CURL_VERBOSE=1

git push

git config --global http.postBuffer

There are useful to debug, long running Git Commands or Git Commands that seem to be hanged for some reason,

Git has an in-built functionality for us to peek into what is running behind the scenes of a git command, just add GIT_TRACE=1 before ANY git command to get additional info, for example:

Other Flags that we can use are : GIT_CURL_VERBOSE=1, -v or –verbose

[server@user sp-server-branches]$ GIT_TRACE=1 git clone

https://github.com/gitlabhq/gitlab-public-wiki/wiki/Trouble-Shooting-Guide

https://drupal.org/node/1065850

http://mattberther.com/2013/12/29/pushing-large-git-repos-with-ssh

http://ocaoimh.ie/2008/12/10/how-to-fix-ssh-timeout-problems/

http://unix.stackexchange.com/questions/3026/what-do-the-options-serveraliveinterval-and-clientaliveinterval-in-sshd-conf

Tagged : / / / / / / / / /

How to run Remote Desktop Console by using command line?

How to run Remote Desktop Console using command line
If you may want to run Desktop Console from a batch file, for example RDC over VPN, you can use mstsc /v:servername /console command.

Mstsc

Creates connections to terminal servers or other remote computers, edits an existing Remote Desktop Connection (.rdp) configuration file, and migrates legacy connection files that were created with Client Connection Manager to new .rdp connection files.

Syntax

mstsc.exe {ConnectionFile | /v:ServerName[:Port]} [/console] [/f] [/w:Width /h:Height]
mstsc.exe /edit”ConnectionFile”
mstsc.exe /migrate
Parameters

ConnectionFile

Specifies the name of an .rdp file for the connection.
/v:ServerName[:Port]
Specifies the remote computer and, optionally, the port number to which you want to connect.

/console
Connects to the console session of the specified Windows Server 2003 family operating system.

/f
Starts Remote Desktop connection in full-screen mode.

/w:Width /h:Height
Specifies the dimensions of the Remote Desktop screen.

/edit”ConnectionFile”
Opens the specified .rdp file for editing.

/migrate
Migrates legacy connection files that were created with Client Connection Manager to new .rdp connection files.

Remarks
* You must be an administrator on the server to which you are connecting to create a remote console connection.
* default.rdp is stored for each user as a hidden file in My Documents. User created .rdp files are stored by default in My Documents but can be moved anywhere.
Examples

To connect to the console session of a server, type:
mstsc /console

To open a file called filename.rdp for editing, type:
mstsc /edit filename.rdp

Tagged : / / / / / / / / / / / / / / / / / / / /

Best Practices in Software Configuration Management – SCM Best Practices Guide

scm-best-practices

Best Practices in Software Configuration Management

Abstract
When deploying new SCM (software configuration management) tools,
implementers sometimes focus on perfecting fine-grained activities, while
unwittingly carrying forward poor, large-scale practices from their previous jobs or
previous tools. The result is a well-executed blunder. This paper promotes some
high-level best practices that reflect the authors’ experiences in deploying SCM.

1. Introduction
“A tool is only as good as you use it,” the saying goes. As providers of software configuration management (SCM) tools and consultants to software companies, we
are often asked for sound advice on SCM best practices – that is, how to deploy SCM software to the maximum advantage. In answering these requests we have a bounty of direct and indirect SCM experience from which to draw. The direct experience comes from having been developers and codeline managers ourselves; the indirect experience comes from customer reports of successes and failures with our product (Perforce) and other SCM tools.
The table below lists six general areas of SCM deployment, and some coarse-grained best practices within each of those areas. The following chapters explain each item.

Workspaces, where
developers build, test, and
debug.
· Don’t share workspaces.
· Don’t work outside of managed workspaces.
· Don’t use jello views.
· Stay in sync with the codeline.
· Check in often.
Codelines, the canonical sets
of source files.
· Give each codeline a policy.
· Give each codeline an owner.
· Have a mainline.
Branches, variants of the
codeline.
· Branch only when necessary.
· Don’t copy when you mean to branch.
· Branch on incompatible policy.
· Branch late.
· Branch, instead of freeze.
Change propagation, getting
changes from one codeline to
another.
· Make original changes in the branch that has
evolved the least since branching.
· Propagate early and often.
· Get the right person to do the merge.
Builds, turning source files
into products.
· Source + tools = product.
· Check in all original source.
· Segregate built objects from original source.
· Use common build tools.
· Build often.
· Keep build logs and build output.
Process, the rules for all of
the above.
· Track change packages.
· Track change package propagations.
· Distinguish change requests from change
packages.
· Give everything and owner.
· Use living documents.

2. The Workspace
The workspace is where engineers edit source files, build the software components they’re working on, and test and debug what they’ve built. Most SCM systems have some notion of a workspace; sometimes they are called “sandboxes”, as in Source Integrity, or “views”, as in ClearCase and Perforce. Changes to managed SCM repository files begin as changes to files in a workspace. The best practices for workspaces include:· Don’t share workspaces. A workspace should have a single purpose, such as an edit/build/test area for a single developer, or a build/test/release area for a product release. Sharing workspaces confuses people, just as sharing a desk does. Furthermore, sharing workspaces compromises the SCM system’s ability to track activity by user or task. Workspaces and the disk space they occupy are cheap; don’t waste time trying to conserve them.· Don’t work outside of managed workspaces. Your SCM system can only track work in progress when it takes place within managed workspaces. Users working outside of workspaces are beached; there’s a river of information flowing past and they’re not part of it. For instance, SCM systems generally use workspaces to facilitate some of the communication among developers working on related tasks. You can see what is happening in others’ workspaces, and they can see what’s going on in yours. If you need to take an emergency vacation, your properly managed workspace may be all you can leave behind. Use proper workspaces.
· Don’t use jello views. A file in your workspace should not change unless youexplicitly cause the change. A “jello view” is a workspace where file changes are
caused by external events beyond your control. A typical example of a jello view is a workspace built upon a tree of symbolic links to files in another workspace –
when the underlying files are updated, your workspace files change. Jello views are a source of chaos in software development. Debug symbols in executables
don’t match the source files, mysterious recompilations occur in supposedly trivial rebuilds, and debugging cycles never converge – these are just some of the
problems. Keep your workspaces firm and stable by setting them up so that users have control over when their files change.· Stay in sync with the codeline. As a developer, the quality of your work depends on how well it meshes with other peoples’ work. In other words, as changes are checked into the codeline, you should update your workspace and integrate those changes with yours. As an SCM engineer, it behooves you to make sure this workspace update operation is straightforward and unencumbered with tricky or time-consuming procedures. If developers find it fairly painless to update their workspaces, they’ll do it more frequently and integration problems won’t pile up at project deadlines.
· Check in often. Integrating your development work with other peoples’ work also requires you to check in your changes as soon as they are ready. Once you’ve
finished a development task, check in your changed files so that your work is available to others. Again, as the SCM engineer, you should set up procedures that encourage frequent check-ins. Don’t implement unduly arduous validation procedures, and don’t freeze codelines (see Branching, below). Short freezes are bearable, but long freezes compromise productivity. Much productivity can be wasted waiting for the right day (or week, or month) to submit changes.

3. The Codeline
In this context, the codeline is the canonical set of source files required to produce your software. Typically codelines are branched, and the branches evolve into variant
codelines embodying different releases. The best practices with regard to codelines are:
· Give each codeline a policy. A codeline policy specifies the fair use and permissible check-ins for the codeline, and is the essential user’s manual for
codeline SCM. For example, the policy of a development codeline should state that it isn’t for release; likewise, the policy of a release codeline should limit
changes to approved bug fixes.1 The policy can also describe how to document changes being checked in, what review is needed, what testing is required, and
the expectations of codeline stability after check-ins. A policy is a critical component for a documented, enforceable software development process, and a
codeline without a policy, from an SCM point of view, is out of control.· Give each codeline an owner. Having defined a policy for a codeline, you’ll soon
encounter special cases where the policy is inapplicable or ambiguous. Developers facing these ambiguities will turn to the person in charge of the
codeline for workarounds. When no one is in charge, developers tend to enact their own workarounds without documenting them. Or they simply procrastinate
because they don’t have enough information about the codeline to come up with a reasonable workaround. You can avoid this morass by appointing someone to
own the codeline, and to shepherd it through its useful life. With this broader objective, the codeline owner can smooth the ride over rough spots in software
development by advising developers on policy exceptions and documenting them.

· Have a mainline. A “mainline,” or “trunk,” is the branch of a codeline that evolves forever. A mainline provides an ultimate destination for almost all
changes – both maintenance fixes and new features – and represents the primary, linear evolution of a software product. Release codelines and development
codelines are branched from the mainline, and work that occurs in branches is propagated back to the mainline.

IMAGE  – 1

Figure 1 shows a mainline (called “main”), from which several release lines (“ver1”, “ver2” and “ver3”) and feature development lines (“projA”, “projb”, and
“projC”) have been branched. Developers work in the mainline or in a feature development line. The release lines are reserved for testing and critical fixes, and
are insulated from the hubbub of development. Eventually all changes submitted to the release lines and the feature development lines get merged into the
mainline. The adverse approach is to “promote” codelines; for example, to promote a development codeline to a release codeline, and branch off a new development
codeline. For example, Figure 2 shows a development codeline promoted to a release codeline (“ver1”) and branched into another development codeline
(“projA”). Each release codeline starts out as a development codeline, and development moves from codeline to codeline.

IMAGE – 2

The promotion scheme suffers from two crippling drawbacks: (1) it requires the policy of a codeline to change, which is never easy to communicate to everyone;
(2) it requires developers to relocate their work from one codeline to another, which is error-prone and time-consuming. 90% of SCM “process” is enforcing
codeline promotion to compensate for the lack of a mainline. Process is streamlined and simplified when you use a mainline model. With a
mainline, contributors’ workspaces and environments are stable for the duration of their tasks at hand, and no additional administrative overhead is incurred as
software products move forward to maturity.

4. Branching
Branching, the creation of variant codelines from other codelines, is the most problematic area of SCM. Different SCM tools support branching in markedly
different ways, and different policies require that branching be used in still more different ways. We found the following guidelines helpful when branching (and
sometimes when avoiding branching):·

Branch only when necessary. Every branch is more work – more builds, more changes to be propagated among codelines, more source file merges. If you keep this in mind every time you consider making a branch you may avoid sprouting unnecessary branches.
· Don’t copy when you mean to branch. An alternative to using your SCM tool’s branching mechanism is to copy a set of source files from one codeline and
check them in to another as new files. Don’t think that you can avoid the costs of branching by simply copying. Copying incurs all the headaches of branching –
additional entities and increased complexity – but without the benefit of your SCM system’s branching support. Don’t be fooled: even “read-only” copies
shipped off to another development group “for reference only” often return with changes made. Use your SCM system to make branches when you spin off parts
or all of a codeline.
· Branch on incompatible policy. There is one simple rule to determine if a codeline should be branched: it should be branched when its users need different
check-in policies. For example, a product release group may need a check-in policy that enforces rigorous testing, whereas a development team may need a
policy that allows frequent check-ins of partially tested changes. This policy divergence calls for a codeline branch. When one development group doesn’t
wish to see another development group’s changes, that is also a form of incompatible policy: each group should have its own branch.
· Branch late. To minimize the number of changes that need to be propagated from one branch to another, put off creating a branch as long as possible. For
example, if the mainline branch contains all the new features ready for a release, do as much testing and bug fixing in it as you can before creating a release
branch. Every bug fixed in the mainline before the release branch is created is one less change needing propagation between branches.
· Branch instead of freeze. On the other hand, if testing requires freezing a codeline, developers who have pending changes will have to sit on their changes
until the testing is complete. If this is the case, branch the codeline early enough so that developers can check in and get on with their work.

5. Change Propagation
Once you have branched codelines, you face the chore of propagating file changes across branches. This is rarely a trivial task, but there are some things you can do to
keep it manageable.
· Make original changes in the branch that has evolved the least since branching. It is much easier to merge a change from a file that is close to the common
ancestor than it is to merge a change from a file that has diverged considerably. This is because the change in the file that has diverged may be built upon
changes that are not being propagated, and those unwanted changes can confound the merge process. You can minimize the merge complexity by making
original changes in the branch that is the most stable. For example, if a release codeline is branched from a mainline, make a bug fix first in the release line and
then merge it into the mainline. If you make the bug fix in the mainline first, subsequently merging it into a release codeline may require you to back out
other, incompatible changes that aren’t meant to go into the release codeline.
· Propagate early and often. When it’s feasible to propagate a change from one branch to another (that is, if the change wouldn’t violate the target branch’s
policy), do it sooner rather than later. Postponed and batched change propagations can result in stunningly complex file merges.
· Get the right person to do the merge. The burden of change propagation can be lightened by assigning the responsibility to the engineer best prepared to resolve
file conflicts. Changes can be propagated by (a) the owner of the target files, (b) the person who make the original changes, or (c) someone else. Either (a) or (b)
will do a better job than (c).

6. Builds
A build is the business of constructing usable software from original source files. Builds are more manageable and less prone to problems when a few key practices are
observed:
· Source + tools = product. The only ingredients in a build should be source files and the tools to which they are input. Memorized procedures and yellow stickies
have no place in this equation. Given the same source files and build tools, the resulting product should always be the same. If you have rote setup procedures,
automate them in scripts. If you have manual setup steps, document them in build instructions. And document all tool specifications, including OS, compilers, include files, link libraries, make programs, and executable paths.
· Check in all original source. When software can’t be reliably reproduced from the same ingredients, chances are the ingredient list is incomplete. Frequently
overlooked ingredients are makefiles, setup scripts, build scripts, build instructions, and tool specifications. All of these are the source you build with.
Remember: source + tools = product.
· Segregate built objects from original source. Organize your builds so that the directories containing original source files are not polluted by built objects.
Original source files are those you create “from an original thought process” with a text editor, an application generator, or any other interactive tool. Built objects
are all the files that get created during your build process, including generated source files. Built objects should not go into your source code directories.
Instead, build them into a directory tree of their own. This segregation allows you to limit the scope of SCM-managed directories to those that contain only
source. It also corrals the files that tend to be large and/or expendable into one location, and simplifies disk space management for builds.
· Use common build tools. Developers, test engineers, and release engineers should all use the same build tools. Much time is wasted when a developer
cannot reproduce a problem found in testing, or when the released product varies from what was tested. Remember: source + tools = product.
· Build often. Frequent, end-to-end builds with regression testing (“sanity” builds) have two benefits: (1) they reveal integration problems introduced by check-ins,
and (2) they produce link libraries and other built objects that can be used by developers. In an ideal world, sanity builds would occur after every check-in, but
in an active codeline it’s more practical to do them at intervals, typically nightly. Every codeline branch should be subject to regular, frequent, and complete builds
and regression testing, even when product release is in the distant future.
· Keep build logs and build outputs. For any built object you produce, you should be able to look up the exact operations (e.g., complete compiler flag and link
command text) that produced the last known good version of it. Archive build outputs and logs, including source file versions (e.g., a label), tool and OS
version info, compiler outputs, intermediate files, built objects, and test results, for future reference. As large software projects evolve, components are handed
off from one group to another, and the receiving group may not be in a position to begin builds of new components immediately. When they do begin to build
new components, they will need access to previous build logs in order to diagnose the integration problems they encounter.

7. Process
It would take an entire paper, or several papers, to explore the full scope of SCM process design and implementation, and many such papers have already been written. Furthermore, your shop has specific objectives and requirements that will be reflected in the process you implement, and we do not presume to know what those are. In our experience, however, some process concepts are key to any SCM implementation:
· Track change packages. Even though each file in a codeline has its revision history, each revision in its history is only useful in the context of a set of related
files. The question “What other source files were changed along with this particular change to foo.c?” can’t be answered unless you track change
packages, or sets of files related by a logical change. Change packages, not individual file changes, are the visible manifestation of software development.
Some SCM systems track change packages for you; if yours doesn’t, write an interface that does.
· Track change package propagations. One clear benefit of tracking change packages is that it becomes very easy propagate logical changes (e.g., bug fixes)
from one codeline branch to another. However, it’s not enough to simply propagate change packages across branches; you must keep track of which
change packages have been propagated, which propagations are pending, and which codeline branches are likely donors or recipients of propagations.
Otherwise you’ll never be able to answer the question “Is the fix for bug X in the release Y codeline?” Again, some SCM systems track change package
propagations for you, whereas with others you’ll have to write your own interface to do it. Ultimately, you should never have to resort to “diffing” files to
figure out if a change package has been propagated between codelines.

· Distinguish change requests from change packages. “What to do” and “what was done” are different data entities. For example, a bug report is a “what to do”
entity and a bug fix is a “what was done” entity. Your SCM process should distinguish between the two, because in fact there can be a one-to-many
relationship between change requests and change packages.
· Give everything an owner. Every process, policy, document, product, component, codeline, branch, and task in your SCM system should have an
owner. Owners give life to these entities by representing them; an entity with an owner can grow and mature. Ownerless entities are like obstacles in an ant trail
– the ants simply march around them as if they weren’t there.
· Use living documents. The policies and procedures you implement should be described in living documents; that is, your process documentation should be as
readily available and as subject to update as your managed source code. Documents that aren’t accessible are useless; documents that aren’t updateable
are nearly so. Process documents should be accessible from all of your development environments: at your own workstation, at someone else’s
workstation, and from your machine at home. And process documents should be easily updateable, and updates should be immediately available.

8. Conclusion
Best practices in SCM, like best practices anywhere, always seem obvious once you’ve used them. The practices discussed in this paper have worked well for us, but
we recognize that no single, short document can contain them all. So we have presented the practices that offer the greatest return and yet seem to be violated more
often than not. We welcome the opportunity to improve this document, and solicit both challenges to the above practices as well as the additions of new ones.

10. References
Berczuk, Steve. “Configuration Management Patterns”, 1997. Available at
http://www.bell-labs.com/cgi-user/OrgPatterns/OrgPatterns?ConfigurationManagementPatterns.
Compton, Stephen B, Configuration Management for Software, VNR Computer
Library, Van Nostrand Reinhold, 1993.
Continuus Software Corp., “Work Area Management”, Continuus/CM: Change
Management for Software Development. Available at
http://www.continuus.com/developers/developersACE.html.
Dart, Susan, “Spectrum of Functionality in Configuration Management Systems”,
Software Engineering Institute, 1990. Available at
http://www.sei.cmu.edu/technology/case/scm/tech_rep/TR11_90/TOC_TR11_90.html

Jameson, Kevin, Multi Platform Code Management, O’Reilly & Associates, 1994
Linenbach, Terris, “Programmers’ Canvas: A pattern for source code management”
1996. Available at http://www.rahul.net/terris/ProgrammersCanvas.htm.
Lyon, David D, Practical CM, Raven Publishing, 1997
McConnell, Steve, “Best Practices: Daily Build and Smoke Test”,
IEEE Software, Vol. 13, No. 4, July 1996
van der Hoek, Andre, Hall, Richard S., Heimbigner, Dennis, and Wolf, Alexander L.,
“Software Release Management”, Proceedings of the 6th European Software
Engineering Conference, Zurich, Switzerland, 1997.

10. Author

Laura Wingerd
Perforce Software, Inc.
wingerd@perforce.com
Christopher Seiwald
Perforce Software, Inc.
seiwald@perforce.com

Tagged : / / / / / / / / / / / / / / / / / / / / / / / / / /