Source Repository Rules

Revision control, also known as version control and source control (AKA part of software configuraiton management), is the management of changes to documents, source codes. As release engineer, we have to work with a source repository everyday. So, what should NOT be under source control?

Today I read one post from https://blogs.oracle.com/kto/entry/source_repository_rules. That blog is for JDK build. In that post, it just simply lists a few commandments for dealing with software repositories and the build results of those repositories. The following is its commandments:

  1. There shalt not be binary files in the repository.
    Binary files (executables, native libraries, zip files, jar files, etc.) are NOT source and should not be managed in a source repository.
  2. Keep thy path names simple.
    Directory names and filenames in the repositories should never contain blanks or non-printing characters. Certain characters such as ‘$’ should also be avoided.
  3. There shall be one newline convention.
    The contents of all source files should follow the standard unix conventions on newlines (no \^M’s).
  4. Generated source files shall not be added as managed files.
    Source files generated during the build process should not be managed files in a repository.
  5. All output from the build shall be kept separated from the source.
    All files generated during the build should land in a well defined output only directory such as build/ or dist/. The src/ directory should never get written to during the build process.

Basically I am just 100% agreed with them but I just want to add some examples which are materials I retrieved from Internet (Well, SCM is a big topic, above commandments are too simple and are just not enough. I know ppl just wants to source control everything as storage is cheaper nowadays. 🙂 )

A general rule of source control: All of the source files generated by people manually should be under control. Besides source codes, it should also include build description files, configure files, etc. Vice versa, anything that is generated by build system automatically, like target files, executable files, binary, bytecode, code/documents generated from XML should not be source control. However, 3rd party libraries that you don’t have the source or don’t build should be under control.

You should only source control those files that:

  • ( need revision history OR are created outside of your build but are part of the build, install, or media ) AND
  • can’t be generated by the build process you control AND
  • are common to all users that build the product (no user config)

The list includes things like:

  • source files
  • make, project, and solution files — build files & project configuration files
  • other build tool configuration files (not user related)
  • 3rd party libraries
  • design documentation
  • description files like WSDL, XSL

For example, in world of Java,

Anything that’s generated from the items you check into source control.

Things should be checked in:

  1. Source files (.java, and other languages)
  2. 3rd party JARs
  3. Configuration XML or .properties
  4. HTML, CSS, JSPs for web apps
  5. SQL scripts
  6. Design (UML) and documentation (Word or HTML)
  7. Unit test classes and any test data

Things should not checked in:

  1. Compiled .class files
  2. Generated JAR or WAR files except those 3rd party JARs
  3. javadocs
  4. JUnit report HTML and results

Actually to avoid checkin those unnecessary files into your repository, you can set those rules in your desktop client of version control system. Take github as example, you can define ignored files (.gitignore) as below. (It is also a very good example about which types of files you should ignore for different languages respectively).

#################
## Eclipse
#################

*.pydevproject
.project
.metadata
bin/
tmp/
*.tmp
*.bak
*.swp
*~.nib
local.properties
.classpath
.settings/
.loadpath
**/*/*.class

# External tool builders
.externalToolBuilders/

# Locally stored “Eclipse launch configurations”
*.launch

# CDT-specific
.cproject

# PDT-specific
.buildpath

#################
## Visual Studio
#################

## Ignore Visual Studio temporary files, build results, and
## files generated by popular Visual Studio add-ons.

# User-specific files
*.suo
*.user
*.sln.docstates

# Build results

[Dd]ebug/
[Rr]elease/
x64/
build/
[Bb]in/
[Oo]bj/

# MSTest test Results
[Tt]est[Rr]esult*/
[Bb]uild[Ll]og.*

*_i.c
*_p.c
*.ilk
*.meta
*.obj
*.pch
*.pdb
*.pgc
*.pgd
*.rsp
*.sbr
*.tlb
*.tli
*.tlh
*.tmp
*.tmp_proj
*.log
*.vspscc
*.vssscc
.builds
*.pidb
*.log
*.scc

# Visual C++ cache files
ipch/
*.aps
*.ncb
*.opensdf
*.sdf
*.cachefile

# Visual Studio profiler
*.psess
*.vsp
*.vspx

# Guidance Automation Toolkit
*.gpState

# ReSharper is a .NET coding add-in
_ReSharper*/
*.[Rr]e[Ss]harper

# TeamCity is a build add-in
_TeamCity*

# DotCover is a Code Coverage Tool
*.dotCover

# NCrunch
*.ncrunch*
.*crunch*.local.xml

# Installshield output folder
[Ee]xpress/

# DocProject is a documentation generator add-in
DocProject/buildhelp/
DocProject/Help/*.HxT
DocProject/Help/*.HxC
DocProject/Help/*.hhc
DocProject/Help/*.hhk
DocProject/Help/*.hhp
DocProject/Help/Html2
DocProject/Help/html

# Click-Once directory
publish/

# Publish Web Output
*.Publish.xml
*.pubxml

# NuGet Packages Directory
## TODO: If you have NuGet Package Restore enabled, uncomment the next line
#packages/

# Windows Azure Build Output
csx
*.build.csdef

# Windows Store app package directory
AppPackages/

# Others
sql/
*.Cache
ClientBin/
[Ss]tyle[Cc]op.*
~$*
*~
*.dbmdl
*.[Pp]ublish.xml
*.pfx
*.publishsettings

# RIA/Silverlight projects
Generated_Code/

# Backup & report files from converting an old project file to a newer
# Visual Studio version. Backup files are not needed, because we have git 😉
_UpgradeReport_Files/
Backup*/
UpgradeLog*.XML
UpgradeLog*.htm

# SQL Server files
App_Data/*.mdf
App_Data/*.ldf

#############
## Windows detritus
#############

# Windows image file caches
Thumbs.db
ehthumbs.db

# Folder config file
Desktop.ini

# Recycle Bin used on file shares
$RECYCLE.BIN/

# Mac crap
.DS_Store

#############
## Python
#############

*.py[co]

# Packages
*.egg
*.egg-info
dist/
build/
eggs/
parts/
var/
sdist/
develop-eggs/
.installed.cfg

# Installer logs
pip-log.txt

# Unit test / coverage reports
.coverage
.tox

#Translations
*.mo

#Mr Developer
.mr.developer.cfg