SRCS – Shell Revision Control System

I wrote a script to simulate a Version Control System. It is simulating the basic core RCS function, like check out a file with a specified version, check in a file, show all log metadata of a file, diff two versions, etc.
It’s logic:
1. For a totally new file, first run ‘init’ command to add a placeholder in the repository. Without ‘init’, user is not able to do other SRCS activities.
2. After ‘init’, user can run ‘ci’ to check in a new version of file in the repository. The Version 0 is empty and it is for placeholder. The versions of the check in are starting with 1, 2, 3,…,10,…
3. User can use ‘co’ to check out a latest version of the file from the repository or provide a ‘version’ in the command to check out a specified version.
4. User can use ‘diff’ command to diff two versions existing in the repository or diff one version with the local copy.
5. User can use ‘log’ command to print all of the metadata info pertaining to a file including its check in comments and version information.

How it work:
1. It depends on linux ‘diff’ and ‘patch’ command.
2. For every file in the repository, it will have a reversion control file RCS/filename,v. For example, for file admin/sql/test.sql, SRCS will maintain its metadata in admin/sql/RCS/test.sql,v
3. The root directory of the repository is defined in parameter _PROD

In SRCS – Check in file – It implements the initial and check in a file.
In SRCS – check out file – It implements check out a file with a specified version.
In SRCS – log – It implements a ‘log’ command to allow user to print all the metadata for a file.
In SRCS – diff – It implements a ‘diff’ command to show diff for two versions.

The script: https://github.com/luohuahuang/SRCS/blob/master/srcs.sh


_PROD="/u01/camel/12.0"
_DEBUG="on"
_LOG="_srcs.log"

fname_srcs="$_PROD/$2"
fname=$(basename "${fname_srcs}")
fdir=`echo "${fname_srcs}" | sed 's/'"$fname"'$//'`
fver="$fdir/RCS/${fname},v"

function DEBUG()
{
	[ "$_DEBUG" == "on" ] && $@ || :
}

function checkout()
{
	version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
	cp $fname_srcs ${fname}.tmp.$1
	for (( v=$version; v>$1; v-- ))
	do
		let delta=$v-1
		_l1=`nl "$fver" | egrep "version#$delta" | sed "s/version#$delta//" | sed 's/^\s*|\s*$//g'`
		_l2=`nl "$fver" | egrep "version#$v" | sed "s/version#$v//" | sed 's/^\s*|\s*$//g'`
		sed -n "${_l1},${_l2}p" "$fver" | patch ${fname}.tmp.$1 -i - >> /dev/null
	done
}

if [ $1 == "diff" ]; then
	##
	# diff a file with two versions.
	# This command will diff the v5 and v6 of admin/sql/test.sql: srcs.sh diff admin/sql/test.sql 5 6
	# This command will diff the v5 and local copy of admin/sql/test.sql: srcs.sh diff admin/sql/test.sql 5
	##
	echo "Diff $2 $3 $4 from SRCS..." | tee -a $_LOG
	if [ "$#" -lt 3 ]; then
		echo "Please provide at least one revision for diff...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			echo "$fname is just initialled in SRCS...<FAILED>" | tee -a $_LOG
			exit 1
		else
			if ! cat $fver | egrep "^version#$3" > /dev/null ; then
				echo "Version $3 doesn't exist...<FAILED>" | tee -a $_LOG
				exit 1
			else
				checkout $3
			fi
		fi
		if [ "$#" -lt 4 ]; then
			echo "Diff $2 $3 from SRCS with the local copy..." | tee -a $_LOG
			if [ ! -f "$fname" ]; then
				echo "$fname doesn't exist in local directory...<FAILED>" | tee -a $_LOG
				exit 1
			fi
			diff ${fname}.tmp.$3 $fname | tee -a $_LOG
			echo "Diff $2 $3 from SRCS with the local copy...<SUCCESSED>" | tee -a $_LOG
		else
			if ! cat $fver | egrep "^version#$4" > /dev/null ; then
				echo "Version $4 doesn't exist...<FAILED>" | tee -a $_LOG
				exit 1
			fi
			checkout $4
			diff ${fname}.tmp.$3 ${fname}.tmp.$4 | tee -a $_LOG
			echo "Diff $2 $3 $4 from SRCS...<SUCCESSED>" | tee -a $_LOG
		fi
	fi
elif [ $1 == "init" ]; then
	##
	# initial a file in revision repository
	# this command will initial a placeholoder of admin/sql/test.sql in revision repository: srcs.sh init admin/sql/test.sql
	##
	date | tee -a $_LOG
	echo "Initial $2 in SRCS..." | tee -a $_LOG
	mkdir -p "$fdir/RCS"
	if [ -f "$fver" ]; then
		echo "$2 exists in SRCS already...<FAILED>" | tee -a $_LOG
		echo "Initial $2 in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	touch "$fver"
	if [ $? -gt 0 ];then
		echo "Initial $2 in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	echo "Initial $2 in SRCS...<SUCCESSED>" | tee -a $_LOG
	exit 0
elif [ $1 == "ci" ]; then
	##
	# check in a file in revision repository
	# this command will check in a new version of admin/sql/test.sql in revision repository: srcs.sh ci admin/sql/test.sql
	##
	echo "Checkin $2 in SRCS..." | tee -a $_LOG
	if [ ! -f "$fname" ]; then
		echo "$fname doesn't exist in local directory...<FAILED>" | tee -a $_LOG
		exit 1
	elif [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS. Run init command to init it firstly...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			version=1
		else
			version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
			let version=1+$version
		fi
			echo "Input comment for this checkin: "
			read comment
			touch "${fname_srcs}"
			diff $fname $fname_srcs >> $fver
			echo "comment#$comment" >> $fver
			echo "version#$version" >> $fver
			echo "" >> $fver
			cp $fname $fname_srcs
			echo "Checkin $2 in SRCS...<SUCCESSED>" | tee -a $_LOG
	fi
elif [ $1 == "co" ]; then
	##
	# check out a file with a specified version from revision repository
	# this command will check out the latest version of admin/sql/test.sql from revision repository: srcs.sh co admin/sql/test.sql
	# this command will check out the v5 of admin/sql/test.sql from revision repository: srcs.sh co admin/sql/test.sql 5
	##
	echo "Checkout $2 $3 from SRCS..." | tee -a $_LOG
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS. Run init and ci command to version control it firstly...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			echo "$fname is just initialled in SRCS. Run ci command to version control it firstly...<FAILED>" | tee -a $_LOG
		elif ! cat $fver | egrep "^version#$3" > /dev/null ; then
			echo "Version $3 doesn't exist...<FAILED>" | tee -a $_LOG
			exit 1
		elif [ "$#" -lt 3 ]; then
			cp $fname_srcs $fname
			echo "Checkout $2 $3 from SRCS...<SUCCESSED>" | tee -a $_LOG
		else
			version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
			cp $fname_srcs $fname
			for (( v=$version; v>$3; v-- ))
			do
				let delta=$v-1
				_l1=`nl "$fver" | egrep "version#$delta" | sed "s/version#$delta//" | sed 's/^\s*|\s*$//g'`
				_l2=`nl "$fver" | egrep "version#$v" | sed "s/version#$v//" | sed 's/^\s*|\s*$//g'`
				sed -n "${_l1},${_l2}p" "$fver" | patch $fname -i - >> /dev/null
			done
			echo "Checkout $2 $3 from SRCS...<SUCCESSED>" | tee -a $_LOG
		fi
	fi
elif [ $1 == "log" ]; then
	##
	# print all the metadata for a file including a complete version history and the checkin comments from revision repository
	# this command will print the metadata for admin/sql/test.sql in revision repository: srcs.sh log admin/sql/test.sql
	##
	echo "Log for $2 from SRCS..." | tee -a $_LOG
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			echo "$fname is just initialled in SRCS and has no revision...<FAILED>" | tee -a $_LOG
		else
			logs=`cat $fver | egrep "^version#|comment"`
			echo "$logs" | tee -a $_LOG
		fi
	fi
else
	##
	# when all options fail
	##
	echo "$1 is not a supported parameter in SRCS..." | tee -a $_LOG
	exit 1
fi

SRCS – diff

I am writing a script to simulate a Version Control System.
In SRCS – Check in file – It implements the initial and check in a file.
In SRCS – check out file – It implements check out a file with a specified version.
In SRCS – log – It implements a ‘log’ command to allow user to print all the metadata for a file.

In this post I will implement a diff command lists file changes.

For example,
The following command will provide a list of differences between file admin/sql/test1.sql revisions 5 and 6:
srcs diff admin/sql/test1.sql 5 6
The following example, you receive a list of differences between revision 5 and the local copy:
srcs diff admin/sql/test1.sql 5


_PROD="/u01/camel/12.0"
_DEBUG="on"
_LOG="_srcs.log"

fname_srcs="$_PROD/$2"
fname=$(basename "${fname_srcs}")
fdir=`echo "${fname_srcs}" | sed 's/'"$fname"'$//'`
fver="$fdir/RCS/${fname},v"
	
function DEBUG()
{
	[ "$_DEBUG" == "on" ] && $@ || :
}

function checkout()
{
	version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
	cp $fname_srcs ${fname}.tmp.$1
	for (( v=$version; v>$1; v-- ))
	do
		let delta=$v-1
		_l1=`nl "$fver" | egrep "version#$delta" | sed "s/version#$delta//" | sed 's/^\s*|\s*$//g'`
		_l2=`nl "$fver" | egrep "version#$v" | sed "s/version#$v//" | sed 's/^\s*|\s*$//g'`
		sed -n "${_l1},${_l2}p" "$fver" | patch ${fname}.tmp.$1 -i - >> /dev/null
	done
}

if [ $1 == "diff" ]; then
	echo "Diff $2 $3 $4 from SRCS..." | tee -a $_LOG
	if [ "$#" -lt 3 ]; then
		echo "Please provide at least one revision for diff...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then 
			echo "$fname is just initialled in SRCS...<FAILED>" | tee -a $_LOG
			exit 1
		else
			if ! cat $fver | egrep "^version#$3" > /dev/null ; then
				echo "Version $3 doesn't exist...<FAILED>" | tee -a $_LOG
				exit 1
			else
				checkout $3
			fi
		fi
		if [ "$#" -lt 4 ]; then 	
			echo "Diff $2 $3 from SRCS with the local copy..." | tee -a $_LOG
			if [ ! -f "$fname" ]; then
				echo "$fname doesn't exist in local directory...<FAILED>" | tee -a $_LOG
				exit 1
			fi
			diff ${fname}.tmp.$3 $fname | tee -a $_LOG
			echo "Diff $2 $3 from SRCS with the local copy...<SUCCESSED>" | tee -a $_LOG
		else 
			if ! cat $fver | egrep "^version#$4" > /dev/null ; then
				echo "Version $4 doesn't exist...<FAILED>" | tee -a $_LOG
				exit 1
			fi
			checkout $4
			diff ${fname}.tmp.$3 ${fname}.tmp.$4 | tee -a $_LOG
			echo "Diff $2 $3 $4 from SRCS...<SUCCESSED>" | tee -a $_LOG
		fi
	fi
fi

Test Result:

-bash-3.2$ sh srcs.sh diff admin/sql/test2.sql
Diff admin/sql/test2.sql   from SRCS...
Please provide at least one revision for diff...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql
Diff admin/sql/test1.sql   from SRCS...
Please provide at least one revision for diff...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test2.sql
Diff admin/sql/test2.sql   from SRCS...
Please provide at least one revision for diff...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test2.sql 3
Diff admin/sql/test2.sql 3  from SRCS...
test2.sql is just initialled in SRCS...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test2.sql 3 5
Diff admin/sql/test2.sql 3 5 from SRCS...
test2.sql is just initialled in SRCS...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 6
Diff admin/sql/test1.sql 6  from SRCS...
Diff admin/sql/test1.sql 6 from SRCS with the local copy...
2,7c2,3
<  I am Version 6
< ahoghgoh
< gegoruaon
<  hello version6
< haxialg xn
< this line is new in version 6
---
> ### I am Version 1
> ### This line should be the same
Diff admin/sql/test1.sql 6 from SRCS with the local copy...<SUCCESSED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 5
Diff admin/sql/test1.sql 5  from SRCS...
Diff admin/sql/test1.sql 5 from SRCS with the local copy...
2,5c2,3
<  I am Version 5
<  hello version5
< haxialg xn
< this line is new in version 5
---
> ### I am Version 1
> ### This line should be the same
Diff admin/sql/test1.sql 5 from SRCS with the local copy...<SUCCESSED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 53
Diff admin/sql/test1.sql 53  from SRCS...
Version 53 doesn't exist...<FAILED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 3
Diff admin/sql/test1.sql 3  from SRCS...
Diff admin/sql/test1.sql 3 from SRCS with the local copy...
2,4c2,3
<  I am Version 3
< ### This line should be the different
< this line is new in version 3
---
> ### I am Version 1
> ### This line should be the same
Diff admin/sql/test1.sql 3 from SRCS with the local copy...<SUCCESSED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 2
Diff admin/sql/test1.sql 2  from SRCS...
Diff admin/sql/test1.sql 2 from SRCS with the local copy...
2c2
< ### I am Version 2
---
> ### I am Version 1
4d3
< this line is new in version 2
Diff admin/sql/test1.sql 2 from SRCS with the local copy...<SUCCESSED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 1
Diff admin/sql/test1.sql 1  from SRCS...
Diff admin/sql/test1.sql 1 from SRCS with the local copy...
Diff admin/sql/test1.sql 1 from SRCS with the local copy...<SUCCESSED>
-bash-3.2$ sh srcs.sh diff admin/sql/test1.sql 6 4
Diff admin/sql/test1.sql 6 4 from SRCS...
2,7c2,4
<  I am Version 6
< ahoghgoh
< gegoruaon
<  hello version6
< haxialg xn
< this line is new in version 6
---
>  I am Version  4
>  hello version4
> this line is new in version 4
Diff admin/sql/test1.sql 6 4 from SRCS...<SUCCESSED>
-bash-3.2$

SRCS – log

I am writing a script to simulate a Version Control System.
In SRCS – Check in file – It implements the initial and check in a file.
In SRCS – check out file – It implements check out a file with a specified version.

In this post I will implement a ‘log’ command to allow user to print all the metadata for a file, including a complete version history and the checkin comments.


_PROD="/u01/camel/12.0"
_DEBUG="on"
_LOG="_srcs.log"

fname_srcs="$_PROD/$2"
fname=$(basename "${fname_srcs}")
fdir=`echo "${fname_srcs}" | sed 's/'"$fname"'$//'`
fver="$fdir/RCS/${fname},v"

function DEBUG()
{
	[ "$_DEBUG" == "on" ] && $@ || :
}

if [ $1 == "log" ]; then
	echo "Log for $2 from SRCS..." | tee -a $_LOG
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			echo "$fname is just initialled in SRCS and has no revision...<FAILED>" | tee -a $_LOG
		else
			logs=`cat $fver | egrep "^version#|comment"`
			echo "$logs" | tee -a $_LOG
		fi
	fi
fi

Test result:

-bash-3.2$ sh srcs.sh log admin/sql/test2.sql
Log for admin/sql/test2.sql from SRCS...
test2.sql is just initialled in SRCS and has no revision...<FAILED>
-bash-3.2$ sh srcs.sh log admin/sql/test3.sql
Log for admin/sql/test3.sql from SRCS...
test3.sql doesn't exist in SRCS...<FAILED>
-bash-3.2$ sh srcs.sh log admin/sql/test1.sql
Log for admin/sql/test1.sql from SRCS...
comment#initial checkin for test1.sql
version#1
comment#hello version 2
version#2
comment#okie, version 3
version#3
comment#i am version 4.
version#4
comment#hello hello version 5
version#5
comment#version 6 here...
version#6
-bash-3.2$

MISC – May 19th, 2014

1# nl – number lines of files. Write each FILE to standard output, with line numbers added.  With no FILE, or when FILE is -, read standard input.

-bash-3.2$ cat file1
This is file 1
I am a file 1
-bash-3.2$ nl file1
1 This is file 1
2 I am a file 1
-bash-3.2$

2# sed -n “$start,${end}p” filename
Output certain lines from a file

-bash-3.2$ nl file2
     1  This is file 2
     2  I am a file 2
     3  line 3 - 123
     4  line 4 - 123
     5  line 5 - 123
-bash-3.2$ sed -n '2,4p' file2
I am a file 2
line 3 - 123
line 4 - 123
-bash-3.2$

3# In Shell, if you want to pass variables into a pattern, the best practice is to use a double quotes instead of the single ones, so the $variables can be expanded.

-bash-3.2$ export start=2
-bash-3.2$ export end=4
-bash-3.2$ sed -n '$start,${end}p' file2
sed: -e expression #1, char 14: unterminated `s' command
-bash-3.2$ sed -n "$start,${end}p" file2
I am a file 2
line 3 - 123
line 4 - 123
-bash-3.2$

4# Three typical ways to do for in Shell,

for (( v=10; v>1; v-- ))
do

done

for i in {1..5}
do
   echo "Welcome $i times"
done

for f in $FILES
do

done

5# Output all of the lines except the last N lines.

head -n -N file
-bash-3.2$ nl file2
1 This is file 2
2 I am a file 2
3 line 3 - 123
4 line 4 - 123
5 line 5 - 123
-bash-3.2$ head -n -2 file2
This is file 2
I am a file 2
line 3 - 123
-bash-3.2$
6# output all of the lines except the first N lines.
<pre>
tail -n +(N+1) file
-bash-3.2$ nl file2
1 This is file 2
2 I am a file 2
3 line 3 - 123
4 line 4 - 123
5 line 5 - 123
-bash-3.2$ tail -n +3 file2
line 3 - 123
line 4 - 123
line 5 - 123
-bash-3.2$

SRCS – check out file

I just started up writing a script to simulate a basic Source Version Control tool. In http://luohuahuang.com/2014/05/17/srcs-startup/ I prepared an experimental script to do initial and checkin activities. Now in this post I will try the check out function.
Basically the check out function will support:
1# If the version is the latest, just copy the latest from repository to local.
2# otherwise check out the specified version.

I will use Linux patch command to apply the diff content to the latest file to get the specified version.
The logic:
1# copy the latest file to local.
2# run patch command to apply diff in a loop way until it reaches the specified version. For example,

-bash-3.2$ cat 2.sql
hello world
no wrong v4
new line in v4
select * from table2;
-bash-3.2$ vi diff.txt
-bash-3.2$ cat diff.txt | patch 2.sql -i -
patching file 2.sql
-bash-3.2$ cat 2.sql
hello world
no wrong
new line in v3
select * from table2;

Note:
-i patchfile or –input=patchfile
Read the patch from patchfile. If patchfile is -, read from standard input, the default.

So in the loop, I will extract the diff content from filename,v and output it into standard input for patch command.

The logic:

1. if it checks out a file doesn’t have history in repository. Failed.

2. if it checks out the file without specifying a version or with the latest version. Check out the latest version.

3. if it chceks out the file by specifying a version. If the version doesn’t exist, Failed. Otherwise check out the specified version.

 


_PROD="/u01/camel/12.0"
_DEBUG="on"
_LOG="_srcs.log"

fname_srcs="$_PROD/$2"
fname=$(basename "${fname_srcs}")
fdir=`echo "${fname_srcs}" | sed 's/'"$fname"'$//'`
fver="$fdir/RCS/${fname},v"

function DEBUG()
{
	[ "$_DEBUG" == "on" ] && $@ || :
}

if [ $1 == "co" ]; then
	echo "Checkout $2 $3 from SRCS..." | tee -a $_LOG
	if [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS. Run init and ci command to version control it firstly...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			echo "$fname is just initialled in SRCS. Run ci command to version control it firstly...<FAILED>" | tee -a $_LOG
		elif ! cat $fver | egrep "^version#$3" > /dev/null ; then
			echo "Version $3 doesn't exist...<FAILED>" | tee -a $_LOG
			exit 1
		elif [ "$#" -lt 3 ]; then
			cp $fname_srcs $fname
			echo "Checkout $2 $3 from SRCS...<SUCCESSED>" | tee -a $_LOG
		else
			version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
			cp $fname_srcs $fname
			for (( v=$version; v>$3; v-- ))
			do
				let delta=$v-1
				_l1=`nl "$fver" | egrep "version#$delta" | sed "s/version#$delta//" | sed 's/^\s*|\s*$//g'`
				_l2=`nl "$fver" | egrep "version#$v" | sed "s/version#$v//" | sed 's/^\s*|\s*$//g'`
				sed -n "${_l1},${_l2}p" "$fver" | patch $fname -i - >> /dev/null
			done
			echo "Checkout $2 $3 from SRCS...<SUCCESSED>" | tee -a $_LOG
		fi
	fi
fi

Test data:

Below data is recorded by check in script posted in http://luohuahuang.com/2014/05/17/srcs-startup/ in the filename,v file. It basically records all of the diff changes.
####################################################################

### This is test1.sql file
### I am Version 1
### This line should be the same
###

comment: initial checkin for test1.sql

####################################################################

### This is test1.sql file
### I am Version 2
### This line should be the same
this line is new in version 2
###

comment: hello version 2

####################################################################

### This is test1.sql file
I am Version 3
### This line should be the different
this line is new in version 3
###

comment: okie, version 3

####################################################################

### This is test1.sql file
I am Version 4
hello version4
this line is new in version 4
###

comment: i am version 4.

####################################################################

### This is test1.sql file
I am Version 5
hello version5
haxialg xn
this line is new in version 5
###

comment: hello hello version 5

####################################################################

### This is test1.sql file
I am Version 6
ahoghgoh
gegoruaon
hello version6
haxialg xn
this line is new in version 6
###

version 6 here…

Test Result:

-bash-3.2$ sh srcs.sh co admin/sql/test1.sql 4
Checkout admin/sql/test1.sql 4 from SRCS…
Checkout admin/sql/test1.sql 4 from SRCS…<SUCCESSED>
-bash-3.2$ cat test1.sql
### This is test1.sql file
I am Version 4
hello version4
this line is new in version 4
###
-bash-3.2$ rm test1.sql
-bash-3.2$ sh srcs.sh co admin/sql/test1.sql 2
Checkout admin/sql/test1.sql 2 from SRCS…
Checkout admin/sql/test1.sql 2 from SRCS…<SUCCESSED>
-bash-3.2$ cat test1.sql
### This is test1.sql file
### I am Version 2
### This line should be the same
this line is new in version 2
###
-bash-3.2$ sh srcs.sh co admin/sql/test1.sql 1
Checkout admin/sql/test1.sql 1 from SRCS…
Checkout admin/sql/test1.sql 1 from SRCS…<SUCCESSED>
-bash-3.2$ cat test1.sql
### This is test1.sql file
### I am Version 1
### This line should be the same
###
-bash-3.2$ sh srcs.sh co admin/sql/test1.sql 0
Checkout admin/sql/test1.sql 0 from SRCS…
Version 0 doesn’t exist…<FAILED>

From the test result, the check out script is working!

So far, the scripts can simulate below version control functions:
1. initial a file in the version repository.
2. check in a file into the repository and the repository records the changes.
3. check out a file by specifying a version.
4. It is only for text-based files.

 

Todo list:

1. Lock and unlock
2. Access control
3. branching
4. .etc

I should complete this in later posts.

SRCS – Check in file

Today I got some spare time so I tried to simulate a basic Source Versions Control Tool by using linux patch command. Basically my target is, the tool could be able to handle version control on a standalone env. for a single user, like check in a file, check out a file, check out a file with dedicated version, lock and unlock, log review, etc. Of course, multi-users and CS mode will be supported in future.

This afternoon I just tried to figure out how patch command work. Basically,

Let’s see an example of patch command,

-bash-3.2$ cat file1
This is file 1
I am a file 1
-bash-3.2$ cat file2
This is file 1
This is file 2
I am a file  2
-bash-3.2$ diff file1 file2 > diff.txt
-bash-3.2$ cat diff.txt
2c2,3
< I am a file 1
---
> This is file 2
> I am a file  2
-bash-3.2$ rm file2
-bash-3.2$ cp file1 file2
-bash-3.2$ patch file2 < diff.txt
patching file file2
-bash-3.2$ cat file2
This is file 1
This is file 2
I am a file  2
-bash-3.2$

Currently I just composed an experimental script. The script can do below thing:
1. for a new file, run ‘srcs init file’ to initial a placeholder in the repository.
2. check in a new file and note down the diff in the RCS/file,v file.

For example, for new file admin/sql/test/1.sql, I run ‘srcs init admin/sql/test/1.sql’ it will help me create a control file under $_PROD/admin/sql/test/RCS/1.sql,v.


_PROD="/u01/camel/12.0"
_DEBUG="on"
_LOG="_srcs.log"

fname_srcs="$_PROD/$2"
fname=$(basename "${fname_srcs}")
fdir=`echo "${fname_srcs}" | sed 's/'"$fname"'$//'`
fver="$fdir/RCS/${fname},v"

function DEBUG()
{
	[ "$_DEBUG" == "on" ] && $@ || :
}

if [ $1 == "init" ]; then
	date | tee -a $_LOG
	echo "Initial $2 in SRCS..." | tee -a $_LOG
	mkdir -p "$fdir/RCS"
	if [ -f "$fver" ]; then
		echo "$2 exists in SRCS already...<FAILED>" | tee -a $_LOG
		echo "Initial $2 in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	touch "$fver"
	if [ $? -gt 0 ];then
		echo "Initial $2 in SRCS...<FAILED>" | tee -a $_LOG
		exit 1
	fi
	echo "Initial $2 in SRCS...<SUCCESSED>" | tee -a $_LOG
	exit 0
fi

if [ $1 == "ci" ]; then
	echo "Checkin $2 in SRCS..." | tee -a $_LOG
	if [ ! -f "$fname" ]; then
		echo "$fname doesn't exist in local directory...<FAILED>" | tee -a $_LOG
		exit 1
	elif [ ! -f "$fver" ]; then
		echo "$fname doesn't exist in SRCS. Run init command to init it firstly...<FAILED>" | tee -a $_LOG
		exit 1
	else
		if egrep -v "^\s*$" $fver | wc -l | egrep "^0" > /dev/null ; then
			version=1
		else
			version=`tail -3 $fver | egrep "^version#" | sed 's/^version#//'`
			let version=1+$version
		fi
			echo "Input comment for this checkin: "
			read comment
			touch "${fname_srcs}"
			diff $fname $fname_srcs >> $fver
			echo "comment#$comment" >> $fver
			echo "version#$version" >> $fver
			cp $fname $fname_srcs
			echo "Checkin $2 in SRCS...<SUCCESSED>" | tee -a $_LOG
	fi
fi

Output:

-bash-3.2$ srcs.sh ci admin/sql/test1/2.sql
Chekcin admin/sql/test1/2.sql in SRCS...
2.sql doesn't exist in SRCS. Run init command to init it firstly...<FAILED>
-bash-3.2$ srcs.sh init admin/sql/test1/2.sql
Sat May 17 01:17:40 PDT 2014
Initial admin/sql/test1/2.sql in SRCS...
Initial admin/sql/test1/2.sql in SRCS...<SUCCESSED>
-bash-3.2$ srcs.sh ci admin/sql/test1/2.sql
Chekcin admin/sql/test1/2.sql in SRCS...
Input comment for this checkin:
this is the 1st checin for 2.sql. Good luck
Chekcin admin/sql/test1/2.sql in SRCS...<SUCCESSED>
-bash-3.2$ cat /u01/camel/12.0/admin/sql/test1/2.sql
hello world
yes correct
select * from table2;
-bash-3.2$ cat /u01/camel/12.0/admin/sql/test1/RCS/2.sql,v
1,3d0
< hello world
< yes correct
< select * from table2;
comment#this is the 1st checin for 2.sql. Good luck
version#1
-bash-3.2$ vi 2.sql
-bash-3.2$ srcs.sh ci admin/sql/test1/2.sql
Chekcin admin/sql/test1/2.sql in SRCS...
Input comment for this checkin:
okie v2 is coming.
Chekcin admin/sql/test1/2.sql in SRCS...<SUCCESSED>
-bash-3.2$ cat /u01/camel/12.0/admin/sql/test1/2.sql
hello world
no wrong
new line in v2
select * from table2;
-bash-3.2$ cat /u01/camel/12.0/admin/sql/test1/RCS/2.sql,v
1,3d0
< hello world
< yes correct
< select * from table2;
comment#this is the 1st checin for 2.sql. Good luck
version#1
2,3c2
< no wrong
< new line in v2
---
> yes correct
comment#okie v2 is coming.
version#2

In the coming days, I will implement check out files with dedicated versions and add access control to it.

How Make build software

This weekend I went through the GNU Make manual http://www.gnu.org/software/make/manual/make.html and basically I have already understood the syntax of Make, Make’s implicit/explicit rules, makefile automatic/defined variables, data structure and functions, and also understood Make’s execute phases (parse makefile to construct directed acyclic graph and execute rules).

As I have already understood the basic of Make, I couldn’t wait to look into OpenJDK’s make system! 🙂 I found a directories diagram about the source codes of OpenJDK ( They called it OpenJDK Mercurial Forest). https://blogs.oracle.com/kto/entry/openjdk_mercurial_forest, It was posted in Oct. 2007 and it was a bit outdated however overall it still reflects the current OpenJDK 8.

Before I describe my understanding of how OpenJDK builds, I want to introduce how Make builds source codes under multi-directories with multi directories levels. For a simply make example (all of the source codes under the same directory), refer to my post Hello World in MAKE. It is not easy if we have to construct a build system with Make to build software with multi-directories. Basically in Make, we have two methods.

Method 1 – Recursive Make:

Take below structure as example, it has one Makefile under the root directory and one Makefile under each sub-directory. This hierarchy of modules can be nested arbitrarily deep. Its method is, makefile will use $(make) to invoke its child makefiles recursively.

The top-level Makefile ofen looks a lot like a shell script:

MODULES = ant bee
all:
for dir in $(MODULES); do \
(cd $$dir; ${MAKE} all); \
done

The ant/Makefile looks like this:

all: main.o
main.o: main.c ../bee/parse.h
$(CC) -I../bee -c main.c

Its directed acyclic graph generated by make in memory should be like below,

So here the logic here is very frank, to build prog, it will recursively build main.o and parse.o. For more details, you might refer to http://aegis.sourceforge.net/auug97.pdf (Recursive Make Considered Harmful), in this thesis, it lists the harmful of recursive make. My understanding is, as in above example it has 2 sub-directories then everything here is fine, however considering we have hundreds of directories and their dependencies are very complicated – in other word, we have to tweak their build sequences, that will be painful. And another pitfall, supposed here main.o needs to invoke a lib generated from another makefile, it will be in risk that main.o will be not built properly as lib might be outdated as here lib is not in the makefile of main.o.

Method 2 – Inclusive Make:

Let’s use below example to explain what is Inclusive Make:

You might have found that, under each subdirectory, it has a .mk file. Inclusive means, in the main entrance of GNU make, it will include the .mk files. In this method, it can be ensure that it always has only one GNU make process (in above Recursive method, it might has more than one process). And the best is, it can manage the dependencies relationship together – that means we will not miss any dependencies and less risk in building outdated or improper software.

The advantages:

1. It has only one GNU make process running and its start up will be faster while in the recursive method, it might invoke hundreds of processes.

2. It still has one makefile to describe the rules for all of the files under each directory. Maintaining only one makefile is nightmare.

3. It maintains all of the dependencies relationship together and reduce the risk of generating improper build artfacts.

4. As said in point 3, its dependencies relationship is maintained by make together hence we don’t need to maintain the $(MAKE) sequences.

Well, let’s talk about its disadvantages 😦

1. It’s more difficult and complicated to compose makefiles as in make, a ‘include’ directive means including the text literal and we have to take care of variables declare, value assign, goals define more carefully!

How OpenJDK maintains makefiles

We have looked into two methods how make builds source codes in different directories with many directories levels. Let’s see into OpenJDK and experience how it maintains its build system structure.

In my post Let’s build openjdk, OpenJDK can be built just in one word: make. Below is its targets http://hg.openjdk.java.net/jdk8/jdk8/raw-file/tip/README-builds.html:

Make Target Description
empty build everything but no images
all build everything including images
all-conf build all configurations
images create complete j2sdk and j2re images
install install the generated images locally, typically in /usr/local
clean remove all files generated by make, but not those generated by configure
dist-clean remove all files generated by both and configure (basically killing the configuration)
help give some help on using make, including some interesting make targets

Let’s see how it works.

1. under common/makefiles/, it has one file Makefile with only one line

include ../../NewMakefile.gmk

2. In NewMakefile.gmk, it has below include snippets:

# ... and then we can include our helper functions
include $(root_dir)/common/makefiles/MakeHelpers.gmk
...
    ifeq ($(words $(SPEC)),1)
        # We are building a single configuration. This is the normal case. Execute the Main.gmk file.
        include $(root_dir)/common/makefiles/Main.gmk
    else
...
include $(root_dir)/common/makefiles/Jprt.gmk
...
help:
	$(info )
	$(info OpenJDK Makefile help)
...
	$(info )

.PHONY: help

3. Let’s see into $(root_dir)/common/makefiles/Main.gmk

In Main.gmk, it will inlcude MakeBase.gmk and others. in the right panel, it lists all of its targets and matches with the table I listed above.

 

Conclusion:

OpenJDK is using Inclusive make to build its source codes. It has one .gmk file under each source directory.

However to support build components individually, OpenJDK also provides Makefile for each component. OpenJDK consists of below components.

Repository Contains
. (root) common configure and makefile logic
hotspot source code and make files for building the OpenJDK Hotspot Virtual Machine
langtools source code for the OpenJDK javac and language tools
jdk source code and make files for building the OpenJDK runtime libraries and misc files
jaxp source code for the OpenJDK JAXP functionality
jaxws source code for the OpenJDK JAX-WS functionality
corba source code for the OpenJDK Corba functionality
nashorn source code for the OpenJDK JavaScript implementation

For example, for corba, under corba/ directory, it has a Makefile,

...
#
# Makefile for building the corba workspace.
#

BUILDDIR=.
include $(BUILDDIR)/common/Defs.gmk
include $(BUILDDIR)/common/CancelImplicits.gmk

#----- commands

CHMOD = chmod
CP = cp
ECHO = echo # FIXME
...
# Default target
default: all

#----- classes.jar

CLASSES_JAR = $(LIB_DIR)/classes.jar
$(CLASSES_JAR):
	$(MKDIR) -p $(@D)
	$(BOOT_JAR_CMD) -cf $@ -C $(CLASSES_DIR) .

#----- src.zip

SRC_ZIP_FILES = $(shell $(FIND) $(SRC_CLASSES_DIR) \( -name \*-template \) -prune -o -type f -print )
...
jprt_build_product jprt_build_debug jprt_build_fastdebug: all
	( $(CD) $(OUTPUTDIR) && \
	  $(ZIP) -q -r $(JPRT_ARCHIVE_BUNDLE) build dist )

#-------------------------------------------------------------------
...
#
# Phonies to avoid accidents.
#
.PHONY: all build clean clobber debug jprt_build_product jprt_build_debug jprt_build_fastdebug

Basically, I am clear about how Make works! Awesome!!! 😉

 

Hello World in MAKE

I had no experience in building software using MAKE. However as I mentioned in my post Let’s build openjdk, I am very interested in how javenet builds their OpenJDK (It is a very good mean to learn how to compose elegant build scripts by studying from open community). But, I know, before I can enjoy OpenJDK’s build scripts, I need to head into MAKE firstly.

Source codes I will use in this Hello World tutorial:

add.c

#include
#include "calc.h"

void add(const char *string)
{
	printf("I am adder %s\n", string);
}

sub.c

#include
#include "calc.h"

void sub(const char *string)
{
	printf("I am subber %s\n", string);
}

calc.c

#include "calc.h"

int main(int argc, char *argv[])
{
	add("1");
	sub("1");
	return 0;
}

calc.h

extern void add(const char *string);
extern void sub(const char *string);

Compile above source codes manually:

luhuang@luhuang:~/workspace/calculator$ pwd
/home/luhuang/workspace/calculator
luhuang@luhuang:~/workspace/calculator$ ls
add.c  calc.c  calc.h  makefile  sub.c
luhuang@luhuang:~/workspace/calculator$ gcc -g -c add.c
luhuang@luhuang:~/workspace/calculator$ gcc -g -c sub.c
luhuang@luhuang:~/workspace/calculator$ gcc -g -c calc.c
luhuang@luhuang:~/workspace/calculator$ gcc -g -o calculator calc.o add.o sub.o
luhuang@luhuang:~/workspace/calculator$ ./calculator
I am adder 1
I am subber 1
luhuang@luhuang:~/workspace/calculator$ ls
add.c  add.o  calc.c  calc.h  calc.o  calculator  makefile  sub.c  sub.o
luhuang@luhuang:~/workspace/calculator$

You can refer to my post How Compiler build Software for more details.

Let’s start using MAKE!

http://www.gnu.org/software/make/manual/make.html

Preparing and Running Make,

You can simply run the command ‘make’ from shell:

luhuang@luhuang:~$ make
make: *** No targets specified and no makefile found.  Stop.
luhuang@luhuang:~$

make needs a file called the makefile that describes the relationships among files in your program and provides commands for updating each file.

In above command, it fails with error ‘no makefile’ file found as under that directory it doesn’t have a default file called ‘makefile’. So firstly let’s compose a makefile.

# I am a comment
# Hello world in MAKE
calculator: calc.o add.o sub.o
	gcc -g -o calculator calc.o add.o sub.o

add.o: add.c calc.h
	gcc -g -c add.c

sub.o: sub.c calc.h
	gcc -g -c sub.c

calc.o: calc.c calc.h
	gcc -g -c calc.c

In above makefile, it defines the relationships among files calc.c, sub.c, add.c, calc.h. A line started with # means it is a comment and will be ignored.

-g means enable debug. -c means create target file and -o means linking the .o files as a program. you need to put a tab character at the beginning of every recipe line! This is an obscurity that catches the unwary. If you prefer to prefix your recipes with a character other than tab, you can set the .RECIPEPREFIX variable to an alternate character. The first two lines are a rule of make.

A simple makefile consists of “rules” with the following shape:

     target ... : prerequisites ...
             recipe
             ...
             ...

Run it!

luhuang@luhuang:~/workspace/calculator$ make
gcc -g -c calc.c
gcc -g -c add.c
gcc -g -c sub.c
gcc -g -o calculator calc.o add.o sub.o
luhuang@luhuang:~/workspace/calculator$
luhuang@luhuang:~/workspace/calculator$ make
make: `calculator' is up to date.
luhuang@luhuang:~/workspace/calculator$ rm *.o
luhuang@luhuang:~/workspace/calculator$ rm calculator
luhuang@luhuang:~/workspace/calculator$ make -f makefile
gcc -g -c calc.c
gcc -g -c add.c
gcc -g -c sub.c
gcc -g -o calculator calc.o add.o sub.o
luhuang@luhuang:~/workspace/calculator$

You could see that the output sequence of the first make command is similar as our manual run. the makefile can be specified by -f option. Also make is smart that if no changes in source code, it will not recompile the program with output ‘make: `calculator’ is up to date.’. If I need to rerun make, I need to run ‘rm *.o’ and ‘rm calculator’ firstly. Let’s improve it by introducing a new target ‘clean’:

clean:
	rm *.o
	rm calculator

and see,

luhuang@luhuang:~/workspace/calculator$ make -f makefile
gcc -g -c calc.c
gcc -g -c add.c
gcc -g -c sub.c
gcc -g -o calculator calc.o add.o sub.o
luhuang@luhuang:~/workspace/calculator$ ls
add.c  add.o  calc.c  calc.h  calc.o  calculator  makefile  makefile~  sub.c  sub.o
luhuang@luhuang:~/workspace/calculator$ make -f makefile clean
rm *.o
rm calculator
luhuang@luhuang:~/workspace/calculator$ ls
add.c  calc.c  calc.h  makefile  makefile~  sub.c
luhuang@luhuang:~/workspace/calculator$

Let’s continue to improve it by removing the repeated here,

# I am a comment
# Hello world in MAKE
SRCS = add.c sub.c calc.c
OBJS = $(SRCS:.c=.o)
HED  = calc.h
PROG = calculator
CC = gcc
CFLAGS = -g

$(PROG): $(OBJS)
	$(CC) $(CFLAGS) -o $@ $^

$(OBJS): $(HED)

clean:
	rm *.o
	rm $(PROG)

In above improved script, it uses key=value to define variables and use $(key) to use it. The variable CFLAGS exists so you can specify flags for C compilation by implicit rules. Here it has another implicit rule: the target file of fileA.c will be named as fileA.o, in above script,  OBJS = $(SRCS:.c=.o) is using this name convention. The value of $@ is the value of left part of rule and $^ is the right part. So in this case, $@ = ‘calculator’ and $^ = ‘$(SRCS:.c=.o)’.

Ok, let me summarize what I learnt so far.

1. What a rule looks like:

     target ... : prerequisites ...
             recipe
             ...
             ...

2. A simple Makefile and how to issue make command with -f

3. How to define variables and use them.

4. Define additional targets

In the coming week, I will continue to refresh myself in make. Like other languages, make will have conditional control, functions, and include directives. After I equip myself with enough make knowledge, I will digest the make scripts of OpenJDK.

!!! Stay tuned !!! 😉

 

Infrastructure as code

As a release engineer, I design/develop automation to serve core responsibilities when appropriate, and create/Enhance internal automation tools on demand for development/QA and IT teams. Like, automated tools to log bugs, parsing logs, fixing build issues, machines issues, build scripts, tools to maintain source codes, generating reports, etc. I love this job as it help improve the productive of the organization. I found a very interesting post ‘How Infrastructure Engineers are like Drug Pushers‘. In the post, the author said, an infrastructure engineers are just like Drug Pushers. 🙂

Here are their ‘evidences’.

Drug Pusher Infrastructure Engineer
The product can be addictive The tools can be addictive
Users rarely admit we exist Developers rarely admit we exist
Users don’t talk about us openly Developers don’t talk about us openly
The product make most users feel good The tools make most developers feel productive
Users go nuts without the product Developers go nuts without the tools
Users would like these supplies to be free Developers expect these tools to be free
When the product doesn’t work for them anymore, they expect new products that work exactly like the old products, but better When the tools don’t work anymore, new tools are expected, but they must work exactly like the old tools, but better
When the users are under the influence and happy, nothing else matters When the developers are happy and everything is working fine, nothing else matters

The tools can be addictive — In my current company, release team provides tools to manage the dual checkin of source codes. Yes, they love our tool. — They raised more than 30,000 requests in the past year! 🙂 How could we image if they or release team have to do these tedious jobs manually?

Developers rarely admit we exist — Bloody true. Release engineers are not coding for customers (We are coding for developers). Okie, so many people might think, “Release enigneers are not generating profit.” — Too bad. 😦

The tools make most developers feel productive — ASA developers check in codes into repository, it will trigger a checkin build and it will provide feedback quickly. That makes our developers be more confident in their codes and improve their productive. Our tools also help them just focus on programming.

Developers go nuts without the tools — Hahaah, can anyone image that, without the support of release engineers, dev could focous on their programming?

Seriously, let’s come back to the topic ‘Infrastructure as code‘.

In post Concise Introduction to Infrastructure as Code, it has a well defined of what is Infrastructure as Code,

The end goal of infrastructure as code is to perform as many infrastructure tasks as possible programmatically. Key word is “automation.”

Yes, Automation Automation and Automation. Automate Everything if those steps are reproducible. Server configuration, packages installed, dependencies managements, automated testing, reports, build artfiacts, deployments, bugs managements, etc. All of them should be automated and remove all of the manual steps prone to errors. Computers won’t do things wrong unless engineers provides them wrong directives.

I believe that every build team has a lots of internal automated tools work separately. I think we can move further and integrate them into a basic infrastructure platform and unify the data input/outputs (generally, the data is, change set/list, bugs, and build artfacts). We should see infrastructure as a single target and should not treat them separately, like IT, DevOps, QA, Build, Release, and Supports. Ideally, the data flow should be,

a new bug triggered automatically-> notify QA/DEV -> Verified by QA/DEV-> provides environment to QA/DEV to debug-> checkin code-> build the new code-> code quality -> dependencies and changes reports-> deploy new code-> notify QA/DEV the result-> passed -> includes into the build cycle.

In above steps, they should be automated as much as possible.

Yes, it is easier said than done! But as we have already known the direction, ASA we decide to go, we can achieve the goal ‘automated everything’ one day with our hard working!

Let’s build openjdk

As a builder, I am interested in how javanet builds their OpenJDK. In post https://blogs.oracle.com/kto/entry/jdk_build_musings, it provides some insights into the world of JDK build and test. I summarize those insights  which are generic and provides my comments and list below. They are some interested but very basic elements a build system should have,

  • Continuous Build/Integration & Automated Test
    Every component, every integration area, every junction point or merge point should be constantly built and smoke tested. Regarding to continuous build/integration, I strongly recommend book Continuous Integration: Improving Software Quality and Reducing Risk. Basically a build system should be well designed to support continuous build. There are lots of applications to support continuous build/integration, like Hudson, Jenkins, Cruise, etc. These applications have rich extensions to support your personal and special requirements.  For smoke testing, actually CI requires that in your build flows, ASA the complication of source code completes, a set of steps of testings should be executed as well.
  • Build and Test Machines/Multiple Platforms
    The hardware/machine resources for a build and test system is cheap, and a bargain if it keeps all developers shielded from bad changes, finding issues as early in the development process as possible. But it is also true that hardware/machine resources do not manage themselves, so there is also an expense to managing the systems, some of it can be automated but not everything. Virtual machines can provide benefits here, but they also introduce complications. Here, in my current working company, we make use of Oracle VM machines to provide virtual environments to support build/release activities. By adopting VMs, we can shorten our time in preparing environments and configuring environments. What is more, it can help make sure our configuration is consistent. In OpenJDK, they provide a script called /configure to help verify whether your environment is ready to build JDK or not. I will introduce it later.
  • Partial Builds/Build Flavors
    With the JDK we have a history of doing what we call partial builds. The hotspot team rarely builds the entire jdk, but instead just builds hotspot (because that is the only thing they changed) and then places their hotspot in a vetted jdk image that was built by the Release Engineering team at the last build promotion. Dito for the jdk teams that don’t work on hotspot, they rarely build hotspot. This was and still is considered a developer optimization, but is really only possible because of the way the JVM interfaces to the rest of the jdk, it rarely changes. To some degree, successful partial builds can indicate that the changes have not created an interface issue and can be considered somewhat ‘compatible’.
    These partial builds create issues when there are changes in both hotspot and the rest of the jdk, where both changes need to be integrated at the same time, or more likely, in a particular order, e.g. hotspot integrates a new extern interface, later the jdk team integrates a change that uses or requires that interface, ideally after the hotspot changes have been integrated into a promoted build so everyone’s partial builds have a chance of working.
    The partial builds came about mostly because of build time, but also because of the time and space needed to hold all the sources of parts of the product you never really needed. I also think there is a comfort effect by a developer not having to even see the sources to everything he or she doesn’t care about. I’m not convinced that the space and time of getting the sources is that significant anymore, although I’m sure I would get arguments on that. The build speed could also become less of an issue as the new build infrastructure speeds up building and makes incremental builds work properly. But stay tuned on this subject, partial builds are not going away, but it’s clear that life would be less complicated without them.
  • Mercurial
    Probably applies to Git or any distributed Source Code Management system too.
    OpenJDK just use Mercurial as its SCM tool. But here I am open that we can just choose on our demand. I have experience in Performance, VSS, SVN, and RCS! (Yes, The RCS from Linux! 🙂 )
  • Nested Repositories
    Not many projects have cut up the sources like the OpenJDK. There were multiple reasons for it, but it often creates issues for tools that either don’t understand the concept of nested repositories, or just cannot handle them.
  • Managing Build and Test Dependencies
    Some build and test dependencies are just packages or products installed on a system, I’ve often called those “system dependencies”. But many are just tarballs or zip bundles that needs to be placed somewhere and referred to. In my opinion, this is a mess, we need better organization here. Yeah yeah, I know someone will suggest Maven or Ivy, but it may not be that easy. Maven is an awesome tool to manage dependencies. You will definitely fall in love with him ASA you give him a hug!
  • Resolved Bugs and Changesets
    Having a quick connection between a resolved bug and the actual changes that fixed it is so extremely helpful that you cannot be without this. The connection needs to be both ways too. It may be possible to do this completely in the DSCM (Mercurial hooks), but in any case it is really critical to have that easy path between changes and bug reports. And if the build and test system has any kind of archival capability, also to that job data.
  • Distributed Builds
    Work in a distributed way is not so easy. I ever built software in a distributed way by using Cruise — it provides distributed support by running build in different agents in different machines. In testing, for example, for unit testing and smoke testing, we can try define two independent flows for unit testing and smoke testing respectively and triggered ASA the source code compilation succeeds.
  • Killing Builds and Tests
    At some point, you need to be able to kill off a build or test, probably many builds and many tests on many different systems. This can be easy on some systems, and hard with others. Using Virtual Machines or ghosting of disk images provides a chance of just system shutdowns and restarts with a pristine state, but that’s not simple logic to get right for all systems. I think here the concern is, how could we have a better control of our build flows. To have a better contorl the flows, we can add pauses, split the flows into some more independent flows and define their dependencies so one build can trigger another one only when all of the prerequiests are satisified.

To support and enhance the build of OpenJDK, OpenJDK team launched a project ‘https://blogs.oracle.com/kto/entry/build_infrastructure_project‘ to enhance in below points,

  • Different build flavors, same build flow
  • Ability to use ‘make -j N‘ on large multi-CPU machines is critical, as is being able to quickly and reliably get incremental builds done, this means:
    • target dependencies must be complete and accurate
    • nested makes should be avoided
    • ant scripts should be avoided for multiple reasons (it is a form of nested make), but we need to allow for IDE builds at the same time
    • rules that generate targets will need to avoid timestamp changes when the result has not changed
    • Java package compilations need to be made parallel and we also need to consider some kind of javac server setup (something that had been talked about a long time ago)
  • Continued use of different compilers: gcc/g++ (various versions), Sun Studio (various versions), and Windows Visual Studio (various versions)
  • Allow for clean cross compilation, this means making sure we just build it and not run it as part of the build
  • Nested repositories need to work well, so we need a way to share common make logic between repositories
  • The build dependencies should be managed as part of the makefiles

(I am going to study how to build software with MAKE. The build scritps of OpenJDK will be good materials. Hooray! 🙂 ). More details about OpenJDK build infrastructure group, http://openjdk.java.net/groups/build/

Let’s build OpenJDK

(Here I am building it based on http://hg.openjdk.java.net/jdk8/jdk8/raw-file/tip/README-builds.html – OpenJDK 8) and https://blogs.oracle.com/kto/entry/jdk8_new_build_infrastructure

Getting the source,

As I said above, OpenJDK uses http://mercurial.selenic.com/ to do its source control. So install Mercurial firstly if you have not yet done.

My Environment is

Linux luhuang-VirtualBox 3.0.0-32-generic-pae #51-Ubuntu SMP Thu Mar 21 16:09:48 UTC 2013 i686 i686 i386 GNU/Linux

Run below commands as ‘root’ user or sudo,

Install and update your aptitude, purge openjdk-6* if installed, and install necessary packages,

1. apt-get install aptitude

root@luhuang-VirtualBox:/home/luhuang# apt-get install aptitude
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  libboost-iostreams1.46.1 libclass-accessor-perl libcwidget3 libept1
  libio-string-perl libparse-debianchangelog-perl libsub-name-perl
Suggested packages:
  aptitude-doc-en aptitude-doc tasksel debtags libcwidget-dev
  libhtml-parser-perl libhtml-template-perl libxml-simple-perl
The following NEW packages will be installed:
  aptitude libboost-iostreams1.46.1 libclass-accessor-perl libcwidget3 libept1
  libio-string-perl libparse-debianchangelog-perl libsub-name-perl
0 upgraded, 8 newly installed, 0 to remove and 449 not upgraded.
Need to get 2,985 kB of archives.
After this operation, 9,236 kB of additional disk space will be used.
Do you want to continue [Y/n]? Y
...

2. aptitude update

3. apt-get purge openjdk-6*

Why we have to purge openjdk-6* before continue?

“Install a bootstrap JDK. All OpenJDK builds require access to a previously released JDK called the bootstrap JDK or boot JDK. The general rule is that the bootstrap JDK must be an instance of the previous major release of the JDK. In addition, there may be a requirement to use a release at or beyond a particular update level”

4. aptitude install mercurial openjdk-7-jdk rpm ssh expect tcsh csh ksh gawk g++ ccache build-essential lesstif2-dev

root@luhuang-VirtualBox:/home/luhuang# aptitude install mercurial openjdk-7-jdk rpm ssh expect tcsh csh ksh gawk g++ ccache build-essential lesstif2-dev
The following NEW packages will be installed:
  ca-certificates-java{a} ccache csh expect gawk icedtea-7-jre-jamvm{a}
  java-common{a} ksh lesstif2{a} lesstif2-dev libbonobo2-0{a}
  libbonobo2-common{a} libexpat1-dev{a} libfontconfig1-dev{a}
  libfreetype6-dev{a} libgnome2-0{a} libice-dev{a} libnss3-1d{a}
  libpthread-stubs0{a} libpthread-stubs0-dev{a} librpm2{a} librpmbuild2{a}
  librpmio2{a} librpmsign0{a} libsigsegv2{a} libsm-dev{a} libx11-dev{a}
  libxau-dev{a} libxcb1-dev{a} libxdmcp-dev{a} libxext-dev{a} libxft-dev{a}
  libxp-dev{a} libxrender-dev{a} libxt-dev{a} mercurial mercurial-common{a}
  openjdk-7-jdk openjdk-7-jre{a} openjdk-7-jre-headless{a}
  openjdk-7-jre-lib{a} openssh-server{a} rpm rpm-common{a} rpm2cpio{a} ssh
  ssh-import-id{a} tcl8.5{a} tcsh ttf-dejavu-extra{a} tzdata-java{a}
  x11proto-core-dev{a} x11proto-input-dev{a} x11proto-kb-dev{a}
  x11proto-print-dev{a} x11proto-render-dev{a} x11proto-xext-dev{a}
  xorg-sgml-doctools{a} xtrans-dev{a} zlib1g-dev{a}
The following packages will be upgraded:
  libexpat1 libfreetype6 libnss3 tzdata
4 packages upgraded, 60 newly installed, 0 to remove and 445 not upgraded.
Need to get 80.5 MB/81.4 MB of archives. After unpacking 148 MB will be used.
Do you want to continue? [Y/n/?]
...

(It seems the dependencies have changed since their last update in the pages, I still have to install below packages):

5. apt-get install libX11-dev libxext-dev libxrender-dev libxtst-dev

6. apt-get install libcups2-dev

7. apt-get install libasound2-dev

Run below commands as your working user to get the jdk8/build sources. Here for convenient, I am also using root user:

8. hg clone http://hg.openjdk.java.net/jdk8/build jdk8-build

9. cd jdk8-build

10. sh ./get_source.sh

Example:

root@luhuang-VirtualBox:/media/sf_shared# hg clone http://hg.openjdk.java.net/jdk8/build jdk8-build
requesting all changes
adding changesets
adding manifests
adding file changes
added 774 changesets with 1018 changes to 118 files
updating to branch default
101 files updated, 0 files merged, 0 files removed, 0 files unresolved
...
root@luhuang-VirtualBox:/media/sf_shared/jdk8-build# sh ./get_source.sh
# Repositories:  corba jaxp jaxws langtools jdk hotspot nashorn

                corba:   /usr/bin/python -u /usr/bin/hg clone http://hg.openjdk.java.net/jdk8/build/corba corba
                 jaxp:   /usr/bin/python -u /usr/bin/hg clone http://hg.openjdk.java.net/jdk8/build/jaxp jaxp
Waiting 5 secs before spawning next background command.
                 jaxp:   requesting all changes
                corba:   requesting all changes
                corba:   adding changesets
                 jaxp:   adding changesets
                jaxws:   /usr/bin/python -u /usr/bin/hg clone http://hg.openjdk.java.net/jdk8/build/jaxws jaxws
            langtools:   /usr/bin/python -u /usr/bin/hg clone http://hg.openjdk.java.net/jdk8/build/langtools langtools
...

Then do your build:

11. chmod a+x common/bin/*

12. cd common/makefiles

13. bash ../autoconf/configure

Configure will try to figure out what system you are running on and where all necessary build components are. If you have all prerequisites for building installed, it should find everything. If it fails to detect any component automatically, it will exit and inform you about the problem. I think the philosophy of Configure is very awesome. In my daily builds, I have some scripts and docs to do sanity of a build server but I don’t design a tool elegant like this to automate everything and detect missing components and give smart suggestions like this!
Example (a failed configure check with suggestion):

configure: error: Could not find all X11 headers (shape.h Xrender.h XTest.h). You might be able to fix this by running 'sudo apt-get install libX11-dev libxext-dev libxrender-dev libxtst-dev'.
configure exiting with result code 1
configure: error: Could not find cups! You might be able to fix this by running 'sudo apt-get install libcups2-dev'.
configure exiting with result code 1

Example (a successful configure check):

...
A new configuration has been successfully created in
/media/sf_shared/jdk8-build/build/linux-x86-normal-server-release
using default settings.

Configuration summary:
* Debug level:    release
* JDK variant:    normal
* JVM variants:   server
* OpenJDK target: OS: linux, CPU architecture: x86, address length: 32

Tools summary:
* Boot JDK:       java version "1.7.0_21" OpenJDK Runtime Environment (IcedTea 2.3.9) (7u21-2.3.9-0ubuntu0.11.10.1) OpenJDK Client VM (build 23.7-b01, mixed mode, sharing)  (at /usr/lib/jvm/java-7-openjdk)
* C Compiler:     gcc-4.6 (Ubuntu/Linaro 4.6.1-9ubuntu3) version 4.6.1 (at /usr/bin/gcc-4.6)
* C++ Compiler:   g++-4.6 (Ubuntu/Linaro 4.6.1-9ubuntu3) version 4.6.1 (at /usr/bin/g++-4.6)

Build performance summary:
* Cores to use:   1
* Memory limit:   4031 MB
* ccache status:  installed and in use
...

Wow, so elegant! A good philosophy to do sanity testing for a build server!

Building JDK 8 requires use of a version of JDK 7 that is at Update 7 or newer. JDK 8 developers should not use JDK 8 as the boot JDK, to ensure that JDK 8 dependencies are not introduced into the parts of the system that are built with JDK 7

Note that some Linux systems have a habit of pre-populating your environment variables for you, for example JAVA_HOME might get pre-defined for you to refer to the JDK installed on your Linux system. You will need to unset JAVA_HOME. It’s a good idea to run env and verify the environment variables you are getting from the default system settings make sense for building the OpenJDK.

14. make
Ready? Fasten your seatbelt. Go!

...
Compiling /media/sf_shared/jdk8-build/hotspot/src/share/vm/utilities/yieldingWorkgroup.cpp
Compiling /media/sf_shared/jdk8-build/hotspot/src/share/vm/runtime/vm_version.cpp
Linking vm...
ln: creating symbolic link `libjvm.so.1': Protocol error
Making signal interposition lib...
Making SA debugger back-end...
**NOTICE** Dtrace support disabled: /usr/include/sys/sdt.h not found
All done.
INFO: ENABLE_FULL_DEBUG_SYMBOLS=1
INFO: ALT_OBJCOPY=/usr/bin/objcopy
INFO: /usr/bin/objcopy cmd found so will create .debuginfo files.
INFO: STRIP_POLICY=min_strip
INFO: ZIP_DEBUGINFO_FILES=1
warning: [options] bootstrap class path not set in conjunction with -source 1.6
1 warning
Generating linux_i486_docs/jvmti.html
INFO: ENABLE_FULL_DEBUG_SYMBOLS=1
INFO: ALT_OBJCOPY=/usr/bin/objcopy
INFO: /usr/bin/objcopy cmd found so will create .debuginfo files.
INFO: STRIP_POLICY=min_strip
INFO: ZIP_DEBUGINFO_FILES=1
## Finished hotspot (build time 00:07:09)

## Starting corba
Compiling 6 files for BUILD_LOGUTIL
Creating corba/btjars/logutil.jar
Compiling 141 files for BUILD_IDLJ
...
## Finished jdk (build time 00:11:38)

----- Build times -------
Start 2013-08-28 18:54:01
End   2013-08-28 19:40:17
00:00:28 corba
00:31:07 hotspot
00:00:32 jaxp
00:01:57 jaxws
00:11:38 jdk
00:00:32 langtools
00:46:16 TOTAL
-------------------------
Finished building OpenJDK for target 'default'

Look, it has my signature!

luhuang@luhuang:~/build/jdk8-build/build/linux-x86-normal-server-release/jdk/bin$ date
Wed Aug 28 20:12:00 CST 2013
luhuang@luhuang:~/build/jdk8-build/build/linux-x86-normal-server-release/jdk/bin$ ./java -version
openjdk version "1.8.0-internal"
OpenJDK Runtime Environment (build 1.8.0-internal-luhuang_2013_08_28_18_53-b00)
OpenJDK Server VM (build 25.0-b47, mixed mode)
luhuang@luhuang:~/build/jdk8-build/build/linux-x86-normal-server-release/jdk/bin$

So, we have done with the build of OpenJDK8. So easy! Thanks to the elegant build infrastructure of OpenJDK that we can build OpenJDK in just several commands!