In past during the Waterfall ways of development, before introduction of CI practices, the development teams use to spend time on a phase called Integration phase. The teams use to collect code from various groups/teams and integrate them into one workable code. This was a difficult and error prone task as it involved integration of weeks or months worth of potentially conflicting changes. It was difficult to get it right and almost always lead to delays in project completion via introduction of bugs and uncovering of unforeseen issues too late in the cycle. Continuous Integration process came to being to address these issues.
In simplest form, CI is the process of continuously building software with every change committed into the version control system and providing feedback to appropriate stake holders.
Example CI process flow
The diagram above (not my creation, taken from Dzone, not able to find the link anymore) explains a simple CI process. It starts with developers committing changes into Version Control System (VCS), Subversion in this example. The project needs to create a build script that can be used to build the software. A CI server either keeps checking the VCS for changes or the VCS notify the CI server whenever a change is committed. Upon notification the CI server starts a build using the build script. A feedback mechanism should be put in place (email or message etc) so that a notification is sent to appropriate stakeholders on completion of build process.
The above described simplistic CI process can be extended to automate lot more, for activities that are usually done in software development work and achieve bigger benefits. We discuss here one such extension that we have setup and used in multiple projects and reaped benefits. Include couple of more tools in the above discussed CI setup and call this collectively a CI stack.
Sample CI stack deployment
The diagram above showcases a simple CI Stack that we have setup and used. At the core of it is a CI Server (Jenkins in this case) on which there is a job defined that fetches the latest code from version control system (SVN here) and does a Maven based build. Maven is used as the tool to script the build and manage project dependencies.
The dependency management functionality of Maven needs an artifact repository to resolve various dependency versions. Nexus is used here to maintain and manage a centralized Maven repository and the above build process utilizes nexus to resolve the project dependency libraries. Nexus acts as a proxy to Maven Central repository. If an artifact/version needed by a project is not available locally with Nexus, it connects to Maven Central over internet and downloads the artifact/version and stores it in local file system. For all further requests for the same artifact/version is locally resolved by Nexus.
The end of the build process invokes an automated static code analysis using Sonar. Sonar is a static code analysis tool that integrates popular code analysis tools like Checkstyle, PMD and FindBug into one tool. It also provides exhaustive metric reports based on the results of analysis using the above tools. It also captures code coverage statistics gathered using tools like Cobertura or Emma. These reports are stored in a DB so it can be referred anytime via a pretty web based UI.
The above integrated stack augments the continuous build and test capabilities of a regular CI server with Continuous Inspection capabilities via Sonar and centralized dependency management and release management capabilities using Nexus. The whole stack is hidden behind a HTTP server which acts as a reverse proxy to hide/shield the individual servers from direct access. It also helps simplify the access URLs of various constituent tools of the stack using a unified IP: port address.
This stack can be extended to add further capabilities. One such thing is web based code review using Trac tool. This can be integrated into individual developer IDEs as well while the web based reports can help QA personnel to extract necessary metrics.
- Detect integration problems early and reduce risk of failure.
- Comprehensive tests help detect bugs early, makes easier to find root cause.
- Early warning of broken code, incompatible changes.
- Availability of ‘current’ build for demo, testing, release.
- Generated metrics and frequent check-ins push for better code.
- Speed up builds and save developer time.
- Save company bandwidth and protect against internet outage.
- Control and audit dependencies.
- Centrally deploy third-party publicly unavailable artifacts.
- Maintain internal repositories and help inter-team (project) collaboration.
- Continuous automated builds including tests and continuous inspection of code.
- Historical trends help analyze project progression direction.
- Trigger builds on code check-in or on specific schedule.
- Automated code quality checks that includes check for Coding standard violations, Best practices, Test coverage, Technical debt, Code complexity etc.
- The reported matrices can be used as exit criteria for release of code beyond development environment.
- Web based reporting making it accessible anywhere and on any device.
The list below provides a few popularly known and often used best practices around CI based development.
- Manage all source code in one centralized VCS repository.
- Make frequent and small commits. Organize the code into smaller group of tasks and make group commits are task level.
- Automate builds using build scripts that avoid the need of manual configurations. The scripts should be capable of running on multiple platforms and environments.
- Centralize dependency management and reduce pre-installed tool dependency minimal.
- Develop code in private workspaces and do private builds before committing code to VCS.
- Send automated feedback from CI server to development teams and other stakeholders.
- Fix build errors on priority, as soon as they occur.
- Use dedicated machines to do build and deployment.
- Create exhaustive test cases to cover all paths in code, group them by type and automate the test execution to add self testing capabilities.
- Execute automated code analysis for standards, best practices and potential bugs. Define a build threshold and notify stakeholders on threshold breach.
- Externalize application configuration into property files and separate configuration from installation.
The table below summarizes the software used to setup this CI Stack along with their purpose.
||Manage and maintain Maven repositories
||Continuous Integration server
||Static code analysis tool
||Build automation tool
||The Java runtime environment
||Http server to provide reverse proxy capabilities
||Version control system
||Web based code review over SVN server