Thursday, 10 May 2012

Introduction To Maven Concepts (Crash Course)

This is the post I wish someone had written to connect dots in 2007, when Maven documentation was scarce. This Maven tutorial is a starting point for those who have no experience with Maven, or who are willing to improve their global understanding of Maven concepts. It explains what Maven is, what you can do with it, but not how to do it. I will also give pointers to additional documentation.

What is Maven?

Maven is a software project implementation tool, but not a project management tool. In a factory, it would be the complete assembly line, taking in raw materials and producing finished products ready for use. It is mostly used by Java programmers, but it can be configured for other programming languages too.

If ever you have to work on a large Java project, you will most probably have to learn either about Maven or Ant (another project implementation tool). Otherwise, your project will quickly become unmanageable as it grows in size. Ant requires you to configure all your requirements from top to bottom. You have to build the whole assembly line for a given project. On the other side, Maven comes with default behaviors. If there is something you don't like in the assembly line, you can reconfigure it or extends parts to meet your specific needs.

There has been some religious wars between Ant users and Maven in the past. But today, they coexists in peace. In fact, it is pretty common for a Maven project to call Ant as a module to perform tasks it cannot perform itself.

Both Maven and Ant are functionally interchangeable. If an existing project works fine with Ant, you should not convert it to Maven. However, if you start a project from scratch, you may want to consider Maven, because of the very large support community and an incredible number of mature modules available for free under an open source license.

What Is The Form Factor?

Maven is a package you download and install on your PC. You just need to unzip it in a directory and make sure it is accessible from the command line (i.e., it has to be 'in the path'). Maven is a command line product with no graphic user interface (GUI). However, it is well integrated in free software development applications such as NetBeans or Eclipse. Most often, you will never need to run Maven from the command line. If you plan to use Ant with Maven, you will need to install it on your PC too.

How Does It Work?

When you start using Maven, it will first create a local repository on your PC. As you compile your software projects, all the results are posted in this local repository. The produced items are called artifacts. You can configure Maven to post those artifacts in other remote repositories or locations too, if necessary.

In addition to this local repository, there is what is called the central repository. It is a huge online repository containing tons of free artifacts contributed under an open source license over many years by thousands of developers. When Maven needs one of these artifacts to build a project and can't find it the local repository, it tries to fetch it from the central repository, if the local PC is connected to the Internet. Maven can be configured to search those artifacts in other public or private repositories too.

As a consequence, the first time Maven builds a project, it will download many artifacts from the central repository. It is a one time operation. You are not allowed to post something directly to the central repository. If you want to contribute your own artifacts, you need to read this.

There are three main types of repositories, the central repository, your local repository and public or private proprietary repositories.

What Is A Maven Project?

Contrary to many other software implementation tools, Maven projects operate according to a standard directory structure where different items (code files, test files, etc...) are expected to be found in well-known directories. It is part of the delivered assembly line.

It is not a good idea to try to reconfigure this directory structure to fit your karma. If another software engineer gets to work on your project, he will be confused. On the other side, if you get to work on another Maven project, you will be happy to find what you are looking for where it is supposed to be. Learn about Maven directory structures. Don't be a baby, open your mouth and do take the pill (lol)!

Another key project item is the Project Object Model XML file often called 'pom' or 'pom.xml'. Each project has its own directory structure with a pom.xml located at its root. Here is one the simplest possible pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>


  <groupId>com.mycompany</groupId>
  <artifactId>SimpleMavenProject</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>pom</packaging>

</project>

It contains the 4 items making the coordinates of the project. Basically, it is the location of the project in repositories; a concatenation of the group id, the artifact id, and the project version (for a definition of SNAPSHOT, see (*) below) together with directory separators. The packaging explains how and which default artifacts will be bundled into the repository. You can organize and identify your projects as you wish using the coordinates you want. The packaging depends of the type of project your want to create.

The POM is where you will configure the different parts of the assembly line to meet your needs if default configuration is not enough.

Typically, under the hood, all Maven projects are created using an archetype. It is a kind of template used to create the default content of POMs and the default directory structure of Maven projects. You can create your own archetypes if you need to, but existing ones will cover most of your needs.

How Does The Compile Process Work?

The compile process is called the build process in Maven. It can be configured to perform much more than a simple compilation. Therefore, you should avoid mentioning a 'compile process' when talking about Maven, because it is only one of the phases of one of Maven's build processes (one step in the assembly line).

When a build process is executed, it goes through a life cycle using the information contained in the POM. Each life cycle is made of phases. Plugins are attached to phases. A phase can contain multiple plugins. A plugin is basically a set of operations of a given type which may be executed during a build cycle. Each possible plugin operation is called a goal. When a plugin is attached to a phase, a goal is specified (or else the default goal is executed).

The life cycle goes through each phase, sequentially, and executes each default (or configured in the POM) plugin goal, sequentially, until the end or until one of the plugin goal fails to execute. A Maven goal (as opposed to a plugin goal) is a step in the life cycle of a build process. By default, a build process will go the whole way through, unless a specific Maven goal is specified. In this case, Maven stops at this life cycle step, even if it is executed successfully.

By default, some plugins are attached to phases. Each plugin has a default goal which will be invoked unless it is configured differently in the POM. This is the default assembly line. You can also attach additional plugins to the phases of your build cycles. These will be downloaded from the central repository if necessary. You can also write your own plugins if you need to and use them in your build cycles.

Think about all this in terms of Lego's attached to a structural frame.

What About Maven Dependencies?

Large projects often use existing pieces of code (libraries) by making an explicit reference to them. In Maven projects, such dependencies to existing artifacts are specified in the POM, using their coordinates. Sometimes, these dependencies are just required for the build process (for the testing phase for example). Sometimes, they need to be shipped as part of the delivered artifacts. The type of use of the dependency is called a scope. It is specified in the POM

What About Profiles?

In order to cover complex situations (for example) when you need to create different types of artifacts for different target platforms, it would be tedious to maintain multiple POMs for the same project. The solution is called build profiles in Maven.

This is a mean to set additional configuration if the build process is to be executed for a specific platform (building for Windows or for Linux for example). In this case, corresponding plugins are only executed if they have been defined in the corresponding platform profile in the POM. A POM can be executed by specifying (or not) a build profile. This determines the set of plugins which will be executed on top of the default configuration.

Conclusion

There are many other features available in Maven. We have only mentioned the basics. If you want to learn more about Maven, including how to use it, the best resource available on the net is the Maven Complete Reference online guide.

After reading this guide, you should explore existing Maven modules. Learn what they can do for you. If you need help, type your question in Google followed by the word 'StackOverflow'. It most probably has already been answered. If not, go to StackOverflow.com and ask your question there.

-----

(*) SNAPSHOT versions can be confusing at the beginning. Let's assume you are working on a project and have released version 1.0.0. You are now working on version 1.1.0, but it has not been released and won't be released until you are done. Version 1.1.0 is work in progress. Yet, with Maven's build process, you are creating temporary work in progress artifacts. In order to differentiate these from production ready artifacts, you can add -SNAPSHOT to the work in progress version of your project (i.e.,  <version>1.1.0-SNAPSHOT</version>).

When you create Maven projects with dependencies to other artifacts/libraries, by default, maven ignores SNAPSHOT versions of those dependencies. Yet, if you want to access a specific SNAPSHOT version of a dependency, you can explicitly specify it in your pom.xml. This is often necessary during the development phase of a project.

For Maven parent projects, see here • More about Maven tips and tricks here.