First, let's outline the constraints that come from the build tool. The main tasks of any build system using Jam are:
We will implement a variation of Rene Rivera's ideas for allowing Boost.Build to work "out-of-the-box", with no environment variable settings. The idea involves searching upward from the invocation directory for "boost-build.jam" and "project-root.jam", which can then be loaded to get the location of the build system installation and project-specific settings, respectively. There are two core Jam extensions I expect to rely on for this functionality:I note that this combined functionality would obviate the need for "subproject" and "project-root" rules in Jamfiles, except where the user wants to declare a project-id.
- The PWD builtin rule, which returns the absolute path name of the directory from which Jam was invoked. PWD can be stolen from Matt Armstrong's guest branch at Perforce.
- The ARGV builtin rule, which returns the arguments with which Jam was invoked. $(ARGV[1]) can be used to find the name that was used to invoke Jam, which can be used as a key to whether to implement stock Perforce Jam behavior or Boost.Build behavior.
We check the name used to invoke Jam, and if the name is not the
recognized Boost.Jam invocation ("bjam
") we
continue with the execution of the builtin Jambase.
Otherwise, when we recognize a Boost.Jam invocation, we:
** the current Jambase gives an error about FTJam toolset definitions etc., if BOOST_BUILD_PATH is not set and the toolset definition is not set either. Probably that message should be extended to say something about setting BOOST_BUILD_PATH so that people are not confused. The FTJam toolset definitions aren't needed unless you're building Jam itself. Shall we dump FTJam functionality that we don't absolutely need?
# ---- Sample site-config.jam file ---- # loads the msvc module, which registers it as an available # toolset. No special toolset location/configuration information # is given, so it is assumed that the toolset is already set up # (e.g. VCVARS32.BAT has been called), or that it's installed in # its standard location. using msvc ; # As above, but tells the system that we have two versions of the # gcc toolset installed, in the specified locations, with 2.95.3 # being the default. The 'using' rule loads the given module and # calls its "configure" method with the rest of its # arguments. How a module treats configuration information, of # course, is up to the module. using gcc : 2.95.3 /usr/local/gcc-2.95.3 3.0.2 /usr/local/gcc-3.0.2 ; # Same idea as above. using stlport : 4.5 ~/stlport-4.5 4.6b2 ~/stlport-4.6b2 ; # does what ALL_LOCATE_TARGET currently does locate-built-targets bigdrive:/dave/builds ;
The system traverses the set of top-level targets and generates the dependency graph based on the expanded build description (the algorithm for expanding build descriptions is given at the bottom of this document). OK, so I've tossed off most of the work of the build system in one sentence. The rest of this document deals with that in more detail.
The basic algorithm is as follows:
Declares a project or subproject. A subproject's id is a path, starting with the project id of which it is a subproject. The requirements and default build apply to any targets described in the Jamfile which do not explicitly declare others. A project rule invocation is mandatory in any Jamfile in a project which includes subprojects or uses other projects.project.project ( project-id : requirements * : default-build * )
Declares the location of this Jamfile with respect to the project root, in case the path given in the project rule does not describe the location of the Jamfile.project.jamfile-location( root-to-jamfile )
Declares that relative paths in this Jamfile are all specified relative to the specified directory. Thus, a project with this structure:project.source-location( root-to-source )
might have the following Jamfile:root +- build | `- Jamfile `- src +- foo.c `- bar.c
project.project foobar ; project.jamfile-location build project.source-location src ; exe foobar : foo.c bar.c ;
A generator doesn't match the build request unless all of its required-properties are contained in the build request.
The matching process for a generator looks like this:
local match ; if $(required-properties) in $(build-properties) { match = $(required-properties) [ set.intersection $(optional-properties) : $(build-properties) $(build-properties:G) # valueless properties match any value ] ; for local r in $(rules) { match = [ $(r) $(match) ] ; # maybe some other arguments, too } } return match ;
The specificity of a match is given by the length of its match list. Basically, generators that match more properties will be more likely to be chosen. In each category with a matching generator is found, we select the generators with the longest match for the generator set.
Notes: These criteria handles things like target-type specificity. Properties in an inheritance/refinement hierarchy can be composite properties which expand to add properties for all of their bases. So for example,might expand to<target=type>PYD(see the bottom of this document if you need to refer to the target-type refinement hierarchy). A generator which wanted to match all executables might specify<target-type>PYD <target-type-base>PYD <target-type-base>DLL <target-type-base>executableas a required property. More-specific generators would still match if available. More-specific generators match better because they list /all/ of the base properties to which they apply:<target-type-base>executablepyd-generator requires: <target-type-base>executable <target-type-base>DLL <target-type-base>PYD dll-generator requires: <target-type-base>executable <target-type-base>DLL
For each generator in the generator set, call its "expand" rule to alter the properties in the build request. Most generators that can actually build targets will not want to implement expand; usually expand will only be used by generators that need to modify the build somehow, e.g. by adding #include paths. Note that the build property set is still one big wad available to all competing/interacting generators, so this would be an inappropriate place for a toolset generator to remove irrelevant properties.
For each generator in the generator set, call its "execute" rule. The "execute" rule should return a list of the virtual targets generated in its dependency graph. Generators that don't succeed in producing targets will return the empty list.
At this point, the generator may collect a set of properties relevant to its target construction method into a subvariant identifier. A database of already-generated subvariant identifiers and their related targets can be queried to see if the subvariant already exists. If it does, the generator may use the cached data to return to its caller immediately.
To produce the dependency graph, a generator may well invoke the matching and target calculation process again on some or all of the source files, with the build property set changed to reflect the generator's input target types. For example, a generator for executables comes across a CPP file in a list of sources. It then replaces target types in the build request with its list of input target types (OBJ, LIB,...). The matching process finds a generator which matches <target-type>OBJ with optional <source-type>CPP. This generator, if eventually selected, is the one that invokes the C++ compiler.
Why is CPP an <source-type>CPP an optional property for the C++ compiler? Consider what happens when a a YPP source file appears in the list of sources: the C++ generator should still be matched - it will want to invoke the matching process itself once again, hopefully finding a generator which matches <target-type>CPP/<source-type>YPP.Generators are typically matched based on a single desired target-type, but some generators produce more than one target. When returning its list of targets, a generator distinguishes the intermediate from final targets by dividing the list of targets into two sections, separated by a special symbol (say "@", since we seem to be using it for everythign). When an intemediate generator produces multiple targets, its parent transforms the targets it can cope with, and passes back any others to its parent as final targets. A multi-source generator like EXE<-{OBJ,LIB} will add any final targets that it can't cope with to its sources at that point in its list of sources.Why do we need <source-type>CPP at all? We don't: it's an optimization to prevent less-specific generators from being invoked for build-property expansion or virtual target generation.
Vladimir's favorite example has a dependency graph like this one:Before returning from its execute rule, each generator may collect "synthesized" build properties from the sources and/or intermediate targets generated by the matching process it invoked.To make the problem more interesting, let's reformulate the example:targets foo ^ / \ | a1.o a2.o | | | | a1.cpp a2.cpp | | | | a.whl a.dlp <--- one build action generates both WHL and DLP | \ / sources asm.wdIn the case oftargets foo / \ a1.o a2.o | | a1.cpp | | a2.cpp a1.lex | | | a.whl a.dlp <--- one build action generates both WHL and DLP \ / asm.wdThe arrow indicates a place where, as we're unwinding, we find that the generator can't cope with DLP, so with final targets enclosed in {}:EXE<-OBJ*, OBJ<-CPP, CPP<-LEX, LEX<-WHL, {WHL,DLP}<-WD, --------------------------------------^Now find that OBJ* doesn't match DLP, so we search for DLP just like a source.EXE<-OBJ*, OBJ<-CPP, CPP<-LEX, {LEX,DLP}<-WHL EXE<-OBJ*, OBJ<-CPP, {CPP,DLP}<-LEX, EXE<-OBJ*, {OBJ,DLP}<-CPP
Note: A generator does not /have/ to build a new target. It may just modify properties of targets at the next level and pass the targets up to their caller.For example, a Windows PYD generator might replace the target-type with DLL and reinvoke the process, then go back and add "_d" to the name of the generated file in the case of a Python debug build.
+--------+ +--------+ Cmd +-----+ Link +------------+ | source | 'Archive'..+ object +....>| RSP +.....>| executable | +----+---+ : +---+----+ +-----+ : +------+-----+ | : | : | +-------+----+ +----------+-+-------------+ : +----+----+ | | | : | | V | | +----------+ +--+--+Asm +--+--+ : +--+--+ +----+---+ +--+--+ +--+--+ | compiled | | ASM |...>| OBJ | :...>| LIB | | IMPLIB | | DLL | | EXE | +----+-----+ +-----+ +-----+ +-----+ +--------+ +--+--+ +-----+ | ^ ^ ^ ^ | +---+--+ : : : : +--+--+ | | : : : : | PYD | +--+--+ +-+-+ : : : : +-----+ | CPP | | C +....:..........: : +--+--+ +---+ 'C' : : : : : :................:..........: 'C++'
© Copyright David Abrahams 2002. Permission to copy, use, modify, sell and distribute this document is granted provided this copyright notice appears in all copies. This document is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose.
Revised 21 January, 2002