------------------------------
My poor Portage deconstruction
------------------------------

Here's what I have so far in my "walk" through the Portage code (specifically,
portage HEAD sometime around Feb. 2005).  Right now I've enumerated ebuild_,
the effects of `importing portage`_, emerge_, the Portage
`portage.doebuild()`_ function, and the `ebuild.sh`_ script.

ebuild
------

The ebuild script is a simple interface to portage.py.  It's a rather simple
python script:

    * Import needed modules
    * Add ``/usr/lib/portage/pym`` to the python path so that portage can be
      found.
    * Define the ``getroot()`` function to ensure that ``ROOT`` is set correctly.
    * Set the ``PORTAGE_CALLER`` environment variable to ``ebuild``.
    * Parse options using getopt:
        * Look for ``--debug``, set ``debug=1`` if found
        * Look for ``merge``, add ``-noauto`` to ``FEATURES`` if found.
    * Import portage (and doing all of the setup that entails).


::

    Example: ebuild --debug /path/to/foo.ebuild fetch unpack


Script loops over arguments after the ebuild name:
    -  store (clone) portage.config to be passed to doebuild
    -  Set ``cleanup=1`` if running either ``clean`` or ``config``
    -  execute ``a=portage.doebuild(ebuild name, argument, ROOT, config, debug=debug,cleanup=cleanup)``
    -  fail (exit) if ``a`` exists (or user interrupt or I/O error)

importing portage
-----------------

Quite a few things happen when portage is imported.  Most of portage.py
goes about defining a boatload of classes and functions, but there's
also a fair amount of information that is assembled and made available
by the import process.  Here's a (probably incomplete) list of those
items:

    * Define constants: portage version number, a list of variables that 
      are "incremental" (USE, FEATURES, etc.), and a list of variables that 
      are "sticky" (KEYWORDS_ACCEPT, USE, ...).
    * Import a lot of python modules that will be used by portage.  Core
      python modules are imported first, followed by various portage modules.
      Along the way, ``catpkgsplit``, ``pkgsplit``, and ``pkgcmp`` from
      ``portage_versions`` are added to the portage namespace.  
    * Establish a signal handler to catch SIGCHLD, SIG_DFL, SIGINT, and
      SIGTERM.
    * Define a *lot* of functions and classes.

    * Define the "VDB" path (traditionally ``/var/db/pkg``)
    * Identify ostype from uname, set some os-related defaults, 
      export USERLAND to the environment
    * Check to make sure user is in the portage group (or root)
    * List of "incrementals" and "stickies" keys
    * List possible ebuild version suffixes ("pre", "alpha", ...)
    * List ebuild keys ("DEPEND", "LICENSE", ...)
    * Check environment for ROOT, use "/" if not found
    * Create tmp and var/tmp directories (below ROOT)
    * Set 022 umask
    * Check to see if the current profile is deprecated
    * Set some global settings (PID, bash env, cache dir)
    * Read use.defaults and set variables appropriately
    * Get OVERLAY
    * Lock global settings
    * Check for selinux
    * Set up some cachedir stuff
    * mtime db stuff???
    * Get FEATURES
    * Set up porttree and bintree dbs
    * Get thirdpartymirrors
    * Perform PORTAGE_CACHEDIR and PORTAGE_TMPDIR sanity checks
    * Get category list from file
    * Get package.mask, /etc/portage/package.*
    * Get packages file
    * Get unmask list, mask list, "revmask"?
    * Get ACCEPT_KEYWORDS
    * Check that /etc/make.profile is a symlink
    * Check for prelink

emerge
------

This script is much more complicated than the ebuild script (portage.py
is just a bit more than 2x as many lines as this script).  

    * Set environment variable ``PORTAGE_CALLER="emerge"``
    * Import portage <-- portage.settings and other vars now exposed
    * Check if output should be colored (if not, call ``nocolor()`` ).
    * Check portage.settings for ``PORTAGE_NICENESS``; if so, nice portage
    * Freeze portdb api (don't know what this does just yet)
    * Turn off "noauto" if set in ``FEATURES``
    * Set # of merged ebuilds = 0
    * Set up lists of valid emerge parameters (params),
      actions, and options (all hard-coded lists); also
      map short-form of options to long-form versions.
    * Process command line (ugly!)
        - process short-form to convert to long-form
        - Process long-form arguments
        - Create list of options (myopts), actions (myactions), & files (myfiles)
        - minor fixes ( ``-U`` implies ``-u``, etcetera)
        - handle ``--noconfmem`` and ``--debug`` flags
    * Define log, exit, countdown, signal handling, help, and "emerge info" 
      helper functions.
    * Check permissions; exit if invalid.
    * Start logging (unless ``--pretend`` was set)
    * Decide what type of dependencies to use
      (self, selective, recurse, deep, or empty)
    * Define spinner func
    * Define the search class
    * Build package digraph
        - Define getlist function for world or system
        - Define genericdict function
        - Define depgraph class (huge)
    * Define unmerge function
    * Define post_emerge function (finish logging, handle info stuff,
      config protection)
    * If --debug set, edebug=1 (not sure why two different debug flags set)
    * Handle ``emerge sync`` (big if, elif, else construction) 
    * Handle ``emerge regen``
    * Handle ``emerge config``
    * Handle ``emerge info``
    * Handle ``emerge search``
    * Handle ``emerge inject``
    * Handle ``emerge unmerge``, ``emerge prune``, and ``emerge clean``
        - Call ``unmerge(myaction, myfiles)``
    * Handle ``emerge depclean``
    * Else: Handle update, system, or just process files
        - Do some setup stuff if ``--pretend`` set
        - If ``--resume`` and portage.mtimedb has a "resume" key, then update opts, create a "resume" depgraph
        - Else, create depgraph, error out if something is wrong
        - If ``--pretend``, show depgraph
        - Else
            + If ``--buildpkgonly``, check that deps are satisfied
            + If --resume, do merge using ``mydepgraph.merge(portage.mtimedb["resume"]["mergelist"])``
            + Else, handle ``--digest`` or perform merge, the latter with ``mydepgraph.merge(mydepgraph.altlist())`` 
        - Run autoclean
        - Run ``post_emerge()``

portage.doebuild()
------------------

Much of what happens during the actual merge of a package happens
in portage's doebuild function.  The interface is::

    doebuild(myebuild, mydo, myroot, debug=0, listonly=0, fetchonly=0)

where ``myebuild`` is the path to the ebuild to be merged,
``mydo`` is the action to be performed (one of the commands that 
follows ``ebuild foo.ebuild ...``), and ``debug``, ``listonly``,
and ``fetchonly`` are flags that can be true (1) or false (0).

Here's what happens:

    *  Check to make sure the input is valid 
    *  Set a boatload of keys in the ``settings`` instance of the config class
        - PORTAGE_DEBUG, ROOT, STARTDIR, EBUILD, O, CATEGORY, FILESDIR,
          PF, ECLASSDIR, SANDBOX_LOG, P, PN, PV, PR, PVR, SLOT, BUILD_PREFIX,
          PKG_TMPDIR, PORTAGE_BUILDDIR, KV, KVERS
    *  If the action is ``depend``, then return dependency info (eclasses?).
       If ``--debug`` was also set, then the deps are printed on the cmd line.
    *  If necessary (meaning not ``fetch``, ``digest``, or ``manifest``), 
       create the build directory (``/var/tmp/portage/whatever``) and the 
       temp directory (and add T to ``settings``).  Also create a ccache
       directory, if necessary, along with WORKDIR and D.
    *  If ``unmerge``, call ``unmerge()`` and return.
    *  Setup logging if requested
    *  If ``help``, ``clean``, ``setup``, ``prerm``, ``postrm``, ``preinst``,
       ``postinst``, or ``config``, then pass the action to ``ebuild.sh`` 
       (using "``ebuild.sh mydo,debug,free=1``") and return.
    *  Generate A from SRC_URI
    *  Ensure that DISTDIR exists; change permissions for ${DISTDIR}/cvs-src
       if necessary.
    *  Run fetch(); return error on failure.
    *  Generate digest if ``digest`` in FEATURES or called explicitly (return
       in the latter case).
    *  If ``manifest`` then generate manifest and return.
    *  Check manifest and die on failure if ``strict`` in FEATURES
    *  If ``fetch``, return (since already done above, and it must have worked
       if we're here)
    *  Set up an "``actionmap``" of dependencies (i.e., ``unpack`` requires
       ``dep`` and ``setup``).
    *  If ``package``, ensure that PKGDIR is created, call 
       ``spawnebuild`` with the current action and the actionmap, and return.
    *  If ``qmerge``, call merge w/o checking for dependencies, and return.
    *  If ``merge``, call ``spawnebuild`` with "``install``" and the 
       ``actionmap``, and then call ``merge`` and return.
    *  Else, die because the action didn't match any of the above.  Oops!

ebuild.sh
---------

The ``ebuild.sh`` bash script predominantly gets called by the
``portage.doebuild`` function to handle the bash functions in an
ebuild.  Following the bouncing ball....

The first block of code is initialization stuff:

    *  If not ``depend`` or ``clean``, then
        - clean the "successful" file from ${T} if it exists
        - Set up logging if desired (including punching a hole in
          sandbox for the log file)
        - If ${T}/environment exists, source it
    *  unalias everything
    *  Set "expand_aliases" and set up ``die`` and ``assert`` aliaes
    *  Source ``/etc/profile`` on GNU userland systems
    *  Export proper CC and CXX variables.
    *  Set up the a hardcoded PATH, tack ${ROOTPATH} to the end of it,
       and add ${PREROOTPATH} to the beginning of the PATH if it exists.
    *  Source helper functions files (``functions.sh`` and 
       ``extra_functions.sh``).

Then a great number of functions are defined:

    *  Define a do-nothing ``esyslog()`` override function.
    *  Define a variety of USE flag functions:
       ``use``, ``has``, ``has_version``, ``best_version``,
       ``use_with``, and ``use_enable`` functions.
    *  Define ``diefunc`` function.
    *  Set umask and a number of installation defaults.
    *  Define ``check_KV`` function.
    *  Define the ``keepdir`` function to put ".keep" files in directories
       that portage should not automatically remove if empty.
    *  Define a series of sandbox helper functions.
    *  Define the ``unpack`` function.
    *  Define the ``econf`` convenience function.  Notice that ``econf``
       automatically dies on error(s).
    *  Define the ``einstall`` convenience function.
    *  Define ``pkg_setup`` as a placebo function to be overridden.
    *  Define ``pkg_nofetch``.
    *  Define minimal, but functional, ``src_unpack`` and ``src_compile``
       functions.
    *  Define ``src_install``, ``pkg_preinst``, ``pkg_postinst``, 
       ``pkg_prerm``, and ``pkg_postrm`` skeleton functions that, when
       needed, should be overridden.
    *  Define the deprecated ``try`` function.
    *  Define the ``gen_wrapper`` function to generate ``lib/cpp`` and
       ``/usr/bin/cc`` wrappers.
    *  Define some of the ``dyn-*`` functions: ``dyn_setup``, ``dyn_unpack``, 
       ``dyn_clean``.
    *  Define some ``install`` helper functions: ``into``, ``insinto``,
       ``exeinto``, ``docinto``, ``insopts``, ``diropts``, ``exeopts``,
       and ``libopts``.  Note that a number of additional helper 
       functions, such as the ``do*`` functions, are actual executable
       scripts in ``/usr/lib/portage/bin``.
    *  Define a number of abort handler functions: ``abort_handler``,
       ``abort_compile``, ``abort_unpack``, ``abort_package``, and
       ``abort_install``.
    *  Define the rest of the ``dyn_*`` functions: ``dyn_compile``,
       ``dyn_package``, ``dyn_install``, ``dyn_spec``, ``dyn_rpm``,
       and ``dyn_help``.
    *  Define some eclass debugging functions: ``debug-print``, 
       ``debug-print-function``, and ``debug-print-section``.
    *  Define the eclass ``inherit`` function.
    *  Define some eclass helper functions:  ``EXPORT_FUNCTIONS``,
       ``newdepend``, ``newrdepend``, ``newcdepend``, ``newpdepend``,
       and ``do_newdepend``.

Finally, the main part of the script is reached:

    *  If not ``depend`` or ``clean``, then
        - cd into the ``/var/tmp/portage/whatever`` directory
        - export USER=portage if the effective user ID is "portage"
        - Set up distcc and ccache PATHs, environment variables, and
          sandbox holes
    *  Turn on sandbox flag
    *  Set the S default and unset a handful of variables
    *  Source the ebuild
    *  Set the S default if S not defined in the ebuild.  (redundant?)
    *  Set TMP and TMPDIR to be ${T} (otherwise sandbox might error out)
    *  Set RDEPEND is not set, set it to be DEPEND.
    *  Add eclass deps to {R,C,P,}DEPEND.
    *  Loop over arguments passed to ``ebuild.sh``, using a big case statement
         - ``nofetch`` -- call ``pkg_nofetch``
         - ``prerm``, ``postrm``, ``preinst``, ``postinst``, ``config`` -- 
           turn off sandbox, handle debugging, and call the eponymous
           ``pkg_*`` function.
         - ``unpack``, ``compile``, ``clean``, ``install`` -- handle sandbox
           and debugging, then call the eponymous ``dyn_*`` function.
         - ``help``, ``clean``, ``setup`` -- Turn off sandbox, handle
           debugging, and call the appropriate ``dyn_*`` function.
         - ``package``, ``rpm`` -- Turn off sandbox, handle debugging,
           and call the eponymous ``dyn_*`` function.
         - ``depend`` -- Turn off sandbox and write dependency info
           to ``/var/edb/cache/dep/CATEGORY/PF``.
         - Else, error that the command wasn't valid.
    *  If ``clean``, clean that package's temp directory
    *  Finish by touching "successful" in the package's temp directory.