Internals

How it works?

Scheme

_images/autodep_arch48.png

Format of network messages

  1. Format of messages to file access registrar:

    <time of event: sec since 1970>
    <event type: open, read, write>
    <name of file>
    <building stage: stagename or unknown>
    <result:OK,ERR/errno,ASKING,DENIED>
  2. Format of answer for ASKING packet from registrar:

    <ALLOW | DENY>

Notes:

  • All sockets are SOCK_SEQPACKET
  • All fields are delimited with character with code 0

How Hooklib approach works?

The main idea of Hooklib approach is to load a dynamic library-hooker before any other library(including the C runtime, libc.so). So, the functions, such as open, read and write, executed from this library instead of libc.so.

Hooklib module modifies Linux’s dynamic linker behavior changing LD_PRELOAD environment variable(see man 8 ld-linux for details). Module protects LD_PRELOAD variable from further changes by program.

When hooklib module loads, it connects to file access registrar via Unix domain sockets. If program forks or creates a new thread, another copy of library loads.

When program do open(...), read(...), write(...), library send an information about a call to registrar. Registar can block or allow an event. If registrer allows an event then the original function is called. Else error “file not found” is returned.

How Fusefs approach works?

The main idea if Fusefs approach is to create a loggable filesystem in userspace and chroot a program into it.

Before program is launched registrar prepare mounts. It usually do:

  1. mount -o bind / /mnt/rootfs/
  2. mount /dev/, /dev/pts, /dev/shm, /proc/, /sys/ same way
  3. mount /lib64/, /lib32/, /var/tmp/portage/ same way to increase performance at cost of accuracy
  4. launch fuse over /mnt/rootfs/

Fuse module blocks all external access to /mnt/rootfs while program runs.

Fuse module also asks the registrar about event allowness.

Notes:

  • Checking for allowness takes a much time

Futher analysis of file access events

After file access analyser recieves list of events it maps it on a list of packages.

Then analyser builds a list of dependencies for packages installed and compares with the list it got from registrar. Analyser believes that packages from system profile are implicit dependencies of any package in system.

If dependency from registrar is unexpected simple heuristics used to cut unuseful packages.

Rules of heuristics

  1. Package is not useful if all files are .desktop or .xml or .m4. Aclocal util tries to read all .m4 files in /usr/share/aclocal directory. Files ending on .desktop and .xml often readed on postrm phase.

Table Of Contents

Previous topic

Introduction

Next topic

API

This Page