It is possible to have many different worker types in taskcluster, indeed we
have multiple worker implementations. This is extremely powerful as we can't
possibly support all future platforms and features with a single code base.
However, just because multiple worker implementations is possible it is still
desirable to reuse code across platforms and aim for consistency in concepts,
task.payload formats across platforms.
an attempt at writing a cross platform worker that support multiple execution
environments and configurable feature-sets.
This does not mean that all thinkable workers can be implemented with
taskcluster-worker. The architecture of
taskcluster-worker aims to abstract
the execution environment and features for any worker where the task specifies
some form of sub-process to be executed. For example, the task could specify a
command to be executed in a virtual machine, docker container or
just an unconfined sub-process. The architecture of
not aim to facilitate more declarative tasks, such as sign a binary, where the
task.payload specifies the binary to sign, rather than a command.
To support multiple platforms and configurable feature-sets
have two important abstractions:
engines.Engine, an abstraction of a sand-boxed execution environment, and,
plugins.Plugin, an plugin which can implement a complex feature.
This allows for features to be implemented independently of the execution environment that is used. Additionally, it means that we can add additionally execution environments without rewriting the worker from scratch.
In the simple terms the worker loads a config file, and sets up some abstracted resources like:
- Temporary file storage,
- A garbage collected cache registry,
- Logic for exposing public web-hooks (live logs and interactive shells),
- Log recording, and
- Life-cycle controls,
Then it instantiates an engine and enters a task processing loop, which in broad strokes looks somewhat like this:
- Claim task
- Process task
At each step (and many sub-sets) the plugins are called giving them the
opportunity to modify the
to shutdown the worker. Hence, plugins are responsible for implementing all the
interesting features such as:
- Injection of environment variables,
- Exposing interactive shells,
- Extraction of artifacts,
- Setup proxy services,
- Control worker life-cycle,
By enabling or disabling plugins we can swap-out logic controlling the worker life-cycle depending on whether we're running in a data-center or on an EC2 spot-node where we have to watch for spot-shutdown events. We can also disable certain features like interactive shells and live-logs which might be undesirable in security sensitive environments.
Finally, this architecture serves to decouple the code as much as possible, making it possible to modify the artifact extraction plugin without touching the plugin responsible for injecting environment variables.