Architecture

The design of Kanister follows the operator pattern. This means Kanister defines its own resources and interacts with those resources through a controller. This blog post describes the pattern in detail.

In particular, Kanister is composed of three main components: the Controller and two Custom Resources - ActionSets and Blueprints. The diagram below illustrates their relationship and how they fit together:

_images/kanister_workflow.png

Kanister Workflow

As seen in the above diagram and described in detail below, all Kanister operations are declarative and require an ActionSet to be created by the user. Once the ActionSet is detected by the Kanister controller, it examines the environment for Blueprint referenced in the ActionSet (along with other required configuration). If all requirements are satisfied, it will then use the discovered Blueprint to complete the action (e.g., backup) specified in the ActionSet. Finally, the original ActionSet will be updated by the controller with status and other metadata generated by the action execution.

Custom Resources

Users interact with Kanister through Kubernetes resources known as CustomResources (CRs). When the controller starts, it creates the CR definitions called CustomResourceDefinitions (CRDs). CRDs were introduced in Kubernetes 1.7 and replaced TPRs. The lifecycle of these objects can be managed entirely through kubectl. Kanister uses Kubernetes' code generation tools to create go client libraries for its CRs.

The schemas of the Kanister CRDs can be found in types.go

Blueprints

Blueprint CRs are a set of instructions that tell the controller how to perform actions on a specific application.

A Blueprint contains a field called Actions which is a mapping of Action Name to BlueprintAction.

The definition of a BlueprintAction is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// BlueprintAction describes the set of phases that constitute an action.
type BlueprintAction struct {
    Name               string              `json:"name"`
    Kind               string              `json:"kind"`
    ConfigMapNames     []string            `json:"configMapNames"`
    SecretNames        []string            `json:"secretNames"`
    InputArtifactNames []string            `json:"inputArtifactNames"`
    OutputArtifacts    map[string]Artifact `json:"outputArtifacts"`
    Phases             []BlueprintPhase    `json:"phases"`
    DeferPhase         *BlueprintPhase     `json:"deferPhase,omitempty"`
}
  • Kind represents the type of Kubernetes object this BlueprintAction is written for. Specifying this is optional and going forward, if this is specified, Kanister will enforce that it matches the Object kind specified in an ActionSet referencing this BlueprintAction

  • ConfigMapNames, SecretNames, InputArtifactNames are optional but, if specified, they list named parameters that must be included by the ActionSet.

  • OutputArtifacts is an optional map of rendered parameters made available to the BlueprintAction.

  • Phases is a required list of BlueprintPhases. These phases are invoked in order when executing this Action.

  • DeferPhase is an optional BlueprintPhase invoked after the execution of Phases defined above. A DeferPhase, when specified, is executed regardless of the statuses of the Phases. A DeferPhase can be used for cleanup operations at the end of an Action.

1
2
3
4
5
6
7
// BlueprintPhase is a an individual unit of execution.
type BlueprintPhase struct {
    Func       string                     `json:"func"`
    Name       string                     `json:"name"`
    ObjectRefs map[string]ObjectReference `json:"objects"`
    Args       map[string]interface{}     `json:"args"`
}
  • Func is required as the name of a registered Kanister function. See Functions for the list of functions supported by the controller.

  • Name is mostly cosmetic. It is useful in quickly identifying which phases the controller has finished executing.

  • Object is a map of references to the Kubernetes objects on which the action will be performed.

  • Args is a map of named arguments that the controller will pass to the Kanister function. String argument values can be templates that the controller will render using the template parameters. Each argument is rendered individually.

As a reference, below is an example of a BlueprintAction.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
actions:
  example-action:
    phases:
    - func: KubeExec
      name: examplePhase
      args:
        namespace: "{{ .Deployment.Namespace }}"
        pod: "{{ index .Deployment.Pods 0 }}"
        container: kanister-sidecar
        command:
          - bash
          - -c
          - |
            echo "Example Action"

ActionSets

Creating an ActionSet instructs the controller to run an action now. The user specifies the runtime parameters inside the spec of the ActionSet. Based on the parameters, the Controller populates the Status of the object, executes the actions, and updates the ActionSet's status.

An ActionSetSpec contains a list of ActionSpecs. An ActionSpec is defined as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
 // ActionSpec is the specification for a single Action.
 type ActionSpec struct {
     Name string                           `json:"name"`
     Object ObjectReference                `json:"object"`
     Blueprint string                      `json:"blueprint,omitempty"`
     Artifacts map[string]Artifact         `json:"artifacts,omitempty"`
     ConfigMaps map[string]ObjectReference `json:"configMaps"`
     Secrets map[string]ObjectReference    `json:"secrets"`
     Options map[string]string             `json:"options"`
     Profile *ObjectReference              `json:"profile"`
     PodOverride map[string]interface{}    `json:"podOverride,omitempty"`
 }
  • Name is required and specifies the action in the Blueprint.

  • Object is a required reference to the Kubernetes object on which the action will be performed.

  • Blueprint is a required name of the Blueprint that contains the action to run.

  • Artifacts are input Artifacts passed to the Blueprint. This must contain an Artifact for each name listed in the BlueprintAction's InputArtifacts.

  • ConfigMaps and Secrets, similar to Artifacts, are a mappings of names specified in the Blueprint referencing the Kubernetes object to be used.

  • Profile is a reference to a Profile Kubernetes CustomResource that will be made available to the Blueprint.

  • Options is used to specify additional values to be used in the Blueprint

  • PodOverride is used to specify pod specs that will override default specs of the Pod created while executing functions like KubeTask, PrepareData, etc.

As a reference, below is an example of a ActionSpec.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
spec:
  actions:
  - name: example-action
    blueprint: example-blueprint
    object:
      kind: Deployment
      name: example-deployment
      namespace: example-namespace
    profile:
      apiVersion: v1alpha1
      kind: profile
      name: example-profile
      namespace: example-namespace

In addition to the Spec, an ActionSet also contains an ActionSetStatus which mirrors the Spec, but contains the phases of execution, their state, and the overall execution progress.

// ActionStatus is updated as we execute phases.
type ActionStatus struct {
    Name string                   `json:"name"`
    Object ObjectReference        `json:"object"`
    Blueprint string              `json:"blueprint"`
    Phases []Phase                `json:"phases"`
    Artifacts map[string]Artifact `json:"artifacts"`
}

Unlike in the ActionSpec, the Artifacts in the ActionStatus are the rendered output artifacts from the Blueprint. These are rendered and populated once the action is complete.

Each phase in the ActionStatus phases list contains the phase name of the Blueprint phase along with its state of execution and output.

// Phase is subcomponent of an action.
type Phase struct {
    Name   string                 `json:"name"`
    State  State                  `json:"state"`
    Output map[string]interface{} `json:"output"`
}

Deleting an ActionSet will cause the controller to delete the ActionSet, which will stop the execution of the actions.

$ kubectl --namespace kanister delete actionset s3backup-j4z6f
  actionset.cr.kanister.io "s3backup-j4z6f" deleted

Note

Since ActionSets are Custom Resources, Kubernetes allows users to delete them like any other API objects. Currently, deleting an ActionSet to stop execution is an alpha feature.

Profiles

Profile CRs capture information about a location for data operation artifacts and corresponding credentials that will be made available to a Blueprint.

The definition of a Profile is:

1
2
3
4
5
6
// Profile
type Profile struct {
  Location          Location   `json:"location"`
  Credential        Credential `json:"credential"`
  SkipSSLVerify     bool       `json:"skipSSLVerify"`
}
  • SkipSSLVerify is boolean and specifies whether skipping SkipSSLVerify verification is allowed when operating with the Location. If omitted from a CR definition it default to false

  • Location is required and used to specify the location that the Blueprint can use. Currently, only s3 compliant locations are supported. If any of the sub-components are omitted, they will be treated as "".

    The definition of Location is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// LocationType
type LocationType string

const (
  LocationTypeGCS         LocationType = "gcs"
  LocationTypeS3Compliant LocationType = "s3Compliant"
  LocationTypeAzure       LocationType = "azure"
)

// Location
type Location struct {
  Type     LocationType `json:"type"`
  Bucket   string       `json:"bucket"`
  Endpoint string       `json:"endpoint"`
  Prefix   string       `json:"prefix"`
  Region   string       `json:"region"`
}
  • Credential is required and used to specify the credentials associated with the Location. Currently, only key pair s3, gcs and azure location credentials are supported.

    The definition of Credential is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// CredentialType
type CredentialType string

const (
  CredentialTypeKeyPair CredentialType = "keyPair"
)

// Credential
type Credential struct {
  Type    CredentialType `json:"type"`
  KeyPair *KeyPair       `json:"keyPair"`
}

// KeyPair
type KeyPair struct {
  IDField     string          `json:"idField"`
  SecretField string          `json:"secretField"`
  Secret      ObjectReference `json:"secret"`
}
  • IDField and SecretField are required and specify the corresponding keys in the secret under which the KeyPair credentials are stored.

  • Secret is required reference to a Kubernetes Secret object storing the KeyPair credentials.

As a reference, below is an example of a Profile and the corresponding secret.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: cr.kanister.io/v1alpha1
kind: Profile
metadata:
  name: example-profile
  namespace: example-namespace
location:
  type: s3Compliant
  bucket: example-bucket
  endpoint: <endpoint URL>:<port>
  prefix: ""
  region: ""
credential:
  type: keyPair
  keyPair:
    idField: example_key_id
    secretField: example_secret_access_key
    secret:
      apiVersion: v1
      kind: Secret
      name: example-secret
      namespace: example-namespace
skipSSLVerify: true
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: example-secret
  namespace: example-namespace
data:
  example_key_id: <access key>
  example_secret_access_key: <access secret>

Controller

The Kanister controller is a Kubernetes Deployment and is installed easily using kubectl. See Installation for more information on deploying the controller.

Execution Walkthrough

The controller watches for new/updated ActionSets in the same namespace in which it is deployed. When it sees an ActionSet with a nil status field, it immediately initializes the ActionSet's status to the Pending State. The status is also prepopulated with the pending phases.

Execution begins by resolving all the Template Parameters. If any required object references or artifacts are missing from the ActionSet, the ActionSet status is marked as failed. Otherwise, the template params are used to render the output Artifacts, and then the args in the Blueprint.

For each action, all phases are executed in-order. The rendered args are passed to Template Parameters which correspond to a single phase. When a phase completes, the status of the phase is updated. If any single phase fails, the entire ActionSet is marked as failed. Upon failure, the controller ceases execution of the ActionSet.

Within an ActionSet, individual Actions are run in parallel.

Currently the user is responsible for cleaning up ActionSets once they complete.

During execution, Kanister controller emits events to the respective ActionSets. In above example, the execution transitions of ActionSet s3backup-j4z6f can be seen by using the following command:

$ kubectl --namespace kanister describe actionset s3backup-j4z6f
Events:
  Type    Reason           Age   From                 Message
  ----    ------           ----  ----                 -------
  Normal  Started Action   23s   Kanister Controller  Executing action backup
  Normal  Started Phase    23s   Kanister Controller  Executing phase backupToS3
  Normal  Update Complete  19s   Kanister Controller  Updated ActionSet 's3backup-j4z6f' Status->complete
  Normal  Ended Phase      19s   Kanister Controller  Completed phase backupToS3