This GAP introduces a proposal for a language-agnostic layer of abstraction over the Golem VPN and its high-level APIs (yapapi
, yajsapi
).
The main element of this proposal is the introduction of an application descriptor file which can be used to describe a deployment of a set of services to be run within the Golem Network.
This application descriptor is intended to be used together with high-level APIs to enable deploying and managing the nodes described within the descriptor.
The main motivation for this proposal is providing support for development of distributed applications (dApps) deployed within the Golem Network.
An application hosted on Golem can be described by a set of data structures, which specify and represent all components of the application which are relevant from Golem’s point of view. The data structures are an Object Model of an application, which has following attributes:
payload
, a service
, a network
), and each object type has a set of fields/attributes (eg. a payload
includes a runtime
, capabilities
and runtime-specific params
)services
is specified by a runtime
, and may be part of a network
)The engine which is a host for an Object Model is responsible for:
Golem Application Object Model (diagram):
An application descriptor specifies initial Golem Application Object Model. It must include all data required to provision Golem resources required by the application. The proposed format is YAML.
Here’s an example of such a descriptor:
meta:
name: "Sample-application"
description: "A sample descriptor for a Golem application"
author: "GolemFactory"
version: "0.1.0"
schema: "https://github.com/golemfactory/gaom/blob/v0.0.1/gaom.schema.json"
payloads:
web-server:
runtime: vm
constraints:
- "golem.inf.cpu.cores=2"
- "golem.inf.mem.gib>=4"
capabilities:
- "vpn"
params:
image_hash: "c37c1364f637c199fe710ca62241ff486db92c875b786814c6030aa1"
db-server:
runtime: vm
constraints:
- "golem.inf.storage.gib>10"
capabilities:
- "vpn"
params:
image_hash: "85021afecf51687ecae8bdc21e10f3b11b82d2e3b169ba44e177340c"
networks:
default:
ip: "192.168.0.1/24"
services:
db-service:
payload: db-server
network: default
init:
- run:
args: ["/bin/run_rqlite.sh"]
web-server-service:
payload: web-server
network: default
init:
- run:
args: ["/bin/bash", "-c", "cd /webapp && python app.py --db-address ${services.db-service.network_node.ip} --db-port 4001 initdb"]
- run:
args: ["/bin/bash", "-c", "cd /webapp && python app.py --db-address ${services.db-service.network_node.ip} --db-port 4001 run > /webapp/out 2> /webapp/err &" ]
The GAOM schema definition shall be maintained in a dedicated repo: https://github.com/golemfactory/gaom.git
Current schema definition: link / documentation
Notes:
The Golem application descriptors may be published in two different formats. An application designer may choose the format which better suits their application.
A package may consist of a single decriptor file in YAML format. Such a single file is most likely sufficient for applications of low complexity. This package type is called a single-YAML descriptor package.
Complex application descriptors may benefit from splitting the YAML content into multiple files, groupped by eg. areas of concern. Such multi-file descriptors can be published as ZIP-packages containing all the relevant YAML files, in flat directory structure (only file sin archive root shall be processed by the engine when processing the descriptor). This package type is called a multi-YAML descriptor package.
Running Golem Deploy engine for a specific descriptor shall provision respective resources/services on Golem Network (as a one-off action).
The engine (and corresponding CLI) shall support provisioning Golem application based on single-YAML descriptors.
Multiple descriptor files may be used within the scope of a single deployment. In such a case, the files are merged based on their ordering. The merging is performed using a deep merge strategy. Here’s an example of how this merging strategy is applied:
Base file:
meta:
name: "Sample-application"
description: "A sample descriptor for a Golem application"
author: "GolemFactory"
version: "0.1.0"
payloads:
nginx:
runtime: "vm"
params:
image: "image-hash"
constraints:
"golem.inf.cpu.cores": 2
capabilities:
- "vpn"
Override file:
payloads:
nginx:
params:
repo: "repo-url"
capabilities:
- "gpu"
Resulting file:
meta:
name: "Sample-application"
description: "A sample descriptor for a Golem application"
author: "GolemFactory"
version: "0.1.0"
payloads:
nginx:
runtime: "vm"
params:
image: "image-hash"
repo: "repo-url"
constraints:
"golem.inf.cpu.cores": 2
capabilities:
- "vpn"
- "gpu"
A depth-first approach is used to determine the values which need to be added or updated to existing collections (lists and key-value maps). For maps, keys from overriding files have precedence over the base ones. In the case of lists, when merging lists from two files, the override values are simply concatenated to the base list. If required, this behaviour can be made configurable (e.g. to enable overriding the entire list instead).
The entities and resources in a Golem application follow a certain lifecycle - they get provisioned, they remain active, they get removed/terminated. The application elements represented by the object graph shall have their state represented in the engine. The state represents the stage of lifecycle in which an application element is at a given moment in time. Following states are considered:
As the descriptor is processed by the engine, the Golem resources are provisioned, and their state in GAOM is updated by the engine. Some resources depend on other resources (eg. a service
may need to be provisioned in a context of a network
) which implies the sequence of resource provisioning. The engine shall derive the dependency graph from the descriptor and based on this - determine the provisioning actions sequence.
It is possible to specify explicit dependency between services. If a service A should only be provisioned after service B becomes active, the specification of service A shall include a depends_on
attribute, pointing at the label of service B. Based on this information, the _engine_shall build an appropriate dependency graph.
Note: a service may depend on a number of other services, therefore the depends_on
attribute must allow for multiple dependency labels.
The engine persists the state of GAOM after the state of the model changes.
The GAOM state can be persisted to a local or remote file.
The GAOM state can be persisted to a distributed storage system (choice of storage options to be selected for implementation).
The engine is able to load a persisted state of GAOM and reconcile its content versus actual state of Activities on Golem network. This reconciliation is required eg. when the engine is disconnected from the network (goes offline) and then is reconnected to resume control over the Golem application.
The synchronization may determine gaps between the persisted state (desired) and the current state (actual) on Golem network. The engine is capable of resolving the gaps, by provisioning or terminating Activities as required.
A GAOM model object type includes all relevant attributes of a given resource, including:
Name
by which the resource can be identified in GAOM model (mandatory in the GAOM descriptor),State
which indicates the current state of the resource, as recognized by the engine (so not specified in the GAOM descriptor, but maintained by the engine),AgreementId
and ActivityId
.The engine implementation shall follow generic logic which operates on resource abstractions, performing generic algorithms, as indicated below.
A general rule is that every tangible resource on Golem network (Service
, Network
) can be manipulated by an abstraction called ResourceAdapter
, which implements a set of generic operations related to provisioning and maintaining a resource for an application. An engine implements a set of generic operations involving resources, their desired state (as per the GAOM descriptor), perceived state (as per GAOM model persisted by the engine) and actual state (as per the current state of resources on Golem network).
The engine operates on following GAOM model states:
desired state
- as described by the application descriptor, this is the composition of resources required by an application at a given moment in time. Note that the desired state may change over time - and it is the responsibility of the engine to reflect those changes in the actual set of resources provisioned on Golem network.persisted state
- the GAOM model snapshot which is persisted after the engine performs any operation on Golem actual resources. Note that the persisted state should include also the ‘physical’ resource attributes, required to identify the actual resource instance on Golem network (eg. AgreementId
/ActivityId
).actual state
- the GAOM model snapshot which represents the actual current state of resources as they operate on Golem network. The actual state should be obtained for a given moment in time by verifying the actual state of resources as recorded in the persisted state
.The fundamental operations of the engine are:
Plan
- based on the desired state
and persisted state
derive the actual state
, and then determine the sequence of actions which need to be performed on Golem network to provision/update/destroy resources, in order to achieve the desired state
. Note that Plan
may be called in Apply
or Destroy
mode.Apply
- run Plan(Apply)
and then execute the actions determined by the Plan operation. For each of the actions, as soon as it is complete, record the changes in persisted state
.Destroy
- run Plan(Destroy)
and then execute the actions determined by the Plan operation. For each of the actions, as soon as it is complete, record the changes in persisted state
.NOTE: the Apply
operation can be executed repeatedly, and each execution should bring the Golem application to a state as indicated by the application descriptor (the desired state
). Even when the descriptor remains unchanged, repeated calls of Apply
operation ensure the application is stable.
A ResourceAdapter
module is an abstraction over a specific type of resource, and implements low-level CRUD operations for the resources. Eg. a ServiceResourceAdapter
implements logic specific to Create, Read, Update and Destroy operations on a service on Golem network. A generic implementation of each of those operations takes <operation>Request
as parameter and returns <operation>Response
, where both ...Request
and ...Response
contain the model (in GAOM sense) of the resource.
Read
operation
ReadResourceRequest { Resource model }
ReadResourceResponse { Resource model }
Logic:
The purpose of this operation is to fetch the actual state of the resource as it exists on Golem network. In order to read the current state, the resource model must contain attributes which allow to identify the actual resource instance (if those attributes are missing from the request, the resource is assumed to not exist).
ReadResourceRequest.model
does not contain “identifying” attributes (eg. AgreementId for services) - return ReadResourceResponse.model
= nullReadResourceRequest.model
does include “identifying” attributes, fetch all available information re. the resource
ReadResourceResponse.model
with returned attributes and stateReadResourceResponse.model
, but set state to Terminated
Create
operation
CreateResourceRequest { Resource model }
CreateResourceResponse { Resource model }
Logic:
The purpose of this operation is to provision the specified resource on Golem network. The resource gets provisioned using the specification received in the request, then the resulting attributes (including the “identifying” attributes, like AgreementId/ActivityId) are populated into the model returned in the response.
Update
operation
UpdateResourceRequest { Resource model }
UpdateResourceResponse { Resource model }
Logic:
The purpose of this operation is to perform an update (change of attribute values) on a resource existing in Golem network without rebuilding it.
NOTE that update without recreating the resource may not be possible in every case. When updated attributes cannot be applied to a resource at runtime - the Update
operation should flag this as an error, so that the engine may decide to execute a rebuild (Destroy
and Create
) instead.
Destroy
operation
DestroyResourceRequest { Resource model }
DestroyResourceResponse { Resource model }
Logic:
The purpose of this operation is to destroy/remove a resource from Golem network. The resource model in question must contain attributes which allow to identify the actual resource instance.
DestroyResourceRequest.model
does not contain “identifying” attributes (eg. AgreementId for services) - return DestroyResourceResponse.model
= nullDestroyResourceRequest.model
does include “identifying” attributes, fetch all available information re. the resource
DestroyResourceResponse.model
with known attributes, setting the state to Terminated
DestroyResourceResponse.model
, but set state to Terminated
The Plan
operation is expected to reconcile the differences between:
Desired state
Persisted state
Actual state
as actually observed on Golem networkin order to derive a sequence of actions to be applied on resources, which must be performed to bring the application (resources required by the application) from its current state to desired state. Possible actions are:
Create
Update
Rebuild
Destroy
A general flow of the Plan
operation is as follows:
Reconcile vs actual state
In this step, the persisted state
(which indicates the current expected state of the application) needs to be reconciled against the actual state of the resources. This requires iterating over all recorded resources and verifying their current actual state (by calling respective ResourceAdapter.Read()
operations). The resulting actual state
becomes input for the second stage.
*NOTE that the initial (ie. before the application is provisioned for the first time) persisted state
is empty.
Reconcile vs desired state
Once the actual state
is up to date, it can be reconciled against the desired state
(as expressed by the application GAOM descriptor).
Pseudo-logic of the reconciliation in Plan(Apply)
mode is indicated below:
desired
but not in actual
- add Create
to the action listdesired
and in actual
and has changed - check if update can be perfomed or a rebuild is required
Update
to the action listRebuild
to the action listactual
but not in desired
- add Delete
to the action listPseudo-logic of the reconciliation in Plan(Destroy)
mode is indicated below:
actual
- add Delete
to the action listDetermine knock-on impact
As the previous step may have yielded updates/rebuilds to existing resources, and there may be other resources dependent on the changes - each Update
/Rebuild
action should also trigger follow-on Update
/Rebuild
actions on resources which are downstream in GAOM dependency graph.
The purpose of Apply
and Destroy
operations is to derive the action sequence and then to apply it via ServiceAdapters
. As the actions are being applied, the resulting resource changes are recorded in the persisted state
.
The high-level flow is as follows:
The engine (and corresponding CLI) shall support provisioning Golem application based on multi-YAML descriptors (ZIP-archived). Note: for the purposes of YAML file merging - the order of processing files within the ZIP-archive shall be undefined (ie. there is no guarantee which YAML file shall override the content of any other YAML file in the same archive).
The attribute values in descriptor may include references to the current state of the Object Model (to specify that service
provisioning requires parameters which are dependent on another service
’s state, eg. a web application service must be launched with connection details of a database service specified in the same descriptor).
Note: that the reference syntax also indicates implicit resource dependency, ie. if service B
launch depends on attributes of service A
which are only known after service A
is launched, the engine must first provision service A
, obtain its updated Object Model state, populate service B
references to ‘service A’ state and then provision service B
.
Proposed syntax is as follows:
A reference to Object Model must be enclosed in ${}
, for example:
init:
- run:
args: ["/bin/bash", "-c", "cd /webapp && python app.py --db-address *${services.db-service.network_node.ip}* initdb"]
The file format of choice is YAML. YAML is used in both Docker Compose and Kubernetes, both of which are widely-adopted deployment solutions. As such, it’s become a de-facto standard
A considered alternative is the HashiCorp HCL format, which is native to Terraform platform ecosystem. YAML however is deemed more widely-adopted.
In its current form, the deployment descriptor does not include support for specifying the market strategy which should be used by the requestor. There are two primary reasons for this:
Not applicable, the Golem Deploy concept is designed as an “upper layer” on top of HL API libraries. There are no backwards compatibility concerns.
The application descriptor examples which illustrate various Golem application use cases are attached below:
An early reference implementation of the “golem deploy” concept has been started in following set of projects:
The “golem deploy” concept is an abstraction layer defined above the Golem’s standard Requestor/Provider model. It is assumed that a “golem deploy” engine acts as one or more Requestors, and therefore all Golem’s security/permissions models apply. No dedicated enhancements to yagna
security features are required.
The “golem deploy” operation requires the engine to act as a Requestor Agent application, connecting to Golem daemons using Golem identities. Therefore an implementation of an engine must consider proper, industry-standard credentials management.
Copyright and related rights waived via CC0.