data-directory-module.md 6.6 KB

Data Directory Module

Table Of Contents

Name

DataDirectory

Dependencies

Design

Motivation

The Data Directory is used to manage what data exists on the network. As such, it contains a complete list of all accessible content; each file uploaded to any storage node requires an entry in the data directory, in the form of a DataObject. These DataObjects are identified via a unique DataObjectId.

It also links to the Data Object Type Registry, permitting all participating apps (consuming apps or storage node) to negotiate whether content may be uploaded.

Uploading

The simplified workflow for uploading content is as follows:

  1. Choose a ContentId. The ContentId is a secure random Byte string, and will usually be represented in a Base58 encoded form.
  2. Select an appropriate DataObjectType from the Data Object Type Registry.
  3. Verify that the file to be uploaded passes the constraints on the DataObjectType.
  4. Create a DataObject linking to the selected DataObjectType. The DataObject is not initially active.
  5. The runtime will select a Liaison, a storage node that must handle the upload.
  6. Contact the Liaison for upload. The Liaison will accept or reject the upload depending on whether the uploaded data fulfils the constraints placed on the DataObjectType. The Liaison will update the DataObject according to its decision.
  7. If the upload was accepted, the Liaison will also create an entry in the Data Object Storage Registry, indicating that the data is stored.
  8. Other storage nodes may replicate the data object, and create their own entries in the Data Object Storage Registry.

Downloading

Any storage provider listed in the Data Object Storage Registry as holding a given file can be contacted for downloads.

Uploading/Downloading Protocol

While the above describes the abstract protocol for up- and downloads, the specific wire protocol are HTTP requests conforming to the most-current OpenAPI specification of the storage node.

It is generated from code documentation, so not easily linked here. The API version is currently not stable, so a v0 namespace is specified. The most current API specs live on Joystream's first storage node.

Storage Backend Metadata

Note that DataObject contains a storage_metadata and corresponding storage_metadata_version object. The version determines how to interpret the metadata field. They are maintained by the Liaison (and potentially other storage providers) for managing storage backend specific metadata.

In version 1 of this metadata, it is expected that:

  1. It is serialized JSON with an Object at the top level.
  2. It contains an ipfs_content_id field specifying a content ID to map to when using IPFS as the backend.

Liaison Selection

The runtime is responsible for selecting a Liaison for a DataObject. The Liaison is selected from the currently staked pool of storage providers from any of the tranches configured in the Staking part of this module, for matching DataObjectType.

The implication of this is that between creating a DataObject and uploading data to the Liaison, not too much time should pass. If there is a reason for the Liaison to un-stake before re-distributing the content, then the DataObject cannot be uploaded, and a new DataObject needs to be created.

State

  • DataObjects - a map of ContentId to DataObject.

Events

  • DataObjectAdded - a new DataObject has been created, with Pending liaison judgement. See above for a high-level description of how a Liaison will judge whether a DataObject is acceptable.

The event payload consists of the ContentId and the uploader's AccountId.

  • DataObjectAccepted - the chosen Liaison accepted the data object.

The event payload consists of the ContentId and the Liaison AccountId.

  • DataObjectRejected - the chosen Liaison rejected the data object.

The event payload consists of the ContentId and the Liaison AccountId.

Dispatchable Methods

add_data_object

Payload

  • The uploader origin.
  • content_id: The ContentId chosen by the uploader.
  • type_id: The DataObjectTypeId chosen by the uploader.
  • size: The file size, in Bytes.

Description

As one of the first steps of the upload process, the uploader will call this function to register their intent to upload data. This also registers the ContentId, making it unavailable for further upload attempts.

As part of this function, the runtime will chose an appropriate Liaison, and record it in the DataObject.

Errors

  • The uploader is not an active member.
  • The chosen ContentId is already in use.
  • The chosen DataObjectType is inactive.

Side effect(s)

  • A DataObject with the given size and type is created, and registered under the ContentId in DataObjects.

Event(s)

  • DataObjectAdded

accept_data_object

Payload

  • The Liaison origin.
  • content_id: The ContentId of the DataObject to accept.

Description

The Liaison calls this function when its checks indicate that the DataObject should be accepted.

Errors

  • The provided ContentId was not found.
  • The signing caller is not the designated Liaison for the DataObject.

Side effect(s)

  • The DataObject identified by ContentId has their liaison_judgement field set to Accepted.

Event(s)

  • DataObjectAccepted

reject_data_object

Payload

  • The Liaison origin.
  • content_id: The ContentId of the DataObject to reject.

Description

The Liaison calls this function when its checks indicate that the DataObject should be rejected.

Errors

  • The provided ContentId was not found.
  • The signing caller is not the designated Liaison for the DataObject.

Side effect(s)

  • The DataObject identified by ContentId has their liaison_judgement field set to Rejected.

Event(s)

  • DataObjectRejected