Spaces

The space functions as the top level governance structure of the fabric that orchestrates how providers cooperate to serve tenant data. It is responsible for

  • Creating providers
  • Creating tenants
  • Defining rules of the space which tenants/providers agree to abide by
  • Reserving slashable bonds for both providers and tenants
  • Governance, including
    • Admitting new tenants/providers
    • Removing misbehaving tenants/providers
    • Slashing tenants/providers
    • Changing rules

Space rules

  • Provider Bond: An amount, PROVIDER_BOND, of currency each provider must lock up in order to participate within the space. Funds can be slashed from here if a provider misbehaves.
  • Tenant Bond: An amount, TENANT_BOND, of currency each tenant must lock up in order to participate within the space. Funds can be slashed from here if a tenant misbehaves.
  • SLAs: Specifications for availability requirements provider nodes must have.
  • Partition Number: The partitioning constant for part storage %TODO: @Serban expand

Providers

A provider is an entity which owns nodes within a space. It is responsible for ensuring its nodes abide by the space's rules, risking it's bond if it misbehaves. Providers also have a permission structure which can associated levels of privilege with cryptographic keys.

Provider Permission Levels

Provider keys have the following permission levels, from most to least privileged

  • Root level
    • add/remove admins (effectively allows for admin key rotation)
  • Admin level
    • add/remove nodes
    • bill tenants
  • Node level
    • Co-author versions with tenants
    • Mark itself as no longer pending
    • Participate in part networking

Keys with a higher level may set the permission level of any keys strictly below it. For example, a key with root permission may give other keys admin level, but admin keys may not give other keys admin or change the permission level of a key at admin level.

Provider Blockchain Storage

(ProviderId) -> {
  space: SpaceId,
  root: AccountId,
}
(ProviderId, NodeId) -> {
  pending: true,
  locator: BoundedString,
}

Provider Blockchain Calls

In addition to setting permissions on keys, we have the following calls

  • CreateProvider(origin: Origin, space: SpaceId, prv: ProviderId)

    • Checks space governance to see whether origin can create a provider
    • Creates a provider at prv
      {
        space: space,
        root: origin,
      }
      
    • Bonds some currency from root to space under prv
  • CreateNode(origin: Origin, prv: ProviderId, node: NodeId, locator: BoundedString)

    • Checks that origin has at least ADMIN level in prv
    • Creates a node (note that it's marked as pending while it syncs up with other nodes)
      {
        pending: true,
        locator: locator,
      }
      
    • Registers origin to prv with NODE permission level 1
  • ConfirmNode(origin: Origin, prv: ProviderId, node: NodeId) marks a node as no longer pending

    • Checks that origin has NODE permission or above in prv
    • Sets pending=false at (prv, node)
  • RemoveNode(origin: Origin, prv: ProviderId, node: NodeId) removes a node

    • Checks that origin has at least admin permission in prv
    • Deletes the node at (prv, node)
  • BillTenant TODO

1 Should this error if the key already exists within the permissions scheme?

Tenants

A tenant is an owner and creator of content. They are responsible for providing a service available to providers' nodes which can manage keys and encrypt/decrypt content.

Tenant Permissions

Tenant keys have the following permission levels, from most to least privileged

  • Root level
    • add/remove admins
    • participate in space governance
  • Admin level
    • add/remove kmses
    • add funds for billing
  • KMS level
    • create/remove content keys
  • Content level
    • co-author content object versions with nodes

Tenant Blockchain Storage

(TenantId) -> {
  space: SpaceId,
  root: AccountId,
}
(TenantId, KMSId) -> {
  locator: BoundedString,
}

Tenant Blockchain Calls

  • CreateTenant(origin: Origin, space: SpaceId, tenant: TenantId)
    • Checks space governance to see whether origin can create a tenant
    • Creates tenant
    {
      space: space,
      root: origin,
    }
    
    • Registers origin with tenant with ADMIN level
    • Bonds some currency from origin to the space under tenant
  • AddKMS(origin: Origin, tenant: TenantId, kms: KMSId, locator: BoundedString)
    • Checks that origin has ADMIN permission for tenant
    • Creates a KMS at (tenant, kms)
      {
        locator: locator,
      }
    
  • RemoveKMS(origin: Origin, tenant: TenantId, kms: KMSId) removes a node
    • Checks that origin has at least ADMIN permissions in tenant
    • Removes (tenant, kms)
  • TODO: Remove Tenant, Top up billing balance

Libraries

TODO: lukas/serban

Content Objects

Content objects are the main way tenants store and retrieve data, globally referenced by (TenantId, ConqId). They are created by storing data in a node, who calls CommitVersion with a digest of the data. Once the verison is commited, other nodes in the space can retrieve the content object. Once a sufficient number of nodes retrieve copies of the content object1, the original authoring node submits a ConfirmVersion which marks the commit as finalized. Each content object also has a 'head' version which refers to the version that should be retrieved when a content object is referred to by its ID.

In order to prevent nodes from creating arbitrary versions without permission of tenants, a tenant-signed version commit message (VersionCommitMessage) must be provided in the CommitVersion call.

  VersionCommitMessage {
    originator: ProviderId,
    tenant_id: TenantId,
    content_object_id: ContentObjectId,
    version_id: VersionId,
    tlp_size: #[compact] u64,
    ts: u64,
    set_head_on_finalize: bool,
    kms_id: KMSId,
  }

Content Object Lifecycle

flowchart TB 
  dc(Draft Created)
  md(Modify Draft)
  df(Draft Finalized)
  cv(Version Commited)
  vp(Version Publishing)
  vf(Version Finalized)

  dc -->|tenant user creates draft on node| md
  md --->|tenant user modifies draft| md
  md -->|tenant user finalizes draft| df
  df -->|node calls CommitVersion| cv
  cv -->|node starts publishing parts to other fabric nodes| vp
  vp -->|parts receieved other nodes, node calls FinalizeVersion| vf

Content Types

TODO: Discuss

Content Object Blockchain Storage

(TenantId, ContentObjectId) -> {
  head_version: Option<VersionId>,
  version_count: u32,
}
(TenantId, ContentObjectId, VersionId) -> {
  originator: ProviderId,
  tlp_size: #[compact] u64,
  ts_committed: u64,
  ts_finalized: Option<u64>,
  set_head_on_finalize: bool,
  kms_id: KMSId
}

Content Object Blockchain Calls

  • CreateContentObject(origin: Origin, ten: TenantId, cobj: ContentObjectId)

    • Checks that origin has at least CONTENT permissions in ten
    • Stores the content object at (ten, cobj)
        {
          head_version: None,
          version_count: 0,
        }
      
  • CommitVersion(origin: Origin, tenant_signer: AccountId, vcm: VersionCommitMessage, vcm_sig: Signature)

    • Checks that origin has NODE level permission within vcm.provider_id
    • Checks that tenant_signer has CONTENT level permission within vcm.tenant_id
    • Checks that vcm_sig is a valid signature of the scale encoded vcm by tenant_signer
    • Increment version_count at (vcm.tenant_id, vcm.content_object_id)
    • Stores the version at (vcm.tenant_id, vcm.content_object_id, vcm.version_id)
        {
          originator: vcm.originator,
          tlp_size: vcm.tlp_size,
          ts_committed: vcm.ts,
          ts_finalized: None,
          set_head_on_finalize: vcm.set_head_on_finalize,
          kms_id: vcm.kms_id,
        }
      
  • FinalizeVersion(origin: Origin, provider: ProviderId, ten: TenantId, cobj: ContentObjectId, ver: CObjVersionId, ts: u64)

    • Checks that origin has at least node level within provider.
    • Retrieve the version metadata, ver_meta, stored at (ten, cobj, ver)
    • Checks that provider matches the ver_meta.originator
    • Checks that ts is a recent timestamp
    • Sets ver_meta.ts_finalized = Some(ts)
    • If ver_meta.set_head_on_finalize, sets the head_version of the content object at (ten, cobj) to ver
  • SetHeadVersion(origin: Origin, ten: TenantId, cobj: ContentObjectId, ver: Option<VersionId>)

    • Checks that origin has CONTENT level permissions within ten
    • Checks that (ten, cobj, ver) exists
    • Sets the head_version at (ten, cobj) to ver
  • DeleteVersion(origin: Origin, ten: TenantId, cobj: ContentObjectId, ver: VersionId)

    • Checks that origin has CONTENT level permissions within ten
    • Checks that (ten, cobj, ver) exists
    • Checks that head_version at (ten, cobj) is not ver
    • Deletes the version stored at (ten, cobj, ver)
    • Decrements version_count at (ten, cobj)
  • DeleteContentObject(origin: Origin, ten: TenantId, cobj: ContentObjId)

    • Checks that origin has CONTENT level permissions within ten
    • Checks that version_count is 0
    • Deletes the content object at (ten, cobj)

1 TODO: Should also talk about partitioning and how we assert data is replicated

KMS

TODO

Data Hierarchy and Key Prefixing

Many of the entities in the system are scoped, meaning they only exist under another identity. Individually, they have the following IDs:

EntityIdentifierSubstrate Type
ProviderProviderId10-byte array
TenantTenantId10-byte array
NodeNodeId10-byte array
SpaceSpaceId10-byte array
Content ObjectCobjId10-byte array
Content Object VersionCobjVersionId32-byte array
KMSKMSId10-byte array

But they are hierarchical.

flowchart TD
  space(Space)
  provider(Provider)
  node(Node)
  tenant(Tenant)
  kms(KMS)
  cobj(Content Object)
  cobjv(Content Object Version)

  space --> provider
  space --> tenant

  provider --> node

  tenant --> kms
  tenant --> cobj
  cobj --> cobjv

Providers and tenants exist within spaces, KMSs exist within a tenancy, etc.

Substrate & Hierarchical keys

In substrate, we have two different ways of expressing these sorts of has-many relationships.

Flat IDs with metadata

We can either store the parent in some metadata key-value map associated with the child, like

TenantId --> TenantMetadata { space_id: SpaceId, ... }

In this case:

  • TenantIds are global: there's no way for the same TenantId to exist within multiple spaces.
  • In order to list all the tenants within a space, an indexer is required.
  • All you need to refer to a specific tenant is the TenantId, since the SpaceId is implied.
  • Deleting a space becomes complicated:
    • The total number of tenants must be stored on the space: SpaceId --> SpaceMetadata { tenant_count: u32 }
    • Each tenant must be manually deleted, and the tenant_count must be decremented on each delete
    • Once the tenant_count reaches 0, all other space-related storage can be deleted

Nested IDs

Alternatively, we can choose to prefix the TenantId by the SpaceId anywhere that tenant data is stored, so we would get

(SpaceId, TenantId) --> TenantMetadata { ... }

In this case:

  • To refer to a tenant, both the SpaceId and TenantId must be given, since the TenantId is not unique across spaces
  • Listing tenants is easy,
  • Deleting the parent entity becomes much easier:
    • In order to delete a space, all the keys under (SpaceId, TenantId) must be deleted as well
    • On chain, we can do a key prefix delete which clears out all keys of the form (SpaceId, ...) (in RocksDb this sort of write is fast)

Current architecture

It is annoying to refer to certain entities by both their id and their parents' ids, so in cases where deletion may not be so common (such as that for a space), flat ids are used.

Here is a complete list of how the IDs stored:

EntityComplete Id
ProviderProviderId
TenantTenantId
Node(ProviderId, NodeId)
SpaceSpaceId
Content Object(TenantId, ContentObjectId)
Content Object Version(TenantId, ContentObjectId, ContentObjectVersion)
KMS(TenantId, KMSId)

All provider and tenant children use nested IDs. Tenants and Providers use flat ids, since deleting a whole space should be considered very rare. Tenants and providers may be often be deleted, however, for governance reasons. Some single blockchain call, then, needs to be able to delete an entire provider or entire tenancy. To do this, the keys prefix deletes are needed to clean up their data.

Potential alternative

An alternative is this: instead of doing some sort of hierarchical delete, governance could store a flag called governance_delete, stored in the metadata of a provider or tenant. Specific governance_delete_* methods could then exist on all children, who look up their parents, metadata, check that governance_delete == true, and delete the requested child. This would allow for using a flat id with metadata, while still having governance be able to delete a child.

Part Networking

TODO: serban

Definitions

Here are some definitions of entities within the system

NameDescription
NodeA server which stores and serves parts.
ProviderAn individual or organization which owns, secures, and operates nodes.
TenantAn individual or organization which owns content.
ContentA versioned set of data which is owned by a tenant.
SpaceA group of providers and tenants, where providers agree to run nodes that serve content owned by a tenant according to a common set of rules.
PartA part is a sequence of bytes stored in the space, referenced by its hash.
Content Object VersionA collection of parts created by a tenant, referenced by its hash.
Content ObjectA collection of versions.
KMSA tenant-owned server which holds keys for encrypting/decrypting content which the tenant stores in the space.

The following entities are identified as follows:

EntityIdentifierSubstrate Type
ProviderProviderId10-byte array
NodeNodeId10-byte array1
SpaceSpaceId10-byte array
Content ObjectContentObjectId10-byte array
Content Object VersionCObjVersionId32-byte array
KMSKMSId10-byte array
1

Could node id just be an unsigned 32-bit integer and do some round robin or mod-magic for assigning partitions?