Creator pillar · v0.15.1

video Video

A single video on a creator platform.

$id · https://corpospec.com/schemas/v0.15.1/video.schema.json

Fields

Field Type Required Description
broadcast_state BroadcastState yes Live-broadcast lifecycle state at the time of sync. Input to the `kind` heuristic.
caption_available boolean yes Caption / subtitle track availability flag.
channel PathRef yes Channel this video belongs to.
contains_synthetic_media boolean yes AI / synthetic-media disclosure flag where the host platform surfaces one (e.g. YouTube `containsSyntheticMedia`). Defaults to `false` for platforms that do not expose the flag.
definition VideoDefinition yes Resolution band for the primary streamable rendition. `hd` covers 720p and above; `sd` covers everything below.
description string yes
duration_seconds integer yes Duration in seconds. Used as input to the `kind` heuristic.
embeddable boolean yes
has_paid_product_placement boolean yes Paid-product-placement disclosure flag where the host platform surfaces one (e.g. YouTube `paidProductPlacementDetails.hasPaidProductPlacement`). The validator cross-checks this against creator-authored `Sponsorship` records — a video flagged paid-placement without a corresponding sponsorship surfaces as a domain-consistency advisory.
id PathRef yes CorpoSpec PathRef (e.g. `creator/videos/abc123def45`).
kind VideoKind yes Video subtype derived from the short-form / long-form / livestream classification heuristic: - `duration_seconds <= 60 && broadcast_state == "none"` → `short` - `livestream_started_at != null && broadcast_state == "none"` → `livestream_vod` - `broadcast_state == "live"` → `live` - `broadcast_state == "upcoming"` → `upcoming` - otherwise → `regular_vod` Where the host platform exposes an authoritative content-type signal (e.g. YouTube Analytics' `creatorContentType`), the connector is free to reconcile the heuristic against it on a periodic cadence.
license VideoLicense yes Content-licence selection. `platform_default` covers any platform's own all-rights-reserved licence; `creative_commons` covers any CC licence the platform exposes. Connectors map native licence values onto the corresponding member.
made_for_kids boolean yes
platform_video_id string yes Platform-side stable identifier for this video. Format is platform-specific (YouTube emits 11-char video IDs; Twitch emits numeric VOD IDs; TikTok emits numeric `aweme_id`s). Treated as an opaque token; persisted indefinitely as a structural identifier.
privacy_status VideoPrivacyStatus yes Video-level privacy status. Public, unlisted, and private are the three states common across most video platforms; values map onto each platform's native enum at the connector boundary.
provenance SyncProvenance yes Provenance metadata stamped on every synced atomic record across the creator pillar. Per-record granularity; the `retention_tier` reflects the strictest tier of any field on the record so the cache-purge job can decide conservatively. Field-level retention distinctions are documented in each entity type's struct doc-comments — the schema documents the static taxonomy, the data carries only the per-fetch state.
published_at string yes Source-of-truth publish timestamp from the host platform.
statistics VideoStatistics yes Aggregate counts attached to a single video. Snapshot semantics: the counts are valid as of `provenance.synced_at`. Comment text and commenter identities are intentionally not stored.
title string yes
upload_status VideoUploadStatus yes Upload-processing status surfaced by the host platform during and after upload. The five states cover the union surveyed across video platforms; connectors map their native states onto the closest member.
category_id string? Platform-side category identifier. Format is platform-specific (e.g. YouTube's numeric category ID); optional because not every platform exposes a category taxonomy.
content_series PathRef? Optional content-series this video belongs to.
default_audio_language string? BCP-47 default audio language code.
default_language string? BCP-47 default language code.
livestream_ended_at string? Livestream actual end timestamp; marks the livestream-VOD boundary.
livestream_scheduled_for string? Livestream scheduled start timestamp.
livestream_started_at string? Livestream actual start timestamp; present only on broadcasts that have started. Input to the `kind` heuristic.
tags string[] Creator-authored tags. High-signal but volatile (creators iterate rapidly).
thumbnail_default_url string? Default thumbnail URL.
thumbnail_high_url string? High thumbnail URL.
thumbnail_maxres_url string? Max-resolution thumbnail URL (not always present).
topic_categories string[] Topic descriptors as URL references (e.g. Wikipedia URLs from YouTube `topicDetails.topicCategories`).

Definitions

Shared types referenced within this schema.

BroadcastState
Live-broadcast lifecycle state. Universal across livestream-capable platforms (YouTube, Twitch, TikTok Live, Instagram Live, …). `none` = not a livestream / livestream finished and was archived as VOD; `upcoming` = scheduled but not yet started; `live` = currently broadcasting.
enum: "none", "live", "upcoming"
DataClass
Classification of a synced field's data sensitivity. Drives subject- rights workflows and cache-purge retention decisions.
PathRef
Path-based cross-reference relative to .corpospec/ root. Pattern: `^[a-z0-9_-]+(/[a-z0-9_.-]+)+$`
pattern: ^[a-z0-9_-]+(/[a-z0-9_.-]+)+$
RetentionTier
Retention policy applied to a synced record. Tiers are named after the discipline they impose; the connector layer maps each upstream surface onto the appropriate tier. - `thirty_day_strict` — atomic records that auto-purge or refresh at the thirty-day boundary. The dominant tier on platforms whose Terms of Service impose a thirty-day cache discipline on derived data (e.g. YouTube Developer Policies §III.E.4 for Data API outputs). - `thirty_day_aggregate_with_lineage` — atomic record purges at thirty days; the aggregate computed from it may be retained when accompanied by explicit `derived_from` + `derivation_method` lineage. - `auth_lifetime` — atomic record retains against active OAuth authorisation; tears down on token revocation or tenant deletion. Used for analytics surfaces whose ToS explicitly exempt them from the thirty-day rule (e.g. YouTube Analytics API per §III.E.4). - `creator_authored` — data the tenant originates rather than fetches; no upstream-platform retention constraint applies. - `metadata_indefinite` — structural identifiers and immutable timestamps the upstream ToS explicitly permits to retain.
SourceAttribution
Source attribution for a synced field. Records the kind of upstream surface that produced the value; combined with the parent record's `platform`, this fully attributes any value back to its origin. Open-ended by design: new connector kinds (e.g. monetary or content- owner APIs) extend without breaking changes. The platform discriminator (`Channel.platform`) carries the brand identity; this enum carries only the kind of API surface the value came from.
SyncProvenance
Provenance metadata stamped on every synced atomic record across the creator pillar. Per-record granularity; the `retention_tier` reflects the strictest tier of any field on the record so the cache-purge job can decide conservatively. Field-level retention distinctions are documented in each entity type's struct doc-comments — the schema documents the static taxonomy, the data carries only the per-fetch state.
type: object
VideoDefinition
Resolution band for the primary streamable rendition. `hd` covers 720p and above; `sd` covers everything below.
enum: "hd", "sd"
VideoKind
Video subtype derived from the short-form / long-form / livestream classification heuristic: - `duration_seconds <= 60 && broadcast_state == "none"` → `short` - `livestream_started_at != null && broadcast_state == "none"` → `livestream_vod` - `broadcast_state == "live"` → `live` - `broadcast_state == "upcoming"` → `upcoming` - otherwise → `regular_vod` Where the host platform exposes an authoritative content-type signal (e.g. YouTube Analytics' `creatorContentType`), the connector is free to reconcile the heuristic against it on a periodic cadence.
enum: "regular_vod", "short", "livestream_vod", "live", "upcoming"
VideoLicense
Content-licence selection. `platform_default` covers any platform's own all-rights-reserved licence; `creative_commons` covers any CC licence the platform exposes. Connectors map native licence values onto the corresponding member.
VideoPrivacyStatus
Video-level privacy status. Public, unlisted, and private are the three states common across most video platforms; values map onto each platform's native enum at the connector boundary.
enum: "public", "private", "unlisted"
VideoStatistics
Aggregate counts attached to a single video. Snapshot semantics: the counts are valid as of `provenance.synced_at`. Comment text and commenter identities are intentionally not stored.
type: object
VideoUploadStatus
Upload-processing status surfaced by the host platform during and after upload. The five states cover the union surveyed across video platforms; connectors map their native states onto the closest member.
enum: "uploaded", "processed", "failed", "rejected", "deleted"

Reference in your YAML

# yaml-language-server: $schema=https://corpospec.com/schemas/v0.15.1/video.schema.json