Creator pillar · v0.16.0
video Video
A single video on a creator platform.
$id · https://corpospec.com/schemas/v0.16.0/video.schema.json
Fields
| Field | Type | Required | Description |
|---|---|---|---|
| broadcast_state | BroadcastState | yes | Live-broadcast lifecycle state at the time of sync. Input to the `kind` heuristic. |
| caption_available | boolean | yes | Caption / subtitle track availability flag. |
| channel | PathRef | yes | Channel this video belongs to. |
| contains_synthetic_media | boolean | yes | AI / synthetic-media disclosure flag where the host platform surfaces one (e.g. YouTube `containsSyntheticMedia`). Defaults to `false` for platforms that do not expose the flag. |
| definition | VideoDefinition | yes | Resolution band for the primary streamable rendition. `hd` covers 720p and above; `sd` covers everything below. |
| description | string | yes | |
| duration_seconds | integer | yes | Duration in seconds. Used as input to the `kind` heuristic. |
| embeddable | boolean | yes | |
| has_paid_product_placement | boolean | yes | Paid-product-placement disclosure flag where the host platform surfaces one (e.g. YouTube `paidProductPlacementDetails.hasPaidProductPlacement`). The validator cross-checks this against creator-authored `Sponsorship` records — a video flagged paid-placement without a corresponding sponsorship surfaces as a domain-consistency advisory. |
| id | PathRef | yes | CorpoSpec PathRef (e.g. `creator/videos/abc123def45`). |
| kind | VideoKind | yes | Video subtype derived from the short-form / long-form / livestream classification heuristic: - `duration_seconds <= 60 && broadcast_state == "none"` → `short` - `livestream_started_at != null && broadcast_state == "none"` → `livestream_vod` - `broadcast_state == "live"` → `live` - `broadcast_state == "upcoming"` → `upcoming` - otherwise → `regular_vod` Where the host platform exposes an authoritative content-type signal (e.g. YouTube Analytics' `creatorContentType`), the connector is free to reconcile the heuristic against it on a periodic cadence. |
| license | VideoLicense | yes | Content-licence selection. `platform_default` covers any platform's own all-rights-reserved licence; `creative_commons` covers any CC licence the platform exposes. Connectors map native licence values onto the corresponding member. |
| made_for_kids | boolean | yes | |
| platform_video_id | string | yes | Platform-side stable identifier for this video. Format is platform-specific (YouTube emits 11-char video IDs; Twitch emits numeric VOD IDs; TikTok emits numeric `aweme_id`s). Treated as an opaque token; persisted indefinitely as a structural identifier. |
| privacy_status | VideoPrivacyStatus | yes | Video-level privacy status. Public, unlisted, and private are the three states common across most video platforms; values map onto each platform's native enum at the connector boundary. |
| provenance | SyncProvenance | yes | Provenance metadata stamped on every synced atomic record across the creator pillar. Per-record granularity; the `retention_tier` reflects the strictest tier of any field on the record so the cache-purge job can decide conservatively. Field-level retention distinctions are documented in each entity type's struct doc-comments — the schema documents the static taxonomy, the data carries only the per-fetch state. |
| published_at | string | yes | Source-of-truth publish timestamp from the host platform. |
| statistics | VideoStatistics | yes | Aggregate counts attached to a single video. Snapshot semantics: the counts are valid as of `provenance.synced_at`. Comment text and commenter identities are intentionally not stored. |
| title | string | yes | |
| upload_status | VideoUploadStatus | yes | Upload-processing status surfaced by the host platform during and after upload. The five states cover the union surveyed across video platforms; connectors map their native states onto the closest member. |
| category_id | string? | — | Platform-side category identifier. Format is platform-specific (e.g. YouTube's numeric category ID); optional because not every platform exposes a category taxonomy. |
| content_series | PathRef? | — | Optional content-series this video belongs to. |
| default_audio_language | string? | — | BCP-47 default audio language code. |
| default_language | string? | — | BCP-47 default language code. |
| livestream_ended_at | string? | — | Livestream actual end timestamp; marks the livestream-VOD boundary. |
| livestream_scheduled_for | string? | — | Livestream scheduled start timestamp. |
| livestream_started_at | string? | — | Livestream actual start timestamp; present only on broadcasts that have started. Input to the `kind` heuristic. |
| tags | string[] | — | Creator-authored tags. High-signal but volatile (creators iterate rapidly). |
| thumbnail_default_url | string? | — | Default thumbnail URL. |
| thumbnail_high_url | string? | — | High thumbnail URL. |
| thumbnail_maxres_url | string? | — | Max-resolution thumbnail URL (not always present). |
| topic_categories | string[] | — | Topic descriptors as URL references (e.g. Wikipedia URLs from YouTube `topicDetails.topicCategories`). |
Definitions
Shared types referenced within this schema.
BroadcastState
Live-broadcast lifecycle state. Universal across livestream-capable
platforms (YouTube, Twitch, TikTok Live, Instagram Live, …).
`none` = not a livestream / livestream finished and was archived as
VOD; `upcoming` = scheduled but not yet started; `live` = currently
broadcasting.
enum: "none", "live", "upcoming"
DataClass
Classification of a synced field's data sensitivity. Drives subject-
rights workflows and cache-purge retention decisions.
PathRef
Path-based cross-reference relative to .corpospec/ root.
Pattern: `^[a-z0-9_-]+(/[a-z0-9_.-]+)+$`
pattern: ^[a-z0-9_-]+(/[a-z0-9_.-]+)+$
RetentionTier
Retention policy applied to a synced record. Tiers are named after the
discipline they impose; the connector layer maps each upstream surface
onto the appropriate tier.
- `thirty_day_strict` — atomic records that auto-purge or refresh at
the thirty-day boundary. The dominant tier on platforms whose
Terms of Service impose a thirty-day cache discipline on derived
data (e.g. YouTube Developer Policies §III.E.4 for Data API
outputs).
- `thirty_day_aggregate_with_lineage` — atomic record purges at
thirty days; the aggregate computed from it may be retained when
accompanied by explicit `derived_from` + `derivation_method`
lineage.
- `auth_lifetime` — atomic record retains against active OAuth
authorisation; tears down on token revocation or tenant deletion.
Used for analytics surfaces whose ToS explicitly exempt them from
the thirty-day rule (e.g. YouTube Analytics API per §III.E.4).
- `creator_authored` — data the tenant originates rather than fetches;
no upstream-platform retention constraint applies.
- `metadata_indefinite` — structural identifiers and immutable
timestamps the upstream ToS explicitly permits to retain.
SourceAttribution
Source attribution for a synced field. Records the kind of upstream
surface that produced the value; combined with the parent record's
`platform`, this fully attributes any value back to its origin.
Open-ended by design: new connector kinds (e.g. monetary or content-
owner APIs) extend without breaking changes. The platform discriminator
(`Channel.platform`) carries the brand identity; this enum carries
only the kind of API surface the value came from.
SyncProvenance
Provenance metadata stamped on every synced atomic record across the
creator pillar. Per-record granularity; the `retention_tier` reflects
the strictest tier of any field on the record so the cache-purge job
can decide conservatively. Field-level retention distinctions are
documented in each entity type's struct doc-comments — the schema
documents the static taxonomy, the data carries only the per-fetch
state.
type: object
VideoDefinition
Resolution band for the primary streamable rendition. `hd` covers
720p and above; `sd` covers everything below.
enum: "hd", "sd"
VideoKind
Video subtype derived from the short-form / long-form / livestream
classification heuristic:
- `duration_seconds <= 60 && broadcast_state == "none"` → `short`
- `livestream_started_at != null && broadcast_state == "none"`
→ `livestream_vod`
- `broadcast_state == "live"` → `live`
- `broadcast_state == "upcoming"` → `upcoming`
- otherwise → `regular_vod`
Where the host platform exposes an authoritative content-type signal
(e.g. YouTube Analytics' `creatorContentType`), the connector is free
to reconcile the heuristic against it on a periodic cadence.
enum: "regular_vod", "short", "livestream_vod", "live", "upcoming"
VideoLicense
Content-licence selection. `platform_default` covers any platform's
own all-rights-reserved licence; `creative_commons` covers any CC
licence the platform exposes. Connectors map native licence values
onto the corresponding member.
VideoPrivacyStatus
Video-level privacy status. Public, unlisted, and private are the
three states common across most video platforms; values map onto
each platform's native enum at the connector boundary.
enum: "public", "private", "unlisted"
VideoStatistics
Aggregate counts attached to a single video. Snapshot semantics: the
counts are valid as of `provenance.synced_at`. Comment text and
commenter identities are intentionally not stored.
type: object
VideoUploadStatus
Upload-processing status surfaced by the host platform during and
after upload. The five states cover the union surveyed across video
platforms; connectors map their native states onto the closest
member.
enum: "uploaded", "processed", "failed", "rejected", "deleted"
Reference in your YAML
# yaml-language-server: $schema=https://corpospec.com/schemas/v0.16.0/video.schema.json