feat(persistence): add .alfred sidecar serializer (DTO ↔ dict)

Step 2 of the specs/dot_alfred.md plan. Pure-dict in/out (serialize(sidecar) -> dict, deserialize(data) -> ShowSidecar); YAML I/O lives in the repository layer (step 3) and is kept out for trivial testability. DTOs mirror the YAML schema field-for-field: - ShowSidecar (root: imdb_id, tmdb_id, schema_version, seasons) - SeasonSidecar (number, path, optional audio/subtitles, optional episodes) - EpisodeSidecar (number, path, optional audio/subtitles) - SubtitleEntry (language, source, type) The sidecar acts as a scan cache: it stores only what is genuinely costly to recompute — folder/file paths (skipping the FS walk) and probed track metadata (skipping ffprobe). Release identifiers (group, source, quality, codec) live in folder/file names and are derived on demand by the parser; they are deliberately absent from the schema and rejected as unknown keys on deserialize. The serializer is strict on schema: unknown keys at any level raise SidecarSchemaError, missing required fields raise clearly, and bool cannot sneak in as a season/episode number. Optional fields (tmdb_id, empty audio/subtitles/episodes) are omitted from the output rather than emitted as null / []. Tests cover round-trip equivalence (DTO → dict → DTO and DTO → YAML text → DTO), the Foundation S01 PACK case (real-world fixture with mixed sub types — superset captured at season scope), and a Breaking Bad S05 EPISODIC case. An on-disk tmp_path fixture recreates the Foundation folder structure with placeholder files, ready to be reused by the upcoming repository walk tests in step 3.
2026-05-22 16:56:56 +02:00
parent 6c12c18a27
commit b0e275bd11
7 changed files with 862 additions and 0 deletions
@@ -17,6 +17,31 @@ callers).

 ### Added

+- **`.alfred` sidecar serializer
+  (`alfred/infrastructure/persistence/dot_alfred/`).** Implements step 2
+  of the `specs/dot_alfred.md` plan. Pure-dict in/out
+  (`serialize(sidecar) -> dict`, `deserialize(data) -> ShowSidecar`) —
+  YAML I/O lives in the repository layer (step 3) and is kept out for
+  trivial testability. Ships the DTOs that mirror the YAML schema
+  field-for-field (`ShowSidecar`, `SeasonSidecar`, `EpisodeSidecar`,
+  `SubtitleEntry`). The sidecar acts as a **scan cache**: it stores
+  only what is genuinely costly to recompute — folder/file paths
+  (skipping the FS walk) and probed track metadata (skipping ffprobe).
+  Release identifiers (group, source, quality, codec) live in folder
+  and file names and are derived on demand by the parser — they are
+  deliberately absent from the schema and rejected on deserialize. The
+  serializer is **strict on schema**: unknown keys at any level raise
+  `SidecarSchemaError`, missing required fields raise clearly, and
+  `bool` cannot sneak in as a season/episode number. Optional fields
+  (`tmdb_id`, empty `audio`/`subtitles`/`episodes`) are omitted from
+  the output rather than emitted as `null` / `[]`. Tests cover
+  round-trip equivalence (DTO → dict → DTO and DTO → YAML text → DTO),
+  the Foundation S01 PACK case (real-world fixture with mixed sub
+  types — superset captured at season scope), and a Breaking Bad S05
+  EPISODIC case. An on-disk `tmp_path` fixture recreates the Foundation
+  folder structure with placeholder files, ready to be reused by the
+  upcoming repository walk tests in step 3.
+
 - **`TVShowBuilder` / `SeasonBuilder` — sole construction surface for the
  TVShow aggregate** (`alfred/domain/tv_shows/builders.py`). The aggregate
  is now fully frozen; building goes through a mutable scratchpad that
@@ -0,0 +1,31 @@
+"""`.alfred` sidecar persistence layer.
+
+Implements the per-show YAML sidecar described in
+``specs/dot_alfred.md``. The sidecar is a single file named ``.alfred``
+posed at the root of a show's directory, containing the full aggregate
+in a factual-only schema.
+
+Public surface:
+
+* :mod:`.sidecar` — DTOs (``ShowSidecar``, ``SeasonSidecar``,
+  ``EpisodeSidecar``, ``SubtitleEntry``) that mirror the YAML schema.
+* :mod:`.serializer` — ``serialize`` / ``deserialize`` functions
+  converting between DTOs and plain dicts (YAML-ready).
+"""
+
+from .serializer import deserialize, serialize
+from .sidecar import (
+    EpisodeSidecar,
+    SeasonSidecar,
+    ShowSidecar,
+    SubtitleEntry,
+)
+
+__all__ = [
+    "deserialize",
+    "serialize",
+    "EpisodeSidecar",
+    "SeasonSidecar",
+    "ShowSidecar",
+    "SubtitleEntry",
+]
@@ -0,0 +1,294 @@
+"""Serialize / deserialize ``.alfred`` sidecar DTOs to plain dicts.
+
+The functions here operate strictly on Python dicts — no YAML I/O. The
+repository layer is responsible for ``yaml.safe_dump`` / ``yaml.safe_load``
+and atomic file writes. Keeping I/O out of the serializer makes it
+trivially testable without touching the filesystem.
+
+The output dict layout matches the schema in ``specs/dot_alfred.md``:
+
+* Top level: ``schema_version``, ``imdb_id``, ``tmdb_id``, ``seasons``.
+* Each season carries ``number``, ``path``, and either pack-mode probed
+  metadata (``audio`` / ``subtitles``) **or** an ``episodes`` list
+  (episodic mode, each episode carrying its own probed metadata).
+* Subtitles are written as inline-style dicts (handled by the YAML
+  writer, not here) — at the DTO level they are just regular keys.
+
+Conventions:
+
+* Fields that are ``None`` or empty tuples are **omitted** from the
+  output dict (cleaner YAML, no ``null`` / ``[]`` noise).
+* Identity fields (``imdb_id``, ``tmdb_id``) are required; empty
+  ``seasons`` is allowed (a show with no season is legitimate during
+  initial population).
+* Deserialization is **strict on unknown keys** — a stray field is a
+  bug, not a feature; raising early prevents silent drift.
+* Release identifiers (group/source/quality/codec) are intentionally
+  absent: they are derived from folder/file names by the parser.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from ....domain.shared.value_objects import ImdbId
+from ....domain.tv_shows.value_objects import EpisodeNumber, SeasonNumber
+from .sidecar import (
+    SCHEMA_VERSION,
+    EpisodeSidecar,
+    SeasonSidecar,
+    ShowSidecar,
+    SubtitleEntry,
+)
+
+
+class SidecarSchemaError(ValueError):
+    """Raised when a sidecar dict does not match the expected schema."""
+
+
+# ════════════════════════════════════════════════════════════════════════════
+# Serialize — DTO → dict
+# ════════════════════════════════════════════════════════════════════════════
+
+
+def serialize(sidecar: ShowSidecar) -> dict[str, Any]:
+    """Render a :class:`ShowSidecar` to a plain dict ready for YAML dump."""
+    out: dict[str, Any] = {
+        "schema_version": sidecar.schema_version,
+        "imdb_id": str(sidecar.imdb_id),
+    }
+    if sidecar.tmdb_id is not None:
+        out["tmdb_id"] = sidecar.tmdb_id
+    out["seasons"] = [_serialize_season(s) for s in sidecar.seasons]
+    return out
+
+
+def _serialize_season(season: SeasonSidecar) -> dict[str, Any]:
+    out: dict[str, Any] = {
+        "number": season.number.value,
+        "path": season.path,
+    }
+    _put_tracks(out, season.audio_languages, season.subtitles)
+    if season.episodes:
+        out["episodes"] = [_serialize_episode(ep) for ep in season.episodes]
+    return out
+
+
+def _serialize_episode(episode: EpisodeSidecar) -> dict[str, Any]:
+    out: dict[str, Any] = {
+        "number": episode.number.value,
+        "path": episode.path,
+    }
+    _put_tracks(out, episode.audio_languages, episode.subtitles)
+    return out
+
+
+def _put_tracks(
+    out: dict[str, Any],
+    audio_languages: tuple[str, ...],
+    subtitles: tuple[SubtitleEntry, ...],
+) -> None:
+    """Append the optional probed-track fields to ``out`` if set."""
+    if audio_languages:
+        out["audio"] = [{"language": lang} for lang in audio_languages]
+    if subtitles:
+        out["subtitles"] = [_serialize_subtitle(sub) for sub in subtitles]
+
+
+def _serialize_subtitle(sub: SubtitleEntry) -> dict[str, Any]:
+    return {"language": sub.language, "source": sub.source, "type": sub.type}
+
+
+# ════════════════════════════════════════════════════════════════════════════
+# Deserialize — dict → DTO
+# ════════════════════════════════════════════════════════════════════════════
+
+_ALLOWED_ROOT = {"schema_version", "imdb_id", "tmdb_id", "seasons"}
+_ALLOWED_SEASON = {"number", "path", "audio", "subtitles", "episodes"}
+_ALLOWED_EPISODE = {"number", "path", "audio", "subtitles"}
+_ALLOWED_SUBTITLE = {"language", "source", "type"}
+_ALLOWED_AUDIO = {"language"}
+
+
+def deserialize(data: dict[str, Any]) -> ShowSidecar:
+    """Parse a sidecar dict into a :class:`ShowSidecar`.
+
+    Raises :class:`SidecarSchemaError` on schema violations (unknown
+    keys, missing required fields, type mismatch, unsupported
+    ``schema_version``).
+    """
+    _require_dict(data, "root")
+    _reject_unknown(data, _ALLOWED_ROOT, "root")
+
+    version = data.get("schema_version")
+    if version != SCHEMA_VERSION:
+        raise SidecarSchemaError(
+            f"Unsupported schema_version: {version!r} (expected {SCHEMA_VERSION})"
+        )
+
+    imdb_id_raw = data.get("imdb_id")
+    if not isinstance(imdb_id_raw, str):
+        raise SidecarSchemaError(
+            f"imdb_id must be a string, got {type(imdb_id_raw).__name__}"
+        )
+
+    tmdb_id_raw = data.get("tmdb_id")
+    if tmdb_id_raw is not None and not isinstance(tmdb_id_raw, int):
+        raise SidecarSchemaError(
+            f"tmdb_id must be an int or absent, got {type(tmdb_id_raw).__name__}"
+        )
+
+    seasons_raw = data.get("seasons", [])
+    if not isinstance(seasons_raw, list):
+        raise SidecarSchemaError(
+            f"seasons must be a list, got {type(seasons_raw).__name__}"
+        )
+
+    seasons = tuple(_deserialize_season(s) for s in seasons_raw)
+
+    return ShowSidecar(
+        imdb_id=ImdbId(imdb_id_raw),
+        tmdb_id=tmdb_id_raw,
+        seasons=seasons,
+        schema_version=version,
+    )
+
+
+def _deserialize_season(data: Any) -> SeasonSidecar:
+    _require_dict(data, "season")
+    _reject_unknown(data, _ALLOWED_SEASON, "season")
+
+    number = _require_int(data, "number", "season")
+    path = _require_str(data, "path", "season")
+    episodes_raw = data.get("episodes")
+
+    tracks = _read_tracks(data, "season")
+    if episodes_raw is not None and not isinstance(episodes_raw, list):
+        raise SidecarSchemaError(
+            f"season.episodes must be a list, got {type(episodes_raw).__name__}"
+        )
+    episodes = (
+        tuple(_deserialize_episode(e) for e in episodes_raw)
+        if episodes_raw
+        else ()
+    )
+
+    return SeasonSidecar(
+        number=SeasonNumber(number),
+        path=path,
+        episodes=episodes,
+        **tracks,
+    )
+
+
+def _deserialize_episode(data: Any) -> EpisodeSidecar:
+    _require_dict(data, "episode")
+    _reject_unknown(data, _ALLOWED_EPISODE, "episode")
+
+    number = _require_int(data, "number", "episode")
+    path = _require_str(data, "path", "episode")
+    tracks = _read_tracks(data, "episode")
+
+    return EpisodeSidecar(
+        number=EpisodeNumber(number),
+        path=path,
+        **tracks,
+    )
+
+
+def _read_tracks(data: dict[str, Any], where: str) -> dict[str, Any]:
+    """Extract the optional probed-track fields shared between season and episode."""
+    result: dict[str, Any] = {}
+
+    audio_raw = data.get("audio")
+    if audio_raw is not None:
+        if not isinstance(audio_raw, list):
+            raise SidecarSchemaError(
+                f"{where}.audio must be a list, got {type(audio_raw).__name__}"
+            )
+        result["audio_languages"] = tuple(
+            _deserialize_audio(entry, where) for entry in audio_raw
+        )
+
+    subtitles_raw = data.get("subtitles")
+    if subtitles_raw is not None:
+        if not isinstance(subtitles_raw, list):
+            raise SidecarSchemaError(
+                f"{where}.subtitles must be a list, got {type(subtitles_raw).__name__}"
+            )
+        result["subtitles"] = tuple(
+            _deserialize_subtitle(entry) for entry in subtitles_raw
+        )
+
+    return result
+
+
+def _deserialize_audio(entry: Any, where: str) -> str:
+    _require_dict(entry, f"{where}.audio[]")
+    _reject_unknown(entry, _ALLOWED_AUDIO, f"{where}.audio[]")
+    language = entry.get("language")
+    if not isinstance(language, str):
+        raise SidecarSchemaError(
+            f"{where}.audio[].language must be a string, "
+            f"got {type(language).__name__}"
+        )
+    return language
+
+
+def _deserialize_subtitle(entry: Any) -> SubtitleEntry:
+    _require_dict(entry, "subtitle")
+    _reject_unknown(entry, _ALLOWED_SUBTITLE, "subtitle")
+    language = entry.get("language")
+    source = entry.get("source")
+    type_ = entry.get("type")
+    if not isinstance(language, str):
+        raise SidecarSchemaError(
+            f"subtitle.language must be a string, got {type(language).__name__}"
+        )
+    if not isinstance(source, str):
+        raise SidecarSchemaError(
+            f"subtitle.source must be a string, got {type(source).__name__}"
+        )
+    if not isinstance(type_, str):
+        raise SidecarSchemaError(
+            f"subtitle.type must be a string, got {type(type_).__name__}"
+        )
+    return SubtitleEntry(language=language, source=source, type=type_)
+
+
+# ════════════════════════════════════════════════════════════════════════════
+# Schema-checking helpers
+# ════════════════════════════════════════════════════════════════════════════
+
+
+def _require_dict(value: Any, where: str) -> None:
+    if not isinstance(value, dict):
+        raise SidecarSchemaError(
+            f"{where} must be a mapping, got {type(value).__name__}"
+        )
+
+
+def _reject_unknown(data: dict[str, Any], allowed: set[str], where: str) -> None:
+    extra = set(data) - allowed
+    if extra:
+        raise SidecarSchemaError(
+            f"{where} has unknown keys: {sorted(extra)}"
+        )
+
+
+def _require_str(data: dict[str, Any], key: str, where: str) -> str:
+    value = data.get(key)
+    if not isinstance(value, str):
+        raise SidecarSchemaError(
+            f"{where}.{key} must be a string, got {type(value).__name__}"
+        )
+    return value
+
+
+def _require_int(data: dict[str, Any], key: str, where: str) -> int:
+    value = data.get(key)
+    if not isinstance(value, int) or isinstance(value, bool):
+        raise SidecarSchemaError(
+            f"{where}.{key} must be an int, got {type(value).__name__}"
+        )
+    return value
@@ -0,0 +1,87 @@
+"""DTOs mirroring the `.alfred` YAML schema.
+
+These dataclasses are the **in-memory representation** of a single
+``.alfred`` file. They mirror the YAML schema described in
+``specs/dot_alfred.md`` field-for-field.
+
+Philosophy: the sidecar exists to avoid two costly operations on every
+read — re-walking the show directory and re-probing the media tracks.
+Parser-derivable fields (release group, source, quality, codec) are
+**not stored**: they live in folder and file names and the parser
+reconstructs them on demand. The sidecar only caches what is not
+otherwise free — folder/file paths (to skip the walk) and probed track
+metadata (audio languages, subtitles — to skip ffprobe).
+
+Schema version: 1.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from ....domain.shared.value_objects import ImdbId
+from ....domain.tv_shows.value_objects import EpisodeNumber, SeasonNumber
+
+SCHEMA_VERSION = 1
+
+
+@dataclass(frozen=True)
+class SubtitleEntry:
+    """One subtitle row, as it appears under ``subtitles:`` in YAML."""
+
+    language: str
+    source: str  # "embedded" | "adjacent"
+    type: str  # "standard" | "sdh" | "forced"
+
+
+@dataclass(frozen=True)
+class EpisodeSidecar:
+    """One episode entry under ``episodes:`` in episodic mode.
+
+    Carries only probed track metadata — release identifiers
+    (group/source/quality/codec) are derived from the filename by the
+    parser, not duplicated here.
+    """
+
+    number: EpisodeNumber
+    path: str
+    audio_languages: tuple[str, ...] = ()
+    subtitles: tuple[SubtitleEntry, ...] = ()
+
+
+@dataclass(frozen=True)
+class SeasonSidecar:
+    """One season block in the sidecar.
+
+    Two storage modes are encoded structurally:
+
+    * **PACK** — ``episodes`` is empty; ``audio_languages`` /
+      ``subtitles`` describe the season as a whole (VO-only policy means
+      all episodes share the same audio set).
+    * **EPISODIC** — ``episodes`` is populated; per-episode track data
+      lives on each :class:`EpisodeSidecar`.
+
+    Release identifiers (group/source/quality/codec) come from parsing
+    the season folder name and are not stored.
+    """
+
+    number: SeasonNumber
+    path: str
+    audio_languages: tuple[str, ...] = ()
+    subtitles: tuple[SubtitleEntry, ...] = ()
+    episodes: tuple[EpisodeSidecar, ...] = ()
+
+
+@dataclass(frozen=True)
+class ShowSidecar:
+    """Root DTO — one ``.alfred`` file maps to one ``ShowSidecar``.
+
+    Identity-only at the root (``imdb_id`` / ``tmdb_id``). The show's
+    display title is the parent directory name on disk, not stored
+    here.
+    """
+
+    imdb_id: ImdbId
+    tmdb_id: int | None = None
+    seasons: tuple[SeasonSidecar, ...] = field(default_factory=tuple)
+    schema_version: int = SCHEMA_VERSION
@@ -0,0 +1,425 @@
+"""Tests for the ``.alfred`` sidecar serializer.
+
+Covers:
+
+* Round-trip equivalence (``serialize`` → ``deserialize`` → equal DTO).
+* Field omission rules (``None`` / empty tuples never make it to dict).
+* Strict schema (unknown keys rejected, missing keys raise clearly).
+* The Foundation fixture (real-world PACK season with mixed subtitles)
+  to exercise the full surface on a realistic case.
+
+The serializer is pure-dict in/out; YAML I/O lives in the repository
+layer and is tested separately.
+
+Note: release identifiers (group/source/quality/codec) live in folder
+and file names — the parser derives them on demand. They are
+deliberately absent from the sidecar schema.
+"""
+
+from __future__ import annotations
+
+import pytest
+import yaml
+
+from alfred.domain.shared.value_objects import ImdbId
+from alfred.domain.tv_shows.value_objects import EpisodeNumber, SeasonNumber
+from alfred.infrastructure.persistence.dot_alfred import (
+    EpisodeSidecar,
+    SeasonSidecar,
+    ShowSidecar,
+    SubtitleEntry,
+    deserialize,
+    serialize,
+)
+from alfred.infrastructure.persistence.dot_alfred.serializer import (
+    SidecarSchemaError,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _foundation_sidecar() -> ShowSidecar:
+    """The Foundation S01 PACK season — real-world fixture data.
+
+    Mirrors the layout seen in
+    ``/mnt/testipool/tv_shows/Foundation.2021.1080p.WEBRip.x265-RARBG/`` —
+    superset audio/subs at season level (some episodes have a forced
+    English sub, captured at season scope).
+    """
+    return ShowSidecar(
+        imdb_id=ImdbId("tt0804484"),
+        tmdb_id=84958,
+        seasons=(
+            SeasonSidecar(
+                number=SeasonNumber(1),
+                path="Foundation.2021.S01.1080p.WEBRip.x265-RARBG",
+                audio_languages=("eng",),
+                subtitles=(
+                    SubtitleEntry(language="eng", source="adjacent", type="standard"),
+                    SubtitleEntry(language="eng", source="adjacent", type="sdh"),
+                    SubtitleEntry(language="eng", source="adjacent", type="forced"),
+                    SubtitleEntry(language="fra", source="adjacent", type="standard"),
+                    SubtitleEntry(language="fra", source="adjacent", type="sdh"),
+                ),
+            ),
+        ),
+    )
+
+
+def _minimal_sidecar() -> ShowSidecar:
+    """Identity-only sidecar — no seasons, no track data."""
+    return ShowSidecar(imdb_id=ImdbId("tt0903747"))
+
+
+def _episodic_sidecar() -> ShowSidecar:
+    """A season in EPISODIC mode (per-episode track metadata)."""
+    return ShowSidecar(
+        imdb_id=ImdbId("tt0903747"),
+        tmdb_id=1396,
+        seasons=(
+            SeasonSidecar(
+                number=SeasonNumber(5),
+                path="Breaking.Bad.S05",
+                episodes=(
+                    EpisodeSidecar(
+                        number=EpisodeNumber(1),
+                        path="Breaking.Bad.S05E01.Live.Free.or.Die-MeGusta/Breaking.Bad.S05E01.mkv",
+                        audio_languages=("eng",),
+                        subtitles=(
+                            SubtitleEntry(
+                                language="eng", source="embedded", type="standard"
+                            ),
+                        ),
+                    ),
+                    EpisodeSidecar(
+                        number=EpisodeNumber(2),
+                        path="Breaking.Bad.S05E02.Madrigal-CtrlHD/Breaking.Bad.S05E02.mkv",
+                        audio_languages=("eng",),
+                    ),
+                ),
+            ),
+        ),
+    )
+
+
+# ---------------------------------------------------------------------------
+# Round-trip
+# ---------------------------------------------------------------------------
+
+
+class TestRoundTrip:
+    def test_minimal(self):
+        original = _minimal_sidecar()
+        assert deserialize(serialize(original)) == original
+
+    def test_foundation_pack_season(self):
+        original = _foundation_sidecar()
+        assert deserialize(serialize(original)) == original
+
+    def test_episodic_breaking_bad(self):
+        original = _episodic_sidecar()
+        assert deserialize(serialize(original)) == original
+
+    def test_round_trip_through_yaml(self):
+        """Full pipeline: DTO → dict → YAML text → dict → DTO."""
+        original = _foundation_sidecar()
+        text = yaml.safe_dump(serialize(original), sort_keys=False)
+        recovered = deserialize(yaml.safe_load(text))
+        assert recovered == original
+
+
+# ---------------------------------------------------------------------------
+# Serialize — field omission
+# ---------------------------------------------------------------------------
+
+
+class TestSerializeOmission:
+    def test_tmdb_id_omitted_when_none(self):
+        out = serialize(_minimal_sidecar())
+        assert "tmdb_id" not in out
+
+    def test_empty_seasons_is_empty_list_not_omitted(self):
+        # We always emit `seasons:` even if empty — the key documents the
+        # show "has no season recorded yet" vs being entirely missing.
+        out = serialize(_minimal_sidecar())
+        assert out["seasons"] == []
+
+    def test_no_audio_when_empty(self):
+        sidecar = ShowSidecar(
+            imdb_id=ImdbId("tt0903747"),
+            seasons=(SeasonSidecar(number=SeasonNumber(1), path="X.S01"),),
+        )
+        out = serialize(sidecar)
+        assert "audio" not in out["seasons"][0]
+
+    def test_no_subtitles_when_empty(self):
+        sidecar = ShowSidecar(
+            imdb_id=ImdbId("tt0903747"),
+            seasons=(SeasonSidecar(number=SeasonNumber(1), path="X.S01"),),
+        )
+        out = serialize(sidecar)
+        assert "subtitles" not in out["seasons"][0]
+
+    def test_no_episodes_when_pack(self):
+        sidecar = ShowSidecar(
+            imdb_id=ImdbId("tt0903747"),
+            seasons=(SeasonSidecar(number=SeasonNumber(1), path="X.S01"),),
+        )
+        out = serialize(sidecar)
+        assert "episodes" not in out["seasons"][0]
+
+    def test_parser_derivable_fields_never_emitted(self):
+        """group/source/quality/codec must never appear in the YAML."""
+        out = serialize(_foundation_sidecar())
+        season = out["seasons"][0]
+        for forbidden in ("group", "source", "quality", "codec"):
+            assert forbidden not in season
+
+
+# ---------------------------------------------------------------------------
+# Serialize — shape
+# ---------------------------------------------------------------------------
+
+
+class TestSerializeShape:
+    def test_root_keys(self):
+        out = serialize(_foundation_sidecar())
+        assert out["schema_version"] == 1
+        assert out["imdb_id"] == "tt0804484"
+        assert out["tmdb_id"] == 84958
+        assert isinstance(out["seasons"], list)
+
+    def test_season_number_is_int(self):
+        out = serialize(_foundation_sidecar())
+        assert out["seasons"][0]["number"] == 1
+        assert isinstance(out["seasons"][0]["number"], int)
+
+    def test_audio_as_list_of_dicts(self):
+        out = serialize(_foundation_sidecar())
+        assert out["seasons"][0]["audio"] == [{"language": "eng"}]
+
+    def test_subtitle_structure(self):
+        out = serialize(_foundation_sidecar())
+        subs = out["seasons"][0]["subtitles"]
+        assert subs[0] == {
+            "language": "eng",
+            "source": "adjacent",
+            "type": "standard",
+        }
+
+
+# ---------------------------------------------------------------------------
+# Deserialize — strict schema
+# ---------------------------------------------------------------------------
+
+
+class TestDeserializeStrict:
+    def _valid_minimal(self) -> dict:
+        return {
+            "schema_version": 1,
+            "imdb_id": "tt0903747",
+            "seasons": [],
+        }
+
+    def test_unknown_root_key_raises(self):
+        data = self._valid_minimal()
+        data["bogus"] = "x"
+        with pytest.raises(SidecarSchemaError, match="root has unknown keys"):
+            deserialize(data)
+
+    def test_unknown_season_key_raises(self):
+        data = self._valid_minimal()
+        data["seasons"] = [{"number": 1, "path": "X", "weird": True}]
+        with pytest.raises(SidecarSchemaError, match="season has unknown keys"):
+            deserialize(data)
+
+    def test_parser_derivable_season_key_raises(self):
+        """A stray group/source/quality/codec key must be rejected."""
+        data = self._valid_minimal()
+        data["seasons"] = [{"number": 1, "path": "X", "group": "RARBG"}]
+        with pytest.raises(SidecarSchemaError, match="season has unknown keys"):
+            deserialize(data)
+
+    def test_unknown_episode_key_raises(self):
+        data = self._valid_minimal()
+        data["seasons"] = [
+            {
+                "number": 1,
+                "path": "X",
+                "episodes": [{"number": 1, "path": "p", "huh": 1}],
+            }
+        ]
+        with pytest.raises(SidecarSchemaError, match="episode has unknown keys"):
+            deserialize(data)
+
+    def test_unknown_subtitle_key_raises(self):
+        data = self._valid_minimal()
+        data["seasons"] = [
+            {
+                "number": 1,
+                "path": "X",
+                "subtitles": [
+                    {"language": "eng", "source": "adjacent", "type": "sdh", "x": 1}
+                ],
+            }
+        ]
+        with pytest.raises(SidecarSchemaError, match="subtitle has unknown keys"):
+            deserialize(data)
+
+    def test_unknown_audio_key_raises(self):
+        data = self._valid_minimal()
+        data["seasons"] = [
+            {
+                "number": 1,
+                "path": "X",
+                "audio": [{"language": "eng", "channels": 6}],
+            }
+        ]
+        with pytest.raises(SidecarSchemaError, match=r"audio\[\] has unknown keys"):
+            deserialize(data)
+
+    def test_wrong_schema_version_raises(self):
+        data = self._valid_minimal()
+        data["schema_version"] = 2
+        with pytest.raises(SidecarSchemaError, match="schema_version"):
+            deserialize(data)
+
+    def test_missing_schema_version_raises(self):
+        data = self._valid_minimal()
+        del data["schema_version"]
+        with pytest.raises(SidecarSchemaError, match="schema_version"):
+            deserialize(data)
+
+    def test_imdb_id_must_be_string(self):
+        data = self._valid_minimal()
+        data["imdb_id"] = 12345
+        with pytest.raises(SidecarSchemaError, match="imdb_id must be a string"):
+            deserialize(data)
+
+    def test_tmdb_id_must_be_int_when_present(self):
+        data = self._valid_minimal()
+        data["tmdb_id"] = "1396"
+        with pytest.raises(SidecarSchemaError, match="tmdb_id"):
+            deserialize(data)
+
+    def test_seasons_must_be_list(self):
+        data = self._valid_minimal()
+        data["seasons"] = {"1": {}}
+        with pytest.raises(SidecarSchemaError, match="seasons must be a list"):
+            deserialize(data)
+
+    def test_season_number_must_be_int(self):
+        data = self._valid_minimal()
+        data["seasons"] = [{"number": "1", "path": "X"}]
+        with pytest.raises(SidecarSchemaError, match="season.number must be an int"):
+            deserialize(data)
+
+    def test_season_number_bool_rejected(self):
+        # bool is a subclass of int but should not pass — guards against
+        # YAML quirks where `True` could sneak in as a season number.
+        data = self._valid_minimal()
+        data["seasons"] = [{"number": True, "path": "X"}]
+        with pytest.raises(SidecarSchemaError, match="season.number must be an int"):
+            deserialize(data)
+
+    def test_season_path_must_be_string(self):
+        data = self._valid_minimal()
+        data["seasons"] = [{"number": 1, "path": 1}]
+        with pytest.raises(SidecarSchemaError, match="season.path"):
+            deserialize(data)
+
+    def test_subtitle_missing_field_raises(self):
+        data = self._valid_minimal()
+        data["seasons"] = [
+            {
+                "number": 1,
+                "path": "X",
+                "subtitles": [{"language": "eng", "source": "adjacent"}],
+            }
+        ]
+        with pytest.raises(SidecarSchemaError, match="subtitle.type"):
+            deserialize(data)
+
+
+# ---------------------------------------------------------------------------
+# Foundation fixture — golden YAML
+# ---------------------------------------------------------------------------
+
+
+class TestFoundationGolden:
+    """Use the Foundation case to validate the produced YAML reads well."""
+
+    def test_yaml_dump_shape(self):
+        text = yaml.safe_dump(serialize(_foundation_sidecar()), sort_keys=False)
+        # Sanity-check that the human-readable layout matches the spec.
+        assert "schema_version: 1" in text
+        assert "imdb_id: tt0804484" in text
+        assert "tmdb_id: 84958" in text
+        assert "- number: 1" in text
+        assert "path: Foundation.2021.S01.1080p.WEBRip.x265-RARBG" in text
+        # No episodes block (PACK mode).
+        assert "episodes:" not in text
+        # No release identifiers at season scope — those live in folder
+        # names. (We can't check ``source:`` here because the subtitle
+        # entries legitimately carry their own ``source`` key.)
+        for forbidden in ("group:", "quality:", "codec:"):
+            assert forbidden not in text
+
+
+# ---------------------------------------------------------------------------
+# Foundation on-disk fixture (real folder structure, no real .mkv)
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def foundation_tree(tmp_path):
+    """Recreate the Foundation S01 layout in a tmp directory.
+
+    Mirrors the on-disk structure of
+    ``/mnt/testipool/tv_shows/Foundation.2021.1080p.WEBRip.x265-RARBG/``
+    using empty placeholder files — sufficient for tests that need a
+    realistic show folder without dragging in real media.
+    """
+    show = tmp_path / "Foundation.2021.1080p.WEBRip.x265-RARBG"
+    season = show / "Foundation.2021.S01.1080p.WEBRip.x265-RARBG"
+    season.mkdir(parents=True)
+    base = "Foundation.2021.S01E{n:02d}.1080p.WEBRip.x265-RARBG"
+    for ep in range(1, 11):
+        stem = base.format(n=ep)
+        (season / f"{stem}.mp4").touch()
+        (season / f"{stem}.eng.srt").touch()
+        (season / f"{stem}.eng.sdh.srt").touch()
+        (season / f"{stem}.fra.srt").touch()
+        (season / f"{stem}.fra.sdh.srt").touch()
+        if 4 <= ep <= 9:
+            (season / f"{stem}.eng.forced.srt").touch()
+    return show
+
+
+class TestFoundationOnDisk:
+    """The on-disk fixture is mostly for future tests (repository walk).
+
+    For now we exercise the basic shape — a placeholder for richer
+    walk-and-build tests landing in step 3 (repository).
+    """
+
+    def test_fixture_has_expected_episode_count(self, foundation_tree):
+        season = foundation_tree / "Foundation.2021.S01.1080p.WEBRip.x265-RARBG"
+        mkvs = sorted(season.glob("*.mp4"))
+        assert len(mkvs) == 10
+
+    def test_fixture_has_forced_subs_only_on_some_episodes(self, foundation_tree):
+        season = foundation_tree / "Foundation.2021.S01.1080p.WEBRip.x265-RARBG"
+        forced = sorted(season.glob("*.eng.forced.srt"))
+        assert len(forced) == 6  # E04 through E09
+
+    def test_serialize_yaml_can_be_written_alongside(self, foundation_tree):
+        """Write the sidecar next to the show folder and read it back."""
+        sidecar_path = foundation_tree / ".alfred"
+        sidecar_path.write_text(
+            yaml.safe_dump(serialize(_foundation_sidecar()), sort_keys=False)
+        )
+        recovered = deserialize(yaml.safe_load(sidecar_path.read_text()))
+        assert recovered == _foundation_sidecar()