# Changelog All notable changes to Alfred are documented here. The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Alfred is not yet on SemVer — entries are grouped by **dated work blocks** instead of release numbers. Granularity targets behavioral or API-visible changes; refer to `git log` for commit-level detail. Sections used per block: **Added** / **Changed** / **Deprecated** / **Removed** / **Fixed** / **Internal** (for tech-debt and refactor noise that doesn't affect callers). --- ## [Unreleased] ### Added - **`.alfred` v2 — Phase 4: v2-shaped `rescan_show` + new `rescan_movie` + index anchor-warning + `tmdb_cache_ttl_days` setting.** Fourth and final structural phase of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. The TV + movie rescan orchestrators now write v2 release aggregates (`SeriesRelease` / `MovieRelease`) via the concrete v2 repositories; the library index keeps auto-healing from the new sidecars on its next read (no TMDB call from rescan — that stays Phase 5). - **`rescan_show`** moves from `alfred/application/library/` to `alfred/application/tv_shows/` (symmetry with the new `alfred/application/movies/`). New signature: `(show_root, *, tmdb_id: TmdbId, imdb_id: ImdbId | None = None, series_repo, scanner, prober, kb) -> SeriesRelease`. - **`rescan_movie`** (new — `alfred/application/movies/rescan.py`) locates the main video via `find_video_file`, runs `inspect_release` once, and writes the per-movie `.alfred` sidecar. `added_at = datetime.now(UTC)` on every rescan (the sidecar records reconciliation time, not filesystem mtime). Raises `MovieRescanFailed` when no video is found in the folder. - **PACK semantics in `rescan_show`**: a single-video + no-episode season becomes `SeasonRelease(mode=PACK, folder=…, episodes=())`. The slot map stays empty until the Phase 5 TMDB sync supplies `episode_count` — no fabricated `EpisodeRange` lands in the sidecar. *(Superseded by Phase 4b — see Fixed.)* - **`Settings.tmdb_cache_ttl_days: int = 14`** — placeholder for the Phase 5 TTL policy on library-index entries (`fetched_at + TTL` drives refresh decisions). - **Library-index anchor-mismatch warning** — both `DotAlfredTVShowLibraryIndex` and `DotAlfredMovieLibraryIndex` now cross-check each entry's `metadata.path` against the on-disk folder layout right after a successful parse. Drift is logged as a `WARNING` (one per missing folder, with `tmdb_id`); the heal path stays silent by construction (it always synthesizes from real folder names). - **`.alfred` v2 — Phase 5: TMDB sync orchestrators.** Fifth phase of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. Two new orchestrators refresh the library-root index's TMDB-cached fields from on-disk truth + a single TMDB call: - **`sync_show`** (`alfred/application/tv_shows/sync.py`) calls `TMDBClient.get_tv_show_info`, loads the release via `DotAlfredSeriesReleaseRepository.load_by_tmdb_id`, and upserts the result into `DotAlfredTVShowLibraryIndex`. Honors `Settings.tmdb_cache_ttl_days`; placeholder entries (auto-healed, `status == "unknown"`) always refresh; `force=True` overrides both gates. Raises `ShowNotFoundInLibrary` when neither index nor sidecar carry `tmdb_id`. Indexed shows with a missing per-show sidecar still get a fresh TMDB pass — slot map clears until rescan repopulates it. - **`sync_movie`** (`alfred/application/movies/sync.py`) is the movie-side parallel. Placeholder signature is `name == metadata.path` (auto-heal copies the folder name into `name`; the sidecar schema requires `name` non-empty so we can't use `name == ""`). When the per-movie sidecar is gone but the index entry remains, sync warns and returns the existing entry unchanged (no upsert possible without a release). - **`TmdbMovieInfo` DTO + `TMDBClient.get_movie_info`** — symmetric to the existing `TmdbShowInfo` / `get_tv_show_info` pair. Carries `tmdb_id`, `imdb_id`, `title`, and `release_year` (parsed from TMDB's `release_date`). - **`load_by_tmdb_id` on the v2 release repositories.** The series repo returns `(SeriesRelease, show_folder_name)` so the sync orchestrator can feed `DotAlfredTVShowLibraryIndex.upsert(..., path=...)`; the movie repo returns `MovieRelease` alone (folder is on `release.folder` already) and is provided as a semantic alias of `find_by_tmdb_id` for symmetry. - **`alfred/application/exceptions.py`** — new module for the two shared `*NotFoundInLibrary` exceptions raised by the sync orchestrators (`ShowNotFoundInLibrary`, `MovieNotFoundInLibrary`). ### Fixed - **PACK vs EPISODIC classification (Phase 4b).** The Phase 4 walker + `rescan_show` logic classified seasons by parser output (does the filename carry `Exx`?), but PACK vs EPISODIC is a *structural* distinction: - **PACK** = season folder with N flat `SxxEyy` videos. - **EPISODIC** = season folder with N subfolders, each holding one video. The walker now descends two levels under `show_root` and classifies per season folder. Mixed (flat + subfolders) is malformed — warn and skip. `rescan_show` trusts the walker's mode and stops conflating "single un-numbered video" with PACK (that case is now skipped as malformed too). Tests rewritten against the real model. Supersedes the PACK-semantics bullet above in Added. ### Removed - **v1 dot_alfred stack and its abstract domain ports.** Deleted `alfred/infrastructure/persistence/dot_alfred/{bridge,repository, serializer,sidecar}.py`, plus the `alfred/domain/{tv_shows,movies}/repositories.py` ABCs (`TVShowRepository` / `MovieRepository`) — zero callers after Phase 4. `dot_alfred/__init__.py` is rewritten as a v2-only re-export (four concrete repositories + `ShowFolderUnknown`). - **`alfred/application/library/` package** (rescan + walker moved to `alfred/application/tv_shows/`). - The two Phase 3 module-level test skips (`test_repository.py`, `test_serializer.py`) are lifted by deleting the quarantined files. - **`MediaWithTracks` mixin + `track_lang_matches` helper** in `alfred.domain.shared.media`. Parked in Phase 4 pending a Phase 5 decision; zero callers across `alfred/` and `tests/` after the v2 aggregates landed, so both go. ### Internal - **Suite**: 1233 → 1277 passing; 10 → 8 skips (only LLM-not-running skips remain — the Phase 3 quarantines are gone with their files). - Phase 5 cleanup sweep returns zero hits for `MediaWithTracks`, v1 dot_alfred symbols, v1 sidecar names, and `alfred.application. library` — the v2 surface is the only one left. ### Changed - **`.alfred` v2 — Phase 3: `TVShow` / `Movie` aggregates become TMDB-only.** Third phase of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. Filesystem-side concerns (file paths, tracks, quality, mode, `added_at`) move to the `releases/` domain added in Phase 1; the TMDB aggregates now carry only identity + TMDB catalog facts. - **`TVShow`** — `tmdb_id: TmdbId` is now the **required primary key**; `imdb_id: ImdbId | None` is the optional secondary anchor. Added `status: str = "unknown"` (raw TMDB string, default matches the v2 library-index auto-heal placeholder). `episode_count` aggregates the TMDB-cached counts on each `Season` (was: sum of materialized `Episode` objects). - **`Season`** — added `episode_count: int = 0` (TMDB-cached, authoritative). **Removed**: `audio_tracks`, `subtitle_tracks`, and the `mode` property (release mode now lives only on `SeasonRelease.mode` — single source of truth). - **`Episode`** — slimmed to identity + title. **Removed**: `file_path`, `file_size`, `audio_tracks`, `subtitle_tracks`. The `MediaWithTracks` mixin is no longer in `Episode`'s MRO; on-disk facts live on the matching `EpisodeRelease` keyed by `(season_number, episode_number)`. - **`Movie`** — `tmdb_id: TmdbId` required, `imdb_id` optional. **Removed**: `file_path`, `file_size`, `quality`, `added_at`, `audio_tracks`, `subtitle_tracks`. `get_filename()` now returns `"Title.Year"` (quality lives on `MovieRelease` and is appended by a release-aware caller — Phase 4 wires this through `MediaOrganizer`). - **`TVShowBuilder` / `SeasonBuilder`** — constructor requires `tmdb_id: TmdbId`; `imdb_id` and `status` are optional. `SeasonBuilder.set_episode_count(int)` replaces the old `set_audio_tracks` / `set_subtitle_tracks` (tracks no longer persisted on `Season`). - **`MovieRelease` carries `added_at: datetime`** (required). Bumped `dot_alfred/v2` `SCHEMA_VERSION` from `1` → `2` to add `added_at: datetime` to `MovieReleaseSidecar`. Round-trip via Pydantic `mode="json"` (datetime ↔ ISO 8601 string). No migration code shipped — no v2.1 sidecars exist in the wild yet. - **No-coercion `TmdbId` contract.** `TVShow(tmdb_id=1396)` now raises — callers pass `TmdbId(1396)`. Same for `imdb_id: ImdbId | None` on `TVShow`/`Movie`. Honest type contract, no ergonomic shim. ### Removed - `Season.mode` property (derive from `SeasonRelease.mode` instead). - `Episode.file_path` / `file_size` / `audio_tracks` / `subtitle_tracks`. - `Movie.file_path` / `file_size` / `quality` / `added_at` / `audio_tracks` / `subtitle_tracks`. ### Internal - v1 dot_alfred package (`bridge.py`, `repository.py`, `serializer.py`, `sidecar.py`), the abstract `TVShowRepository` / `MovieRepository` ports typed against the pre-Phase-3 aggregates, and `alfred/application/library/rescan.py` are **intentionally left in tree as a known-red island**. Their tests (`tests/infrastructure/persistence/dot_alfred/test_repository.py`, `test_serializer.py`, `tests/application/library/test_rescan.py`) are module-level skipped with a Phase 4 reference. Phase 4 rewrites `rescan_show` / introduces `rescan_movie` on top of the v2 release repositories + library index, then deletes the v1 stack + the abstract ports + the quarantined tests in one swing. - Test suite: 1216 passed, 11 skipped (8 pre-existing + 3 Phase-3 quarantines), 4 xfailed. v2 round-trip tests now reference `SCHEMA_VERSION` instead of hard-coded `1` for future-proofing. ### Added - **`.alfred` v2 — Phase 2: new persistence package + TMDB client extensions.** Second phase of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. The new `alfred/infrastructure/persistence/dot_alfred/v2/` package ships the full v2 sidecar stack while leaving v1 (and the existing `TVShow` aggregate) untouched — Phase 3 is the cutover. - **Pydantic DTOs** — `SeriesReleaseSidecar` / `MovieReleaseSidecar` (per-item), `TVShowLibraryIndexSidecar` / `MovieLibraryIndexSidecar` (library-root index). All built on a common `_Strict` base (`extra="forbid"`, `frozen=True`) with a `@model_validator` enforcing `schema_version == 1`. - **Track entries** — `AudioTrackEntry` / `SubtitleEntry` (sidecar cache shape, slimmed from the domain track types). `SubtitleEntry` carries `is_forced` + `is_sdh` as explicit booleans (v1's `type: "sdh"` overload is gone). - **Serializer** — `read_yaml` / `atomic_write_yaml` helpers centralize YAML I/O and atomic writes (`.tmp + os.replace`). `SidecarSchemaError` wraps both YAML parse errors and Pydantic validation errors for uniform catch-and-skip semantics. - **Bridge** — lossless `domain ↔ sidecar` conversion for `SeriesRelease` / `MovieRelease` (round-trippable, including multi-episode ranges and `is_sdh` subtitles); one-way projection for library-index entries (`show_index_entry_from`, `movie_index_entry_from`) that flattens multi-episode files into per-TMDB-slot maps in `seasons[*].episodes`. - **Repositories** — `DotAlfredSeriesReleaseRepository` / `DotAlfredMovieReleaseRepository` walk `library_root/*/` with log+skip on corruption; **`DotAlfredTVShowLibraryIndex`** / **`DotAlfredMovieLibraryIndex`** auto-heal silently on missing or corrupt index files by rebuilding from the per-item sidecars (healed entries keep TMDB-cached fields as placeholders until the next sync repopulates them). Writes are atomic and never auto-heal (read paths handle that). - **TMDB client extensions** — `TmdbSeasonInfo` / `TmdbShowInfo` DTOs + `TMDBClient.get_tv_show_info(tmdb_id)` aggregating `/tv/{id}` + `/tv/{id}/external_ids`. The parsing logic is a pure function (`parse_tv_show_info`) testable without HTTP, with an injectable reference date for deterministic `aired` flag tests. - **`is_sdh` flag on `SubtitleTrack`.** Added to `alfred/domain/shared/media.py::SubtitleTrack` to mirror ffprobe's `hearing_impaired` disposition. Wired through the ffprobe layer (`ffprobe_prober.py`) and the v2 sidecar bridge so SDH information round-trips end-to-end. Defaults to `False` — backwards-compatible for every existing caller. - **37 v2 integration tests** on `tmp_path` covering round-trips (domain ↔ sidecar ↔ YAML ↔ domain), atomic writes (no `.tmp` leftovers), per-item log+skip on corruption / schema mismatch, movie anchor-mismatch warning, full upsert / find / delete on both library indexes, and the auto-heal path on missing / corrupt / schema-mismatched index files. **16 TMDB DTO tests** for the new `parse_tv_show_info` pure function. - **`.alfred` v2 — Phase 1: new `releases/` domain.** First step of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. The new `alfred/domain/releases/` package introduces a filesystem-only bounded context separated from TMDB identity (the existing `tv_shows` / `movies` domains). It hosts: - **`EpisodeRange` VO** — covers single-episode files (`EpisodeRange(E02, E02)`) and multi-episode files (`EpisodeRange(E02, E04)` for `SxxE02E03E04.mkv`), with `count()` / `numbers()` / `is_single()` helpers. - **`ReleaseMode` enum** — `PACK` (N video files directly in the season folder) vs `EPISODIC` (N sub-folders, one episode each); classified by the walker, never re-derived. - **Aggregates** — `TrackProfile`, `EpisodeRelease`, `SeasonRelease` (with `episode_count()` summing each file's range), `SeriesRelease`, `MovieRelease`. All frozen dataclasses; mutation via `SeasonReleaseBuilder` / `SeriesReleaseBuilder` (mirror the v1 `TVShowBuilder` pattern, including `from_existing()` round-trip). - **Abstract ports** — `SeriesReleaseRepository`, `MovieReleaseRepository` (concrete `DotAlfred*` arrive in Phase 2). - **`TmdbId` VO** added to `alfred/domain/shared/value_objects.py` (positive int, rejects bool/str/float — symmetry with `ImdbId`). - 73 unit tests covering VO validation, entity invariants, builder sort + overlap detection, and `from_existing()` round-trips. v1 code paths untouched at this stage; new domain coexists. - **`rescan_show` orchestrator (`alfred/application/library/rescan.py`).** Step 4 of the `specs/dot_alfred.md` plan. Walks an Alfred-managed show folder, runs the existing `inspect_release` pipeline on every video file it finds, and assembles a frozen `TVShow` aggregate persisted via the injected `TVShowRepository`. Reuses the release parser + ffprobe path verbatim — no duplicated parse/probe logic at the library layer. PACK vs EPISODIC inferred per season folder from the on-disk file count + parser output: a single video whose name carries no `Exx` token becomes a PACK season (tracks lifted to the season-level `audio_tracks` / `subtitle_tracks`), anything else becomes EPISODIC (one `Episode` per file). Episode paths are stored relative to the show root for portability. Files that fail to parse a season/episode number, or seasons with mixed numbers, are logged and skipped — the orchestrator never raises. Embedded subtitle tracks are captured from `ffprobe`; adjacent `.srt` files, multi-episode entries (`S01E01E02`), and TMDB-driven PACK detection are tracked as tech debt for a dedicated subtitles / ShowTracker session. 7 integration tests on `tmp_path` with the Foundation layout (S01 EPISODIC + S02 PACK) cover the round-trip through the real `.alfred` repository. - **Show tree walker (`alfred/application/library/walker.py`).** Step 4a foundation. `walk_show(show_root, scanner, kb)` returns a `ShowTree(show_root, season_folders=tuple[SeasonFolder, ...])` — pure structural snapshot, no parsing, no probing. Season folders are detected by a `\bS\d{1,2}\b` token anywhere in the directory name (release-style naming, no Plex `Season 01` / `Specials` conventions). Video files are filtered against `kb.video_extensions`; no recursion into sub-sub-folders. 11 unit tests on `tmp_path` cover detection (case-insensitive, in-word rejection), filtering (subs, NFO, sample files), and edge cases (empty / missing show root). - **Season-level audio/subtitle tracks (`alfred/domain/tv_shows/entities.py`, `alfred/domain/tv_shows/builders.py`).** `Season` now inherits from `MediaWithTracks` and carries `audio_tracks` / `subtitle_tracks` tuples (empty by default). Populated only in PACK mode (the single release covering the whole season); empty in EPISODIC mode where tracks live per-episode. `SeasonBuilder` gains `set_audio_tracks()` / `set_subtitle_tracks()` and forwards them through `from_existing()`. The bridge writes / reads them in the PACK branch via shared `_synth_audio_tracks` / `_synth_subtitle_tracks` helpers used for episodes too. - **`DotAlfredTVShowRepository` — filesystem-backed implementation of the `TVShowRepository` port (`alfred/infrastructure/persistence/dot_alfred/repository.py`).** Step 3 of the `specs/dot_alfred.md` plan. Reads and writes one `.alfred` YAML file per show under a configurable `library_root`. `save(show)` writes atomically (`.alfred.tmp` + `os.replace`) into a folder that **must already exist** — the repository never invents a folder name (the upstream `MediaOrganizer` is in charge of placing files; the repo writes the sidecar next to them). `find_by_imdb_id` / `find_all` walk `library_root/*/`, loading each readable sidecar; folders without a sidecar return `None` / are skipped (no implicit cold scan — that is the job of the upcoming `rescan_show` tool). Corrupted YAML and schema violations are logged and skipped, never raised, so a single bad folder does not break the rest of the library. The repo keeps a tiny in-memory `imdb_id → folder_name` index populated on every successful read/save, so subsequent saves find the right destination without re-walking — useful when the show folder name diverges from `show.get_folder_name()` (custom 1080p / 4K variants). 20 integration tests on `tmp_path` cover the round-trip, cold folder / unknown id returns, multi-show `find_all`, corrupted / wrong-schema skipping, atomic write (no `.alfred.tmp` left behind), overwrite, and folder-name fallbacks. - **Sidecar ↔ TVShow bridge (`alfred/infrastructure/persistence/dot_alfred/bridge.py`).** `to_sidecar(show, folder_paths=...)` summarizes the rich domain `AudioTrack` / `SubtitleTrack` to the sidecar's compact form (unique audio languages in track order; subtitle entries derived from `is_forced` and assumed `source="embedded"`). `from_sidecar(sidecar, title=...)` reconstructs the domain `TVShow` with synthesized tracks — one `AudioTrack` per language, one `SubtitleTrack` per entry, with ffprobe-only fields (`codec`, `channels`, `channel_layout`) left as `None`. The bridge is intentionally lossy on probe minutiae the sidecar does not store; this is the documented trade-off from the factual-only spec. - **`.alfred` sidecar serializer (`alfred/infrastructure/persistence/dot_alfred/`).** Implements step 2 of the `specs/dot_alfred.md` plan. Pure-dict in/out (`serialize(sidecar) -> dict`, `deserialize(data) -> ShowSidecar`) — YAML I/O lives in the repository layer (step 3) and is kept out for trivial testability. Ships the DTOs that mirror the YAML schema field-for-field (`ShowSidecar`, `SeasonSidecar`, `EpisodeSidecar`, `SubtitleEntry`). The sidecar acts as a **scan cache**: it stores only what is genuinely costly to recompute — folder/file paths (skipping the FS walk) and probed track metadata (skipping ffprobe). Release identifiers (group, source, quality, codec) live in folder and file names and are derived on demand by the parser — they are deliberately absent from the schema and rejected on deserialize. The serializer is **strict on schema**: unknown keys at any level raise `SidecarSchemaError`, missing required fields raise clearly, and `bool` cannot sneak in as a season/episode number. Optional fields (`tmdb_id`, empty `audio`/`subtitles`/`episodes`) are omitted from the output rather than emitted as `null` / `[]`. Tests cover round-trip equivalence (DTO → dict → DTO and DTO → YAML text → DTO), the Foundation S01 PACK case (real-world fixture with mixed sub types — superset captured at season scope), and a Breaking Bad S05 EPISODIC case. An on-disk `tmp_path` fixture recreates the Foundation folder structure with placeholder files, ready to be reused by the upcoming repository walk tests in step 3. - **`TVShowBuilder` / `SeasonBuilder` — sole construction surface for the TVShow aggregate** (`alfred/domain/tv_shows/builders.py`). The aggregate is now fully frozen; building goes through a mutable scratchpad that emits an immutable `TVShow` via `build()`. Both builders offer a `from_existing()` classmethod to seed from a current frozen aggregate and apply modifications. Episodes are emitted sorted by number within a season, seasons sorted by number within the show. - **`SeasonMode` enum** (`PACK` / `EPISODIC`) in `alfred/domain/tv_shows/value_objects.py`. Computed at read time from the season's structural shape (`Season.mode` property): a season with no explicit episodes is `PACK` (a single release covering the whole season), a season with episodes is `EPISODIC` (currently airing, one release per episode). Never stored — the YAML sidecar encodes the mode via the presence/absence of the `episodes:` block. ### Changed - **TVShow aggregate is now frozen all the way down.** `TVShow`, `Season` and `Episode` are all `@dataclass(frozen=True)`. Children are stored as ordered tuples (`tuple[Season, ...]`, `tuple[Episode, ...]`) sorted by their respective numbers, replacing the previous mutable dicts. Lookup helpers `TVShow.get_season(n)` and `Season.get_episode(n)` traverse the tuple lazily via `next()`. The former `add_episode` / `add_season` mutation methods are gone — all construction goes through `TVShowBuilder` / `SeasonBuilder`. ### Removed - **ShowTracker-territory fields stripped from the TVShow aggregate.** The aggregate now models only what the `.alfred` sidecar stores (filesystem-observable facts + immutable identity). Dropped from the domain: - `TVShow.status` (`ShowStatus`) and the `ShowStatus` enum entirely, along with its TMDB string mapping (`from_string`). - `TVShow.expected_seasons`, `Season.expected_episodes`, `Season.aired_episodes`, `Season.name`. - `TVShow.collection_status()`, `is_complete_series()`, `missing_episodes()`, `is_ongoing()`, `is_ended()` and the `CollectionStatus` enum. - `Season.is_complete()`, `is_fully_aired()`, `missing_episodes()` and the `aired ≤ expected` validation. - `TVShow.add_episode()` / `TVShow.add_season()` / `Season.add_episode()` — replaced by the builder API. These concerns will reappear in a dedicated `ShowTracker` layer (to be designed) that combines the `.alfred` sidecar with live TMDB data to answer questions like "is this show complete?" or "are new episodes out?". Keeping volatile/derived state out of the aggregate matches the factuel-only philosophy locked in `specs/dot_alfred.md`. ### Internal - **Test suite rewritten for the new aggregate shape.** `tests/domain/test_tv_shows.py` now covers frozen invariants, builder ordering, last-write-wins on duplicates, `from_existing` round-trip, and `SeasonMode` derivation. `tests/infrastructure/test_filesystem_extras.py` helper simplified (no more `ShowStatus.ENDED` / `expected_seasons` on test shows). 1078 tests still green. - **Design doc for `.alfred/` sidecar persistence (`specs/dot_alfred.md`).** First entry in the new `specs/` directory. Specifies a per-show `.alfred/` directory holding a `show.yaml` and one `season_NN.yaml` per season, used by the upcoming concrete `TVShowRepository` to cache parse/probe results and avoid full rescans on every library read. Covers schema, naming conventions, cache invalidation strategy (size + mtime), self-healing on drift, atomicity (`os.replace`), edge cases (legacy folders, corrupted sidecars, manual file removal), and a phased implementation plan. No code yet — spec only. ### Internal - **`specs/` is now tracked.** The repo-level `.gitignore` had a blanket `*.md` rule with only `CHANGELOG.md` allow-listed. Added explicit exceptions for `/README.md` (root only — avoids unintentionally exposing fixture READMEs) and `specs/**/*.md` so the new design-doc directory ships with the project. Also added an explicit `/.claude/` ignore line for the private dev-docs sub-repo that sits inside the working tree but is versioned separately. ### Fixed - **Multi-episode chain (e.g. `S14E09E10E11`) now collapses to a full range.** The parser previously captured `episode=9, episode_end=10` and dropped E11+. It now returns `episode=first, episode_end=last`, with intermediate values implied. Fixture `shitty/archer_multi_episode/` updated from anti-regression-of-bug to anti-regression-of-fix. - **Apostrophes in titles no longer push the release through the AI fallback.** `Honey.Don't.2025.2160p.WEBRip.DSNP.DV.HDR.x265-Amen` previously parsed with `parse_path="ai"` and everything UNKNOWN because `'` is in the forbidden-chars list. Apostrophes are now pre-stripped before the well-formed check, so the parse completes normally (`title=Honey.Dont, year=2025, quality=2160p, ...`); only the title text loses its apostrophe. `parse_path` becomes `sanitized` to surface the cleanup. Side win: PoP fixture `the_prodigy_full_chaos/` also moves from total failure to a partially-correct parse (year, source, codec extracted). - **Season-range markers (`Sxx-yy`) are now recognized as `tv_complete`.** `Der.Tatortreiniger.S01-06.GERMAN...` previously parsed as `media_type=movie` with `S01-06` glued onto the title. The parser now recognizes the range, sets `season=first`, `media_type=tv_complete`, and removes the marker from the title. `is_season_pack` flips to `true`. - **Pure-punctuation TITLE tokens are dropped at assembly.** Releases with surrounding ` - ` separators (`Vinyl - 1x01 - FHD`) previously produced `title="Vinyl.-"`. Such tokens (a stray dash, a wide pipe `|`, …) carry no title content and are now filtered out. Side effect: PoP fixture `khruangbin_yt_wide_pipe/` also benefits — the YouTube wide-pipe no longer leaks into the title. ### Added - **Fullwidth vertical bar `|` (U+FF5C) is now a recognized release-name token separator.** Added to `alfred/knowledge/release/separators.yaml` so CJK release names (and the occasional decorative YouTube-style use) tokenize cleanly instead of leaving the wide pipe glued onto an adjacent token. The tokenizer in `alfred/domain/release/parser/pipeline.py` already iterates the separator list as plain strings (no regex), so a multi-byte UTF-8 separator works without any code change. - **`InspectedResult.recommended_action` property** — derived hint that collapses the orchestrator's go / wait / skip decision into a single value (``"process"`` / ``"ask_user"`` / ``"skip"``). Centralizes the exclusion logic that was previously dispersed across road / media_type / main_video checks at each call site. Ordering is part of the contract: ``skip`` (no main video, or media_type == ``"other"``) wins over ``ask_user`` (media_type == ``"unknown"`` or road == ``"path_of_pain"``) which wins over ``process``. Surfaced through the ``analyze_release`` tool so the LLM can route on it directly. 6 new tests in ``tests/application/test_inspect.py`` cover the four branches and the precedence rules. - **`LanguageRepository` port** in `alfred.domain.shared.ports`. Structural Protocol covering `from_iso`, `from_any`, `all`, `__contains__`, `__len__` — the surface previously coupled to the concrete `LanguageRegistry`. Mirrors the `MediaProber` / `FilesystemScanner` pattern: domain code depends on the Protocol, infrastructure provides the YAML-backed adapter. Tests in `tests/infrastructure/test_language_registry.py`. ### Changed - **`Movie` and `Episode` are now frozen dataclasses.** Both entities hold their track collections as `tuple[AudioTrack, ...]` and `tuple[SubtitleTrack, ...]` instead of mutable lists, and are `@dataclass(frozen=True, eq=False)` (identity-based equality preserved via `__eq__`/`__hash__`). `__post_init__` coercion uses `object.__setattr__` for the `imdb_id` / `title` / `season_number` / `episode_number` normalizations. To project enrichment results (probe output, file metadata) callers now rebuild via `dataclasses.replace(...)`. Pattern aligned with the recent `ParsedRelease` freeze. `MediaWithTracks` mixin contract updated to `tuple` accordingly. `Season` and `TVShow` remain mutable for now — freezing the aggregate root would cascade a full reconstruction on every `add_episode`, deferred. - **`SubtitleCandidate` renamed to `SubtitleScanResult`.** The old name conflated "this might become a placed subtitle" with "this is what a scan pass produced". The class is the output of a scan/identify pass — language/format may still be `None`, confidence reflects how sure the classifier is, and `raw_tokens` holds the filename fragments under analysis. `SubtitleScanResult` says that directly. Pure rename with a refreshed docstring in `alfred/domain/subtitles/entities.py`; no behavior change. Touches the domain entity + `__init__` export, the matcher / identifier / utils services, the manage_subtitles use case, the placer, the metadata store, the shared-media cross-ref comment, and the seven test modules that imported the type. - **`ParsedRelease` is now frozen; enrichment passes return new instances.** The VO was mutable so `detect_media_type` and `enrich_from_probe` could patch fields in place — a code smell in a value object whose identity *is* its content. `ParsedRelease` is now `@dataclass(frozen=True)`; `languages` is a `tuple[str, ...]` instead of a `list[str]`. `enrich_from_probe` returns a new `ParsedRelease` via `dataclasses.replace` (only allocates when at least one field actually changed). `inspect_release` rebinds `parsed` after both `detect_media_type` (wrapped in `MediaTypeToken` to satisfy the strict isinstance check that now also runs on replace) and `enrich_from_probe`. Parser pipeline now packs `languages` as a tuple in the assemble dict. Callers updated: `inspect_release`, `testing/recognize_folders_in_downloads.py`, and the enrichment tests (22 call sites + language assertions switched to tuple literals). - **`resolve_destination` use cases take `kb` / `prober` as required params; module-level singletons gone.** The four `resolve_{season,episode,movie,series}_destination` use cases now accept `kb: ReleaseKnowledge` and `prober: MediaProber` as required arguments, matching the shape of `inspect_release`. The module-level `_KB = YamlReleaseKnowledge()` and `_PROBER = FfprobeMediaProber()` singletons that previously lived in `alfred/application/filesystem/resolve_destination.py` are removed — the application layer no longer reaches into infrastructure. The singletons now live at the agent-tools frontier (`alfred/agent/tools/filesystem.py`), where the LLM-facing wrappers instantiate them once and thread them through. `analyze_release` no longer needs the dirty `from ... import _KB` indirection. Tests inject their own stubs by keyword (`prober=_StubProber(...)`) instead of monkeypatching a module attribute. - **`ParsePath` enum renamed to `TokenizationRoute`.** The old name collided with `pathlib.Path` in code-reading mental models, and was one letter from `parse_path` (the field that holds the value) — making it harder than it needed to be to spot the type vs the attribute. ``TokenizationRoute`` says what it actually captures (DIRECT / SANITIZED / AI = how the name reached the tokenizer), and the class docstring now spells out the orthogonality with ``Road`` (EASY / SHITTY / PATH_OF_PAIN, which captures parser confidence on ``ParseReport``). The ``parse_path`` field name stays unchanged — string values too — so YAML fixtures, the ``analyze_release`` tool spec, and any external consumer are untouched. - **`enrich_from_probe` codec mappings moved to YAML.** The three hard-coded module dicts (`_VIDEO_CODEC_MAP`, `_AUDIO_CODEC_MAP`, `_CHANNEL_MAP`) translating ffprobe output to scene tokens (`hevc → x265`, `eac3 → EAC3`, `8 → "7.1"`, …) now live in `alfred/knowledge/release/probe_mappings.yaml` and are loaded into `ReleaseKnowledge.probe_mappings` (new port field, populated by `YamlReleaseKnowledge`). `enrich_from_probe` gains a third `kb` parameter and reads the maps from there. Aligns with the CLAUDE.md rule that lookup tables of domain knowledge belong in YAML, not in Python — and opens the door to a future "learn new codec" pass. Callers updated: `inspect_release`, `testing/recognize_folders_in_downloads.py`, and all 22 sites in `tests/application/test_enrich_from_probe.py`. - **`ParsedRelease.tech_string` is now a derived `@property`** (`alfred/domain/release/value_objects.py`). It computes `quality.source.codec` joined by dots on every access, so it stays in sync with the underlying fields by construction. The stored field is gone from the dataclass, the dict returned by `assemble()` no longer carries the key, `parse_release`'s malformed-name fallback drops the `tech_string=""` kwarg, and `enrich_from_probe` no longer re-derives it after filling `quality`/`source`/`codec`. Closes the parser/enrichment double-source-of-truth that `e79ca46` had to fix reactively. The fixtures runner now injects `tech_string` alongside `is_season_pack` since `asdict()` skips properties. - **`RuleScope.level` is now an enum (`RuleScopeLevel`).** The set of valid levels (global, release_group, movie, show, season, episode) was documented only in a docstring comment and validated nowhere. `RuleScopeLevel(str, Enum)` keeps wire compatibility (YAML serialization, `.value` access) while making the closed set explicit to type-checkers and IDEs. `to_dict()` emits `.value` strings so YAML output is unchanged. - **`FilePath` VO uses `__post_init__` instead of a hand-rolled `__init__`.** Same public API (accepts `str | Path`), same behavior, but the dataclass-generated `__init__` is no longer bypassed. One less smell in the shared VOs. - **`Language` VO is strict by default; `Language.from_raw()` factory for normalization.** The previous `__post_init__` mutated `iso` and `aliases` via `object.__setattr__` on a frozen dataclass — a code smell hiding behind the dataclass facade. Split: the direct constructor now rejects un-normalized input (uppercase iso, whitespace in aliases, etc.), and `Language.from_raw()` handles arbitrary YAML/user input. Only one caller (LanguageRegistry loading the ISO YAML) needed migration. - **`ParsedRelease.normalised` renamed to `clean`.** The field name promised "dots instead of spaces" but in practice held `raw - site_tag - apostrophes` — only used by `season_folder_name()`. Renamed and docstring corrected. - **`ParsedRelease.media_type` / `parse_path` are strict enums.** The fields were already typed as `MediaTypeToken` / `ParsePath`, but a tolerant `__post_init__` coerced raw strings. With both classes being `(str, Enum)`, the coercion served no purpose. Strict constructor; `.value` no longer passed at call sites; dropped the unused `_VALID_MEDIA_TYPES` / `_VALID_PARSE_PATHS` lookup tables. ### Removed - **`settings.min_movie_size_bytes`** — orphan Pydantic field + validator. Its only consumer (`MovieService.validate_movie_file`) had been removed during an earlier refactor. The "real movie vs sample" rule now lives in extension-based exclusion (`application/release/supported_media.py`) and PoP. If a size threshold is ever needed, it'll go in a knowledge YAML, not in `settings`. ### Internal - **Flattened `alfred.domain.shared.media/` package into a single `media.py` module.** The 6-file package (audio, video, subtitle, info, matching, tracks_mixin + `__init__`) collapsed into one ~250 LoC module. All 12 import sites continue to resolve unchanged (`from alfred.domain.shared.media import AudioTrack, MediaInfo, …`) since Python treats `media.py` and `media/__init__.py` interchangeably for import paths. Easier to scan when the whole bounded-context fits on one screen. - **`SubtitleKnowledgeBase` types `language_registry` against the `LanguageRepository` port** instead of the concrete `LanguageRegistry` class. The default constructor still instantiates the concrete adapter when no repository is injected — behaviour is unchanged for existing callers. Opens the door to in-memory fakes in future tests without loading the full ISO 639 YAML. - **Moved `detect_media_type` and `enrich_from_probe` from `alfred.application.filesystem` to `alfred.application.release`**. They are inspection-pipeline helpers — their natural home is next to `inspect_release`, not next to the filesystem use cases. The move also eliminates a circular-import workaround in `resolve_destination.py`: `inspect_release` can now be imported at module top instead of lazily inside `_resolve_parsed`. Public surface is unchanged for callers that imported the helpers from their full module paths (the only call sites — `inspect.py`, two tests, one testing script — were updated in this commit). ### Added - **`resolve_*_destination` use cases now consume `inspect_release`**. `resolve_episode_destination` and `resolve_movie_destination` reuse their existing `source_file` parameter as the inspection target; `resolve_season_destination` and `resolve_series_destination` gain a new **optional** `source_path` parameter (also threaded through the tool wrappers and YAML specs). When the path exists, ffprobe data fills tokens missing from the release name (e.g. quality) and refreshes `tech_string`, so the destination folder / file names end up more accurate. When the path is missing or absent (back-compat callers), the use cases fall back to parse-only — same behavior as before. ### Fixed - **`enrich_from_probe` now refreshes `tech_string`** after filling `quality` / `source` / `codec`. Previously the field stayed at its parser-time value, so filename builders saw stale tech tokens even after a successful probe. New `TestTechString` class in `tests/application/test_enrich_from_probe.py` locks the behavior. ### Added - **`inspect_release` orchestrator + `InspectedResult` VO** (`alfred/application/release/inspect.py`). Single composition of the four inspection layers: `parse_release` → `detect_media_type` (patches `parsed.media_type`) → `find_main_video` (top-level scan) → `prober.probe` + `enrich_from_probe` when a video exists and the refined media type isn't in `{"unknown", "other"}`. Returns a frozen `InspectedResult(parsed, report, source_path, main_video, media_info, probe_used)` that downstream callers consume directly instead of rebuilding the same chain. `kb` and `prober` are injected — no module-level singletons. Never raises. ### Changed - **`analyze_release` tool now delegates to `inspect_release`** — same output shape, plus two new fields: `confidence` (0–100) and `road` (`"easy"` / `"shitty"` / `"path_of_pain"`) surfaced from the parser's `ParseReport`. The tool spec (`specs/analyze_release.yaml`) documents both fields so the LLM can route releases by confidence. - **`MediaProber` port now covers full media probing**: added `probe(video) -> MediaInfo | None` alongside the existing `list_subtitle_streams`. `FfprobeMediaProber` (in `alfred/infrastructure/probe/`) implements both methods and is now the single adapter shelling out to `ffprobe`. The standalone `alfred/infrastructure/filesystem/ffprobe.py` module was removed — all callers (tools, testing scripts) instantiate `FfprobeMediaProber` instead. Unblocks the upcoming `inspect_release` orchestrator, which depends on the port. ### Removed - `alfred/infrastructure/filesystem/ffprobe.py` (folded into the `FfprobeMediaProber` adapter). --- ## [2026-05-20] — Release parser confidence scoring + exclusion ### Added - **Pre-pipeline exclusion helpers** (`alfred/application/release/supported_media.py`): `is_supported_video(path, kb)` (extension-only check against `kb.video_extensions`) and `find_main_video(folder, kb)` (top-level scan, lexicographically-first eligible file, returns `None` when no video qualifies; accepts a bare file as folder for single-file releases). No size threshold, no filename heuristics — PATH_OF_PAIN handles the exotic cases. Foundation for the future `inspect_release` orchestrator. - **Release parser — parse-confidence scoring** (`alfred/domain/release/parser/scoring.py`, `alfred/knowledge/release/scoring.yaml`). `parse_release` now returns `(ParsedRelease, ParseReport)`. The new `ParseReport` frozen VO carries a 0–100 `confidence`, a `road` (`"easy"` / `"shitty"` / `"path_of_pain"`), the residual UNKNOWN tokens, and the missing critical fields. EASY is decided structurally (a group schema matched); SHITTY vs PATH_OF_PAIN is decided by score against a YAML-configurable cutoff (default 60). Weights and penalties also live in `scoring.yaml` — title 30, media_type 20, year 15, season 10, episode 5, tech 5 each; penalty 5 per UNKNOWN token capped at -30. `Road` is a new enum, distinct from `ParsePath` (which records the tokenization route, not the confidence tier). `ReleaseKnowledge` port gains a `scoring: dict` field. ### Changed - **`parse_release` signature** is now `(name, kb) → tuple[ParsedRelease, ParseReport]` instead of returning a bare `ParsedRelease`. Call sites updated in `application/filesystem/resolve_destination.py` and `agent/tools/filesystem.py`. Tests updated accordingly. --- ## [2026-05-20] — Release parser v2 (EASY + SHITTY) ### Added - **Release parser v2 — EASY path live** (`alfred/domain/release/parser/`): new annotate-based pipeline (tokenize → annotate → assemble) drives releases from known groups. Exposes `Token` (frozen VO with `index` + `role` + `extra`), `TokenRole` enum (structural/technical/meta families), and `GroupSchema` / `SchemaChunk` value objects. - `pipeline.tokenize`: string-ops separator split (no regex), strips a `[site.tag]` prefix/suffix first. - `pipeline.annotate`: detects the trailing group right-to-left (priority to `codec-GROUP` shape, fallback to any non-source dashed token), looks up its `GroupSchema`, then walks tokens and schema chunks in lockstep — optional chunks that don't match are skipped, mandatory mismatches abort EASY and return `None` so the caller can fall back to SHITTY. - `pipeline.assemble`: folds annotated tokens into a `ParsedRelease`-compatible dict. - `parse_release` (in `release.services`) tries the v2 EASY path first and falls through to the legacy SHITTY heuristic on `None`. Legacy SHITTY/PATH OF PAIN behavior is unchanged. - Knowledge: `alfred/knowledge/release/release_groups/{kontrast,elite, rarbg}.yaml` declare the canonical chunk order per group, loaded via new `ReleaseKnowledge.group_schema(name)` port method. - Tests in `tests/domain/release/test_parser_v2_{scaffolding,easy}.py` cover token VOs, site-tag stripping, group detection, schema-driven annotation (movie, TV episode, season pack with optional source), and field assembly. - **Release parser v2 — enricher pass** completes the EASY pipeline. The structural schema walk now tolerates non-positional tokens between chunks (instead of aborting on leftover tokens), and a second pass tags them with audio / video-meta / edition / language roles. Multi-token sequences from `audio.yaml`, `video.yaml`, `editions.yaml` (e.g. `DTS.HD.MA`, `DV.HDR10`, `TrueHD.Atmos`, `DIRECTORS.CUT`) are matched before single tokens. Channel layouts like `5.1` and `7.1` (split into two tokens by the `.` separator) are detected as consecutive pairs. Sequence members carry an `extra["sequence_member"]` marker so `assemble` extracts the canonical value only from the primary token. KONTRAST releases with audio / HDR / edition / language metadata now produce a fully populated `ParsedRelease`. - **Streaming distributor as a separate dimension** from encoding source. New `alfred/knowledge/release/distributors.yaml` (NF, AMZN, DSNP, HMAX, ATVP, HULU, PCOK, PMTP, CR) feeds a new `ReleaseKnowledge.distributors` port field, a `TokenRole.DISTRIBUTOR` annotation, and a `ParsedRelease.distributor` field. `WEB-DL` stays the source; the platform that produced the release is now recorded distinctly. The five entries (NF, AMZN, DSNP, HMAX, ATVP) were correspondingly removed from `sources.yaml`. - **Real-world release fixtures** under `tests/fixtures/releases/{easy,shitty,path_of_pain}/`, each documenting an expected `ParsedRelease` plus the future `routing` (library / torrents / seed_hardlinks) for the upcoming `organize_media` refactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chain `S14E09E10E11` (Archer, captures E11 loss), lowercase `s01e01` (Notre Planète), `NxNN` with ` - ` separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-range `S01-06` (Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, captures `group=S03` misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual `[FR-EN]` tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized over `tests/domain/test_release_fixtures.py` for anti-regression. - **`NxNN` alt season/episode form supported** by `parse_release`. Releases like `Show.1x05.720p.HDTV.x264-GRP` and `Show.2x07x08.1080p.WEB.x265-GRP` (multi-ep alt form) now parse as TV shows. - **`alfred/knowledge/release/separators.yaml`** declares the token separators used by the release-name tokenizer (`.`, ` `, `[`, `]`, `(`, `)`, `_`). New conventions can be added without code changes. The canonical `.` is always present even if missing from YAML. ### Changed - **Release parser v2 — SHITTY simplified to dict-driven tagging**. The legacy ~480-line heuristic block in `release/services.py` is gone; `pipeline._annotate_shitty` does a single pass that looks each token up in the kb buckets (resolutions / sources / codecs / distributors / year / `SxxExx`) with first-match-wins semantics, and the leftmost contiguous UNKNOWN run becomes the title. `annotate()` no longer returns `None` — SHITTY is the always-on fallback when no group schema matches. `services.py` shrunk from ~525 to ~85 lines. Four fixtures (`deutschland_franchise_box`, `sleaford_yt_slug`, `super_mario_bilingual`, `predator_space_separators` — the last one moved from `shitty/` → `path_of_pain/`) are now marked `pytest.mark.xfail(strict=False)` documenting PoP-grade pathologies that SHITTY intentionally won't handle. `ReleaseFixture` grows an `xfail_reason` field; the parametrized suite wires the xfail mark automatically. - **`parse_release` tokenizer is now data-driven**: it splits on any character listed in `separators.yaml` (regex character class) instead of `name.split(".")`. This makes YTS-style releases (`The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]`), space-separated names (`Inception 2010 1080p BluRay x264-GROUP`), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization. - **`parse_release` flow simplified**: site-tag extraction always runs first (so `parse_path == "sanitized"` now reliably indicates a stripped `[tag]`), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set). - **ISO 639-2/B is now the canonical language code project-wide** (was a mix of 639-1 and 639-2/T): - `SubtitlePreferences.languages` default is now `["fre", "eng"]` (was `["fr", "en"]`). Old LTM files are not auto-migrated — delete `data/memory/ltm.json` to regenerate with the new defaults. - Subtitle output filenames are now `{iso639_2b}.srt` (e.g. `fre.srt`, `fre.sdh.srt`). Existing `fr.srt` files are still **read** correctly (recognized as French via alias) but new files are written canonically. - `Language` value object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented. - **`MovieService.validate_movie_file` minimum size is now configurable** via `settings.min_movie_size_bytes` (default unchanged: 100 MB). Constructor accepts an optional `min_movie_size_bytes` override for tests. - **`SubtitleKnowledgeBase` delegates language lookup to `LanguageRegistry`** rather than duplicating tokens. `subtitles.yaml` now only declares subtitle-specific tokens (e.g. `vostfr`, `vf`, `vff`) under a new `language_tokens` section. ### Removed - **`alfred/domain/tv_shows/services.py`** and **`alfred/domain/movies/services.py`** deleted entirely. They held fossil parsers (`parse_episode_filename`, `extract_movie_metadata`, …) with zero production callers — superseded by `parse_release` as the single source of truth for release-name parsing. Associated tests (`tests/domain/test_movies.py`, `tests/domain/test_tv_shows_service.py`) removed as well. - `_sanitize` and `_normalize` helpers in `alfred/domain/release/services.py` — the new tokenizer makes them redundant. - `_LANG_KEYWORDS`, `_SDH_TOKENS`, `_FORCED_TOKENS`, `SUBTITLE_EXTENSIONS` hardcoded dicts in `alfred/domain/subtitles/scanner.py` — all knowledge now lives in YAML (CLAUDE.md compliance). - `_MIN_MOVIE_SIZE_BYTES` module-level constant in `alfred/domain/movies/services.py` — replaced by the new setting. - Top-level `languages:` block in `subtitles.yaml` — superseded by `language_tokens:` (subtitle-specific only) since iso_languages.yaml is the canonical source. ### Fixed - **`hi` token no longer marks a subtitle as SDH** (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only via `sdh`, `cc`, and `hearing` tokens. - `SubtitleKnowledgeBase` default rules used `"fra"` while `iso_languages.yaml` exposes French as `"fre"` — preferred languages defaults now match the canonical form. ### Internal - **Domain I/O extraction** (`refactor/domain-io-extraction`): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits: - **Knowledge YAML loaders moved to infrastructure**: `alfred/domain/release/knowledge.py`, `alfred/domain/shared/knowledge/language_registry.py`, and `alfred/domain/subtitles/knowledge/{base,loader}.py` relocated to `alfred/infrastructure/knowledge/`. Re-exports were dropped — callers import directly from the new location. - **`MediaProber` and `FilesystemScanner` Protocol ports** introduced at `alfred/domain/shared/ports/` with frozen-dataclass DTOs (`SubtitleStreamInfo`, `FileEntry`). `SubtitleIdentifier` and `PatternDetector` are now constructor-injected with concrete adapters (`FfprobeMediaProber` wrapping `subprocess.run(ffprobe)` and `PathlibFilesystemScanner` wrapping `pathlib`). No more direct `subprocess`/`pathlib` usage from the subtitle domain services. - **Live filesystem methods removed from VOs and entities**: `FilePath.exists()` / `.is_file()` / `.is_dir()` deleted — `FilePath` is now a pure address VO. `Movie.has_file()` and `Episode.is_downloaded()` dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races). - **`SubtitlePlacer` moved to the application layer** at `alfred/application/subtitles/placer.py` — it performs `os.link` I/O, which doesn't belong in the domain. Pre-checks replaced with try/except for `FileNotFoundError`/`FileExistsError`. - **`SubtitleRuleSet.resolve()` no longer reaches into the knowledge base**: the implicit `DEFAULT_RULES()` helper is gone, replaced by an explicit `default_rules: SubtitleMatchingRules` parameter. The `ManageSubtitles` use case loads defaults from the KB once and passes them in. - **`SubtitleKnowledge` Protocol port** at `alfred/domain/subtitles/ports/knowledge.py` declares the read-only query surface domain services consume (7 methods: `known_extensions`, `format_for_extension`, `language_for_token`, `is_known_lang_token`, `type_for_token`, `is_known_type_token`, `patterns`). `SubtitleIdentifier` and `PatternDetector` depend on this Protocol instead of the concrete `SubtitleKnowledgeBase` from infrastructure — `domain/subtitles/` now has zero imports from `infrastructure/`. The remaining domain → infra leak (`domain/release/` loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch. - **`to_dot_folder_name(title)` helper** in `alfred/domain/shared/value_objects.py` — extracts the `re.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".")` pattern that was duplicated between `MovieTitle.normalized()` and `TVShow.get_folder_name()`. - **`ParsedRelease.languages` uses `field(default_factory=list)`** instead of a manual `__post_init__` that assigned `[]` via `object.__setattr__`. - **`file_extensions.yaml` splits subtitle sidecars (`.srt`, `.sub`, `.idx`, `.ass`, `.ssa`) into a dedicated `subtitle:` category** instead of lumping them under `metadata:`. The `_METADATA_EXTENSIONS` set used by `detect_media_type` remains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a new `load_subtitle_extensions()` loader is now available for the subtitles domain. Sematic clarity, no functional change. - **`tv_shows/entities.py` module docstring** now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure. - Removed backward-compat shims `_sanitise_for_fs` / `_strip_episode_from_normalised` from `domain/release/value_objects.py` (zero callers). - Cleaned ruff warnings across the codebase: `subprocess.run` calls now pass explicit `check=False` (PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 in `manage_subtitles.py`, `placer.py`, `qbittorrent/client.py`, `file_manager.py`); fixed module-level import ordering (E402) in `language_registry.py` and `subtitles/knowledge/loader.py`; removed unused locals (F841 / B007); replaced unnecessary set comprehension with `set()` in `release/knowledge.py` (C416). - Ruff config: ignore `PLR0911` / `PLR0912` (too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. Ignore `PLW0603` for the documented memory singleton (`infrastructure/persistence/context.py`). - **Release-knowledge DDD purification** (`refactor/domain-release-knowledge`): the last domain → infrastructure leak (`domain/release/value_objects.py` loading YAML at import-time) is gone. Achieved via: - **`ReleaseKnowledge` Protocol port** at `alfred/domain/release/ports/knowledge.py` declares the read-only query surface release parsing needs (token sets for resolutions, sources, codecs, languages, hdr extras; structured dicts for audio, video_meta, editions, media_type_tokens; separators list; file-extension sets used by application/infra callers; `sanitize_for_fs(text)` method). - **`YamlReleaseKnowledge` adapter** at `alfred/infrastructure/knowledge/release_kb.py` loads every YAML constant once at construction. Builds an immutable `str.maketrans` translation table for filesystem sanitization. - **`parse_release(name, kb)`** takes the knowledge as an explicit parameter — no more module-level YAML loading inside the domain. Every internal helper (`_tokenize`, `_extract_tech`, `_extract_languages`, `_extract_audio`, `_extract_video_meta`, `_extract_edition`, `_extract_title`, `_infer_media_type`, `_is_well_formed`) takes `kb`. - **`ParsedRelease` Option B**: sanitization happens once at parse time and is stored on a new `title_sanitized: str` field. Builder methods (`show_folder_name`, `season_folder_name`, `episode_filename`, `movie_folder_name`, `movie_filename`) are now pure — they accept already-sanitized `tmdb_title_safe` / `tmdb_episode_title_safe` arguments. Callers at the use-case boundary sanitize TMDB strings via `kb.sanitize_for_fs(...)` before passing them in. - **All domain-knowledge constants removed from `value_objects.py`**: `_RESOLUTIONS`, `_SOURCES`, `_CODECS`, `_AUDIO`, `_VIDEO_META`, `_EDITIONS`, `_HDR_EXTRA`, `_MEDIA_TYPE_TOKENS`, `_LANGUAGE_TOKENS`, `_FORBIDDEN_CHARS`, `_VIDEO_EXTENSIONS`, `_NON_VIDEO_EXTENSIONS`, `_SUBTITLE_EXTENSIONS`, `_METADATA_EXTENSIONS`, `_WIN_FORBIDDEN_TABLE`, and the `_sanitize_for_fs` helper. The domain module is now pure. - **Application-layer KB singleton**: `resolve_destination.py` instantiates a module-level `_KB: ReleaseKnowledge = YamlReleaseKnowledge()` and threads it through every `parse_release(...)` call. The local `_sanitize` helper and `_WIN_FORBIDDEN` regex were dropped in favor of `_KB.sanitize_for_fs(...)`. - **`detect_media_type(parsed, source_path, kb)` and `find_video_file(path, kb)`** now take the knowledge explicitly instead of importing `_*_EXTENSIONS` constants from the domain. `agent/tools/filesystem.py::analyze_release` imports the application KB singleton and passes it through. --- ## [2026-05-17] — TVShow & Movie aggregate refactor Multi-phase refonte of the TV show domain into a real DDD aggregate, with matching parity work on `Movie`, a language knowledge system, and the `shared/media` restructure that supports both. ### Added - **Language knowledge system** (`alfred/knowledge/iso_languages.yaml` + 42 languages including `und` for undetermined). - `Language` value object (frozen dataclass) with `iso`, `english_name`, `native_name`, `aliases`, and a `matches(raw)` cross-format helper. - `LanguageRegistry` loader (`alfred/domain/shared/knowledge/`) merging builtin + learned YAML. Not a singleton — the application layer instantiates it. - ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings. - **`VideoTrack`** dataclass (`alfred/domain/shared/media/video.py`) with a `resolution` property using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p). - **`shared/media/matching.py`** — `track_lang_matches` helper shared by `Episode` and `Movie`. Implements the **"C+" contract** for language helpers: - `Language` query → cross-format match via `Language.matches()` - `str` query → case-insensitive direct comparison (no normalization) - **TVShow aggregate composition**: - `TVShow.seasons: dict[SeasonNumber, Season]` - `Season.episodes: dict[EpisodeNumber, Episode]` - `Season.expected_episodes` / `Season.aired_episodes` (split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones) - **Aggregate methods on `TVShow`**: - `add_episode(ep)` — sole sanctioned mutation entry point (creates the season if missing) - `add_season(season)` — replaces a season wholesale - `collection_status()` → `CollectionStatus.{EMPTY, PARTIAL, COMPLETE}` - `is_complete_series()` — true iff `ENDED + COMPLETE` - `missing_episodes()` — flat list of all aired-but-not-owned `(season, episode)` pairs - **`CollectionStatus`** enum (orthogonal to `ShowStatus`). - **Episode track helpers** (`has_audio_in`, `has_subtitles_in`, `has_forced_subs`, `audio_languages`, `subtitle_languages`), driven by `Episode.audio_tracks` / `Episode.subtitle_tracks`. - **Movie aggregate parity** — `Movie` now carries `audio_tracks` / `subtitle_tracks` and exposes the same helpers as `Episode` (same C+ contract). - **`CHANGELOG.md`** (this file). ### Changed - **`shared/media_info.py` exploded into `shared/media/{audio,video,subtitle,info,matching}.py`.** `MediaInfo` is now symmetric: every stream type is a `list[Track]`. Flat accessors (`width`, `height`, `video_codec`, `resolution`) remain as properties that read the first video track. - **`MediaInfo.duration_seconds` / `bitrate_kbps`** moved from `VideoTrack` to `MediaInfo` (file-level — they come from the ffprobe `format` block, not a stream). Files without a video stream now correctly expose duration. - **`ShowStatus.from_string`** extended to map TMDB strings (`Returning Series`, `In Production`, `Pilot`, `Planned`, `Canceled`, `Cancelled`). Comparison is whitespace-trimmed and case-insensitive. - **`Season` / `Episode`** dropped their `show_imdb_id` back-references. They are owned by `TVShow` and reached only through it. - **`TVShow.seasons_count` and `episode_count`** are now `@property` (computed from the dict) instead of stored ints. - **`TVShowService.parse_episode_from_filename`** rewritten in string operations (no regex). Supports `S01E05` / `s1e5` and `1x05` / `01x5` forms. - **`TVShowService.find_next_episode`** now drives off `show.missing_episodes()` instead of the hardcoded "max 50 episodes per season" heuristic. - **`TVShowService` constructor** no longer takes `season_repository` / `episode_repository` — the aggregate persists in one block via `TVShowRepository` only. - **`SubtitleTrack` in `alfred.domain.subtitles.entities` renamed to `SubtitleCandidate`.** Coexists with the `shared.media.SubtitleTrack` ffprobe-view dataclass (different bounded contexts, kept separate intentionally). - **`tv_shows/services.py` `_VIDEO_EXTENSIONS`** now loaded from `knowledge/release/file_extensions.yaml` via `load_video_extensions()` (single source of truth). - **`CLAUDE.md`** updated with three new policy sections: - "Tests" — small updates OK during normal work, no mass-update sprees - "Backwards-compatibility shims" — prefer clean migration over shims - "Regex" — not forbidden, use judgment when string ops would be fragile ### Removed - **Legacy `Season N Episode N` filename form** in `TVShowService.parse_episode_from_filename`. It never appears in the release names Alfred handles, and supporting it forced a regex. - **`SeasonRepository` and `EpisodeRepository`** — only the aggregate root has a repository (DDD rule: one repo per aggregate). - **`shared/media_info.py`** compatibility shim — callers updated. - **`SubtitleTrack` compatibility alias** in `subtitles.entities` — callers updated to `SubtitleCandidate`. ### Fixed - **`MediaInfo.duration_seconds` returns `None` on audio-only files** instead of crashing through `primary_video.duration_seconds` (see the duration/bitrate move under **Changed**). - **`MediaOrganizer`** (`infrastructure/filesystem/organizer.py`) no longer passes the removed `show_imdb_id` / `episode_count` kwargs when constructing a `Season` for folder-name generation. ### Internal - Test suite rewritten where the aggregate redesign broke fixtures: `tests/domain/test_tv_shows.py` (69 tests), `tests/domain/test_media_info.py` (rewritten for `VideoTrack`), `tests/application/test_enrich_from_probe.py` (helper added), `tests/infrastructure/test_filesystem_extras.py` (fixtures), `tests/domain/test_tv_shows_service.py` (find_next_episode driven by real aggregate state). - Subtitle services internal migration: `matcher.py`, `utils.py`, `placer.py`, `identifier.py` updated to import `SubtitleCandidate`. - Suite status at end of block: **1066 passed, 8 skipped, 0 failed**.