test: add real-world release fixtures (EASY bucket)

Captures 5 canonical releases from /mnt/testipool/downloads as parametrized fixtures under tests/fixtures/releases/easy/. Each fixture declares the release name, expected ParsedRelease fields, original tree, and the future routing (library / torrents / seed_hardlinks) for the upcoming organize_media refactor. Today only the 'parsed' section is asserted; tree is materialized into a tmp_path to catch typos. Routing is captured ahead of the planner work — it becomes verifiable once organize_media lands. Cases: back_in_action (movie), slow_horses_single_ep (TV single), foundation_season_pack (S02 + .nfo noise), long_walk_with_noise (movie + KONTRAST.TOP.txt), sinners_yts (YTS bracket-heavy + Subs/ dir). Also tracks CHANGELOG.md under [Unreleased] / Added.
2026-05-18 15:36:19 +02:00
parent f17abdbaec
commit 7bc50fd5b8
8 changed files with 568 additions and 0 deletions
@@ -0,0 +1,224 @@
+# Changelog
+
+All notable changes to Alfred are documented here.
+
+The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
+Alfred is not yet on SemVer — entries are grouped by **dated work blocks** instead
+of release numbers. Granularity targets behavioral or API-visible changes; refer
+to `git log` for commit-level detail.
+
+Sections used per block: **Added** / **Changed** / **Deprecated** / **Removed** /
+**Fixed** / **Internal** (for tech-debt and refactor noise that doesn't affect
+callers).
+
+---
+
+## [Unreleased]
+
+### Added
+
+- **Real-world release fixtures** under `tests/fixtures/releases/{easy,shitty,path_of_pain}/`,
+  each documenting an expected `ParsedRelease` plus the future `routing`
+  (library / torrents / seed_hardlinks) for the upcoming `organize_media`
+  refactor. EASY bucket seeded with 5 cases (movie, single-episode, season
+  pack, movie + noise, YTS bracket-heavy). Parametrized over
+  `tests/domain/test_release_fixtures.py` for anti-regression.
+- **`NxNN` alt season/episode form supported** by `parse_release`. Releases like
+  `Show.1x05.720p.HDTV.x264-GRP` and `Show.2x07x08.1080p.WEB.x265-GRP` (multi-ep
+  alt form) now parse as TV shows.
+- **`alfred/knowledge/release/separators.yaml`** declares the token separators
+  used by the release-name tokenizer (`.`, ` `, `[`, `]`, `(`, `)`, `_`). New
+  conventions can be added without code changes. The canonical `.` is always
+  present even if missing from YAML.
+
+### Changed
+
+- **`parse_release` tokenizer is now data-driven**: it splits on any character
+  listed in `separators.yaml` (regex character class) instead of `name.split(".")`.
+  This makes YTS-style releases (`The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]`),
+  space-separated names (`Inception 2010 1080p BluRay x264-GROUP`), and
+  underscore-separated names parse correctly via the direct path — no more
+  fallback through sanitization.
+- **`parse_release` flow simplified**: site-tag extraction always runs first
+  (so `parse_path == "sanitized"` now reliably indicates a stripped `[tag]`),
+  then well-formedness is checked only against truly forbidden chars
+  (anything not in the configured separator set).
+- **ISO 639-2/B is now the canonical language code project-wide** (was a mix of
+  639-1 and 639-2/T):
+  - `SubtitlePreferences.languages` default is now `["fre", "eng"]` (was
+    `["fr", "en"]`). Old LTM files are not auto-migrated — delete
+    `data/memory/ltm.json` to regenerate with the new defaults.
+  - Subtitle output filenames are now `{iso639_2b}.srt` (e.g. `fre.srt`,
+    `fre.sdh.srt`). Existing `fr.srt` files are still **read** correctly
+    (recognized as French via alias) but new files are written canonically.
+  - `Language` value object docstring corrected: it has always stored 639-2/B
+    (matching what ffprobe emits), not 639-2/T as previously documented.
+- **`MovieService.validate_movie_file` minimum size is now configurable** via
+  `settings.min_movie_size_bytes` (default unchanged: 100 MB). Constructor
+  accepts an optional `min_movie_size_bytes` override for tests.
+- **`SubtitleKnowledgeBase` delegates language lookup to `LanguageRegistry`**
+  rather than duplicating tokens. `subtitles.yaml` now only declares
+  subtitle-specific tokens (e.g. `vostfr`, `vf`, `vff`) under a new
+  `language_tokens` section.
+
+### Removed
+
+- **`alfred/domain/tv_shows/services.py`** and **`alfred/domain/movies/services.py`**
+  deleted entirely. They held fossil parsers (`parse_episode_filename`,
+  `extract_movie_metadata`, …) with zero production callers — superseded by
+  `parse_release` as the single source of truth for release-name parsing.
+  Associated tests (`tests/domain/test_movies.py`, `tests/domain/test_tv_shows_service.py`)
+  removed as well.
+- `_sanitize` and `_normalize` helpers in `alfred/domain/release/services.py` —
+  the new tokenizer makes them redundant.
+- `_LANG_KEYWORDS`, `_SDH_TOKENS`, `_FORCED_TOKENS`, `SUBTITLE_EXTENSIONS`
+  hardcoded dicts in `alfred/domain/subtitles/scanner.py` — all knowledge now
+  lives in YAML (CLAUDE.md compliance).
+- `_MIN_MOVIE_SIZE_BYTES` module-level constant in
+  `alfred/domain/movies/services.py` — replaced by the new setting.
+- Top-level `languages:` block in `subtitles.yaml` — superseded by
+  `language_tokens:` (subtitle-specific only) since iso_languages.yaml is the
+  canonical source.
+
+### Fixed
+
+- **`hi` token no longer marks a subtitle as SDH** (it conflicted with the
+  ISO 639-1 alias for Hindi). SDH is now detected only via `sdh`, `cc`, and
+  `hearing` tokens.
+- `SubtitleKnowledgeBase` default rules used `"fra"` while
+  `iso_languages.yaml` exposes French as `"fre"` — preferred languages
+  defaults now match the canonical form.
+
+### Internal
+
+- Removed backward-compat shims `_sanitise_for_fs` /
+  `_strip_episode_from_normalised` from `domain/release/value_objects.py`
+  (zero callers).
+- Cleaned ruff warnings across the codebase: `subprocess.run` calls now pass
+  explicit `check=False` (PLW1510); lazy imports promoted to module top where
+  there was no cycle (PLC0415 in `manage_subtitles.py`, `placer.py`,
+  `qbittorrent/client.py`, `file_manager.py`); fixed module-level import
+  ordering (E402) in `language_registry.py` and `subtitles/knowledge/loader.py`;
+  removed unused locals (F841 / B007); replaced unnecessary set comprehension
+  with `set()` in `release/knowledge.py` (C416).
+- Ruff config: ignore `PLR0911` / `PLR0912` (too-many-returns / too-many-branches)
+  globally — noisy on parser mappers and orchestrator use-cases where early-return
+  validation is essential complexity. Ignore `PLW0603` for the documented memory
+  singleton (`infrastructure/persistence/context.py`).
+
+---
+
+## [2026-05-17] — TVShow & Movie aggregate refactor
+
+Multi-phase refonte of the TV show domain into a real DDD aggregate, with
+matching parity work on `Movie`, a language knowledge system, and the
+`shared/media` restructure that supports both.
+
+### Added
+
+- **Language knowledge system** (`alfred/knowledge/iso_languages.yaml` + 42
+  languages including `und` for undetermined).
+  - `Language` value object (frozen dataclass) with `iso`, `english_name`,
+    `native_name`, `aliases`, and a `matches(raw)` cross-format helper.
+  - `LanguageRegistry` loader (`alfred/domain/shared/knowledge/`) merging
+    builtin + learned YAML. Not a singleton — the application layer
+    instantiates it.
+  - ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English
+    name, native name, and common spellings.
+- **`VideoTrack`** dataclass (`alfred/domain/shared/media/video.py`) with a
+  `resolution` property using width-priority bucket detection (handles
+  cinema/scope crops like 1920×960 → 1080p).
+- **`shared/media/matching.py`** — `track_lang_matches` helper shared by
+  `Episode` and `Movie`. Implements the **"C+" contract** for language helpers:
+  - `Language` query → cross-format match via `Language.matches()`
+  - `str` query → case-insensitive direct comparison (no normalization)
+- **TVShow aggregate composition**:
+  - `TVShow.seasons: dict[SeasonNumber, Season]`
+  - `Season.episodes: dict[EpisodeNumber, Episode]`
+  - `Season.expected_episodes` / `Season.aired_episodes` (split so collection
+    state can compare "owned vs aired today" without confusing in-flight
+    seasons with future ones)
+- **Aggregate methods on `TVShow`**:
+  - `add_episode(ep)` — sole sanctioned mutation entry point (creates the
+    season if missing)
+  - `add_season(season)` — replaces a season wholesale
+  - `collection_status()` → `CollectionStatus.{EMPTY, PARTIAL, COMPLETE}`
+  - `is_complete_series()` — true iff `ENDED + COMPLETE`
+  - `missing_episodes()` — flat list of all aired-but-not-owned
+    `(season, episode)` pairs
+- **`CollectionStatus`** enum (orthogonal to `ShowStatus`).
+- **Episode track helpers** (`has_audio_in`, `has_subtitles_in`,
+  `has_forced_subs`, `audio_languages`, `subtitle_languages`), driven by
+  `Episode.audio_tracks` / `Episode.subtitle_tracks`.
+- **Movie aggregate parity** — `Movie` now carries `audio_tracks` /
+  `subtitle_tracks` and exposes the same helpers as `Episode` (same C+
+  contract).
+- **`CHANGELOG.md`** (this file).
+
+### Changed
+
+- **`shared/media_info.py` exploded into `shared/media/{audio,video,subtitle,info,matching}.py`.**
+  `MediaInfo` is now symmetric: every stream type is a `list[Track]`. Flat
+  accessors (`width`, `height`, `video_codec`, `resolution`) remain as
+  properties that read the first video track.
+- **`MediaInfo.duration_seconds` / `bitrate_kbps`** moved from `VideoTrack` to
+  `MediaInfo` (file-level — they come from the ffprobe `format` block, not a
+  stream). Files without a video stream now correctly expose duration.
+- **`ShowStatus.from_string`** extended to map TMDB strings (`Returning
+  Series`, `In Production`, `Pilot`, `Planned`, `Canceled`, `Cancelled`).
+  Comparison is whitespace-trimmed and case-insensitive.
+- **`Season` / `Episode`** dropped their `show_imdb_id` back-references. They
+  are owned by `TVShow` and reached only through it.
+- **`TVShow.seasons_count` and `episode_count`** are now `@property` (computed
+  from the dict) instead of stored ints.
+- **`TVShowService.parse_episode_from_filename`** rewritten in string
+  operations (no regex). Supports `S01E05` / `s1e5` and `1x05` / `01x5` forms.
+- **`TVShowService.find_next_episode`** now drives off
+  `show.missing_episodes()` instead of the hardcoded "max 50 episodes per
+  season" heuristic.
+- **`TVShowService` constructor** no longer takes `season_repository` /
+  `episode_repository` — the aggregate persists in one block via
+  `TVShowRepository` only.
+- **`SubtitleTrack` in `alfred.domain.subtitles.entities` renamed to
+  `SubtitleCandidate`.** Coexists with the `shared.media.SubtitleTrack`
+  ffprobe-view dataclass (different bounded contexts, kept separate
+  intentionally).
+- **`tv_shows/services.py` `_VIDEO_EXTENSIONS`** now loaded from
+  `knowledge/release/file_extensions.yaml` via `load_video_extensions()`
+  (single source of truth).
+- **`CLAUDE.md`** updated with three new policy sections:
+  - "Tests" — small updates OK during normal work, no mass-update sprees
+  - "Backwards-compatibility shims" — prefer clean migration over shims
+  - "Regex" — not forbidden, use judgment when string ops would be fragile
+
+### Removed
+
+- **Legacy `Season N Episode N` filename form** in
+  `TVShowService.parse_episode_from_filename`. It never appears in the release
+  names Alfred handles, and supporting it forced a regex.
+- **`SeasonRepository` and `EpisodeRepository`** — only the aggregate root has
+  a repository (DDD rule: one repo per aggregate).
+- **`shared/media_info.py`** compatibility shim — callers updated.
+- **`SubtitleTrack` compatibility alias** in `subtitles.entities` — callers
+  updated to `SubtitleCandidate`.
+
+### Fixed
+
+- **`MediaInfo.duration_seconds` returns `None` on audio-only files** instead
+  of crashing through `primary_video.duration_seconds` (see the duration/bitrate
+  move under **Changed**).
+- **`MediaOrganizer`** (`infrastructure/filesystem/organizer.py`) no longer
+  passes the removed `show_imdb_id` / `episode_count` kwargs when constructing
+  a `Season` for folder-name generation.
+
+### Internal
+
+- Test suite rewritten where the aggregate redesign broke fixtures:
+  `tests/domain/test_tv_shows.py` (69 tests), `tests/domain/test_media_info.py`
+  (rewritten for `VideoTrack`), `tests/application/test_enrich_from_probe.py`
+  (helper added), `tests/infrastructure/test_filesystem_extras.py` (fixtures),
+  `tests/domain/test_tv_shows_service.py` (find_next_episode driven by real
+  aggregate state).
+- Subtitle services internal migration: `matcher.py`, `utils.py`, `placer.py`,
+  `identifier.py` updated to import `SubtitleCandidate`.
+- Suite status at end of block: **1066 passed, 8 skipped, 0 failed**.