alfred/CHANGELOG.md

# Changelog

All notable changes to Alfred are documented here.

The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
Alfred is not yet on SemVer — entries are grouped by **dated work blocks** instead
of release numbers. Granularity targets behavioral or API-visible changes; refer
to `git log` for commit-level detail.

Sections used per block: **Added** / **Changed** / **Deprecated** / **Removed** /
**Fixed** / **Internal** (for tech-debt and refactor noise that doesn't affect
callers).

---

## [Unreleased]

### Added

- **Real-world release fixtures** under `tests/fixtures/releases/{easy,shitty,path_of_pain}/`,
  each documenting an expected `ParsedRelease` plus the future `routing`
  (library / torrents / seed_hardlinks) for the upcoming `organize_media`
  refactor. EASY bucket seeded with 5 cases (movie, single-episode, season
  pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15
  anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel),
  French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi),
  multi-episode chain `S14E09E10E11` (Archer, captures E11 loss),
  lowercase `s01e01` (Notre Planète), `NxNN` with ` - ` separators
  (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83),
  season-range `S01-06` (Tatortreiniger, captures movie misclassification),
  bare folder name (Jurassic Park,
  media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path
  degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands,
  captures group=UNKNOWN), subs-only release (Westworld S04).
  Parametrized over `tests/domain/test_release_fixtures.py` for anti-regression.
- **`NxNN` alt season/episode form supported** by `parse_release`. Releases like
  `Show.1x05.720p.HDTV.x264-GRP` and `Show.2x07x08.1080p.WEB.x265-GRP` (multi-ep
  alt form) now parse as TV shows.
- **`alfred/knowledge/release/separators.yaml`** declares the token separators
  used by the release-name tokenizer (`.`, ` `, `[`, `]`, `(`, `)`, `_`). New
  conventions can be added without code changes. The canonical `.` is always
  present even if missing from YAML.

### Changed

- **`parse_release` tokenizer is now data-driven**: it splits on any character
  listed in `separators.yaml` (regex character class) instead of `name.split(".")`.
  This makes YTS-style releases (`The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]`),
  space-separated names (`Inception 2010 1080p BluRay x264-GROUP`), and
  underscore-separated names parse correctly via the direct path — no more
  fallback through sanitization.
- **`parse_release` flow simplified**: site-tag extraction always runs first
  (so `parse_path == "sanitized"` now reliably indicates a stripped `[tag]`),
  then well-formedness is checked only against truly forbidden chars
  (anything not in the configured separator set).
- **ISO 639-2/B is now the canonical language code project-wide** (was a mix of
  639-1 and 639-2/T):
  - `SubtitlePreferences.languages` default is now `["fre", "eng"]` (was
    `["fr", "en"]`). Old LTM files are not auto-migrated — delete
    `data/memory/ltm.json` to regenerate with the new defaults.
  - Subtitle output filenames are now `{iso639_2b}.srt` (e.g. `fre.srt`,
    `fre.sdh.srt`). Existing `fr.srt` files are still **read** correctly
    (recognized as French via alias) but new files are written canonically.
  - `Language` value object docstring corrected: it has always stored 639-2/B
    (matching what ffprobe emits), not 639-2/T as previously documented.
- **`MovieService.validate_movie_file` minimum size is now configurable** via
  `settings.min_movie_size_bytes` (default unchanged: 100 MB). Constructor
  accepts an optional `min_movie_size_bytes` override for tests.
- **`SubtitleKnowledgeBase` delegates language lookup to `LanguageRegistry`**
  rather than duplicating tokens. `subtitles.yaml` now only declares
  subtitle-specific tokens (e.g. `vostfr`, `vf`, `vff`) under a new
  `language_tokens` section.

### Removed

- **`alfred/domain/tv_shows/services.py`** and **`alfred/domain/movies/services.py`**
  deleted entirely. They held fossil parsers (`parse_episode_filename`,
  `extract_movie_metadata`, …) with zero production callers — superseded by
  `parse_release` as the single source of truth for release-name parsing.
  Associated tests (`tests/domain/test_movies.py`, `tests/domain/test_tv_shows_service.py`)
  removed as well.
- `_sanitize` and `_normalize` helpers in `alfred/domain/release/services.py` —
  the new tokenizer makes them redundant.
- `_LANG_KEYWORDS`, `_SDH_TOKENS`, `_FORCED_TOKENS`, `SUBTITLE_EXTENSIONS`
  hardcoded dicts in `alfred/domain/subtitles/scanner.py` — all knowledge now
  lives in YAML (CLAUDE.md compliance).
- `_MIN_MOVIE_SIZE_BYTES` module-level constant in
  `alfred/domain/movies/services.py` — replaced by the new setting.
- Top-level `languages:` block in `subtitles.yaml` — superseded by
  `language_tokens:` (subtitle-specific only) since iso_languages.yaml is the
  canonical source.

### Fixed

- **`hi` token no longer marks a subtitle as SDH** (it conflicted with the
  ISO 639-1 alias for Hindi). SDH is now detected only via `sdh`, `cc`, and
  `hearing` tokens.
- `SubtitleKnowledgeBase` default rules used `"fra"` while
  `iso_languages.yaml` exposes French as `"fre"` — preferred languages
  defaults now match the canonical form.

### Internal

- Removed backward-compat shims `_sanitise_for_fs` /
  `_strip_episode_from_normalised` from `domain/release/value_objects.py`
  (zero callers).
- Cleaned ruff warnings across the codebase: `subprocess.run` calls now pass
  explicit `check=False` (PLW1510); lazy imports promoted to module top where
  there was no cycle (PLC0415 in `manage_subtitles.py`, `placer.py`,
  `qbittorrent/client.py`, `file_manager.py`); fixed module-level import
  ordering (E402) in `language_registry.py` and `subtitles/knowledge/loader.py`;
  removed unused locals (F841 / B007); replaced unnecessary set comprehension
  with `set()` in `release/knowledge.py` (C416).
- Ruff config: ignore `PLR0911` / `PLR0912` (too-many-returns / too-many-branches)
  globally — noisy on parser mappers and orchestrator use-cases where early-return
  validation is essential complexity. Ignore `PLW0603` for the documented memory
  singleton (`infrastructure/persistence/context.py`).

---

## [2026-05-17] — TVShow & Movie aggregate refactor

Multi-phase refonte of the TV show domain into a real DDD aggregate, with
matching parity work on `Movie`, a language knowledge system, and the
`shared/media` restructure that supports both.

### Added

- **Language knowledge system** (`alfred/knowledge/iso_languages.yaml` + 42
  languages including `und` for undetermined).
  - `Language` value object (frozen dataclass) with `iso`, `english_name`,
    `native_name`, `aliases`, and a `matches(raw)` cross-format helper.
  - `LanguageRegistry` loader (`alfred/domain/shared/knowledge/`) merging
    builtin + learned YAML. Not a singleton — the application layer
    instantiates it.
  - ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English
    name, native name, and common spellings.
- **`VideoTrack`** dataclass (`alfred/domain/shared/media/video.py`) with a
  `resolution` property using width-priority bucket detection (handles
  cinema/scope crops like 1920×960 → 1080p).
- **`shared/media/matching.py`** — `track_lang_matches` helper shared by
  `Episode` and `Movie`. Implements the **"C+" contract** for language helpers:
  - `Language` query → cross-format match via `Language.matches()`
  - `str` query → case-insensitive direct comparison (no normalization)
- **TVShow aggregate composition**:
  - `TVShow.seasons: dict[SeasonNumber, Season]`
  - `Season.episodes: dict[EpisodeNumber, Episode]`
  - `Season.expected_episodes` / `Season.aired_episodes` (split so collection
    state can compare "owned vs aired today" without confusing in-flight
    seasons with future ones)
- **Aggregate methods on `TVShow`**:
  - `add_episode(ep)` — sole sanctioned mutation entry point (creates the
    season if missing)
  - `add_season(season)` — replaces a season wholesale
  - `collection_status()` → `CollectionStatus.{EMPTY, PARTIAL, COMPLETE}`
  - `is_complete_series()` — true iff `ENDED + COMPLETE`
  - `missing_episodes()` — flat list of all aired-but-not-owned
    `(season, episode)` pairs
- **`CollectionStatus`** enum (orthogonal to `ShowStatus`).
- **Episode track helpers** (`has_audio_in`, `has_subtitles_in`,
  `has_forced_subs`, `audio_languages`, `subtitle_languages`), driven by
  `Episode.audio_tracks` / `Episode.subtitle_tracks`.
- **Movie aggregate parity** — `Movie` now carries `audio_tracks` /
  `subtitle_tracks` and exposes the same helpers as `Episode` (same C+
  contract).
- **`CHANGELOG.md`** (this file).

### Changed

- **`shared/media_info.py` exploded into `shared/media/{audio,video,subtitle,info,matching}.py`.**
  `MediaInfo` is now symmetric: every stream type is a `list[Track]`. Flat
  accessors (`width`, `height`, `video_codec`, `resolution`) remain as
  properties that read the first video track.
- **`MediaInfo.duration_seconds` / `bitrate_kbps`** moved from `VideoTrack` to
  `MediaInfo` (file-level — they come from the ffprobe `format` block, not a
  stream). Files without a video stream now correctly expose duration.
- **`ShowStatus.from_string`** extended to map TMDB strings (`Returning
  Series`, `In Production`, `Pilot`, `Planned`, `Canceled`, `Cancelled`).
  Comparison is whitespace-trimmed and case-insensitive.
- **`Season` / `Episode`** dropped their `show_imdb_id` back-references. They
  are owned by `TVShow` and reached only through it.
- **`TVShow.seasons_count` and `episode_count`** are now `@property` (computed
  from the dict) instead of stored ints.
- **`TVShowService.parse_episode_from_filename`** rewritten in string
  operations (no regex). Supports `S01E05` / `s1e5` and `1x05` / `01x5` forms.
- **`TVShowService.find_next_episode`** now drives off
  `show.missing_episodes()` instead of the hardcoded "max 50 episodes per
  season" heuristic.
- **`TVShowService` constructor** no longer takes `season_repository` /
  `episode_repository` — the aggregate persists in one block via
  `TVShowRepository` only.
- **`SubtitleTrack` in `alfred.domain.subtitles.entities` renamed to
  `SubtitleCandidate`.** Coexists with the `shared.media.SubtitleTrack`
  ffprobe-view dataclass (different bounded contexts, kept separate
  intentionally).
- **`tv_shows/services.py` `_VIDEO_EXTENSIONS`** now loaded from
  `knowledge/release/file_extensions.yaml` via `load_video_extensions()`
  (single source of truth).
- **`CLAUDE.md`** updated with three new policy sections:
  - "Tests" — small updates OK during normal work, no mass-update sprees
  - "Backwards-compatibility shims" — prefer clean migration over shims
  - "Regex" — not forbidden, use judgment when string ops would be fragile

### Removed

- **Legacy `Season N Episode N` filename form** in
  `TVShowService.parse_episode_from_filename`. It never appears in the release
  names Alfred handles, and supporting it forced a regex.
- **`SeasonRepository` and `EpisodeRepository`** — only the aggregate root has
  a repository (DDD rule: one repo per aggregate).
- **`shared/media_info.py`** compatibility shim — callers updated.
- **`SubtitleTrack` compatibility alias** in `subtitles.entities` — callers
  updated to `SubtitleCandidate`.

### Fixed

- **`MediaInfo.duration_seconds` returns `None` on audio-only files** instead
  of crashing through `primary_video.duration_seconds` (see the duration/bitrate
  move under **Changed**).
- **`MediaOrganizer`** (`infrastructure/filesystem/organizer.py`) no longer
  passes the removed `show_imdb_id` / `episode_count` kwargs when constructing
  a `Season` for folder-name generation.

### Internal

- Test suite rewritten where the aggregate redesign broke fixtures:
  `tests/domain/test_tv_shows.py` (69 tests), `tests/domain/test_media_info.py`
  (rewritten for `VideoTrack`), `tests/application/test_enrich_from_probe.py`
  (helper added), `tests/infrastructure/test_filesystem_extras.py` (fixtures),
  `tests/domain/test_tv_shows_service.py` (find_next_episode driven by real
  aggregate state).
- Subtitle services internal migration: `matcher.py`, `utils.py`, `placer.py`,
  `identifier.py` updated to import `SubtitleCandidate`.
- Suite status at end of block: **1066 passed, 8 skipped, 0 failed**.