Files

T

francwa 42fa6139ed refactor(tools): wire filesystem tools to new use cases, drop broken ones

Updates alfred/agent/tools/filesystem.py to use the five free-function
use cases introduced in the previous commit:

  - list_folder            -> list_dir_use_case(Path(path), roots)
  - create_directory (new) -> create_dir_use_case(Path(path), roots)
  - move_media             -> move_file_use_case(src, dst, roots)
  - move_to_destination    -> create_dir_use_case(dst.parent) + move_file

A module-level _load_directory_roots() helper reads memory once per
call and builds the DirectoryRoots VO; missing roots produce an
explicit 'roots_not_configured' error.

Tools whose backing code was moved to *_OLD files are removed entirely
rather than left broken: manage_subtitles, set_path_for_folder,
create_seed_links, and the four resolve_*_destination tools. They will
come back when the matching application/domain code is rebuilt later
on this branch.

alfred/agent/tools/__init__.py shrinks accordingly. find_media_imdb_id
(already broken before this branch — name not exported by tools.api)
is dropped from the package re-exports so the package imports cleanly
again.

2026-05-26 19:46:49 +02:00

68 KiB

Raw Permalink Blame History

Changelog

All notable changes to Alfred are documented here.

The format is loosely based on Keep a Changelog. Alfred is not yet on SemVer — entries are grouped by dated work blocks instead of release numbers. Granularity targets behavioral or API-visible changes; refer to git log for commit-level detail.

Sections used per block: Added / Changed / Deprecated / Removed / Fixed / Internal (for tech-debt and refactor noise that doesn't affect callers).

[Unreleased]

Changed

filesystem infra + application rewritten as 5 atomic free functions. On branch unfuck. Replaces the monolithic FileManager class + scattered helpers with five small, pure ops in alfred/infrastructure/filesystem/: list_dir, create_dir, link_file, move_file, move_dir. Each takes pathlib.Path arguments and raises typed exceptions from a dedicated hierarchy (FilesystemError → SourceNotFound / DestinationExists / NotADirectory / NotAFile / PermissionDenied / CrossDevice / FilesystemOSError) — no more {"status": "ok" | "error"} dicts at the infra boundary, no more get_memory() reads.
filesystem application: 5 use cases as free functions. A matching <op>_use_case(path, …, roots: DirectoryRoots) wraps each infra op, guards inputs against escaping a new DirectoryRoots VO (downloads / torrents / movies / tv_shows), catches infra exceptions, and returns a frozen <Op>Response DTO. Roots are now injected, not pulled from the global memory singleton.
Agent tool wrappers partially re-wired to the new use cases. list_folder now delegates to list_dir_use_case; move_media to move_file_use_case; move_to_destination chains create_dir_use_case + move_file_use_case; a new create_directory tool wraps create_dir_use_case. Roots are loaded once via a module-level _load_directory_roots() helper that reads the persisted memory (no more per-call singleton reads inside the use cases themselves).

Removed

FileManager / MediaOrganizer / create_folder / move from the public API of alfred.infrastructure.filesystem. Their files remain on disk renamed with an _OLD suffix (e.g. file_manager_OLD.py) so the migration can finish on a follow-up commit without losing reference material. They are no longer re-exported from __init__.
CreateSeedLinksUseCase / ListFolderUseCase / MoveMediaUseCase / ManageSubtitlesUseCase / resolve_destination from the public API of alfred.application.filesystem. Same _OLD rename treatment. This intentionally breaks current tool wrappers and tests downstream — re-wiring is the next chunk of work on this branch.
Agent tools dropped during the refactor (to be reintroduced when the matching domain/application code lands): manage_subtitles, set_path_for_folder, create_seed_links, resolve_season_destination, resolve_episode_destination, resolve_movie_destination, resolve_series_destination. Their wrappers are removed from alfred.agent.tools.filesystem; alfred.agent.tools.__init__ now re-exports only what still imports cleanly. find_media_imdb_id (already broken before this branch — name no longer exported by tools.api) was also dropped from the package re-exports.

Added

.alfred v2 — Phase 4: v2-shaped rescan_show + new rescan_movie + index anchor-warning + tmdb_cache_ttl_days setting. Fourth and final structural phase of specs/dot_alfred_v2.md on branch refactor/dot-alfred-v2. The TV
- movie rescan orchestrators now write v2 release aggregates (SeriesRelease / MovieRelease) via the concrete v2 repositories; the library index keeps auto-healing from the new sidecars on its next read (no TMDB call from rescan — that stays Phase 5).
- rescan_show moves from alfred/application/library/ to alfred/application/tv_shows/ (symmetry with the new alfred/application/movies/). New signature: (show_root, *, tmdb_id: TmdbId, imdb_id: ImdbId | None = None, series_repo, scanner, prober, kb) -> SeriesRelease.
- rescan_movie (new — alfred/application/movies/rescan.py) locates the main video via find_video_file, runs inspect_release once, and writes the per-movie .alfred sidecar. added_at = datetime.now(UTC) on every rescan (the sidecar records reconciliation time, not filesystem mtime). Raises MovieRescanFailed when no video is found in the folder.
- PACK semantics in rescan_show: a single-video + no-episode season becomes SeasonRelease(mode=PACK, folder=…, episodes=()). The slot map stays empty until the Phase 5 TMDB sync supplies episode_count — no fabricated EpisodeRange lands in the sidecar. (Superseded by Phase 4b — see Fixed.)
Settings.tmdb_cache_ttl_days: int = 14 — placeholder for the Phase 5 TTL policy on library-index entries (fetched_at + TTL drives refresh decisions).
Library-index anchor-mismatch warning — both DotAlfredTVShowLibraryIndex and DotAlfredMovieLibraryIndex now cross-check each entry's metadata.path against the on-disk folder layout right after a successful parse. Drift is logged as a WARNING (one per missing folder, with tmdb_id); the heal path stays silent by construction (it always synthesizes from real folder names).
.alfred v2 — Phase 5: TMDB sync orchestrators. Fifth phase of specs/dot_alfred_v2.md on branch refactor/dot-alfred-v2. Two new orchestrators refresh the library-root index's TMDB-cached fields from on-disk truth + a single TMDB call:
- sync_show (alfred/application/tv_shows/sync.py) calls TMDBClient.get_tv_show_info, loads the release via DotAlfredSeriesReleaseRepository.load_by_tmdb_id, and upserts the result into DotAlfredTVShowLibraryIndex. Honors Settings.tmdb_cache_ttl_days; placeholder entries (auto-healed, status == "unknown") always refresh; force=True overrides both gates. Raises ShowNotFoundInLibrary when neither index nor sidecar carry tmdb_id. Indexed shows with a missing per-show sidecar still get a fresh TMDB pass — slot map clears until rescan repopulates it.
- sync_movie (alfred/application/movies/sync.py) is the movie-side parallel. Placeholder signature is name == metadata.path (auto-heal copies the folder name into name; the sidecar schema requires name non-empty so we can't use name == ""). When the per-movie sidecar is gone but the index entry remains, sync warns and returns the existing entry unchanged (no upsert possible without a release).
TmdbMovieInfo DTO + TMDBClient.get_movie_info — symmetric to the existing TmdbShowInfo / get_tv_show_info pair. Carries tmdb_id, imdb_id, title, and release_year (parsed from TMDB's release_date).
load_by_tmdb_id on the v2 release repositories. The series repo returns (SeriesRelease, show_folder_name) so the sync orchestrator can feed DotAlfredTVShowLibraryIndex.upsert(..., path=...); the movie repo returns MovieRelease alone (folder is on release.folder already) and is provided as a semantic alias of find_by_tmdb_id for symmetry.
alfred/application/exceptions.py — new module for the two shared *NotFoundInLibrary exceptions raised by the sync orchestrators (ShowNotFoundInLibrary, MovieNotFoundInLibrary).

Fixed

PACK vs EPISODIC classification (Phase 4b). The Phase 4 walker + rescan_show logic classified seasons by parser output (does the filename carry Exx?), but PACK vs EPISODIC is a structural distinction:
- PACK = season folder with N flat SxxEyy videos.
- EPISODIC = season folder with N subfolders, each holding one video. The walker now descends two levels under show_root and classifies per season folder. Mixed (flat + subfolders) is malformed — warn and skip. rescan_show trusts the walker's mode and stops conflating "single un-numbered video" with PACK (that case is now skipped as malformed too). Tests rewritten against the real model. Supersedes the PACK-semantics bullet above in Added.

Removed

v1 dot_alfred stack and its abstract domain ports. Deleted alfred/infrastructure/persistence/dot_alfred/{bridge,repository, serializer,sidecar}.py, plus the alfred/domain/{tv_shows,movies}/repositories.py ABCs (TVShowRepository / MovieRepository) — zero callers after Phase 4. dot_alfred/__init__.py is rewritten as a v2-only re-export (four concrete repositories + ShowFolderUnknown).
alfred/application/library/ package (rescan + walker moved to alfred/application/tv_shows/).
The two Phase 3 module-level test skips (test_repository.py, test_serializer.py) are lifted by deleting the quarantined files.
MediaWithTracks mixin + track_lang_matches helper in alfred.domain.shared.media. Parked in Phase 4 pending a Phase 5 decision; zero callers across alfred/ and tests/ after the v2 aggregates landed, so both go.

Internal

Suite: 1233 → 1277 passing; 10 → 8 skips (only LLM-not-running skips remain — the Phase 3 quarantines are gone with their files).
Phase 5 cleanup sweep returns zero hits for MediaWithTracks, v1 dot_alfred symbols, v1 sidecar names, and alfred.application. library — the v2 surface is the only one left.

Changed

.alfred v2 — Phase 3: TVShow / Movie aggregates become TMDB-only. Third phase of specs/dot_alfred_v2.md on branch refactor/dot-alfred-v2. Filesystem-side concerns (file paths, tracks, quality, mode, added_at) move to the releases/ domain added in Phase 1; the TMDB aggregates now carry only identity + TMDB catalog facts.
- TVShow — tmdb_id: TmdbId is now the required primary key; imdb_id: ImdbId | None is the optional secondary anchor. Added status: str = "unknown" (raw TMDB string, default matches the v2 library-index auto-heal placeholder). episode_count aggregates the TMDB-cached counts on each Season (was: sum of materialized Episode objects).
- Season — added episode_count: int = 0 (TMDB-cached, authoritative). Removed: audio_tracks, subtitle_tracks, and the mode property (release mode now lives only on SeasonRelease.mode — single source of truth).
- Episode — slimmed to identity + title. Removed: file_path, file_size, audio_tracks, subtitle_tracks. The MediaWithTracks mixin is no longer in Episode's MRO; on-disk facts live on the matching EpisodeRelease keyed by (season_number, episode_number).
- Movie — tmdb_id: TmdbId required, imdb_id optional. Removed: file_path, file_size, quality, added_at, audio_tracks, subtitle_tracks. get_filename() now returns "Title.Year" (quality lives on MovieRelease and is appended by a release-aware caller — Phase 4 wires this through MediaOrganizer).
- TVShowBuilder / SeasonBuilder — constructor requires tmdb_id: TmdbId; imdb_id and status are optional. SeasonBuilder.set_episode_count(int) replaces the old set_audio_tracks / set_subtitle_tracks (tracks no longer persisted on Season).
MovieRelease carries added_at: datetime (required). Bumped dot_alfred/v2 SCHEMA_VERSION from 1 → 2 to add added_at: datetime to MovieReleaseSidecar. Round-trip via Pydantic mode="json" (datetime ↔ ISO 8601 string). No migration code shipped — no v2.1 sidecars exist in the wild yet.
No-coercion TmdbId contract. TVShow(tmdb_id=1396) now raises — callers pass TmdbId(1396). Same for imdb_id: ImdbId | None on TVShow/Movie. Honest type contract, no ergonomic shim.

Removed

Season.mode property (derive from SeasonRelease.mode instead).
Episode.file_path / file_size / audio_tracks / subtitle_tracks.
Movie.file_path / file_size / quality / added_at / audio_tracks / subtitle_tracks.

Internal

v1 dot_alfred package (bridge.py, repository.py, serializer.py, sidecar.py), the abstract TVShowRepository / MovieRepository ports typed against the pre-Phase-3 aggregates, and alfred/application/library/rescan.py are intentionally left in tree as a known-red island. Their tests (tests/infrastructure/persistence/dot_alfred/test_repository.py, test_serializer.py, tests/application/library/test_rescan.py) are module-level skipped with a Phase 4 reference. Phase 4 rewrites rescan_show / introduces rescan_movie on top of the v2 release repositories + library index, then deletes the v1 stack + the abstract ports + the quarantined tests in one swing.
Test suite: 1216 passed, 11 skipped (8 pre-existing + 3 Phase-3 quarantines), 4 xfailed. v2 round-trip tests now reference SCHEMA_VERSION instead of hard-coded 1 for future-proofing.

Added

.alfred v2 — Phase 2: new persistence package + TMDB client extensions. Second phase of specs/dot_alfred_v2.md on branch refactor/dot-alfred-v2. The new alfred/infrastructure/persistence/dot_alfred/v2/ package ships the full v2 sidecar stack while leaving v1 (and the existing TVShow aggregate) untouched — Phase 3 is the cutover.
- Pydantic DTOs — SeriesReleaseSidecar / MovieReleaseSidecar (per-item), TVShowLibraryIndexSidecar / MovieLibraryIndexSidecar (library-root index). All built on a common _Strict base (extra="forbid", frozen=True) with a @model_validator enforcing schema_version == 1.
- Track entries — AudioTrackEntry / SubtitleEntry (sidecar cache shape, slimmed from the domain track types). SubtitleEntry carries is_forced + is_sdh as explicit booleans (v1's type: "sdh" overload is gone).
- Serializer — read_yaml / atomic_write_yaml helpers centralize YAML I/O and atomic writes (.tmp + os.replace). SidecarSchemaError wraps both YAML parse errors and Pydantic validation errors for uniform catch-and-skip semantics.
- Bridge — lossless domain ↔ sidecar conversion for SeriesRelease / MovieRelease (round-trippable, including multi-episode ranges and is_sdh subtitles); one-way projection for library-index entries (show_index_entry_from, movie_index_entry_from) that flattens multi-episode files into per-TMDB-slot maps in seasons[*].episodes.
- Repositories — DotAlfredSeriesReleaseRepository / DotAlfredMovieReleaseRepository walk library_root/*/ with log+skip on corruption; DotAlfredTVShowLibraryIndex / DotAlfredMovieLibraryIndex auto-heal silently on missing or corrupt index files by rebuilding from the per-item sidecars (healed entries keep TMDB-cached fields as placeholders until the next sync repopulates them). Writes are atomic and never auto-heal (read paths handle that).
- TMDB client extensions — TmdbSeasonInfo / TmdbShowInfo DTOs + TMDBClient.get_tv_show_info(tmdb_id) aggregating /tv/{id} + /tv/{id}/external_ids. The parsing logic is a pure function (parse_tv_show_info) testable without HTTP, with an injectable reference date for deterministic aired flag tests.
is_sdh flag on SubtitleTrack. Added to alfred/domain/shared/media.py::SubtitleTrack to mirror ffprobe's hearing_impaired disposition. Wired through the ffprobe layer (ffprobe_prober.py) and the v2 sidecar bridge so SDH information round-trips end-to-end. Defaults to False — backwards-compatible for every existing caller.
37 v2 integration tests on tmp_path covering round-trips (domain ↔ sidecar ↔ YAML ↔ domain), atomic writes (no .tmp leftovers), per-item log+skip on corruption / schema mismatch, movie anchor-mismatch warning, full upsert / find / delete on both library indexes, and the auto-heal path on missing / corrupt / schema-mismatched index files. 16 TMDB DTO tests for the new parse_tv_show_info pure function.
.alfred v2 — Phase 1: new releases/ domain. First step of specs/dot_alfred_v2.md on branch refactor/dot-alfred-v2. The new alfred/domain/releases/ package introduces a filesystem-only bounded context separated from TMDB identity (the existing tv_shows / movies domains). It hosts:
- EpisodeRange VO — covers single-episode files (EpisodeRange(E02, E02)) and multi-episode files (EpisodeRange(E02, E04) for SxxE02E03E04.mkv), with count() / numbers() / is_single() helpers.
- ReleaseMode enum — PACK (N video files directly in the season folder) vs EPISODIC (N sub-folders, one episode each); classified by the walker, never re-derived.
- Aggregates — TrackProfile, EpisodeRelease, SeasonRelease (with episode_count() summing each file's range), SeriesRelease, MovieRelease. All frozen dataclasses; mutation via SeasonReleaseBuilder / SeriesReleaseBuilder (mirror the v1 TVShowBuilder pattern, including from_existing() round-trip).
- Abstract ports — SeriesReleaseRepository, MovieReleaseRepository (concrete DotAlfred* arrive in Phase 2).
- TmdbId VO added to alfred/domain/shared/value_objects.py (positive int, rejects bool/str/float — symmetry with ImdbId).
73 unit tests covering VO validation, entity invariants, builder sort + overlap detection, and from_existing() round-trips. v1 code paths untouched at this stage; new domain coexists.
rescan_show orchestrator (alfred/application/library/rescan.py). Step 4 of the specs/dot_alfred.md plan. Walks an Alfred-managed show folder, runs the existing inspect_release pipeline on every video file it finds, and assembles a frozen TVShow aggregate persisted via the injected TVShowRepository. Reuses the release parser + ffprobe path verbatim — no duplicated parse/probe logic at the library layer. PACK vs EPISODIC inferred per season folder from the on-disk file count + parser output: a single video whose name carries no Exx token becomes a PACK season (tracks lifted to the season-level audio_tracks / subtitle_tracks), anything else becomes EPISODIC (one Episode per file). Episode paths are stored relative to the show root for portability. Files that fail to parse a season/episode number, or seasons with mixed numbers, are logged and skipped — the orchestrator never raises. Embedded subtitle tracks are captured from ffprobe; adjacent .srt files, multi-episode entries (S01E01E02), and TMDB-driven PACK detection are tracked as tech debt for a dedicated subtitles / ShowTracker session. 7 integration tests on tmp_path with the Foundation layout (S01 EPISODIC + S02 PACK) cover the round-trip through the real .alfred repository.
Show tree walker (alfred/application/library/walker.py). Step 4a foundation. walk_show(show_root, scanner, kb) returns a ShowTree(show_root, season_folders=tuple[SeasonFolder, ...]) — pure structural snapshot, no parsing, no probing. Season folders are detected by a \bS\d{1,2}\b token anywhere in the directory name (release-style naming, no Plex Season 01 / Specials conventions). Video files are filtered against kb.video_extensions; no recursion into sub-sub-folders. 11 unit tests on tmp_path cover detection (case-insensitive, in-word rejection), filtering (subs, NFO, sample files), and edge cases (empty / missing show root).
Season-level audio/subtitle tracks (alfred/domain/tv_shows/entities.py, alfred/domain/tv_shows/builders.py). Season now inherits from MediaWithTracks and carries audio_tracks / subtitle_tracks tuples (empty by default). Populated only in PACK mode (the single release covering the whole season); empty in EPISODIC mode where tracks live per-episode. SeasonBuilder gains set_audio_tracks() / set_subtitle_tracks() and forwards them through from_existing(). The bridge writes / reads them in the PACK branch via shared _synth_audio_tracks / _synth_subtitle_tracks helpers used for episodes too.
DotAlfredTVShowRepository — filesystem-backed implementation of the TVShowRepository port (alfred/infrastructure/persistence/dot_alfred/repository.py). Step 3 of the specs/dot_alfred.md plan. Reads and writes one .alfred YAML file per show under a configurable library_root. save(show) writes atomically (.alfred.tmp + os.replace) into a folder that must already exist — the repository never invents a folder name (the upstream MediaOrganizer is in charge of placing files; the repo writes the sidecar next to them). find_by_imdb_id / find_all walk library_root/*/, loading each readable sidecar; folders without a sidecar return None / are skipped (no implicit cold scan — that is the job of the upcoming rescan_show tool). Corrupted YAML and schema violations are logged and skipped, never raised, so a single bad folder does not break the rest of the library. The repo keeps a tiny in-memory imdb_id → folder_name index populated on every successful read/save, so subsequent saves find the right destination without re-walking — useful when the show folder name diverges from show.get_folder_name() (custom 1080p / 4K variants). 20 integration tests on tmp_path cover the round-trip, cold folder / unknown id returns, multi-show find_all, corrupted / wrong-schema skipping, atomic write (no .alfred.tmp left behind), overwrite, and folder-name fallbacks.
Sidecar ↔ TVShow bridge (alfred/infrastructure/persistence/dot_alfred/bridge.py). to_sidecar(show, folder_paths=...) summarizes the rich domain AudioTrack / SubtitleTrack to the sidecar's compact form (unique audio languages in track order; subtitle entries derived from is_forced and assumed source="embedded"). from_sidecar(sidecar, title=...) reconstructs the domain TVShow with synthesized tracks — one AudioTrack per language, one SubtitleTrack per entry, with ffprobe-only fields (codec, channels, channel_layout) left as None. The bridge is intentionally lossy on probe minutiae the sidecar does not store; this is the documented trade-off from the factual-only spec.
.alfred sidecar serializer (alfred/infrastructure/persistence/dot_alfred/). Implements step 2 of the specs/dot_alfred.md plan. Pure-dict in/out (serialize(sidecar) -> dict, deserialize(data) -> ShowSidecar) — YAML I/O lives in the repository layer (step 3) and is kept out for trivial testability. Ships the DTOs that mirror the YAML schema field-for-field (ShowSidecar, SeasonSidecar, EpisodeSidecar, SubtitleEntry). The sidecar acts as a scan cache: it stores only what is genuinely costly to recompute — folder/file paths (skipping the FS walk) and probed track metadata (skipping ffprobe). Release identifiers (group, source, quality, codec) live in folder and file names and are derived on demand by the parser — they are deliberately absent from the schema and rejected on deserialize. The serializer is strict on schema: unknown keys at any level raise SidecarSchemaError, missing required fields raise clearly, and bool cannot sneak in as a season/episode number. Optional fields (tmdb_id, empty audio/subtitles/episodes) are omitted from the output rather than emitted as null / []. Tests cover round-trip equivalence (DTO → dict → DTO and DTO → YAML text → DTO), the Foundation S01 PACK case (real-world fixture with mixed sub types — superset captured at season scope), and a Breaking Bad S05 EPISODIC case. An on-disk tmp_path fixture recreates the Foundation folder structure with placeholder files, ready to be reused by the upcoming repository walk tests in step 3.
TVShowBuilder / SeasonBuilder — sole construction surface for the TVShow aggregate (alfred/domain/tv_shows/builders.py). The aggregate is now fully frozen; building goes through a mutable scratchpad that emits an immutable TVShow via build(). Both builders offer a from_existing() classmethod to seed from a current frozen aggregate and apply modifications. Episodes are emitted sorted by number within a season, seasons sorted by number within the show.
SeasonMode enum (PACK / EPISODIC) in alfred/domain/tv_shows/value_objects.py. Computed at read time from the season's structural shape (Season.mode property): a season with no explicit episodes is PACK (a single release covering the whole season), a season with episodes is EPISODIC (currently airing, one release per episode). Never stored — the YAML sidecar encodes the mode via the presence/absence of the episodes: block.

Changed

TVShow aggregate is now frozen all the way down. TVShow, Season and Episode are all @dataclass(frozen=True). Children are stored as ordered tuples (tuple[Season, ...], tuple[Episode, ...]) sorted by their respective numbers, replacing the previous mutable dicts. Lookup helpers TVShow.get_season(n) and Season.get_episode(n) traverse the tuple lazily via next(). The former add_episode / add_season mutation methods are gone — all construction goes through TVShowBuilder / SeasonBuilder.

Removed

ShowTracker-territory fields stripped from the TVShow aggregate. The aggregate now models only what the .alfred sidecar stores (filesystem-observable facts + immutable identity). Dropped from the domain:
- TVShow.status (ShowStatus) and the ShowStatus enum entirely, along with its TMDB string mapping (from_string).
- TVShow.expected_seasons, Season.expected_episodes, Season.aired_episodes, Season.name.
- TVShow.collection_status(), is_complete_series(), missing_episodes(), is_ongoing(), is_ended() and the CollectionStatus enum.
- Season.is_complete(), is_fully_aired(), missing_episodes() and the aired ≤ expected validation.
- TVShow.add_episode() / TVShow.add_season() / Season.add_episode() — replaced by the builder API. These concerns will reappear in a dedicated ShowTracker layer (to be designed) that combines the .alfred sidecar with live TMDB data to answer questions like "is this show complete?" or "are new episodes out?". Keeping volatile/derived state out of the aggregate matches the factuel-only philosophy locked in specs/dot_alfred.md.

Internal

Test suite rewritten for the new aggregate shape. tests/domain/test_tv_shows.py now covers frozen invariants, builder ordering, last-write-wins on duplicates, from_existing round-trip, and SeasonMode derivation. tests/infrastructure/test_filesystem_extras.py helper simplified (no more ShowStatus.ENDED / expected_seasons on test shows). 1078 tests still green.
Design doc for .alfred/ sidecar persistence (specs/dot_alfred.md). First entry in the new specs/ directory. Specifies a per-show .alfred/ directory holding a show.yaml and one season_NN.yaml per season, used by the upcoming concrete TVShowRepository to cache parse/probe results and avoid full rescans on every library read. Covers schema, naming conventions, cache invalidation strategy (size + mtime), self-healing on drift, atomicity (os.replace), edge cases (legacy folders, corrupted sidecars, manual file removal), and a phased implementation plan. No code yet — spec only.

Internal

specs/ is now tracked. The repo-level .gitignore had a blanket *.md rule with only CHANGELOG.md allow-listed. Added explicit exceptions for /README.md (root only — avoids unintentionally exposing fixture READMEs) and specs/**/*.md so the new design-doc directory ships with the project. Also added an explicit /.claude/ ignore line for the private dev-docs sub-repo that sits inside the working tree but is versioned separately.

Fixed

Multi-episode chain (e.g. S14E09E10E11) now collapses to a full range. The parser previously captured episode=9, episode_end=10 and dropped E11+. It now returns episode=first, episode_end=last, with intermediate values implied. Fixture shitty/archer_multi_episode/ updated from anti-regression-of-bug to anti-regression-of-fix.
Apostrophes in titles no longer push the release through the AI fallback. Honey.Don't.2025.2160p.WEBRip.DSNP.DV.HDR.x265-Amen previously parsed with parse_path="ai" and everything UNKNOWN because ' is in the forbidden-chars list. Apostrophes are now pre-stripped before the well-formed check, so the parse completes normally (title=Honey.Dont, year=2025, quality=2160p, ...); only the title text loses its apostrophe. parse_path becomes sanitized to surface the cleanup. Side win: PoP fixture the_prodigy_full_chaos/ also moves from total failure to a partially-correct parse (year, source, codec extracted).
Season-range markers (Sxx-yy) are now recognized as tv_complete. Der.Tatortreiniger.S01-06.GERMAN... previously parsed as media_type=movie with S01-06 glued onto the title. The parser now recognizes the range, sets season=first, media_type=tv_complete, and removes the marker from the title. is_season_pack flips to true.
Pure-punctuation TITLE tokens are dropped at assembly. Releases with surrounding - separators (Vinyl - 1x01 - FHD) previously produced title="Vinyl.-". Such tokens (a stray dash, a wide pipe ｜, …) carry no title content and are now filtered out. Side effect: PoP fixture khruangbin_yt_wide_pipe/ also benefits — the YouTube wide-pipe no longer leaks into the title.

Added

Fullwidth vertical bar ｜ (U+FF5C) is now a recognized release-name token separator. Added to alfred/knowledge/release/separators.yaml so CJK release names (and the occasional decorative YouTube-style use) tokenize cleanly instead of leaving the wide pipe glued onto an adjacent token. The tokenizer in alfred/domain/release/parser/pipeline.py already iterates the separator list as plain strings (no regex), so a multi-byte UTF-8 separator works without any code change.
InspectedResult.recommended_action property — derived hint that collapses the orchestrator's go / wait / skip decision into a single value ("process" / "ask_user" / "skip"). Centralizes the exclusion logic that was previously dispersed across road / media_type / main_video checks at each call site. Ordering is part of the contract: skip (no main video, or media_type == "other") wins over ask_user (media_type == "unknown" or road == "path_of_pain") which wins over process. Surfaced through the analyze_release tool so the LLM can route on it directly. 6 new tests in tests/application/test_inspect.py cover the four branches and the precedence rules.
LanguageRepository port in alfred.domain.shared.ports. Structural Protocol covering from_iso, from_any, all, __contains__, __len__ — the surface previously coupled to the concrete LanguageRegistry. Mirrors the MediaProber / FilesystemScanner pattern: domain code depends on the Protocol, infrastructure provides the YAML-backed adapter. Tests in tests/infrastructure/test_language_registry.py.

Changed

Movie and Episode are now frozen dataclasses. Both entities hold their track collections as tuple[AudioTrack, ...] and tuple[SubtitleTrack, ...] instead of mutable lists, and are @dataclass(frozen=True, eq=False) (identity-based equality preserved via __eq__/__hash__). __post_init__ coercion uses object.__setattr__ for the imdb_id / title / season_number / episode_number normalizations. To project enrichment results (probe output, file metadata) callers now rebuild via dataclasses.replace(...). Pattern aligned with the recent ParsedRelease freeze. MediaWithTracks mixin contract updated to tuple accordingly. Season and TVShow remain mutable for now — freezing the aggregate root would cascade a full reconstruction on every add_episode, deferred.
SubtitleCandidate renamed to SubtitleScanResult. The old name conflated "this might become a placed subtitle" with "this is what a scan pass produced". The class is the output of a scan/identify pass — language/format may still be None, confidence reflects how sure the classifier is, and raw_tokens holds the filename fragments under analysis. SubtitleScanResult says that directly. Pure rename with a refreshed docstring in alfred/domain/subtitles/entities.py; no behavior change. Touches the domain entity + __init__ export, the matcher / identifier / utils services, the manage_subtitles use case, the placer, the metadata store, the shared-media cross-ref comment, and the seven test modules that imported the type.
ParsedRelease is now frozen; enrichment passes return new instances. The VO was mutable so detect_media_type and enrich_from_probe could patch fields in place — a code smell in a value object whose identity is its content. ParsedRelease is now @dataclass(frozen=True); languages is a tuple[str, ...] instead of a list[str]. enrich_from_probe returns a new ParsedRelease via dataclasses.replace (only allocates when at least one field actually changed). inspect_release rebinds parsed after both detect_media_type (wrapped in MediaTypeToken to satisfy the strict isinstance check that now also runs on replace) and enrich_from_probe. Parser pipeline now packs languages as a tuple in the assemble dict. Callers updated: inspect_release, testing/recognize_folders_in_downloads.py, and the enrichment tests (22 call sites + language assertions switched to tuple literals).
resolve_destination use cases take kb / prober as required params; module-level singletons gone. The four resolve_{season,episode,movie,series}_destination use cases now accept kb: ReleaseKnowledge and prober: MediaProber as required arguments, matching the shape of inspect_release. The module-level _KB = YamlReleaseKnowledge() and _PROBER = FfprobeMediaProber() singletons that previously lived in alfred/application/filesystem/resolve_destination.py are removed — the application layer no longer reaches into infrastructure. The singletons now live at the agent-tools frontier (alfred/agent/tools/filesystem.py), where the LLM-facing wrappers instantiate them once and thread them through. analyze_release no longer needs the dirty from ... import _KB indirection. Tests inject their own stubs by keyword (prober=_StubProber(...)) instead of monkeypatching a module attribute.
ParsePath enum renamed to TokenizationRoute. The old name collided with pathlib.Path in code-reading mental models, and was one letter from parse_path (the field that holds the value) — making it harder than it needed to be to spot the type vs the attribute. TokenizationRoute says what it actually captures (DIRECT / SANITIZED / AI = how the name reached the tokenizer), and the class docstring now spells out the orthogonality with Road (EASY / SHITTY / PATH_OF_PAIN, which captures parser confidence on ParseReport). The parse_path field name stays unchanged — string values too — so YAML fixtures, the analyze_release tool spec, and any external consumer are untouched.
enrich_from_probe codec mappings moved to YAML. The three hard-coded module dicts (_VIDEO_CODEC_MAP, _AUDIO_CODEC_MAP, _CHANNEL_MAP) translating ffprobe output to scene tokens (hevc → x265, eac3 → EAC3, 8 → "7.1", …) now live in alfred/knowledge/release/probe_mappings.yaml and are loaded into ReleaseKnowledge.probe_mappings (new port field, populated by YamlReleaseKnowledge). enrich_from_probe gains a third kb parameter and reads the maps from there. Aligns with the CLAUDE.md rule that lookup tables of domain knowledge belong in YAML, not in Python — and opens the door to a future "learn new codec" pass. Callers updated: inspect_release, testing/recognize_folders_in_downloads.py, and all 22 sites in tests/application/test_enrich_from_probe.py.
ParsedRelease.tech_string is now a derived @property (alfred/domain/release/value_objects.py). It computes quality.source.codec joined by dots on every access, so it stays in sync with the underlying fields by construction. The stored field is gone from the dataclass, the dict returned by assemble() no longer carries the key, parse_release's malformed-name fallback drops the tech_string="" kwarg, and enrich_from_probe no longer re-derives it after filling quality/source/codec. Closes the parser/enrichment double-source-of-truth that e79ca46 had to fix reactively. The fixtures runner now injects tech_string alongside is_season_pack since asdict() skips properties.
RuleScope.level is now an enum (RuleScopeLevel). The set of valid levels (global, release_group, movie, show, season, episode) was documented only in a docstring comment and validated nowhere. RuleScopeLevel(str, Enum) keeps wire compatibility (YAML serialization, .value access) while making the closed set explicit to type-checkers and IDEs. to_dict() emits .value strings so YAML output is unchanged.
FilePath VO uses __post_init__ instead of a hand-rolled __init__. Same public API (accepts str | Path), same behavior, but the dataclass-generated __init__ is no longer bypassed. One less smell in the shared VOs.
Language VO is strict by default; Language.from_raw() factory for normalization. The previous __post_init__ mutated iso and aliases via object.__setattr__ on a frozen dataclass — a code smell hiding behind the dataclass facade. Split: the direct constructor now rejects un-normalized input (uppercase iso, whitespace in aliases, etc.), and Language.from_raw() handles arbitrary YAML/user input. Only one caller (LanguageRegistry loading the ISO YAML) needed migration.
ParsedRelease.normalised renamed to clean. The field name promised "dots instead of spaces" but in practice held raw - site_tag - apostrophes — only used by season_folder_name(). Renamed and docstring corrected.
ParsedRelease.media_type / parse_path are strict enums. The fields were already typed as MediaTypeToken / ParsePath, but a tolerant __post_init__ coerced raw strings. With both classes being (str, Enum), the coercion served no purpose. Strict constructor; .value no longer passed at call sites; dropped the unused _VALID_MEDIA_TYPES / _VALID_PARSE_PATHS lookup tables.

Removed

settings.min_movie_size_bytes — orphan Pydantic field + validator. Its only consumer (MovieService.validate_movie_file) had been removed during an earlier refactor. The "real movie vs sample" rule now lives in extension-based exclusion (application/release/supported_media.py) and PoP. If a size threshold is ever needed, it'll go in a knowledge YAML, not in settings.

Internal

Flattened alfred.domain.shared.media/ package into a single media.py module. The 6-file package (audio, video, subtitle, info, matching, tracks_mixin + __init__) collapsed into one ~250 LoC module. All 12 import sites continue to resolve unchanged (from alfred.domain.shared.media import AudioTrack, MediaInfo, …) since Python treats media.py and media/__init__.py interchangeably for import paths. Easier to scan when the whole bounded-context fits on one screen.
SubtitleKnowledgeBase types language_registry against the LanguageRepository port instead of the concrete LanguageRegistry class. The default constructor still instantiates the concrete adapter when no repository is injected — behaviour is unchanged for existing callers. Opens the door to in-memory fakes in future tests without loading the full ISO 639 YAML.
Moved detect_media_type and enrich_from_probe from alfred.application.filesystem to alfred.application.release. They are inspection-pipeline helpers — their natural home is next to inspect_release, not next to the filesystem use cases. The move also eliminates a circular-import workaround in resolve_destination.py: inspect_release can now be imported at module top instead of lazily inside _resolve_parsed. Public surface is unchanged for callers that imported the helpers from their full module paths (the only call sites — inspect.py, two tests, one testing script — were updated in this commit).

Added

resolve_*_destination use cases now consume inspect_release. resolve_episode_destination and resolve_movie_destination reuse their existing source_file parameter as the inspection target; resolve_season_destination and resolve_series_destination gain a new optional source_path parameter (also threaded through the tool wrappers and YAML specs). When the path exists, ffprobe data fills tokens missing from the release name (e.g. quality) and refreshes tech_string, so the destination folder / file names end up more accurate. When the path is missing or absent (back-compat callers), the use cases fall back to parse-only — same behavior as before.

Fixed

enrich_from_probe now refreshes tech_string after filling quality / source / codec. Previously the field stayed at its parser-time value, so filename builders saw stale tech tokens even after a successful probe. New TestTechString class in tests/application/test_enrich_from_probe.py locks the behavior.

Added

inspect_release orchestrator + InspectedResult VO (alfred/application/release/inspect.py). Single composition of the four inspection layers: parse_release → detect_media_type (patches parsed.media_type) → find_main_video (top-level scan) → prober.probe + enrich_from_probe when a video exists and the refined media type isn't in {"unknown", "other"}. Returns a frozen InspectedResult(parsed, report, source_path, main_video, media_info, probe_used) that downstream callers consume directly instead of rebuilding the same chain. kb and prober are injected — no module-level singletons. Never raises.

Changed

analyze_release tool now delegates to inspect_release — same output shape, plus two new fields: confidence (0–100) and road ("easy" / "shitty" / "path_of_pain") surfaced from the parser's ParseReport. The tool spec (specs/analyze_release.yaml) documents both fields so the LLM can route releases by confidence.
MediaProber port now covers full media probing: added probe(video) -> MediaInfo | None alongside the existing list_subtitle_streams. FfprobeMediaProber (in alfred/infrastructure/probe/) implements both methods and is now the single adapter shelling out to ffprobe. The standalone alfred/infrastructure/filesystem/ffprobe.py module was removed — all callers (tools, testing scripts) instantiate FfprobeMediaProber instead. Unblocks the upcoming inspect_release orchestrator, which depends on the port.

Removed

alfred/infrastructure/filesystem/ffprobe.py (folded into the FfprobeMediaProber adapter).

[2026-05-20] — Release parser confidence scoring + exclusion

Added

Pre-pipeline exclusion helpers (alfred/application/release/supported_media.py): is_supported_video(path, kb) (extension-only check against kb.video_extensions) and find_main_video(folder, kb) (top-level scan, lexicographically-first eligible file, returns None when no video qualifies; accepts a bare file as folder for single-file releases). No size threshold, no filename heuristics — PATH_OF_PAIN handles the exotic cases. Foundation for the future inspect_release orchestrator.
Release parser — parse-confidence scoring (alfred/domain/release/parser/scoring.py, alfred/knowledge/release/scoring.yaml). parse_release now returns (ParsedRelease, ParseReport). The new ParseReport frozen VO carries a 0–100 confidence, a road ("easy" / "shitty" / "path_of_pain"), the residual UNKNOWN tokens, and the missing critical fields. EASY is decided structurally (a group schema matched); SHITTY vs PATH_OF_PAIN is decided by score against a YAML-configurable cutoff (default 60). Weights and penalties also live in scoring.yaml — title 30, media_type 20, year 15, season 10, episode 5, tech 5 each; penalty 5 per UNKNOWN token capped at -30. Road is a new enum, distinct from ParsePath (which records the tokenization route, not the confidence tier). ReleaseKnowledge port gains a scoring: dict field.

Changed

parse_release signature is now (name, kb) → tuple[ParsedRelease, ParseReport] instead of returning a bare ParsedRelease. Call sites updated in application/filesystem/resolve_destination.py and agent/tools/filesystem.py. Tests updated accordingly.

[2026-05-20] — Release parser v2 (EASY + SHITTY)

Added

Release parser v2 — EASY path live (alfred/domain/release/parser/): new annotate-based pipeline (tokenize → annotate → assemble) drives releases from known groups. Exposes Token (frozen VO with index + role + extra), TokenRole enum (structural/technical/meta families), and GroupSchema / SchemaChunk value objects.
- pipeline.tokenize: string-ops separator split (no regex), strips a [site.tag] prefix/suffix first.
- pipeline.annotate: detects the trailing group right-to-left (priority to codec-GROUP shape, fallback to any non-source dashed token), looks up its GroupSchema, then walks tokens and schema chunks in lockstep — optional chunks that don't match are skipped, mandatory mismatches abort EASY and return None so the caller can fall back to SHITTY.
- pipeline.assemble: folds annotated tokens into a ParsedRelease-compatible dict.
- parse_release (in release.services) tries the v2 EASY path first and falls through to the legacy SHITTY heuristic on None. Legacy SHITTY/PATH OF PAIN behavior is unchanged.
- Knowledge: alfred/knowledge/release/release_groups/{kontrast,elite, rarbg}.yaml declare the canonical chunk order per group, loaded via new ReleaseKnowledge.group_schema(name) port method.
- Tests in tests/domain/release/test_parser_v2_{scaffolding,easy}.py cover token VOs, site-tag stripping, group detection, schema-driven annotation (movie, TV episode, season pack with optional source), and field assembly.
Release parser v2 — enricher pass completes the EASY pipeline. The structural schema walk now tolerates non-positional tokens between chunks (instead of aborting on leftover tokens), and a second pass tags them with audio / video-meta / edition / language roles. Multi-token sequences from audio.yaml, video.yaml, editions.yaml (e.g. DTS.HD.MA, DV.HDR10, TrueHD.Atmos, DIRECTORS.CUT) are matched before single tokens. Channel layouts like 5.1 and 7.1 (split into two tokens by the . separator) are detected as consecutive pairs. Sequence members carry an extra["sequence_member"] marker so assemble extracts the canonical value only from the primary token. KONTRAST releases with audio / HDR / edition / language metadata now produce a fully populated ParsedRelease.
Streaming distributor as a separate dimension from encoding source. New alfred/knowledge/release/distributors.yaml (NF, AMZN, DSNP, HMAX, ATVP, HULU, PCOK, PMTP, CR) feeds a new ReleaseKnowledge.distributors port field, a TokenRole.DISTRIBUTOR annotation, and a ParsedRelease.distributor field. WEB-DL stays the source; the platform that produced the release is now recorded distinctly. The five entries (NF, AMZN, DSNP, HMAX, ATVP) were correspondingly removed from sources.yaml.
Real-world release fixtures under tests/fixtures/releases/{easy,shitty,path_of_pain}/, each documenting an expected ParsedRelease plus the future routing (library / torrents / seed_hardlinks) for the upcoming organize_media refactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chain S14E09E10E11 (Archer, captures E11 loss), lowercase s01e01 (Notre Planète), NxNN with - separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-range S01-06 (Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, captures group=S03 misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual [FR-EN] tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized over tests/domain/test_release_fixtures.py for anti-regression.
NxNN alt season/episode form supported by parse_release. Releases like Show.1x05.720p.HDTV.x264-GRP and Show.2x07x08.1080p.WEB.x265-GRP (multi-ep alt form) now parse as TV shows.
alfred/knowledge/release/separators.yaml declares the token separators used by the release-name tokenizer (., , [, ], (, ), _). New conventions can be added without code changes. The canonical . is always present even if missing from YAML.

Changed

Release parser v2 — SHITTY simplified to dict-driven tagging. The legacy ~480-line heuristic block in release/services.py is gone; pipeline._annotate_shitty does a single pass that looks each token up in the kb buckets (resolutions / sources / codecs / distributors / year / SxxExx) with first-match-wins semantics, and the leftmost contiguous UNKNOWN run becomes the title. annotate() no longer returns None — SHITTY is the always-on fallback when no group schema matches. services.py shrunk from ~525 to ~85 lines. Four fixtures (deutschland_franchise_box, sleaford_yt_slug, super_mario_bilingual, predator_space_separators — the last one moved from shitty/ → path_of_pain/) are now marked pytest.mark.xfail(strict=False) documenting PoP-grade pathologies that SHITTY intentionally won't handle. ReleaseFixture grows an xfail_reason field; the parametrized suite wires the xfail mark automatically.
parse_release tokenizer is now data-driven: it splits on any character listed in separators.yaml (regex character class) instead of name.split("."). This makes YTS-style releases (The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]), space-separated names (Inception 2010 1080p BluRay x264-GROUP), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization.
parse_release flow simplified: site-tag extraction always runs first (so parse_path == "sanitized" now reliably indicates a stripped [tag]), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set).
ISO 639-2/B is now the canonical language code project-wide (was a mix of 639-1 and 639-2/T):
- SubtitlePreferences.languages default is now ["fre", "eng"] (was ["fr", "en"]). Old LTM files are not auto-migrated — delete data/memory/ltm.json to regenerate with the new defaults.
- Subtitle output filenames are now {iso639_2b}.srt (e.g. fre.srt, fre.sdh.srt). Existing fr.srt files are still read correctly (recognized as French via alias) but new files are written canonically.
- Language value object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented.
MovieService.validate_movie_file minimum size is now configurable via settings.min_movie_size_bytes (default unchanged: 100 MB). Constructor accepts an optional min_movie_size_bytes override for tests.
SubtitleKnowledgeBase delegates language lookup to LanguageRegistry rather than duplicating tokens. subtitles.yaml now only declares subtitle-specific tokens (e.g. vostfr, vf, vff) under a new language_tokens section.

Removed

alfred/domain/tv_shows/services.py and alfred/domain/movies/services.py deleted entirely. They held fossil parsers (parse_episode_filename, extract_movie_metadata, …) with zero production callers — superseded by parse_release as the single source of truth for release-name parsing. Associated tests (tests/domain/test_movies.py, tests/domain/test_tv_shows_service.py) removed as well.
_sanitize and _normalize helpers in alfred/domain/release/services.py — the new tokenizer makes them redundant.
_LANG_KEYWORDS, _SDH_TOKENS, _FORCED_TOKENS, SUBTITLE_EXTENSIONS hardcoded dicts in alfred/domain/subtitles/scanner.py — all knowledge now lives in YAML (CLAUDE.md compliance).
_MIN_MOVIE_SIZE_BYTES module-level constant in alfred/domain/movies/services.py — replaced by the new setting.
Top-level languages: block in subtitles.yaml — superseded by language_tokens: (subtitle-specific only) since iso_languages.yaml is the canonical source.

Fixed

hi token no longer marks a subtitle as SDH (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only via sdh, cc, and hearing tokens.
SubtitleKnowledgeBase default rules used "fra" while iso_languages.yaml exposes French as "fre" — preferred languages defaults now match the canonical form.

Internal

Domain I/O extraction (refactor/domain-io-extraction): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits:
- Knowledge YAML loaders moved to infrastructure: alfred/domain/release/knowledge.py, alfred/domain/shared/knowledge/language_registry.py, and alfred/domain/subtitles/knowledge/{base,loader}.py relocated to alfred/infrastructure/knowledge/. Re-exports were dropped — callers import directly from the new location.
- MediaProber and FilesystemScanner Protocol ports introduced at alfred/domain/shared/ports/ with frozen-dataclass DTOs (SubtitleStreamInfo, FileEntry). SubtitleIdentifier and PatternDetector are now constructor-injected with concrete adapters (FfprobeMediaProber wrapping subprocess.run(ffprobe) and PathlibFilesystemScanner wrapping pathlib). No more direct subprocess/pathlib usage from the subtitle domain services.
- Live filesystem methods removed from VOs and entities: FilePath.exists() / .is_file() / .is_dir() deleted — FilePath is now a pure address VO. Movie.has_file() and Episode.is_downloaded() dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races).
- SubtitlePlacer moved to the application layer at alfred/application/subtitles/placer.py — it performs os.link I/O, which doesn't belong in the domain. Pre-checks replaced with try/except for FileNotFoundError/FileExistsError.
- SubtitleRuleSet.resolve() no longer reaches into the knowledge base: the implicit DEFAULT_RULES() helper is gone, replaced by an explicit default_rules: SubtitleMatchingRules parameter. The ManageSubtitles use case loads defaults from the KB once and passes them in.
- SubtitleKnowledge Protocol port at alfred/domain/subtitles/ports/knowledge.py declares the read-only query surface domain services consume (7 methods: known_extensions, format_for_extension, language_for_token, is_known_lang_token, type_for_token, is_known_type_token, patterns). SubtitleIdentifier and PatternDetector depend on this Protocol instead of the concrete SubtitleKnowledgeBase from infrastructure — domain/subtitles/ now has zero imports from infrastructure/. The remaining domain → infra leak (domain/release/ loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch.
to_dot_folder_name(title) helper in alfred/domain/shared/value_objects.py — extracts the re.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".") pattern that was duplicated between MovieTitle.normalized() and TVShow.get_folder_name().
ParsedRelease.languages uses field(default_factory=list) instead of a manual __post_init__ that assigned [] via object.__setattr__.
file_extensions.yaml splits subtitle sidecars (.srt, .sub, .idx, .ass, .ssa) into a dedicated subtitle: category instead of lumping them under metadata:. The _METADATA_EXTENSIONS set used by detect_media_type remains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a new load_subtitle_extensions() loader is now available for the subtitles domain. Sematic clarity, no functional change.
tv_shows/entities.py module docstring now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure.
Removed backward-compat shims _sanitise_for_fs / _strip_episode_from_normalised from domain/release/value_objects.py (zero callers).
Cleaned ruff warnings across the codebase: subprocess.run calls now pass explicit check=False (PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 in manage_subtitles.py, placer.py, qbittorrent/client.py, file_manager.py); fixed module-level import ordering (E402) in language_registry.py and subtitles/knowledge/loader.py; removed unused locals (F841 / B007); replaced unnecessary set comprehension with set() in release/knowledge.py (C416).
Ruff config: ignore PLR0911 / PLR0912 (too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. Ignore PLW0603 for the documented memory singleton (infrastructure/persistence/context.py).
Release-knowledge DDD purification (refactor/domain-release-knowledge): the last domain → infrastructure leak (domain/release/value_objects.py loading YAML at import-time) is gone. Achieved via:
- ReleaseKnowledge Protocol port at alfred/domain/release/ports/knowledge.py declares the read-only query surface release parsing needs (token sets for resolutions, sources, codecs, languages, hdr extras; structured dicts for audio, video_meta, editions, media_type_tokens; separators list; file-extension sets used by application/infra callers; sanitize_for_fs(text) method).
- YamlReleaseKnowledge adapter at alfred/infrastructure/knowledge/release_kb.py loads every YAML constant once at construction. Builds an immutable str.maketrans translation table for filesystem sanitization.
- parse_release(name, kb) takes the knowledge as an explicit parameter — no more module-level YAML loading inside the domain. Every internal helper (_tokenize, _extract_tech, _extract_languages, _extract_audio, _extract_video_meta, _extract_edition, _extract_title, _infer_media_type, _is_well_formed) takes kb.
- ParsedRelease Option B: sanitization happens once at parse time and is stored on a new title_sanitized: str field. Builder methods (show_folder_name, season_folder_name, episode_filename, movie_folder_name, movie_filename) are now pure — they accept already-sanitized tmdb_title_safe / tmdb_episode_title_safe arguments. Callers at the use-case boundary sanitize TMDB strings via kb.sanitize_for_fs(...) before passing them in.
- All domain-knowledge constants removed from value_objects.py: _RESOLUTIONS, _SOURCES, _CODECS, _AUDIO, _VIDEO_META, _EDITIONS, _HDR_EXTRA, _MEDIA_TYPE_TOKENS, _LANGUAGE_TOKENS, _FORBIDDEN_CHARS, _VIDEO_EXTENSIONS, _NON_VIDEO_EXTENSIONS, _SUBTITLE_EXTENSIONS, _METADATA_EXTENSIONS, _WIN_FORBIDDEN_TABLE, and the _sanitize_for_fs helper. The domain module is now pure.
- Application-layer KB singleton: resolve_destination.py instantiates a module-level _KB: ReleaseKnowledge = YamlReleaseKnowledge() and threads it through every parse_release(...) call. The local _sanitize helper and _WIN_FORBIDDEN regex were dropped in favor of _KB.sanitize_for_fs(...).
- detect_media_type(parsed, source_path, kb) and find_video_file(path, kb) now take the knowledge explicitly instead of importing _*_EXTENSIONS constants from the domain. agent/tools/filesystem.py::analyze_release imports the application KB singleton and passes it through.

[2026-05-17] — TVShow & Movie aggregate refactor

Multi-phase refonte of the TV show domain into a real DDD aggregate, with matching parity work on Movie, a language knowledge system, and the shared/media restructure that supports both.

Added

Language knowledge system (alfred/knowledge/iso_languages.yaml + 42 languages including und for undetermined).
- Language value object (frozen dataclass) with iso, english_name, native_name, aliases, and a matches(raw) cross-format helper.
- LanguageRegistry loader (alfred/domain/shared/knowledge/) merging builtin + learned YAML. Not a singleton — the application layer instantiates it.
- ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings.
VideoTrack dataclass (alfred/domain/shared/media/video.py) with a resolution property using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p).
shared/media/matching.py — track_lang_matches helper shared by Episode and Movie. Implements the "C+" contract for language helpers:
- Language query → cross-format match via Language.matches()
- str query → case-insensitive direct comparison (no normalization)
TVShow aggregate composition:
- TVShow.seasons: dict[SeasonNumber, Season]
- Season.episodes: dict[EpisodeNumber, Episode]
- Season.expected_episodes / Season.aired_episodes (split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones)
Aggregate methods on TVShow:
- add_episode(ep) — sole sanctioned mutation entry point (creates the season if missing)
- add_season(season) — replaces a season wholesale
- collection_status() → CollectionStatus.{EMPTY, PARTIAL, COMPLETE}
- is_complete_series() — true iff ENDED + COMPLETE
- missing_episodes() — flat list of all aired-but-not-owned (season, episode) pairs
CollectionStatus enum (orthogonal to ShowStatus).
Episode track helpers (has_audio_in, has_subtitles_in, has_forced_subs, audio_languages, subtitle_languages), driven by Episode.audio_tracks / Episode.subtitle_tracks.
Movie aggregate parity — Movie now carries audio_tracks / subtitle_tracks and exposes the same helpers as Episode (same C+ contract).
CHANGELOG.md (this file).

Changed

shared/media_info.py exploded into shared/media/{audio,video,subtitle,info,matching}.py. MediaInfo is now symmetric: every stream type is a list[Track]. Flat accessors (width, height, video_codec, resolution) remain as properties that read the first video track.
MediaInfo.duration_seconds / bitrate_kbps moved from VideoTrack to MediaInfo (file-level — they come from the ffprobe format block, not a stream). Files without a video stream now correctly expose duration.
ShowStatus.from_string extended to map TMDB strings (Returning Series, In Production, Pilot, Planned, Canceled, Cancelled). Comparison is whitespace-trimmed and case-insensitive.
Season / Episode dropped their show_imdb_id back-references. They are owned by TVShow and reached only through it.
TVShow.seasons_count and episode_count are now @property (computed from the dict) instead of stored ints.
TVShowService.parse_episode_from_filename rewritten in string operations (no regex). Supports S01E05 / s1e5 and 1x05 / 01x5 forms.
TVShowService.find_next_episode now drives off show.missing_episodes() instead of the hardcoded "max 50 episodes per season" heuristic.
TVShowService constructor no longer takes season_repository / episode_repository — the aggregate persists in one block via TVShowRepository only.
SubtitleTrack in alfred.domain.subtitles.entities renamed to SubtitleCandidate. Coexists with the shared.media.SubtitleTrack ffprobe-view dataclass (different bounded contexts, kept separate intentionally).
tv_shows/services.py _VIDEO_EXTENSIONS now loaded from knowledge/release/file_extensions.yaml via load_video_extensions() (single source of truth).
CLAUDE.md updated with three new policy sections:
- "Tests" — small updates OK during normal work, no mass-update sprees
- "Backwards-compatibility shims" — prefer clean migration over shims
- "Regex" — not forbidden, use judgment when string ops would be fragile

Removed

Legacy Season N Episode N filename form in TVShowService.parse_episode_from_filename. It never appears in the release names Alfred handles, and supporting it forced a regex.
SeasonRepository and EpisodeRepository — only the aggregate root has a repository (DDD rule: one repo per aggregate).
shared/media_info.py compatibility shim — callers updated.
SubtitleTrack compatibility alias in subtitles.entities — callers updated to SubtitleCandidate.

Fixed

MediaInfo.duration_seconds returns None on audio-only files instead of crashing through primary_video.duration_seconds (see the duration/bitrate move under Changed).
MediaOrganizer (infrastructure/filesystem/organizer.py) no longer passes the removed show_imdb_id / episode_count kwargs when constructing a Season for folder-name generation.

Internal

Test suite rewritten where the aggregate redesign broke fixtures: tests/domain/test_tv_shows.py (69 tests), tests/domain/test_media_info.py (rewritten for VideoTrack), tests/application/test_enrich_from_probe.py (helper added), tests/infrastructure/test_filesystem_extras.py (fixtures), tests/domain/test_tv_shows_service.py (find_next_episode driven by real aggregate state).
Subtitle services internal migration: matcher.py, utils.py, placer.py, identifier.py updated to import SubtitleCandidate.
Suite status at end of block: 1066 passed, 8 skipped, 0 failed.

68 KiB Raw Permalink Blame History Unescape Escape

Changelog

[Unreleased]

Changed

Removed

Added

Fixed

Removed

Internal

Changed

Removed

Internal

Added

Changed

Removed

Internal

Internal

Fixed

Added

Changed

Removed

Internal

Added

Fixed

Added

Changed

Removed

[2026-05-20] — Release parser confidence scoring + exclusion

Added

Changed

[2026-05-20] — Release parser v2 (EASY + SHITTY)

Added

Changed

Removed

Fixed

Internal

[2026-05-17] — TVShow & Movie aggregate refactor

Added

Changed

Removed

Fixed

Internal

68 KiB

Raw Permalink Blame History