Changelog

All notable changes to Alfred are documented here.

The format is loosely based on Keep a Changelog. Alfred is not yet on SemVer — entries are grouped by dated work blocks instead of release numbers. Granularity targets behavioral or API-visible changes; refer to git log for commit-level detail.

Sections used per block: Added / Changed / Deprecated / Removed / Fixed / Internal (for tech-debt and refactor noise that doesn't affect callers).

[Unreleased]

Added

Release parser v2 — EASY path live (alfred/domain/release/parser/): new annotate-based pipeline (tokenize → annotate → assemble) drives releases from known groups. Exposes Token (frozen VO with index + role + extra), TokenRole enum (structural/technical/meta families), and GroupSchema / SchemaChunk value objects.
- pipeline.tokenize: string-ops separator split (no regex), strips a [site.tag] prefix/suffix first.
- pipeline.annotate: detects the trailing group right-to-left (priority to codec-GROUP shape, fallback to any non-source dashed token), looks up its GroupSchema, then walks tokens and schema chunks in lockstep — optional chunks that don't match are skipped, mandatory mismatches abort EASY and return None so the caller can fall back to SHITTY.
- pipeline.assemble: folds annotated tokens into a ParsedRelease-compatible dict.
- parse_release (in release.services) tries the v2 EASY path first and falls through to the legacy SHITTY heuristic on None. Legacy SHITTY/PATH OF PAIN behavior is unchanged.
- Knowledge: alfred/knowledge/release/release_groups/{kontrast,elite, rarbg}.yaml declare the canonical chunk order per group, loaded via new ReleaseKnowledge.group_schema(name) port method.
- Tests in tests/domain/release/test_parser_v2_{scaffolding,easy}.py cover token VOs, site-tag stripping, group detection, schema-driven annotation (movie, TV episode, season pack with optional source), and field assembly.
Release parser v2 — enricher pass completes the EASY pipeline. The structural schema walk now tolerates non-positional tokens between chunks (instead of aborting on leftover tokens), and a second pass tags them with audio / video-meta / edition / language roles. Multi-token sequences from audio.yaml, video.yaml, editions.yaml (e.g. DTS.HD.MA, DV.HDR10, TrueHD.Atmos, DIRECTORS.CUT) are matched before single tokens. Channel layouts like 5.1 and 7.1 (split into two tokens by the . separator) are detected as consecutive pairs. Sequence members carry an extra["sequence_member"] marker so assemble extracts the canonical value only from the primary token. KONTRAST releases with audio / HDR / edition / language metadata now produce a fully populated ParsedRelease.
Streaming distributor as a separate dimension from encoding source. New alfred/knowledge/release/distributors.yaml (NF, AMZN, DSNP, HMAX, ATVP, HULU, PCOK, PMTP, CR) feeds a new ReleaseKnowledge.distributors port field, a TokenRole.DISTRIBUTOR annotation, and a ParsedRelease.distributor field. WEB-DL stays the source; the platform that produced the release is now recorded distinctly. The five entries (NF, AMZN, DSNP, HMAX, ATVP) were correspondingly removed from sources.yaml.
Real-world release fixtures under tests/fixtures/releases/{easy,shitty,path_of_pain}/, each documenting an expected ParsedRelease plus the future routing (library / torrents / seed_hardlinks) for the upcoming organize_media refactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chain S14E09E10E11 (Archer, captures E11 loss), lowercase s01e01 (Notre Planète), NxNN with - separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-range S01-06 (Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, captures group=S03 misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual [FR-EN] tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized over tests/domain/test_release_fixtures.py for anti-regression.
NxNN alt season/episode form supported by parse_release. Releases like Show.1x05.720p.HDTV.x264-GRP and Show.2x07x08.1080p.WEB.x265-GRP (multi-ep alt form) now parse as TV shows.
alfred/knowledge/release/separators.yaml declares the token separators used by the release-name tokenizer (., , [, ], (, ), _). New conventions can be added without code changes. The canonical . is always present even if missing from YAML.

Changed

Release parser v2 — SHITTY simplified to dict-driven tagging. The legacy ~480-line heuristic block in release/services.py is gone; pipeline._annotate_shitty does a single pass that looks each token up in the kb buckets (resolutions / sources / codecs / distributors / year / SxxExx) with first-match-wins semantics, and the leftmost contiguous UNKNOWN run becomes the title. annotate() no longer returns None — SHITTY is the always-on fallback when no group schema matches. services.py shrunk from ~525 to ~85 lines. Four fixtures (deutschland_franchise_box, sleaford_yt_slug, super_mario_bilingual, predator_space_separators — the last one moved from shitty/ → path_of_pain/) are now marked pytest.mark.xfail(strict=False) documenting PoP-grade pathologies that SHITTY intentionally won't handle. ReleaseFixture grows an xfail_reason field; the parametrized suite wires the xfail mark automatically.
parse_release tokenizer is now data-driven: it splits on any character listed in separators.yaml (regex character class) instead of name.split("."). This makes YTS-style releases (The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]), space-separated names (Inception 2010 1080p BluRay x264-GROUP), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization.
parse_release flow simplified: site-tag extraction always runs first (so parse_path == "sanitized" now reliably indicates a stripped [tag]), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set).
ISO 639-2/B is now the canonical language code project-wide (was a mix of 639-1 and 639-2/T):
- SubtitlePreferences.languages default is now ["fre", "eng"] (was ["fr", "en"]). Old LTM files are not auto-migrated — delete data/memory/ltm.json to regenerate with the new defaults.
- Subtitle output filenames are now {iso639_2b}.srt (e.g. fre.srt, fre.sdh.srt). Existing fr.srt files are still read correctly (recognized as French via alias) but new files are written canonically.
- Language value object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented.
MovieService.validate_movie_file minimum size is now configurable via settings.min_movie_size_bytes (default unchanged: 100 MB). Constructor accepts an optional min_movie_size_bytes override for tests.
SubtitleKnowledgeBase delegates language lookup to LanguageRegistry rather than duplicating tokens. subtitles.yaml now only declares subtitle-specific tokens (e.g. vostfr, vf, vff) under a new language_tokens section.

Removed

alfred/domain/tv_shows/services.py and alfred/domain/movies/services.py deleted entirely. They held fossil parsers (parse_episode_filename, extract_movie_metadata, …) with zero production callers — superseded by parse_release as the single source of truth for release-name parsing. Associated tests (tests/domain/test_movies.py, tests/domain/test_tv_shows_service.py) removed as well.
_sanitize and _normalize helpers in alfred/domain/release/services.py — the new tokenizer makes them redundant.
_LANG_KEYWORDS, _SDH_TOKENS, _FORCED_TOKENS, SUBTITLE_EXTENSIONS hardcoded dicts in alfred/domain/subtitles/scanner.py — all knowledge now lives in YAML (CLAUDE.md compliance).
_MIN_MOVIE_SIZE_BYTES module-level constant in alfred/domain/movies/services.py — replaced by the new setting.
Top-level languages: block in subtitles.yaml — superseded by language_tokens: (subtitle-specific only) since iso_languages.yaml is the canonical source.

Fixed

hi token no longer marks a subtitle as SDH (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only via sdh, cc, and hearing tokens.
SubtitleKnowledgeBase default rules used "fra" while iso_languages.yaml exposes French as "fre" — preferred languages defaults now match the canonical form.

Internal

Domain I/O extraction (refactor/domain-io-extraction): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits:
- Knowledge YAML loaders moved to infrastructure: alfred/domain/release/knowledge.py, alfred/domain/shared/knowledge/language_registry.py, and alfred/domain/subtitles/knowledge/{base,loader}.py relocated to alfred/infrastructure/knowledge/. Re-exports were dropped — callers import directly from the new location.
- MediaProber and FilesystemScanner Protocol ports introduced at alfred/domain/shared/ports/ with frozen-dataclass DTOs (SubtitleStreamInfo, FileEntry). SubtitleIdentifier and PatternDetector are now constructor-injected with concrete adapters (FfprobeMediaProber wrapping subprocess.run(ffprobe) and PathlibFilesystemScanner wrapping pathlib). No more direct subprocess/pathlib usage from the subtitle domain services.
- Live filesystem methods removed from VOs and entities: FilePath.exists() / .is_file() / .is_dir() deleted — FilePath is now a pure address VO. Movie.has_file() and Episode.is_downloaded() dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races).
- SubtitlePlacer moved to the application layer at alfred/application/subtitles/placer.py — it performs os.link I/O, which doesn't belong in the domain. Pre-checks replaced with try/except for FileNotFoundError/FileExistsError.
- SubtitleRuleSet.resolve() no longer reaches into the knowledge base: the implicit DEFAULT_RULES() helper is gone, replaced by an explicit default_rules: SubtitleMatchingRules parameter. The ManageSubtitles use case loads defaults from the KB once and passes them in.
- SubtitleKnowledge Protocol port at alfred/domain/subtitles/ports/knowledge.py declares the read-only query surface domain services consume (7 methods: known_extensions, format_for_extension, language_for_token, is_known_lang_token, type_for_token, is_known_type_token, patterns). SubtitleIdentifier and PatternDetector depend on this Protocol instead of the concrete SubtitleKnowledgeBase from infrastructure — domain/subtitles/ now has zero imports from infrastructure/. The remaining domain → infra leak (domain/release/ loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch.
to_dot_folder_name(title) helper in alfred/domain/shared/value_objects.py — extracts the re.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".") pattern that was duplicated between MovieTitle.normalized() and TVShow.get_folder_name().
ParsedRelease.languages uses field(default_factory=list) instead of a manual __post_init__ that assigned [] via object.__setattr__.
file_extensions.yaml splits subtitle sidecars (.srt, .sub, .idx, .ass, .ssa) into a dedicated subtitle: category instead of lumping them under metadata:. The _METADATA_EXTENSIONS set used by detect_media_type remains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a new load_subtitle_extensions() loader is now available for the subtitles domain. Sematic clarity, no functional change.
tv_shows/entities.py module docstring now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure.
Removed backward-compat shims _sanitise_for_fs / _strip_episode_from_normalised from domain/release/value_objects.py (zero callers).
Cleaned ruff warnings across the codebase: subprocess.run calls now pass explicit check=False (PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 in manage_subtitles.py, placer.py, qbittorrent/client.py, file_manager.py); fixed module-level import ordering (E402) in language_registry.py and subtitles/knowledge/loader.py; removed unused locals (F841 / B007); replaced unnecessary set comprehension with set() in release/knowledge.py (C416).
Ruff config: ignore PLR0911 / PLR0912 (too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. Ignore PLW0603 for the documented memory singleton (infrastructure/persistence/context.py).
Release-knowledge DDD purification (refactor/domain-release-knowledge): the last domain → infrastructure leak (domain/release/value_objects.py loading YAML at import-time) is gone. Achieved via:
- ReleaseKnowledge Protocol port at alfred/domain/release/ports/knowledge.py declares the read-only query surface release parsing needs (token sets for resolutions, sources, codecs, languages, hdr extras; structured dicts for audio, video_meta, editions, media_type_tokens; separators list; file-extension sets used by application/infra callers; sanitize_for_fs(text) method).
- YamlReleaseKnowledge adapter at alfred/infrastructure/knowledge/release_kb.py loads every YAML constant once at construction. Builds an immutable str.maketrans translation table for filesystem sanitization.
- parse_release(name, kb) takes the knowledge as an explicit parameter — no more module-level YAML loading inside the domain. Every internal helper (_tokenize, _extract_tech, _extract_languages, _extract_audio, _extract_video_meta, _extract_edition, _extract_title, _infer_media_type, _is_well_formed) takes kb.
- ParsedRelease Option B: sanitization happens once at parse time and is stored on a new title_sanitized: str field. Builder methods (show_folder_name, season_folder_name, episode_filename, movie_folder_name, movie_filename) are now pure — they accept already-sanitized tmdb_title_safe / tmdb_episode_title_safe arguments. Callers at the use-case boundary sanitize TMDB strings via kb.sanitize_for_fs(...) before passing them in.
- All domain-knowledge constants removed from value_objects.py: _RESOLUTIONS, _SOURCES, _CODECS, _AUDIO, _VIDEO_META, _EDITIONS, _HDR_EXTRA, _MEDIA_TYPE_TOKENS, _LANGUAGE_TOKENS, _FORBIDDEN_CHARS, _VIDEO_EXTENSIONS, _NON_VIDEO_EXTENSIONS, _SUBTITLE_EXTENSIONS, _METADATA_EXTENSIONS, _WIN_FORBIDDEN_TABLE, and the _sanitize_for_fs helper. The domain module is now pure.
- Application-layer KB singleton: resolve_destination.py instantiates a module-level _KB: ReleaseKnowledge = YamlReleaseKnowledge() and threads it through every parse_release(...) call. The local _sanitize helper and _WIN_FORBIDDEN regex were dropped in favor of _KB.sanitize_for_fs(...).
- detect_media_type(parsed, source_path, kb) and find_video_file(path, kb) now take the knowledge explicitly instead of importing _*_EXTENSIONS constants from the domain. agent/tools/filesystem.py::analyze_release imports the application KB singleton and passes it through.

[2026-05-17] — TVShow & Movie aggregate refactor

Multi-phase refonte of the TV show domain into a real DDD aggregate, with matching parity work on Movie, a language knowledge system, and the shared/media restructure that supports both.

Added

Language knowledge system (alfred/knowledge/iso_languages.yaml + 42 languages including und for undetermined).
- Language value object (frozen dataclass) with iso, english_name, native_name, aliases, and a matches(raw) cross-format helper.
- LanguageRegistry loader (alfred/domain/shared/knowledge/) merging builtin + learned YAML. Not a singleton — the application layer instantiates it.
- ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings.
VideoTrack dataclass (alfred/domain/shared/media/video.py) with a resolution property using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p).
shared/media/matching.py — track_lang_matches helper shared by Episode and Movie. Implements the "C+" contract for language helpers:
- Language query → cross-format match via Language.matches()
- str query → case-insensitive direct comparison (no normalization)
TVShow aggregate composition:
- TVShow.seasons: dict[SeasonNumber, Season]
- Season.episodes: dict[EpisodeNumber, Episode]
- Season.expected_episodes / Season.aired_episodes (split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones)
Aggregate methods on TVShow:
- add_episode(ep) — sole sanctioned mutation entry point (creates the season if missing)
- add_season(season) — replaces a season wholesale
- collection_status() → CollectionStatus.{EMPTY, PARTIAL, COMPLETE}
- is_complete_series() — true iff ENDED + COMPLETE
- missing_episodes() — flat list of all aired-but-not-owned (season, episode) pairs
CollectionStatus enum (orthogonal to ShowStatus).
Episode track helpers (has_audio_in, has_subtitles_in, has_forced_subs, audio_languages, subtitle_languages), driven by Episode.audio_tracks / Episode.subtitle_tracks.
Movie aggregate parity — Movie now carries audio_tracks / subtitle_tracks and exposes the same helpers as Episode (same C+ contract).
CHANGELOG.md (this file).

Changed

shared/media_info.py exploded into shared/media/{audio,video,subtitle,info,matching}.py. MediaInfo is now symmetric: every stream type is a list[Track]. Flat accessors (width, height, video_codec, resolution) remain as properties that read the first video track.
MediaInfo.duration_seconds / bitrate_kbps moved from VideoTrack to MediaInfo (file-level — they come from the ffprobe format block, not a stream). Files without a video stream now correctly expose duration.
ShowStatus.from_string extended to map TMDB strings (Returning Series, In Production, Pilot, Planned, Canceled, Cancelled). Comparison is whitespace-trimmed and case-insensitive.
Season / Episode dropped their show_imdb_id back-references. They are owned by TVShow and reached only through it.
TVShow.seasons_count and episode_count are now @property (computed from the dict) instead of stored ints.
TVShowService.parse_episode_from_filename rewritten in string operations (no regex). Supports S01E05 / s1e5 and 1x05 / 01x5 forms.
TVShowService.find_next_episode now drives off show.missing_episodes() instead of the hardcoded "max 50 episodes per season" heuristic.
TVShowService constructor no longer takes season_repository / episode_repository — the aggregate persists in one block via TVShowRepository only.
SubtitleTrack in alfred.domain.subtitles.entities renamed to SubtitleCandidate. Coexists with the shared.media.SubtitleTrack ffprobe-view dataclass (different bounded contexts, kept separate intentionally).
tv_shows/services.py _VIDEO_EXTENSIONS now loaded from knowledge/release/file_extensions.yaml via load_video_extensions() (single source of truth).
CLAUDE.md updated with three new policy sections:
- "Tests" — small updates OK during normal work, no mass-update sprees
- "Backwards-compatibility shims" — prefer clean migration over shims
- "Regex" — not forbidden, use judgment when string ops would be fragile

Removed

Legacy Season N Episode N filename form in TVShowService.parse_episode_from_filename. It never appears in the release names Alfred handles, and supporting it forced a regex.
SeasonRepository and EpisodeRepository — only the aggregate root has a repository (DDD rule: one repo per aggregate).
shared/media_info.py compatibility shim — callers updated.
SubtitleTrack compatibility alias in subtitles.entities — callers updated to SubtitleCandidate.

Fixed

MediaInfo.duration_seconds returns None on audio-only files instead of crashing through primary_video.duration_seconds (see the duration/bitrate move under Changed).
MediaOrganizer (infrastructure/filesystem/organizer.py) no longer passes the removed show_imdb_id / episode_count kwargs when constructing a Season for folder-name generation.

Internal

Test suite rewritten where the aggregate redesign broke fixtures: tests/domain/test_tv_shows.py (69 tests), tests/domain/test_media_info.py (rewritten for VideoTrack), tests/application/test_enrich_from_probe.py (helper added), tests/infrastructure/test_filesystem_extras.py (fixtures), tests/domain/test_tv_shows_service.py (find_next_episode driven by real aggregate state).
Subtitle services internal migration: matcher.py, utils.py, placer.py, identifier.py updated to import SubtitleCandidate.
Suite status at end of block: 1066 passed, 8 skipped, 0 failed.

23 KiB Raw Blame History Unescape Escape

Changelog

[Unreleased]

Added

Changed

Removed

Fixed

Internal

[2026-05-17] — TVShow & Movie aggregate refactor

Added

Changed

Removed

Fixed

Internal

23 KiB

Raw Blame History