df798f55cc
Domain services (SubtitleIdentifier, PatternDetector) used to import the concrete SubtitleKnowledgeBase class directly from infrastructure for their type hint. With this commit they depend on a structural Protocol in alfred/domain/subtitles/ports/knowledge.py declaring just the 7 read-only query methods the domain actually consumes. The concrete YAML-backed SubtitleKnowledgeBase in infrastructure remains the sole adapter — no rename, no shim. With this change alfred/domain/subtitles/ has zero imports from alfred/infrastructure/. Also extend the changelog entry covering the full domain-io-extraction branch.
17 KiB
17 KiB
Changelog
All notable changes to Alfred are documented here.
The format is loosely based on Keep a Changelog.
Alfred is not yet on SemVer — entries are grouped by dated work blocks instead
of release numbers. Granularity targets behavioral or API-visible changes; refer
to git log for commit-level detail.
Sections used per block: Added / Changed / Deprecated / Removed / Fixed / Internal (for tech-debt and refactor noise that doesn't affect callers).
[Unreleased]
Added
- Real-world release fixtures under
tests/fixtures/releases/{easy,shitty,path_of_pain}/, each documenting an expectedParsedReleaseplus the futurerouting(library / torrents / seed_hardlinks) for the upcomingorganize_mediarefactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chainS14E09E10E11(Archer, captures E11 loss), lowercases01e01(Notre Planète),NxNNwith-separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-rangeS01-06(Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, capturesgroup=S03misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual[FR-EN]tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized overtests/domain/test_release_fixtures.pyfor anti-regression. NxNNalt season/episode form supported byparse_release. Releases likeShow.1x05.720p.HDTV.x264-GRPandShow.2x07x08.1080p.WEB.x265-GRP(multi-ep alt form) now parse as TV shows.alfred/knowledge/release/separators.yamldeclares the token separators used by the release-name tokenizer (.,,[,],(,),_). New conventions can be added without code changes. The canonical.is always present even if missing from YAML.
Changed
parse_releasetokenizer is now data-driven: it splits on any character listed inseparators.yaml(regex character class) instead ofname.split("."). This makes YTS-style releases (The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]), space-separated names (Inception 2010 1080p BluRay x264-GROUP), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization.parse_releaseflow simplified: site-tag extraction always runs first (soparse_path == "sanitized"now reliably indicates a stripped[tag]), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set).- ISO 639-2/B is now the canonical language code project-wide (was a mix of
639-1 and 639-2/T):
SubtitlePreferences.languagesdefault is now["fre", "eng"](was["fr", "en"]). Old LTM files are not auto-migrated — deletedata/memory/ltm.jsonto regenerate with the new defaults.- Subtitle output filenames are now
{iso639_2b}.srt(e.g.fre.srt,fre.sdh.srt). Existingfr.srtfiles are still read correctly (recognized as French via alias) but new files are written canonically. Languagevalue object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented.
MovieService.validate_movie_fileminimum size is now configurable viasettings.min_movie_size_bytes(default unchanged: 100 MB). Constructor accepts an optionalmin_movie_size_bytesoverride for tests.SubtitleKnowledgeBasedelegates language lookup toLanguageRegistryrather than duplicating tokens.subtitles.yamlnow only declares subtitle-specific tokens (e.g.vostfr,vf,vff) under a newlanguage_tokenssection.
Removed
alfred/domain/tv_shows/services.pyandalfred/domain/movies/services.pydeleted entirely. They held fossil parsers (parse_episode_filename,extract_movie_metadata, …) with zero production callers — superseded byparse_releaseas the single source of truth for release-name parsing. Associated tests (tests/domain/test_movies.py,tests/domain/test_tv_shows_service.py) removed as well._sanitizeand_normalizehelpers inalfred/domain/release/services.py— the new tokenizer makes them redundant._LANG_KEYWORDS,_SDH_TOKENS,_FORCED_TOKENS,SUBTITLE_EXTENSIONShardcoded dicts inalfred/domain/subtitles/scanner.py— all knowledge now lives in YAML (CLAUDE.md compliance)._MIN_MOVIE_SIZE_BYTESmodule-level constant inalfred/domain/movies/services.py— replaced by the new setting.- Top-level
languages:block insubtitles.yaml— superseded bylanguage_tokens:(subtitle-specific only) since iso_languages.yaml is the canonical source.
Fixed
hitoken no longer marks a subtitle as SDH (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only viasdh,cc, andhearingtokens.SubtitleKnowledgeBasedefault rules used"fra"whileiso_languages.yamlexposes French as"fre"— preferred languages defaults now match the canonical form.
Internal
- Domain I/O extraction (
refactor/domain-io-extraction): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits:- Knowledge YAML loaders moved to infrastructure:
alfred/domain/release/knowledge.py,alfred/domain/shared/knowledge/language_registry.py, andalfred/domain/subtitles/knowledge/{base,loader}.pyrelocated toalfred/infrastructure/knowledge/. Re-exports were dropped — callers import directly from the new location. MediaProberandFilesystemScannerProtocol ports introduced atalfred/domain/shared/ports/with frozen-dataclass DTOs (SubtitleStreamInfo,FileEntry).SubtitleIdentifierandPatternDetectorare now constructor-injected with concrete adapters (FfprobeMediaProberwrappingsubprocess.run(ffprobe)andPathlibFilesystemScannerwrappingpathlib). No more directsubprocess/pathlibusage from the subtitle domain services.- Live filesystem methods removed from VOs and entities:
FilePath.exists()/.is_file()/.is_dir()deleted —FilePathis now a pure address VO.Movie.has_file()andEpisode.is_downloaded()dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races). SubtitlePlacermoved to the application layer atalfred/application/subtitles/placer.py— it performsos.linkI/O, which doesn't belong in the domain. Pre-checks replaced with try/except forFileNotFoundError/FileExistsError.SubtitleRuleSet.resolve()no longer reaches into the knowledge base: the implicitDEFAULT_RULES()helper is gone, replaced by an explicitdefault_rules: SubtitleMatchingRulesparameter. TheManageSubtitlesuse case loads defaults from the KB once and passes them in.SubtitleKnowledgeProtocol port atalfred/domain/subtitles/ports/knowledge.pydeclares the read-only query surface domain services consume (7 methods:known_extensions,format_for_extension,language_for_token,is_known_lang_token,type_for_token,is_known_type_token,patterns).SubtitleIdentifierandPatternDetectordepend on this Protocol instead of the concreteSubtitleKnowledgeBasefrom infrastructure —domain/subtitles/now has zero imports frominfrastructure/. The remaining domain → infra leak (domain/release/loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch.
- Knowledge YAML loaders moved to infrastructure:
to_dot_folder_name(title)helper inalfred/domain/shared/value_objects.py— extracts there.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".")pattern that was duplicated betweenMovieTitle.normalized()andTVShow.get_folder_name().ParsedRelease.languagesusesfield(default_factory=list)instead of a manual__post_init__that assigned[]viaobject.__setattr__.file_extensions.yamlsplits subtitle sidecars (.srt,.sub,.idx,.ass,.ssa) into a dedicatedsubtitle:category instead of lumping them undermetadata:. The_METADATA_EXTENSIONSset used bydetect_media_typeremains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a newload_subtitle_extensions()loader is now available for the subtitles domain. Sematic clarity, no functional change.tv_shows/entities.pymodule docstring now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure.- Removed backward-compat shims
_sanitise_for_fs/_strip_episode_from_normalisedfromdomain/release/value_objects.py(zero callers). - Cleaned ruff warnings across the codebase:
subprocess.runcalls now pass explicitcheck=False(PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 inmanage_subtitles.py,placer.py,qbittorrent/client.py,file_manager.py); fixed module-level import ordering (E402) inlanguage_registry.pyandsubtitles/knowledge/loader.py; removed unused locals (F841 / B007); replaced unnecessary set comprehension withset()inrelease/knowledge.py(C416). - Ruff config: ignore
PLR0911/PLR0912(too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. IgnorePLW0603for the documented memory singleton (infrastructure/persistence/context.py).
[2026-05-17] — TVShow & Movie aggregate refactor
Multi-phase refonte of the TV show domain into a real DDD aggregate, with
matching parity work on Movie, a language knowledge system, and the
shared/media restructure that supports both.
Added
- Language knowledge system (
alfred/knowledge/iso_languages.yaml+ 42 languages includingundfor undetermined).Languagevalue object (frozen dataclass) withiso,english_name,native_name,aliases, and amatches(raw)cross-format helper.LanguageRegistryloader (alfred/domain/shared/knowledge/) merging builtin + learned YAML. Not a singleton — the application layer instantiates it.- ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings.
VideoTrackdataclass (alfred/domain/shared/media/video.py) with aresolutionproperty using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p).shared/media/matching.py—track_lang_matcheshelper shared byEpisodeandMovie. Implements the "C+" contract for language helpers:Languagequery → cross-format match viaLanguage.matches()strquery → case-insensitive direct comparison (no normalization)
- TVShow aggregate composition:
TVShow.seasons: dict[SeasonNumber, Season]Season.episodes: dict[EpisodeNumber, Episode]Season.expected_episodes/Season.aired_episodes(split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones)
- Aggregate methods on
TVShow:add_episode(ep)— sole sanctioned mutation entry point (creates the season if missing)add_season(season)— replaces a season wholesalecollection_status()→CollectionStatus.{EMPTY, PARTIAL, COMPLETE}is_complete_series()— true iffENDED + COMPLETEmissing_episodes()— flat list of all aired-but-not-owned(season, episode)pairs
CollectionStatusenum (orthogonal toShowStatus).- Episode track helpers (
has_audio_in,has_subtitles_in,has_forced_subs,audio_languages,subtitle_languages), driven byEpisode.audio_tracks/Episode.subtitle_tracks. - Movie aggregate parity —
Movienow carriesaudio_tracks/subtitle_tracksand exposes the same helpers asEpisode(same C+ contract). CHANGELOG.md(this file).
Changed
shared/media_info.pyexploded intoshared/media/{audio,video,subtitle,info,matching}.py.MediaInfois now symmetric: every stream type is alist[Track]. Flat accessors (width,height,video_codec,resolution) remain as properties that read the first video track.MediaInfo.duration_seconds/bitrate_kbpsmoved fromVideoTracktoMediaInfo(file-level — they come from the ffprobeformatblock, not a stream). Files without a video stream now correctly expose duration.ShowStatus.from_stringextended to map TMDB strings (Returning Series,In Production,Pilot,Planned,Canceled,Cancelled). Comparison is whitespace-trimmed and case-insensitive.Season/Episodedropped theirshow_imdb_idback-references. They are owned byTVShowand reached only through it.TVShow.seasons_countandepisode_countare now@property(computed from the dict) instead of stored ints.TVShowService.parse_episode_from_filenamerewritten in string operations (no regex). SupportsS01E05/s1e5and1x05/01x5forms.TVShowService.find_next_episodenow drives offshow.missing_episodes()instead of the hardcoded "max 50 episodes per season" heuristic.TVShowServiceconstructor no longer takesseason_repository/episode_repository— the aggregate persists in one block viaTVShowRepositoryonly.SubtitleTrackinalfred.domain.subtitles.entitiesrenamed toSubtitleCandidate. Coexists with theshared.media.SubtitleTrackffprobe-view dataclass (different bounded contexts, kept separate intentionally).tv_shows/services.py_VIDEO_EXTENSIONSnow loaded fromknowledge/release/file_extensions.yamlviaload_video_extensions()(single source of truth).CLAUDE.mdupdated with three new policy sections:- "Tests" — small updates OK during normal work, no mass-update sprees
- "Backwards-compatibility shims" — prefer clean migration over shims
- "Regex" — not forbidden, use judgment when string ops would be fragile
Removed
- Legacy
Season N Episode Nfilename form inTVShowService.parse_episode_from_filename. It never appears in the release names Alfred handles, and supporting it forced a regex. SeasonRepositoryandEpisodeRepository— only the aggregate root has a repository (DDD rule: one repo per aggregate).shared/media_info.pycompatibility shim — callers updated.SubtitleTrackcompatibility alias insubtitles.entities— callers updated toSubtitleCandidate.
Fixed
MediaInfo.duration_secondsreturnsNoneon audio-only files instead of crashing throughprimary_video.duration_seconds(see the duration/bitrate move under Changed).MediaOrganizer(infrastructure/filesystem/organizer.py) no longer passes the removedshow_imdb_id/episode_countkwargs when constructing aSeasonfor folder-name generation.
Internal
- Test suite rewritten where the aggregate redesign broke fixtures:
tests/domain/test_tv_shows.py(69 tests),tests/domain/test_media_info.py(rewritten forVideoTrack),tests/application/test_enrich_from_probe.py(helper added),tests/infrastructure/test_filesystem_extras.py(fixtures),tests/domain/test_tv_shows_service.py(find_next_episode driven by real aggregate state). - Subtitle services internal migration:
matcher.py,utils.py,placer.py,identifier.pyupdated to importSubtitleCandidate. - Suite status at end of block: 1066 passed, 8 skipped, 0 failed.