Remove the module-level _KB / _PROBER singletons from
alfred/application/filesystem/resolve_destination.py. The four
resolve_{season,episode,movie,series}_destination use cases now take
kb: ReleaseKnowledge and prober: MediaProber as required arguments,
matching the shape of inspect_release.
The singletons now live at the agent-tools frontier
(alfred/agent/tools/filesystem.py), where the LLM-facing wrappers
instantiate YamlReleaseKnowledge / FfprobeMediaProber once and thread
them through. The wrappers' Python signatures are unchanged — the
inspect-based JSON-schema generator in agent/registry.py still sees the
same LLM-passable params.
analyze_release drops the dirty 'from ... import _KB' indirection.
Tests inject their own stubs by keyword (prober=_StubProber(...)) via
thin convenience wrappers, replacing the prior
monkeypatch.setattr(rd, '_PROBER', ...) pattern.
testing/debug_release.py: instantiate YamlReleaseKnowledge() /
FfprobeMediaProber() inline at the two call sites.
Suite: 1077 passed.
37 KiB
Changelog
All notable changes to Alfred are documented here.
The format is loosely based on Keep a Changelog.
Alfred is not yet on SemVer — entries are grouped by dated work blocks instead
of release numbers. Granularity targets behavioral or API-visible changes; refer
to git log for commit-level detail.
Sections used per block: Added / Changed / Deprecated / Removed / Fixed / Internal (for tech-debt and refactor noise that doesn't affect callers).
[Unreleased]
Fixed
- Multi-episode chain (e.g.
S14E09E10E11) now collapses to a full range. The parser previously capturedepisode=9, episode_end=10and dropped E11+. It now returnsepisode=first, episode_end=last, with intermediate values implied. Fixtureshitty/archer_multi_episode/updated from anti-regression-of-bug to anti-regression-of-fix. - Apostrophes in titles no longer push the release through the AI
fallback.
Honey.Don't.2025.2160p.WEBRip.DSNP.DV.HDR.x265-Amenpreviously parsed withparse_path="ai"and everything UNKNOWN because'is in the forbidden-chars list. Apostrophes are now pre-stripped before the well-formed check, so the parse completes normally (title=Honey.Dont, year=2025, quality=2160p, ...); only the title text loses its apostrophe.parse_pathbecomessanitizedto surface the cleanup. Side win: PoP fixturethe_prodigy_full_chaos/also moves from total failure to a partially-correct parse (year, source, codec extracted). - Season-range markers (
Sxx-yy) are now recognized astv_complete.Der.Tatortreiniger.S01-06.GERMAN...previously parsed asmedia_type=moviewithS01-06glued onto the title. The parser now recognizes the range, setsseason=first,media_type=tv_complete, and removes the marker from the title.is_season_packflips totrue. - Pure-punctuation TITLE tokens are dropped at assembly. Releases
with surrounding
-separators (Vinyl - 1x01 - FHD) previously producedtitle="Vinyl.-". Such tokens (a stray dash, a wide pipe|, …) carry no title content and are now filtered out. Side effect: PoP fixturekhruangbin_yt_wide_pipe/also benefits — the YouTube wide-pipe no longer leaks into the title.
Added
LanguageRepositoryport inalfred.domain.shared.ports. Structural Protocol coveringfrom_iso,from_any,all,__contains__,__len__— the surface previously coupled to the concreteLanguageRegistry. Mirrors theMediaProber/FilesystemScannerpattern: domain code depends on the Protocol, infrastructure provides the YAML-backed adapter. Tests intests/infrastructure/test_language_registry.py.
Changed
resolve_destinationuse cases takekb/proberas required params; module-level singletons gone. The fourresolve_{season,episode,movie,series}_destinationuse cases now acceptkb: ReleaseKnowledgeandprober: MediaProberas required arguments, matching the shape ofinspect_release. The module-level_KB = YamlReleaseKnowledge()and_PROBER = FfprobeMediaProber()singletons that previously lived inalfred/application/filesystem/resolve_destination.pyare removed — the application layer no longer reaches into infrastructure. The singletons now live at the agent-tools frontier (alfred/agent/tools/filesystem.py), where the LLM-facing wrappers instantiate them once and thread them through.analyze_releaseno longer needs the dirtyfrom ... import _KBindirection. Tests inject their own stubs by keyword (prober=_StubProber(...)) instead of monkeypatching a module attribute.ParsePathenum renamed toTokenizationRoute. The old name collided withpathlib.Pathin code-reading mental models, and was one letter fromparse_path(the field that holds the value) — making it harder than it needed to be to spot the type vs the attribute.TokenizationRoutesays what it actually captures (DIRECT / SANITIZED / AI = how the name reached the tokenizer), and the class docstring now spells out the orthogonality withRoad(EASY / SHITTY / PATH_OF_PAIN, which captures parser confidence onParseReport). Theparse_pathfield name stays unchanged — string values too — so YAML fixtures, theanalyze_releasetool spec, and any external consumer are untouched.enrich_from_probecodec mappings moved to YAML. The three hard-coded module dicts (_VIDEO_CODEC_MAP,_AUDIO_CODEC_MAP,_CHANNEL_MAP) translating ffprobe output to scene tokens (hevc → x265,eac3 → EAC3,8 → "7.1", …) now live inalfred/knowledge/release/probe_mappings.yamland are loaded intoReleaseKnowledge.probe_mappings(new port field, populated byYamlReleaseKnowledge).enrich_from_probegains a thirdkbparameter and reads the maps from there. Aligns with the CLAUDE.md rule that lookup tables of domain knowledge belong in YAML, not in Python — and opens the door to a future "learn new codec" pass. Callers updated:inspect_release,testing/recognize_folders_in_downloads.py, and all 22 sites intests/application/test_enrich_from_probe.py.ParsedRelease.tech_stringis now a derived@property(alfred/domain/release/value_objects.py). It computesquality.source.codecjoined by dots on every access, so it stays in sync with the underlying fields by construction. The stored field is gone from the dataclass, the dict returned byassemble()no longer carries the key,parse_release's malformed-name fallback drops thetech_string=""kwarg, andenrich_from_probeno longer re-derives it after fillingquality/source/codec. Closes the parser/enrichment double-source-of-truth thate79ca46had to fix reactively. The fixtures runner now injectstech_stringalongsideis_season_packsinceasdict()skips properties.RuleScope.levelis now an enum (RuleScopeLevel). The set of valid levels (global, release_group, movie, show, season, episode) was documented only in a docstring comment and validated nowhere.RuleScopeLevel(str, Enum)keeps wire compatibility (YAML serialization,.valueaccess) while making the closed set explicit to type-checkers and IDEs.to_dict()emits.valuestrings so YAML output is unchanged.FilePathVO uses__post_init__instead of a hand-rolled__init__. Same public API (acceptsstr | Path), same behavior, but the dataclass-generated__init__is no longer bypassed. One less smell in the shared VOs.LanguageVO is strict by default;Language.from_raw()factory for normalization. The previous__post_init__mutatedisoandaliasesviaobject.__setattr__on a frozen dataclass — a code smell hiding behind the dataclass facade. Split: the direct constructor now rejects un-normalized input (uppercase iso, whitespace in aliases, etc.), andLanguage.from_raw()handles arbitrary YAML/user input. Only one caller (LanguageRegistry loading the ISO YAML) needed migration.ParsedRelease.normalisedrenamed toclean. The field name promised "dots instead of spaces" but in practice heldraw - site_tag - apostrophes— only used byseason_folder_name(). Renamed and docstring corrected.ParsedRelease.media_type/parse_pathare strict enums. The fields were already typed asMediaTypeToken/ParsePath, but a tolerant__post_init__coerced raw strings. With both classes being(str, Enum), the coercion served no purpose. Strict constructor;.valueno longer passed at call sites; dropped the unused_VALID_MEDIA_TYPES/_VALID_PARSE_PATHSlookup tables.
Removed
settings.min_movie_size_bytes— orphan Pydantic field + validator. Its only consumer (MovieService.validate_movie_file) had been removed during an earlier refactor. The "real movie vs sample" rule now lives in extension-based exclusion (application/release/supported_media.py) and PoP. If a size threshold is ever needed, it'll go in a knowledge YAML, not insettings.
Internal
- Flattened
alfred.domain.shared.media/package into a singlemedia.pymodule. The 6-file package (audio, video, subtitle, info, matching, tracks_mixin +__init__) collapsed into one ~250 LoC module. All 12 import sites continue to resolve unchanged (from alfred.domain.shared.media import AudioTrack, MediaInfo, …) since Python treatsmedia.pyandmedia/__init__.pyinterchangeably for import paths. Easier to scan when the whole bounded-context fits on one screen. SubtitleKnowledgeBasetypeslanguage_registryagainst theLanguageRepositoryport instead of the concreteLanguageRegistryclass. The default constructor still instantiates the concrete adapter when no repository is injected — behaviour is unchanged for existing callers. Opens the door to in-memory fakes in future tests without loading the full ISO 639 YAML.- Moved
detect_media_typeandenrich_from_probefromalfred.application.filesystemtoalfred.application.release. They are inspection-pipeline helpers — their natural home is next toinspect_release, not next to the filesystem use cases. The move also eliminates a circular-import workaround inresolve_destination.py:inspect_releasecan now be imported at module top instead of lazily inside_resolve_parsed. Public surface is unchanged for callers that imported the helpers from their full module paths (the only call sites —inspect.py, two tests, one testing script — were updated in this commit).
Added
resolve_*_destinationuse cases now consumeinspect_release.resolve_episode_destinationandresolve_movie_destinationreuse their existingsource_fileparameter as the inspection target;resolve_season_destinationandresolve_series_destinationgain a new optionalsource_pathparameter (also threaded through the tool wrappers and YAML specs). When the path exists, ffprobe data fills tokens missing from the release name (e.g. quality) and refreshestech_string, so the destination folder / file names end up more accurate. When the path is missing or absent (back-compat callers), the use cases fall back to parse-only — same behavior as before.
Fixed
enrich_from_probenow refreshestech_stringafter fillingquality/source/codec. Previously the field stayed at its parser-time value, so filename builders saw stale tech tokens even after a successful probe. NewTestTechStringclass intests/application/test_enrich_from_probe.pylocks the behavior.
Added
inspect_releaseorchestrator +InspectedResultVO (alfred/application/release/inspect.py). Single composition of the four inspection layers:parse_release→detect_media_type(patchesparsed.media_type) →find_main_video(top-level scan) →prober.probe+enrich_from_probewhen a video exists and the refined media type isn't in{"unknown", "other"}. Returns a frozenInspectedResult(parsed, report, source_path, main_video, media_info, probe_used)that downstream callers consume directly instead of rebuilding the same chain.kbandproberare injected — no module-level singletons. Never raises.
Changed
-
analyze_releasetool now delegates toinspect_release— same output shape, plus two new fields:confidence(0–100) androad("easy"/"shitty"/"path_of_pain") surfaced from the parser'sParseReport. The tool spec (specs/analyze_release.yaml) documents both fields so the LLM can route releases by confidence. -
MediaProberport now covers full media probing: addedprobe(video) -> MediaInfo | Nonealongside the existinglist_subtitle_streams.FfprobeMediaProber(inalfred/infrastructure/probe/) implements both methods and is now the single adapter shelling out toffprobe. The standalonealfred/infrastructure/filesystem/ffprobe.pymodule was removed — all callers (tools, testing scripts) instantiateFfprobeMediaProberinstead. Unblocks the upcominginspect_releaseorchestrator, which depends on the port.
Removed
alfred/infrastructure/filesystem/ffprobe.py(folded into theFfprobeMediaProberadapter).
[2026-05-20] — Release parser confidence scoring + exclusion
Added
-
Pre-pipeline exclusion helpers (
alfred/application/release/supported_media.py):is_supported_video(path, kb)(extension-only check againstkb.video_extensions) andfind_main_video(folder, kb)(top-level scan, lexicographically-first eligible file, returnsNonewhen no video qualifies; accepts a bare file as folder for single-file releases). No size threshold, no filename heuristics — PATH_OF_PAIN handles the exotic cases. Foundation for the futureinspect_releaseorchestrator. -
Release parser — parse-confidence scoring (
alfred/domain/release/parser/scoring.py,alfred/knowledge/release/scoring.yaml).parse_releasenow returns(ParsedRelease, ParseReport). The newParseReportfrozen VO carries a 0–100confidence, aroad("easy"/"shitty"/"path_of_pain"), the residual UNKNOWN tokens, and the missing critical fields. EASY is decided structurally (a group schema matched); SHITTY vs PATH_OF_PAIN is decided by score against a YAML-configurable cutoff (default 60). Weights and penalties also live inscoring.yaml— title 30, media_type 20, year 15, season 10, episode 5, tech 5 each; penalty 5 per UNKNOWN token capped at -30.Roadis a new enum, distinct fromParsePath(which records the tokenization route, not the confidence tier).ReleaseKnowledgeport gains ascoring: dictfield.
Changed
parse_releasesignature is now(name, kb) → tuple[ParsedRelease, ParseReport]instead of returning a bareParsedRelease. Call sites updated inapplication/filesystem/resolve_destination.pyandagent/tools/filesystem.py. Tests updated accordingly.
[2026-05-20] — Release parser v2 (EASY + SHITTY)
Added
-
Release parser v2 — EASY path live (
alfred/domain/release/parser/): new annotate-based pipeline (tokenize → annotate → assemble) drives releases from known groups. ExposesToken(frozen VO withindex+role+extra),TokenRoleenum (structural/technical/meta families), andGroupSchema/SchemaChunkvalue objects.pipeline.tokenize: string-ops separator split (no regex), strips a[site.tag]prefix/suffix first.pipeline.annotate: detects the trailing group right-to-left (priority tocodec-GROUPshape, fallback to any non-source dashed token), looks up itsGroupSchema, then walks tokens and schema chunks in lockstep — optional chunks that don't match are skipped, mandatory mismatches abort EASY and returnNoneso the caller can fall back to SHITTY.pipeline.assemble: folds annotated tokens into aParsedRelease-compatible dict.parse_release(inrelease.services) tries the v2 EASY path first and falls through to the legacy SHITTY heuristic onNone. Legacy SHITTY/PATH OF PAIN behavior is unchanged.- Knowledge:
alfred/knowledge/release/release_groups/{kontrast,elite, rarbg}.yamldeclare the canonical chunk order per group, loaded via newReleaseKnowledge.group_schema(name)port method. - Tests in
tests/domain/release/test_parser_v2_{scaffolding,easy}.pycover token VOs, site-tag stripping, group detection, schema-driven annotation (movie, TV episode, season pack with optional source), and field assembly.
-
Release parser v2 — enricher pass completes the EASY pipeline. The structural schema walk now tolerates non-positional tokens between chunks (instead of aborting on leftover tokens), and a second pass tags them with audio / video-meta / edition / language roles. Multi-token sequences from
audio.yaml,video.yaml,editions.yaml(e.g.DTS.HD.MA,DV.HDR10,TrueHD.Atmos,DIRECTORS.CUT) are matched before single tokens. Channel layouts like5.1and7.1(split into two tokens by the.separator) are detected as consecutive pairs. Sequence members carry anextra["sequence_member"]marker soassembleextracts the canonical value only from the primary token. KONTRAST releases with audio / HDR / edition / language metadata now produce a fully populatedParsedRelease. -
Streaming distributor as a separate dimension from encoding source. New
alfred/knowledge/release/distributors.yaml(NF, AMZN, DSNP, HMAX, ATVP, HULU, PCOK, PMTP, CR) feeds a newReleaseKnowledge.distributorsport field, aTokenRole.DISTRIBUTORannotation, and aParsedRelease.distributorfield.WEB-DLstays the source; the platform that produced the release is now recorded distinctly. The five entries (NF, AMZN, DSNP, HMAX, ATVP) were correspondingly removed fromsources.yaml. -
Real-world release fixtures under
tests/fixtures/releases/{easy,shitty,path_of_pain}/, each documenting an expectedParsedReleaseplus the futurerouting(library / torrents / seed_hardlinks) for the upcomingorganize_mediarefactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chainS14E09E10E11(Archer, captures E11 loss), lowercases01e01(Notre Planète),NxNNwith-separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-rangeS01-06(Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, capturesgroup=S03misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual[FR-EN]tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized overtests/domain/test_release_fixtures.pyfor anti-regression. -
NxNNalt season/episode form supported byparse_release. Releases likeShow.1x05.720p.HDTV.x264-GRPandShow.2x07x08.1080p.WEB.x265-GRP(multi-ep alt form) now parse as TV shows. -
alfred/knowledge/release/separators.yamldeclares the token separators used by the release-name tokenizer (.,,[,],(,),_). New conventions can be added without code changes. The canonical.is always present even if missing from YAML.
Changed
-
Release parser v2 — SHITTY simplified to dict-driven tagging. The legacy ~480-line heuristic block in
release/services.pyis gone;pipeline._annotate_shittydoes a single pass that looks each token up in the kb buckets (resolutions / sources / codecs / distributors / year /SxxExx) with first-match-wins semantics, and the leftmost contiguous UNKNOWN run becomes the title.annotate()no longer returnsNone— SHITTY is the always-on fallback when no group schema matches.services.pyshrunk from ~525 to ~85 lines. Four fixtures (deutschland_franchise_box,sleaford_yt_slug,super_mario_bilingual,predator_space_separators— the last one moved fromshitty/→path_of_pain/) are now markedpytest.mark.xfail(strict=False)documenting PoP-grade pathologies that SHITTY intentionally won't handle.ReleaseFixturegrows anxfail_reasonfield; the parametrized suite wires the xfail mark automatically. -
parse_releasetokenizer is now data-driven: it splits on any character listed inseparators.yaml(regex character class) instead ofname.split("."). This makes YTS-style releases (The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]), space-separated names (Inception 2010 1080p BluRay x264-GROUP), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization. -
parse_releaseflow simplified: site-tag extraction always runs first (soparse_path == "sanitized"now reliably indicates a stripped[tag]), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set). -
ISO 639-2/B is now the canonical language code project-wide (was a mix of 639-1 and 639-2/T):
SubtitlePreferences.languagesdefault is now["fre", "eng"](was["fr", "en"]). Old LTM files are not auto-migrated — deletedata/memory/ltm.jsonto regenerate with the new defaults.- Subtitle output filenames are now
{iso639_2b}.srt(e.g.fre.srt,fre.sdh.srt). Existingfr.srtfiles are still read correctly (recognized as French via alias) but new files are written canonically. Languagevalue object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented.
-
MovieService.validate_movie_fileminimum size is now configurable viasettings.min_movie_size_bytes(default unchanged: 100 MB). Constructor accepts an optionalmin_movie_size_bytesoverride for tests. -
SubtitleKnowledgeBasedelegates language lookup toLanguageRegistryrather than duplicating tokens.subtitles.yamlnow only declares subtitle-specific tokens (e.g.vostfr,vf,vff) under a newlanguage_tokenssection.
Removed
alfred/domain/tv_shows/services.pyandalfred/domain/movies/services.pydeleted entirely. They held fossil parsers (parse_episode_filename,extract_movie_metadata, …) with zero production callers — superseded byparse_releaseas the single source of truth for release-name parsing. Associated tests (tests/domain/test_movies.py,tests/domain/test_tv_shows_service.py) removed as well._sanitizeand_normalizehelpers inalfred/domain/release/services.py— the new tokenizer makes them redundant._LANG_KEYWORDS,_SDH_TOKENS,_FORCED_TOKENS,SUBTITLE_EXTENSIONShardcoded dicts inalfred/domain/subtitles/scanner.py— all knowledge now lives in YAML (CLAUDE.md compliance)._MIN_MOVIE_SIZE_BYTESmodule-level constant inalfred/domain/movies/services.py— replaced by the new setting.- Top-level
languages:block insubtitles.yaml— superseded bylanguage_tokens:(subtitle-specific only) since iso_languages.yaml is the canonical source.
Fixed
hitoken no longer marks a subtitle as SDH (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only viasdh,cc, andhearingtokens.SubtitleKnowledgeBasedefault rules used"fra"whileiso_languages.yamlexposes French as"fre"— preferred languages defaults now match the canonical form.
Internal
- Domain I/O extraction (
refactor/domain-io-extraction): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits:- Knowledge YAML loaders moved to infrastructure:
alfred/domain/release/knowledge.py,alfred/domain/shared/knowledge/language_registry.py, andalfred/domain/subtitles/knowledge/{base,loader}.pyrelocated toalfred/infrastructure/knowledge/. Re-exports were dropped — callers import directly from the new location. MediaProberandFilesystemScannerProtocol ports introduced atalfred/domain/shared/ports/with frozen-dataclass DTOs (SubtitleStreamInfo,FileEntry).SubtitleIdentifierandPatternDetectorare now constructor-injected with concrete adapters (FfprobeMediaProberwrappingsubprocess.run(ffprobe)andPathlibFilesystemScannerwrappingpathlib). No more directsubprocess/pathlibusage from the subtitle domain services.- Live filesystem methods removed from VOs and entities:
FilePath.exists()/.is_file()/.is_dir()deleted —FilePathis now a pure address VO.Movie.has_file()andEpisode.is_downloaded()dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races). SubtitlePlacermoved to the application layer atalfred/application/subtitles/placer.py— it performsos.linkI/O, which doesn't belong in the domain. Pre-checks replaced with try/except forFileNotFoundError/FileExistsError.SubtitleRuleSet.resolve()no longer reaches into the knowledge base: the implicitDEFAULT_RULES()helper is gone, replaced by an explicitdefault_rules: SubtitleMatchingRulesparameter. TheManageSubtitlesuse case loads defaults from the KB once and passes them in.SubtitleKnowledgeProtocol port atalfred/domain/subtitles/ports/knowledge.pydeclares the read-only query surface domain services consume (7 methods:known_extensions,format_for_extension,language_for_token,is_known_lang_token,type_for_token,is_known_type_token,patterns).SubtitleIdentifierandPatternDetectordepend on this Protocol instead of the concreteSubtitleKnowledgeBasefrom infrastructure —domain/subtitles/now has zero imports frominfrastructure/. The remaining domain → infra leak (domain/release/loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch.
- Knowledge YAML loaders moved to infrastructure:
to_dot_folder_name(title)helper inalfred/domain/shared/value_objects.py— extracts there.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".")pattern that was duplicated betweenMovieTitle.normalized()andTVShow.get_folder_name().ParsedRelease.languagesusesfield(default_factory=list)instead of a manual__post_init__that assigned[]viaobject.__setattr__.file_extensions.yamlsplits subtitle sidecars (.srt,.sub,.idx,.ass,.ssa) into a dedicatedsubtitle:category instead of lumping them undermetadata:. The_METADATA_EXTENSIONSset used bydetect_media_typeremains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a newload_subtitle_extensions()loader is now available for the subtitles domain. Sematic clarity, no functional change.tv_shows/entities.pymodule docstring now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure.- Removed backward-compat shims
_sanitise_for_fs/_strip_episode_from_normalisedfromdomain/release/value_objects.py(zero callers). - Cleaned ruff warnings across the codebase:
subprocess.runcalls now pass explicitcheck=False(PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 inmanage_subtitles.py,placer.py,qbittorrent/client.py,file_manager.py); fixed module-level import ordering (E402) inlanguage_registry.pyandsubtitles/knowledge/loader.py; removed unused locals (F841 / B007); replaced unnecessary set comprehension withset()inrelease/knowledge.py(C416). - Ruff config: ignore
PLR0911/PLR0912(too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. IgnorePLW0603for the documented memory singleton (infrastructure/persistence/context.py). - Release-knowledge DDD purification (
refactor/domain-release-knowledge): the last domain → infrastructure leak (domain/release/value_objects.pyloading YAML at import-time) is gone. Achieved via:ReleaseKnowledgeProtocol port atalfred/domain/release/ports/knowledge.pydeclares the read-only query surface release parsing needs (token sets for resolutions, sources, codecs, languages, hdr extras; structured dicts for audio, video_meta, editions, media_type_tokens; separators list; file-extension sets used by application/infra callers;sanitize_for_fs(text)method).YamlReleaseKnowledgeadapter atalfred/infrastructure/knowledge/release_kb.pyloads every YAML constant once at construction. Builds an immutablestr.maketranstranslation table for filesystem sanitization.parse_release(name, kb)takes the knowledge as an explicit parameter — no more module-level YAML loading inside the domain. Every internal helper (_tokenize,_extract_tech,_extract_languages,_extract_audio,_extract_video_meta,_extract_edition,_extract_title,_infer_media_type,_is_well_formed) takeskb.ParsedReleaseOption B: sanitization happens once at parse time and is stored on a newtitle_sanitized: strfield. Builder methods (show_folder_name,season_folder_name,episode_filename,movie_folder_name,movie_filename) are now pure — they accept already-sanitizedtmdb_title_safe/tmdb_episode_title_safearguments. Callers at the use-case boundary sanitize TMDB strings viakb.sanitize_for_fs(...)before passing them in.- All domain-knowledge constants removed from
value_objects.py:_RESOLUTIONS,_SOURCES,_CODECS,_AUDIO,_VIDEO_META,_EDITIONS,_HDR_EXTRA,_MEDIA_TYPE_TOKENS,_LANGUAGE_TOKENS,_FORBIDDEN_CHARS,_VIDEO_EXTENSIONS,_NON_VIDEO_EXTENSIONS,_SUBTITLE_EXTENSIONS,_METADATA_EXTENSIONS,_WIN_FORBIDDEN_TABLE, and the_sanitize_for_fshelper. The domain module is now pure. - Application-layer KB singleton:
resolve_destination.pyinstantiates a module-level_KB: ReleaseKnowledge = YamlReleaseKnowledge()and threads it through everyparse_release(...)call. The local_sanitizehelper and_WIN_FORBIDDENregex were dropped in favor of_KB.sanitize_for_fs(...). detect_media_type(parsed, source_path, kb)andfind_video_file(path, kb)now take the knowledge explicitly instead of importing_*_EXTENSIONSconstants from the domain.agent/tools/filesystem.py::analyze_releaseimports the application KB singleton and passes it through.
[2026-05-17] — TVShow & Movie aggregate refactor
Multi-phase refonte of the TV show domain into a real DDD aggregate, with
matching parity work on Movie, a language knowledge system, and the
shared/media restructure that supports both.
Added
- Language knowledge system (
alfred/knowledge/iso_languages.yaml+ 42 languages includingundfor undetermined).Languagevalue object (frozen dataclass) withiso,english_name,native_name,aliases, and amatches(raw)cross-format helper.LanguageRegistryloader (alfred/domain/shared/knowledge/) merging builtin + learned YAML. Not a singleton — the application layer instantiates it.- ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings.
VideoTrackdataclass (alfred/domain/shared/media/video.py) with aresolutionproperty using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p).shared/media/matching.py—track_lang_matcheshelper shared byEpisodeandMovie. Implements the "C+" contract for language helpers:Languagequery → cross-format match viaLanguage.matches()strquery → case-insensitive direct comparison (no normalization)
- TVShow aggregate composition:
TVShow.seasons: dict[SeasonNumber, Season]Season.episodes: dict[EpisodeNumber, Episode]Season.expected_episodes/Season.aired_episodes(split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones)
- Aggregate methods on
TVShow:add_episode(ep)— sole sanctioned mutation entry point (creates the season if missing)add_season(season)— replaces a season wholesalecollection_status()→CollectionStatus.{EMPTY, PARTIAL, COMPLETE}is_complete_series()— true iffENDED + COMPLETEmissing_episodes()— flat list of all aired-but-not-owned(season, episode)pairs
CollectionStatusenum (orthogonal toShowStatus).- Episode track helpers (
has_audio_in,has_subtitles_in,has_forced_subs,audio_languages,subtitle_languages), driven byEpisode.audio_tracks/Episode.subtitle_tracks. - Movie aggregate parity —
Movienow carriesaudio_tracks/subtitle_tracksand exposes the same helpers asEpisode(same C+ contract). CHANGELOG.md(this file).
Changed
shared/media_info.pyexploded intoshared/media/{audio,video,subtitle,info,matching}.py.MediaInfois now symmetric: every stream type is alist[Track]. Flat accessors (width,height,video_codec,resolution) remain as properties that read the first video track.MediaInfo.duration_seconds/bitrate_kbpsmoved fromVideoTracktoMediaInfo(file-level — they come from the ffprobeformatblock, not a stream). Files without a video stream now correctly expose duration.ShowStatus.from_stringextended to map TMDB strings (Returning Series,In Production,Pilot,Planned,Canceled,Cancelled). Comparison is whitespace-trimmed and case-insensitive.Season/Episodedropped theirshow_imdb_idback-references. They are owned byTVShowand reached only through it.TVShow.seasons_countandepisode_countare now@property(computed from the dict) instead of stored ints.TVShowService.parse_episode_from_filenamerewritten in string operations (no regex). SupportsS01E05/s1e5and1x05/01x5forms.TVShowService.find_next_episodenow drives offshow.missing_episodes()instead of the hardcoded "max 50 episodes per season" heuristic.TVShowServiceconstructor no longer takesseason_repository/episode_repository— the aggregate persists in one block viaTVShowRepositoryonly.SubtitleTrackinalfred.domain.subtitles.entitiesrenamed toSubtitleCandidate. Coexists with theshared.media.SubtitleTrackffprobe-view dataclass (different bounded contexts, kept separate intentionally).tv_shows/services.py_VIDEO_EXTENSIONSnow loaded fromknowledge/release/file_extensions.yamlviaload_video_extensions()(single source of truth).CLAUDE.mdupdated with three new policy sections:- "Tests" — small updates OK during normal work, no mass-update sprees
- "Backwards-compatibility shims" — prefer clean migration over shims
- "Regex" — not forbidden, use judgment when string ops would be fragile
Removed
- Legacy
Season N Episode Nfilename form inTVShowService.parse_episode_from_filename. It never appears in the release names Alfred handles, and supporting it forced a regex. SeasonRepositoryandEpisodeRepository— only the aggregate root has a repository (DDD rule: one repo per aggregate).shared/media_info.pycompatibility shim — callers updated.SubtitleTrackcompatibility alias insubtitles.entities— callers updated toSubtitleCandidate.
Fixed
MediaInfo.duration_secondsreturnsNoneon audio-only files instead of crashing throughprimary_video.duration_seconds(see the duration/bitrate move under Changed).MediaOrganizer(infrastructure/filesystem/organizer.py) no longer passes the removedshow_imdb_id/episode_countkwargs when constructing aSeasonfor folder-name generation.
Internal
- Test suite rewritten where the aggregate redesign broke fixtures:
tests/domain/test_tv_shows.py(69 tests),tests/domain/test_media_info.py(rewritten forVideoTrack),tests/application/test_enrich_from_probe.py(helper added),tests/infrastructure/test_filesystem_extras.py(fixtures),tests/domain/test_tv_shows_service.py(find_next_episode driven by real aggregate state). - Subtitle services internal migration:
matcher.py,utils.py,placer.py,identifier.pyupdated to importSubtitleCandidate. - Suite status at end of block: 1066 passed, 8 skipped, 0 failed.