The annotate-based v2 pipeline now handles releases ending in -KONTRAST, -ELiTE, or -RARBG. Unknown groups still fall through to the legacy SHITTY heuristic in services.py — nothing changes for them. Pipeline (alfred/domain/release/parser/pipeline.py): - tokenize(): string-ops separator split, strips [site.tag] first. - annotate(): right-to-left group detection (priority to codec-GROUP shape, fallback to any non-source dashed token), GroupSchema lookup via the kb port, then lockstep walk of tokens against schema chunks. Optional chunks skip on mismatch, mandatory mismatches return None so the caller falls back gracefully. CODEC pre-consumed by a codec-GROUP trailing token correctly skips the CODEC chunk in the body walk. - assemble(): folds annotated tokens into a ParsedRelease-compatible dict (title joined by '.', group from the codec-GROUP token's extras). Schema (alfred/domain/release/parser/schema.py): - GroupSchema + SchemaChunk frozen value objects. - TokenRole.GROUP added. Port + adapter: - ReleaseKnowledge.group_schema(name) lookup added (case-insensitive). - YamlReleaseKnowledge loads alfred/knowledge/release/release_groups/ *.yaml at construction time; learned overrides in data/knowledge/release/release_groups/ also picked up. Knowledge: - release_groups/kontrast.yaml, elite.yaml, rarbg.yaml declare the canonical chunk_order. ELiTE marks source as optional (Foundation.S02 has no WEBRip token). Services: - parse_release tries the v2 path first; on None falls through to the legacy implementation untouched. Tests: - tests/domain/release/test_parser_v2_easy.py (10 cases) cover group detection (codec-GROUP, dashed-source skip, no-dash → unknown), schema-driven annotation (movie, TV episode, season pack with optional source, unknown group returns None), and field assembly. - Existing tests/domain/test_release_fixtures.py (30 cases) stay green: 5 EASY fixtures now produced by v2, 25 SHITTY/PATH OF PAIN fixtures still produced by the legacy path. Verified via spy on v2.assemble. Suite: 1007 passed, 8 skipped. Refs: project_release_parser_v2_specs (memory)
21 KiB
Changelog
All notable changes to Alfred are documented here.
The format is loosely based on Keep a Changelog.
Alfred is not yet on SemVer — entries are grouped by dated work blocks instead
of release numbers. Granularity targets behavioral or API-visible changes; refer
to git log for commit-level detail.
Sections used per block: Added / Changed / Deprecated / Removed / Fixed / Internal (for tech-debt and refactor noise that doesn't affect callers).
[Unreleased]
Added
-
Release parser v2 — EASY path live (
alfred/domain/release/parser/): new annotate-based pipeline (tokenize → annotate → assemble) drives releases from known groups. ExposesToken(frozen VO withindex+role+extra),TokenRoleenum (structural/technical/meta families), andGroupSchema/SchemaChunkvalue objects.pipeline.tokenize: string-ops separator split (no regex), strips a[site.tag]prefix/suffix first.pipeline.annotate: detects the trailing group right-to-left (priority tocodec-GROUPshape, fallback to any non-source dashed token), looks up itsGroupSchema, then walks tokens and schema chunks in lockstep — optional chunks that don't match are skipped, mandatory mismatches abort EASY and returnNoneso the caller can fall back to SHITTY.pipeline.assemble: folds annotated tokens into aParsedRelease-compatible dict.parse_release(inrelease.services) tries the v2 EASY path first and falls through to the legacy SHITTY heuristic onNone. Legacy SHITTY/PATH OF PAIN behavior is unchanged.- Knowledge:
alfred/knowledge/release/release_groups/{kontrast,elite, rarbg}.yamldeclare the canonical chunk order per group, loaded via newReleaseKnowledge.group_schema(name)port method. - Tests in
tests/domain/release/test_parser_v2_{scaffolding,easy}.pycover token VOs, site-tag stripping, group detection, schema-driven annotation (movie, TV episode, season pack with optional source), and field assembly.
-
Real-world release fixtures under
tests/fixtures/releases/{easy,shitty,path_of_pain}/, each documenting an expectedParsedReleaseplus the futurerouting(library / torrents / seed_hardlinks) for the upcomingorganize_mediarefactor. EASY bucket seeded with 5 cases (movie, single-episode, season pack, movie + noise, YTS bracket-heavy). SHITTY bucket seeded with 15 anti-regression cases covering: 3-level INTEGRALE hierarchy (Angel), French custom titles (Buffy, La Nuit au Musée, Chérie j'ai agrandi), multi-episode chainS14E09E10E11(Archer, captures E11 loss), lowercases01e01(Notre Planète),NxNNwith-separators (Vinyl, captures dash artifact), title-with-year-suffix (Deutschland.83), season-rangeS01-06(Tatortreiniger, captures movie misclassification), bare folder name (Jurassic Park, media_type=unknown), apostrophe-in-name (Honey Don't, captures full AI-path degeneration), SUBS-tag movie (Hook), space separators (Predator Badlands, captures group=UNKNOWN), subs-only release (Westworld S04). PATH OF PAIN bucket seeded with 10 worst-case fixtures covering: UTF-8 wide pipe yt-dlp slug (Khruangbin), 3-show franchise box-set with double season range and parens-wrapped tech (Deutschland 83-86-89, capturesgroup=S03misdetection), accented chars in title (Chérie BéBé with VFF), 8-word stand-up comedy title (Jimmy Carr), site-tag prefix + XviD (OxTorrent), episode title + air-date silently lost (Prodiges), full-chaos apostrophe + spaces + Blu-ray dash + 1080i + multi-word audio codec (The Prodigy, full AI-path degeneration), yt-dlp YouTube ID glued to year (Sleaford Mods), bilingual[FR-EN]tag mistaken for group (Super Mario Bros), COMPLETE + S01-S07 range + REPACK + HEVC (Gilmore Girls, the well-behaved exception). Parametrized overtests/domain/test_release_fixtures.pyfor anti-regression. -
NxNNalt season/episode form supported byparse_release. Releases likeShow.1x05.720p.HDTV.x264-GRPandShow.2x07x08.1080p.WEB.x265-GRP(multi-ep alt form) now parse as TV shows. -
alfred/knowledge/release/separators.yamldeclares the token separators used by the release-name tokenizer (.,,[,],(,),_). New conventions can be added without code changes. The canonical.is always present even if missing from YAML.
Changed
parse_releasetokenizer is now data-driven: it splits on any character listed inseparators.yaml(regex character class) instead ofname.split("."). This makes YTS-style releases (The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]), space-separated names (Inception 2010 1080p BluRay x264-GROUP), and underscore-separated names parse correctly via the direct path — no more fallback through sanitization.parse_releaseflow simplified: site-tag extraction always runs first (soparse_path == "sanitized"now reliably indicates a stripped[tag]), then well-formedness is checked only against truly forbidden chars (anything not in the configured separator set).- ISO 639-2/B is now the canonical language code project-wide (was a mix of
639-1 and 639-2/T):
SubtitlePreferences.languagesdefault is now["fre", "eng"](was["fr", "en"]). Old LTM files are not auto-migrated — deletedata/memory/ltm.jsonto regenerate with the new defaults.- Subtitle output filenames are now
{iso639_2b}.srt(e.g.fre.srt,fre.sdh.srt). Existingfr.srtfiles are still read correctly (recognized as French via alias) but new files are written canonically. Languagevalue object docstring corrected: it has always stored 639-2/B (matching what ffprobe emits), not 639-2/T as previously documented.
MovieService.validate_movie_fileminimum size is now configurable viasettings.min_movie_size_bytes(default unchanged: 100 MB). Constructor accepts an optionalmin_movie_size_bytesoverride for tests.SubtitleKnowledgeBasedelegates language lookup toLanguageRegistryrather than duplicating tokens.subtitles.yamlnow only declares subtitle-specific tokens (e.g.vostfr,vf,vff) under a newlanguage_tokenssection.
Removed
alfred/domain/tv_shows/services.pyandalfred/domain/movies/services.pydeleted entirely. They held fossil parsers (parse_episode_filename,extract_movie_metadata, …) with zero production callers — superseded byparse_releaseas the single source of truth for release-name parsing. Associated tests (tests/domain/test_movies.py,tests/domain/test_tv_shows_service.py) removed as well._sanitizeand_normalizehelpers inalfred/domain/release/services.py— the new tokenizer makes them redundant._LANG_KEYWORDS,_SDH_TOKENS,_FORCED_TOKENS,SUBTITLE_EXTENSIONShardcoded dicts inalfred/domain/subtitles/scanner.py— all knowledge now lives in YAML (CLAUDE.md compliance)._MIN_MOVIE_SIZE_BYTESmodule-level constant inalfred/domain/movies/services.py— replaced by the new setting.- Top-level
languages:block insubtitles.yaml— superseded bylanguage_tokens:(subtitle-specific only) since iso_languages.yaml is the canonical source.
Fixed
hitoken no longer marks a subtitle as SDH (it conflicted with the ISO 639-1 alias for Hindi). SDH is now detected only viasdh,cc, andhearingtokens.SubtitleKnowledgeBasedefault rules used"fra"whileiso_languages.yamlexposes French as"fre"— preferred languages defaults now match the canonical form.
Internal
- Domain I/O extraction (
refactor/domain-io-extraction): the domain layer no longer performs subprocess calls, filesystem scans, or YAML loading. Achieved in a series of focused commits:- Knowledge YAML loaders moved to infrastructure:
alfred/domain/release/knowledge.py,alfred/domain/shared/knowledge/language_registry.py, andalfred/domain/subtitles/knowledge/{base,loader}.pyrelocated toalfred/infrastructure/knowledge/. Re-exports were dropped — callers import directly from the new location. MediaProberandFilesystemScannerProtocol ports introduced atalfred/domain/shared/ports/with frozen-dataclass DTOs (SubtitleStreamInfo,FileEntry).SubtitleIdentifierandPatternDetectorare now constructor-injected with concrete adapters (FfprobeMediaProberwrappingsubprocess.run(ffprobe)andPathlibFilesystemScannerwrappingpathlib). No more directsubprocess/pathlibusage from the subtitle domain services.- Live filesystem methods removed from VOs and entities:
FilePath.exists()/.is_file()/.is_dir()deleted —FilePathis now a pure address VO.Movie.has_file()andEpisode.is_downloaded()dropped. Callers either rely on a prior detection step or use try/except over pre-checks (eliminates TOCTOU races). SubtitlePlacermoved to the application layer atalfred/application/subtitles/placer.py— it performsos.linkI/O, which doesn't belong in the domain. Pre-checks replaced with try/except forFileNotFoundError/FileExistsError.SubtitleRuleSet.resolve()no longer reaches into the knowledge base: the implicitDEFAULT_RULES()helper is gone, replaced by an explicitdefault_rules: SubtitleMatchingRulesparameter. TheManageSubtitlesuse case loads defaults from the KB once and passes them in.SubtitleKnowledgeProtocol port atalfred/domain/subtitles/ports/knowledge.pydeclares the read-only query surface domain services consume (7 methods:known_extensions,format_for_extension,language_for_token,is_known_lang_token,type_for_token,is_known_type_token,patterns).SubtitleIdentifierandPatternDetectordepend on this Protocol instead of the concreteSubtitleKnowledgeBasefrom infrastructure —domain/subtitles/now has zero imports frominfrastructure/. The remaining domain → infra leak (domain/release/loading separator YAML at import-time) is documented in tech-debt and scheduled for its own branch.
- Knowledge YAML loaders moved to infrastructure:
to_dot_folder_name(title)helper inalfred/domain/shared/value_objects.py— extracts there.sub(r"[^\w\s\.\-]", "", title).replace(" ", ".")pattern that was duplicated betweenMovieTitle.normalized()andTVShow.get_folder_name().ParsedRelease.languagesusesfield(default_factory=list)instead of a manual__post_init__that assigned[]viaobject.__setattr__.file_extensions.yamlsplits subtitle sidecars (.srt,.sub,.idx,.ass,.ssa) into a dedicatedsubtitle:category instead of lumping them undermetadata:. The_METADATA_EXTENSIONSset used bydetect_media_typeremains the union of both (same behavior — subtitles are still ignored when deciding the media type of a folder), but a newload_subtitle_extensions()loader is now available for the subtitles domain. Sematic clarity, no functional change.tv_shows/entities.pymodule docstring now shows the aggregate ownership as an ASCII tree before the rule text — quicker visual scan of the DDD structure.- Removed backward-compat shims
_sanitise_for_fs/_strip_episode_from_normalisedfromdomain/release/value_objects.py(zero callers). - Cleaned ruff warnings across the codebase:
subprocess.runcalls now pass explicitcheck=False(PLW1510); lazy imports promoted to module top where there was no cycle (PLC0415 inmanage_subtitles.py,placer.py,qbittorrent/client.py,file_manager.py); fixed module-level import ordering (E402) inlanguage_registry.pyandsubtitles/knowledge/loader.py; removed unused locals (F841 / B007); replaced unnecessary set comprehension withset()inrelease/knowledge.py(C416). - Ruff config: ignore
PLR0911/PLR0912(too-many-returns / too-many-branches) globally — noisy on parser mappers and orchestrator use-cases where early-return validation is essential complexity. IgnorePLW0603for the documented memory singleton (infrastructure/persistence/context.py). - Release-knowledge DDD purification (
refactor/domain-release-knowledge): the last domain → infrastructure leak (domain/release/value_objects.pyloading YAML at import-time) is gone. Achieved via:ReleaseKnowledgeProtocol port atalfred/domain/release/ports/knowledge.pydeclares the read-only query surface release parsing needs (token sets for resolutions, sources, codecs, languages, hdr extras; structured dicts for audio, video_meta, editions, media_type_tokens; separators list; file-extension sets used by application/infra callers;sanitize_for_fs(text)method).YamlReleaseKnowledgeadapter atalfred/infrastructure/knowledge/release_kb.pyloads every YAML constant once at construction. Builds an immutablestr.maketranstranslation table for filesystem sanitization.parse_release(name, kb)takes the knowledge as an explicit parameter — no more module-level YAML loading inside the domain. Every internal helper (_tokenize,_extract_tech,_extract_languages,_extract_audio,_extract_video_meta,_extract_edition,_extract_title,_infer_media_type,_is_well_formed) takeskb.ParsedReleaseOption B: sanitization happens once at parse time and is stored on a newtitle_sanitized: strfield. Builder methods (show_folder_name,season_folder_name,episode_filename,movie_folder_name,movie_filename) are now pure — they accept already-sanitizedtmdb_title_safe/tmdb_episode_title_safearguments. Callers at the use-case boundary sanitize TMDB strings viakb.sanitize_for_fs(...)before passing them in.- All domain-knowledge constants removed from
value_objects.py:_RESOLUTIONS,_SOURCES,_CODECS,_AUDIO,_VIDEO_META,_EDITIONS,_HDR_EXTRA,_MEDIA_TYPE_TOKENS,_LANGUAGE_TOKENS,_FORBIDDEN_CHARS,_VIDEO_EXTENSIONS,_NON_VIDEO_EXTENSIONS,_SUBTITLE_EXTENSIONS,_METADATA_EXTENSIONS,_WIN_FORBIDDEN_TABLE, and the_sanitize_for_fshelper. The domain module is now pure. - Application-layer KB singleton:
resolve_destination.pyinstantiates a module-level_KB: ReleaseKnowledge = YamlReleaseKnowledge()and threads it through everyparse_release(...)call. The local_sanitizehelper and_WIN_FORBIDDENregex were dropped in favor of_KB.sanitize_for_fs(...). detect_media_type(parsed, source_path, kb)andfind_video_file(path, kb)now take the knowledge explicitly instead of importing_*_EXTENSIONSconstants from the domain.agent/tools/filesystem.py::analyze_releaseimports the application KB singleton and passes it through.
[2026-05-17] — TVShow & Movie aggregate refactor
Multi-phase refonte of the TV show domain into a real DDD aggregate, with
matching parity work on Movie, a language knowledge system, and the
shared/media restructure that supports both.
Added
- Language knowledge system (
alfred/knowledge/iso_languages.yaml+ 42 languages includingundfor undetermined).Languagevalue object (frozen dataclass) withiso,english_name,native_name,aliases, and amatches(raw)cross-format helper.LanguageRegistryloader (alfred/domain/shared/knowledge/) merging builtin + learned YAML. Not a singleton — the application layer instantiates it.- ISO 639-2/B is the canonical key; aliases cover 639-1, 639-2/T, English name, native name, and common spellings.
VideoTrackdataclass (alfred/domain/shared/media/video.py) with aresolutionproperty using width-priority bucket detection (handles cinema/scope crops like 1920×960 → 1080p).shared/media/matching.py—track_lang_matcheshelper shared byEpisodeandMovie. Implements the "C+" contract for language helpers:Languagequery → cross-format match viaLanguage.matches()strquery → case-insensitive direct comparison (no normalization)
- TVShow aggregate composition:
TVShow.seasons: dict[SeasonNumber, Season]Season.episodes: dict[EpisodeNumber, Episode]Season.expected_episodes/Season.aired_episodes(split so collection state can compare "owned vs aired today" without confusing in-flight seasons with future ones)
- Aggregate methods on
TVShow:add_episode(ep)— sole sanctioned mutation entry point (creates the season if missing)add_season(season)— replaces a season wholesalecollection_status()→CollectionStatus.{EMPTY, PARTIAL, COMPLETE}is_complete_series()— true iffENDED + COMPLETEmissing_episodes()— flat list of all aired-but-not-owned(season, episode)pairs
CollectionStatusenum (orthogonal toShowStatus).- Episode track helpers (
has_audio_in,has_subtitles_in,has_forced_subs,audio_languages,subtitle_languages), driven byEpisode.audio_tracks/Episode.subtitle_tracks. - Movie aggregate parity —
Movienow carriesaudio_tracks/subtitle_tracksand exposes the same helpers asEpisode(same C+ contract). CHANGELOG.md(this file).
Changed
shared/media_info.pyexploded intoshared/media/{audio,video,subtitle,info,matching}.py.MediaInfois now symmetric: every stream type is alist[Track]. Flat accessors (width,height,video_codec,resolution) remain as properties that read the first video track.MediaInfo.duration_seconds/bitrate_kbpsmoved fromVideoTracktoMediaInfo(file-level — they come from the ffprobeformatblock, not a stream). Files without a video stream now correctly expose duration.ShowStatus.from_stringextended to map TMDB strings (Returning Series,In Production,Pilot,Planned,Canceled,Cancelled). Comparison is whitespace-trimmed and case-insensitive.Season/Episodedropped theirshow_imdb_idback-references. They are owned byTVShowand reached only through it.TVShow.seasons_countandepisode_countare now@property(computed from the dict) instead of stored ints.TVShowService.parse_episode_from_filenamerewritten in string operations (no regex). SupportsS01E05/s1e5and1x05/01x5forms.TVShowService.find_next_episodenow drives offshow.missing_episodes()instead of the hardcoded "max 50 episodes per season" heuristic.TVShowServiceconstructor no longer takesseason_repository/episode_repository— the aggregate persists in one block viaTVShowRepositoryonly.SubtitleTrackinalfred.domain.subtitles.entitiesrenamed toSubtitleCandidate. Coexists with theshared.media.SubtitleTrackffprobe-view dataclass (different bounded contexts, kept separate intentionally).tv_shows/services.py_VIDEO_EXTENSIONSnow loaded fromknowledge/release/file_extensions.yamlviaload_video_extensions()(single source of truth).CLAUDE.mdupdated with three new policy sections:- "Tests" — small updates OK during normal work, no mass-update sprees
- "Backwards-compatibility shims" — prefer clean migration over shims
- "Regex" — not forbidden, use judgment when string ops would be fragile
Removed
- Legacy
Season N Episode Nfilename form inTVShowService.parse_episode_from_filename. It never appears in the release names Alfred handles, and supporting it forced a regex. SeasonRepositoryandEpisodeRepository— only the aggregate root has a repository (DDD rule: one repo per aggregate).shared/media_info.pycompatibility shim — callers updated.SubtitleTrackcompatibility alias insubtitles.entities— callers updated toSubtitleCandidate.
Fixed
MediaInfo.duration_secondsreturnsNoneon audio-only files instead of crashing throughprimary_video.duration_seconds(see the duration/bitrate move under Changed).MediaOrganizer(infrastructure/filesystem/organizer.py) no longer passes the removedshow_imdb_id/episode_countkwargs when constructing aSeasonfor folder-name generation.
Internal
- Test suite rewritten where the aggregate redesign broke fixtures:
tests/domain/test_tv_shows.py(69 tests),tests/domain/test_media_info.py(rewritten forVideoTrack),tests/application/test_enrich_from_probe.py(helper added),tests/infrastructure/test_filesystem_extras.py(fixtures),tests/domain/test_tv_shows_service.py(find_next_episode driven by real aggregate state). - Subtitle services internal migration:
matcher.py,utils.py,placer.py,identifier.pyupdated to importSubtitleCandidate. - Suite status at end of block: 1066 passed, 8 skipped, 0 failed.