37 Commits

Author SHA1 Message Date
francwa 745dec39f5 FINAL COMMIT BEFORE REWRITE 2026-05-26 21:45:11 +02:00
francwa 42fa6139ed refactor(tools): wire filesystem tools to new use cases, drop broken ones
Updates alfred/agent/tools/filesystem.py to use the five free-function
use cases introduced in the previous commit:

  - list_folder            -> list_dir_use_case(Path(path), roots)
  - create_directory (new) -> create_dir_use_case(Path(path), roots)
  - move_media             -> move_file_use_case(src, dst, roots)
  - move_to_destination    -> create_dir_use_case(dst.parent) + move_file

A module-level _load_directory_roots() helper reads memory once per
call and builds the DirectoryRoots VO; missing roots produce an
explicit 'roots_not_configured' error.

Tools whose backing code was moved to *_OLD files are removed entirely
rather than left broken: manage_subtitles, set_path_for_folder,
create_seed_links, and the four resolve_*_destination tools. They will
come back when the matching application/domain code is rebuilt later
on this branch.

alfred/agent/tools/__init__.py shrinks accordingly. find_media_imdb_id
(already broken before this branch — name not exported by tools.api)
is dropped from the package re-exports so the package imports cleanly
again.
2026-05-26 19:46:49 +02:00
francwa 2df7843d8b refactor(filesystem): split into 5 atomic free-function ops + use cases
Replaces the monolithic FileManager class + scattered helpers in
alfred/infrastructure/filesystem with five free functions, each
single-responsibility and pathlib-native:

  list_dir / create_dir / link_file / move_file / move_dir

The infra layer now raises typed exceptions (FilesystemError base
+ SourceNotFound / DestinationExists / NotADirectory / NotAFile /
PermissionDenied / CrossDevice / FilesystemOSError) instead of
returning {status: ok|error} dicts. No more get_memory() reads
from infra.

Application layer mirrors the same split: five free use cases
(<op>_use_case) wrap each infra op, guard inputs against escaping
the new DirectoryRoots VO (downloads / torrents / movies /
tv_shows), catch infra exceptions, and return frozen DTOs. Roots
are injected — no global state.

Legacy files kept on disk with _OLD suffix for reference during
the follow-up rewiring (FileManager, MediaOrganizer,
create_folder/move helpers; CreateSeedLinks/ListFolder/MoveMedia/
ManageSubtitles use cases, resolve_destination). They are no
longer exported from __init__, which intentionally breaks current
agent tool wrappers and downstream tests — re-wiring is the next
chunk of work on the unfuck branch.
2026-05-26 19:22:09 +02:00
francwa 28304bb162 fix(releases): repair singular 'release' imports in parser
The CHOP CHOP CHOP pass left parser/{pipeline,scoring,services}.py
importing from alfred.domain.release.value_objects (singular), which
does not exist. parse_release was unimportable; all release tests
errored at collection.

Point the 3 imports at value_objects_old_question_mark.py, which still
holds ParsedRelease/ParseReport/MediaTypeToken/TokenizationRoute. The
file name is misleading (it is not 'old' — it is the active parser VO);
naming will be resolved when ParsedRelease itself is replaced. Tracked
in .claude/specs/unfuck_technical_debt.md #4.
2026-05-26 06:55:30 +02:00
francwa c62ae81275 refactor(tmdb): ACL pass — push VOs into DTOs, split search per media type
Anti-corruption boundary tightened on the TMDB adapter:

* TmdbMovieInfo / TmdbShowInfo now carry domain VOs (TmdbId, ImdbId,
  MovieTitle, ReleaseYear, ShowStatus) instead of raw scalars —
  validation happens at the boundary, not three layers later.
* ShowStatus enum added (domain/tv_shows/value_objects) with a
  from_tmdb() mapper that falls back to UNKNOWN + logs a warning on
  unrecognized values. TVShow.status is now ShowStatus, not str.
* MovieTitle cap raised from 100 to 150 chars.
* MediaResult / ExternalIds dropped. Replaced by per-media search
  DTOs: TmdbMovieSearchResult and TmdbShowSearchResult. Neither
  carries imdb_id — search no longer enriches with external_ids
  (callers needing imdb_id follow up with get_movie_info /
  get_tv_show_info on the chosen tmdb_id).
* TMDBClient: search_multi / search_media / _parse_result removed.
  search_movies (/search/movie) and search_shows (/search/tv) added,
  each parsing hits into VO-typed DTOs.
* SearchMovieUseCase returns a list of MovieHit (flattened to
  primitives for the agent). New symmetric SearchShowUseCase +
  ShowHit / SearchShowResponse DTOs.
* agent/tools/api.py: find_media_imdb_id → search_movies +
  search_shows wrappers.
* FileEntry moved from domain/shared/ports/filesystem_scanner.py to
  domain/shared/file_entry.py (it's a DTO, not a Protocol); size_kb
  (float) → size (int bytes). Scanner and SubtitleIdentifier
  updated.

Tests: 79/79 pass on tests/infrastructure/api/ +
tests/application/test_search_movie.py +
tests/application/test_search_show.py.
2026-05-26 05:54:58 +02:00
francwa cffafa2e60 CHOP CHOP CHOP 2026-05-26 05:45:07 +02:00
francwa b3abad4da4 chore(dot_alfred): Phase 5 cleanup + changelog
Delete the orphan MediaWithTracks mixin (and its only consumer,
track_lang_matches) from alfred.domain.shared.media — zero callers
since the v2 aggregates landed in Phase 3, parked for the Phase 5
decision in CHANGELOG.

Cleanup sweep across alfred/ and tests/ returns zero hits for:
* MediaWithTracks
* the v1 dot_alfred symbols
* the v1 sidecar names
* the alfred.application.library package
The v2 surface is the only one left.

CHANGELOG updated with:
* the Phase 5 sync orchestrators (sync_show / sync_movie),
* the Phase 4b PACK vs EPISODIC fix (Fixed section),
* the MediaWithTracks deletion in Removed,
* refreshed suite count (1277 passing).
2026-05-26 00:55:17 +02:00
francwa 7ff2e6bc4e feat(movies): sync_movie populates library index from TMDB
Parallel to sync_show. Calls TMDBClient.get_movie_info,
combines the TmdbMovieInfo with the on-disk MovieRelease loaded
via DotAlfredMovieReleaseRepository.load_by_tmdb_id, and upserts
into DotAlfredMovieLibraryIndex.

Policy mirrors sync_show with two adaptations specific to movies:
* placeholder signature is name == metadata.path (auto-heal writes
  them equal — the schema requires name to be non-empty so we can't
  use name == "" as the spec originally suggested),
* when the per-movie sidecar is gone but the index entry remains,
  sync warns and returns the existing entry unchanged (no upsert
  possible without a release: index.upsert requires folder/imdb_id
  from the MovieRelease itself).

Raises MovieNotFoundInLibrary when neither index nor sidecar
carry tmdb_id.
2026-05-26 00:51:43 +02:00
francwa 8f31f880aa feat(tv_shows): sync_show populates library index from TMDB
New orchestrator alfred.application.tv_shows.sync.sync_show calls
TMDBClient.get_tv_show_info, combines the response with the on-disk
release loaded via DotAlfredSeriesReleaseRepository.load_by_tmdb_id,
and upserts the result into DotAlfredTVShowLibraryIndex.

Policy:
* placeholders (auto-healed entries, status=="unknown") always
  refresh regardless of TTL,
* fresh entries within Settings.tmdb_cache_ttl_days are no-ops,
* stale entries past TTL refresh,
* force=True overrides both gates,
* indexed shows whose per-show sidecar is gone still get a fresh
  TMDB pass — slot map clears until rescan repopulates it,
* truly absent shows raise ShowNotFoundInLibrary from the new
  alfred.application.exceptions module.
2026-05-26 00:49:00 +02:00
francwa 1efe9a82c1 feat(dot_alfred): load_by_tmdb_id on release repos
Series repo returns (release, folder) so the upcoming sync
orchestrator can feed the library index's upsert(..., path=...).
Movie repo returns the release alone (folder is on release.folder
by the one-folder-one-file convention) — kept as a semantic alias
of find_by_tmdb_id for symmetry with the series side.
2026-05-26 00:45:14 +02:00
francwa 0dc053881a feat(tmdb): add TmdbMovieInfo DTO and get_movie_info
Symmetric to TmdbShowInfo / get_tv_show_info — gives the upcoming
sync_movie orchestrator a typed cache snapshot for the v2 movie
library index.

* TmdbMovieInfo(tmdb_id, imdb_id, title, release_year)
* parse_movie_info(details, external_ids) — pure builder, parses
  release_year from the first 4 chars of release_date (None on
  missing/empty/non-numeric)
* TMDBClient.get_movie_info(tmdb_id) — aggregates
  /movie/{id} + /movie/{id}/external_ids and feeds the parser

Tests cover happy path, missing/null/empty imdb_id, every
release_year edge (none/empty/short/non-numeric/missing key),
and the two required-field errors (id, title).
2026-05-26 00:35:42 +02:00
francwa 97dc799a26 fix(tv_shows): correct PACK vs EPISODIC classification model
The Phase 4 walker + rescan logic classified seasons by parser
output (does the filename carry Exx?), but PACK vs EPISODIC is a
structural distinction:

* PACK = season folder with N flat SxxEyy videos directly inside
* EPISODIC = season folder with N subfolders, each holding one video

Changes:
* walker.py: descends two levels under show_root and classifies
  each season folder by FS structure. SeasonFolder now carries
  mode: ReleaseMode | None. Mixed layouts (flat + subfolders) and
  EPISODIC subfolders with >1 video log a warning and report
  mode=None.
* rescan.py: trusts walker.mode; drops the bogus 'single un-
  numbered video → PACK with empty episodes' branch. A season
  with no parseable episodes is now skipped with a warning.
* Tests rewritten against the real model: PACK with flat numbered
  files, EPISODIC with one-video-per-subfolder, malformed mixed
  layout skipped, single-un-numbered-file skipped.

Suite: 1237 → 1245 passing.
2026-05-25 21:37:34 +02:00
francwa fe9857aaed docs(changelog): Phase 4 Step 5 — record dot_alfred v2 Phase 4 work
Append Phase 4 entry under [Unreleased]:
* Added: rescan_show v2 signature + new rescan_movie + PACK empty-
  episodes semantics + Settings.tmdb_cache_ttl_days + library-index
  anchor-mismatch warning
* Removed: v1 dot_alfred stack (bridge/repository/serializer/sidecar),
  abstract domain ports (TVShowRepository / MovieRepository),
  application/library/ package, two Phase-3 quarantine test files
* Internal: 1233 → 1237 passing, 10 → 8 skips; MediaWithTracks
  mixin parked for Phase 5

Phase 3 entries left intact (historically accurate at commit time).
2026-05-25 21:17:23 +02:00
francwa cc334a7951 feat(dot_alfred/v2): Phase 4 Step 4 — settings + anchor warning
Two small additions that close out Phase 4's loose ends.

Settings — tmdb_cache_ttl_days

    class Settings(BaseSettings):
        # --- DOT_ALFRED ---
        tmdb_cache_ttl_days: int = 14

Default 14 days, matching the dot_alfred_v2 master spec. Will drive
the Phase 5 TTL policy on TVShowLibraryIndexSidecar /
MovieLibraryIndexSidecar (decide when a TMDB-cached entry is stale
and triggers a refresh sync).

Anchor-mismatch warning

DotAlfredTVShowLibraryIndex._load_or_heal and DotAlfredMovieLibraryIndex
._load_or_heal now cross-check each indexed entry's metadata.path
against the on-disk folder layout right after a successful parse.
Drift (sidecar says folder X, X no longer exists under library_root)
is surfaced as a WARNING log — one per missing folder, with the
tmdb_id for cross-reference. No auto-heal on drift; the caller
decides (the heal path remains opt-in via index.heal()).

The warning fires only on the parsed-index path. The heal path
always synthesizes entries from real folder names, so it can never
drift — silent by construction.

Tests

* TestTVShowLibraryIndexAnchorWarning — 3 scenarios:
  warn-on-drift / no-warn-on-match / no-warn-on-heal.
* TestMovieLibraryIndexAnchorWarning — symmetric coverage.

Full suite: 1237 passed / 8 skipped / 4 xfailed.
2026-05-25 21:14:18 +02:00
francwa 86222d95d1 refactor(persistence): Phase 4 Step 3 — delete v1 dot_alfred + ports
Now that rescan_show + rescan_movie run on the v2 release repositories
(Phase 4 Steps 1-2), the v1 dot_alfred stack and its abstract domain
ports have zero callers. Delete them and lift the Phase 3 quarantines.

Deleted

* alfred/infrastructure/persistence/dot_alfred/bridge.py
* alfred/infrastructure/persistence/dot_alfred/repository.py     (v1)
* alfred/infrastructure/persistence/dot_alfred/serializer.py     (v1)
* alfred/infrastructure/persistence/dot_alfred/sidecar.py        (v1)
* alfred/domain/tv_shows/repositories.py     (TVShowRepository ABC)
* alfred/domain/movies/repositories.py       (MovieRepository ABC)
* tests/infrastructure/persistence/dot_alfred/test_repository.py
* tests/infrastructure/persistence/dot_alfred/test_serializer.py

Rewrite

alfred/infrastructure/persistence/dot_alfred/__init__.py now re-
exports only the v2 surface: the four concrete repositories
(DotAlfredSeriesReleaseRepository, DotAlfredMovieReleaseRepository,
DotAlfredTVShowLibraryIndex, DotAlfredMovieLibraryIndex) plus
ShowFolderUnknown. DTO-level imports go through
alfred.infrastructure.persistence.dot_alfred.v2 directly.

No backwards-compat shims (per CLAUDE.md): the v1 names are gone,
not aliased. Test suite drops from 10 → 8 skips (the two Phase 3
module-level skips disappear with the quarantined files).

Full suite: 1233 passed / 8 skipped / 4 xfailed.

The MediaWithTracks mixin in alfred.domain.shared.media is now
orphaned (Episode lost its tracks in Phase 3, MovieRelease doesn't
inherit it). Parked for Phase 5, which will either mount it on
MovieRelease / SeasonRelease or delete it for good.
2026-05-25 21:10:32 +02:00
francwa 9e48c70b8a feat(rescan): Phase 4 Step 2 — add rescan_movie orchestrator
Mirror rescan_show for the movies library. Locates the main video via
find_video_file, runs inspect_release once (movies are one-folder-one-
main-file by convention), and writes a v2 MovieRelease sidecar via
DotAlfredMovieReleaseRepository.

Signature

    rescan_movie(
        movie_dir,
        *,
        tmdb_id: TmdbId,
        imdb_id: ImdbId | None = None,
        movie_repo: DotAlfredMovieReleaseRepository,
        prober,
        kb,
    ) -> MovieRelease

Behavior

* added_at = datetime.now(UTC) — the v2 sidecar records when the
  release was last reconciled with disk, not filesystem mtime (which
  drifts across moves and hard-links). Phase 3 made this field
  required on MovieRelease.
* No TMDB call. Index auto-heals from the new sidecar on next read.
* MovieRescanFailed raised when no video is found inside movie_dir
  (only explicit failure mode; all other adapter errors degrade
  gracefully into empty / partial fields).
* file_path is recorded relative to movie_dir so the sidecar stays
  portable across library moves.

Tests

tests/application/movies/test_rescan.py: 8 scenarios on the real v2
movie repo + real KB + stubbed prober. Covers track flattening,
sidecar round-trip, prober returning None, video in subfolder,
explicit no-video failure, imdb_id optional.

Full suite: 1233 passed / 10 skipped / 4 xfailed.
2026-05-25 21:09:02 +02:00
francwa 7da0f887e7 refactor(rescan): Phase 4 Step 1 — rescan_show on v2 release repo
Rewrite rescan_show to build a SeriesRelease (Phase 1 v2 aggregate)
and persist it via DotAlfredSeriesReleaseRepository. The orchestrator
keeps reusing inspect_release as the single source of parse/probe
truth — only the assembly target changes (SeriesRelease/SeasonRelease/
EpisodeRelease instead of TVShow/Season/Episode).

New signature

    rescan_show(
        show_root,
        *,
        tmdb_id: TmdbId,
        imdb_id: ImdbId | None = None,
        series_repo: DotAlfredSeriesReleaseRepository,
        scanner,
        prober,
        kb,
    ) -> SeriesRelease

Identity is TMDB-anchored (tmdb_id required, no coercion); imdb_id is
optional. No TMDB call from rescan — the library index auto-heals
from the new sidecar on its next read.

PACK vs EPISODIC

* Single-video + season-parsed + no-episode → SeasonRelease(
  mode=PACK, folder=<season folder>, episodes=()). The slot map stays
  empty until the Phase 5 TMDB sync supplies episode_count. We do
  not fabricate an EpisodeRange we cannot prove on disk.
* Otherwise → EPISODIC: every file with (season, episode) becomes an
  EpisodeRelease with EpisodeRange(start, end) = (E, E). Multi-episode
  files (S01E01E02) still record only the first slot — Parser does
  not yet expose episode_end (existing tech debt, unchanged).

Package move

The orchestrator moves from alfred/application/library/ to
alfred/application/tv_shows/ for symmetry with alfred/application/
movies/ (Step 2). walker.py + its tests move with it. The empty
library/ package is deleted.

Tests

tests/application/tv_shows/test_rescan.py rewritten end-to-end on
the real v2 repository, real KB, real scanner, stubbed prober.
9 happy-path + edge-case scenarios cover EPISODIC track flattening,
PACK empty-episodes semantics, sidecar round-trip, imdb_id optional,
empty show root, season folder with no videos, prober returning None.
test_walker.py moved verbatim (import path updated).

Full suite: 1214 passed / 10 skipped / 4 xfailed. The three v1
dot_alfred quarantines from Phase 3 stay in place until Step 3.
2026-05-25 21:07:25 +02:00
francwa c22b2b78eb refactor(domain): Phase 3 — TVShow/Movie aggregates become TMDB-only
Filesystem-side concerns (file paths, tracks, quality, mode, added_at)
move to the releases/ domain added in Phase 1; the TMDB aggregates now
carry only identity + TMDB catalog facts.

Domain entities:
- TVShow: tmdb_id: TmdbId required (primary key), imdb_id: ImdbId | None
  optional, status: str = "unknown" added.
- Season: episode_count: int = 0 added (TMDB-cached); audio_tracks,
  subtitle_tracks, mode property removed.
- Episode: slimmed to identity + title. file_path/file_size/tracks
  removed. No longer inherits MediaWithTracks.
- Movie: tmdb_id required, imdb_id optional. file_path/file_size/quality/
  added_at/audio_tracks/subtitle_tracks removed. get_filename() now
  returns "Title.Year" — quality moves to MovieRelease.

Builders:
- TVShowBuilder requires tmdb_id: TmdbId; imdb_id/status optional.
- SeasonBuilder.set_episode_count(int) replaces set_audio_tracks /
  set_subtitle_tracks.

No-coercion contract: TVShow(tmdb_id=1396) raises — callers pass
TmdbId(1396). No ergonomic shim per the no-shims rule.

Cascade fixes:
- MediaOrganizer test fixtures updated to new Movie/TVShow shapes.
- Movie.get_filename() re-added (without Quality) so MediaOrganizer
  keeps working until Phase 4 rewires it through MovieRelease.

Quarantined (deleted in Phase 4 alongside v1 dot_alfred):
- tests/application/library/test_rescan.py — module-level skip.
- tests/infrastructure/persistence/dot_alfred/test_repository.py —
  module-level skip.
- tests/infrastructure/persistence/dot_alfred/test_serializer.py —
  module-level skip.

Suite: 1216 passed, 11 skipped (8 pre-existing + 3 Phase 3
quarantines), 4 xfailed. CHANGELOG updated under [Unreleased].
2026-05-25 19:54:35 +02:00
francwa 2f160644da feat(dot_alfred/v2): bump SCHEMA_VERSION to 2 — added_at on MovieRelease
Phase 3 prep: Movie aggregate is about to become TMDB-only (no
filesystem fields). added_at is a release-time observation, not a
TMDB-aggregate concern, so it moves to MovieRelease +
MovieReleaseSidecar.

- Add added_at: datetime (required) to MovieRelease with a
  type-check in __post_init__.
- Add added_at: datetime (required) to MovieReleaseSidecar.
- Bump SCHEMA_VERSION 1 → 2 with a version-history note.
- Bridge round-trips added_at via Pydantic mode="json" (datetime
  → ISO 8601 string).
- Tests: update MovieRelease fixtures, add a validator test, add
  an added_at round-trip test, switch hard-coded `1` assertions
  to SCHEMA_VERSION for future-proofing.

No v1 sidecars in the wild yet — no migration code needed.
2026-05-25 19:47:25 +02:00
francwa e65c1df229 feat(.alfred v2 — Phase 2): Pydantic sidecars, atomic repos, auto-heal index
Spec: specs/dot_alfred_v2.md (Phase 2).

New package alfred/infrastructure/persistence/dot_alfred/v2/:
  * sidecar_release.py / sidecar_root.py — Pydantic DTOs
    (extra="forbid", frozen=True) for per-item sidecars and the
    library-root index. schema_version enforced via model_validator.
  * serializer.py — read_yaml / atomic_write_yaml (.tmp + os.replace).
    SidecarSchemaError wraps YAML + Pydantic errors uniformly.
  * bridge.py — lossless domain <-> sidecar for SeriesRelease /
    MovieRelease; projection-only show_index_entry_from /
    movie_index_entry_from with multi-episode-file flattening.
  * repository.py — DotAlfredSeriesReleaseRepository /
    DotAlfredMovieReleaseRepository (log+skip on corruption),
    DotAlfredTVShowLibraryIndex / DotAlfredMovieLibraryIndex with
    silent auto-heal on missing/corrupt index reads. Writes never
    auto-heal (read paths handle that).

TMDB client extensions:
  * TmdbSeasonInfo / TmdbShowInfo DTOs + pure parse_tv_show_info.
  * TMDBClient.get_tv_show_info aggregates /tv/{id} +
    /tv/{id}/external_ids.

Domain change:
  * SubtitleTrack gains is_sdh: bool = False, populated from
    ffprobe's hearing_impaired disposition. Required for v2 sidecar
    parity (spec replaces v1's type: "sdh" with explicit flag).
    Default keeps every existing caller unchanged.

Tests: 37 new v2 integration tests on tmp_path (round-trips, atomic
writes, schema mismatch handling, anchor warnings, auto-heal paths)
plus 16 TMDB DTO tests. Full suite: 1240 -> 1277 passed.

Implementation notes filed in .claude/specs/dot_alfred_v2_notes.md
(strict=True trade-off, upsert signature deviation from spec, etc.).

Phases 3-5 (TVShow/Movie refactor to TMDB-only, rescan_show rewrite,
v1 deletion + wiring) are next.
2026-05-25 16:01:39 +02:00
francwa c0f6d01048 feat(releases): Phase 1 — new filesystem release domain + TmdbId VO
First step of specs/dot_alfred_v2.md. Introduces a separate bounded
context (alfred/domain/releases/) for the filesystem-side aggregates,
disjoint from TMDB identity which stays in tv_shows/ and movies/.
The link between the two worlds is TmdbId, used as the natural key
in the persistence layer (no domain-level reference).

New package alfred/domain/releases/:
- value_objects: EpisodeRange (covers SxxE01E02E03 multi-episode
  files via start/end inclusive range, with count/numbers/is_single
  helpers), ReleaseMode enum (PACK = N video files direct in the
  season folder, EPISODIC = N sub-folders).
- entities: TrackProfile, EpisodeRelease, SeasonRelease (with
  episode_count() summing each EpisodeRange.count()), SeriesRelease
  (tmdb_id primary anchor, optional imdb_id secondary), MovieRelease.
  All frozen dataclasses.
- builders: SeasonReleaseBuilder + SeriesReleaseBuilder mirroring
  the v1 TVShowBuilder pattern. Builders sort episodes by range
  start on emit and reject overlapping ranges (two files claiming
  the same TMDB slot). from_existing() seeds a builder from an
  existing frozen aggregate for round-trip edits.
- repositories: abstract ports (SeriesReleaseRepository,
  MovieReleaseRepository); concrete .alfred sidecar impls arrive
  in Phase 2.

New shared VO alfred/domain/shared/value_objects.py::TmdbId — positive
int, rejects bool/str/float, symmetric with the existing ImdbId VO.

73 unit tests cover VO validation, entity invariants, builder sort
+ overlap detection, and from_existing() round-trips.

v1 code paths are untouched at this stage; the new domain coexists
with the old TVShow aggregate until Phase 3 refactors it.
2026-05-25 15:19:23 +02:00
francwa de7030fa9c feat(library): add rescan_show orchestrator + walker (Step 4)
Step 4 of specs/dot_alfred.md — rebuild a TVShow aggregate from disk
by reusing the existing release pipeline (inspect_release) on every
video file in a show folder, then persist via the .alfred repository.

- alfred/application/library/walker.py — pure structural walk
  (season folders detected via \bS\d{1,2}\b regex, video files
  filtered against kb.video_extensions, no recursion).
- alfred/application/library/rescan.py — orchestrator that ingests
  each season folder, infers PACK vs EPISODIC from on-disk file
  count + parser output, and assembles via TVShowBuilder. Episode
  paths stored relative to show_root. Logs + skips corrupt input
  (no season parsed, mixed season numbers, unparseable episodes).
- Season now inherits MediaWithTracks: PACK seasons carry
  season-level audio_tracks / subtitle_tracks; EPISODIC seasons
  leave them empty (tracks live per-episode). SeasonBuilder gains
  set_audio_tracks / set_subtitle_tracks; bridge writes/reads them
  in the PACK branch via shared _synth_* helpers.

Out of scope, tracked as tech debt: adjacent .srt capture, multi-
episode (episode_end), TMDB-driven PACK detection (the current
heuristic '1 file == PACK' is a placeholder until ShowTracker lands).

18 new tests (11 walker + 7 rescan integration) on tmp_path with
the Foundation layout. Full suite: 1149 passed.
2026-05-24 15:22:18 +02:00
francwa 3622c95154 chore(lint): Lint the shit out of it 2026-05-24 15:21:58 +02:00
francwa c7c11180d9 feat(persistence): add DotAlfredTVShowRepository (filesystem-backed)
Step 3 of specs/dot_alfred.md. Concrete TVShowRepository
implementation reading and writing per-show .alfred YAML files under
a configurable library_root. Writes are atomic (.alfred.tmp +
os.replace), reads tolerate corrupted/wrong-schema sidecars (log +
skip), and the repo never invents a folder name — save(show)
requires the target folder to exist beforehand (raises
ShowFolderUnknown otherwise), matching the spec's
MediaOrganizer-then-sidecar split.

Cold folders without a sidecar are skipped by find_all and yield
None from find_by_imdb_id — the upcoming rescan_show tool (step 4)
will own the opt-in rebuild path.

A small bridge module translates between the rich domain TVShow
(AudioTrack/SubtitleTrack with full ffprobe minutiae) and the
compact sidecar shape (language-only audio, embedded-only subs with
type derived from is_forced). The bridge is intentionally lossy on
probe details the sidecar does not store, per the spec's
factual-only philosophy.

20 integration tests on tmp_path: round-trip save/find,
cold-folder/unknown-id returns, find_all skipping
(corrupted/schema-violating sidecars), delete/exists, atomic write
(no .alfred.tmp leftover), overwrite, and folder-name fallbacks
(get_folder_name guess + full-scan rescue when renamed).
2026-05-22 17:16:41 +02:00
francwa b0e275bd11 feat(persistence): add .alfred sidecar serializer (DTO ↔ dict)
Step 2 of the specs/dot_alfred.md plan. Pure-dict in/out
(serialize(sidecar) -> dict, deserialize(data) -> ShowSidecar);
YAML I/O lives in the repository layer (step 3) and is kept out
for trivial testability.

DTOs mirror the YAML schema field-for-field:
- ShowSidecar (root: imdb_id, tmdb_id, schema_version, seasons)
- SeasonSidecar (number, path, optional audio/subtitles, optional episodes)
- EpisodeSidecar (number, path, optional audio/subtitles)
- SubtitleEntry (language, source, type)

The sidecar acts as a scan cache: it stores only what is genuinely
costly to recompute — folder/file paths (skipping the FS walk) and
probed track metadata (skipping ffprobe). Release identifiers
(group, source, quality, codec) live in folder/file names and are
derived on demand by the parser; they are deliberately absent from
the schema and rejected as unknown keys on deserialize.

The serializer is strict on schema: unknown keys at any level raise
SidecarSchemaError, missing required fields raise clearly, and bool
cannot sneak in as a season/episode number. Optional fields
(tmdb_id, empty audio/subtitles/episodes) are omitted from the
output rather than emitted as null / [].

Tests cover round-trip equivalence (DTO → dict → DTO and DTO → YAML
text → DTO), the Foundation S01 PACK case (real-world fixture with
mixed sub types — superset captured at season scope), and a
Breaking Bad S05 EPISODIC case. An on-disk tmp_path fixture
recreates the Foundation folder structure with placeholder files,
ready to be reused by the upcoming repository walk tests in step 3.
2026-05-22 16:56:56 +02:00
francwa 6c12c18a27 refactor(tv_shows): freeze aggregate, builder-only construction, drop ShowTracker fields
The TVShow aggregate is now fully immutable. TVShow, Season and Episode
are @dataclass(frozen=True), children stored as ordered tuples sorted
by number. All construction goes through TVShowBuilder / SeasonBuilder
(new module), which expose from_existing() to seed from a current
frozen aggregate and apply modifications.

ShowTracker-territory fields are stripped from the domain: ShowStatus,
CollectionStatus, expected_seasons/episodes, aired_episodes,
collection_status(), is_complete_series(), missing_episodes(),
is_ongoing(), is_ended(), Season.name, the aired<=expected validation,
and the TMDB status string mapping. These will reappear in a dedicated
ShowTracker layer (to be designed) combining the .alfred sidecar with
live TMDB data.

New SeasonMode enum (PACK / EPISODIC) computed at read time from the
season's structural shape — never stored, the YAML sidecar encodes the
mode via presence/absence of the episodes: block.

Test suite for the domain entirely rewritten to cover frozen invariants,
builder ordering, last-write-wins, from_existing round-trip, and
SeasonMode derivation. Full suite still green (1078 passed).
2026-05-22 16:09:37 +02:00
francwa 1427c8a54b docs(specs): add dot_alfred sidecar design doc
First entry in the new specs/ directory. Specifies the layout and
semantics of the per-show .alfred/ sidecar that will back the future
concrete TVShowRepository:

- One .alfred/ directory per show, containing show.yaml + one
  season_NN.yaml per season (zero-padded, season_00 for Specials).
- Per-episode entries store file size + mtime so cache lookups skip
  a full ffprobe rescan when nothing changed.
- Self-healing on drift (file missing/modified/new) without raising.
- Atomic writes via temp file + os.replace().
- Phased implementation plan (builder + freeze first, then
  serializer, then cache validator, then repo, then wiring).

No code yet — spec only, awaiting review before the implementation
phases. Companion entry in CHANGELOG (Added).
2026-05-21 18:05:55 +02:00
francwa 8491edac22 infra(gitignore): track specs/ + carve out private .claude/
The repo-level .gitignore had a blanket *.md rule with only
CHANGELOG.md exempted. Two adjustments:

- Allow specs/ to be tracked (design docs / RFCs live here, public).
- Restrict the README.md exception to the root (/README.md) so that
  per-directory README files (e.g. tests/fixtures/releases/README.md)
  stay ignored as before — no unintended scope creep.
- Explicitly ignore /.claude/, the private dev-docs sub-repo that
  lives inside the working tree but is versioned and pushed
  separately.

CHANGELOG: Internal entry.
2026-05-21 18:05:33 +02:00
francwa 02e478a157 refactor(domain): freeze Movie and Episode, switch track collections to tuple
Movie and Episode become @dataclass(frozen=True, eq=False), with
audio_tracks/subtitle_tracks held as tuple[...] instead of list[...].
Identity-based equality is preserved via the existing __eq__/__hash__.
__post_init__ coercion (imdb_id, title, season_number, episode_number)
uses object.__setattr__ to stay compatible with frozen.

The MediaWithTracks mixin contract is updated to tuple accordingly.

Callers projecting enrichment results (probe output, file metadata) now
rebuild via dataclasses.replace(...) — same pattern recently adopted for
ParsedRelease.

Season and TVShow stay mutable for now: freezing the aggregate root
would cascade a full reconstruction on every add_episode, deferred.
2026-05-21 13:40:22 +02:00
francwa 3dc73a5214 feat(release): add fullwidth vertical bar | (U+FF5C) to separators
CJK release names sometimes use the fullwidth vertical bar as a token
separator, as do occasional decorative YouTube-style uploads. Adding
the codepoint to separators.yaml lets the tokenizer split on it
instead of leaving the wide pipe glued onto an adjacent token.

The tokenizer in alfred/domain/release/parser/pipeline.py iterates
the separator list as plain strings (no regex), so a multi-byte
UTF-8 separator works without any code change.
2026-05-21 08:05:56 +02:00
francwa 88f156b7a4 refactor(subtitles): rename SubtitleCandidate → SubtitleScanResult
The old name conflated 'might become a placed subtitle' with 'what a
scan pass produced'. The class is the output of a scan/identify pass —
language/format may still be None while classification is in progress,
confidence reflects classifier certainty, raw_tokens holds filename
fragments under analysis. SubtitleScanResult says that directly.

Pure rename + refreshed docstring; no behavior change. Touches the
domain entity, the matcher/identifier/utils services, the
manage_subtitles use case, the placer, the metadata store, the
shared-media cross-ref comment, and 7 test modules.
2026-05-21 08:05:46 +02:00
francwa 5107cb32c0 feat(release): InspectedResult.recommended_action centralizes exclusion decision
Add a derived 'recommended_action' property on InspectedResult that
collapses the orchestrator's go / wait / skip decision into one value:

- 'skip'      → no main_video, or media_type == 'other'
- 'ask_user'  → media_type == 'unknown', or road == 'path_of_pain'
- 'process'   → confident parse with a main video on disk

The ordering is part of the contract (skip > ask_user > process) —
documented in the property docstring.

Until now every consumer (workflows, the agent, the orchestrator
sketch) had to re-derive this from the road / media_type / main_video
triple, with subtle drift between sites. One place, one rule.

Exposed through the analyze_release tool so the LLM can route on it.
Spec YAML updated to describe the new field.

Suite: 1083 passed (+6 new tests in tests/application/test_inspect.py
covering the four branches and the precedence rules).
2026-05-21 07:54:17 +02:00
francwa b7979c0f8b refactor(release): freeze ParsedRelease + enrich_from_probe returns new instance
ParsedRelease is now @dataclass(frozen=True). The enrichment passes that
used to patch fields in place now produce new instances:

- enrich_from_probe(parsed, info, kb) returns a new ParsedRelease via
  dataclasses.replace (no allocation when no field changed).
- inspect_release rebinds 'parsed' after detect_media_type (wrapped in
  MediaTypeToken — the strict isinstance check now also runs on
  replace) and after enrich_from_probe.

languages becomes a tuple[str, ...] so the VO is properly immutable.
Parser pipeline packs languages as a tuple in the assemble dict.

Callers updated: inspect_release, testing/recognize_folders_in_downloads.py.
Tests updated: 22 enrich_from_probe call sites rebound, language
assertions switched to tuple literals, test_release_fixtures normalizes
result['languages'] back to list for YAML-fixture comparison.

Suite: 1077 passed.
2026-05-21 07:51:49 +02:00
francwa 9f1ce94690 refactor(application): inject kb/prober into resolve_destination use cases
Remove the module-level _KB / _PROBER singletons from
alfred/application/filesystem/resolve_destination.py. The four
resolve_{season,episode,movie,series}_destination use cases now take
kb: ReleaseKnowledge and prober: MediaProber as required arguments,
matching the shape of inspect_release.

The singletons now live at the agent-tools frontier
(alfred/agent/tools/filesystem.py), where the LLM-facing wrappers
instantiate YamlReleaseKnowledge / FfprobeMediaProber once and thread
them through. The wrappers' Python signatures are unchanged — the
inspect-based JSON-schema generator in agent/registry.py still sees the
same LLM-passable params.

analyze_release drops the dirty 'from ... import _KB' indirection.

Tests inject their own stubs by keyword (prober=_StubProber(...)) via
thin convenience wrappers, replacing the prior
monkeypatch.setattr(rd, '_PROBER', ...) pattern.

testing/debug_release.py: instantiate YamlReleaseKnowledge() /
FfprobeMediaProber() inline at the two call sites.

Suite: 1077 passed.
2026-05-21 07:46:13 +02:00
francwa 5e0ed11672 refactor(release): rename ParsePath enum to TokenizationRoute
ParsePath collided with pathlib.Path in mental models, and was one
letter from the parse_path attribute that stores its value — confusion
on confusion. Road (EASY/SHITTY/PATH_OF_PAIN) is the parser-confidence
axis; TokenizationRoute (DIRECT/SANITIZED/AI) is the tokenization-method
axis. They're orthogonal and the new name makes that obvious.

Field name parse_path stays — it's the right name for the attribute
that *holds* the route. String values ("direct", "sanitized", "ai")
stay too, so YAML fixtures and the analyze_release tool spec are
unchanged. Only the type symbol changes:

- value_objects.py: class rename + docstring spelling out orthogonality
  with Road.
- services.py: 3 call sites.
- scoring.py: docstring cross-reference updated.
- tests/domain/release/test_parser_v2_scoring.py: import + 3 call sites.
2026-05-21 07:39:42 +02:00
francwa 0246f85ef8 refactor(release): move codec mappings from code to YAML knowledge
The three module-level dicts in enrich_from_probe (ffprobe codec name
to scene token, channel count to layout) were exactly the kind of
domain lookup table CLAUDE.md says belongs in YAML, not in Python.
Move them to alfred/knowledge/release/probe_mappings.yaml, load
through a new ReleaseKnowledge.probe_mappings port field, and add a
kb parameter to enrich_from_probe so the consumer reads the maps via
the same injection pattern as everything else.

- New knowledge file: alfred/knowledge/release/probe_mappings.yaml
- New loader: load_probe_mappings() in infrastructure/knowledge/release.py
  (normalizes channel-count keys back to int).
- Port: ReleaseKnowledge gains probe_mappings: dict.
- Adapter: YamlReleaseKnowledge populates it at __init__.
- Consumer: enrich_from_probe(parsed, info, kb) reads the three sub-maps
  from kb.probe_mappings; unknown codecs still fall back to uppercase
  raw value, same behaviour as before.
- Call sites updated: inspect_release passes kb through; the testing
  script gets its kb wiring (it was already broken since the
  ReleaseKnowledge refactor); all 22 enrich_from_probe call sites in
  tests/application/test_enrich_from_probe.py pass _KB.
2026-05-21 07:37:42 +02:00
francwa e62dc90bd1 refactor(release): make tech_string a derived property
ParsedRelease.tech_string was a stored str field re-computed in two
places (assemble() at parse time, enrich_from_probe() after the probe).
The second site was a reactive fix (e79ca46) for filename builders that
saw a stale value. Turn it into an @property so it stays in sync with
quality/source/codec by construction.

- Drop the field from the dataclass + the key from assemble()'s dict.
- Drop tech_string="" from parse_release's malformed-name fallback.
- Drop the manual recomputation at the end of enrich_from_probe.
- Inject the property into asdict() result in the fixtures runner
  (same treatment as is_season_pack).
- Update tests that passed tech_string= to the constructor; rewrite the
  TestTechString case that mutated p.tech_string manually.
2026-05-21 07:33:53 +02:00
313 changed files with 10679 additions and 4983 deletions
+6
View File
@@ -74,5 +74,11 @@ docs/
# .md files (project-level Markdown is brol-y; allow-list the ones we track) # .md files (project-level Markdown is brol-y; allow-list the ones we track)
*.md *.md
!CHANGELOG.md !CHANGELOG.md
!/README.md
!specs/
!specs/**/*.md
# Private dev docs (separate git repo inside; see .claude/CLAUDE.md)
/.claude/
# #
+609
View File
@@ -15,6 +15,506 @@ callers).
## [Unreleased] ## [Unreleased]
### Changed
- **`filesystem` infra + application rewritten as 5 atomic free
functions.** On branch `unfuck`. Replaces the monolithic
`FileManager` class + scattered helpers with five small, pure ops in
`alfred/infrastructure/filesystem/`: `list_dir`, `create_dir`,
`link_file`, `move_file`, `move_dir`. Each takes `pathlib.Path`
arguments and raises typed exceptions from a dedicated hierarchy
(`FilesystemError``SourceNotFound` / `DestinationExists` /
`NotADirectory` / `NotAFile` / `PermissionDenied` / `CrossDevice` /
`FilesystemOSError`) — no more `{"status": "ok" | "error"}` dicts at
the infra boundary, no more `get_memory()` reads.
- **`filesystem` application: 5 use cases as free functions.** A
matching `<op>_use_case(path, …, roots: DirectoryRoots)` wraps each
infra op, guards inputs against escaping a new `DirectoryRoots` VO
(downloads / torrents / movies / tv_shows), catches infra exceptions,
and returns a frozen `<Op>Response` DTO. Roots are now injected, not
pulled from the global memory singleton.
- **Agent tool wrappers partially re-wired** to the new use cases.
`list_folder` now delegates to `list_dir_use_case`; `move_media`
to `move_file_use_case`; `move_to_destination` chains
`create_dir_use_case` + `move_file_use_case`; a new
`create_directory` tool wraps `create_dir_use_case`. Roots are
loaded once via a module-level `_load_directory_roots()` helper
that reads the persisted memory (no more per-call singleton
reads inside the use cases themselves).
### Removed
- `FileManager` / `MediaOrganizer` / `create_folder` / `move` from the
public API of `alfred.infrastructure.filesystem`. Their files remain
on disk renamed with an `_OLD` suffix (e.g. `file_manager_OLD.py`) so
the migration can finish on a follow-up commit without losing
reference material. They are no longer re-exported from `__init__`.
- `CreateSeedLinksUseCase` / `ListFolderUseCase` / `MoveMediaUseCase` /
`ManageSubtitlesUseCase` / `resolve_destination` from the public API
of `alfred.application.filesystem`. Same `_OLD` rename treatment.
This intentionally breaks current tool wrappers and tests downstream
— re-wiring is the next chunk of work on this branch.
- **Agent tools dropped during the refactor** (to be reintroduced
when the matching domain/application code lands):
`manage_subtitles`, `set_path_for_folder`, `create_seed_links`,
`resolve_season_destination`, `resolve_episode_destination`,
`resolve_movie_destination`, `resolve_series_destination`.
Their wrappers are removed from `alfred.agent.tools.filesystem`;
`alfred.agent.tools.__init__` now re-exports only what still
imports cleanly. `find_media_imdb_id` (already broken before this
branch — name no longer exported by `tools.api`) was also dropped
from the package re-exports.
### Added
- **`.alfred` v2 — Phase 4: v2-shaped `rescan_show` + new
`rescan_movie` + index anchor-warning + `tmdb_cache_ttl_days`
setting.** Fourth and final structural phase of
`specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. The TV
+ movie rescan orchestrators now write v2 release aggregates
(`SeriesRelease` / `MovieRelease`) via the concrete v2
repositories; the library index keeps auto-healing from the new
sidecars on its next read (no TMDB call from rescan — that stays
Phase 5).
- **`rescan_show`** moves from `alfred/application/library/` to
`alfred/application/tv_shows/` (symmetry with the new
`alfred/application/movies/`). New signature:
`(show_root, *, tmdb_id: TmdbId, imdb_id: ImdbId | None = None,
series_repo, scanner, prober, kb) -> SeriesRelease`.
- **`rescan_movie`** (new — `alfred/application/movies/rescan.py`)
locates the main video via `find_video_file`, runs
`inspect_release` once, and writes the per-movie `.alfred`
sidecar. `added_at = datetime.now(UTC)` on every rescan (the
sidecar records reconciliation time, not filesystem mtime).
Raises `MovieRescanFailed` when no video is found in the folder.
- **PACK semantics in `rescan_show`**: a single-video + no-episode
season becomes `SeasonRelease(mode=PACK, folder=…, episodes=())`.
The slot map stays empty until the Phase 5 TMDB sync supplies
`episode_count` — no fabricated `EpisodeRange` lands in the
sidecar. *(Superseded by Phase 4b — see Fixed.)*
- **`Settings.tmdb_cache_ttl_days: int = 14`** — placeholder for the
Phase 5 TTL policy on library-index entries (`fetched_at + TTL`
drives refresh decisions).
- **Library-index anchor-mismatch warning** — both
`DotAlfredTVShowLibraryIndex` and `DotAlfredMovieLibraryIndex` now
cross-check each entry's `metadata.path` against the on-disk
folder layout right after a successful parse. Drift is logged as a
`WARNING` (one per missing folder, with `tmdb_id`); the heal path
stays silent by construction (it always synthesizes from real
folder names).
- **`.alfred` v2 — Phase 5: TMDB sync orchestrators.** Fifth phase
of `specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`.
Two new orchestrators refresh the library-root index's
TMDB-cached fields from on-disk truth + a single TMDB call:
- **`sync_show`** (`alfred/application/tv_shows/sync.py`) calls
`TMDBClient.get_tv_show_info`, loads the release via
`DotAlfredSeriesReleaseRepository.load_by_tmdb_id`, and upserts
the result into `DotAlfredTVShowLibraryIndex`. Honors
`Settings.tmdb_cache_ttl_days`; placeholder entries (auto-healed,
`status == "unknown"`) always refresh; `force=True` overrides
both gates. Raises `ShowNotFoundInLibrary` when neither index nor
sidecar carry `tmdb_id`. Indexed shows with a missing per-show
sidecar still get a fresh TMDB pass — slot map clears until
rescan repopulates it.
- **`sync_movie`** (`alfred/application/movies/sync.py`) is the
movie-side parallel. Placeholder signature is `name ==
metadata.path` (auto-heal copies the folder name into `name`;
the sidecar schema requires `name` non-empty so we can't use
`name == ""`). When the per-movie sidecar is gone but the
index entry remains, sync warns and returns the existing entry
unchanged (no upsert possible without a release).
- **`TmdbMovieInfo` DTO + `TMDBClient.get_movie_info`** — symmetric
to the existing `TmdbShowInfo` / `get_tv_show_info` pair. Carries
`tmdb_id`, `imdb_id`, `title`, and `release_year` (parsed from
TMDB's `release_date`).
- **`load_by_tmdb_id` on the v2 release repositories.** The series
repo returns `(SeriesRelease, show_folder_name)` so the sync
orchestrator can feed `DotAlfredTVShowLibraryIndex.upsert(...,
path=...)`; the movie repo returns `MovieRelease` alone (folder is
on `release.folder` already) and is provided as a semantic alias
of `find_by_tmdb_id` for symmetry.
- **`alfred/application/exceptions.py`** — new module for the two
shared `*NotFoundInLibrary` exceptions raised by the sync
orchestrators (`ShowNotFoundInLibrary`, `MovieNotFoundInLibrary`).
### Fixed
- **PACK vs EPISODIC classification (Phase 4b).** The Phase 4
walker + `rescan_show` logic classified seasons by parser output
(does the filename carry `Exx`?), but PACK vs EPISODIC is a
*structural* distinction:
- **PACK** = season folder with N flat `SxxEyy` videos.
- **EPISODIC** = season folder with N subfolders, each holding
one video.
The walker now descends two levels under `show_root` and
classifies per season folder. Mixed (flat + subfolders) is
malformed — warn and skip. `rescan_show` trusts the walker's
mode and stops conflating "single un-numbered video" with PACK
(that case is now skipped as malformed too). Tests rewritten
against the real model. Supersedes the PACK-semantics bullet
above in Added.
### Removed
- **v1 dot_alfred stack and its abstract domain ports.** Deleted
`alfred/infrastructure/persistence/dot_alfred/{bridge,repository,
serializer,sidecar}.py`, plus the
`alfred/domain/{tv_shows,movies}/repositories.py` ABCs
(`TVShowRepository` / `MovieRepository`) — zero callers after
Phase 4. `dot_alfred/__init__.py` is rewritten as a v2-only
re-export (four concrete repositories + `ShowFolderUnknown`).
- **`alfred/application/library/` package** (rescan + walker moved
to `alfred/application/tv_shows/`).
- The two Phase 3 module-level test skips
(`test_repository.py`, `test_serializer.py`) are lifted by
deleting the quarantined files.
- **`MediaWithTracks` mixin + `track_lang_matches` helper** in
`alfred.domain.shared.media`. Parked in Phase 4 pending a
Phase 5 decision; zero callers across `alfred/` and `tests/`
after the v2 aggregates landed, so both go.
### Internal
- **Suite**: 1233 → 1277 passing; 10 → 8 skips (only LLM-not-running
skips remain — the Phase 3 quarantines are gone with their files).
- Phase 5 cleanup sweep returns zero hits for `MediaWithTracks`,
v1 dot_alfred symbols, v1 sidecar names, and `alfred.application.
library` — the v2 surface is the only one left.
### Changed
- **`.alfred` v2 — Phase 3: `TVShow` / `Movie` aggregates become
TMDB-only.** Third phase of `specs/dot_alfred_v2.md` on branch
`refactor/dot-alfred-v2`. Filesystem-side concerns (file paths,
tracks, quality, mode, `added_at`) move to the `releases/` domain
added in Phase 1; the TMDB aggregates now carry only identity +
TMDB catalog facts.
- **`TVShow`** — `tmdb_id: TmdbId` is now the **required primary
key**; `imdb_id: ImdbId | None` is the optional secondary anchor.
Added `status: str = "unknown"` (raw TMDB string, default matches
the v2 library-index auto-heal placeholder). `episode_count`
aggregates the TMDB-cached counts on each `Season` (was: sum of
materialized `Episode` objects).
- **`Season`** — added `episode_count: int = 0` (TMDB-cached,
authoritative). **Removed**: `audio_tracks`, `subtitle_tracks`,
and the `mode` property (release mode now lives only on
`SeasonRelease.mode` — single source of truth).
- **`Episode`** — slimmed to identity + title. **Removed**:
`file_path`, `file_size`, `audio_tracks`, `subtitle_tracks`. The
`MediaWithTracks` mixin is no longer in `Episode`'s MRO; on-disk
facts live on the matching `EpisodeRelease` keyed by
`(season_number, episode_number)`.
- **`Movie`** — `tmdb_id: TmdbId` required, `imdb_id` optional.
**Removed**: `file_path`, `file_size`, `quality`, `added_at`,
`audio_tracks`, `subtitle_tracks`. `get_filename()` now returns
`"Title.Year"` (quality lives on `MovieRelease` and is appended
by a release-aware caller — Phase 4 wires this through
`MediaOrganizer`).
- **`TVShowBuilder` / `SeasonBuilder`** — constructor requires
`tmdb_id: TmdbId`; `imdb_id` and `status` are optional.
`SeasonBuilder.set_episode_count(int)` replaces the old
`set_audio_tracks` / `set_subtitle_tracks` (tracks no longer
persisted on `Season`).
- **`MovieRelease` carries `added_at: datetime`** (required).
Bumped `dot_alfred/v2` `SCHEMA_VERSION` from `1``2` to add
`added_at: datetime` to `MovieReleaseSidecar`. Round-trip via
Pydantic `mode="json"` (datetime ↔ ISO 8601 string). No migration
code shipped — no v2.1 sidecars exist in the wild yet.
- **No-coercion `TmdbId` contract.** `TVShow(tmdb_id=1396)` now raises
— callers pass `TmdbId(1396)`. Same for `imdb_id: ImdbId | None`
on `TVShow`/`Movie`. Honest type contract, no ergonomic shim.
### Removed
- `Season.mode` property (derive from `SeasonRelease.mode` instead).
- `Episode.file_path` / `file_size` / `audio_tracks` /
`subtitle_tracks`.
- `Movie.file_path` / `file_size` / `quality` / `added_at` /
`audio_tracks` / `subtitle_tracks`.
### Internal
- v1 dot_alfred package (`bridge.py`, `repository.py`,
`serializer.py`, `sidecar.py`), the abstract `TVShowRepository` /
`MovieRepository` ports typed against the pre-Phase-3 aggregates,
and `alfred/application/library/rescan.py` are **intentionally
left in tree as a known-red island**. Their tests
(`tests/infrastructure/persistence/dot_alfred/test_repository.py`,
`test_serializer.py`, `tests/application/library/test_rescan.py`)
are module-level skipped with a Phase 4 reference. Phase 4 rewrites
`rescan_show` / introduces `rescan_movie` on top of the v2
release repositories + library index, then deletes the v1 stack +
the abstract ports + the quarantined tests in one swing.
- Test suite: 1216 passed, 11 skipped (8 pre-existing + 3 Phase-3
quarantines), 4 xfailed. v2 round-trip tests now reference
`SCHEMA_VERSION` instead of hard-coded `1` for future-proofing.
### Added
- **`.alfred` v2 — Phase 2: new persistence package + TMDB client
extensions.** Second phase of `specs/dot_alfred_v2.md` on branch
`refactor/dot-alfred-v2`. The new
`alfred/infrastructure/persistence/dot_alfred/v2/` package ships
the full v2 sidecar stack while leaving v1 (and the existing
`TVShow` aggregate) untouched — Phase 3 is the cutover.
- **Pydantic DTOs** — `SeriesReleaseSidecar` /
`MovieReleaseSidecar` (per-item), `TVShowLibraryIndexSidecar` /
`MovieLibraryIndexSidecar` (library-root index). All built on a
common `_Strict` base (`extra="forbid"`, `frozen=True`) with a
`@model_validator` enforcing `schema_version == 1`.
- **Track entries** — `AudioTrackEntry` / `SubtitleEntry` (sidecar
cache shape, slimmed from the domain track types). `SubtitleEntry`
carries `is_forced` + `is_sdh` as explicit booleans (v1's
`type: "sdh"` overload is gone).
- **Serializer** — `read_yaml` / `atomic_write_yaml` helpers
centralize YAML I/O and atomic writes (`.tmp + os.replace`).
`SidecarSchemaError` wraps both YAML parse errors and Pydantic
validation errors for uniform catch-and-skip semantics.
- **Bridge** — lossless `domain ↔ sidecar` conversion for
`SeriesRelease` / `MovieRelease` (round-trippable, including
multi-episode ranges and `is_sdh` subtitles); one-way projection
for library-index entries (`show_index_entry_from`,
`movie_index_entry_from`) that flattens multi-episode files into
per-TMDB-slot maps in `seasons[*].episodes`.
- **Repositories** —
`DotAlfredSeriesReleaseRepository` /
`DotAlfredMovieReleaseRepository` walk `library_root/*/` with
log+skip on corruption; **`DotAlfredTVShowLibraryIndex`** /
**`DotAlfredMovieLibraryIndex`** auto-heal silently on missing or
corrupt index files by rebuilding from the per-item sidecars
(healed entries keep TMDB-cached fields as placeholders until the
next sync repopulates them). Writes are atomic and never auto-heal
(read paths handle that).
- **TMDB client extensions** — `TmdbSeasonInfo` / `TmdbShowInfo`
DTOs + `TMDBClient.get_tv_show_info(tmdb_id)` aggregating
`/tv/{id}` + `/tv/{id}/external_ids`. The parsing logic is a pure
function (`parse_tv_show_info`) testable without HTTP, with an
injectable reference date for deterministic `aired` flag tests.
- **`is_sdh` flag on `SubtitleTrack`.** Added to
`alfred/domain/shared/media.py::SubtitleTrack` to mirror ffprobe's
`hearing_impaired` disposition. Wired through the ffprobe layer
(`ffprobe_prober.py`) and the v2 sidecar bridge so SDH information
round-trips end-to-end. Defaults to `False` — backwards-compatible
for every existing caller.
- **37 v2 integration tests** on `tmp_path` covering round-trips
(domain ↔ sidecar ↔ YAML ↔ domain), atomic writes (no `.tmp`
leftovers), per-item log+skip on corruption / schema mismatch,
movie anchor-mismatch warning, full upsert / find / delete on both
library indexes, and the auto-heal path on missing / corrupt /
schema-mismatched index files. **16 TMDB DTO tests** for the new
`parse_tv_show_info` pure function.
- **`.alfred` v2 — Phase 1: new `releases/` domain.** First step of
`specs/dot_alfred_v2.md` on branch `refactor/dot-alfred-v2`. The
new `alfred/domain/releases/` package introduces a filesystem-only
bounded context separated from TMDB identity (the existing
`tv_shows` / `movies` domains). It hosts:
- **`EpisodeRange` VO** — covers single-episode files
(`EpisodeRange(E02, E02)`) and multi-episode files
(`EpisodeRange(E02, E04)` for `SxxE02E03E04.mkv`), with
`count()` / `numbers()` / `is_single()` helpers.
- **`ReleaseMode` enum** — `PACK` (N video files directly in the
season folder) vs `EPISODIC` (N sub-folders, one episode each);
classified by the walker, never re-derived.
- **Aggregates** — `TrackProfile`, `EpisodeRelease`,
`SeasonRelease` (with `episode_count()` summing each file's
range), `SeriesRelease`, `MovieRelease`. All frozen
dataclasses; mutation via `SeasonReleaseBuilder` /
`SeriesReleaseBuilder` (mirror the v1 `TVShowBuilder` pattern,
including `from_existing()` round-trip).
- **Abstract ports** — `SeriesReleaseRepository`,
`MovieReleaseRepository` (concrete `DotAlfred*` arrive in
Phase 2).
- **`TmdbId` VO** added to `alfred/domain/shared/value_objects.py`
(positive int, rejects bool/str/float — symmetry with `ImdbId`).
- 73 unit tests covering VO validation, entity invariants, builder
sort + overlap detection, and `from_existing()` round-trips. v1
code paths untouched at this stage; new domain coexists.
- **`rescan_show` orchestrator
(`alfred/application/library/rescan.py`).** Step 4 of the
`specs/dot_alfred.md` plan. Walks an Alfred-managed show folder,
runs the existing `inspect_release` pipeline on every video file it
finds, and assembles a frozen `TVShow` aggregate persisted via the
injected `TVShowRepository`. Reuses the release parser + ffprobe
path verbatim — no duplicated parse/probe logic at the library
layer. PACK vs EPISODIC inferred per season folder from the
on-disk file count + parser output: a single video whose name
carries no `Exx` token becomes a PACK season (tracks lifted to the
season-level `audio_tracks` / `subtitle_tracks`), anything else
becomes EPISODIC (one `Episode` per file). Episode paths are
stored relative to the show root for portability. Files that fail
to parse a season/episode number, or seasons with mixed numbers,
are logged and skipped — the orchestrator never raises. Embedded
subtitle tracks are captured from `ffprobe`; adjacent `.srt`
files, multi-episode entries (`S01E01E02`), and TMDB-driven PACK
detection are tracked as tech debt for a dedicated subtitles /
ShowTracker session. 7 integration tests on `tmp_path` with the
Foundation layout (S01 EPISODIC + S02 PACK) cover the round-trip
through the real `.alfred` repository.
- **Show tree walker (`alfred/application/library/walker.py`).**
Step 4a foundation. `walk_show(show_root, scanner, kb)` returns a
`ShowTree(show_root, season_folders=tuple[SeasonFolder, ...])`
pure structural snapshot, no parsing, no probing. Season folders
are detected by a `\bS\d{1,2}\b` token anywhere in the directory
name (release-style naming, no Plex `Season 01` / `Specials`
conventions). Video files are filtered against
`kb.video_extensions`; no recursion into sub-sub-folders. 11 unit
tests on `tmp_path` cover detection (case-insensitive, in-word
rejection), filtering (subs, NFO, sample files), and edge cases
(empty / missing show root).
- **Season-level audio/subtitle tracks
(`alfred/domain/tv_shows/entities.py`,
`alfred/domain/tv_shows/builders.py`).** `Season` now inherits
from `MediaWithTracks` and carries `audio_tracks` /
`subtitle_tracks` tuples (empty by default). Populated only in
PACK mode (the single release covering the whole season); empty in
EPISODIC mode where tracks live per-episode. `SeasonBuilder`
gains `set_audio_tracks()` / `set_subtitle_tracks()` and forwards
them through `from_existing()`. The bridge writes / reads them in
the PACK branch via shared `_synth_audio_tracks` /
`_synth_subtitle_tracks` helpers used for episodes too.
- **`DotAlfredTVShowRepository` — filesystem-backed implementation of
the `TVShowRepository` port
(`alfred/infrastructure/persistence/dot_alfred/repository.py`).**
Step 3 of the `specs/dot_alfred.md` plan. Reads and writes one
`.alfred` YAML file per show under a configurable `library_root`.
`save(show)` writes atomically (`.alfred.tmp` + `os.replace`) into a
folder that **must already exist** — the repository never invents a
folder name (the upstream `MediaOrganizer` is in charge of placing
files; the repo writes the sidecar next to them). `find_by_imdb_id` /
`find_all` walk `library_root/*/`, loading each readable sidecar;
folders without a sidecar return `None` / are skipped (no implicit
cold scan — that is the job of the upcoming `rescan_show` tool).
Corrupted YAML and schema violations are logged and skipped, never
raised, so a single bad folder does not break the rest of the
library. The repo keeps a tiny in-memory `imdb_id → folder_name`
index populated on every successful read/save, so subsequent saves
find the right destination without re-walking — useful when the show
folder name diverges from `show.get_folder_name()` (custom 1080p / 4K
variants). 20 integration tests on `tmp_path` cover the round-trip,
cold folder / unknown id returns, multi-show `find_all`, corrupted /
wrong-schema skipping, atomic write (no `.alfred.tmp` left behind),
overwrite, and folder-name fallbacks.
- **Sidecar ↔ TVShow bridge
(`alfred/infrastructure/persistence/dot_alfred/bridge.py`).**
`to_sidecar(show, folder_paths=...)` summarizes the rich domain
`AudioTrack` / `SubtitleTrack` to the sidecar's compact form (unique
audio languages in track order; subtitle entries derived from
`is_forced` and assumed `source="embedded"`). `from_sidecar(sidecar,
title=...)` reconstructs the domain `TVShow` with synthesized tracks
— one `AudioTrack` per language, one `SubtitleTrack` per entry, with
ffprobe-only fields (`codec`, `channels`, `channel_layout`) left as
`None`. The bridge is intentionally lossy on probe minutiae the
sidecar does not store; this is the documented trade-off from the
factual-only spec.
- **`.alfred` sidecar serializer
(`alfred/infrastructure/persistence/dot_alfred/`).** Implements step 2
of the `specs/dot_alfred.md` plan. Pure-dict in/out
(`serialize(sidecar) -> dict`, `deserialize(data) -> ShowSidecar`) —
YAML I/O lives in the repository layer (step 3) and is kept out for
trivial testability. Ships the DTOs that mirror the YAML schema
field-for-field (`ShowSidecar`, `SeasonSidecar`, `EpisodeSidecar`,
`SubtitleEntry`). The sidecar acts as a **scan cache**: it stores
only what is genuinely costly to recompute — folder/file paths
(skipping the FS walk) and probed track metadata (skipping ffprobe).
Release identifiers (group, source, quality, codec) live in folder
and file names and are derived on demand by the parser — they are
deliberately absent from the schema and rejected on deserialize. The
serializer is **strict on schema**: unknown keys at any level raise
`SidecarSchemaError`, missing required fields raise clearly, and
`bool` cannot sneak in as a season/episode number. Optional fields
(`tmdb_id`, empty `audio`/`subtitles`/`episodes`) are omitted from
the output rather than emitted as `null` / `[]`. Tests cover
round-trip equivalence (DTO → dict → DTO and DTO → YAML text → DTO),
the Foundation S01 PACK case (real-world fixture with mixed sub
types — superset captured at season scope), and a Breaking Bad S05
EPISODIC case. An on-disk `tmp_path` fixture recreates the Foundation
folder structure with placeholder files, ready to be reused by the
upcoming repository walk tests in step 3.
- **`TVShowBuilder` / `SeasonBuilder` — sole construction surface for the
TVShow aggregate** (`alfred/domain/tv_shows/builders.py`). The aggregate
is now fully frozen; building goes through a mutable scratchpad that
emits an immutable `TVShow` via `build()`. Both builders offer a
`from_existing()` classmethod to seed from a current frozen aggregate
and apply modifications. Episodes are emitted sorted by number within a
season, seasons sorted by number within the show.
- **`SeasonMode` enum** (`PACK` / `EPISODIC`) in
`alfred/domain/tv_shows/value_objects.py`. Computed at read time from
the season's structural shape (`Season.mode` property): a season with
no explicit episodes is `PACK` (a single release covering the whole
season), a season with episodes is `EPISODIC` (currently airing, one
release per episode). Never stored — the YAML sidecar encodes the
mode via the presence/absence of the `episodes:` block.
### Changed
- **TVShow aggregate is now frozen all the way down.** `TVShow`,
`Season` and `Episode` are all `@dataclass(frozen=True)`. Children
are stored as ordered tuples (`tuple[Season, ...]`,
`tuple[Episode, ...]`) sorted by their respective numbers, replacing
the previous mutable dicts. Lookup helpers `TVShow.get_season(n)` and
`Season.get_episode(n)` traverse the tuple lazily via `next()`. The
former `add_episode` / `add_season` mutation methods are gone — all
construction goes through `TVShowBuilder` / `SeasonBuilder`.
### Removed
- **ShowTracker-territory fields stripped from the TVShow aggregate.**
The aggregate now models only what the `.alfred` sidecar stores
(filesystem-observable facts + immutable identity). Dropped from the
domain:
- `TVShow.status` (`ShowStatus`) and the `ShowStatus` enum entirely,
along with its TMDB string mapping (`from_string`).
- `TVShow.expected_seasons`, `Season.expected_episodes`,
`Season.aired_episodes`, `Season.name`.
- `TVShow.collection_status()`, `is_complete_series()`,
`missing_episodes()`, `is_ongoing()`, `is_ended()` and the
`CollectionStatus` enum.
- `Season.is_complete()`, `is_fully_aired()`, `missing_episodes()`
and the `aired ≤ expected` validation.
- `TVShow.add_episode()` / `TVShow.add_season()` /
`Season.add_episode()` — replaced by the builder API.
These concerns will reappear in a dedicated `ShowTracker` layer (to
be designed) that combines the `.alfred` sidecar with live TMDB data
to answer questions like "is this show complete?" or "are new
episodes out?". Keeping volatile/derived state out of the aggregate
matches the factuel-only philosophy locked in `specs/dot_alfred.md`.
### Internal
- **Test suite rewritten for the new aggregate shape.**
`tests/domain/test_tv_shows.py` now covers frozen invariants, builder
ordering, last-write-wins on duplicates, `from_existing` round-trip,
and `SeasonMode` derivation. `tests/infrastructure/test_filesystem_extras.py`
helper simplified (no more `ShowStatus.ENDED` / `expected_seasons` on
test shows). 1078 tests still green.
- **Design doc for `.alfred/` sidecar persistence
(`specs/dot_alfred.md`).** First entry in the new `specs/` directory.
Specifies a per-show `.alfred/` directory holding a `show.yaml` and
one `season_NN.yaml` per season, used by the upcoming concrete
`TVShowRepository` to cache parse/probe results and avoid full
rescans on every library read. Covers schema, naming conventions,
cache invalidation strategy (size + mtime), self-healing on
drift, atomicity (`os.replace`), edge cases (legacy folders,
corrupted sidecars, manual file removal), and a phased
implementation plan. No code yet — spec only.
### Internal
- **`specs/` is now tracked.** The repo-level `.gitignore` had a
blanket `*.md` rule with only `CHANGELOG.md` allow-listed. Added
explicit exceptions for `/README.md` (root only — avoids
unintentionally exposing fixture READMEs) and `specs/**/*.md` so the
new design-doc directory ships with the project. Also added an
explicit `/.claude/` ignore line for the private dev-docs sub-repo
that sits inside the working tree but is versioned separately.
### Fixed ### Fixed
- **Multi-episode chain (e.g. `S14E09E10E11`) now collapses to a full - **Multi-episode chain (e.g. `S14E09E10E11`) now collapses to a full
@@ -48,6 +548,26 @@ callers).
### Added ### Added
- **Fullwidth vertical bar `` (U+FF5C) is now a recognized release-name
token separator.** Added to `alfred/knowledge/release/separators.yaml`
so CJK release names (and the occasional decorative YouTube-style use)
tokenize cleanly instead of leaving the wide pipe glued onto an
adjacent token. The tokenizer in
`alfred/domain/release/parser/pipeline.py` already iterates the
separator list as plain strings (no regex), so a multi-byte UTF-8
separator works without any code change.
- **`InspectedResult.recommended_action` property** — derived hint that
collapses the orchestrator's go / wait / skip decision into a single
value (``"process"`` / ``"ask_user"`` / ``"skip"``). Centralizes the
exclusion logic that was previously dispersed across road /
media_type / main_video checks at each call site. Ordering is part of
the contract: ``skip`` (no main video, or media_type == ``"other"``)
wins over ``ask_user`` (media_type == ``"unknown"`` or road ==
``"path_of_pain"``) which wins over ``process``. Surfaced through the
``analyze_release`` tool so the LLM can route on it directly.
6 new tests in ``tests/application/test_inspect.py`` cover the four
branches and the precedence rules.
- **`LanguageRepository` port** in `alfred.domain.shared.ports`. Structural - **`LanguageRepository` port** in `alfred.domain.shared.ports`. Structural
Protocol covering `from_iso`, `from_any`, `all`, `__contains__`, `__len__` Protocol covering `from_iso`, `from_any`, `all`, `__contains__`, `__len__`
— the surface previously coupled to the concrete `LanguageRegistry`. — the surface previously coupled to the concrete `LanguageRegistry`.
@@ -57,6 +577,95 @@ callers).
### Changed ### Changed
- **`Movie` and `Episode` are now frozen dataclasses.** Both entities
hold their track collections as `tuple[AudioTrack, ...]` and
`tuple[SubtitleTrack, ...]` instead of mutable lists, and are
`@dataclass(frozen=True, eq=False)` (identity-based equality
preserved via `__eq__`/`__hash__`). `__post_init__` coercion uses
`object.__setattr__` for the `imdb_id` / `title` /
`season_number` / `episode_number` normalizations. To project
enrichment results (probe output, file metadata) callers now rebuild
via `dataclasses.replace(...)`. Pattern aligned with the recent
`ParsedRelease` freeze. `MediaWithTracks` mixin contract updated to
`tuple` accordingly. `Season` and `TVShow` remain mutable for now —
freezing the aggregate root would cascade a full reconstruction on
every `add_episode`, deferred.
- **`SubtitleCandidate` renamed to `SubtitleScanResult`.** The old name
conflated "this might become a placed subtitle" with "this is what a
scan pass produced". The class is the output of a scan/identify pass
— language/format may still be `None`, confidence reflects how sure
the classifier is, and `raw_tokens` holds the filename fragments
under analysis. `SubtitleScanResult` says that directly. Pure rename
with a refreshed docstring in `alfred/domain/subtitles/entities.py`;
no behavior change. Touches the domain entity + `__init__` export,
the matcher / identifier / utils services, the manage_subtitles use
case, the placer, the metadata store, the shared-media cross-ref
comment, and the seven test modules that imported the type.
- **`ParsedRelease` is now frozen; enrichment passes return new
instances.** The VO was mutable so `detect_media_type` and
`enrich_from_probe` could patch fields in place — a code smell in a
value object whose identity *is* its content. `ParsedRelease` is now
`@dataclass(frozen=True)`; `languages` is a `tuple[str, ...]`
instead of a `list[str]`. `enrich_from_probe` returns a new
`ParsedRelease` via `dataclasses.replace` (only allocates when at
least one field actually changed). `inspect_release` rebinds
`parsed` after both `detect_media_type` (wrapped in `MediaTypeToken`
to satisfy the strict isinstance check that now also runs on
replace) and `enrich_from_probe`. Parser pipeline now packs
`languages` as a tuple in the assemble dict. Callers updated:
`inspect_release`, `testing/recognize_folders_in_downloads.py`, and
the enrichment tests (22 call sites + language assertions switched
to tuple literals).
- **`resolve_destination` use cases take `kb` / `prober` as required
params; module-level singletons gone.** The four
`resolve_{season,episode,movie,series}_destination` use cases now
accept `kb: ReleaseKnowledge` and `prober: MediaProber` as required
arguments, matching the shape of `inspect_release`. The module-level
`_KB = YamlReleaseKnowledge()` and `_PROBER = FfprobeMediaProber()`
singletons that previously lived in
`alfred/application/filesystem/resolve_destination.py` are removed —
the application layer no longer reaches into infrastructure. The
singletons now live at the agent-tools frontier
(`alfred/agent/tools/filesystem.py`), where the LLM-facing wrappers
instantiate them once and thread them through. `analyze_release` no
longer needs the dirty `from ... import _KB` indirection. Tests
inject their own stubs by keyword (`prober=_StubProber(...)`) instead
of monkeypatching a module attribute.
- **`ParsePath` enum renamed to `TokenizationRoute`.** The old name
collided with `pathlib.Path` in code-reading mental models, and was
one letter from `parse_path` (the field that holds the value) — making
it harder than it needed to be to spot the type vs the attribute.
``TokenizationRoute`` says what it actually captures (DIRECT /
SANITIZED / AI = how the name reached the tokenizer), and the class
docstring now spells out the orthogonality with ``Road`` (EASY /
SHITTY / PATH_OF_PAIN, which captures parser confidence on
``ParseReport``). The ``parse_path`` field name stays unchanged —
string values too — so YAML fixtures, the ``analyze_release`` tool
spec, and any external consumer are untouched.
- **`enrich_from_probe` codec mappings moved to YAML.** The three
hard-coded module dicts (`_VIDEO_CODEC_MAP`, `_AUDIO_CODEC_MAP`,
`_CHANNEL_MAP`) translating ffprobe output to scene tokens
(`hevc → x265`, `eac3 → EAC3`, `8 → "7.1"`, …) now live in
`alfred/knowledge/release/probe_mappings.yaml` and are loaded into
`ReleaseKnowledge.probe_mappings` (new port field, populated by
`YamlReleaseKnowledge`). `enrich_from_probe` gains a third `kb`
parameter and reads the maps from there. Aligns with the CLAUDE.md
rule that lookup tables of domain knowledge belong in YAML, not in
Python — and opens the door to a future "learn new codec" pass.
Callers updated: `inspect_release`, `testing/recognize_folders_in_downloads.py`,
and all 22 sites in `tests/application/test_enrich_from_probe.py`.
- **`ParsedRelease.tech_string` is now a derived `@property`**
(`alfred/domain/release/value_objects.py`). It computes
`quality.source.codec` joined by dots on every access, so it stays in
sync with the underlying fields by construction. The stored field is
gone from the dataclass, the dict returned by `assemble()` no longer
carries the key, `parse_release`'s malformed-name fallback drops the
`tech_string=""` kwarg, and `enrich_from_probe` no longer re-derives
it after filling `quality`/`source`/`codec`. Closes the
parser/enrichment double-source-of-truth that `e79ca46` had to fix
reactively. The fixtures runner now injects `tech_string` alongside
`is_season_pack` since `asdict()` skips properties.
- **`RuleScope.level` is now an enum (`RuleScopeLevel`).** The set of - **`RuleScope.level` is now an enum (`RuleScopeLevel`).** The set of
valid levels (global, release_group, movie, show, season, episode) valid levels (global, release_group, movie, show, season, episode)
was documented only in a docstring comment and validated nowhere. was documented only in a docstring comment and validated nowhere.
+3 -3
View File
@@ -6,13 +6,13 @@ from collections.abc import AsyncGenerator
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from alfred.infrastructure.metadata import MetadataStore from alfred.infrastructure.metadata_TO_CHECK import MetadataStore
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
from alfred.settings import settings from alfred.settings import settings
from .prompt import PromptBuilder from .prompt import PromptBuilder
from .registry import Tool, make_tools from .registry import Tool, make_tools
from .workflows import WorkflowLoader from .workflows_TO_CHECK import WorkflowLoader
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
+3 -3
View File
@@ -3,12 +3,12 @@
import json import json
from typing import Any from typing import Any
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
from alfred.infrastructure.persistence.memory import MemoryRegistry from alfred.infrastructure.persistence_TO_CHECK.memory import MemoryRegistry
from .expressions import build_expressions_context from .expressions import build_expressions_context
from .registry import Tool from .registry import Tool
from .workflows import WorkflowLoader from .workflows_TO_CHECK import WorkflowLoader
# Tools that are always available, regardless of workflow scope. # Tools that are always available, regardless of workflow scope.
# Kept small on purpose — the noyau is what the agent uses to either # Kept small on purpose — the noyau is what the agent uses to either
+6 -6
View File
@@ -6,8 +6,8 @@ from collections.abc import Callable
from dataclasses import dataclass from dataclasses import dataclass
from typing import Any from typing import Any
from .tools.spec import ToolSpec, ToolSpecError from .tools_TO_CHECK.spec import ToolSpec, ToolSpecError
from .tools.spec_loader import load_tool_specs from .tools_TO_CHECK.spec_loader import load_tool_specs
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -130,10 +130,10 @@ def make_tools(settings) -> dict[str, Tool]:
Returns: Returns:
Dictionary mapping tool names to Tool objects. Dictionary mapping tool names to Tool objects.
""" """
from .tools import api as api_tools # noqa: PLC0415 from .tools_TO_CHECK import api as api_tools # noqa: PLC0415
from .tools import filesystem as fs_tools # noqa: PLC0415 from .tools_TO_CHECK import filesystem as fs_tools # noqa: PLC0415
from .tools import language as lang_tools # noqa: PLC0415 from .tools_TO_CHECK import language as lang_tools # noqa: PLC0415
from .tools import workflow as wf_tools # noqa: PLC0415 from .tools_TO_CHECK import workflow as wf_tools # noqa: PLC0415
tool_functions = [ tool_functions = [
fs_tools.set_path_for_folder, fs_tools.set_path_for_folder,
-22
View File
@@ -1,22 +0,0 @@
"""Tools module - filesystem and API tools for the agent."""
from .api import (
add_torrent_by_index,
add_torrent_to_qbittorrent,
find_media_imdb_id,
find_torrent,
get_torrent_by_index,
)
from .filesystem import list_folder, set_path_for_folder
from .language import set_language
__all__ = [
"set_path_for_folder",
"list_folder",
"find_media_imdb_id",
"find_torrent",
"get_torrent_by_index",
"add_torrent_to_qbittorrent",
"add_torrent_by_index",
"set_language",
]
+23
View File
@@ -0,0 +1,23 @@
"""Tools module — agent-exposed wrappers.
Re-exports are intentionally minimal during the ``unfuck`` refactor.
Tool wiring (registry / specs / LLM-facing surface) is the last
chunk of work on this branch; until then, importers should reach
into the submodules directly (``alfred.agent.tools.filesystem``, …).
"""
from .api import (
add_torrent_by_index,
add_torrent_to_qbittorrent,
find_torrent,
get_torrent_by_index,
)
from .language import set_language
__all__ = [
"find_torrent",
"get_torrent_by_index",
"add_torrent_to_qbittorrent",
"add_torrent_by_index",
"set_language",
]
@@ -3,35 +3,47 @@
import logging import logging
from typing import Any from typing import Any
from alfred.application.movies import SearchMovieUseCase from alfred.application.movies_TO_CHECK import SearchMovieUseCase
from alfred.application.torrents import AddTorrentUseCase, SearchTorrentsUseCase from alfred.application.torrents_TO_CHECK import AddTorrentUseCase, SearchTorrentsUseCase
from alfred.infrastructure.api.knaben import knaben_client from alfred.application.tv_shows_TO_CHECK import SearchShowUseCase
from alfred.infrastructure.api.qbittorrent import qbittorrent_client from alfred.infrastructure.api_TO_CHECK.knaben import knaben_client
from alfred.infrastructure.api.tmdb import tmdb_client from alfred.infrastructure.api_TO_CHECK.qbittorrent import qbittorrent_client
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.api_TO_CHECK.tmdb import tmdb_client
from alfred.infrastructure.persistence_TO_CHECK import get_memory
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def find_media_imdb_id(media_title: str) -> dict[str, Any]: def search_movies(media_title: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/find_media_imdb_id.yaml.""" """Thin tool wrapper — semantics live in alfred/agent/tools/specs/search_movies.yaml."""
use_case = SearchMovieUseCase(tmdb_client) use_case = SearchMovieUseCase(tmdb_client)
response = use_case.execute(media_title) response = use_case.execute(media_title)
result = response.to_dict() result = response.to_dict()
if result.get("status") == "ok": if result.get("status") == "ok":
memory = get_memory() memory = get_memory()
memory.stm.set_entity( memory.stm.set_entity("last_movie_search", {"hits": result.get("hits", [])})
"last_media_search", memory.stm.set_topic("searching_movie")
{ logger.debug(
"title": result.get("title"), f"Stored movie search result in STM: {len(result.get('hits', []))} hits"
"imdb_id": result.get("imdb_id"), )
"media_type": result.get("media_type"),
"tmdb_id": result.get("tmdb_id"), return result
},
def search_shows(show_title: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/search_shows.yaml."""
use_case = SearchShowUseCase(tmdb_client)
response = use_case.execute(show_title)
result = response.to_dict()
if result.get("status") == "ok":
memory = get_memory()
memory.stm.set_entity("last_show_search", {"hits": result.get("hits", [])})
memory.stm.set_topic("searching_show")
logger.debug(
f"Stored show search result in STM: {len(result.get('hits', []))} hits"
) )
memory.stm.set_topic("searching_media")
logger.debug(f"Stored media search result in STM: {result.get('title')}")
return result return result
@@ -1,4 +1,20 @@
"""Filesystem tools for folder management.""" """Filesystem tools for folder management.
Thin wrappers around the 5 atomic filesystem use cases
(``alfred.application.filesystem``) plus a few self-contained tools
(``analyze_release``, ``probe_media``, ``learn``, ).
Tools removed during the ``unfuck`` filesystem refactor to be
rewired in a later step:
- ``manage_subtitles`` (depends on the rewritten subtitle services)
- ``set_path_for_folder`` (no replacement use case yet)
- ``create_seed_links`` (flow has changed: hard-link straight to
library, no copy back; will be re-introduced per-file when the
organize-release workflow lands)
- ``resolve_season_destination`` / ``resolve_episode_destination``
/ ``resolve_movie_destination`` / ``resolve_series_destination``
(their use cases moved to ``_OLD`` files pending a rewrite)
"""
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
@@ -7,121 +23,136 @@ import yaml
import alfred as _alfred_pkg import alfred as _alfred_pkg
from alfred.application.filesystem import ( from alfred.application.filesystem import (
CreateSeedLinksUseCase, DirectoryRoots,
ListFolderUseCase, create_dir_use_case,
ManageSubtitlesUseCase, list_dir_use_case,
MoveMediaUseCase, move_file_use_case,
SetFolderPathUseCase,
) )
from alfred.application.filesystem.resolve_destination import ( from alfred.infrastructure.knowledge_TO_CHECK.release_kb import YamlReleaseKnowledge
resolve_episode_destination as _resolve_episode_destination, from alfred.infrastructure.metadata_TO_CHECK import MetadataStore
) from alfred.infrastructure.persistence_TO_CHECK import get_memory
from alfred.application.filesystem.resolve_destination import ( from alfred.infrastructure.probe_TO_CHECK import FfprobeMediaProber
resolve_movie_destination as _resolve_movie_destination,
)
from alfred.application.filesystem.resolve_destination import (
resolve_season_destination as _resolve_season_destination,
)
from alfred.application.filesystem.resolve_destination import (
resolve_series_destination as _resolve_series_destination,
)
from alfred.infrastructure.filesystem import FileManager, create_folder, move
from alfred.infrastructure.metadata import MetadataStore
from alfred.infrastructure.persistence import get_memory
from alfred.infrastructure.probe import FfprobeMediaProber
# Agent-tools frontier: this is the legitimate home for the singletons that
# back every LLM-exposed wrapper. The use cases below take ``kb`` / ``prober``
# as required params; tests inject their own stubs.
_KB = YamlReleaseKnowledge()
_PROBER = FfprobeMediaProber() _PROBER = FfprobeMediaProber()
_LEARNED_ROOT = Path(_alfred_pkg.__file__).parent.parent / "data" / "knowledge" _LEARNED_ROOT = Path(_alfred_pkg.__file__).parent.parent / "data" / "knowledge"
class _RootsNotConfigured(Exception):
"""Raised when one of the 4 expected roots is missing from memory."""
def __init__(self, missing: list[str]):
super().__init__(f"Roots not configured: {missing}")
self.missing = missing
def _load_directory_roots() -> DirectoryRoots:
"""Build :class:`DirectoryRoots` from the persisted memory.
Reads:
- ``ltm.workspace.download`` ``downloads``
- ``ltm.workspace.torrent`` ``torrents``
- ``ltm.library_paths['movies']`` ``movies``
- ``ltm.library_paths['tv_shows']`` ``tv_shows``
Raises:
_RootsNotConfigured: if any of the four paths is unset.
"""
memory = get_memory()
downloads = memory.ltm.workspace.download
torrents = memory.ltm.workspace.torrent
movies = memory.ltm.library_paths.get("movies")
tv_shows = memory.ltm.library_paths.get("tv_shows")
missing: list[str] = []
if not downloads:
missing.append("downloads")
if not torrents:
missing.append("torrents")
if not movies:
missing.append("movies")
if not tv_shows:
missing.append("tv_shows")
if missing:
raise _RootsNotConfigured(missing)
return DirectoryRoots(
downloads=Path(downloads),
torrents=Path(torrents),
movies=Path(movies),
tv_shows=Path(tv_shows),
)
def _roots_error(exc: _RootsNotConfigured) -> dict[str, Any]:
return {
"status": "error",
"error": "roots_not_configured",
"message": (
f"Missing roots: {exc.missing}. "
"Configure them via /set_path before using filesystem tools."
),
}
# ---------------------------------------------------------------------------
# 5 atomic filesystem tools — thin wrappers over the use cases.
# ---------------------------------------------------------------------------
def list_folder(path: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/list_folder.yaml."""
try:
roots = _load_directory_roots()
except _RootsNotConfigured as e:
return _roots_error(e)
return list_dir_use_case(Path(path), roots).to_dict()
def create_directory(path: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/create_directory.yaml."""
try:
roots = _load_directory_roots()
except _RootsNotConfigured as e:
return _roots_error(e)
return create_dir_use_case(Path(path), roots).to_dict()
def move_media(source: str, destination: str) -> dict[str, Any]: def move_media(source: str, destination: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/move_media.yaml.""" """Thin tool wrapper — semantics live in alfred/agent/tools/specs/move_media.yaml."""
file_manager = FileManager() try:
use_case = MoveMediaUseCase(file_manager) roots = _load_directory_roots()
return use_case.execute(source, destination).to_dict() except _RootsNotConfigured as e:
return _roots_error(e)
return move_file_use_case(Path(source), Path(destination), roots).to_dict()
def move_to_destination(source: str, destination: str) -> dict[str, Any]: def move_to_destination(source: str, destination: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/move_to_destination.yaml.""" """Thin tool wrapper — semantics live in alfred/agent/tools/specs/move_to_destination.yaml.
parent = str(Path(destination).parent)
result = create_folder(parent) Convenience tool that creates the destination's parent directory
if result["status"] != "ok": if missing, then moves the file. Saves the LLM from having to
return result chain ``create_directory`` + ``move_media`` explicitly.
return move(source, destination) """
try:
roots = _load_directory_roots()
except _RootsNotConfigured as e:
return _roots_error(e)
dst = Path(destination)
mkdir_resp = create_dir_use_case(dst.parent, roots)
if mkdir_resp.status != "ok":
return mkdir_resp.to_dict()
return move_file_use_case(Path(source), dst, roots).to_dict()
def resolve_season_destination( # ---------------------------------------------------------------------------
release_name: str, # Self-contained tools — not impacted by the filesystem refactor.
tmdb_title: str, # ---------------------------------------------------------------------------
tmdb_year: int,
confirmed_folder: str | None = None,
source_path: str | None = None,
) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/resolve_season_destination.yaml."""
return _resolve_season_destination(
release_name, tmdb_title, tmdb_year, confirmed_folder, source_path
).to_dict()
def resolve_episode_destination(
release_name: str,
source_file: str,
tmdb_title: str,
tmdb_year: int,
tmdb_episode_title: str | None = None,
confirmed_folder: str | None = None,
) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/resolve_episode_destination.yaml."""
return _resolve_episode_destination(
release_name,
source_file,
tmdb_title,
tmdb_year,
tmdb_episode_title,
confirmed_folder,
).to_dict()
def resolve_movie_destination(
release_name: str,
source_file: str,
tmdb_title: str,
tmdb_year: int,
) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/resolve_movie_destination.yaml."""
return _resolve_movie_destination(
release_name, source_file, tmdb_title, tmdb_year
).to_dict()
def resolve_series_destination(
release_name: str,
tmdb_title: str,
tmdb_year: int,
confirmed_folder: str | None = None,
source_path: str | None = None,
) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/resolve_series_destination.yaml."""
return _resolve_series_destination(
release_name, tmdb_title, tmdb_year, confirmed_folder, source_path
).to_dict()
def create_seed_links(
library_file: str, original_download_folder: str
) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/create_seed_links.yaml."""
file_manager = FileManager()
use_case = CreateSeedLinksUseCase(file_manager)
return use_case.execute(library_file, original_download_folder).to_dict()
def manage_subtitles(source_video: str, destination_video: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/manage_subtitles.yaml."""
file_manager = FileManager()
use_case = ManageSubtitlesUseCase(file_manager)
return use_case.execute(source_video, destination_video).to_dict()
def learn(pack: str, category: str, key: str, values: list[str]) -> dict[str, Any]: def learn(pack: str, category: str, key: str, values: list[str]) -> dict[str, Any]:
@@ -181,18 +212,9 @@ def learn(pack: str, category: str, key: str, values: list[str]) -> dict[str, An
} }
def set_path_for_folder(folder_name: str, path_value: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/set_path_for_folder.yaml."""
file_manager = FileManager()
use_case = SetFolderPathUseCase(file_manager)
response = use_case.execute(folder_name, path_value)
return response.to_dict()
def analyze_release(release_name: str, source_path: str) -> dict[str, Any]: def analyze_release(release_name: str, source_path: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/analyze_release.yaml.""" """Thin tool wrapper — semantics live in alfred/agent/tools/specs/analyze_release.yaml."""
from alfred.application.filesystem.resolve_destination import _KB # noqa: PLC0415 from alfred.application.release_TO_CHECK import inspect_release # noqa: PLC0415
from alfred.application.release import inspect_release # noqa: PLC0415
result = inspect_release(release_name, Path(source_path), _KB, _PROBER) result = inspect_release(release_name, Path(source_path), _KB, _PROBER)
parsed = result.parsed parsed = result.parsed
@@ -220,6 +242,7 @@ def analyze_release(release_name: str, source_path: str) -> dict[str, Any]:
"probe_used": result.probe_used, "probe_used": result.probe_used,
"confidence": result.report.confidence, "confidence": result.report.confidence,
"road": result.report.road, "road": result.report.road,
"recommended_action": result.recommended_action,
} }
@@ -277,14 +300,6 @@ def probe_media(source_path: str) -> dict[str, Any]:
} }
def list_folder(folder_type: str, path: str = ".") -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/list_folder.yaml."""
file_manager = FileManager()
use_case = ListFolderUseCase(file_manager)
response = use_case.execute(folder_type, path)
return response.to_dict()
def read_release_metadata(release_path: str) -> dict[str, Any]: def read_release_metadata(release_path: str) -> dict[str, Any]:
"""Thin tool wrapper — semantics live in alfred/agent/tools/specs/read_release_metadata.yaml.""" """Thin tool wrapper — semantics live in alfred/agent/tools/specs/read_release_metadata.yaml."""
path = Path(release_path) path = Path(release_path)
@@ -3,7 +3,7 @@
import logging import logging
from typing import Any from typing import Any
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -82,3 +82,4 @@ returns:
probe_used: True when ffprobe successfully enriched the result. probe_used: True when ffprobe successfully enriched the result.
confidence: Parser confidence score, 0100 (higher = more reliable). confidence: Parser confidence score, 0100 (higher = more reliable).
road: "Parser road: 'easy' (group schema matched), 'shitty' (heuristic but acceptable), or 'path_of_pain' (low confidence — ask the user before auto-routing)." road: "Parser road: 'easy' (group schema matched), 'shitty' (heuristic but acceptable), or 'path_of_pain' (low confidence — ask the user before auto-routing)."
recommended_action: "Orchestrator hint: 'process' (go straight to resolve_*_destination), 'ask_user' (media_type unknown or road=path_of_pain — confirm with the user first), or 'skip' (no main video, or media_type=other — nothing to organize)."
@@ -9,9 +9,9 @@ to reason over the full set.
import logging import logging
from typing import Any from typing import Any
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
from ..workflows import WorkflowLoader from ..workflows_TO_CHECK import WorkflowLoader
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
+1 -1
View File
@@ -15,7 +15,7 @@ from alfred.agent.agent import Agent
from alfred.agent.llm.deepseek import DeepSeekClient from alfred.agent.llm.deepseek import DeepSeekClient
from alfred.agent.llm.exceptions import LLMAPIError, LLMConfigurationError from alfred.agent.llm.exceptions import LLMAPIError, LLMConfigurationError
from alfred.agent.llm.ollama import OllamaClient from alfred.agent.llm.ollama import OllamaClient
from alfred.infrastructure.persistence import get_memory, init_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory, init_memory
from alfred.settings import settings from alfred.settings import settings
logging.basicConfig( logging.basicConfig(
+26
View File
@@ -0,0 +1,26 @@
"""Application-layer exceptions shared across orchestrators.
Kept in a dedicated module (rather than inside each orchestrator's
file) because the sync flows for TV shows and movies raise structurally
identical "not found in library" errors — pulling them out makes the
shared semantics explicit and avoids cross-imports between the
``tv_shows`` and ``movies`` packages.
"""
from __future__ import annotations
class ShowNotFoundInLibrary(LookupError):
"""Raised when no on-disk TV show carries the requested ``tmdb_id``.
The sync orchestrator raises this when both the library index and
the per-show release repository return ``None`` for a lookup —
there is nothing on disk to refresh TMDB facts against.
"""
class MovieNotFoundInLibrary(LookupError):
"""Raised when no on-disk movie carries the requested ``tmdb_id``.
Symmetric to :class:`ShowNotFoundInLibrary` for the movies library.
"""
+36 -41
View File
@@ -1,47 +1,42 @@
"""Filesystem use cases.""" """Filesystem application layer — 5 atomic use cases as free functions.
from .create_seed_links import CreateSeedLinksUseCase Each use case:
- accepts :class:`pathlib.Path` inputs plus a :class:`DirectoryRoots` VO,
- guards inputs against escaping configured roots,
- calls the matching infra op,
- catches :class:`~alfred.infrastructure.filesystem.FilesystemError` and
returns a frozen DTO with a normalized error code.
No global state, no ``get_memory()``. Roots are injected.
"""
from .create_dir import create_dir_use_case
from .directory_roots import DirectoryRoots
from .dto import ( from .dto import (
CreateSeedLinksResponse, CreateDirResponse,
ListFolderResponse, LinkFileResponse,
ManageSubtitlesResponse, ListDirResponse,
MoveMediaResponse, MoveDirResponse,
PlacedSubtitle, MoveFileResponse,
SetFolderPathResponse,
) )
from .list_folder import ListFolderUseCase from .link_file import link_file_use_case
from .manage_subtitles import ManageSubtitlesUseCase from .list_dir import list_dir_use_case
from .move_media import MoveMediaUseCase from .move_dir import move_dir_use_case
from .resolve_destination import ( from .move_file import move_file_use_case
ResolvedEpisodeDestination,
ResolvedMovieDestination,
ResolvedSeasonDestination,
ResolvedSeriesDestination,
resolve_episode_destination,
resolve_movie_destination,
resolve_season_destination,
resolve_series_destination,
)
from .set_folder_path import SetFolderPathUseCase
__all__ = [ __all__ = [
"SetFolderPathUseCase", # use cases
"ListFolderUseCase", "list_dir_use_case",
"CreateSeedLinksUseCase", "create_dir_use_case",
"MoveMediaUseCase", "link_file_use_case",
"ManageSubtitlesUseCase", "move_file_use_case",
"ResolvedSeasonDestination", "move_dir_use_case",
"ResolvedEpisodeDestination", # VO
"ResolvedMovieDestination", "DirectoryRoots",
"ResolvedSeriesDestination", # DTOs
"resolve_season_destination", "ListDirResponse",
"resolve_episode_destination", "CreateDirResponse",
"resolve_movie_destination", "LinkFileResponse",
"resolve_series_destination", "MoveFileResponse",
"SetFolderPathResponse", "MoveDirResponse",
"ListFolderResponse",
"CreateSeedLinksResponse",
"MoveMediaResponse",
"ManageSubtitlesResponse",
"PlacedSubtitle",
] ]
+41
View File
@@ -0,0 +1,41 @@
"""Internal helpers: mapping infra exceptions → error codes.
Kept private (``_errors``) — only the 5 use cases in this package use
it. Centralizes the exception → code translation so every use case
returns consistent error payloads.
"""
from __future__ import annotations
from alfred.infrastructure.filesystem import (
CrossDevice,
DestinationExists,
FilesystemError,
FilesystemOSError,
NotADirectory,
NotAFile,
PermissionDenied,
SourceNotFound,
)
# Application-layer error codes (guard violations, not infra).
PATH_NOT_ALLOWED = "path_not_allowed"
def code_for(exc: FilesystemError) -> str:
"""Return the snake-case error code for an infra exception."""
if isinstance(exc, SourceNotFound):
return "source_not_found"
if isinstance(exc, DestinationExists):
return "destination_exists"
if isinstance(exc, NotADirectory):
return "not_a_directory"
if isinstance(exc, NotAFile):
return "not_a_file"
if isinstance(exc, PermissionDenied):
return "permission_denied"
if isinstance(exc, CrossDevice):
return "cross_device"
if isinstance(exc, FilesystemOSError):
return "filesystem_os_error"
return "filesystem_error"
@@ -0,0 +1,33 @@
"""create_dir use case — create a directory under one of the configured roots."""
from __future__ import annotations
from pathlib import Path
from alfred.infrastructure.filesystem import FilesystemError, create_dir
from ._errors import PATH_NOT_ALLOWED, code_for
from .directory_roots import DirectoryRoots
from .dto import CreateDirResponse
def create_dir_use_case(path: Path, roots: DirectoryRoots) -> CreateDirResponse:
"""Create directory ``path`` (and any missing parents) provided it
lives under one of the configured roots.
Idempotent on the infra side: re-running on an existing directory
returns ``status="ok"``.
"""
if not roots.contains(path):
return CreateDirResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Path is outside configured roots: {path}",
)
try:
create_dir(path)
except FilesystemError as e:
return CreateDirResponse(status="error", error=code_for(e), message=str(e))
return CreateDirResponse(status="ok", path=path)
@@ -3,7 +3,7 @@
import logging import logging
from alfred.infrastructure.filesystem import FileManager from alfred.infrastructure.filesystem import FileManager
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
from .dto import CreateSeedLinksResponse from .dto import CreateSeedLinksResponse
@@ -0,0 +1,56 @@
"""DirectoryRoots — VO carrying the configured filesystem roots.
Replaces the ad-hoc ``get_memory().ltm.workspace.<x>`` lookups that were
sprinkled across the filesystem use cases. By making roots an explicit
input, use cases become pure (no global state read) and easy to test.
The roots are read once at the tool wrapper boundary (where the agent
config lives) and threaded through the use cases.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
@dataclass(frozen=True)
class DirectoryRoots:
"""Configured roots of Alfred's filesystem.
All paths must be absolute and existing directories — validation is
expected at the boundary that builds this VO.
Attributes:
downloads: where qBittorrent drops finished torrents.
torrents: where seeding hard-links live (mirrors downloads/).
movies: library root for movies.
tv_shows: library root for TV shows.
"""
downloads: Path
torrents: Path
movies: Path
tv_shows: Path
def all(self) -> tuple[Path, ...]:
"""Return every configured root, in declaration order."""
return (self.downloads, self.torrents, self.movies, self.tv_shows)
def contains(self, path: Path) -> bool:
"""Return True if ``path`` is inside one of the configured roots.
Uses ``Path.resolve()`` to handle symlinks and ``..`` segments,
then ``relative_to`` for an exact within-root check.
"""
try:
resolved = path.resolve()
except OSError:
return False
for root in self.all():
try:
resolved.relative_to(root.resolve())
return True
except (ValueError, OSError):
continue
return False
+62 -164
View File
@@ -1,19 +1,28 @@
"""Filesystem application DTOs.""" """DTOs for the 5 atomic filesystem use cases.
Each use case returns a small frozen dataclass tagged with a ``status``
field. On error, ``error`` (machine-readable code) and ``message``
(human-readable) are populated; on success, the relevant payload
fields are.
Error codes mirror the infrastructure exception types (lowercased,
snake-cased) — e.g. ``SourceNotFound`` → ``"source_not_found"`` — plus
the application-layer ``"path_not_allowed"`` for guard violations.
"""
from __future__ import annotations from __future__ import annotations
from dataclasses import dataclass from dataclasses import dataclass, field
from pathlib import Path
@dataclass @dataclass(frozen=True)
class CopyMediaResponse: class ListDirResponse:
"""Response from copying a media file.""" """Response from ``list_dir_use_case``."""
status: str status: str # "ok" | "error"
source: str | None = None path: Path | None = None
destination: str | None = None entries: tuple[Path, ...] = ()
filename: str | None = None
size: int | None = None
error: str | None = None error: str | None = None
message: str | None = None message: str | None = None
@@ -22,22 +31,33 @@ class CopyMediaResponse:
return {"status": self.status, "error": self.error, "message": self.message} return {"status": self.status, "error": self.error, "message": self.message}
return { return {
"status": self.status, "status": self.status,
"source": self.source, "path": str(self.path) if self.path else None,
"destination": self.destination, "entries": [str(p) for p in self.entries],
"filename": self.filename,
"size": self.size,
} }
@dataclass @dataclass(frozen=True)
class MoveMediaResponse: class CreateDirResponse:
"""Response from moving a media file.""" """Response from ``create_dir_use_case``."""
status: str status: str
source: str | None = None path: Path | None = None
destination: str | None = None error: str | None = None
filename: str | None = None message: str | None = None
size: int | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
return {"status": self.status, "path": str(self.path) if self.path else None}
@dataclass(frozen=True)
class LinkFileResponse:
"""Response from ``link_file_use_case``."""
status: str
source: Path | None = None
destination: Path | None = None
error: str | None = None error: str | None = None
message: str | None = None message: str | None = None
@@ -46,125 +66,18 @@ class MoveMediaResponse:
return {"status": self.status, "error": self.error, "message": self.message} return {"status": self.status, "error": self.error, "message": self.message}
return { return {
"status": self.status, "status": self.status,
"source": self.source, "source": str(self.source) if self.source else None,
"destination": self.destination, "destination": str(self.destination) if self.destination else None,
"filename": self.filename,
"size": self.size,
} }
@dataclass @dataclass(frozen=True)
class SetFolderPathResponse: class MoveFileResponse:
"""Response from setting a folder path.""" """Response from ``move_file_use_case``."""
status: str status: str
folder_name: str | None = None source: Path | None = None
path: str | None = None destination: Path | None = None
error: str | None = None
message: str | None = None
def to_dict(self):
"""Convert to dict for agent compatibility."""
result = {"status": self.status}
if self.error:
result["error"] = self.error
result["message"] = self.message
else:
if self.folder_name:
result["folder_name"] = self.folder_name
if self.path:
result["path"] = self.path
return result
@dataclass
class PlacedSubtitle:
"""One subtitle file successfully placed."""
source: str
destination: str
filename: str
def to_dict(self) -> dict:
return {
"source": self.source,
"destination": self.destination,
"filename": self.filename,
}
@dataclass
class UnresolvedTrack:
"""A subtitle track that needs agent clarification before placement."""
raw_tokens: list[str]
file_path: str | None = None
file_size_kb: float | None = None
reason: str = "" # "unknown_language" | "low_confidence"
def to_dict(self) -> dict:
return {
"raw_tokens": self.raw_tokens,
"file_path": self.file_path,
"file_size_kb": self.file_size_kb,
"reason": self.reason,
}
@dataclass
class AvailableSubtitle:
"""One subtitle track available on an embedded media item."""
language: str # ISO 639-2 code
subtitle_type: str # "standard" | "sdh" | "forced" | "unknown"
def to_dict(self) -> dict:
return {"language": self.language, "type": self.subtitle_type}
@dataclass
class ManageSubtitlesResponse:
"""Response from the manage_subtitles use case."""
status: str # "ok" | "needs_clarification" | "error"
video_path: str | None = None
placed: list[PlacedSubtitle] | None = None
skipped_count: int = 0
unresolved: list[UnresolvedTrack] | None = None
available: list[AvailableSubtitle] | None = None # embedded tracks summary
error: str | None = None
message: str | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
result = {
"status": self.status,
"video_path": self.video_path,
"placed": [p.to_dict() for p in (self.placed or [])],
"placed_count": len(self.placed or []),
"skipped_count": self.skipped_count,
}
if self.unresolved:
result["unresolved"] = [u.to_dict() for u in self.unresolved]
result["unresolved_count"] = len(self.unresolved)
if self.available:
result["available"] = [a.to_dict() for a in self.available]
return result
@dataclass
class CreateSeedLinksResponse:
"""Response from creating seed links for a torrent."""
status: str
torrent_subfolder: str | None = None
linked_file: str | None = None
copied_files: list[str] | None = None
copied_count: int = 0
skipped: list[str] | None = None
error: str | None = None error: str | None = None
message: str | None = None message: str | None = None
@@ -173,41 +86,26 @@ class CreateSeedLinksResponse:
return {"status": self.status, "error": self.error, "message": self.message} return {"status": self.status, "error": self.error, "message": self.message}
return { return {
"status": self.status, "status": self.status,
"torrent_subfolder": self.torrent_subfolder, "source": str(self.source) if self.source else None,
"linked_file": self.linked_file, "destination": str(self.destination) if self.destination else None,
"copied_files": self.copied_files or [],
"copied_count": self.copied_count,
"skipped": self.skipped or [],
} }
@dataclass @dataclass(frozen=True)
class ListFolderResponse: class MoveDirResponse:
"""Response from listing a folder.""" """Response from ``move_dir_use_case``."""
status: str status: str
folder_type: str | None = None source: Path | None = None
path: str | None = None destination: Path | None = None
entries: list[str] | None = None
count: int | None = None
error: str | None = None error: str | None = None
message: str | None = None message: str | None = None
def to_dict(self): def to_dict(self) -> dict:
"""Convert to dict for agent compatibility."""
result = {"status": self.status}
if self.error: if self.error:
result["error"] = self.error return {"status": self.status, "error": self.error, "message": self.message}
result["message"] = self.message return {
else: "status": self.status,
if self.folder_type: "source": str(self.source) if self.source else None,
result["folder_type"] = self.folder_type "destination": str(self.destination) if self.destination else None,
if self.path: }
result["path"] = self.path
if self.entries is not None:
result["entries"] = self.entries
if self.count is not None:
result["count"] = self.count
return result
+188
View File
@@ -0,0 +1,188 @@
"""Filesystem application DTOs."""
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class CopyMediaResponse:
"""Response from copying a media file."""
status: str
source: str | None = None
destination: str | None = None
filename: str | None = None
size: int | None = None
error: str | None = None
message: str | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
return {
"status": self.status,
"source": self.source,
"destination": self.destination,
"filename": self.filename,
"size": self.size,
}
@dataclass
class MoveMediaResponse:
"""Response from moving a media file."""
status: str
source: str | None = None
destination: str | None = None
filename: str | None = None
size: int | None = None
error: str | None = None
message: str | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
return {
"status": self.status,
"source": self.source,
"destination": self.destination,
"filename": self.filename,
"size": self.size,
}
@dataclass
class PlacedSubtitle:
"""One subtitle file successfully placed."""
source: str
destination: str
filename: str
def to_dict(self) -> dict:
return {
"source": self.source,
"destination": self.destination,
"filename": self.filename,
}
@dataclass
class UnresolvedTrack:
"""A subtitle track that needs agent clarification before placement."""
raw_tokens: list[str]
file_path: str | None = None
file_size_kb: float | None = None
reason: str = "" # "unknown_language" | "low_confidence"
def to_dict(self) -> dict:
return {
"raw_tokens": self.raw_tokens,
"file_path": self.file_path,
"file_size_kb": self.file_size_kb,
"reason": self.reason,
}
@dataclass
class AvailableSubtitle:
"""One subtitle track available on an embedded media item."""
language: str # ISO 639-2 code
subtitle_type: str # "standard" | "sdh" | "forced" | "unknown"
def to_dict(self) -> dict:
return {"language": self.language, "type": self.subtitle_type}
@dataclass
class ManageSubtitlesResponse:
"""Response from the manage_subtitles use case."""
status: str # "ok" | "needs_clarification" | "error"
video_path: str | None = None
placed: list[PlacedSubtitle] | None = None
skipped_count: int = 0
unresolved: list[UnresolvedTrack] | None = None
available: list[AvailableSubtitle] | None = None # embedded tracks summary
error: str | None = None
message: str | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
result = {
"status": self.status,
"video_path": self.video_path,
"placed": [p.to_dict() for p in (self.placed or [])],
"placed_count": len(self.placed or []),
"skipped_count": self.skipped_count,
}
if self.unresolved:
result["unresolved"] = [u.to_dict() for u in self.unresolved]
result["unresolved_count"] = len(self.unresolved)
if self.available:
result["available"] = [a.to_dict() for a in self.available]
return result
@dataclass
class CreateSeedLinksResponse:
"""Response from creating seed links for a torrent."""
status: str
torrent_subfolder: str | None = None
linked_file: str | None = None
copied_files: list[str] | None = None
copied_count: int = 0
skipped: list[str] | None = None
error: str | None = None
message: str | None = None
def to_dict(self) -> dict:
if self.error:
return {"status": self.status, "error": self.error, "message": self.message}
return {
"status": self.status,
"torrent_subfolder": self.torrent_subfolder,
"linked_file": self.linked_file,
"copied_files": self.copied_files or [],
"copied_count": self.copied_count,
"skipped": self.skipped or [],
}
@dataclass
class ListFolderResponse:
"""Response from listing a folder."""
status: str
folder_type: str | None = None # SHOULD BE A PROPERTY
path: str | None = None # NOT NONE - Should be path
entries: list[str] | None = None # NOT NONE - Empty list of path
count: int | None = None # USELESS
error: str | None = None
message: str | None = None
def to_dict(self):
"""Convert to dict for agent compatibility."""
result = {"status": self.status}
if self.error:
result["error"] = self.error
result["message"] = self.message
else:
if self.folder_type:
result["folder_type"] = self.folder_type
if self.path:
result["path"] = self.path
if self.entries is not None:
result["entries"] = self.entries
if self.count is not None:
result["count"] = self.count
return result
@@ -0,0 +1,40 @@
"""link_file use case — hard-link a file from one root to another."""
from __future__ import annotations
from pathlib import Path
from alfred.infrastructure.filesystem import FilesystemError, link_file
from ._errors import PATH_NOT_ALLOWED, code_for
from .directory_roots import DirectoryRoots
from .dto import LinkFileResponse
def link_file_use_case(
src: Path, dst: Path, roots: DirectoryRoots
) -> LinkFileResponse:
"""Hard-link ``src`` to ``dst``. Both must be under configured roots.
The destination parent must already exist — the caller is expected
to have created it via ``create_dir_use_case`` if needed.
"""
if not roots.contains(src):
return LinkFileResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Source is outside configured roots: {src}",
)
if not roots.contains(dst):
return LinkFileResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Destination is outside configured roots: {dst}",
)
try:
link_file(src, dst)
except FilesystemError as e:
return LinkFileResponse(status="error", error=code_for(e), message=str(e))
return LinkFileResponse(status="ok", source=src, destination=dst)
+34
View File
@@ -0,0 +1,34 @@
"""list_dir use case — list a directory after guarding it within roots."""
from __future__ import annotations
from pathlib import Path
from alfred.infrastructure.filesystem import FilesystemError, list_dir
from ._errors import PATH_NOT_ALLOWED, code_for
from .directory_roots import DirectoryRoots
from .dto import ListDirResponse
def list_dir_use_case(path: Path, roots: DirectoryRoots) -> ListDirResponse:
"""List the immediate children of ``path`` if it lives under one of
the configured roots.
Returns a :class:`ListDirResponse`. On guard failure, status is
``"error"`` with ``error="path_not_allowed"``. On infra failure,
status is ``"error"`` with a code mapped from the raised exception.
"""
if not roots.contains(path):
return ListDirResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Path is outside configured roots: {path}",
)
try:
entries = list_dir(path)
except FilesystemError as e:
return ListDirResponse(status="error", error=code_for(e), message=str(e))
return ListDirResponse(status="ok", path=path, entries=tuple(entries))
@@ -3,25 +3,25 @@
import logging import logging
from pathlib import Path from pathlib import Path
from alfred.domain.shared.value_objects import ImdbId from alfred.application.subtitles_TO_CHECK.placer import (
from alfred.domain.subtitles.entities import SubtitleCandidate
from alfred.domain.subtitles.services.identifier import SubtitleIdentifier
from alfred.domain.subtitles.services.matcher import SubtitleMatcher
from alfred.domain.subtitles.services.pattern_detector import PatternDetector
from alfred.application.subtitles.placer import (
PlacedTrack, PlacedTrack,
SubtitlePlacer, SubtitlePlacer,
_build_dest_name, _build_dest_name,
) )
from alfred.domain.subtitles.services.utils import available_subtitles from alfred.domain.shared_TO_CHECK.value_objects import ImdbId
from alfred.domain.subtitles.value_objects import ScanStrategy from alfred.domain.subtitles_TO_CHECK.entities import SubtitleScanResult
from alfred.domain.subtitles_TO_CHECK.services.identifier import SubtitleIdentifier
from alfred.domain.subtitles_TO_CHECK.services.matcher import SubtitleMatcher
from alfred.domain.subtitles_TO_CHECK.services.pattern_detector import PatternDetector
from alfred.domain.subtitles_TO_CHECK.services.utils import available_subtitles
from alfred.domain.subtitles_TO_CHECK.value_objects import ScanStrategy
from alfred.infrastructure.filesystem.scanner import PathlibFilesystemScanner from alfred.infrastructure.filesystem.scanner import PathlibFilesystemScanner
from alfred.infrastructure.knowledge.subtitles.base import SubtitleKnowledgeBase from alfred.infrastructure.knowledge_TO_CHECK.subtitles.base import SubtitleKnowledgeBase
from alfred.infrastructure.knowledge.subtitles.loader import KnowledgeLoader from alfred.infrastructure.knowledge_TO_CHECK.subtitles.loader import KnowledgeLoader
from alfred.infrastructure.persistence.context import get_memory from alfred.infrastructure.persistence_TO_CHECK.context import get_memory
from alfred.infrastructure.probe.ffprobe_prober import FfprobeMediaProber from alfred.infrastructure.probe_TO_CHECK.ffprobe_prober import FfprobeMediaProber
from alfred.infrastructure.subtitle.metadata_store import SubtitleMetadataStore from alfred.infrastructure.subtitle_TO_CHECK.metadata_store import SubtitleMetadataStore
from alfred.infrastructure.subtitle.rule_repository import RuleSetRepository from alfred.infrastructure.subtitle_TO_CHECK.rule_repository import RuleSetRepository
from .dto import ( from .dto import (
AvailableSubtitle, AvailableSubtitle,
@@ -278,7 +278,7 @@ class ManageSubtitlesUseCase:
def _to_unresolved_dto( def _to_unresolved_dto(
track: SubtitleCandidate, min_confidence: float = 0.7 track: SubtitleScanResult, min_confidence: float = 0.7
) -> UnresolvedTrack: ) -> UnresolvedTrack:
reason = "unknown_language" if track.language is None else "low_confidence" reason = "unknown_language" if track.language is None else "low_confidence"
return UnresolvedTrack( return UnresolvedTrack(
@@ -291,10 +291,10 @@ def _to_unresolved_dto(
def _pair_placed_with_tracks( def _pair_placed_with_tracks(
placed: list[PlacedTrack], placed: list[PlacedTrack],
tracks: list[SubtitleCandidate], tracks: list[SubtitleScanResult],
) -> list[tuple[PlacedTrack, SubtitleCandidate]]: ) -> list[tuple[PlacedTrack, SubtitleScanResult]]:
""" """
Pair each PlacedTrack with its originating SubtitleCandidate by source path. Pair each PlacedTrack with its originating SubtitleScanResult by source path.
Falls back to positional matching if paths don't align. Falls back to positional matching if paths don't align.
""" """
track_by_path = {t.file_path: t for t in tracks if t.file_path} track_by_path = {t.file_path: t for t in tracks if t.file_path}
+36
View File
@@ -0,0 +1,36 @@
"""move_dir use case — move a directory tree between configured roots."""
from __future__ import annotations
from pathlib import Path
from alfred.infrastructure.filesystem import FilesystemError, move_dir
from ._errors import PATH_NOT_ALLOWED, code_for
from .directory_roots import DirectoryRoots
from .dto import MoveDirResponse
def move_dir_use_case(
src: Path, dst: Path, roots: DirectoryRoots
) -> MoveDirResponse:
"""Move directory ``src`` to ``dst``. Both must be under configured roots."""
if not roots.contains(src):
return MoveDirResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Source is outside configured roots: {src}",
)
if not roots.contains(dst):
return MoveDirResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Destination is outside configured roots: {dst}",
)
try:
move_dir(src, dst)
except FilesystemError as e:
return MoveDirResponse(status="error", error=code_for(e), message=str(e))
return MoveDirResponse(status="ok", source=src, destination=dst)
@@ -0,0 +1,36 @@
"""move_file use case — move a file between configured roots."""
from __future__ import annotations
from pathlib import Path
from alfred.infrastructure.filesystem import FilesystemError, move_file
from ._errors import PATH_NOT_ALLOWED, code_for
from .directory_roots import DirectoryRoots
from .dto import MoveFileResponse
def move_file_use_case(
src: Path, dst: Path, roots: DirectoryRoots
) -> MoveFileResponse:
"""Move file ``src`` to ``dst``. Both must be under configured roots."""
if not roots.contains(src):
return MoveFileResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Source is outside configured roots: {src}",
)
if not roots.contains(dst):
return MoveFileResponse(
status="error",
error=PATH_NOT_ALLOWED,
message=f"Destination is outside configured roots: {dst}",
)
try:
move_file(src, dst)
except FilesystemError as e:
return MoveFileResponse(status="error", error=code_for(e), message=str(e))
return MoveFileResponse(status="ok", source=src, destination=dst)
@@ -22,38 +22,34 @@ import logging
from dataclasses import dataclass from dataclasses import dataclass
from pathlib import Path from pathlib import Path
from alfred.application.release import inspect_release from alfred.application.release_TO_CHECK import inspect_release
from alfred.domain.release import parse_release from alfred.domain.release import parse_release
from alfred.domain.release.ports import ReleaseKnowledge from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from alfred.domain.release.value_objects import ParsedRelease from alfred.domain.release.value_objects import ParsedRelease
from alfred.infrastructure.knowledge.release_kb import YamlReleaseKnowledge from alfred.domain.shared_TO_CHECK.ports import MediaProber
from alfred.infrastructure.persistence import get_memory from alfred.infrastructure.persistence_TO_CHECK import get_memory
from alfred.infrastructure.probe import FfprobeMediaProber
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# Single module-level knowledge instance. YAML is loaded once at first import.
# Tests that need a custom KB can monkeypatch this attribute.
_KB: ReleaseKnowledge = YamlReleaseKnowledge()
# Module-level prober — same singleton style as _KB. Tests that need a custom def _resolve_parsed(
# adapter can monkeypatch this attribute. release_name: str,
_PROBER = FfprobeMediaProber() source_path: str | None,
kb: ReleaseKnowledge,
prober: MediaProber,
def _resolve_parsed(release_name: str, source_path: str | None) -> ParsedRelease: ) -> ParsedRelease:
"""Pick the right entry point depending on whether we have a path. """Pick the right entry point depending on whether we have a path.
When ``source_path`` is provided and points to something that exists, When ``source_path`` is provided and points to something that exists,
we run the full inspection pipeline so probe data can refresh we run the full inspection pipeline so probe data can refresh tech
``tech_string`` (which feeds every filename builder). Otherwise we fields (which feed every filename builder). Otherwise we fall back
fall back to a parse-only path same behavior as before. to a parse-only path same behavior as before.
""" """
if source_path: if source_path:
path = Path(source_path) path = Path(source_path)
if path.exists(): if path.exists():
return inspect_release(release_name, path, _KB, _PROBER).parsed return inspect_release(release_name, path, kb, prober).parsed
parsed, _ = parse_release(release_name, _KB) parsed, _ = parse_release(release_name, kb)
return parsed return parsed
@@ -259,6 +255,8 @@ def resolve_season_destination(
release_name: str, release_name: str,
tmdb_title: str, tmdb_title: str,
tmdb_year: int, tmdb_year: int,
kb: ReleaseKnowledge,
prober: MediaProber,
confirmed_folder: str | None = None, confirmed_folder: str | None = None,
source_path: str | None = None, source_path: str | None = None,
) -> ResolvedSeasonDestination: ) -> ResolvedSeasonDestination:
@@ -280,8 +278,8 @@ def resolve_season_destination(
message="TV show library path is not configured.", message="TV show library path is not configured.",
) )
parsed = _resolve_parsed(release_name, source_path) parsed = _resolve_parsed(release_name, source_path, kb, prober)
tmdb_title_safe = _KB.sanitize_for_fs(tmdb_title) tmdb_title_safe = kb.sanitize_for_fs(tmdb_title)
computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year) computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year)
resolved = _resolve_series_folder( resolved = _resolve_series_folder(
@@ -314,6 +312,8 @@ def resolve_episode_destination(
source_file: str, source_file: str,
tmdb_title: str, tmdb_title: str,
tmdb_year: int, tmdb_year: int,
kb: ReleaseKnowledge,
prober: MediaProber,
tmdb_episode_title: str | None = None, tmdb_episode_title: str | None = None,
confirmed_folder: str | None = None, confirmed_folder: str | None = None,
) -> ResolvedEpisodeDestination: ) -> ResolvedEpisodeDestination:
@@ -332,11 +332,11 @@ def resolve_episode_destination(
message="TV show library path is not configured.", message="TV show library path is not configured.",
) )
parsed = _resolve_parsed(release_name, source_file) parsed = _resolve_parsed(release_name, source_file, kb, prober)
ext = Path(source_file).suffix ext = Path(source_file).suffix
tmdb_title_safe = _KB.sanitize_for_fs(tmdb_title) tmdb_title_safe = kb.sanitize_for_fs(tmdb_title)
tmdb_episode_title_safe = ( tmdb_episode_title_safe = (
_KB.sanitize_for_fs(tmdb_episode_title) if tmdb_episode_title else None kb.sanitize_for_fs(tmdb_episode_title) if tmdb_episode_title else None
) )
computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year) computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year)
@@ -375,6 +375,8 @@ def resolve_movie_destination(
source_file: str, source_file: str,
tmdb_title: str, tmdb_title: str,
tmdb_year: int, tmdb_year: int,
kb: ReleaseKnowledge,
prober: MediaProber,
) -> ResolvedMovieDestination: ) -> ResolvedMovieDestination:
""" """
Compute destination paths for a movie file. Compute destination paths for a movie file.
@@ -392,9 +394,9 @@ def resolve_movie_destination(
message="Movie library path is not configured.", message="Movie library path is not configured.",
) )
parsed = _resolve_parsed(release_name, source_file) parsed = _resolve_parsed(release_name, source_file, kb, prober)
ext = Path(source_file).suffix ext = Path(source_file).suffix
tmdb_title_safe = _KB.sanitize_for_fs(tmdb_title) tmdb_title_safe = kb.sanitize_for_fs(tmdb_title)
folder_name = parsed.movie_folder_name(tmdb_title_safe, tmdb_year) folder_name = parsed.movie_folder_name(tmdb_title_safe, tmdb_year)
filename = parsed.movie_filename(tmdb_title_safe, tmdb_year, ext) filename = parsed.movie_filename(tmdb_title_safe, tmdb_year, ext)
@@ -416,6 +418,8 @@ def resolve_series_destination(
release_name: str, release_name: str,
tmdb_title: str, tmdb_title: str,
tmdb_year: int, tmdb_year: int,
kb: ReleaseKnowledge,
prober: MediaProber,
confirmed_folder: str | None = None, confirmed_folder: str | None = None,
source_path: str | None = None, source_path: str | None = None,
) -> ResolvedSeriesDestination: ) -> ResolvedSeriesDestination:
@@ -435,8 +439,8 @@ def resolve_series_destination(
message="TV show library path is not configured.", message="TV show library path is not configured.",
) )
parsed = _resolve_parsed(release_name, source_path) parsed = _resolve_parsed(release_name, source_path, kb, prober)
tmdb_title_safe = _KB.sanitize_for_fs(tmdb_title) tmdb_title_safe = kb.sanitize_for_fs(tmdb_title)
computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year) computed_name = parsed.show_folder_name(tmdb_title_safe, tmdb_year)
resolved = _resolve_series_folder( resolved = _resolve_series_folder(
@@ -1,50 +0,0 @@
"""Set folder path use case."""
import logging
from alfred.infrastructure.filesystem import FileManager
from .dto import SetFolderPathResponse
logger = logging.getLogger(__name__)
class SetFolderPathUseCase:
"""
Use case for setting a folder path in configuration.
This orchestrates the FileManager to set folder paths.
"""
def __init__(self, file_manager: FileManager):
"""
Initialize use case.
Args:
file_manager: FileManager instance
"""
self.file_manager = file_manager
def execute(self, folder_name: str, path_value: str) -> SetFolderPathResponse:
"""
Set a folder path in configuration.
Args:
folder_name: Name of folder to set (download, tvshow, movie, torrent)
path_value: Absolute path to the folder
Returns:
SetFolderPathResponse with success or error information
"""
result = self.file_manager.set_folder_path(folder_name, path_value)
if result.get("status") == "ok":
return SetFolderPathResponse(
status="ok",
folder_name=result.get("folder_name"),
path=result.get("path"),
)
else:
return SetFolderPathResponse(
status="error", error=result.get("error"), message=result.get("message")
)
-44
View File
@@ -1,44 +0,0 @@
"""Movie application DTOs."""
from dataclasses import dataclass
@dataclass
class SearchMovieResponse:
"""Response from searching for a movie."""
status: str
imdb_id: str | None = None
title: str | None = None
media_type: str | None = None
tmdb_id: int | None = None
overview: str | None = None
release_date: str | None = None
vote_average: float | None = None
error: str | None = None
message: str | None = None
def to_dict(self):
"""Convert to dict for agent compatibility."""
result = {"status": self.status}
if self.error:
result["error"] = self.error
result["message"] = self.message
else:
if self.imdb_id:
result["imdb_id"] = self.imdb_id
if self.title:
result["title"] = self.title
if self.media_type:
result["media_type"] = self.media_type
if self.tmdb_id:
result["tmdb_id"] = self.tmdb_id
if self.overview:
result["overview"] = self.overview
if self.release_date:
result["release_date"] = self.release_date
if self.vote_average:
result["vote_average"] = self.vote_average
return result
-93
View File
@@ -1,93 +0,0 @@
"""Search movie use case."""
import logging
from alfred.infrastructure.api.tmdb import (
TMDBAPIError,
TMDBClient,
TMDBConfigurationError,
TMDBNotFoundError,
)
from .dto import SearchMovieResponse
logger = logging.getLogger(__name__)
class SearchMovieUseCase:
"""
Use case for searching a movie and retrieving its IMDb ID.
This orchestrates the TMDB API client to find movie information.
"""
def __init__(self, tmdb_client: TMDBClient):
"""
Initialize use case.
Args:
tmdb_client: TMDB API client
"""
self.tmdb_client = tmdb_client
def execute(self, media_title: str) -> SearchMovieResponse:
"""
Search for a movie by title.
Args:
media_title: Title of the movie to search for
Returns:
SearchMovieResponse with movie information or error
"""
try:
# Use the TMDB client to search for media
result = self.tmdb_client.search_media(media_title)
# Check if IMDb ID was found
if result.imdb_id:
logger.info(f"IMDb ID found for '{media_title}': {result.imdb_id}")
return SearchMovieResponse(
status="ok",
imdb_id=result.imdb_id,
title=result.title,
media_type=result.media_type,
tmdb_id=result.tmdb_id,
overview=result.overview,
release_date=result.release_date,
vote_average=result.vote_average,
)
else:
logger.warning(f"No IMDb ID available for '{media_title}'")
return SearchMovieResponse(
status="ok",
title=result.title,
media_type=result.media_type,
tmdb_id=result.tmdb_id,
error="no_imdb_id",
message=f"No IMDb ID available for '{result.title}'",
)
except TMDBNotFoundError as e:
logger.info(f"Media not found: {e}")
return SearchMovieResponse(
status="error", error="not_found", message=str(e)
)
except TMDBConfigurationError as e:
logger.error(f"TMDB configuration error: {e}")
return SearchMovieResponse(
status="error", error="configuration_error", message=str(e)
)
except TMDBAPIError as e:
logger.error(f"TMDB API error: {e}")
return SearchMovieResponse(
status="error", error="api_error", message=str(e)
)
except ValueError as e:
logger.error(f"Validation error: {e}")
return SearchMovieResponse(
status="error", error="validation_failed", message=str(e)
)
@@ -1,9 +1,10 @@
"""Movie use cases.""" """Movie use cases."""
from .dto import SearchMovieResponse from .dto import MovieHit, SearchMovieResponse
from .search_movie import SearchMovieUseCase from .search_movie import SearchMovieUseCase
__all__ = [ __all__ = [
"SearchMovieUseCase", "MovieHit",
"SearchMovieResponse", "SearchMovieResponse",
"SearchMovieUseCase",
] ]
+40
View File
@@ -0,0 +1,40 @@
"""Movie application DTOs."""
from dataclasses import dataclass, field
@dataclass(frozen=True)
class MovieHit:
"""One movie hit, flattened for transport to the agent."""
tmdb_id: int
title: str
release_year: int | None = None
def to_dict(self) -> dict:
out: dict = {"tmdb_id": self.tmdb_id, "title": self.title}
if self.release_year is not None:
out["release_year"] = self.release_year
return out
@dataclass
class SearchMovieResponse:
"""Response from searching for a movie."""
status: str
hits: list[MovieHit] = field(default_factory=list)
error: str | None = None
message: str | None = None
def to_dict(self):
"""Convert to dict for agent compatibility."""
result: dict = {"status": self.status}
if self.error:
result["error"] = self.error
result["message"] = self.message
else:
result["hits"] = [h.to_dict() for h in self.hits]
return result
@@ -0,0 +1,60 @@
"""Search movie use case."""
import logging
from alfred.infrastructure.api_TO_CHECK.tmdb import (
TMDBAPIError,
TMDBClient,
TMDBConfigurationError,
)
from .dto import MovieHit, SearchMovieResponse
logger = logging.getLogger(__name__)
class SearchMovieUseCase:
"""List movies matching a free-text query via TMDB ``/search/movie``.
The use case is a thin orchestrator: it asks the client for hits,
flattens domain VOs into agent-friendly primitives, and wraps
errors. It deliberately does **not** look up ``imdb_id`` —
enrichment is the caller's job (via :meth:`TMDBClient.get_movie_info`
on a chosen ``tmdb_id``).
"""
def __init__(self, tmdb_client: TMDBClient):
self.tmdb_client = tmdb_client
def execute(self, media_title: str) -> SearchMovieResponse:
try:
results = self.tmdb_client.search_movies(media_title)
hits = [
MovieHit(
tmdb_id=r.tmdb_id.value,
title=str(r.title),
release_year=r.release_year.value if r.release_year else None,
)
for r in results
]
logger.info(f"search_movies({media_title!r}) → {len(hits)} hits")
return SearchMovieResponse(status="ok", hits=hits)
except TMDBConfigurationError as e:
logger.error(f"TMDB configuration error: {e}")
return SearchMovieResponse(
status="error", error="configuration_error", message=str(e)
)
except TMDBAPIError as e:
logger.error(f"TMDB API error: {e}")
return SearchMovieResponse(
status="error", error="api_error", message=str(e)
)
except ValueError as e:
logger.error(f"Validation error: {e}")
return SearchMovieResponse(
status="error", error="validation_failed", message=str(e)
)
@@ -1,89 +0,0 @@
"""enrich_from_probe — fill missing ParsedRelease fields from MediaInfo."""
from __future__ import annotations
from alfred.domain.release.value_objects import ParsedRelease
from alfred.domain.shared.media import MediaInfo
# Map ffprobe codec names to scene-style codec tokens
_VIDEO_CODEC_MAP = {
"hevc": "x265",
"h264": "x264",
"h265": "x265",
"av1": "AV1",
"vp9": "VP9",
"mpeg4": "XviD",
}
# Map ffprobe audio codec names to scene-style tokens
_AUDIO_CODEC_MAP = {
"eac3": "EAC3",
"ac3": "AC3",
"dts": "DTS",
"truehd": "TrueHD",
"aac": "AAC",
"flac": "FLAC",
"opus": "OPUS",
"mp3": "MP3",
"pcm_s16l": "PCM",
"pcm_s24l": "PCM",
}
# Map channel count to standard layout string
_CHANNEL_MAP = {
8: "7.1",
6: "5.1",
2: "2.0",
1: "1.0",
}
def enrich_from_probe(parsed: ParsedRelease, info: MediaInfo) -> None:
"""
Fill None fields in parsed using data from ffprobe MediaInfo.
Only overwrites fields that are currently None — token-level values
from the release name always take priority.
Mutates parsed in place.
"""
if parsed.quality is None and info.resolution:
parsed.quality = info.resolution
if parsed.codec is None and info.video_codec:
parsed.codec = _VIDEO_CODEC_MAP.get(
info.video_codec.lower(), info.video_codec.upper()
)
if parsed.bit_depth is None and info.video_codec:
# ffprobe exposes bit depth via pix_fmt — not in MediaInfo yet, skip for now
pass
# Audio — use the default track, fallback to first
default_track = next((t for t in info.audio_tracks if t.is_default), None)
track = default_track or (info.audio_tracks[0] if info.audio_tracks else None)
if track:
if parsed.audio_codec is None and track.codec:
parsed.audio_codec = _AUDIO_CODEC_MAP.get(
track.codec.lower(), track.codec.upper()
)
if parsed.audio_channels is None and track.channels:
parsed.audio_channels = _CHANNEL_MAP.get(
track.channels, f"{track.channels}ch"
)
# Languages — merge ffprobe languages with token-level ones
# "und" = undetermined, not useful
if info.audio_languages:
existing = set(parsed.languages)
for lang in info.audio_languages:
if lang.lower() != "und" and lang.upper() not in existing:
parsed.languages.append(lang)
# Re-derive tech_string so filename builders see the enriched
# quality/source/codec. Built the same way as in the parser pipeline:
# the non-None parts joined by dots, in order.
parsed.tech_string = ".".join(
p for p in (parsed.quality, parsed.source, parsed.codec) if p
)
@@ -19,7 +19,7 @@ from __future__ import annotations
from pathlib import Path from pathlib import Path
from alfred.domain.release.ports import ReleaseKnowledge from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from alfred.domain.release.value_objects import ParsedRelease from alfred.domain.release.value_objects import ParsedRelease
@@ -0,0 +1,74 @@
"""enrich_from_probe — fill missing ParsedRelease fields from MediaInfo."""
from __future__ import annotations
from dataclasses import replace
from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from alfred.domain.release.value_objects import ParsedRelease
from alfred.domain.shared_TO_CHECK.media import MediaInfo
def enrich_from_probe(
parsed: ParsedRelease, info: MediaInfo, kb: ReleaseKnowledge
) -> ParsedRelease:
"""
Return a new ParsedRelease with None fields filled from ffprobe MediaInfo.
Only overwrites fields that are currently None — token-level values
from the release name always take priority. ``ParsedRelease`` is
frozen; this returns a new instance via :func:`dataclasses.replace`.
Translation tables (ffprobe codec name → scene token, channel count
→ layout) live in ``kb.probe_mappings`` (loaded from
``alfred/knowledge/release/probe_mappings.yaml``). When ffprobe
reports a value with no mapping entry, the fallback is the uppercase
raw value so unknown codecs still surface in a predictable form.
"""
mappings = kb.probe_mappings
video_codec_map: dict[str, str] = mappings.get("video_codec", {})
audio_codec_map: dict[str, str] = mappings.get("audio_codec", {})
channel_map: dict[int, str] = mappings.get("audio_channels", {})
updates: dict[str, object] = {}
if parsed.quality is None and info.resolution:
updates["quality"] = info.resolution
if parsed.codec is None and info.video_codec:
updates["codec"] = video_codec_map.get(
info.video_codec.lower(), info.video_codec.upper()
)
# bit_depth: ffprobe exposes it via pix_fmt — not in MediaInfo yet, skip.
# Audio — use the default track, fallback to first
default_track = next((t for t in info.audio_tracks if t.is_default), None)
track = default_track or (info.audio_tracks[0] if info.audio_tracks else None)
if track:
if parsed.audio_codec is None and track.codec:
updates["audio_codec"] = audio_codec_map.get(
track.codec.lower(), track.codec.upper()
)
if parsed.audio_channels is None and track.channels:
updates["audio_channels"] = channel_map.get(
track.channels, f"{track.channels}ch"
)
# Languages — merge ffprobe languages with token-level ones
# "und" = undetermined, not useful
if info.audio_languages:
existing_upper = {lang.upper() for lang in parsed.languages}
new_languages = list(parsed.languages)
for lang in info.audio_languages:
if lang.lower() != "und" and lang.upper() not in existing_upper:
new_languages.append(lang)
existing_upper.add(lang.upper())
if len(new_languages) != len(parsed.languages):
updates["languages"] = tuple(new_languages)
if not updates:
return parsed
return replace(parsed, **updates)
@@ -45,17 +45,35 @@ Design notes:
from __future__ import annotations from __future__ import annotations
from dataclasses import dataclass from dataclasses import dataclass, replace
from pathlib import Path from pathlib import Path
from alfred.application.release.detect_media_type import detect_media_type from alfred.application.release_TO_CHECK.detect_media_type import detect_media_type
from alfred.application.release.enrich_from_probe import enrich_from_probe from alfred.application.release_TO_CHECK.enrich_from_probe import enrich_from_probe
from alfred.application.release.supported_media import find_main_video from alfred.application.release_TO_CHECK.supported_media import find_main_video
from alfred.domain.release.ports import ReleaseKnowledge from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from alfred.domain.release.services import parse_release from alfred.domain.releases_TO_CHECK.parser.services import parse_release
from alfred.domain.release.value_objects import ParsedRelease, ParseReport from alfred.domain.release.value_objects import (
from alfred.domain.shared.media import MediaInfo MediaTypeToken,
from alfred.domain.shared.ports import MediaProber ParsedRelease,
ParseReport,
)
from alfred.domain.shared_TO_CHECK.media import MediaInfo
from alfred.domain.shared_TO_CHECK.ports import MediaProber
# Media types for which a probe carries no useful information.
_NON_PROBABLE_MEDIA_TYPES = frozenset({"unknown", "other"})
# Media types for which there's nothing for the organizer to do.
# ``other`` covers things like games / ISOs / archives sitting on the
# downloads folder. ``unknown`` does NOT belong here — those need a
# user decision, not a skip.
_SKIPPABLE_MEDIA_TYPES = frozenset({"other"})
# Roads that signal the parser couldn't reach a confident answer on its
# own. ``Road`` values are kept as strings on the report to avoid a
# cross-package import here.
_ASK_USER_ROADS = frozenset({"path_of_pain"})
@dataclass(frozen=True) @dataclass(frozen=True)
@@ -81,6 +99,10 @@ class InspectedResult:
- ``probe_used`` ``True`` iff ``media_info`` is non-``None`` and - ``probe_used`` ``True`` iff ``media_info`` is non-``None`` and
``enrich_from_probe`` actually ran. Explicit flag so callers ``enrich_from_probe`` actually ran. Explicit flag so callers
don't have to re-derive the condition. don't have to re-derive the condition.
- ``recommended_action`` derived hint for the orchestrator (see
property docstring). Encodes the exclusion / clarification /
go-ahead decision in one place so downstream callers don't
re-implement the same checks.
""" """
parsed: ParsedRelease parsed: ParsedRelease
@@ -90,9 +112,36 @@ class InspectedResult:
media_info: MediaInfo | None media_info: MediaInfo | None
probe_used: bool probe_used: bool
@property
def recommended_action(self) -> str:
"""Return one of ``"skip"`` / ``"ask_user"`` / ``"process"``.
# Media types for which a probe carries no useful information. - ``"skip"`` nothing to organize:
_NON_PROBABLE_MEDIA_TYPES = frozenset({"unknown", "other"}) * the source has no main video file, **or**
* ``media_type`` is ``"other"`` (games / ISOs / archives).
- ``"ask_user"`` a decision is required before any action:
* ``media_type`` is ``"unknown"`` (parser couldn't classify), **or**
* the parse landed on ``Road.PATH_OF_PAIN``
(low-confidence, malformed name, etc.).
- ``"process"`` everything else: a confident parse with a
usable media type and a main video on disk. The orchestrator
can move straight to the planning step.
The check ordering matters: ``"skip"`` wins over ``"ask_user"``
because if there's no video to organize, no question to the
user can change that. ``"ask_user"`` then wins over
``"process"`` because a confident parse alone isn't enough if
the type or road still flag uncertainty.
"""
if self.main_video is None:
return "skip"
if self.parsed.media_type.value in _SKIPPABLE_MEDIA_TYPES:
return "skip"
if self.parsed.media_type.value == "unknown":
return "ask_user"
if self.report.road in _ASK_USER_ROADS:
return "ask_user"
return "process"
def inspect_release( def inspect_release(
@@ -115,8 +164,11 @@ def inspect_release(
# Step 2: refine media_type from the on-disk extension mix. # Step 2: refine media_type from the on-disk extension mix.
# detect_media_type tolerates non-existent paths (returns parsed.media_type # detect_media_type tolerates non-existent paths (returns parsed.media_type
# untouched), so no need to guard here. # untouched), so no need to guard here. ParsedRelease is frozen — use
parsed.media_type = detect_media_type(parsed, source_path, kb) # dataclasses.replace to rebind with the refined value.
refined_media_type = MediaTypeToken(detect_media_type(parsed, source_path, kb))
if refined_media_type != parsed.media_type:
parsed = replace(parsed, media_type=refined_media_type)
# Step 3: pick the canonical main video (top-level scan only). # Step 3: pick the canonical main video (top-level scan only).
main_video = find_main_video(source_path, kb) main_video = find_main_video(source_path, kb)
@@ -127,7 +179,7 @@ def inspect_release(
if main_video is not None and parsed.media_type not in _NON_PROBABLE_MEDIA_TYPES: if main_video is not None and parsed.media_type not in _NON_PROBABLE_MEDIA_TYPES:
media_info = prober.probe(main_video) media_info = prober.probe(main_video)
if media_info is not None: if media_info is not None:
enrich_from_probe(parsed, media_info) parsed = enrich_from_probe(parsed, media_info, kb)
probe_used = True probe_used = True
return InspectedResult( return InspectedResult(
@@ -32,7 +32,7 @@ from __future__ import annotations
from pathlib import Path from pathlib import Path
from alfred.domain.release.ports.knowledge import ReleaseKnowledge from alfred.domain.releases_TO_CHECK.ports.knowledge import ReleaseKnowledge
def is_supported_video(path: Path, kb: ReleaseKnowledge) -> bool: def is_supported_video(path: Path, kb: ReleaseKnowledge) -> bool:
@@ -5,13 +5,13 @@ import os
from dataclasses import dataclass from dataclasses import dataclass
from pathlib import Path from pathlib import Path
from alfred.domain.subtitles.entities import SubtitleCandidate from alfred.domain.subtitles_TO_CHECK.entities import SubtitleScanResult
from alfred.domain.subtitles.value_objects import SubtitleType from alfred.domain.subtitles_TO_CHECK.value_objects import SubtitleType
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def _build_dest_name(track: SubtitleCandidate, video_stem: str) -> str: def _build_dest_name(track: SubtitleScanResult, video_stem: str) -> str:
""" """
Build the destination filename for a subtitle track. Build the destination filename for a subtitle track.
@@ -41,7 +41,7 @@ class PlacedTrack:
@dataclass @dataclass
class PlaceResult: class PlaceResult:
placed: list[PlacedTrack] placed: list[PlacedTrack]
skipped: list[tuple[SubtitleCandidate, str]] # (track, reason) skipped: list[tuple[SubtitleScanResult, str]] # (track, reason)
@property @property
def placed_count(self) -> int: def placed_count(self) -> int:
@@ -54,7 +54,7 @@ class PlaceResult:
class SubtitlePlacer: class SubtitlePlacer:
""" """
Hard-links matched SubtitleCandidate files next to a destination video. Hard-links matched SubtitleScanResult files next to a destination video.
Uses the same hard-link strategy as FileManager.copy_file: Uses the same hard-link strategy as FileManager.copy_file:
instant, no data duplication, qBittorrent keeps seeding. instant, no data duplication, qBittorrent keeps seeding.
@@ -64,11 +64,11 @@ class SubtitlePlacer:
def place( def place(
self, self,
tracks: list[SubtitleCandidate], tracks: list[SubtitleScanResult],
destination_video: Path, destination_video: Path,
) -> PlaceResult: ) -> PlaceResult:
placed: list[PlacedTrack] = [] placed: list[PlacedTrack] = []
skipped: list[tuple[SubtitleCandidate, str]] = [] skipped: list[tuple[SubtitleScanResult, str]] = []
dest_dir = destination_video.parent dest_dir = destination_video.parent
@@ -2,7 +2,7 @@
import logging import logging
from alfred.infrastructure.api.qbittorrent import ( from alfred.infrastructure.api_TO_CHECK.qbittorrent import (
QBittorrentAPIError, QBittorrentAPIError,
QBittorrentAuthError, QBittorrentAuthError,
QBittorrentClient, QBittorrentClient,
@@ -2,7 +2,7 @@
import logging import logging
from alfred.infrastructure.api.knaben import ( from alfred.infrastructure.api_TO_CHECK.knaben import (
KnabenAPIError, KnabenAPIError,
KnabenClient, KnabenClient,
KnabenNotFoundError, KnabenNotFoundError,
@@ -0,0 +1,21 @@
"""TV-show orchestrators — operate on the Alfred-managed TV library tree.
The TV library is a directory of show folders (one per TV show), each
holding season folders containing video files. Modules here walk this
tree and reconstruct on-disk :class:`SeriesRelease` aggregates by
reusing the existing release pipeline (``inspect_release``) rather
than duplicating its parse/probe logic.
"""
from .dto import SearchShowResponse, ShowHit
from .search_show import SearchShowUseCase
from .walker import SeasonFolder, ShowTree, walk_show
__all__ = [
"SearchShowResponse",
"SearchShowUseCase",
"SeasonFolder",
"ShowHit",
"ShowTree",
"walk_show",
]
@@ -0,0 +1,39 @@
"""TV show application DTOs."""
from dataclasses import dataclass, field
@dataclass(frozen=True)
class ShowHit:
"""One TV-show hit, flattened for transport to the agent."""
tmdb_id: int
name: str
first_air_year: int | None = None
def to_dict(self) -> dict:
out: dict = {"tmdb_id": self.tmdb_id, "name": self.name}
if self.first_air_year is not None:
out["first_air_year"] = self.first_air_year
return out
@dataclass
class SearchShowResponse:
"""Response from searching for a TV show."""
status: str
hits: list[ShowHit] = field(default_factory=list)
error: str | None = None
message: str | None = None
def to_dict(self):
result: dict = {"status": self.status}
if self.error:
result["error"] = self.error
result["message"] = self.message
else:
result["hits"] = [h.to_dict() for h in self.hits]
return result
@@ -0,0 +1,59 @@
"""Search TV show use case."""
import logging
from alfred.infrastructure.api_TO_CHECK.tmdb import (
TMDBAPIError,
TMDBClient,
TMDBConfigurationError,
)
from .dto import SearchShowResponse, ShowHit
logger = logging.getLogger(__name__)
class SearchShowUseCase:
"""List TV shows matching a free-text query via TMDB ``/search/tv``.
Symmetric to :class:`alfred.application.movies.SearchMovieUseCase`:
thin orchestrator, flattens domain VOs into agent-friendly
primitives, no ``imdb_id`` enrichment (caller follows up with
:meth:`TMDBClient.get_tv_show_info` on a chosen ``tmdb_id``).
"""
def __init__(self, tmdb_client: TMDBClient):
self.tmdb_client = tmdb_client
def execute(self, show_title: str) -> SearchShowResponse:
try:
results = self.tmdb_client.search_shows(show_title)
hits = [
ShowHit(
tmdb_id=r.tmdb_id.value,
name=r.name,
first_air_year=r.first_air_year,
)
for r in results
]
logger.info(f"search_shows({show_title!r}) → {len(hits)} hits")
return SearchShowResponse(status="ok", hits=hits)
except TMDBConfigurationError as e:
logger.error(f"TMDB configuration error: {e}")
return SearchShowResponse(
status="error", error="configuration_error", message=str(e)
)
except TMDBAPIError as e:
logger.error(f"TMDB API error: {e}")
return SearchShowResponse(
status="error", error="api_error", message=str(e)
)
except ValueError as e:
logger.error(f"Validation error: {e}")
return SearchShowResponse(
status="error", error="validation_failed", message=str(e)
)
@@ -0,0 +1,208 @@
"""Show tree walker — minimal filesystem traversal of a TV show folder.
The walker is intentionally dumb: it lists season folders, classifies
each one as PACK or EPISODIC by **inspecting its filesystem
structure**, and hands the orchestrator a flat list of video files
per season. It does not parse release names, run ffprobe, or
classify subtitle files. All of that intelligence lives in the
existing release pipeline (``inspect_release`` + downstream
services); the walker just hands the orchestrator the paths to feed
into that pipeline.
Folder convention
-----------------
Inside an Alfred-managed library, a show root looks like::
Foundation/
Foundation.S01.1080p.WEB-DL.x265-GROUP/ ← PACK season
Foundation.S01E01.1080p.WEB-DL.x265.mkv ← flat video
Foundation.S01E02.1080p.WEB-DL.x265.mkv
...
Foundation.S02/ ← EPISODIC season
Foundation.S02E01.1080p.WEB-DL.x265-GROUP/ ← episode subfolder
Foundation.S02E01.1080p.WEB-DL.x265-GROUP.mkv
Foundation.S02E02.1080p.WEB-DL.x265-OTHER/
Foundation.S02E02.1080p.WEB-DL.x265-OTHER.mkv
The walker recognizes a season folder by a ``Sxx`` token anywhere in
its name (case-insensitive). It does **not** care about Plex-style
names (``Season 01``, ``Specials``) — the Alfred library uses
release-style folder names only.
PACK vs EPISODIC is a **structural distinction**, not a naming one:
* **PACK** — season folder contains N flat video files. No
subfolders.
* **EPISODIC** — season folder contains N subfolders, each holding
exactly one video.
A season folder that mixes the two layouts (some flat videos AND
some subfolders) is malformed: the walker reports
``mode=None`` and an empty ``video_files`` tuple so the
orchestrator can warn and skip it.
"""
from __future__ import annotations
import logging
import re
from dataclasses import dataclass
from pathlib import Path
from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from alfred.domain.releases_TO_CHECK.value_objects import ReleaseMode
from alfred.domain.shared_TO_CHECK.ports import FilesystemScanner
_LOG = logging.getLogger(__name__)
# Matches any ``Sxx`` token (1-2 digits) bounded by non-alphanumerics.
# Examples that match: ``Foundation.S01.1080p`` , ``S2.Pack`` , ``BBC.s10.bluray``.
# Examples that don't: ``Sample`` , ``Soundtrack`` , ``2024.S0E1`` (no S+digits boundary).
_SEASON_TOKEN_RE = re.compile(r"(?<![A-Za-z0-9])s(\d{1,2})(?![A-Za-z0-9])", re.IGNORECASE)
@dataclass(frozen=True)
class SeasonFolder:
"""One season folder discovered inside a show root.
``mode`` is set by the walker from the FS structure:
* :attr:`ReleaseMode.PACK` — ``video_files`` lists the season
folder's flat videos.
* :attr:`ReleaseMode.EPISODIC` — ``video_files`` lists each
episode subfolder's single video.
* ``None`` — the folder is empty, malformed (mixed layout), or
otherwise unclassifiable. ``video_files`` is empty. The
orchestrator decides whether to warn/skip.
"""
season_dir: Path
mode: ReleaseMode | None
video_files: tuple[Path, ...]
@dataclass(frozen=True)
class ShowTree:
"""The full structural snapshot of a show on disk."""
show_root: Path
season_folders: tuple[SeasonFolder, ...]
def walk_show(
show_root: Path,
*,
scanner: FilesystemScanner,
kb: ReleaseKnowledge,
) -> ShowTree:
"""Walk ``show_root`` and return its structural tree.
The walker:
* lists direct children of ``show_root``,
* keeps the directories whose name contains a ``Sxx`` token,
* classifies each season folder as PACK / EPISODIC / unknown by
inspecting its direct children (videos vs subfolders),
* for EPISODIC, descends one extra level into each episode
subfolder to collect its single video,
* sorts season folders by name and video files by name within
each folder.
The walker never raises — empty / unreadable / malformed
directories surface as a ``SeasonFolder`` with ``mode=None`` and
an empty ``video_files`` tuple.
"""
video_exts = {ext.lower() for ext in kb.video_extensions}
season_folders: list[SeasonFolder] = []
for entry in scanner.scan_dir(show_root):
if not entry.is_dir or not _SEASON_TOKEN_RE.search(entry.name):
continue
season_folders.append(
_classify_season(entry.path, scanner=scanner, video_exts=video_exts)
)
return ShowTree(
show_root=show_root, season_folders=tuple(season_folders)
)
# --------------------------------------------------------------------------- #
# Season-folder classification #
# --------------------------------------------------------------------------- #
def _classify_season(
season_dir: Path,
*,
scanner: FilesystemScanner,
video_exts: set[str],
) -> SeasonFolder:
"""Inspect one season folder and decide PACK / EPISODIC / unknown.
Looks only at direct children. For EPISODIC, descends one extra
level into each subfolder to collect its single video. Mixed
layouts (flat videos + subfolders) are reported as ``mode=None``
so the orchestrator can skip them with a warning.
"""
flat_videos: list[Path] = []
subdirs: list[Path] = []
for child in scanner.scan_dir(season_dir):
if child.is_file and child.suffix.lower() in video_exts:
flat_videos.append(child.path)
elif child.is_dir:
subdirs.append(child.path)
# Anything else (non-video files like .nfo, .srt at the season
# root) is ignored — it doesn't affect classification.
has_flat = bool(flat_videos)
has_subdirs = bool(subdirs)
if has_flat and has_subdirs:
_LOG.warning(
"walker: season folder %s mixes flat videos and subfolders — "
"malformed layout, skipping",
season_dir,
)
return SeasonFolder(season_dir=season_dir, mode=None, video_files=())
if has_flat:
return SeasonFolder(
season_dir=season_dir,
mode=ReleaseMode.PACK,
video_files=tuple(sorted(flat_videos)),
)
if has_subdirs:
episode_videos: list[Path] = []
for sub in sorted(subdirs):
videos_in_sub = [
child.path
for child in scanner.scan_dir(sub)
if child.is_file and child.suffix.lower() in video_exts
]
if len(videos_in_sub) == 0:
_LOG.warning(
"walker: episode subfolder %s contains no video — skipping",
sub,
)
continue
if len(videos_in_sub) > 1:
_LOG.warning(
"walker: episode subfolder %s contains %d videos — "
"malformed, skipping season %s",
sub,
len(videos_in_sub),
season_dir,
)
return SeasonFolder(
season_dir=season_dir, mode=None, video_files=()
)
episode_videos.append(videos_in_sub[0])
return SeasonFolder(
season_dir=season_dir,
mode=ReleaseMode.EPISODIC,
video_files=tuple(episode_videos),
)
# No flat videos, no subdirs → empty season folder.
return SeasonFolder(season_dir=season_dir, mode=None, video_files=())
-104
View File
@@ -1,104 +0,0 @@
"""Movie domain entities."""
from dataclasses import dataclass, field
from datetime import datetime
from ..shared.media import AudioTrack, MediaWithTracks, SubtitleTrack
from ..shared.value_objects import FilePath, FileSize, ImdbId
from .value_objects import MovieTitle, Quality, ReleaseYear
@dataclass(eq=False)
class Movie(MediaWithTracks):
"""
Movie aggregate root for the movies domain.
Carries file metadata (path, size) and the tracks discovered by the
ffprobe + subtitle scan pipeline. The track lists may be empty when the
movie is known but not yet scanned, or when no file is downloaded.
Track helpers follow the same "C+" contract as ``Episode``: pass a
``Language`` for cross-format matching, or a ``str`` for case-insensitive
direct comparison.
Equality is identity-based: two ``Movie`` instances are equal iff they
share the same ``imdb_id``, regardless of file/track contents. This is
the DDD aggregate invariant — the aggregate is identified by its root id.
"""
imdb_id: ImdbId
title: MovieTitle
release_year: ReleaseYear | None = None
quality: Quality = Quality.UNKNOWN
file_path: FilePath | None = None
file_size: FileSize | None = None
tmdb_id: int | None = None
added_at: datetime = field(default_factory=datetime.now)
audio_tracks: list[AudioTrack] = field(default_factory=list)
subtitle_tracks: list[SubtitleTrack] = field(default_factory=list)
def __post_init__(self):
"""Validate movie entity."""
# Ensure ImdbId is actually an ImdbId instance
if not isinstance(self.imdb_id, ImdbId):
if isinstance(self.imdb_id, str):
self.imdb_id = ImdbId(self.imdb_id)
else:
raise ValueError(
f"imdb_id must be ImdbId or str, got {type(self.imdb_id)}"
)
# Ensure MovieTitle is actually a MovieTitle instance
if not isinstance(self.title, MovieTitle):
if isinstance(self.title, str):
self.title = MovieTitle(self.title)
else:
raise ValueError(
f"title must be MovieTitle or str, got {type(self.title)}"
)
def __eq__(self, other: object) -> bool:
if not isinstance(other, Movie):
return NotImplemented
return self.imdb_id == other.imdb_id
def __hash__(self) -> int:
return hash(self.imdb_id)
# Track helpers (has_audio_in / audio_languages / has_subtitles_in /
# has_forced_subs / subtitle_languages) come from MediaWithTracks.
def get_folder_name(self) -> str:
"""
Get the folder name for this movie.
Format: "Title (Year)"
Example: "Inception (2010)"
"""
if self.release_year:
return f"{self.title.value} ({self.release_year.value})"
return self.title.value
def get_filename(self) -> str:
"""
Get the suggested filename for this movie.
Format: "Title.Year.Quality.ext"
Example: "Inception.2010.1080p.mkv"
"""
parts = [self.title.normalized()]
if self.release_year:
parts.append(str(self.release_year.value))
if self.quality != Quality.UNKNOWN:
parts.append(self.quality.value)
# Extension will be added based on actual file
return ".".join(parts)
def __str__(self) -> str:
return f"{self.title.value} ({self.release_year.value if self.release_year else 'Unknown'})"
def __repr__(self) -> str:
return f"Movie(imdb_id={self.imdb_id}, title='{self.title.value}')"
-73
View File
@@ -1,73 +0,0 @@
"""Movie repository interfaces (abstract)."""
from abc import ABC, abstractmethod
from ..shared.value_objects import ImdbId
from .entities import Movie
class MovieRepository(ABC):
"""
Abstract repository for movie persistence.
This defines the interface that infrastructure implementations must follow.
"""
@abstractmethod
def save(self, movie: Movie) -> None:
"""
Save a movie to the repository.
Args:
movie: Movie entity to save
"""
pass
@abstractmethod
def find_by_imdb_id(self, imdb_id: ImdbId) -> Movie | None:
"""
Find a movie by its IMDb ID.
Args:
imdb_id: IMDb ID to search for
Returns:
Movie if found, None otherwise
"""
pass
@abstractmethod
def find_all(self) -> list[Movie]:
"""
Get all movies in the repository.
Returns:
List of all movies
"""
pass
@abstractmethod
def delete(self, imdb_id: ImdbId) -> bool:
"""
Delete a movie from the repository.
Args:
imdb_id: IMDb ID of the movie to delete
Returns:
True if deleted, False if not found
"""
pass
@abstractmethod
def exists(self, imdb_id: ImdbId) -> bool:
"""
Check if a movie exists in the repository.
Args:
imdb_id: IMDb ID to check
Returns:
True if exists, False otherwise
"""
pass
+91
View File
@@ -0,0 +1,91 @@
"""Movie domain entities."""
from dataclasses import dataclass
from ..shared_TO_CHECK.value_objects import ImdbId, TmdbId
from .value_objects import MovieTitle, ReleaseYear
@dataclass(frozen=True, eq=False)
class Movie:
"""
Movie aggregate root for the movies domain.
TMDB-only aggregate: carries identity (``tmdb_id`` + optional
``imdb_id``) plus the catalog facts that come from TMDB (``title``,
``release_year``). Filesystem-side concerns (file path, quality,
tracks, ``added_at``) live on :class:`alfred.domain.releases.entities.
MovieRelease`, the per-movie release aggregate persisted alongside.
Frozen: rebuild via ``dataclasses.replace`` to project metadata
updates (e.g. a TMDB refresh) onto a new instance.
Equality is identity-based on ``tmdb_id``: two ``Movie`` instances
are equal iff they share the same primary key. ``imdb_id`` is a
secondary anchor and not part of the identity.
"""
tmdb_id: TmdbId
title: MovieTitle
imdb_id: ImdbId | None = None
release_year: ReleaseYear | None = None
def __post_init__(self) -> None:
if not isinstance(self.tmdb_id, TmdbId):
raise ValueError(
f"tmdb_id must be TmdbId, got {type(self.tmdb_id)}"
)
if not isinstance(self.title, MovieTitle):
if isinstance(self.title, str):
object.__setattr__(self, "title", MovieTitle(self.title))
else:
raise ValueError(
f"title must be MovieTitle or str, got {type(self.title)}"
)
if self.imdb_id is not None and not isinstance(self.imdb_id, ImdbId):
raise ValueError(
f"imdb_id must be ImdbId or None, got {type(self.imdb_id)}"
)
def __eq__(self, other: object) -> bool:
if not isinstance(other, Movie):
return NotImplemented
return self.tmdb_id == other.tmdb_id
def __hash__(self) -> int:
return hash(self.tmdb_id)
# WRONG
def get_folder_name(self) -> str:
"""
Get the folder name for this movie.
Format: "Title (Year)"
Example: "Inception (2010)"
"""
if self.release_year:
return f"{self.title.value} ({self.release_year.value})"
return self.title.value
# WRONG
def get_filename(self) -> str:
"""
Get the suggested base filename (without extension) for this movie.
Format: ``Title.Year`` (quality lives on
:class:`alfred.domain.releases.entities.MovieRelease` now and is
appended by the release-aware caller — typically the rescan /
organize flow, after Phase 4).
Example: ``Inception.2010``.
"""
parts = [self.title.normalized()]
if self.release_year:
parts.append(str(self.release_year.value))
return ".".join(parts)
def __str__(self) -> str:
return f"{self.title.value} ({self.release_year.value if self.release_year else 'Unknown'})"
def __repr__(self) -> str:
return f"Movie(tmdb_id={self.tmdb_id}, title='{self.title.value}')"
@@ -1,6 +1,6 @@
"""Movie domain exceptions.""" """Movie domain exceptions."""
from ..shared.exceptions import DomainException, NotFoundError from ..shared_TO_CHECK.exceptions import DomainException, NotFoundError
class MovieNotFound(NotFoundError): class MovieNotFound(NotFoundError):
@@ -3,8 +3,7 @@
from dataclasses import dataclass from dataclasses import dataclass
from enum import Enum from enum import Enum
from ..shared.exceptions import ValidationError from ..shared_TO_CHECK.exceptions import ValidationError
from ..shared.value_objects import to_dot_folder_name
class Quality(Enum): class Quality(Enum):
@@ -56,18 +55,11 @@ class MovieTitle:
f"Movie title must be a string, got {type(self.value)}" f"Movie title must be a string, got {type(self.value)}"
) )
if len(self.value) > 500: if len(self.value) > 150:
raise ValidationError( raise ValidationError(
f"Movie title too long: {len(self.value)} characters (max 500)" f"Movie title too long: {len(self.value)} characters (max 150)"
) )
def normalized(self) -> str:
"""
Return normalized title for file system usage.
Removes special characters and replaces spaces with dots.
"""
return to_dot_folder_name(self.value)
def __str__(self) -> str: def __str__(self) -> str:
return self.value return self.value
@@ -93,10 +85,6 @@ class ReleaseYear:
f"Release year must be an integer, got {type(self.value)}" f"Release year must be an integer, got {type(self.value)}"
) )
# Movies started around 1888, and we shouldn't have movies from the future
if self.value < 1888 or self.value > 2100:
raise ValidationError(f"Invalid release year: {self.value}")
def __str__(self) -> str: def __str__(self) -> str:
return str(self.value) return str(self.value)
-6
View File
@@ -1,6 +0,0 @@
"""Release domain — release name parsing and naming conventions."""
from .services import parse_release
from .value_objects import ParsedRelease, ParseReport
__all__ = ["ParsedRelease", "ParseReport", "parse_release"]
@@ -0,0 +1,38 @@
"""Filesystem release aggregates — what the user owns on disk.
This bounded context is intentionally separated from
``alfred.domain.tv_shows`` / ``alfred.domain.movies`` (TMDB identity).
A :class:`SeriesRelease` describes the physical files on disk for one
show; a :class:`TVShow` describes the work as catalogued by TMDB. The
two are linked by :class:`~alfred.domain.shared.value_objects.TmdbId`
in the persistence layer, never by direct reference.
Not to be confused with ``alfred.domain.release`` (singular) which
parses release **names** (strings → tokens). The two packages may be
merged later; for now they coexist as separate concerns.
"""
from .builders import SeasonReleaseBuilder, SeriesReleaseBuilder
from .entities import (
EpisodeRelease,
MovieRelease,
SeasonRelease,
SeriesRelease,
TrackProfile,
)
from .repositories import MovieReleaseRepository, SeriesReleaseRepository
from .value_objects import EpisodeRange, ReleaseMode
__all__ = [
"EpisodeRange",
"EpisodeRelease",
"MovieRelease",
"MovieReleaseRepository",
"ReleaseMode",
"SeasonRelease",
"SeasonReleaseBuilder",
"SeriesRelease",
"SeriesReleaseBuilder",
"SeriesReleaseRepository",
"TrackProfile",
]
+243
View File
@@ -0,0 +1,243 @@
"""Builders for the filesystem release aggregates.
The aggregates are frozen — :class:`SeriesRelease`, :class:`SeasonRelease`,
and :class:`EpisodeRelease` are ``@dataclass(frozen=True)`` and offer no
mutation methods. All construction goes through these builders, which
assemble the aggregate piece by piece and emit a frozen instance via
``build()``.
Typical usage during a filesystem walk::
builder = SeriesReleaseBuilder(tmdb_id=TmdbId(84958), imdb_id=ImdbId("tt0804484"))
sb = builder.season_builder(SeasonNumber(1), folder="Show.S01", mode=ReleaseMode.PACK)
sb.add_episode(EpisodeRelease(
episodes=EpisodeRange(EpisodeNumber(1), EpisodeNumber(1)),
file_path=FilePath("Show.S01/Show.S01E01.mkv"),
tracks=TrackProfile(),
))
release = builder.build()
Builders are **single-use scratchpads**: they hold mutable state during
construction, then produce an immutable aggregate.
Invariants enforced at ``build()`` time:
* Seasons are emitted sorted by ``season_number``.
* Episodes within each season are emitted sorted by their
``EpisodeRange.start`` (so a season with ``E01-E03`` + ``E04`` is
emitted in that order).
* No two ``EpisodeRelease`` within a season may overlap (same TMDB
episode covered by two distinct files) — raises ``ValidationError``.
"""
from __future__ import annotations
from ..shared_TO_CHECK.exceptions import ValidationError
from ..shared_TO_CHECK.value_objects import ImdbId, TmdbId
from ..tv_shows.value_objects import SeasonNumber
from .entities import (
EpisodeRelease,
SeasonRelease,
SeriesRelease,
)
from .value_objects import ReleaseMode
# ════════════════════════════════════════════════════════════════════════════
# MovieReleaseBuilder
# ════════════════════════════════════════════════════════════════════════════
# ...
# ════════════════════════════════════════════════════════════════════════════
# SeasonReleaseBuilder
# ════════════════════════════════════════════════════════════════════════════
class SeasonReleaseBuilder:
"""
Mutable scratchpad for a :class:`SeasonRelease`.
Episodes are appended in arbitrary order; ``build()`` sorts them by
their range start before emitting the frozen aggregate and verifies
there are no overlapping ranges.
"""
def __init__(
self,
season_number: SeasonNumber | int,
*,
folder: str,
mode: ReleaseMode,
) -> None:
if isinstance(season_number, int):
season_number = SeasonNumber(season_number)
self._season_number: SeasonNumber = season_number
self._folder: str = folder
self._mode: ReleaseMode = mode
self._episodes: list[EpisodeRelease] = []
@classmethod
def from_existing(cls, season: SeasonRelease) -> SeasonReleaseBuilder:
"""Seed a builder from an existing frozen :class:`SeasonRelease`."""
builder = cls(
season.season_number,
folder=season.folder,
mode=season.mode,
)
builder._episodes = list(season.episodes)
return builder
@property
def season_number(self) -> SeasonNumber:
return self._season_number
@property
def mode(self) -> ReleaseMode:
return self._mode
def set_folder(self, folder: str) -> SeasonReleaseBuilder:
self._folder = folder
return self
def set_mode(self, mode: ReleaseMode) -> SeasonReleaseBuilder:
self._mode = mode
return self
def add_episode(self, episode: EpisodeRelease) -> SeasonReleaseBuilder:
"""Append a physical-file :class:`EpisodeRelease` to this season."""
self._episodes.append(episode)
return self
def build(self) -> SeasonRelease:
"""Emit a frozen :class:`SeasonRelease` with episodes sorted.
Raises :class:`ValidationError` if any two episode ranges overlap
(same TMDB slot claimed by two distinct files).
"""
ordered = tuple(
sorted(self._episodes, key=lambda ep: ep.episodes.start.value)
)
# Overlap check — ranges are inclusive on both ends, sorted by start.
for prev, curr in zip(ordered, ordered[1:], strict=False):
if curr.episodes.start.value <= prev.episodes.end.value:
raise ValidationError(
f"SeasonRelease season {self._season_number}: overlapping "
f"episode ranges {prev.episodes} and {curr.episodes}"
)
return SeasonRelease(
season_number=self._season_number,
folder=self._folder,
mode=self._mode,
episodes=ordered,
)
# ════════════════════════════════════════════════════════════════════════════
# SeriesReleaseBuilder
# ════════════════════════════════════════════════════════════════════════════
class SeriesReleaseBuilder:
"""
Mutable scratchpad for the :class:`SeriesRelease` aggregate root.
Seasons are tracked via internal :class:`SeasonReleaseBuilder`
instances keyed by :class:`SeasonNumber`.
"""
def __init__(
self,
*,
tmdb_id: TmdbId | int,
imdb_id: ImdbId | str | None = None,
) -> None:
if isinstance(tmdb_id, int):
tmdb_id = TmdbId(tmdb_id)
if isinstance(imdb_id, str):
imdb_id = ImdbId(imdb_id)
self._tmdb_id: TmdbId = tmdb_id
self._imdb_id: ImdbId | None = imdb_id
self._season_builders: dict[SeasonNumber, SeasonReleaseBuilder] = {}
@classmethod
def from_existing(cls, release: SeriesRelease) -> SeriesReleaseBuilder:
"""Seed a builder from an existing frozen :class:`SeriesRelease`."""
builder = cls(
tmdb_id=release.tmdb_id,
imdb_id=release.imdb_id,
)
for season in release.seasons:
builder._season_builders[season.season_number] = (
SeasonReleaseBuilder.from_existing(season)
)
return builder
# ── Top-level mutators ─────────────────────────────────────────────────
def set_imdb_id(self, imdb_id: ImdbId | str | None) -> SeriesReleaseBuilder:
if isinstance(imdb_id, str):
imdb_id = ImdbId(imdb_id)
self._imdb_id = imdb_id
return self
# ── Content ────────────────────────────────────────────────────────────
def season_builder(
self,
season_number: SeasonNumber | int,
*,
folder: str | None = None,
mode: ReleaseMode | None = None,
) -> SeasonReleaseBuilder:
"""
Return (creating if needed) the :class:`SeasonReleaseBuilder` for a
season.
``folder`` and ``mode`` are required when the builder does not yet
exist for this season; subsequent calls may pass them to override.
"""
if isinstance(season_number, int):
season_number = SeasonNumber(season_number)
sb = self._season_builders.get(season_number)
if sb is None:
if folder is None or mode is None:
raise ValidationError(
f"season_builder({season_number}): folder and mode "
f"are required to create a new season builder"
)
sb = SeasonReleaseBuilder(season_number, folder=folder, mode=mode)
self._season_builders[season_number] = sb
else:
if folder is not None:
sb.set_folder(folder)
if mode is not None:
sb.set_mode(mode)
return sb
def add_season(self, season: SeasonRelease) -> SeriesReleaseBuilder:
"""
Attach (or replace) a fully-built :class:`SeasonRelease`.
Replaces any existing season with the same number.
"""
self._season_builders[season.season_number] = (
SeasonReleaseBuilder.from_existing(season)
)
return self
# ── Emit ───────────────────────────────────────────────────────────────
def build(self) -> SeriesRelease:
"""Emit a frozen :class:`SeriesRelease` with seasons sorted by number."""
ordered_seasons = tuple(
self._season_builders[n].build()
for n in sorted(self._season_builders, key=lambda x: x.value)
)
return SeriesRelease(
tmdb_id=self._tmdb_id,
imdb_id=self._imdb_id,
seasons=ordered_seasons,
)
+217
View File
@@ -0,0 +1,217 @@
"""Filesystem release aggregates.
The release domain models what the user owns on disk — one
:class:`SeriesRelease` per show, one :class:`MovieRelease` per movie.
TMDB identity (title, status, episode_count, …) lives in the
``tv_shows`` / ``movies`` domains and is linked via the
:class:`~alfred.domain.shared.value_objects.TmdbId` natural key.
All entities are frozen. Mutation goes through the builders in
:mod:`alfred.domain.releases.builders`.
"""
from __future__ import annotations
from dataclasses import dataclass
from datetime import datetime
from ..shared_TO_CHECK.exceptions import ValidationError
from ..shared_TO_CHECK.media import AudioTrack, SubtitleTrack
from ..shared_TO_CHECK.value_objects import FilePath, ImdbId, TmdbId
from ..tv_shows.value_objects import SeasonNumber
from .value_objects import EpisodeRange, ReleaseMode
__all__ = [
"EpisodeRelease",
"MovieRelease",
"SeasonRelease",
"SeriesRelease",
"TrackProfile",
]
@dataclass(frozen=True)
class TrackProfile:
"""
Audio + subtitle tracks of one physical file.
Tracks live per-file (not per-season): every ``EpisodeRelease`` and
``MovieRelease`` carries its own ``TrackProfile``. Season-level
aggregation is computed by the caller when needed.
"""
audio_tracks: tuple[AudioTrack, ...] = ()
subtitle_tracks: tuple[SubtitleTrack, ...] = ()
@dataclass(frozen=True)
class EpisodeRelease:
"""
One physical episode file (or multi-episode file) on disk.
:attr:`episodes` is an :class:`EpisodeRange` — a single ``.mkv``
that covers ``S01E02E03`` carries ``EpisodeRange(start=E02, end=E03)``
and is recorded once. The library index lists it under each covered
slot (``E02``, ``E03``) for symmetric lookups.
:attr:`file_path` is **relative to the show root** (e.g.
``"Show.S01/Show.S01E02.mkv"`` for PACK,
``"Show.S01/Show.S01E02-RG/Show.S01E02-RG.mkv"`` for EPISODIC).
The caller (repository) prepends the absolute show root when
needed.
"""
episodes: EpisodeRange
file_path: FilePath
tracks: TrackProfile = TrackProfile()
@dataclass(frozen=True)
class SeasonRelease:
"""
All physical files on disk for one season of a show.
The :attr:`mode` flag records the filesystem layout:
* :attr:`ReleaseMode.PACK` — the season folder contains N video
files directly. ``episodes`` lists each ``.mkv`` in the folder.
* :attr:`ReleaseMode.EPISODIC` — the season folder contains N
sub-folders, each with one episode. ``episodes`` lists each
``(subfolder, file)`` pair.
:attr:`folder` is the season folder name, relative to the show root.
Invariant: every ``EpisodeRelease.episodes`` range stays within
sane bounds (validated at construction). Cross-episode duplicate
detection (two files claiming the same TMDB slot) is the
builder's job, not the entity's.
"""
season_number: SeasonNumber
folder: str
mode: ReleaseMode
episodes: tuple[EpisodeRelease, ...] = ()
def __post_init__(self) -> None:
if not isinstance(self.season_number, SeasonNumber):
raise ValidationError(
f"SeasonRelease.season_number must be SeasonNumber, "
f"got {type(self.season_number)}"
)
if not isinstance(self.mode, ReleaseMode):
raise ValidationError(
f"SeasonRelease.mode must be ReleaseMode, got {type(self.mode)}"
)
if not isinstance(self.folder, str) or not self.folder:
raise ValidationError(
f"SeasonRelease.folder must be a non-empty string, "
f"got {self.folder!r}"
)
def episode_count(self) -> int:
"""
Total number of TMDB episode slots covered by all physical files.
Sums each :meth:`EpisodeRange.count` — a season with two files
``E01`` + ``E02-E03`` returns ``3`` (one slot from the first
file, two from the second).
Compared by the caller against the library index's TMDB
``episode_count`` to detect incomplete seasons.
"""
return sum(ep.episodes.count() for ep in self.episodes)
@dataclass(frozen=True)
class SeriesRelease:
"""
All physical seasons on disk for one show.
Anchored to TMDB by :attr:`tmdb_id` (primary key). :attr:`imdb_id`
is optional and stored as a secondary anchor — useful for the
occasional show without TMDB coverage, and for cross-checking
when both ids are known.
Seasons are exposed sorted by ``season_number`` (the builder
enforces this on emit). No duplicate ``season_number`` is
permitted across :attr:`seasons`.
"""
tmdb_id: TmdbId
imdb_id: ImdbId | None
seasons: tuple[SeasonRelease, ...] = ()
def __post_init__(self) -> None:
if not isinstance(self.tmdb_id, TmdbId):
raise ValidationError(
f"SeriesRelease.tmdb_id must be TmdbId, got {type(self.tmdb_id)}"
)
if self.imdb_id is not None and not isinstance(self.imdb_id, ImdbId):
raise ValidationError(
f"SeriesRelease.imdb_id must be ImdbId or None, "
f"got {type(self.imdb_id)}"
)
seen: set[int] = set()
for s in self.seasons:
if s.season_number.value in seen:
raise ValidationError(
f"SeriesRelease has duplicate season "
f"{s.season_number}"
)
seen.add(s.season_number.value)
def get_season(self, season_number: SeasonNumber) -> SeasonRelease | None:
"""Return the :class:`SeasonRelease` for ``season_number`` or ``None``."""
for s in self.seasons:
if s.season_number == season_number:
return s
return None
@dataclass(frozen=True)
class MovieRelease:
"""
A single physical movie file on disk.
Anchored to TMDB by :attr:`tmdb_id`; :attr:`imdb_id` optional
secondary anchor.
:attr:`folder` is the movie folder name relative to the
``movies/`` library root. :attr:`file_path` is the video file
name relative to the folder (movies are one folder, one file in
Alfred's layout — no sub-folders).
:attr:`added_at` is the UTC timestamp at which the release was
first observed in the library — set by the caller (organizer /
rescan) when the aggregate is built. Persisted by the v2 movie
sidecar; not derived from the filesystem (mtime drifts across
moves and hard-links).
"""
tmdb_id: TmdbId
imdb_id: ImdbId | None
folder: str
file_path: FilePath
added_at: datetime
tracks: TrackProfile = TrackProfile()
def __post_init__(self) -> None:
if not isinstance(self.tmdb_id, TmdbId):
raise ValidationError(
f"MovieRelease.tmdb_id must be TmdbId, got {type(self.tmdb_id)}"
)
if self.imdb_id is not None and not isinstance(self.imdb_id, ImdbId):
raise ValidationError(
f"MovieRelease.imdb_id must be ImdbId or None, "
f"got {type(self.imdb_id)}"
)
if not isinstance(self.folder, str) or not self.folder:
raise ValidationError(
f"MovieRelease.folder must be a non-empty string, "
f"got {self.folder!r}"
)
if not isinstance(self.added_at, datetime):
raise ValidationError(
f"MovieRelease.added_at must be datetime, "
f"got {type(self.added_at)}"
)
@@ -17,10 +17,6 @@ The pipeline has three internal paths driven by the detected release group:
knowledge sets, with a 0-100 confidence score. knowledge sets, with a 0-100 confidence score.
- **PATH OF PAIN**: score below threshold OR critical chunks missing - **PATH OF PAIN**: score below threshold OR critical chunks missing
signaled to the caller, who decides whether to involve the LLM/user. signaled to the caller, who decides whether to involve the LLM/user.
Today the package exposes scaffolding only (token VOs and a thin pipeline
stub). The legacy ``parse_release`` in ``release.services`` keeps serving
production until each piece of the v2 pipeline is wired in.
""" """
from __future__ import annotations from __future__ import annotations
@@ -29,11 +29,10 @@ arrives through ``kb: ReleaseKnowledge``.
from __future__ import annotations from __future__ import annotations
from ..ports.knowledge import ReleaseKnowledge from ..ports.knowledge import ReleaseKnowledge
from ..value_objects import MediaTypeToken from alfred.domain.releases_TO_CHECK.value_objects_old_question_mark import MediaTypeToken
from .schema import GroupSchema from .schema import GroupSchema
from .tokens import Token, TokenRole from .tokens import Token, TokenRole
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Stage 1 — tokenize # Stage 1 — tokenize
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -713,9 +712,6 @@ def assemble(
if distributor is None: if distributor is None:
distributor = tok.text.upper() distributor = tok.text.upper()
tech_parts = [p for p in (quality, source, codec) if p]
tech_string = ".".join(tech_parts)
# Media type heuristic. Doc/concert/integrale tokens win over the # Media type heuristic. Doc/concert/integrale tokens win over the
# generic tech-based fallback. We look across all tokens (not just # generic tech-based fallback. We look across all tokens (not just
# annotated ones) because these markers may be tagged UNKNOWN by the # annotated ones) because these markers may be tagged UNKNOWN by the
@@ -754,10 +750,9 @@ def assemble(
"source": source, "source": source,
"codec": codec, "codec": codec,
"group": group, "group": group,
"tech_string": tech_string,
"media_type": media_type, "media_type": media_type,
"site_tag": site_tag, "site_tag": site_tag,
"languages": languages, "languages": tuple(languages),
"audio_codec": audio_codec, "audio_codec": audio_codec,
"audio_channels": audio_channels, "audio_channels": audio_channels,
"bit_depth": bit_depth, "bit_depth": bit_depth,
@@ -27,14 +27,14 @@ from __future__ import annotations
from enum import Enum from enum import Enum
from ..ports.knowledge import ReleaseKnowledge from ..ports.knowledge import ReleaseKnowledge
from ..value_objects import ParsedRelease from alfred.domain.releases_TO_CHECK.value_objects_old_question_mark import ParsedRelease
from .tokens import Token, TokenRole from .tokens import Token, TokenRole
class Road(str, Enum): class Road(str, Enum):
"""How the parser handled a given release name. """How the parser handled a given release name.
Distinct from :class:`~alfred.domain.release.value_objects.ParsePath`, Distinct from :class:`~alfred.domain.release.value_objects.TokenizationRoute`,
which records the tokenization route (DIRECT / SANITIZED / AI). Road which records the tokenization route (DIRECT / SANITIZED / AI). Road
is about confidence in the *result*, not the *method*. is about confidence in the *result*, not the *method*.
""" """
@@ -18,10 +18,9 @@ score, the road, and diagnostic info for downstream callers.
from __future__ import annotations from __future__ import annotations
from .parser import pipeline as _v2 from alfred.domain.releases_TO_CHECK.parser import scoring as _scoring, pipeline as _v2
from .parser import scoring as _scoring from alfred.domain.releases_TO_CHECK.ports import ReleaseKnowledge
from .ports import ReleaseKnowledge from alfred.domain.releases_TO_CHECK.value_objects_old_question_mark import MediaTypeToken, ParsedRelease, ParseReport, TokenizationRoute
from .value_objects import MediaTypeToken, ParsedRelease, ParsePath, ParseReport
def parse_release( def parse_release(
@@ -44,7 +43,7 @@ def parse_release(
3. Otherwise run the v2 pipeline: tokenize annotate (EASY when a 3. Otherwise run the v2 pipeline: tokenize annotate (EASY when a
group schema is known, SHITTY otherwise) assemble score. group schema is known, SHITTY otherwise) assemble score.
""" """
parse_path = ParsePath.DIRECT parse_path = TokenizationRoute.DIRECT
# Apostrophes inside titles ("Don't", "L'avare") are common and should # Apostrophes inside titles ("Don't", "L'avare") are common and should
# not push the release through the AI fallback. Strip them up front so # not push the release through the AI fallback. Strip them up front so
@@ -53,11 +52,11 @@ def parse_release(
working_name = name working_name = name
if "'" in working_name: if "'" in working_name:
working_name = working_name.replace("'", "") working_name = working_name.replace("'", "")
parse_path = ParsePath.SANITIZED parse_path = TokenizationRoute.SANITIZED
clean, site_tag = _v2.strip_site_tag(working_name) clean, site_tag = _v2.strip_site_tag(working_name)
if site_tag is not None: if site_tag is not None:
parse_path = ParsePath.SANITIZED parse_path = TokenizationRoute.SANITIZED
if not _is_well_formed(clean, kb): if not _is_well_formed(clean, kb):
parsed = ParsedRelease( parsed = ParsedRelease(
@@ -73,10 +72,9 @@ def parse_release(
source=None, source=None,
codec=None, codec=None,
group="UNKNOWN", group="UNKNOWN",
tech_string="",
media_type=MediaTypeToken.UNKNOWN, media_type=MediaTypeToken.UNKNOWN,
site_tag=site_tag, site_tag=site_tag,
parse_path=ParsePath.AI, parse_path=TokenizationRoute.AI,
) )
report = ParseReport( report = ParseReport(
confidence=0, confidence=0,

Some files were not shown because too many files have changed in this diff Show More