refactor(rescan): Phase 4 Step 1 — rescan_show on v2 release repo

Rewrite rescan_show to build a SeriesRelease (Phase 1 v2 aggregate)
and persist it via DotAlfredSeriesReleaseRepository. The orchestrator
keeps reusing inspect_release as the single source of parse/probe
truth — only the assembly target changes (SeriesRelease/SeasonRelease/
EpisodeRelease instead of TVShow/Season/Episode).

New signature

    rescan_show(
        show_root,
        *,
        tmdb_id: TmdbId,
        imdb_id: ImdbId | None = None,
        series_repo: DotAlfredSeriesReleaseRepository,
        scanner,
        prober,
        kb,
    ) -> SeriesRelease

Identity is TMDB-anchored (tmdb_id required, no coercion); imdb_id is
optional. No TMDB call from rescan — the library index auto-heals
from the new sidecar on its next read.

PACK vs EPISODIC

* Single-video + season-parsed + no-episode → SeasonRelease(
  mode=PACK, folder=<season folder>, episodes=()). The slot map stays
  empty until the Phase 5 TMDB sync supplies episode_count. We do
  not fabricate an EpisodeRange we cannot prove on disk.
* Otherwise → EPISODIC: every file with (season, episode) becomes an
  EpisodeRelease with EpisodeRange(start, end) = (E, E). Multi-episode
  files (S01E01E02) still record only the first slot — Parser does
  not yet expose episode_end (existing tech debt, unchanged).

Package move

The orchestrator moves from alfred/application/library/ to
alfred/application/tv_shows/ for symmetry with alfred/application/
movies/ (Step 2). walker.py + its tests move with it. The empty
library/ package is deleted.

Tests

tests/application/tv_shows/test_rescan.py rewritten end-to-end on
the real v2 repository, real KB, real scanner, stubbed prober.
9 happy-path + edge-case scenarios cover EPISODIC track flattening,
PACK empty-episodes semantics, sidecar round-trip, imdb_id optional,
empty show root, season folder with no videos, prober returning None.
test_walker.py moved verbatim (import path updated).

Full suite: 1214 passed / 10 skipped / 4 xfailed. The three v1
dot_alfred quarantines from Phase 3 stay in place until Step 3.
This commit is contained in:
2026-05-25 21:07:25 +02:00
parent c22b2b78eb
commit 7da0f887e7
8 changed files with 335 additions and 283 deletions
+217
View File
@@ -0,0 +1,217 @@
"""``rescan_show`` — rebuild a SeriesRelease from disk and persist it.
The orchestrator walks the show folder, runs the existing release
pipeline (``inspect_release``) on every video file, and assembles the
result into a frozen :class:`SeriesRelease` written to the per-show
v2 ``.alfred`` sidecar.
Why reuse ``inspect_release``?
-------------------------------
The "fresh download" flow already parses release names, picks a main
video, runs ffprobe and refines media type. We want exactly the same
intelligence applied to library content — running it again here keeps
a single source of truth for parsing / probing rules. The orchestrator
just translates per-file :class:`InspectedResult` into release
aggregate construction.
PACK vs EPISODIC detection
---------------------------
Detection lives in this layer (until the TMDB-driven season-tracker
arrives). The current rule:
* A season folder containing exactly one video whose parser yields
``season is not None`` and ``episode is None`` → **PACK** with an
empty ``episodes`` tuple. We record the season's mode + folder, but
we cannot fill the episode slot map without TMDB's
``episode_count`` — that's Phase 5's job. The file is still on
disk; the next TMDB sync repairs the slot map.
* Otherwise → **EPISODIC**: every file with a valid ``(season,
episode)`` becomes an :class:`EpisodeRelease`. Multi-episode files
(``S01E01E02``) are recorded once with a wide ``EpisodeRange``;
Phase 4 only handles the single-episode case (the parser does not
yet expose ``episode_end`` on ``ParsedRelease``).
Files that fall outside both rules (no season parsed, mixed season
numbers in a folder, etc.) are logged and skipped — the walker doesn't
raise on corrupt input, and neither does the orchestrator.
TMDB
----
``rescan_show`` does **not** call TMDB. It writes the release
sidecar; the library index is updated transparently by its auto-heal
path on the next read. A subsequent TMDB sync (Phase 5) layers
identity / season cache facts on top of the on-disk truth.
Out of scope (tracked as tech debt):
* Adjacent ``.srt`` files — only embedded subtitle tracks are
captured.
* Multi-episode files — ``ParsedRelease`` has no ``episode_end``
field yet.
"""
from __future__ import annotations
import logging
from pathlib import Path
from alfred.application.release.inspect import inspect_release
from alfred.application.tv_shows.walker import SeasonFolder, walk_show
from alfred.domain.release.ports import ReleaseKnowledge
from alfred.domain.releases.entities import (
EpisodeRelease,
SeasonRelease,
SeriesRelease,
TrackProfile,
)
from alfred.domain.releases.value_objects import EpisodeRange, ReleaseMode
from alfred.domain.shared.media import MediaInfo
from alfred.domain.shared.ports import FilesystemScanner, MediaProber
from alfred.domain.shared.value_objects import FilePath, ImdbId, TmdbId
from alfred.domain.tv_shows.value_objects import EpisodeNumber, SeasonNumber
from alfred.infrastructure.persistence.dot_alfred.v2.repository import (
DotAlfredSeriesReleaseRepository,
)
_LOG = logging.getLogger(__name__)
def rescan_show(
show_root: Path,
*,
tmdb_id: TmdbId,
imdb_id: ImdbId | None = None,
series_repo: DotAlfredSeriesReleaseRepository,
scanner: FilesystemScanner,
prober: MediaProber,
kb: ReleaseKnowledge,
) -> SeriesRelease:
"""Rebuild and persist the :class:`SeriesRelease` for ``show_root``.
The show's folder name (``show_root.name``) is used as the sidecar
location relative to the library root. TMDB identity comes from the
caller — the orchestrator does not call TMDB.
Returns the rebuilt frozen aggregate (also written to disk by
``series_repo.save``).
"""
tree = walk_show(show_root, scanner=scanner, kb=kb)
seasons: list[SeasonRelease] = []
for season_folder in tree.season_folders:
season = _ingest_season(season_folder, show_root, kb, prober)
if season is not None:
seasons.append(season)
release = SeriesRelease(
tmdb_id=tmdb_id,
imdb_id=imdb_id,
seasons=tuple(seasons),
)
series_repo.save(release, show_folder=show_root.name)
return release
# --------------------------------------------------------------------------- #
# Per-season ingestion #
# --------------------------------------------------------------------------- #
def _ingest_season(
season_folder: SeasonFolder,
show_root: Path,
kb: ReleaseKnowledge,
prober: MediaProber,
) -> SeasonRelease | None:
if not season_folder.video_files:
_LOG.warning(
"rescan_show: season folder %s contains no video file — skipping",
season_folder.season_dir,
)
return None
# Inspect every video first; we need the whole batch to decide
# PACK vs EPISODIC before assembling the SeasonRelease.
inspected = []
for video_path in season_folder.video_files:
result = inspect_release(video_path.name, video_path, kb, prober)
inspected.append((video_path, result))
season_numbers = {
r.parsed.season for _, r in inspected if r.parsed.season is not None
}
if not season_numbers:
_LOG.warning(
"rescan_show: no season number parsed in %s — skipping",
season_folder.season_dir,
)
return None
if len(season_numbers) > 1:
_LOG.warning(
"rescan_show: mixed season numbers %s in %s — skipping",
sorted(season_numbers),
season_folder.season_dir,
)
return None
season_number = SeasonNumber(season_numbers.pop())
folder_name = season_folder.season_dir.name
# Single video, no episode → PACK with empty episodes. We can't
# synthesize an EpisodeRange without TMDB's episode_count; the
# Phase 5 sync repairs the slot map. Track info from the PACK
# file is intentionally not persisted here — re-derivable on the
# next rescan after the sync fills the range.
if len(inspected) == 1 and inspected[0][1].parsed.episode is None:
return SeasonRelease(
season_number=season_number,
folder=folder_name,
mode=ReleaseMode.PACK,
episodes=(),
)
# EPISODIC: every file with a parseable episode number becomes an
# EpisodeRelease. Files without an episode number are logged and
# dropped (a mixed PACK/EPISODIC folder is malformed).
episodes: list[EpisodeRelease] = []
for video_path, result in inspected:
if result.parsed.episode is None:
_LOG.warning(
"rescan_show: no episode number parsed for %s — skipping",
video_path,
)
continue
episodes.append(
_make_episode_release(
episode_number=EpisodeNumber(result.parsed.episode),
video_path=video_path,
show_root=show_root,
media_info=result.media_info,
)
)
return SeasonRelease(
season_number=season_number,
folder=folder_name,
mode=ReleaseMode.EPISODIC,
episodes=tuple(episodes),
)
def _make_episode_release(
*,
episode_number: EpisodeNumber,
video_path: Path,
show_root: Path,
media_info: MediaInfo | None,
) -> EpisodeRelease:
rel_path = video_path.relative_to(show_root)
audio_tracks = media_info.audio_tracks if media_info else ()
subtitle_tracks = media_info.subtitle_tracks if media_info else ()
return EpisodeRelease(
episodes=EpisodeRange(start=episode_number, end=episode_number),
file_path=FilePath(str(rel_path)),
tracks=TrackProfile(
audio_tracks=audio_tracks,
subtitle_tracks=subtitle_tracks,
),
)