Files
alfred/alfred/domain/subtitles/services/placer.py
T
francwa e07c9ec77b chore: sprint cleanup — language unification, parser unification, fossils removal
Several weeks of work accumulated without being committed. Grouped here for
clarity; see CHANGELOG.md [Unreleased] for the user-facing summary.

Highlights
----------

P1 #2 — ISO 639-2/B canonical migration
- New Language VO + LanguageRegistry (alfred/domain/shared/knowledge/).
- iso_languages.yaml as single source of truth for language codes.
- SubtitleKnowledgeBase now delegates lookup to LanguageRegistry; subtitles.yaml
  only declares subtitle-specific tokens (vostfr, vf, vff, …).
- SubtitlePreferences default → ["fre", "eng"]; subtitle filenames written as
  {iso639_2b}.srt (legacy fr.srt still read via alias).
- Scanner: dropped _LANG_KEYWORDS / _SDH_TOKENS / _FORCED_TOKENS /
  SUBTITLE_EXTENSIONS hardcoded dicts.
- Fixed: 'hi' token no longer marks SDH (conflicted with Hindi alias).
- Added settings.min_movie_size_bytes (was a module constant).

P1 #3 — Release parser unification + data-driven tokenizer
- parse_release() is now the single source of truth for release-name parsing.
- alfred/knowledge/release/separators.yaml declares the token separators used
  by the tokenizer (., space, [, ], (, ), _). New conventions can be added
  without code changes.
- Tokenizer now splits on any configured separator instead of name.split('.').
  Releases like 'The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]' parse via
  the direct path without sanitization fallback.
- Site-tag extraction always runs first; well-formedness only rejects truly
  forbidden chars.
- _parse_season_episode() extended with NxNN / NxNNxNN alt forms.
- Removed dead helpers: _sanitize, _normalize.

Domain cleanup
- Deleted fossil services with zero production callers:
    alfred/domain/movies/services.py
    alfred/domain/tv_shows/services.py
    alfred/domain/subtitles/services.py (replaced by subtitles/services/ package)
    alfred/domain/subtitles/repositories.py
- Split monolithic subtitle services into a package (identifier, matcher,
  placer, pattern_detector, utils) + dedicated knowledge/ package.
- MediaInfo split into dedicated package (alfred/domain/shared/media/:
  audio, video, subtitle, info, matching).

Persistence cleanup
- Removed dead JSON repositories (movie/subtitle/tvshow_repository.py).

Tests
- Major expansion of the test suite organized to mirror the source tree.
- Removed obsolete *_edge_cases test files superseded by structured tests.
- Suite: 990 passed, 8 skipped.

Misc
- .gitignore: exclude env_backup/ and *.bak.
- Adjustments across agent/llm, app.py, application/filesystem, and
  infrastructure/filesystem to align with the new domain layout.
2026-05-17 23:38:00 +02:00

118 lines
3.5 KiB
Python

"""SubtitlePlacer — hard-links matched subtitle tracks next to the destination video."""
import logging
import os
from dataclasses import dataclass
from pathlib import Path
from ..entities import SubtitleCandidate
logger = logging.getLogger(__name__)
def _build_dest_name(track: SubtitleCandidate, video_stem: str) -> str:
"""
Build the destination filename for a subtitle track.
Format: {video_stem}.{lang}.{ext}
{video_stem}.{lang}.sdh.{ext}
{video_stem}.{lang}.forced.{ext}
"""
from ..value_objects import SubtitleType
if not track.language or not track.format:
raise ValueError("Cannot compute destination name: language or format missing")
ext = track.format.extensions[0].lstrip(".")
parts = [video_stem, track.language.code]
if track.subtitle_type == SubtitleType.SDH:
parts.append("sdh")
elif track.subtitle_type == SubtitleType.FORCED:
parts.append("forced")
return ".".join(parts) + "." + ext
@dataclass
class PlacedTrack:
source: Path
destination: Path
filename: str
@dataclass
class PlaceResult:
placed: list[PlacedTrack]
skipped: list[tuple[SubtitleCandidate, str]] # (track, reason)
@property
def placed_count(self) -> int:
return len(self.placed)
@property
def skipped_count(self) -> int:
return len(self.skipped)
class SubtitlePlacer:
"""
Hard-links matched SubtitleCandidate files next to a destination video.
Uses the same hard-link strategy as FileManager.copy_file:
instant, no data duplication, qBittorrent keeps seeding.
Embedded tracks are skipped — nothing to place on disk.
"""
def place(
self,
tracks: list[SubtitleCandidate],
destination_video: Path,
) -> PlaceResult:
placed: list[PlacedTrack] = []
skipped: list[tuple[SubtitleCandidate, str]] = []
dest_dir = destination_video.parent
for track in tracks:
if track.is_embedded:
logger.debug(f"SubtitlePlacer: skip embedded track ({track.language})")
skipped.append((track, "embedded — no file to place"))
continue
if not track.file_path or not track.file_path.exists():
skipped.append((track, "source file not found"))
continue
try:
dest_name = _build_dest_name(track, destination_video.stem)
except ValueError as e:
skipped.append((track, str(e)))
continue
dest_path = dest_dir / dest_name
if dest_path.exists():
logger.debug(f"SubtitlePlacer: skip {dest_name} — already exists")
skipped.append((track, "destination already exists"))
continue
try:
os.link(track.file_path, dest_path)
placed.append(
PlacedTrack(
source=track.file_path,
destination=dest_path,
filename=dest_name,
)
)
logger.info(f"SubtitlePlacer: placed {dest_name}")
except OSError as e:
logger.warning(f"SubtitlePlacer: failed to place {dest_name}: {e}")
skipped.append((track, str(e)))
logger.info(
f"SubtitlePlacer: {len(placed)} placed, {len(skipped)} skipped "
f"for {destination_video.name}"
)
return PlaceResult(placed=placed, skipped=skipped)