refactor(language): LanguageRepository port + SubtitleKnowledgeBase wired to it

Mirror the MediaProber / FilesystemScanner pattern for language lookup:

- New Protocol `LanguageRepository` in alfred.domain.shared.ports
  covering from_iso, from_any, all, __contains__, __len__ — the
  surface previously coupled to the concrete LanguageRegistry.
- SubtitleKnowledgeBase types its `language_registry` parameter
  against the Protocol; the concrete LanguageRegistry stays in
  infrastructure as the YAML-backed adapter and remains the default
  when no repository is injected.
- New unit tests in tests/infrastructure/test_language_registry.py
  cover the adapter surface (from_iso, from_any, membership,
  case-insensitivity, non-string inputs).

Behaviour is unchanged for existing callers. The split opens the
door to in-memory fakes in future tests without loading the full
ISO 639 YAML.
This commit is contained in:
2026-05-20 23:18:25 +02:00
parent 19fe8a519a
commit 18267d0165
5 changed files with 142 additions and 3 deletions
+2
View File
@@ -7,11 +7,13 @@ Protocol without going through real I/O.
"""
from .filesystem_scanner import FileEntry, FilesystemScanner
from .language_repository import LanguageRepository
from .media_prober import MediaProber, SubtitleStreamInfo
__all__ = [
"FileEntry",
"FilesystemScanner",
"LanguageRepository",
"MediaProber",
"SubtitleStreamInfo",
]
@@ -0,0 +1,36 @@
"""LanguageRepository port — abstracts canonical language lookup.
The adapter (typically loading from ISO 639 YAML knowledge) maps a wide
range of raw forms (codes, English/native names, aliases) onto the
canonical :class:`Language` value object. Domain code accepts the port
via constructor injection; tests can pass a small in-memory fake.
"""
from __future__ import annotations
from typing import Protocol
from alfred.domain.shared.value_objects import Language
class LanguageRepository(Protocol):
"""Canonical language lookup."""
def from_iso(self, code: str) -> Language | None:
"""Look up by canonical ISO 639-2/B code (case-insensitive)."""
...
def from_any(self, raw: str) -> Language | None:
"""Look up by any known representation: ISO code, name, alias.
Case-insensitive. Returns ``None`` when the raw form is unknown.
"""
...
def all(self) -> list[Language]:
"""Return all known languages, in a stable order."""
...
def __contains__(self, raw: str) -> bool: ...
def __len__(self) -> int: ...