chore: sprint cleanup — language unification, parser unification, fossils removal

Several weeks of work accumulated without being committed. Grouped here for
clarity; see CHANGELOG.md [Unreleased] for the user-facing summary.

Highlights
----------

P1 #2 — ISO 639-2/B canonical migration
- New Language VO + LanguageRegistry (alfred/domain/shared/knowledge/).
- iso_languages.yaml as single source of truth for language codes.
- SubtitleKnowledgeBase now delegates lookup to LanguageRegistry; subtitles.yaml
  only declares subtitle-specific tokens (vostfr, vf, vff, …).
- SubtitlePreferences default → ["fre", "eng"]; subtitle filenames written as
  {iso639_2b}.srt (legacy fr.srt still read via alias).
- Scanner: dropped _LANG_KEYWORDS / _SDH_TOKENS / _FORCED_TOKENS /
  SUBTITLE_EXTENSIONS hardcoded dicts.
- Fixed: 'hi' token no longer marks SDH (conflicted with Hindi alias).
- Added settings.min_movie_size_bytes (was a module constant).

P1 #3 — Release parser unification + data-driven tokenizer
- parse_release() is now the single source of truth for release-name parsing.
- alfred/knowledge/release/separators.yaml declares the token separators used
  by the tokenizer (., space, [, ], (, ), _). New conventions can be added
  without code changes.
- Tokenizer now splits on any configured separator instead of name.split('.').
  Releases like 'The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]' parse via
  the direct path without sanitization fallback.
- Site-tag extraction always runs first; well-formedness only rejects truly
  forbidden chars.
- _parse_season_episode() extended with NxNN / NxNNxNN alt forms.
- Removed dead helpers: _sanitize, _normalize.

Domain cleanup
- Deleted fossil services with zero production callers:
    alfred/domain/movies/services.py
    alfred/domain/tv_shows/services.py
    alfred/domain/subtitles/services.py (replaced by subtitles/services/ package)
    alfred/domain/subtitles/repositories.py
- Split monolithic subtitle services into a package (identifier, matcher,
  placer, pattern_detector, utils) + dedicated knowledge/ package.
- MediaInfo split into dedicated package (alfred/domain/shared/media/:
  audio, video, subtitle, info, matching).

Persistence cleanup
- Removed dead JSON repositories (movie/subtitle/tvshow_repository.py).

Tests
- Major expansion of the test suite organized to mirror the source tree.
- Removed obsolete *_edge_cases test files superseded by structured tests.
- Suite: 990 passed, 8 skipped.

Misc
- .gitignore: exclude env_backup/ and *.bak.
- Adjustments across agent/llm, app.py, application/filesystem, and
  infrastructure/filesystem to align with the new domain layout.
This commit is contained in:
2026-05-17 23:38:00 +02:00
parent ba6f016d49
commit e07c9ec77b
99 changed files with 8833 additions and 6533 deletions
+298
View File
@@ -0,0 +1,298 @@
"""Tests for ``alfred.agent.llm.ollama.OllamaClient``.
Thin wrapper around Ollama's local ``/api/chat`` endpoint. The client
validates message shape, POSTs JSON without auth, and translates
``requests`` exceptions into ``LLMAPIError``.
Coverage:
- ``TestInit`` — explicit args win; missing base_url / model raise
``LLMConfigurationError``; temperature defaults from settings.
- ``TestCompleteValidation`` — same shape checks as DeepSeek (empty, bad
element, missing role, invalid role, missing content; tool role is
exempt).
- ``TestCompleteHappyPath`` — POSTs to ``/api/chat`` with proper payload
(no auth header), returns ``data.message`` verbatim, threads tools.
- ``TestCompleteErrors`` — Timeout, HTTPError (with/without JSON body),
RequestException, missing ``message`` field all wrapped as ``LLMAPIError``.
- ``TestListModels`` — happy path returns model names; failure returns ``[]``.
- ``TestIsAvailable`` — 200 → True; exception → False.
"""
from __future__ import annotations
from unittest.mock import MagicMock, patch
import pytest
from requests.exceptions import HTTPError, RequestException, Timeout
from alfred.agent.llm.exceptions import LLMAPIError, LLMConfigurationError
from alfred.agent.llm.ollama import OllamaClient
from alfred.settings import Settings
def _settings(**overrides) -> Settings:
base = {
"ollama_base_url": "http://ollama.test:11434",
"ollama_model": "llama3.3:latest",
"request_timeout": 30,
"llm_temperature": 0.3,
}
base.update(overrides)
return Settings(**base)
# --------------------------------------------------------------------------- #
# Init #
# --------------------------------------------------------------------------- #
class TestInit:
def test_defaults_from_settings(self):
c = OllamaClient(settings=_settings())
assert c.base_url == "http://ollama.test:11434"
assert c.model == "llama3.3:latest"
assert c.timeout == 30
assert c.temperature == 0.3
def test_explicit_args_override(self):
c = OllamaClient(
base_url="http://other:9999",
model="mistral",
timeout=120,
temperature=0.0,
settings=_settings(),
)
assert c.base_url == "http://other:9999"
assert c.model == "mistral"
assert c.timeout == 120
assert c.temperature == 0.0
def test_zero_temperature_explicit_respected(self):
# 0.0 is falsy; the implementation guards against this with a
# ``is not None`` check.
c = OllamaClient(temperature=0.0, settings=_settings())
assert c.temperature == 0.0
def test_missing_base_url_raises(self):
with pytest.raises(LLMConfigurationError, match="base URL"):
OllamaClient(settings=_settings(ollama_base_url=""))
def test_missing_model_raises(self):
with pytest.raises(LLMConfigurationError, match="model"):
OllamaClient(settings=_settings(ollama_model=""))
# --------------------------------------------------------------------------- #
# complete — message validation #
# --------------------------------------------------------------------------- #
@pytest.fixture
def client():
return OllamaClient(settings=_settings())
class TestCompleteValidation:
def test_empty_messages_raises(self, client):
with pytest.raises(ValueError, match="empty"):
client.complete([])
def test_non_dict_element_raises(self, client):
with pytest.raises(ValueError, match="must be a dict"):
client.complete(["nope"]) # type: ignore[list-item]
def test_missing_role_raises(self, client):
with pytest.raises(ValueError, match="'role' key"):
client.complete([{"content": "hi"}])
def test_invalid_role_raises(self, client):
with pytest.raises(ValueError, match="Invalid role"):
client.complete([{"role": "bogus", "content": "x"}])
def test_missing_content_for_non_tool_role_raises(self, client):
with pytest.raises(ValueError, match="'content' key"):
client.complete([{"role": "assistant"}])
def test_tool_role_allowed_without_content(self, client):
with patch("alfred.agent.llm.ollama.requests.post") as mock_post:
mock_post.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(
return_value={"message": {"role": "assistant", "content": "ok"}}
),
)
out = client.complete([{"role": "tool", "tool_call_id": "a"}])
assert out["content"] == "ok"
# --------------------------------------------------------------------------- #
# complete — happy path #
# --------------------------------------------------------------------------- #
class TestCompleteHappyPath:
def test_posts_to_api_chat_with_payload(self, client):
with patch("alfred.agent.llm.ollama.requests.post") as mock_post:
mock_post.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(
return_value={"message": {"role": "assistant", "content": "hi"}}
),
)
client.complete([{"role": "user", "content": "hello"}])
args, kwargs = mock_post.call_args
assert args[0] == "http://ollama.test:11434/api/chat"
assert kwargs["timeout"] == 30
payload = kwargs["json"]
assert payload["model"] == "llama3.3:latest"
assert payload["stream"] is False
assert payload["options"] == {"temperature": 0.3}
assert payload["messages"] == [{"role": "user", "content": "hello"}]
assert "tools" not in payload
# No Authorization header — Ollama is unauthenticated locally.
assert "headers" not in kwargs or "Authorization" not in (
kwargs.get("headers") or {}
)
def test_returns_message_verbatim(self, client):
message = {"role": "assistant", "content": "answer"}
with patch("alfred.agent.llm.ollama.requests.post") as mock_post:
mock_post.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(return_value={"message": message}),
)
out = client.complete([{"role": "user", "content": "q"}])
assert out == message
def test_tools_threaded_into_payload(self, client):
tools = [{"type": "function", "function": {"name": "x"}}]
with patch("alfred.agent.llm.ollama.requests.post") as mock_post:
mock_post.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(
return_value={"message": {"role": "assistant", "content": ""}}
),
)
client.complete([{"role": "user", "content": "q"}], tools=tools)
assert mock_post.call_args.kwargs["json"]["tools"] == tools
# --------------------------------------------------------------------------- #
# complete — errors #
# --------------------------------------------------------------------------- #
class TestCompleteErrors:
def test_timeout_wrapped(self, client):
with patch(
"alfred.agent.llm.ollama.requests.post", side_effect=Timeout("t")
):
with pytest.raises(LLMAPIError, match="timeout"):
client.complete([{"role": "user", "content": "q"}])
def test_http_error_with_json_body(self, client):
resp = MagicMock()
resp.json.return_value = {"error": "model not found"}
err = HTTPError("404")
err.response = resp
post_resp = MagicMock(raise_for_status=MagicMock(side_effect=err))
with patch("alfred.agent.llm.ollama.requests.post", return_value=post_resp):
with pytest.raises(LLMAPIError, match="model not found"):
client.complete([{"role": "user", "content": "q"}])
def test_http_error_with_non_json_body(self, client):
resp = MagicMock()
resp.json.side_effect = ValueError("not json")
err = HTTPError("boom")
err.response = resp
post_resp = MagicMock(raise_for_status=MagicMock(side_effect=err))
with patch("alfred.agent.llm.ollama.requests.post", return_value=post_resp):
with pytest.raises(LLMAPIError, match="Ollama API error"):
client.complete([{"role": "user", "content": "q"}])
def test_http_error_without_response(self, client):
err = HTTPError("boom")
err.response = None
post_resp = MagicMock(raise_for_status=MagicMock(side_effect=err))
with patch("alfred.agent.llm.ollama.requests.post", return_value=post_resp):
with pytest.raises(LLMAPIError, match="HTTP error"):
client.complete([{"role": "user", "content": "q"}])
def test_request_exception_wrapped(self, client):
with patch(
"alfred.agent.llm.ollama.requests.post",
side_effect=RequestException("conn refused"),
):
with pytest.raises(LLMAPIError, match="Failed to connect"):
client.complete([{"role": "user", "content": "q"}])
def test_missing_message_field_raises(self, client):
with patch("alfred.agent.llm.ollama.requests.post") as mock_post:
mock_post.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(return_value={}),
)
with pytest.raises(LLMAPIError, match="missing 'message'"):
client.complete([{"role": "user", "content": "q"}])
# --------------------------------------------------------------------------- #
# list_models #
# --------------------------------------------------------------------------- #
class TestListModels:
def test_returns_model_names(self, client):
with patch("alfred.agent.llm.ollama.requests.get") as mock_get:
mock_get.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(
return_value={
"models": [
{"name": "llama3.3:latest"},
{"name": "mistral:7b"},
]
}
),
)
assert client.list_models() == ["llama3.3:latest", "mistral:7b"]
def test_no_models_returns_empty(self, client):
with patch("alfred.agent.llm.ollama.requests.get") as mock_get:
mock_get.return_value = MagicMock(
raise_for_status=MagicMock(),
json=MagicMock(return_value={}),
)
assert client.list_models() == []
def test_failure_returns_empty(self, client):
with patch(
"alfred.agent.llm.ollama.requests.get",
side_effect=RequestException("offline"),
):
assert client.list_models() == []
# --------------------------------------------------------------------------- #
# is_available #
# --------------------------------------------------------------------------- #
class TestIsAvailable:
def test_returns_true_on_200(self, client):
with patch("alfred.agent.llm.ollama.requests.get") as mock_get:
mock_get.return_value = MagicMock(status_code=200)
assert client.is_available() is True
def test_returns_false_on_non_200(self, client):
with patch("alfred.agent.llm.ollama.requests.get") as mock_get:
mock_get.return_value = MagicMock(status_code=503)
assert client.is_available() is False
def test_returns_false_on_exception(self, client):
with patch(
"alfred.agent.llm.ollama.requests.get",
side_effect=RequestException("down"),
):
assert client.is_available() is False