The four resolve_*_destination use cases now route through a private
_resolve_parsed helper that picks the right entry point:
- source path provided AND it exists -> inspect_release(name, path)
runs the full pipeline (parse + media-type refinement + probe
+ enrich), so missing tech tokens (quality, codec, ...) get
filled by ffprobe and the refreshed tech_string lands in the
destination folder / file names.
- source path missing or absent -> parse_release(name) only,
same behavior as before. Back-compat: tests using fake /dl/*.mkv
paths still pass unchanged.
resolve_episode_destination / resolve_movie_destination reuse their
existing source_file parameter as the inspection target. The two
folder-move use cases (season / series) gain a new OPTIONAL
source_path parameter — threaded through the agent tool wrappers
and documented in the YAML specs.
The lazy import inside _resolve_parsed avoids a circular import:
inspect_release imports detect_media_type / enrich_from_probe from
the same application.filesystem package whose __init__ re-exports
resolve_destination.
Three new tests in TestProbeEnrichmentWiring with a stub MediaProber
prove the wiring: movie picks up probe quality, season picks it up
via source_path, and a missing path correctly skips probe (back-compat
guard).
New application-layer entry point that composes the four inspection
layers in one call:
1. parse_release(name, kb) -> (ParsedRelease, ParseReport)
2. detect_media_type(parsed, path, kb) -> patch parsed.media_type
3. find_main_video(path, kb) -> Path | None (top-level scan)
4. prober.probe(video) + enrich -> when video exists and
media_type not in
{unknown, other}
Returns a frozen InspectedResult(parsed, report, source_path,
main_video, media_info, probe_used). kb and prober are injected — no
module-level singletons in inspect.py.
analyze_release tool now delegates to inspect_release; its output
gains two fields, confidence (0-100) and road (easy/shitty/path_of_pain),
surfaced from ParseReport so the LLM can route by confidence. Spec
updated to document them.
12 new tests covering happy paths, probe gating (no video, media_type
'other', probe failure), mutation contract (detect refining
parsed.media_type, enrich filling None fields), resilience
(nonexistent path), and frozen contract. Suite: 1058 passing.
Add probe(video) -> MediaInfo | None to the MediaProber Protocol and
implement it on FfprobeMediaProber. The standalone
alfred/infrastructure/filesystem/ffprobe.py module is removed; all
callers (analyze_release / probe_media tools, testing scripts) now go
through the adapter.
Tests for the probe path moved to tests/infrastructure/test_ffprobe_prober.py
(patching subprocess.run at the adapter module level).
Unblocks the upcoming inspect_release orchestrator, which needs the
port — not a free function — to compose parse + main-video selection
+ probe in one shot.
Wire the scoring foundations into the parser entry point. parse_release
now returns a tuple — the structural ParsedRelease and a diagnostic
ParseReport carrying confidence (0-100), road
(EASY / SHITTY / PATH_OF_PAIN), the residual UNKNOWN tokens, and the
list of critical fields that couldn't be filled.
EASY is decided structurally (a group schema matched), independently
of the score. SHITTY vs PATH_OF_PAIN is decided by score against the
60 cutoff from scoring.yaml. Malformed names (forbidden chars) emit a
zero-confidence PoP report and short-circuit to parse_path=AI as
before.
ParsePath stays as-is (DIRECT / SANITIZED / AI) — it records *how* we
tokenized, not how confident we are. The two dimensions are now
properly separated.
Call sites propagated:
- alfred/application/filesystem/resolve_destination.py (4 occurrences)
- alfred/agent/tools/filesystem.py
- tests/domain/test_release.py
- tests/domain/test_release_fixtures.py
- tests/application/test_detect_media_type.py
New tests/domain/release/test_parser_v2_scoring.py (22 cases) locks
ParseReport validation, compute_score arithmetic, decide_road
thresholding, the collector helpers, and the end-to-end tuple contract.
Wires the new explicit-kb signatures into every caller of the release
parser and the filesystem-extension helpers.
- application/filesystem/resolve_destination.py: module-level singleton
_KB: ReleaseKnowledge = YamlReleaseKnowledge(); each use case now calls
parse_release(release_name, _KB) and sanitizes TMDB strings via
_KB.sanitize_for_fs(...) before passing them to the pure ParsedRelease
builders. Local _sanitize helper + _WIN_FORBIDDEN regex dropped.
- application/filesystem/detect_media_type.py: signature is now
detect_media_type(parsed, source_path, kb); uses kb.metadata_extensions,
kb.video_extensions, kb.non_video_extensions.
- infrastructure/filesystem/find_video.py: find_video_file(path, kb) uses
kb.video_extensions instead of an imported constant.
- agent/tools/filesystem.py::analyze_release imports the application _KB
singleton and passes it through to parse_release / detect_media_type /
find_video_file.
Several weeks of work accumulated without being committed. Grouped here for
clarity; see CHANGELOG.md [Unreleased] for the user-facing summary.
Highlights
----------
P1 #2 — ISO 639-2/B canonical migration
- New Language VO + LanguageRegistry (alfred/domain/shared/knowledge/).
- iso_languages.yaml as single source of truth for language codes.
- SubtitleKnowledgeBase now delegates lookup to LanguageRegistry; subtitles.yaml
only declares subtitle-specific tokens (vostfr, vf, vff, …).
- SubtitlePreferences default → ["fre", "eng"]; subtitle filenames written as
{iso639_2b}.srt (legacy fr.srt still read via alias).
- Scanner: dropped _LANG_KEYWORDS / _SDH_TOKENS / _FORCED_TOKENS /
SUBTITLE_EXTENSIONS hardcoded dicts.
- Fixed: 'hi' token no longer marks SDH (conflicted with Hindi alias).
- Added settings.min_movie_size_bytes (was a module constant).
P1 #3 — Release parser unification + data-driven tokenizer
- parse_release() is now the single source of truth for release-name parsing.
- alfred/knowledge/release/separators.yaml declares the token separators used
by the tokenizer (., space, [, ], (, ), _). New conventions can be added
without code changes.
- Tokenizer now splits on any configured separator instead of name.split('.').
Releases like 'The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]' parse via
the direct path without sanitization fallback.
- Site-tag extraction always runs first; well-formedness only rejects truly
forbidden chars.
- _parse_season_episode() extended with NxNN / NxNNxNN alt forms.
- Removed dead helpers: _sanitize, _normalize.
Domain cleanup
- Deleted fossil services with zero production callers:
alfred/domain/movies/services.py
alfred/domain/tv_shows/services.py
alfred/domain/subtitles/services.py (replaced by subtitles/services/ package)
alfred/domain/subtitles/repositories.py
- Split monolithic subtitle services into a package (identifier, matcher,
placer, pattern_detector, utils) + dedicated knowledge/ package.
- MediaInfo split into dedicated package (alfred/domain/shared/media/:
audio, video, subtitle, info, matching).
Persistence cleanup
- Removed dead JSON repositories (movie/subtitle/tvshow_repository.py).
Tests
- Major expansion of the test suite organized to mirror the source tree.
- Removed obsolete *_edge_cases test files superseded by structured tests.
- Suite: 990 passed, 8 skipped.
Misc
- .gitignore: exclude env_backup/ and *.bak.
- Adjustments across agent/llm, app.py, application/filesystem, and
infrastructure/filesystem to align with the new domain layout.
- Extract MetadataStore from SubtitleMetadataStore (alfred/infrastructure/metadata/).
Generic load/save + typed update helpers (update_parse, update_probe, update_tmdb)
for the per-release .alfred/metadata.yaml.
- SubtitleMetadataStore becomes a thin facade — owns subtitle_history shape,
delegates I/O to MetadataStore.
- Agent._execute_tool_call auto-persists successful analyze_release / probe_media /
find_media_imdb_id results to the release's .alfred file. find_media_imdb_id
follows release_focus when it has no path argument.
- New tools:
· read_release_metadata(release_path) — cacheable, key=release_path.
Returns the .alfred content or has_metadata=false.
· query_library(name) — substring scan across configured library roots.
- Both new tools added to CORE_TOOLS (always visible).
Adds two STM components and a transparent cache hook in the agent loop so
read-only tools don't re-do work the agent already did in this session.
New STM components:
- ToolResultsCache — {tool_name: {key: result}}, session-scoped.
to_dict() exposes only the key inventory (not payloads) to keep the
prompt cheap.
- ReleaseFocus — current_release_path + working_set list, updated
automatically when a path-keyed inspector runs.
YAML spec layer:
- New optional 'cache: { key: <param_name> }' block in ToolSpec.
- Validated at load time: cache.key must be a declared parameter.
- Surfaced on Tool dataclass as cache_key: str | None.
Agent._execute_tool_call:
- Pre-exec cache lookup; hit short-circuits and adds _from_cache=true.
- Post-exec: stores successful results, updates release_focus for
path-keyed tools, refreshes episodic.last_search_results when
find_torrent's hit served the response (so get_torrent_by_index
keeps pointing at the right list).
Cacheable tools (5): analyze_release, probe_media, list_folder,
find_media_imdb_id, find_torrent.
Adds YAML specs for the 14 tools that were still description-from-docstring:
filesystem:
- set_path_for_folder, list_folder, analyze_release, probe_media,
move_media, manage_subtitles, create_seed_links, learn
api:
- find_media_imdb_id, find_torrent, get_torrent_by_index,
add_torrent_to_qbittorrent, add_torrent_by_index
language:
- set_language
Each spec follows the established shape (summary / description /
when_to_use / when_not_to_use / next_steps / parameters with
why_needed + example / returns) and the Python function docstring is
slimmed to a one-line pointer.
Registry now reports: 21 tools, 21 with YAML spec, 0 doc-only.
The Workflow STM component stored an active workflow as
{type, target, stage, started_at}. Now that start_workflow takes a
workflow_name and a params dict, those keys match what they actually
hold:
type -> name (the YAML workflow name, e.g. media.organize_media)
target -> params (the dict passed to start_workflow)
ShortTermMemory.start_workflow parameters renamed accordingly. All
consumers (prompt builder workflow scope + STM context, start/end
workflow tools) updated.
Introduce a scope-aware agent so the LLM never sees the full 21-tool
catalog at once. The system prompt now describes either:
- idle mode: core noyau (5 tools: set_language, set_path_for_folder,
list_folder, start_workflow, end_workflow) + a list of available
workflows with their goals;
- active mode: the noyau plus the tools declared by the active
workflow's YAML, with the step plan inlined into the prompt.
Pieces:
- alfred/agent/tools/workflow.py: start_workflow / end_workflow tools
(with YAML specs under tools/specs/) that drive memory.stm.workflow.
- alfred/agent/prompt.py: CORE_TOOLS constant, visible_tool_names(),
filtered build_tools_spec() / _format_tools_description(), and a new
_format_workflow_scope() section in the system prompt.
- alfred/agent/agent.py: WorkflowLoader wired into Agent, defensive
out-of-scope check in _execute_tool_call.
- alfred/agent/registry.py: registers the two new meta-tools (21 total,
7 with YAML spec).
- workflows/media.organize_media.yaml: tools/steps list refreshed to
match the current resolver split (analyze_release, probe_media,
resolve_*_destination, move_to_destination).
Rename workflow files and their 'name' field with a 'media.' domain
prefix to anticipate future multi-domain expansion (mail.*, calendar.*, ...).
- organize_media -> media.organize_media
- manage_subtitles -> media.manage_subtitles
WorkflowLoader picks them up unchanged (uses data['name']).
The ParameterSchema / REQUIRED_PARAMETERS / get_missing_required_parameters
machinery in alfred/agent/parameters.py was used in early prototypes for
the prompt-required-params check but has been unwired from production for
several refactors. The new YAML tool-spec layer (alfred/agent/tools/specs/)
covers the same need (rich, LLM-facing parameter descriptions) without
the parallel registration plumbing.
Tests in tests/test_config_edge_cases.py still reference the deleted
module — left untouched per the project policy of treating test sync
as a dedicated end-of-week task.
Introduce a first-class semantic layer for tool descriptions, separated
from Python signatures (which stay the source of truth for types and
required-ness).
New
- alfred/agent/tools/spec.py — ToolSpec / ParameterSpec / ReturnsSpec
dataclasses with strict YAML validation (ToolSpecError on malformed
or inconsistent specs). compile_description() builds the rich text
passed to the LLM as Tool.description, with sections for summary,
description, when_to_use, when_not_to_use, next_steps, and returns.
compile_parameter_description() injects the 'why_needed' field next
to each parameter so the LLM sees the *intent* of each argument.
- alfred/agent/tools/spec_loader.py — discovers tools/specs/*.yaml,
enforces filename ↔ spec.name match, rejects duplicates.
- alfred/agent/tools/specs/ — one YAML per tool:
* resolve_season_destination.yaml
* resolve_episode_destination.yaml
* resolve_movie_destination.yaml
* resolve_series_destination.yaml
* move_to_destination.yaml
Refactor
- alfred/agent/registry.py
* _create_tool_from_function now takes an optional ToolSpec.
When provided, the long description + per-parameter descriptions
come from the spec; types and required-ness still come from the
Python signature.
* Cross-validates spec.parameters against the function signature —
crashes on missing or extra entries.
* make_tools() loads all specs at startup and hands the right one
to each tool. Tools without a spec fall back to the old
docstring-only behaviour, so the 14 not-yet-migrated tools keep
working unchanged.
* Adds 'array' and 'object' to the Python→JSON type mapping and
handles Optional[X] / X | None annotations.
- alfred/agent/tools/filesystem.py
* Drops the '_tool' suffix on the 4 resolve_* wrappers (option 1:
alias the use-case imports as _resolve_*). Tool names exposed to
the LLM now match the underlying use case verbatim.
* Wrapper docstrings shrink to a one-liner pointing to the YAML
spec — no more duplicated when_to_use/Args/Returns in Python.
Verified
- make_tools() loads 19 tools (5 with YAML spec, 14 doc-only).
- Compiled descriptions render cleanly with all sections.
Destination resolution
- Replace the single ResolveDestinationUseCase with four dedicated
functions, one per release type:
resolve_season_destination (pack season, folder move)
resolve_episode_destination (single episode, file move)
resolve_movie_destination (movie, file move)
resolve_series_destination (multi-season pack, folder move)
- Each returns a dedicated DTO carrying only the fields relevant to
that release type — no more polymorphic ResolvedDestination with
half the fields unused depending on the case.
- Looser series folder matching: exact computed-name match is reused
silently; any deviation (different group, multiple candidates) now
prompts the user with all options including the computed name.
Agent tools
- Four new tools wrapping the use cases above; old resolve_destination
removed from the registry.
- New move_to_destination tool: create_folder + move, chained — used
after a resolve_* call to perform the actual relocation.
- Low-level filesystem_operations module (create_folder, move via mv)
for instant same-FS renames (ZFS).
Prompt & persona
- New PromptBuilder (alfred/agent/prompt.py) replacing prompts.py:
identity + personality block, situational expressions, memory
schema, episodic/STM/config context, tool catalogue.
- Per-user expression system: knowledge/users/common.yaml +
{username}.yaml are merged at runtime; one phrase per situation
(greeting/success/error/...) is sampled into the system prompt.
qBittorrent integration
- Credentials now come from settings (qbittorrent_url/username/password)
instead of hardcoded defaults.
- New client methods: find_by_name, set_location, recheck — the trio
needed to update a torrent's save path and re-verify after a move.
- Host→container path translation settings (qbittorrent_host_path /
qbittorrent_container_path) for docker-mounted setups.
Subtitles
- Identifier: strip parenthesized qualifiers (simplified, brazil…) at
tokenization; new _tokenize_suffix used for the episode_subfolder
pattern so episode-stem tokens no longer pollute language detection.
- Placer: extract _build_dest_name so it can be reused by the new
dry_run path in ManageSubtitlesUseCase.
- Knowledge: add yue, ell, ind, msa, rus, vie, heb, tam, tel, tha,
hin, ukr; add 'fre' to fra; add 'simplified'/'traditional' to zho.
Misc
- LTM workspace: add 'trash' folder slot.
- Default LLM provider switched to deepseek.
- testing/debug_release.py: CLI to parse a release, hit TMDB, and
dry-run the destination resolution end-to-end.