alfred

Author	SHA1	Message	Date
francwa	ced72547f7	refactor(knowledge): extract YAML loaders from domain to infrastructure The domain layer no longer reads YAML files. All knowledge loaders move from `alfred/domain//knowledge/` to `alfred/infrastructure/knowledge/`: domain/release/knowledge.py → infrastructure/knowledge/release.py domain/shared/knowledge/language_registry.py → infrastructure/knowledge/language_registry.py domain/subtitles/knowledge/{loader,base}.py → infrastructure/knowledge/subtitles/{loader,base}.py Callers in domain/release/{services,value_objects}.py, domain/subtitles/{aggregates,services/}.py, and application/filesystem/manage_subtitles.py updated to absolute imports. Re-exports of KnowledgeLoader/SubtitleKnowledgeBase from domain/subtitles/__init__.py dropped (no shim per project convention). Tests follow the moved targets.	2026-05-19 14:35:18 +02:00
francwa	eb8995cfc3	refactor(subtitles): drop dead scanner module SubtitleScanner was an earlier iteration superseded by SubtitleIdentifier and never imported in production code (only by its own tests). Removing both keeps the bounded context clean and shrinks the surface.	2026-05-19 14:17:15 +02:00
francwa	273510dff8	test(fixtures): seed PATH OF PAIN bucket with 10 worst-case fixtures 10 pathological release names mined from the real downloads folder. Each fixture locks in the current parse_release output (including its silent losses and false positives) so future parser improvements are intentional, not silent drift. Cases: - Khruangbin yt-dlp slug (UTF-8 wide pipe '｜', YT ID as group) - Deutschland 83-86-89 franchise box (group=S03 misdetection) - Chérie Le BéBé (accented chars preserved, VFF language) - Jimmy Carr 8-word stand-up special title - [ OxTorrent.vc ] prefix + XviD codec (site_tag prefix) - Prodiges S12E01 with episode title + air-date silently lost - The Prodigy: apostrophe + Blu-ray dash + 1080i + multi-word audio = full AI-path degeneration (everything UNKNOWN) - Sleaford Mods yt-dlp slug (YT ID glued to year) - Super Mario Bros [FR-EN] (bilingual tag mistaken for group) - Gilmore Girls Complete S01-S07 (the well-behaved exception: COMPLETE token correctly drives tv_complete + REPACK + 10bit) Also adds shitty + path_of_pain to the per-bucket sanity assertion. Suite: 1020 passed, 8 skipped.	2026-05-18 15:57:56 +02:00
francwa	c1831e3f46	test(fixtures): drop derry_duplicate_naming (was a copy-paste artifact) The release name mixed two distinct releases — not a real-world case worth anti-regression. SHITTY bucket now holds 14 fixtures (down from 15).	2026-05-18 15:51:11 +02:00
francwa	aa182458b8	test(fixtures): seed SHITTY release bucket with 15 anti-regression cases Add 15 expected.yaml fixtures under tests/fixtures/releases/shitty/ covering the awkward but real-world release names from the downloads folder. Each fixture locks in the current parse_release behavior so future parser changes are intentional, not silent drift. Cases captured: - Angel INTEGRALE 3-level hierarchy (tv_complete media_type) - Buffy custom French title with dots preserved - Archer S14E09E10E11 multi-episode (E11 lost — tech debt) - Notre Planète lowercase s01e01 - Vinyl ' - 1x01 - FHD' (stray dash artifact — tech debt) - Deutschland.83 (year-suffix as part of title) - Tatortreiniger S01-06 range (falls to movie — tech debt) - Derry Girls duplicated title - Jurassic Park bare folder (media_type=unknown) - La Nuit au Musée bilingual MULTI - Chérie j'ai agrandi (ASCII-stripped apostrophe, parses fine) - Honey Don't (unescaped apostrophe — full AI-path degeneration) - Hook MULTi.SUBS movie with Subs/ folder - Predator Badlands space separators (group=UNKNOWN — tech debt) - Westworld S04 Subs.Only (no video file) Each fixture also captures the future 3-flow routing (library / torrents / seed_hardlinks) ahead of the organize_media refactor. Suite: 1011 passed, 8 skipped.	2026-05-18 15:48:41 +02:00
francwa	7bc50fd5b8	test: add real-world release fixtures (EASY bucket) Captures 5 canonical releases from /mnt/testipool/downloads as parametrized fixtures under tests/fixtures/releases/easy/. Each fixture declares the release name, expected ParsedRelease fields, original tree, and the future routing (library / torrents / seed_hardlinks) for the upcoming organize_media refactor. Today only the 'parsed' section is asserted; tree is materialized into a tmp_path to catch typos. Routing is captured ahead of the planner work — it becomes verifiable once organize_media lands. Cases: back_in_action (movie), slow_horses_single_ep (TV single), foundation_season_pack (S02 + .nfo noise), long_walk_with_noise (movie + KONTRAST.TOP.txt), sinners_yts (YTS bracket-heavy + Subs/ dir). Also tracks CHANGELOG.md under [Unreleased] / Added.	2026-05-18 15:36:19 +02:00
francwa	891ba502a2	chore: apply pre-commit auto-fixes (trim trailing whitespace, EOF)	2026-05-17 23:41:54 +02:00
francwa	e07c9ec77b	chore: sprint cleanup — language unification, parser unification, fossils removal Several weeks of work accumulated without being committed. Grouped here for clarity; see CHANGELOG.md [Unreleased] for the user-facing summary. Highlights ---------- P1 #2 — ISO 639-2/B canonical migration - New Language VO + LanguageRegistry (alfred/domain/shared/knowledge/). - iso_languages.yaml as single source of truth for language codes. - SubtitleKnowledgeBase now delegates lookup to LanguageRegistry; subtitles.yaml only declares subtitle-specific tokens (vostfr, vf, vff, …). - SubtitlePreferences default → ["fre", "eng"]; subtitle filenames written as {iso639_2b}.srt (legacy fr.srt still read via alias). - Scanner: dropped _LANG_KEYWORDS / _SDH_TOKENS / _FORCED_TOKENS / SUBTITLE_EXTENSIONS hardcoded dicts. - Fixed: 'hi' token no longer marks SDH (conflicted with Hindi alias). - Added settings.min_movie_size_bytes (was a module constant). P1 #3 — Release parser unification + data-driven tokenizer - parse_release() is now the single source of truth for release-name parsing. - alfred/knowledge/release/separators.yaml declares the token separators used by the tokenizer (., space, [, ], (, ), _). New conventions can be added without code changes. - Tokenizer now splits on any configured separator instead of name.split('.'). Releases like 'The Father (2020) [1080p] [WEBRip] [5.1] [YTS.MX]' parse via the direct path without sanitization fallback. - Site-tag extraction always runs first; well-formedness only rejects truly forbidden chars. - _parse_season_episode() extended with NxNN / NxNNxNN alt forms. - Removed dead helpers: _sanitize, _normalize. Domain cleanup - Deleted fossil services with zero production callers: alfred/domain/movies/services.py alfred/domain/tv_shows/services.py alfred/domain/subtitles/services.py (replaced by subtitles/services/ package) alfred/domain/subtitles/repositories.py - Split monolithic subtitle services into a package (identifier, matcher, placer, pattern_detector, utils) + dedicated knowledge/ package. - MediaInfo split into dedicated package (alfred/domain/shared/media/: audio, video, subtitle, info, matching). Persistence cleanup - Removed dead JSON repositories (movie/subtitle/tvshow_repository.py). Tests - Major expansion of the test suite organized to mirror the source tree. - Removed obsolete _edge_cases test files superseded by structured tests. - Suite: 990 passed, 8 skipped. Misc - .gitignore: exclude env_backup/ and .bak. - Adjustments across agent/llm, app.py, application/filesystem, and infrastructure/filesystem to align with the new domain layout.	2026-05-17 23:38:00 +02:00
francwa	e45465d52d	feat: split resolve_destination, persona-driven prompts, qBittorrent relocation Destination resolution - Replace the single ResolveDestinationUseCase with four dedicated functions, one per release type: resolve_season_destination (pack season, folder move) resolve_episode_destination (single episode, file move) resolve_movie_destination (movie, file move) resolve_series_destination (multi-season pack, folder move) - Each returns a dedicated DTO carrying only the fields relevant to that release type — no more polymorphic ResolvedDestination with half the fields unused depending on the case. - Looser series folder matching: exact computed-name match is reused silently; any deviation (different group, multiple candidates) now prompts the user with all options including the computed name. Agent tools - Four new tools wrapping the use cases above; old resolve_destination removed from the registry. - New move_to_destination tool: create_folder + move, chained — used after a resolve_* call to perform the actual relocation. - Low-level filesystem_operations module (create_folder, move via mv) for instant same-FS renames (ZFS). Prompt & persona - New PromptBuilder (alfred/agent/prompt.py) replacing prompts.py: identity + personality block, situational expressions, memory schema, episodic/STM/config context, tool catalogue. - Per-user expression system: knowledge/users/common.yaml + {username}.yaml are merged at runtime; one phrase per situation (greeting/success/error/...) is sampled into the system prompt. qBittorrent integration - Credentials now come from settings (qbittorrent_url/username/password) instead of hardcoded defaults. - New client methods: find_by_name, set_location, recheck — the trio needed to update a torrent's save path and re-verify after a move. - Host→container path translation settings (qbittorrent_host_path / qbittorrent_container_path) for docker-mounted setups. Subtitles - Identifier: strip parenthesized qualifiers (simplified, brazil…) at tokenization; new _tokenize_suffix used for the episode_subfolder pattern so episode-stem tokens no longer pollute language detection. - Placer: extract _build_dest_name so it can be reused by the new dry_run path in ManageSubtitlesUseCase. - Knowledge: add yue, ell, ind, msa, rus, vie, heb, tam, tel, tha, hin, ukr; add 'fre' to fra; add 'simplified'/'traditional' to zho. Misc - LTM workspace: add 'trash' folder slot. - Default LLM provider switched to deepseek. - testing/debug_release.py: CLI to parse a release, hit TMDB, and dry-run the destination resolution end-to-end.	2026-05-14 05:01:59 +02:00
francwa	1723b9fa53	feat: release parser, media type detection, ffprobe integration Replace the old domain/media release parser with a full rewrite under domain/release/: - ParsedRelease with media_type ("movie" \| "tv_show" \| "tv_complete" \| "documentary" \| "concert" \| "other" \| "unknown"), site_tag, parse_path, languages, audio_codec, audio_channels, bit_depth, hdr_format, edition - Well-formedness check + sanitize pipeline (_is_well_formed, _sanitize, _strip_site_tag) before token-level parsing - Multi-token sequence matching for audio (DTS-HD.MA, TrueHD.Atmos…), HDR (DV.HDR10…) and editions (DIRECTORS.CUT…) - Knowledge YAML: file_extensions, release_format, languages, audio, video, editions, sites/c411 New infrastructure: - ffprobe.py — single-pass probe returning MediaInfo (video, audio tracks, subtitle tracks) - find_video.py — locate first video file in a release folder New application helpers: - detect_media_type — filesystem-based type refinement - enrich_from_probe — fill missing ParsedRelease fields from MediaInfo New agent tools: - analyze_release — parse + detect type + ffprobe in one call - probe_media — standalone ffprobe for a specific file New domain value object: - MediaInfo + AudioTrack + SubtitleTrack (domain/shared/media_info.py) Testing CLIs: - recognize_folders_in_downloads.py — full pipeline with colored output - probe_video.py — display MediaInfo for a video file	2026-05-12 16:14:20 +02:00
francwa	249c5de76a	feat: major architectural refactor - Refactor memory system (episodic/STM/LTM with components) - Implement complete subtitle domain (scanner, matcher, placer) - Add YAML workflow infrastructure - Externalize knowledge base (patterns, release groups) - Add comprehensive testing suite - Create manual testing CLIs	2026-05-11 21:55:06 +02:00
francwa	62b5d0b998	Settings + fix startup	2026-04-30 12:41:42 +02:00
francwa	ab1df3dd0f	fix: forgot to lint/format	2026-01-01 04:48:32 +01:00
francwa	c50091f6bf	feat: added proper settings handling	2026-01-01 04:48:32 +01:00
francwa	52f025ae32	chore: ran linter & formatter again	2025-12-27 20:07:48 +01:00
francwa	261a1f3918	fix: fixed real data directory being used in tests	2025-12-27 19:48:13 +01:00
francwa	3880a4ec49	chore: ran linter and formatter	2025-12-27 19:41:22 +01:00
francwa	6195abbaa5	chore: fixed imports and tests configuration	2025-12-27 19:39:36 +01:00
francwa	d10c9160f3	infra: renamed broken references to alfred	2025-12-24 08:09:26 +01:00
francwa	1f88e99e8b	infra: reorganized repo	2025-12-24 07:50:09 +01:00
francwa	ec7d2d623f	Updated folder structure (for Docker)	2025-12-09 05:35:59 +01:00
francwa	6940c76e58	Updated README and did a little bit of cleanup	2025-12-09 04:24:16 +01:00
francwa	0c48640412	Fixed all ruff issues	2025-12-07 05:59:53 +01:00
francwa	a21121d025	Fix more ruff issues	2025-12-07 05:42:29 +01:00
francwa	10704896f9	Fix some ruff issues in code	2025-12-07 05:33:39 +01:00
francwa	4eae1d6d58	Formatting	2025-12-07 03:33:51 +01:00
francwa	a923a760ef	Unfucked gemini's mess	2025-12-07 03:27:45 +01:00
francwa	5b71233fb0	Recovered tests	2025-12-06 23:55:21 +01:00
francwa	9ca31e45e0	feat!: migrate to OpenAI native tool calls and fix circular deps (#fuck-gemini) - Fix circular dependencies in agent/tools - Migrate from custom JSON to OpenAI tool calls format - Add async streaming (step_stream, complete_stream) - Simplify prompt system and remove token counting - Add 5 new API endpoints (/health, /v1/models, /api/memory/*) - Add 3 new tools (get_torrent_by_index, add_torrent_by_index, set_language) - Fix all 500 tests and add coverage config (80% threshold) - Add comprehensive docs (README, pytest guide) BREAKING: LLM interface changed, memory injection via get_memory()	2025-12-06 19:11:05 +01:00

29 Commits