9.5 KiB
9.5 KiB
- [P20260219-062531] 2026-02-19 06:25:31 +0800
- summary: CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary: --- ## TASK 1: Analyze a…
- prompt:
CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary:
---
## TASK 1: Analyze and update outdated documentation in `apps/etl/connectors/feiqiu/docs/`
**STATUS**: in-progress
**USER QUERIES**: 1 ("根据apps\etl\connectors\feiqiu\docs\README.md,逐一分析apps\etl\connectors\feiqiu\docs内的文档,哪些过期了?帮我按照项目现状更新文档。")
**DETAILS**:
- Phase 1 (DONE): Used context-gatherer to map codebase structure
- Phase 2 (DONE): Read ALL source code truth files and ALL documentation files
- Phase 3 (IN PROGRESS): Analysis complete, specific outdated items identified, but NO documentation files have been modified yet
**Key findings from source code analysis:**
Schema names: Code uses `ods.*` (e.g., `ods.member_profiles`), NOT `billiards_ods.*`. Documents still reference `billiards_ods`, `billiards_dwd`, `billiards_dws` — these are outdated.
Task counts from `task_registry.py`:
- ODS: 23 tasks (from `ODS_TASK_CLASSES` + `ODS_SETTLEMENT_TICKET` override, all in `ENABLED_ODS_CODES`)
- DWD: 2 tasks (`DWD_LOAD_FROM_ODS`, `DWD_QUALITY_CHECK`)
- DWS business: 12 tasks (5 assistant + 2 member + 4 finance + `DWS_BUILD_ORDER_SUMMARY`)
- DWS maintenance: 1 task (`DWS_MAINTENANCE` — replaced old `MV_REFRESH` + `RETENTION_CLEANUP`)
- INDEX: 4 tasks (`DWS_WINBACK_INDEX`, `DWS_NEWCONV_INDEX`, `DWS_RELATION_INDEX`, `DWS_ML_MANUAL_IMPORT`)
- Utility: 7 tasks (`MANUAL_INGEST`, `INIT_ODS_SCHEMA`, `INIT_DWD_SCHEMA`, `INIT_DWS_SCHEMA`, `ODS_JSON_ARCHIVE`, `CHECK_CUTOFF`, `SEED_DWS_CONFIG`)
- Verification: 1 task (`DATA_INTEGRITY_CHECK`)
- Total: ~50 registered tasks
CLI from `cli/main.py`: 3 execution modes (--tasks, --flow, --layers), 3 processing modes (increment_only, verify_only, increment_verify), plus --force-full, --window-split, --window-split-unit, --window-split-days, --window-compensation-hours, --force-window-override, --verify-tables, --fetch-before-verify, --data-source
7 Flow types from `flow_runner.py`: api_ods, api_ods_dwd, api_full, ods_dwd, dwd_dws, dwd_dws_index, dwd_index
**Specific outdated items per document (NOT yet fixed):**
1. `architecture/system_overview.md`:
- Tech stack table lists `PySide6` (GUI removed, replaced by admin-web) and `Flask` (replaced by FastAPI)
- Module diagram references `gui/main.py` — should be removed
- Schema names: `billiards_ods | billiards_dwd | billiards_dws | etl_admin` → should be `ods | dwd | dws | meta` (per steering tech.md: six-layer schema `meta`, `ods`, `dwd`, `core`, `dws`, `app`)
- ODS tasks count says "16 个业务实体" → should be 23
- References `ETLScheduler` which is deprecated
- Missing: `--force-full` parameter description
- Data source section references old `--pipeline-flow` as primary
2. `architecture/data_flow.md`:
- Schema names: `billiards_ods`, `billiards_dwd`, `billiards_dws` → `ods`, `dwd`, `dws`
- Says "23 张 ODS 表,对应 23 个 API 端点" — count is correct but schema prefix wrong
- Missing INDEX layer in the data flow diagram
- `etl_admin` → should be `meta`
3. `etl_tasks/base_task_mechanism.md`:
- Section 5.4 task statistics: "DWD | 5" → should be 2; "DWS | 13" is correct; "工具类 | 7" is correct; "校验类 | 1" is correct
- Registration example uses `"ORDERS"` which is a deleted legacy task → update to current task name
4. `operations/environment_setup.md`:
- Section 2.2: `pip install -r requirements.txt` → should be `uv sync` (project uses uv workspace)
- Dependencies table lists `PySide6` and `flask` — both removed
- Section 4 Schema names: `billiards_ods`, `billiards_dwd`, `billiards_dws`, `etl_admin` → `ods`, `dwd`, `dws`, `meta`
- Section 4 DDL paths: `database/schema_ods.sql` etc. → actual paths in `db/etl_feiqiu/schemas/`
- Section 6 run entries: GUI entry (`python -m gui.main`) and batch files (`run_etl.bat`/`run_gui.bat`) should be removed
- Missing: `--flow`, `--layers`, `--processing-mode` and other new CLI params in verification section
5. `operations/scheduling.md`:
- Mostly up-to-date (was recently rewritten), but missing `--force-full` parameter
- Missing `--window-split-unit`, `--window-split-days`, `--window-compensation-hours` parameters
6. `etl_tasks/ods_tasks.md`:
- Content appears mostly accurate and recently updated
- Schema references say `billiards_ods` in some places → should be `ods`
7. `etl_tasks/dws_tasks.md`:
- Schema references: `billiards_dws`, `billiards_dwd` → `dws`, `dwd`
- Otherwise content appears accurate
8. `etl_tasks/index_tasks.md`:
- Schema references: `billiards_dws`, `billiards_dwd` → `dws`, `dwd`
- Otherwise content appears accurate and comprehensive
9. `etl_tasks/utility_tasks.md`:
- Schema references: `billiards_ods`, `billiards_dwd`, `billiards_dws` → `ods`, `dwd`, `dws`
- DDL file paths reference `database/schema_ODS_doc.sql` etc. → verify against actual paths in `db/etl_feiqiu/schemas/`
10. `architecture/ods_taskspec_refactor_proposal.md`:
- Already marked as "方案 1-4 已完成 ✅ · 方案 5 中长期待办" — this is fine, no update needed
11. `docs/README.md`:
- Appears up-to-date, no changes needed
12. `CHANGELOG.md`:
- Will need a new entry after all doc updates are complete
**NEXT STEPS** (priority order):
1. Update `architecture/system_overview.md` — remove GUI/Flask/PySide6 references, fix schema names (`ods`/`dwd`/`dws`/`meta`), fix task count, remove ETLScheduler reference, add --force-full
2. Update `architecture/data_flow.md` — fix schema names, add INDEX layer to diagram
3. Update `etl_tasks/base_task_mechanism.md` — fix DWD count from 5→2, fix example task name
4. Update `operations/environment_setup.md` — fix install method (uv sync), remove GUI/Flask/PySide6, fix schema names, fix DDL paths, fix run entries
5. Update `operations/scheduling.md` — add missing CLI params (--force-full, --window-split-unit, --window-split-days, --window-compensation-hours)
6. Bulk fix schema name references across all docs: `billiards_ods`→`ods`, `billiards_dwd`→`dwd`, `billiards_dws`→`dws`, `etl_admin`→`meta`
7. Add CHANGELOG entry for documentation refresh
**IMPORTANT NOTE on schema names**: The steering file `tech.md` says "六层 Schema 架构:`meta`(调度元数据)、`ods`(原始数据)、`dwd`(明细数据)、`core`(跨门店标准化)、`dws`(汇总数据)、`app`(RLS 视图层)". The code uses `ods.*` prefix (e.g., `ods.member_profiles`). However, the DWS tasks docs reference `billiards_dws.*` extensively in business logic descriptions. The new agent should verify whether DWS/DWD code also uses short schema names before doing bulk replacement — check `tasks/dws/base_dws_task.py` for actual schema prefix used.
**FILEPATHS**:
- `apps/etl/connectors/feiqiu/docs/README.md`
- `apps/etl/connectors/feiqiu/docs/architecture/system_overview.md` (needs update)
- `apps/etl/connectors/feiqiu/docs/architecture/data_flow.md` (needs update)
- `apps/etl/connectors/feiqiu/docs/architecture/ods_taskspec_refactor_proposal.md` (OK)
- `apps/etl/connectors/feiqiu/docs/etl_tasks/ods_tasks.md` (minor schema fix)
- `apps/etl/connectors/feiqiu/docs/etl_tasks/dws_tasks.md` (minor schema fix)
- `apps/etl/connectors/feiqiu/docs/etl_tasks/index_tasks.md` (minor schema fix)
- `apps/etl/connectors/feiqiu/docs/etl_tasks/base_task_mechanism.md` (needs update)
- `apps/etl/connectors/feiqiu/docs/etl_tasks/utility_tasks.md` (minor schema fix)
- `apps/etl/connectors/feiqiu/docs/operations/environment_setup.md` (needs major update)
- `apps/etl/connectors/feiqiu/docs/operations/scheduling.md` (needs minor update)
- `apps/etl/connectors/feiqiu/docs/CHANGELOG.md` (needs new entry at end)
**USER CORRECTIONS AND INSTRUCTIONS**:
- User wants docs updated according to current project state ("按照项目现状更新文档")
- Language: Chinese (user communicates in Chinese, docs are mixed Chinese/English)
- All explanatory text must be in simplified Chinese per `language-zh.md` steering
- This is a docs-only change — no code files should be modified
- Governance note: docs-only changes are low risk and don't require `/audit`
**Files to read** (prioritized for the new agent to continue updating):
- `apps/etl/connectors/feiqiu/docs/architecture/system_overview.md` (first to update)
- `apps/etl/connectors/feiqiu/docs/architecture/data_flow.md`
- `apps/etl/connectors/feiqiu/docs/etl_tasks/base_task_mechanism.md`
- `apps/etl/connectors/feiqiu/docs/operations/environment_setup.md`
- `apps/etl/connectors/feiqiu/docs/operations/scheduling.md`
- `apps/etl/connectors/feiqiu/tasks/dws/base_dws_task.py` (verify actual DWS schema prefix before bulk rename)
- `apps/etl/connectors/feiqiu/orchestration/task_registry.py` (source of truth for all task registrations — already read, key facts summarized above)
- `apps/etl/connectors/feiqiu/cli/main.py` (source of truth for CLI params — already read, key facts summarized above)
USER QUERIES(most recent first):
1. Output:
Command executed successfully with no output.
Exit Code: 0
2. Output:
Command executed successfully with no output.
Exit Code: 0
---
METADATA:
The previous conversation had 2 messages.
INSTRUCTIONS:
Continue working until the user query has been fully addressed. Do not ask for clarification - proceed with the work based on the context provided.
IMPORTANT: you need to read from the files to Read section