在前后端开发联调前 的提交20260223

This commit is contained in:
Neo
2026-02-23 23:02:20 +08:00
parent 254ccb1e77
commit fafc95e64c
1142 changed files with 10366960 additions and 36957 deletions

View File

@@ -0,0 +1,156 @@
- [P20260221-153812] 2026-02-21 15:38:12 +0800
- summary: CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary: --- ## TASK 1: 前后端联调 — a…
- prompt:
```text
CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary:
---
## TASK 1: 前后端联调 — admin-web 任务配置、提交、执行、验证
**STATUS**: in-progress
**USER QUERIES**: All queries in this conversation are continuations of the same task from a previous context transfer.
**SHORT DESCRIPTION**: 前后端联调:通过 admin-web 浏览器配置 ETL 任务api_full, full_window, 2025-11-01~2026-02-20, 30天窗口切分, force-full, 40个常用任务提交执行做回归测试和数据检验执行 ETL Data Consistency Check hook输出详细计时文档。
**DETAILS**:
### 已完成的步骤:
1. **端口清理和服务启动** — 前端 `pnpm dev` 在 5173后端 uvicorn 在 8000PID 3644
2. **登录** — 用户手动完成
3. **浏览器 UI 配置** — 通过 Playwright 完成所有配置并提交执行
4. **任务已提交** — 开始时间 2026/2/21 15:29:20run_uuid=4ba9d2d365ee4a858f1c4104b1942dc2
### 当前发现的关键 BUG
ETL 执行过程中 `DWS_ASSISTANT_DAILY` 任务失败,根因是 `assistant_daily_task.py` 的 `_extract_trash_records` 方法中 SQL 引用了 `dwd.dwd_assistant_trash_event` 表中不存在的字段:
- `assistant_service_id` → 表中实际 PK 是 `assistant_trash_event_id`
- `trash_seconds` → 表中实际字段是 `charge_minutes_raw`(分钟,需 *60 转秒)
- `trash_time` → 表中实际字段是 `create_time`
这导致 `psycopg2.errors.UndefinedColumn` 错误,进而事务进入 `InFailedSqlTransaction` 状态,级联导致以下任务全部失败:
- DWS_ASSISTANT_DAILY根因
- DWS_ASSISTANT_MONTHLY
- DWS_ASSISTANT_CUSTOMER
- DWS_ASSISTANT_SALARY
- DWS_ASSISTANT_FINANCE
- ODS_SETTLEMENT_RECORDS
- ODS_PAYMENT
- ODS_REFUND
- DWS_BUILD_ORDER_SUMMARY
ODS_TABLE_USE 之后的任务恢复执行task_executor 有连接恢复机制ETL 仍在后台运行中。
### 修复尝试(未成功):
尝试用 `editCode` 的 `replace_node` 修复 `_extract_trash_records` 方法,但写入验证失败。需要重试修复。
### 修复方案(已确认正确):
在 `_extract_trash_records` 中:
```python
# 旧 SQL错误
SELECT assistant_service_id, trash_seconds, trash_reason, trash_time
FROM dwd.dwd_assistant_trash_event
WHERE site_id = %s AND DATE(trash_time) >= %s AND DATE(trash_time) <= %s
# 新 SQL正确
SELECT assistant_trash_event_id, charge_minutes_raw * 60 AS trash_seconds,
trash_reason, create_time AS trash_time, table_id, assistant_name
FROM dwd.dwd_assistant_trash_event
WHERE site_id = %s AND DATE(create_time) >= %s AND DATE(create_time) <= %s
```
`_build_trash_index` 中:
```python
# 旧代码service_id = record.get('assistant_service_id')
# 新代码service_id = record.get('assistant_trash_event_id')
```
**重要业务逻辑问题**:废除表 `dwd_assistant_trash_event` 没有 `assistant_service_id` 外键,无法与 `dwd_assistant_service_log` 做 1:1 关联。`_aggregate_by_assistant_date` 中用 `service_id in trash_index` 判断服务是否被废除的逻辑在设计上有缺陷。但服务记录已有 `is_delete` 字段(当前 SQL 已用 `AND asl.is_delete = 0` 过滤),所以废除排除可能需要改为按 `table_id` + 时间匹配,或者直接依赖 `is_delete` 标记。这是一个需要与用户确认的业务决策。
**NEXT STEPS**:
1. **修复 `_extract_trash_records` SQL** — 替换错误字段名(文件:`apps/etl/connectors/feiqiu/tasks/dws/assistant_daily_task.py`,方法 `_extract_trash_records` 约第 163 行)
2. **修复 `_build_trash_index`** — 将 `record.get('assistant_service_id')` 改为 `record.get('assistant_trash_event_id')`(同文件,约第 201 行)
3. **与用户讨论废除关联逻辑** — 当前 trash_index 用 `assistant_trash_event_id` 做 key`_aggregate_by_assistant_date` 中用 `service_id in trash_index` 匹配service_id 来自 `dwd_assistant_service_log.assistant_service_id`),两个 ID 不同源,匹配不上。需要确认是否改用 `table_id` + 时间匹配,或直接依赖 `is_delete` 标记
4. **等待当前 ETL 执行完成** — 任务仍在后台运行,需要刷新浏览器检查最终状态
5. **回归测试和数据检验** — 执行完成后验证数据正确性
6. **执行 ETL Data Consistency Check hook** — 运行 `scripts/ops/etl_consistency_check.py`
7. **创建计时文档** — 解析 ETL 日志提取各步骤耗时,输出到 `SYSTEM_LOG_ROOT`
**FILEPATHS**:
- `apps/etl/connectors/feiqiu/tasks/dws/assistant_daily_task.py` — 需要修复的文件(`_extract_trash_records` + `_build_trash_index`
- `db/etl_feiqiu/schemas/dwd.sql``dwd_assistant_trash_event` 表 DDL第 1580-1593 行)
- `db/etl_feiqiu/schemas/ods.sql` — ODS `assistant_cancellation_records` 表(第 575-620 行)
- `apps/etl/connectors/feiqiu/tasks/dwd/dwd_load_task.py` — DWD 字段映射(第 362-374 行)
- `apps/backend/app/routers/execution.py` — 执行路由
- `apps/backend/app/config.py` — 后端配置
- `scripts/ops/_env_paths.py` — 共享路径工具
- `.env.template` — 环境变量模板
**RUNNING PROCESSES**:
- 进程 6: `pnpm dev` in `apps/admin-web`(前端 dev server, port 5173
- 后端 uvicorn PID 3644通过 Start-Process 启动, port 8000, 非 Kiro 管理)
- ETL 子进程正在执行中(由后端 execution router 启动)
**BROWSER STATE**:
- Playwright 浏览器在 http://localhost:5173/task-manager任务管理页面历史标签执行详情对话框打开中显示实时日志
- 用户已登录JWT token 有效
## USER CORRECTIONS AND INSTRUCTIONS:
- 所有输出文件必须写入 `export/` 目录体系,通过 `.env` 环境变量控制路径export-paths.md 强制规范)
- 禁止硬编码绝对/相对路径,路径仅存在于 `.env` 文件中
- 测试必须使用完整 `.env` 加载cwd 与正式运行一致testing-env.md 强制规范)
- 数据库使用测试库:`test_etl_feiqiu` / `test_zqyy_app`
- 所有说明性文字使用简体中文language-zh.md 强制规范)
- 脚本级任务优先编写 Python 脚本执行,避免 PowerShell 复杂逻辑
- 一次性运维脚本放 `scripts/ops/`
- 高风险路径变更需审计(`tasks/` 目录下的修改需要审计)
- `scripts/ops/` 脚本通过 `_env_paths.get_output_path("变量名")` 读取输出路径
- MCP `mcp_pg_etl_test` / `mcp_pg_app_test` 连接失败Not connected需要用 Python 脚本或后端 API 查询数据库
- 用户说"你使用浏览器,进行调试,我来帮你登录" — 表示用户愿意手动协助浏览器操作
## DWD 表结构速查(修复所需):
`dwd.dwd_assistant_trash_event` 实际字段:
- `assistant_trash_event_id` BIGINT (PK)
- `site_id` BIGINT
- `table_id` BIGINT
- `table_area_id` BIGINT
- `assistant_no` VARCHAR(32)
- `assistant_name` VARCHAR(64)
- `charge_minutes_raw` INTEGER废除时长分钟
- `abolish_amount` NUMERIC(18,2)
- `trash_reason` VARCHAR(255)
- `create_time` TIMESTAMPTZ废除时间
- `tenant_id` BIGINT
`dwd.dwd_assistant_service_log` 关键字段:
- `assistant_service_id` BIGINT (PK)
- `site_table_id` BIGINT
- `is_delete` INTEGER0=正常非0=已删除)
- `income_seconds` INTEGER
- `start_use_time` TIMESTAMPTZ
## Files to read
- `apps/etl/connectors/feiqiu/tasks/dws/assistant_daily_task.py`
- `db/etl_feiqiu/schemas/dwd.sql` (lines 1438-1475 for service_log, lines 1580-1620 for trash_event)
- `scripts/ops/_env_paths.py`
- `.env.template`
USER QUERIES(most recent first):
1. Output:
Command executed successfully with no output.
Exit Code: 0
2. Output:
Command executed successfully with no output.
Exit Code: 0
---
METADATA:
The previous conversation had 2 messages.
INSTRUCTIONS:
Continue working until the user query has been fully addressed. Do not ask for clarification - proceed with the work based on the context provided.
IMPORTANT: you need to read from the files to Read section
```