在前后端开发联调前 的提交20260223
This commit is contained in:
99
docs/audit/prompt_logs/prompt_log_20260221_122109.md
Normal file
99
docs/audit/prompt_logs/prompt_log_20260221_122109.md
Normal file
@@ -0,0 +1,99 @@
|
||||
- [P20260221-122109] 2026-02-21 12:21:09 +0800
|
||||
- summary: CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary: --- ## TASK 1: Execute E…
|
||||
- prompt:
|
||||
```text
|
||||
CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary:
|
||||
|
||||
---
|
||||
## TASK 1: Execute ETL Full-Chain Data Consistency Black-Box Check
|
||||
|
||||
**STATUS**: in-progress
|
||||
|
||||
**USER QUERIES**: User's single request - Run `python scripts/ops/etl_consistency_check.py` and summarize findings
|
||||
|
||||
**DETAILS**:
|
||||
- User requested execution of the ETL consistency check script which performs:
|
||||
- API JSON vs ODS field completeness and value sampling
|
||||
- ODS vs DWD mapping correctness and value transformation verification
|
||||
- DWD vs DWS/INDEX aggregation logic verification
|
||||
- The script at `scripts/ops/etl_consistency_check.py` has been fully read and verified complete (1011 lines)
|
||||
- The script connects to the database via `PG_DSN` in read-only mode
|
||||
- Output goes to `ETL_REPORT_ROOT` (from `.env`)
|
||||
- No code changes needed - this is a pure execution task
|
||||
- The script was NOT yet executed - context limit was hit before running it
|
||||
|
||||
**NEXT STEPS**:
|
||||
1. Run the script: `uv run python scripts/ops/etl_consistency_check.py` from the workspace root directory
|
||||
2. Check the output report in `ETL_REPORT_ROOT` (path from `.env`)
|
||||
3. Read the generated report and summarize key findings to the user
|
||||
4. Report structure expected: ETL execution overview, API↔ODS consistency, ODS↔DWD consistency, DWD↔DWS consistency, anomaly summary
|
||||
|
||||
**FILEPATHS**:
|
||||
- `scripts/ops/etl_consistency_check.py` - the main script to execute
|
||||
- `.env` - contains PG_DSN, ETL_REPORT_ROOT, FETCH_ROOT, LOG_ROOT
|
||||
|
||||
**KEY CONTEXT FROM PREVIOUS REPORTS** (for interpreting results):
|
||||
- Known bug: `DWS_RELATION_INDEX` fails due to `d.is_delete` should be `s.is_delete` in `relation_index_task.py` line 226
|
||||
- 14 DWS tasks are registered but not yet run (will show as SKIP)
|
||||
- ODS layer: 22/22 tasks typically succeed
|
||||
- DWD layer: 20/20 tables have data
|
||||
- DWS layer: only 9/32 tables have data
|
||||
- Known mapping issues: `dwd.dim_member.update_time` missing ODS source, `dwd.dwd_member_balance_change.principal_change_amount` missing ODS source
|
||||
- 2 DWD tables don't exist yet: `dwd.dwd_goods_stock_movement`, `dwd.dwd_goods_stock_summary`
|
||||
|
||||
**USER CORRECTIONS AND INSTRUCTIONS**:
|
||||
- All responses must be in simplified Chinese (language-zh.md steering)
|
||||
- Use `PG_DSN` (production database) in read-only mode, do NOT modify any data
|
||||
- Testing environment rules apply: must load `.env` properly, never skip config
|
||||
- Output paths must come from `.env` environment variables (export-paths.md steering)
|
||||
- Script execution convention: run Python scripts via `uv run python` or `python`
|
||||
- The workspace root is `C:\NeoZQYY` on Windows with cmd shell
|
||||
- Four database connections available via MCP: `mcp_pg_etl` (production), `mcp_pg_etl_test` (test), `mcp_pg_app`, `mcp_pg_app_test`
|
||||
- store_id: `2790685415443269`
|
||||
|
||||
**Files to read**:
|
||||
- `scripts/ops/etl_consistency_check.py`
|
||||
- `.env`
|
||||
|
||||
USER QUERIES(most recent first):
|
||||
1. <source-event>
|
||||
The user manually invoked this action
|
||||
The user is focued on the following file: No file focused
|
||||
The user has the following paths open:
|
||||
</source-event>
|
||||
|
||||
执行 ETL 全链路数据一致性黑盒检查,按以下步骤完成:
|
||||
|
||||
1. 运行 `python scripts/ops/etl_consistency_check.py`
|
||||
2. 脚本会自动:
|
||||
a. 从 LOG_ROOT 找到最近一次成功的 ETL 日志,解析本次执行的任务列表
|
||||
b. 从 FETCH_ROOT 读取本次 ETL 落盘的 API JSON 文件
|
||||
c. 连接数据库(PG_DSN),对本次任务涉及的每张表逐字段比对:
|
||||
- API JSON vs ODS:字段完整性、值采样比对(随机 5 条记录的关键字段)
|
||||
- ODS vs DWD:字段映射正确性、值转换验证(采样比对)
|
||||
- DWD vs DWS/INDEX:聚合逻辑验证(行数、关键指标抽查)
|
||||
d. 输出 Markdown 报告到 ETL_REPORT_ROOT
|
||||
3. 检查报告输出,汇总关键发现
|
||||
|
||||
报告结构:
|
||||
- 1. ETL 执行概览(任务列表、成功/失败/跳过统计)
|
||||
- 2. API↔ODS 数据一致性(逐表逐字段值比对)
|
||||
- 3. ODS↔DWD 数据一致性(映射验证 + 值采样)
|
||||
- 4. DWD↔DWS 数据一致性(聚合逻辑验证)
|
||||
- 5. 异常汇总与建议
|
||||
|
||||
注意:使用正式库 PG_DSN 连接(只读模式),不修改任何数据。
|
||||
2. <implicit-rules>## Implicit Rules
|
||||
Focus on creating a new spec file or identifying an existing spec to update.
|
||||
If starting a new spec, create a requirements.md file in the .kiro/specs directory with clear user stories and acceptance criteria.
|
||||
If working with an existing spec, review the current requirements and suggest improvements if needed.
|
||||
Do not make direct code changes yet. First establish or review the spec file that will guide our implementation.</implicit-rules>
|
||||
---
|
||||
|
||||
METADATA:
|
||||
The previous conversation had 2 messages.
|
||||
|
||||
INSTRUCTIONS:
|
||||
Continue working until the user query has been fully addressed. Do not ask for clarification - proceed with the work based on the context provided.
|
||||
IMPORTANT: you need to read from the files to Read section
|
||||
```
|
||||
Reference in New Issue
Block a user