This commit is contained in:
Neo
2026-03-15 10:15:02 +08:00
parent 2dd217522c
commit 72bb11b34f
916 changed files with 65306 additions and 16102803 deletions

View File

@@ -0,0 +1,203 @@
# BD 手册assistant_trash_event 清理 + ODS_STORE_GOODS_SALES 修复 + dim_staff_ex 修复 + DWS 精度扩展 + ODS 库存 siteid 注入
> 日期2026-03-01
> Prompt 摘要:清理 assistant_trash_event 残留代码/DDL 文档;修复 ODS_STORE_GOODS_SALES 窗口配置以恢复商品销售数据拉取;修复 dim_staff_ex FACT_MAPPINGS 列名映射错误DWS 层 7 个 ratio/margin 字段 numeric 精度扩展P1ODS goods_stock_summary 加 siteid 列 + DWD 映射补全P2
> 直接原因:联调发现的 P1/P2 问题 + 三个独立问题的批量修复
---
## 一、变更说明
### 1.1 assistant_trash_event 残留清理
`dwd.dwd_assistant_trash_event``dwd.dwd_assistant_trash_event_ex` 已于 2026-02-22 DROP
上游 ODS 表 `ods.assistant_cancellation_records` 同步 DROP。本次清理残留的代码引用和 DDL 文档。
**DDL 文档删除项:**
| 文件 | 删除内容 |
|------|---------|
| `etl_feiqiu__ods.sql` | PK 约束 `assistant_cancellation_records_pkey`,索引 `idx_assistant_cancellation_records_fetched_at_*`2 条) |
| `etl_feiqiu__dwd.sql` | PK 约束 `dwd_assistant_trash_event_pkey``dwd_assistant_trash_event_ex_pkey`,索引 `idx_dwd_assistant_trash_event_*`2 条) |
| `dwd-amount-duration-calibration.md` | 章节 2.11(助教废除事件主表)、存疑字段 #15、数据新鲜度行 |
**代码删除项(前序已完成):**
- `utils/json_store.py` — API 路径映射
- `tasks/utility/manual_ingest_task.py` — FILE_MAPPING / TABLE_SPECS
- `quality/consistency_checker.py` — ODS_TABLE_TO_JSON_FILE / ODS_TABLE_TO_TASK_CODE
- `scripts/refresh_json_and_audit.py` — ACTUAL_LIST_KEY
- `scripts/run_compare_v3.py` / `run_compare_v3_fixed.py` — TABLES 列表
### 1.2 ODS_STORE_GOODS_SALES 窗口配置修复
**问题:** `ods_tasks.py``ODS_STORE_GOODS_SALES``requires_window=False`,导致 API `/TenantGoods/GetGoodsSalesList` 不传 `startTime/endTime` 参数,始终返回 0 条记录。
**修复:**
- `requires_window``True`
- 新增 `time_fields=("startTime", "endTime")`
**数据恢复:** 以 30 天窗口切分,回填 2025-07-07 ~ 2026-03-01ODS 新增 26,759 条DWD 层 `dwd_store_goods_sale` 从 17,563 → 26,759 条,时间范围延伸至 2026-02-25。
### 1.3 dim_staff_ex FACT_MAPPINGS 列名修复
**问题:** `dwd_load_task.py``dim_staff_ex` 的 FACT_MAPPINGS 使用驼峰列名(`cashierpointid``groupid` 等),但 ODS 表 `staff_info_master` 实际列名为下划线风格(`cashier_point_id``group_id` 等),导致 SCD2 合并 SQL 执行报错,整表被跳过。
**修复映射:**
| DWD 列 | 修复前(错误) | 修复后(正确) |
|--------|---------------|---------------|
| `cashier_point_id` | `cashierpointid` | `cashier_point_id` |
| `cashier_point_name` | `cashierpointname` | `cashier_point_name` |
| `group_id` | `groupid` | `group_id` |
| `group_name` | `groupname` | `group_name` |
| `system_user_id` | `systemuserid` | `system_user_id` |
| `tenant_org_id` | `tenantorgid` | `tenant_org_id` |
| `user_roles` | `userroles` | `user_roles` |
**数据恢复:** DWD 装载后 `dim_staff_ex` 从 0 行 → 15 行。
### 1.4 [P1] DWS 层 numeric 精度扩展(举一反三)
**问题:** `dws.dws_assistant_finance_analysis.gross_margin` 定义为 `numeric(5,4)`,只能存 ±0.9999。当 `cost_daily > revenue_total`(亏损场景),`gross_margin = gross_profit / revenue_total` 可能 < -1导致 INSERT 溢出报错。举一反三排查发现 7 个同类风险字段。
**修复:**
| 表 | 字段 | 修复前 | 修复后 |
|----|------|--------|--------|
| `dws.cfg_performance_tier` | `bonus_deduction_ratio` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_assistant_finance_analysis` | `gross_margin` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_assistant_recharge_commission` | `commission_ratio` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_assistant_salary_calc` | `bonus_deduction_ratio` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_finance_discount_detail` | `discount_ratio` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_finance_income_structure` | `income_ratio` | `numeric(5,4)` | `numeric(7,4)` |
| `dws.dws_member_assistant_intimacy` | `burst_multiplier` | `numeric(6,4)` | `numeric(7,4)` |
**依赖视图处理:** 7 个 `app.v_dws_*` RLS 视图先 DROP 再重建。
**Python 防御:** `assistant_finance_task.py``gross_margin` 计算加 clamp 到 ±999.9999。
**迁移脚本:** `db/etl_feiqiu/migrations/20260301_dws_numeric_precision_fix.sql`
### 1.5 [P2] ODS goods_stock_summary 加 siteid + DWD 映射补全
**问题:** `dwd.dwd_goods_stock_summary` DDL 定义了 `site_id bigint``tenant_id bigint`,但 FACT_MAPPINGS 缺少映射,导致 DWD 层 site_id 始终为 NULL。
**根因分析:**
- API `GetGoodsStockReport` 返回的记录不含 `siteId`/`tenantId`(已从 JSON 缓存确认)
- ODS 表 `ods.goods_stock_summary` 也没有 `siteid`
- 请求参数中有 `siteId`,但 ODS 入库时未注入到记录中
**修复(三层联动):**
1. ODS 表加列:`ALTER TABLE ods.goods_stock_summary ADD COLUMN siteid bigint`
2. ODS 入库通用注入:`_insert_records_schema_aware` 中,当 ODS 表有 `siteid` 列但记录不含时,从 `app.store_id` 配置注入
3. DWD FACT_MAPPINGS 补映射:`("site_id", '"siteid"', "bigint")`
4. 已有数据回填:从 `ods.goods_stock_movements` 推断 siteid 回填 3216 条
**迁移脚本:** `db/etl_feiqiu/migrations/20260301_ods_goods_stock_summary_add_siteid.sql`
---
## 二、兼容性影响
| 子系统 | 影响 |
|--------|------|
| ETL | `assistant_trash_event` 相关任务已无代码引用,无影响;`ODS_STORE_GOODS_SALES` 恢复正常窗口拉取;`dim_staff_ex` 恢复正常 SCD2 装载DWS ratio 字段精度扩展后不再溢出ODS 库存汇总入库时自动注入 siteid |
| 后端 API | 7 个 `app.v_dws_*` RLS 视图已重建,字段类型从 numeric(5,4) 变为 numeric(7,4)API 返回值精度不变(仍为 4 位小数),无破坏性影响 |
| 小程序 | 无影响 |
| 管理后台 | 商品销售相关报表数据将恢复完整;库存汇总将携带 site_id |
---
## 三、回滚策略
### 3.1 assistant_trash_event
纯文档清理,无需回滚。如需恢复,从 git 历史还原对应行即可。
### 3.2 ODS_STORE_GOODS_SALES
```python
# 回滚 ods_tasks.py
requires_window=True requires_window=False
# 删除 time_fields=("startTime", "endTime") 行
```
### 3.3 dim_staff_ex
```python
# 回滚 dwd_load_task.py FACT_MAPPINGS
("cashier_point_id", "cashier_point_id", "bigint") ("cashier_point_id", "cashierpointid", "bigint")
# 其余 6 个字段同理恢复驼峰写法
```
如需清空已装载数据:`TRUNCATE dwd.dim_staff_ex;`
### 3.4 [P1] DWS numeric 精度
```sql
-- 回滚脚本db/etl_feiqiu/migrations/20260301_dws_numeric_precision_fix_rollback.sql
-- 注意:如果已有数据超出原精度范围(如 gross_margin > 0.9999),回滚会失败
-- 需先清理超范围数据:
UPDATE dws.dws_assistant_finance_analysis SET gross_margin = LEAST(0.9999, GREATEST(-0.9999, gross_margin));
-- 然后执行回滚(同样需要先 DROP 再重建视图)
```
Python 回滚:删除 `assistant_finance_task.py` 中的 clamp 行。
### 3.5 [P2] ODS siteid 列
```sql
-- ODS 列不可逆删除(已有数据依赖),但可置 NULL
UPDATE ods.goods_stock_summary SET siteid = NULL;
-- DWD FACT_MAPPINGS 回滚:删除 ("site_id", '"siteid"', "bigint") 行
-- ODS 入库注入回滚:删除 ods_tasks.py 中 "通用 siteid 注入" 代码块
```
---
## 四、验证 SQL
```sql
-- 1. 确认 assistant_trash_event 表已不存在
SELECT count(*) FROM information_schema.tables
WHERE table_schema = 'dwd' AND table_name LIKE '%assistant_trash_event%';
-- 预期0
-- 2. 确认 ODS 商品销售数据已回填
SELECT count(*) as cnt, min(create_time) as min_time, max(create_time) as max_time
FROM ods.store_goods_sales_records WHERE fetched_at IS NOT NULL;
-- 预期cnt > 40000, max_time >= 2026-02-25
-- 3. 确认 DWD 商品销售数据已更新
SELECT count(*) as cnt, max(create_time) as max_time FROM dwd.dwd_store_goods_sale;
-- 预期cnt > 25000, max_time >= 2026-02-25
-- 4. 确认 dim_staff_ex 已有数据
SELECT count(*) as cnt FROM dwd.dim_staff_ex WHERE scd2_is_current = 1;
-- 预期cnt = 15与 dim_staff 一致)
-- 5. 确认 dim_staff_ex 关键字段非全 NULL
SELECT count(*) as has_system_user FROM dwd.dim_staff_ex
WHERE scd2_is_current = 1 AND system_user_id IS NOT NULL;
-- 预期:> 0
-- 6. [P1] 确认 DWS ratio 字段精度已扩展
SELECT table_name, column_name, numeric_precision, numeric_scale
FROM information_schema.columns
WHERE table_schema = 'dws'
AND column_name IN ('gross_margin', 'bonus_deduction_ratio', 'commission_ratio',
'discount_ratio', 'income_ratio', 'burst_multiplier')
ORDER BY table_name;
-- 预期:所有行 numeric_precision=7, numeric_scale=4
-- 7. [P1] 确认 RLS 视图已重建
SELECT table_schema || '.' || table_name FROM information_schema.views
WHERE table_schema = 'app'
AND table_name IN ('v_cfg_performance_tier', 'v_dws_assistant_finance_analysis',
'v_dws_assistant_recharge_commission', 'v_dws_assistant_salary_calc',
'v_dws_finance_discount_detail', 'v_dws_finance_income_structure',
'v_dws_member_assistant_intimacy');
-- 预期7 行
-- 8. [P2] 确认 ODS goods_stock_summary 有 siteid 列且已回填
SELECT count(*) as total, count(siteid) as filled, count(DISTINCT siteid) as distinct_sites
FROM ods.goods_stock_summary;
-- 预期filled = total, distinct_sites = 1
-- 9. [P2] 确认 DWD FACT_MAPPINGS 生效(下次 DWD_LOAD 后验证)
SELECT count(*) as has_site_id FROM dwd.dwd_goods_stock_summary WHERE site_id IS NOT NULL;
-- 预期:下次 ETL 运行后 > 0当前可能仍为 0需重跑 DWD_LOAD
```

View File

@@ -0,0 +1,186 @@
# BD_Manualdws.biz_date() 函数与物化视图营业日重建
> 变更类型:新增函数 + 重建物化视图
> Schema`dws`
> 迁移脚本 1`db/etl_feiqiu/migrations/2026-02-27__add_biz_date_function.sql`
> 迁移脚本 2`db/etl_feiqiu/migrations/2026-02-27__rebuild_mv_with_biz_date.sql`
> DDL 基线:`docs/database/ddl/etl_feiqiu__dws.sql`
> 关联需求Requirements 9.1, 9.2, 9.3, 9.4
---
## 1. 变更说明
### 1.1 新增函数:`dws.biz_date(timestamptz, int)`
| 属性 | 值 |
|------|-----|
| 函数签名 | `dws.biz_date(ts timestamptz, cutoff_hour int DEFAULT 8) RETURNS date` |
| 语言 | SQL |
| 特性 | `IMMUTABLE PARALLEL SAFE` |
| 逻辑 | `(ts - make_interval(hours => cutoff_hour))::date` |
| 等价 Python | `neozqyy_shared.datetime_utils.business_date()` |
将时间戳减去 `cutoff_hour` 小时后取日期,实现营业日归属。默认 `cutoff_hour=8`,即 08:00 前的时间戳归属前一天。
### 1.2 重建物化视图8 个)
`CURRENT_DATE` 替换为 `dws.biz_date(NOW())`,使物化视图的数据范围与 DWS 任务的营业日口径一致。
| 物化视图 | 原条件 | 新条件 |
|---------|--------|--------|
| `mv_dws_assistant_daily_detail_l1` | `stat_date >= (CURRENT_DATE - '1 day')` | `stat_date >= (dws.biz_date(NOW()) - '1 day')` |
| `mv_dws_assistant_daily_detail_l2` | `stat_date >= (CURRENT_DATE - '30 days')` | `stat_date >= (dws.biz_date(NOW()) - '30 days')` |
| `mv_dws_assistant_daily_detail_l3` | `stat_date >= (CURRENT_DATE - '90 days')` | `stat_date >= (dws.biz_date(NOW()) - '90 days')` |
| `mv_dws_assistant_daily_detail_l4` | `date_trunc('month', CURRENT_DATE) ± 6 mons` | `date_trunc('month', dws.biz_date(NOW())) ± 6 mons` |
| `mv_dws_finance_daily_summary_l1` | `stat_date >= (CURRENT_DATE - '1 day')` | `stat_date >= (dws.biz_date(NOW()) - '1 day')` |
| `mv_dws_finance_daily_summary_l2` | `stat_date >= (CURRENT_DATE - '30 days')` | `stat_date >= (dws.biz_date(NOW()) - '30 days')` |
| `mv_dws_finance_daily_summary_l3` | `stat_date >= (CURRENT_DATE - '90 days')` | `stat_date >= (dws.biz_date(NOW()) - '90 days')` |
| `mv_dws_finance_daily_summary_l4` | `date_trunc('month', CURRENT_DATE) ± 6 mons` | `date_trunc('month', dws.biz_date(NOW())) ± 6 mons` |
索引在重建后重新创建,结构不变。
---
## 2. 兼容性说明
| 影响范围 | 说明 |
|---------|------|
| ETL 任务 | `MvRefreshTask``DWS_MV_REFRESH_*`)执行 `REFRESH MATERIALIZED VIEW` 不受影响,视图定义变更对刷新逻辑透明 |
| 后端 API | 无直接影响。后端通过 DWS 表查询,物化视图仅用于加速查询 |
| 管理后台 | 无影响。前端不直接查询物化视图 |
| 小程序 | 无影响 |
| 字段映射 | 物化视图列结构不变(`SELECT *`),仅 WHERE 条件变更 |
| `biz_date()` 函数 | 标记为 `IMMUTABLE`,可安全用于索引表达式和物化视图定义 |
---
## 3. 回滚策略
### 3.1 回滚函数
```sql
DROP FUNCTION IF EXISTS dws.biz_date(timestamptz, int);
```
> 注意需先回滚物化视图3.2),否则依赖此函数的视图定义会阻止删除。
### 3.2 回滚物化视图(恢复自然日口径)
使用 `scripts/migrate/migrate_finalize.py` 中的原始定义重建:
```sql
BEGIN;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_assistant_daily_detail_l1;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_assistant_daily_detail_l2;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_assistant_daily_detail_l3;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_assistant_daily_detail_l4;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_finance_daily_summary_l1;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_finance_daily_summary_l2;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_finance_daily_summary_l3;
DROP MATERIALIZED VIEW IF EXISTS dws.mv_dws_finance_daily_summary_l4;
-- 用原始定义重建CURRENT_DATE 版本)
CREATE MATERIALIZED VIEW dws.mv_dws_assistant_daily_detail_l1 AS
SELECT * FROM dws.dws_assistant_daily_detail
WHERE stat_date >= (CURRENT_DATE - '1 day'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_assistant_daily_detail_l2 AS
SELECT * FROM dws.dws_assistant_daily_detail
WHERE stat_date >= (CURRENT_DATE - '30 days'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_assistant_daily_detail_l3 AS
SELECT * FROM dws.dws_assistant_daily_detail
WHERE stat_date >= (CURRENT_DATE - '90 days'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_assistant_daily_detail_l4 AS
SELECT * FROM dws.dws_assistant_daily_detail
WHERE stat_date >= (date_trunc('month', CURRENT_DATE::timestamptz) - '6 mons'::interval)
AND stat_date < date_trunc('month', CURRENT_DATE::timestamptz) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_finance_daily_summary_l1 AS
SELECT * FROM dws.dws_finance_daily_summary
WHERE stat_date >= (CURRENT_DATE - '1 day'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_finance_daily_summary_l2 AS
SELECT * FROM dws.dws_finance_daily_summary
WHERE stat_date >= (CURRENT_DATE - '30 days'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_finance_daily_summary_l3 AS
SELECT * FROM dws.dws_finance_daily_summary
WHERE stat_date >= (CURRENT_DATE - '90 days'::interval) WITH DATA;
CREATE MATERIALIZED VIEW dws.mv_dws_finance_daily_summary_l4 AS
SELECT * FROM dws.dws_finance_daily_summary
WHERE stat_date >= (date_trunc('month', CURRENT_DATE::timestamptz) - '6 mons'::interval)
AND stat_date < date_trunc('month', CURRENT_DATE::timestamptz) WITH DATA;
-- 重建索引
CREATE INDEX idx_mv_assistant_daily_l1 ON dws.mv_dws_assistant_daily_detail_l1 USING btree (site_id, stat_date, assistant_id);
CREATE INDEX idx_mv_assistant_daily_l2 ON dws.mv_dws_assistant_daily_detail_l2 USING btree (site_id, stat_date, assistant_id);
CREATE INDEX idx_mv_assistant_daily_l3 ON dws.mv_dws_assistant_daily_detail_l3 USING btree (site_id, stat_date, assistant_id);
CREATE INDEX idx_mv_assistant_daily_l4 ON dws.mv_dws_assistant_daily_detail_l4 USING btree (site_id, stat_date, assistant_id);
CREATE INDEX idx_mv_finance_daily_l1 ON dws.mv_dws_finance_daily_summary_l1 USING btree (site_id, stat_date);
CREATE INDEX idx_mv_finance_daily_l2 ON dws.mv_dws_finance_daily_summary_l2 USING btree (site_id, stat_date);
CREATE INDEX idx_mv_finance_daily_l3 ON dws.mv_dws_finance_daily_summary_l3 USING btree (site_id, stat_date);
CREATE INDEX idx_mv_finance_daily_l4 ON dws.mv_dws_finance_daily_summary_l4 USING btree (site_id, stat_date);
COMMIT;
```
---
## 4. 验证 SQL
### 4.1 确认 `biz_date()` 函数存在且行为正确
```sql
-- 08:00 前归属前一天
SELECT dws.biz_date('2026-01-15 07:59:59+08'::timestamptz) AS should_be_0114;
-- 预期2026-01-14
-- 08:00 起归属当天
SELECT dws.biz_date('2026-01-15 08:00:00+08'::timestamptz) AS should_be_0115;
-- 预期2026-01-15
-- 月末边界
SELECT dws.biz_date('2026-02-01 07:00:00+08'::timestamptz, 8) AS should_be_0131;
-- 预期2026-01-31
```
### 4.2 确认 8 个物化视图已重建且定义包含 `biz_date`
```sql
SELECT matviewname, definition LIKE '%biz_date%' AS uses_biz_date
FROM pg_matviews
WHERE schemaname = 'dws'
AND matviewname LIKE 'mv_dws_%'
ORDER BY matviewname;
-- 预期8 行uses_biz_date 全部为 true
```
### 4.3 确认物化视图索引完整
```sql
SELECT indexname, tablename
FROM pg_indexes
WHERE schemaname = 'dws'
AND tablename LIKE 'mv_dws_%'
ORDER BY indexname;
-- 预期8 个索引assistant_daily l1-l4 + finance_daily l1-l4
```
### 4.4 确认物化视图有数据(刷新后)
```sql
SELECT 'assistant_l1' AS mv, COUNT(*) FROM dws.mv_dws_assistant_daily_detail_l1
UNION ALL
SELECT 'assistant_l2', COUNT(*) FROM dws.mv_dws_assistant_daily_detail_l2
UNION ALL
SELECT 'assistant_l3', COUNT(*) FROM dws.mv_dws_assistant_daily_detail_l3
UNION ALL
SELECT 'assistant_l4', COUNT(*) FROM dws.mv_dws_assistant_daily_detail_l4
UNION ALL
SELECT 'finance_l1', COUNT(*) FROM dws.mv_dws_finance_daily_summary_l1
UNION ALL
SELECT 'finance_l2', COUNT(*) FROM dws.mv_dws_finance_daily_summary_l2
UNION ALL
SELECT 'finance_l3', COUNT(*) FROM dws.mv_dws_finance_daily_summary_l3
UNION ALL
SELECT 'finance_l4', COUNT(*) FROM dws.mv_dws_finance_daily_summary_l4;
```

View File

@@ -0,0 +1,94 @@
# BD_Manual修复 dim_staff_ex 列映射 rankname → rank_name
> 影响表:`dwd.dim_staff_ex`
> ODS 源表:`ods.staff_info_master`
> 修复日期2026-02-26
> 代码位置:`apps/etl/connectors/feiqiu/tasks/dwd/dwd_load_task.py`
> 触发场景:`FLOW_API_FULL` 执行时 `DWD_LOAD_FROM_ODS` 阶段 dim_staff_ex 加载失败
---
## 1. 变更说明
DWD 加载任务 `DwdLoadTask` 中,`dwd.dim_staff_ex` 的列映射定义错误:
```python
# 修复前(错误)
("rank_name", "rankname", None)
# 修复后(正确)
("rank_name", "rank_name", None)
```
映射元组含义:`(dwd_列名, ods_源列名, 类型转换)`
ODS 表 `ods.staff_info_master` 的实际列名为 `rank_name`(带下划线),而非 `rankname`。此错误导致 PostgreSQL 报 `UndefinedColumn: 字段 "rankname" 不存在`dim_staff_ex 的 SCD2 合并在每个窗口段均失败(共 4 次)。
---
## 2. 兼容性影响
| 组件 | 影响 |
|------|------|
| ETL DWD 层 | dim_staff_ex 恢复正常加载rank_name 字段将正确从 ODS 映射 |
| DWS 层 | 无直接影响(当前无 DWS 任务依赖 dim_staff_ex.rank_name |
| 后端 API | 无影响(后端通过 FDW 读取,表结构未变) |
| 小程序 | 无影响 |
| DDL | 无变更,表结构不变 |
---
## 3. 回滚策略
此修复仅涉及 Python 代码中的列映射字符串,无 DDL 变更。
回滚步骤:
1.`dwd_load_task.py` 中 dim_staff_ex 映射恢复为 `("rank_name", "rankname", None)`
2. 注意:回滚后 dim_staff_ex 将再次无法加载 rank_name 字段
已加载的数据无需回滚——修复前该字段从未成功写入。
---
## 4. 验证 SQL
```sql
-- 验证 1确认 ODS 源表列名为 rank_name非 rankname
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'ods'
AND table_name = 'staff_info_master'
AND column_name IN ('rank_name', 'rankname');
-- 预期:仅返回 rank_name
-- 验证 2确认 DWD 目标表存在 rank_name 列
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'dwd'
AND table_name = 'dim_staff_ex'
AND column_name = 'rank_name';
-- 预期:返回 1 行
-- 验证 3修复后重跑 ETL检查 dim_staff_ex 是否有 rank_name 非空数据
SELECT COUNT(*) AS total,
COUNT(rank_name) AS has_rank_name
FROM dwd.dim_staff_ex
WHERE scd2_is_current = 1;
-- 预期has_rank_name > 0取决于上游数据是否有值
-- 验证 4对比 ODS 与 DWD 的 rank_name 一致性
SELECT s.id AS staff_id, s.rank_name AS ods_rank_name, d.rank_name AS dwd_rank_name
FROM ods.staff_info_master s
JOIN dwd.dim_staff_ex d ON s.id = d.staff_id AND d.scd2_is_current = 1
WHERE s.rank_name IS DISTINCT FROM d.rank_name
LIMIT 10;
-- 预期:修复并重跑后返回 0 行
```
---
## 5. 映射修正记录
| 日期 | 字段 | 修正内容 |
|------|------|---------|
| 2026-02-26 | `rank_name` | ODS 源列名从 `rankname` 修正为 `rank_name`,与 `ods.staff_info_master` DDL 一致 |

View File

@@ -0,0 +1,132 @@
# BD_Manual修复 DWS_ASSISTANT_DAILY 缺失 table_area_name 列
> 影响任务:`DWS_ASSISTANT_DAILY`
> 涉及表:`dwd.dwd_assistant_service_log`(读取)、`dwd.dim_table`(新增 JOIN、`dws.dws_assistant_daily`(写入)
> 修复日期2026-02-26
> 代码位置:`apps/etl/connectors/feiqiu/tasks/dws/assistant_daily_task.py` → `_extract_service_records()`
> 触发场景:`FLOW_API_FULL` 执行时 DWS 阶段首个任务失败,级联导致后续 13 个 DWS 任务全部 `InFailedSqlTransaction`
---
## 1. 变更说明
`_extract_service_records()` 方法的 SQL 原先直接从 `dwd.dwd_assistant_service_log` 读取 `asl.table_area_name`但该表实际不存在此列DDL 可确认)。
修复方式:通过 LEFT JOIN `dwd.dim_table` 获取台区名称。
```sql
-- 修复前(错误)
SELECT ...
asl.table_area_name,
...
FROM dwd.dwd_assistant_service_log asl
LEFT JOIN dwd.dwd_assistant_service_log_ex ex ...
-- 修复后(正确)
SELECT ...
COALESCE(dt.site_table_area_name, '') AS table_area_name,
...
FROM dwd.dwd_assistant_service_log asl
LEFT JOIN dwd.dwd_assistant_service_log_ex ex ...
LEFT JOIN dwd.dim_table dt
ON asl.site_table_id = dt.table_id
AND dt.scd2_is_current = 1
```
JOIN 条件说明:
- `asl.site_table_id = dt.table_id`:通过台桌 ID 关联维度表
- `dt.scd2_is_current = 1`:仅取当前有效的 SCD2 版本
- `COALESCE(..., '')`dim_table 无匹配时回退空字符串,避免 NULL 传播
---
## 2. 级联失败说明
此 bug 不仅导致 `DWS_ASSISTANT_DAILY` 本身失败,还因 DWS 阶段共享同一数据库连接且无逐任务 rollback使 psycopg2 进入 `InFailedSqlTransaction` 状态,后续 13 个 DWS/INDEX 任务全部失败:
- DWS_ASSISTANT_CUSTOMER, DWS_ASSISTANT_SALARY, DWS_ASSISTANT_FINANCE, DWS_ASSISTANT_MONTHLY
- DWS_MEMBER_CONSUMPTION, DWS_MEMBER_VISIT
- DWS_FINANCE_DAILY, DWS_FINANCE_RECHARGE, DWS_FINANCE_INCOME_STRUCTURE, DWS_FINANCE_DISCOUNT_DETAIL
- DWS_WINBACK_INDEX, DWS_NEWCONV_INDEX, DWS_RELATION_INDEX
修复此根因后,上述任务均可恢复正常执行。
---
## 3. 兼容性影响
| 组件 | 影响 |
|------|------|
| ETL DWS 层 | `dws_assistant_daily` 恢复正常写入,`table_area_name` 来源从不存在的列改为 dim_table 维度表 |
| 后续 DWS 任务 | 级联失败消除,所有 DWS 任务可正常执行 |
| 后端 API | 无影响DWS 聚合表结构未变) |
| 管理后台 | 助教日报表将正确显示台区名称 |
| DDL | 无变更,无新增表/列 |
---
## 4. 回滚策略
此修复仅涉及 Python 代码中的 SQL 查询,无 DDL 变更。
回滚步骤:
1.`assistant_daily_task.py``_extract_service_records()` 的 SQL 恢复为 `asl.table_area_name`,移除 `LEFT JOIN dwd.dim_table`
2. 注意:回滚后 DWS_ASSISTANT_DAILY 将再次失败
已写入的 DWS 数据如需回滚:
```sql
-- 清除修复后写入的 dws_assistant_daily 数据(按需执行)
DELETE FROM dws.dws_assistant_daily
WHERE stat_date >= '2025-11-01';
```
---
## 5. 验证 SQL
```sql
-- 验证 1确认 dwd_assistant_service_log 确实没有 table_area_name 列
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'dwd'
AND table_name = 'dwd_assistant_service_log'
AND column_name = 'table_area_name';
-- 预期:返回 0 行
-- 验证 2确认 dim_table 存在 site_table_area_name 列
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'dwd'
AND table_name = 'dim_table'
AND column_name = 'site_table_area_name';
-- 预期:返回 1 行
-- 验证 3检查 JOIN 关联覆盖率service_log 的 site_table_id 在 dim_table 中的匹配率)
SELECT
COUNT(*) AS total_records,
COUNT(dt.table_id) AS matched_dim_table,
ROUND(COUNT(dt.table_id)::numeric / NULLIF(COUNT(*), 0) * 100, 1) AS match_pct
FROM dwd.dwd_assistant_service_log asl
LEFT JOIN dwd.dim_table dt
ON asl.site_table_id = dt.table_id
AND dt.scd2_is_current = 1
WHERE asl.is_delete = 0;
-- 预期match_pct 接近 100%
-- 验证 4修复后重跑 ETL检查 dws_assistant_daily 是否有数据
SELECT stat_date, COUNT(*) AS rows
FROM dws.dws_assistant_daily
WHERE stat_date >= '2025-11-01'
GROUP BY stat_date
ORDER BY stat_date
LIMIT 10;
-- 预期:有数据行返回
```
---
## 6. 代码引用
- 修复文件:`apps/etl/connectors/feiqiu/tasks/dws/assistant_daily_task.py``_extract_service_records()`
- 关联 BD Manual`BD_Manual_assistant_service_records.md`dwd_assistant_service_log 字段映射文档)
- dim_table DDL`docs/database/ddl/etl_feiqiu__dwd.sql`(含 `site_table_area_name` 列定义)

View File

@@ -0,0 +1,128 @@
# BD_Manualtenant_id INTEGER → BIGINT 迁移
> 日期2026-03-03
> 涉及库:`etl_feiqiu` / `test_etl_feiqiu`、`zqyy_app` / `test_zqyy_app`
> 迁移脚本:
> - `db/etl_feiqiu/migrations/2026-03-03__alter_tenant_id_int_to_bigint.sql`
> - `db/zqyy_app/migrations/2026-03-03__alter_tenant_id_int_to_bigint.sql`
> 直接原因:飞球 tenant_id如 2790683160709957远超 int4 上限2,147,483,647导致写入溢出
> Prompt 摘要:修复 tenant_id int4 溢出问题,迁移为 bigint
---
## 1. 变更说明
### 变更前
| 库 | Schema | 表 | 列 | 类型 |
|----|--------|----|----|------|
| etl_feiqiu | dws | dws_assistant_order_contribution | tenant_id | INTEGER (int4) NOT NULL |
| zqyy_app | auth | site_code_mapping | tenant_id | INTEGER (int4) NULL |
### 变更后
| 库 | Schema | 表 | 列 | 类型 |
|----|--------|----|----|------|
| etl_feiqiu | dws | dws_assistant_order_contribution | tenant_id | BIGINT (int8) NOT NULL |
| zqyy_app | auth | site_code_mapping | tenant_id | BIGINT (int8) NULL |
### 级联影响
| 对象 | 类型 | 处理方式 |
|------|------|---------|
| `app.v_dws_assistant_order_contribution` (ETL 库) | RLS 视图 | DROP → ALTER → 重建SELECT * |
| `fdw_etl.v_dws_assistant_order_contribution` (App 库) | FDW 外部表 | DROP → IMPORT FOREIGN SCHEMA 重新导入 |
| `app.v_dws_assistant_order_contribution` (App 库) | RLS 视图 | 自动继承 FDW 外部表类型 |
---
## 2. 兼容性影响
| 组件 | 影响 | 说明 |
|------|------|------|
| ETL 任务 | 无影响 | `assistant_order_contribution_task.py` 从 DWD 层读取 tenant_id已是 bigint写入 DWS 现在类型匹配 |
| 后端 API | 无影响 | 通过 FDW 视图读取,类型自动跟随源表 |
| 小程序 | 无影响 | 不直接使用 tenant_id |
| `init_test_user.py` | 已更新 | 移除 `_safe_tenant_id()` 中的 int4 范围检查降级逻辑 |
---
## 3. 回滚策略
### ETL 库回滚
```sql
BEGIN;
DROP VIEW IF EXISTS app.v_dws_assistant_order_contribution CASCADE;
ALTER TABLE dws.dws_assistant_order_contribution ALTER COLUMN tenant_id TYPE integer;
CREATE OR REPLACE VIEW app.v_dws_assistant_order_contribution AS
SELECT * FROM dws.dws_assistant_order_contribution
WHERE site_id = current_setting('app.current_site_id')::bigint;
GRANT SELECT ON app.v_dws_assistant_order_contribution TO app_reader;
COMMIT;
```
### App 库回滚
```sql
BEGIN;
ALTER TABLE auth.site_code_mapping ALTER COLUMN tenant_id TYPE integer;
COMMIT;
```
### 代码回滚
恢复 `scripts/ops/init_test_user.py``_safe_tenant_id()` 的 int4 范围检查逻辑。
---
## 4. 验证 SQL
### ETL 库test_etl_feiqiu
```sql
-- 1. 确认 dws 表 tenant_id 类型
SELECT column_name, data_type, udt_name
FROM information_schema.columns
WHERE table_schema = 'dws'
AND table_name = 'dws_assistant_order_contribution'
AND column_name = 'tenant_id';
-- 预期data_type = 'bigint', udt_name = 'int8'
-- 2. 确认 RLS 视图存在且类型正确
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'app'
AND table_name = 'v_dws_assistant_order_contribution'
AND column_name = 'tenant_id';
-- 预期data_type = 'bigint'
-- 3. 确认全库无残留 int4 tenant_id
SELECT table_schema, table_name
FROM information_schema.columns
WHERE column_name = 'tenant_id' AND udt_name = 'int4';
-- 预期0 行
```
### App 库test_zqyy_app
```sql
-- 1. 确认 auth 表 tenant_id 类型
SELECT column_name, data_type, udt_name
FROM information_schema.columns
WHERE table_schema = 'auth'
AND table_name = 'site_code_mapping'
AND column_name = 'tenant_id';
-- 预期data_type = 'bigint', udt_name = 'int8'
-- 2. 确认 FDW 外部表类型正确
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'fdw_etl'
AND table_name = 'v_dws_assistant_order_contribution'
AND column_name = 'tenant_id';
-- 预期data_type = 'bigint'
-- 3. 确认全库无残留 int4 tenant_id
SELECT table_schema, table_name
FROM information_schema.columns
WHERE column_name = 'tenant_id' AND udt_name = 'int4';
-- 预期0 行
```