Files
Neo-ZQYY/docs/specs/board-finance-dws-area-refactor/design.md
Neo 70324d8542 chore: 文档与 IDE 配置整理
- .kiro/specs/ → docs/specs/(41 个历史需求 spec 迁移,移除 .config.kiro)
- CLAUDE.md 三层拆分:根文件精简 + apps/backend/CLAUDE.md + .claude/commands/
- 新增 /spec-close、/pre-change 两个工作流命令
- DDL 基线刷新(从测试库重新导出 11 个文件,dws 35→38 表,biz 18→21 表)
- BD_Manual → BD_manual 命名统一(48 个文件)
- 修复 3 处文档与数据库不一致(auth.users.status 默认值、scheduled_tasks 字段、RLS 视图数)
- 新增 BD_manual_public_rbac_tables.md(public schema 8 张 RBAC/工作流表)
- 合并 biz.trigger_jobs 文档(10→12 字段,归档独立文档)
- docs/database/README.md 索引更新

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 00:02:37 +08:00

503 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Design Document: 财务看板 DWS 区域维度重构
## Overview
本次重构解决财务看板在 `area≠all` 时优惠数据从全局 DWS 表取数导致区域级优惠占比严重失真的核心 bug如 B区优惠占比 417.9%)。
采用两层架构:
1. **原子层** `dws_finance_area_daily`:按 `(site_id, stat_date, area_code)` 日粒度存储 9 个区域的收入、优惠、现金流等预计算数据
2. **缓存层** `dws_finance_board_cache`:缓存已完成周期的聚合结果,避免重复 SUM 计算
后端查询改为:已完成周期先查缓存 → 未命中从日粒度表 SUM → 当期周期直接从日粒度表 SUM。API 签名和返回结构完全不变,前端零改动。
核心设计决策:
- 优惠按结算单桌台区域直接聚合,不做分摊(每张结算单对应一张桌台)
- 区域映射抽成共享包ETL 和后端共用同一份配置
- `discount_gift_card` 使用赠送卡消费金额口径(与现有 ETL 一致)
- 现金流/充值/卡消费仅 `area_code='all'` 时有值,区域级无法拆分
## Architecture
### 系统架构图
```mermaid
graph TB
subgraph "数据源"
DWD[dwd_settlement_head<br/>结算单明细]
DIM[dim_table<br/>桌台维度表 SCD2]
DWS_OLD[dws_finance_daily_summary<br/>现有全局日汇总]
end
subgraph "共享包 packages/shared/"
AM[area_mapping.py<br/>AREA_LABEL_MAP + resolve_area_code]
end
subgraph "ETL 层 apps/etl/"
ETL1[DWS_FINANCE_AREA_DAILY<br/>每小时 · delete-before-insert]
ETL2[DWS_FINANCE_BOARD_CACHE<br/>每天 · 指纹对比]
end
subgraph "DWS 新表"
T1[dws_finance_area_daily<br/>原子层 · 9行/天/站点]
T2[dws_finance_board_cache<br/>缓存层 · 已完成周期]
end
subgraph "后端 apps/backend/"
SVC[board_service.py<br/>缓存优先查询逻辑]
FDW[fdw_queries.py<br/>get_finance_overview_area<br/>get_finance_revenue_area]
end
subgraph "前端(不改动)"
MP[小程序财务看板页]
end
DWD --> ETL1
DIM --> ETL1
DWS_OLD --> ETL1
AM --> ETL1
AM --> FDW
ETL1 --> T1
T1 --> ETL2
ETL2 --> T2
T1 --> FDW
T2 --> FDW
FDW --> SVC
SVC --> MP
```
### 数据流
```mermaid
sequenceDiagram
participant Client as 小程序
participant API as FastAPI
participant SVC as board_service
participant Cache as dws_finance_board_cache
participant Daily as dws_finance_area_daily
Client->>API: GET /api/xcx/board/finance?time=X&area=Y&compare=Z
API->>SVC: get_finance_board(time, area, compare, site_id)
alt 已完成周期 (lastMonth/lastWeek/...)
SVC->>Cache: 查询缓存
alt 缓存命中
Cache-->>SVC: 返回缓存数据
else 缓存未命中
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
SVC->>Cache: 写入缓存
end
else 当期周期 (month/week/quarter)
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
end
alt compare=1
SVC->>SVC: 对上期执行同样逻辑
SVC->>SVC: calc_compare(当期, 上期)
end
SVC-->>API: FinanceBoardResponse
API-->>Client: JSON (camelCase)
```
### ETL 任务依赖
```mermaid
graph LR
A[DWD_LOAD_FROM_ODS] --> B[DWS_FINANCE_AREA_DAILY<br/>每小时]
B --> C[DWS_FINANCE_BOARD_CACHE<br/>每天一次]
A --> D[DWS_FINANCE_DAILY<br/>现有任务·不改动]
```
## Components and Interfaces
### 1. 共享区域映射 — `packages/shared/src/neozqyy_shared/area_mapping.py`
```python
# 区域编码 → 物理区域名称列表
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
# 所有具体区域编码(不含 all/hall
SPECIFIC_AREA_CODES: list[str] # ["hallA", "hallB", ..., "ktv"]
# 全部 9 个区域编码
ALL_AREA_CODES: list[str] # ["all", "hall", "hallA", ..., "ktv"]
# 反向映射:物理区域名称 → 区域编码
_REVERSE_MAP: dict[str, str] # {"A区": "hallA", "B区": "hallB", ...}
def resolve_area_code(area_name: str | None) -> str | None:
"""输入 site_table_area_name返回对应的 area_code。未匹配返回 None。"""
def get_area_labels(area_code: str) -> list[str] | None:
"""输入 area_code返回对应的物理区域名称列表。all/hall 返回 None。"""
```
设计决策:
- `hall` = 所有具体区域之和(不含 all语义上等同于 all历史兼容
- `all` = 所有区域之和
- 未匹配的 `area_name` 返回 `None`,由 ETL 决定是否记录警告
### 2. ETL 任务 — `DWS_FINANCE_AREA_DAILY`
位置:`apps/etl/connectors/feiqiu/tasks/dws/finance_area_daily.py`
继承 `FinanceBaseTask`(复用结算单提取方法),覆盖 `extract` / `transform` / `load`
- **extract**:从 `dwd_settlement_head` + `dim_table` 提取当天结算单(按营业日切点),同时从 `dws_finance_daily_summary` 提取全局现金流/充值/卡消费字段
- **transform**:使用 `resolve_area_code` 将每张结算单映射到区域,按区域聚合收入和优惠字段,构建 9 行7 个具体区域 + hall + all
- **load**delete-before-insert`site_id + stat_date` 删除后插入 9 行)
关键接口:
```python
class FinanceAreaDailyTask(FinanceBaseTask):
def get_task_code(self) -> str: return "DWS_FINANCE_AREA_DAILY"
def get_target_table(self) -> str: return "dws.dws_finance_area_daily"
def get_primary_keys(self) -> list[str]: return ["site_id", "stat_date", "area_code"]
def extract(self, context: TaskContext) -> dict: ...
def transform(self, extracted: dict, context: TaskContext) -> list[dict]: ...
```
### 3. ETL 任务 — `DWS_FINANCE_BOARD_CACHE`
位置:`apps/etl/connectors/feiqiu/tasks/dws/finance_board_cache.py`
继承 `BaseDwsTask`
- **extract**:遍历 5 个已完成周期 × 9 个区域 = 45 组合,对每个组合从 `dws_finance_area_daily` 读取日粒度行
- **transform**计算数据指纹MD5与缓存表对比标记需要重算的组合
- **load**:对需要重算的组合,从日粒度表 SUM 后 upsert 到缓存表
指纹计算:
```python
def compute_fingerprint(rows: list[dict]) -> str:
"""对 (stat_date, gross_amount, discount_total) 排序后 MD5"""
sorted_rows = sorted(rows, key=lambda r: str(r['stat_date']))
payload = json.dumps([(str(r['stat_date']), str(r['gross_amount']), str(r['discount_total'])) for r in sorted_rows])
return hashlib.md5(payload.encode()).hexdigest()
```
### 4. 后端查询改造 — `fdw_queries.py`
新增/改造函数:
```python
def get_finance_overview_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 overview 8 项指标"""
def get_finance_revenue_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 revenue 板块数据"""
def get_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str
) -> dict | None:
"""查询 v_dws_finance_board_cache 缓存"""
def set_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str, data: dict
) -> None:
"""写入/更新缓存"""
```
### 5. 后端服务改造 — `board_service.py`
`get_finance_board` 函数改造:
- 新增缓存查询逻辑:已完成周期先查缓存
- `_build_overview` 改为调用 `get_finance_overview_area`(传入 area_code
- `_build_revenue` 改为调用 `get_finance_revenue_area`(传入 area_code
- `_build_cashflow` 不变(始终用全局数据)
- `area≠all` 时 overview 覆盖逻辑保留
## Data Models
### 1. `dws_finance_area_daily` — 原子层
```sql
CREATE TABLE dws.dws_finance_area_daily (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
tenant_id BIGINT NOT NULL,
stat_date DATE NOT NULL,
area_code VARCHAR(20) NOT NULL,
-- 收入结构4 项 + gross_amount
table_fee_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
goods_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_pd_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_cx_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
gross_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 优惠拆分6 项 + discount_total
discount_groupbuy NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_vip NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_manual NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_gift_card NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rounding NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_other NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_total NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 确认收入
confirmed_income NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 现金流(仅 area_code='all'
cash_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_paper_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
scan_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
groupbuy_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_cash_inflow NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_inflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_outflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance_change NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 卡消费(仅 area_code='all'
card_consume_total NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
gift_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 充值(仅 area_code='all'
recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
first_recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
renewal_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 订单统计
order_count INTEGER NOT NULL DEFAULT 0,
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, stat_date, area_code)
);
```
约束与恒等式:
- `gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amount`
- `discount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_other`
- `confirmed_income = gross_amount - discount_total`
- `area_code ∈ {all, hall, hallA, hallB, hallC, vip, snooker, mahjong, ktv}`
- `area_code ≠ 'all'` 时现金流/卡消费/充值字段 = 0
### 2. `dws_finance_board_cache` — 缓存层
```sql
CREATE TABLE dws.dws_finance_board_cache (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
time_range VARCHAR(20) NOT NULL,
area_code VARCHAR(20) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
prev_start_date DATE,
prev_end_date DATE,
-- overview 8 项
occurrence NUMERIC(14,2) NOT NULL DEFAULT 0,
discount NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
confirmed_revenue NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_in NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_out NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance NUMERIC(14,2) NOT NULL DEFAULT 0,
balance_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
-- 指纹
data_fingerprint VARCHAR(64),
computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, time_range, area_code)
);
```
缓存策略:
- `time_range ∈ {lastMonth, lastWeek, lastQuarter, quarter3, half6}` → 缓存
- `time_range ∈ {month, week, quarter}` → 不缓存
- 失效条件:`data_fingerprint` 变化(补录导致)
### 3. 区域映射数据模型
```python
# area_code 枚举值
AREA_CODES = Literal[
"all", "hall", "hallA", "hallB", "hallC",
"vip", "snooker", "mahjong", "ktv"
]
# AREA_LABEL_MAP 结构
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
```
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: 区域映射 round-trip
*For any* `area_name` 存在于 `AREA_LABEL_MAP` 的某个值列表中,`resolve_area_code(area_name)` 应返回对应的 `area_code`,且 `area_name in get_area_labels(resolve_area_code(area_name))` 为 True。
**Validates: Requirements 1.1, 1.5**
### Property 2: 未知区域名称返回 None
*For any* 不在 `AREA_LABEL_MAP` 任何值列表中的字符串 `area_name``resolve_area_code(area_name)` 应返回 `None`
**Validates: Requirements 1.4**
### Property 3: 日粒度行数学恒等式
*For any* `dws_finance_area_daily` 行,以下三个恒等式必须同时成立:
1. `gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amount`
2. `discount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_other`
3. `confirmed_income = gross_amount - discount_total`
**Validates: Requirements 2.1, 2.2, 2.3, 8.3**
### Property 4: 非 all 区域现金流/卡消费/充值为零
*For any* `dws_finance_area_daily` 行,当 `area_code ≠ 'all'`所有现金流字段cash_pay_amount, cash_paper_amount, scan_pay_amount, groupbuy_pay_amount, recharge_cash_inflow, cash_inflow_total, cash_outflow_total, cash_balance_change、卡消费字段card_consume_total, recharge_card_consume, gift_card_consume和充值字段recharge_cash, first_recharge_cash, renewal_cash均应为 0。
**Validates: Requirements 2.5**
### Property 5: ETL 输出完整性与聚合正确性
*For any* 一组结算单输入数据ETL transform 应输出恰好 9 行area_code 覆盖 all/hall/hallA/hallB/hallC/vip/snooker/mahjong/ktv`all` 行的收入和优惠字段 = hallA~ktv 各行对应字段之和,`hall` 行的收入和优惠字段 = hallA~ktv 各行对应字段之和。
**Validates: Requirements 2.7, 2.8, 8.4**
### Property 6: ETL 幂等性delete-before-insert
*For any* 一组结算单输入数据,对同一 `(site_id, stat_date)` 运行两次 ETL transform两次输出应完全相同。
**Validates: Requirements 3.4**
### Property 7: settle_type 过滤
*For any* 一组包含不同 `settle_type` 值的结算单ETL 仅处理 `settle_type IN (1, 3)` 的记录,其他 settle_type 的结算单不应影响输出金额。
**Validates: Requirements 3.6**
### Property 8: 数据指纹确定性与缓存失效
*For any* 一组日粒度行,`compute_fingerprint` 是确定性的(相同输入产生相同输出)。且 *for any* 对源数据的修改(改变任意行的 gross_amount 或 discount_total新指纹应与原指纹不同。
**Validates: Requirements 5.2, 5.3, 5.4**
### Property 9: 当期周期不写入缓存
*For any* `time_range ∈ {month, week, quarter}`ETL 缓存任务不应为该 time_range 写入缓存记录。
**Validates: Requirements 5.7**
### Property 10: 查询路由正确性
*For any* 查询请求,当 `time_range` 为已完成周期且缓存存在时,应直接返回缓存数据;当缓存不存在时,应从日粒度表 SUM 计算并写入缓存;当 `time_range` 为当期周期时,应直接从日粒度表 SUM 计算,不查缓存。
**Validates: Requirements 6.1, 6.2, 6.3, 9.4**
### Property 11: 区域过滤行为
*For any* `area_code ≠ 'all'` 的查询,`recharge` 板块应返回 `null``cashflow`/`expense`/`coach_analysis` 板块的数据应与 `area_code='all'` 时一致。
**Validates: Requirements 6.7, 6.8**
### Property 12: revenue 固定项数
*For any* 查询返回的 revenue 板块,`discount_items` 应恰好包含 5 项(团购/会员折扣/手动调整/赠送卡/其他),`channel_items` 应恰好包含 3 项(储值卡结算冲销/现金线上支付/团购核销)。
**Validates: Requirements 7.3, 7.4**
### Property 13: area≠all 时 overview 覆盖逻辑
*For any* `area_code ≠ 'all'` 的查询,`overview.occurrence` 应等于 `revenue.total_occurrence``overview.discount` 应等于 `revenue.discount_total``overview.confirmed_revenue` 应等于 `revenue.confirmed_total`
**Validates: Requirements 7.6**
### Property 14: area=all 回归一致性
*For any* 日期范围和 `area_code='all'` 的查询,新逻辑(从 `dws_finance_area_daily` 查询)的 overview 板块 8 项指标应与旧逻辑(从 `dws_finance_daily_summary` 查询)的结果完全一致。
**Validates: Requirements 9.1**
## Error Handling
### ETL 层
| 场景 | 处理策略 |
|------|---------|
| `resolve_area_code` 返回 None未知区域 | 记录 WARNING 日志,该结算单不计入任何具体区域行,但仍计入 all 行 |
| `dws_finance_daily_summary` 无当天数据 | all 行的现金流/充值/卡消费字段填 0记录 WARNING |
| `dim_table``table_id` 无匹配(`scd2_is_current=1` | 该结算单的 area_code 视为 None同上处理 |
| delete-before-insert 事务失败 | 整个事务回滚,任务标记失败,下次调度重试 |
| 指纹计算时日粒度表无数据 | 指纹为空字符串的 MD5缓存标记为"无数据" |
### 后端查询层
| 场景 | 处理策略 |
|------|---------|
| `dws_finance_area_daily` 无数据(新站点/新日期) | 返回全零的 overview/revenue与现有降级逻辑一致 |
| 缓存写入失败 | 不影响查询结果返回,记录 ERROR 日志,下次请求重试写入 |
| 缓存表连接失败 | 降级为直接从日粒度表 SUM不中断请求 |
| area_code 参数非法 | 由 FastAPI 的 AreaFilterEnum 校验拦截,返回 422 |
### 数据一致性保护
- ETL delete-before-insert 在单个事务内执行,保证原子性
- 缓存写入使用 `ON CONFLICT ... DO UPDATE`,保证幂等性
- 后端查询使用 `SET LOCAL app.current_site_id` 保证 RLS 隔离
## Testing Strategy
### 属性测试Property-Based Testing
使用 **hypothesis**Python每个属性测试最少 100 次迭代。
| Property | 测试文件 | 生成器 |
|----------|---------|--------|
| Property 1-2 | `tests/test_area_mapping_props.py` | `st.text()` 生成随机 area_name |
| Property 3-7 | `tests/test_finance_area_daily_props.py` | 生成随机结算单列表(金额用 `st.decimals`area_name 从已知+未知混合) |
| Property 8-9 | `tests/test_finance_board_cache_props.py` | 生成随机日粒度行列表 |
| Property 10-14 | `tests/test_board_service_props.py` | 生成随机查询参数 + mock 数据库返回 |
每个测试函数必须包含注释标签:
```python
# Feature: board-finance-dws-area-refactor, Property 1: 区域映射 round-trip
```
### 单元测试
| 测试范围 | 测试文件 | 关注点 |
|---------|---------|--------|
| area_mapping | `tests/test_area_mapping_unit.py` | 边界空字符串、None、大小写、特殊字符 |
| ETL transform | `apps/etl/connectors/feiqiu/tests/unit/test_finance_area_daily.py` | discount_gift_card 口径验证、营业日切点边界 |
| ETL cache | `apps/etl/connectors/feiqiu/tests/unit/test_finance_board_cache.py` | 指纹变化检测、空数据处理 |
| 后端查询 | `apps/backend/tests/unit/test_fdw_queries_area.py` | SQL 正确性、area_code 过滤、缓存命中/未命中 |
| 后端服务 | `apps/backend/tests/unit/test_board_service_area.py` | 覆盖逻辑、环比计算、降级行为 |
| 回归验证 | `scripts/ops/validate_board_finance.py` | 144 组合全量对比 |
### 测试配置
```python
# conftest.py / hypothesis settings
from hypothesis import settings
settings.register_profile("ci", max_examples=100)
settings.register_profile("dev", max_examples=30)
```
### 集成验证
- 144 组合全量验证脚本:`scripts/ops/validate_board_finance.py`8 time_range × 9 area_code × 2 compare
- area=all 回归对比:新旧逻辑输出 diff
- 缓存命中率验证:已完成周期第二次请求不触发 SUM