chore: 文档与 IDE 配置整理

- .kiro/specs/ → docs/specs/(41 个历史需求 spec 迁移,移除 .config.kiro)
- CLAUDE.md 三层拆分:根文件精简 + apps/backend/CLAUDE.md + .claude/commands/
- 新增 /spec-close、/pre-change 两个工作流命令
- DDL 基线刷新(从测试库重新导出 11 个文件,dws 35→38 表,biz 18→21 表)
- BD_Manual → BD_manual 命名统一(48 个文件)
- 修复 3 处文档与数据库不一致(auth.users.status 默认值、scheduled_tasks 字段、RLS 视图数)
- 新增 BD_manual_public_rbac_tables.md(public schema 8 张 RBAC/工作流表)
- 合并 biz.trigger_jobs 文档(10→12 字段,归档独立文档)
- docs/database/README.md 索引更新

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Neo
2026-04-06 00:02:37 +08:00
parent 8228b3fa37
commit 70324d8542
185 changed files with 13595 additions and 1219 deletions

View File

@@ -0,0 +1,502 @@
# Design Document: 财务看板 DWS 区域维度重构
## Overview
本次重构解决财务看板在 `area≠all` 时优惠数据从全局 DWS 表取数导致区域级优惠占比严重失真的核心 bug如 B区优惠占比 417.9%)。
采用两层架构:
1. **原子层** `dws_finance_area_daily`:按 `(site_id, stat_date, area_code)` 日粒度存储 9 个区域的收入、优惠、现金流等预计算数据
2. **缓存层** `dws_finance_board_cache`:缓存已完成周期的聚合结果,避免重复 SUM 计算
后端查询改为:已完成周期先查缓存 → 未命中从日粒度表 SUM → 当期周期直接从日粒度表 SUM。API 签名和返回结构完全不变,前端零改动。
核心设计决策:
- 优惠按结算单桌台区域直接聚合,不做分摊(每张结算单对应一张桌台)
- 区域映射抽成共享包ETL 和后端共用同一份配置
- `discount_gift_card` 使用赠送卡消费金额口径(与现有 ETL 一致)
- 现金流/充值/卡消费仅 `area_code='all'` 时有值,区域级无法拆分
## Architecture
### 系统架构图
```mermaid
graph TB
subgraph "数据源"
DWD[dwd_settlement_head<br/>结算单明细]
DIM[dim_table<br/>桌台维度表 SCD2]
DWS_OLD[dws_finance_daily_summary<br/>现有全局日汇总]
end
subgraph "共享包 packages/shared/"
AM[area_mapping.py<br/>AREA_LABEL_MAP + resolve_area_code]
end
subgraph "ETL 层 apps/etl/"
ETL1[DWS_FINANCE_AREA_DAILY<br/>每小时 · delete-before-insert]
ETL2[DWS_FINANCE_BOARD_CACHE<br/>每天 · 指纹对比]
end
subgraph "DWS 新表"
T1[dws_finance_area_daily<br/>原子层 · 9行/天/站点]
T2[dws_finance_board_cache<br/>缓存层 · 已完成周期]
end
subgraph "后端 apps/backend/"
SVC[board_service.py<br/>缓存优先查询逻辑]
FDW[fdw_queries.py<br/>get_finance_overview_area<br/>get_finance_revenue_area]
end
subgraph "前端(不改动)"
MP[小程序财务看板页]
end
DWD --> ETL1
DIM --> ETL1
DWS_OLD --> ETL1
AM --> ETL1
AM --> FDW
ETL1 --> T1
T1 --> ETL2
ETL2 --> T2
T1 --> FDW
T2 --> FDW
FDW --> SVC
SVC --> MP
```
### 数据流
```mermaid
sequenceDiagram
participant Client as 小程序
participant API as FastAPI
participant SVC as board_service
participant Cache as dws_finance_board_cache
participant Daily as dws_finance_area_daily
Client->>API: GET /api/xcx/board/finance?time=X&area=Y&compare=Z
API->>SVC: get_finance_board(time, area, compare, site_id)
alt 已完成周期 (lastMonth/lastWeek/...)
SVC->>Cache: 查询缓存
alt 缓存命中
Cache-->>SVC: 返回缓存数据
else 缓存未命中
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
SVC->>Cache: 写入缓存
end
else 当期周期 (month/week/quarter)
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
end
alt compare=1
SVC->>SVC: 对上期执行同样逻辑
SVC->>SVC: calc_compare(当期, 上期)
end
SVC-->>API: FinanceBoardResponse
API-->>Client: JSON (camelCase)
```
### ETL 任务依赖
```mermaid
graph LR
A[DWD_LOAD_FROM_ODS] --> B[DWS_FINANCE_AREA_DAILY<br/>每小时]
B --> C[DWS_FINANCE_BOARD_CACHE<br/>每天一次]
A --> D[DWS_FINANCE_DAILY<br/>现有任务·不改动]
```
## Components and Interfaces
### 1. 共享区域映射 — `packages/shared/src/neozqyy_shared/area_mapping.py`
```python
# 区域编码 → 物理区域名称列表
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
# 所有具体区域编码(不含 all/hall
SPECIFIC_AREA_CODES: list[str] # ["hallA", "hallB", ..., "ktv"]
# 全部 9 个区域编码
ALL_AREA_CODES: list[str] # ["all", "hall", "hallA", ..., "ktv"]
# 反向映射:物理区域名称 → 区域编码
_REVERSE_MAP: dict[str, str] # {"A区": "hallA", "B区": "hallB", ...}
def resolve_area_code(area_name: str | None) -> str | None:
"""输入 site_table_area_name返回对应的 area_code。未匹配返回 None。"""
def get_area_labels(area_code: str) -> list[str] | None:
"""输入 area_code返回对应的物理区域名称列表。all/hall 返回 None。"""
```
设计决策:
- `hall` = 所有具体区域之和(不含 all语义上等同于 all历史兼容
- `all` = 所有区域之和
- 未匹配的 `area_name` 返回 `None`,由 ETL 决定是否记录警告
### 2. ETL 任务 — `DWS_FINANCE_AREA_DAILY`
位置:`apps/etl/connectors/feiqiu/tasks/dws/finance_area_daily.py`
继承 `FinanceBaseTask`(复用结算单提取方法),覆盖 `extract` / `transform` / `load`
- **extract**:从 `dwd_settlement_head` + `dim_table` 提取当天结算单(按营业日切点),同时从 `dws_finance_daily_summary` 提取全局现金流/充值/卡消费字段
- **transform**:使用 `resolve_area_code` 将每张结算单映射到区域,按区域聚合收入和优惠字段,构建 9 行7 个具体区域 + hall + all
- **load**delete-before-insert`site_id + stat_date` 删除后插入 9 行)
关键接口:
```python
class FinanceAreaDailyTask(FinanceBaseTask):
def get_task_code(self) -> str: return "DWS_FINANCE_AREA_DAILY"
def get_target_table(self) -> str: return "dws.dws_finance_area_daily"
def get_primary_keys(self) -> list[str]: return ["site_id", "stat_date", "area_code"]
def extract(self, context: TaskContext) -> dict: ...
def transform(self, extracted: dict, context: TaskContext) -> list[dict]: ...
```
### 3. ETL 任务 — `DWS_FINANCE_BOARD_CACHE`
位置:`apps/etl/connectors/feiqiu/tasks/dws/finance_board_cache.py`
继承 `BaseDwsTask`
- **extract**:遍历 5 个已完成周期 × 9 个区域 = 45 组合,对每个组合从 `dws_finance_area_daily` 读取日粒度行
- **transform**计算数据指纹MD5与缓存表对比标记需要重算的组合
- **load**:对需要重算的组合,从日粒度表 SUM 后 upsert 到缓存表
指纹计算:
```python
def compute_fingerprint(rows: list[dict]) -> str:
"""对 (stat_date, gross_amount, discount_total) 排序后 MD5"""
sorted_rows = sorted(rows, key=lambda r: str(r['stat_date']))
payload = json.dumps([(str(r['stat_date']), str(r['gross_amount']), str(r['discount_total'])) for r in sorted_rows])
return hashlib.md5(payload.encode()).hexdigest()
```
### 4. 后端查询改造 — `fdw_queries.py`
新增/改造函数:
```python
def get_finance_overview_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 overview 8 项指标"""
def get_finance_revenue_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 revenue 板块数据"""
def get_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str
) -> dict | None:
"""查询 v_dws_finance_board_cache 缓存"""
def set_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str, data: dict
) -> None:
"""写入/更新缓存"""
```
### 5. 后端服务改造 — `board_service.py`
`get_finance_board` 函数改造:
- 新增缓存查询逻辑:已完成周期先查缓存
- `_build_overview` 改为调用 `get_finance_overview_area`(传入 area_code
- `_build_revenue` 改为调用 `get_finance_revenue_area`(传入 area_code
- `_build_cashflow` 不变(始终用全局数据)
- `area≠all` 时 overview 覆盖逻辑保留
## Data Models
### 1. `dws_finance_area_daily` — 原子层
```sql
CREATE TABLE dws.dws_finance_area_daily (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
tenant_id BIGINT NOT NULL,
stat_date DATE NOT NULL,
area_code VARCHAR(20) NOT NULL,
-- 收入结构4 项 + gross_amount
table_fee_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
goods_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_pd_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_cx_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
gross_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 优惠拆分6 项 + discount_total
discount_groupbuy NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_vip NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_manual NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_gift_card NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rounding NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_other NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_total NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 确认收入
confirmed_income NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 现金流(仅 area_code='all'
cash_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_paper_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
scan_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
groupbuy_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_cash_inflow NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_inflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_outflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance_change NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 卡消费(仅 area_code='all'
card_consume_total NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
gift_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 充值(仅 area_code='all'
recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
first_recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
renewal_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 订单统计
order_count INTEGER NOT NULL DEFAULT 0,
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, stat_date, area_code)
);
```
约束与恒等式:
- `gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amount`
- `discount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_other`
- `confirmed_income = gross_amount - discount_total`
- `area_code ∈ {all, hall, hallA, hallB, hallC, vip, snooker, mahjong, ktv}`
- `area_code ≠ 'all'` 时现金流/卡消费/充值字段 = 0
### 2. `dws_finance_board_cache` — 缓存层
```sql
CREATE TABLE dws.dws_finance_board_cache (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
time_range VARCHAR(20) NOT NULL,
area_code VARCHAR(20) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
prev_start_date DATE,
prev_end_date DATE,
-- overview 8 项
occurrence NUMERIC(14,2) NOT NULL DEFAULT 0,
discount NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
confirmed_revenue NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_in NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_out NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance NUMERIC(14,2) NOT NULL DEFAULT 0,
balance_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
-- 指纹
data_fingerprint VARCHAR(64),
computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, time_range, area_code)
);
```
缓存策略:
- `time_range ∈ {lastMonth, lastWeek, lastQuarter, quarter3, half6}` → 缓存
- `time_range ∈ {month, week, quarter}` → 不缓存
- 失效条件:`data_fingerprint` 变化(补录导致)
### 3. 区域映射数据模型
```python
# area_code 枚举值
AREA_CODES = Literal[
"all", "hall", "hallA", "hallB", "hallC",
"vip", "snooker", "mahjong", "ktv"
]
# AREA_LABEL_MAP 结构
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
```
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: 区域映射 round-trip
*For any* `area_name` 存在于 `AREA_LABEL_MAP` 的某个值列表中,`resolve_area_code(area_name)` 应返回对应的 `area_code`,且 `area_name in get_area_labels(resolve_area_code(area_name))` 为 True。
**Validates: Requirements 1.1, 1.5**
### Property 2: 未知区域名称返回 None
*For any* 不在 `AREA_LABEL_MAP` 任何值列表中的字符串 `area_name``resolve_area_code(area_name)` 应返回 `None`
**Validates: Requirements 1.4**
### Property 3: 日粒度行数学恒等式
*For any* `dws_finance_area_daily` 行,以下三个恒等式必须同时成立:
1. `gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amount`
2. `discount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_other`
3. `confirmed_income = gross_amount - discount_total`
**Validates: Requirements 2.1, 2.2, 2.3, 8.3**
### Property 4: 非 all 区域现金流/卡消费/充值为零
*For any* `dws_finance_area_daily` 行,当 `area_code ≠ 'all'`所有现金流字段cash_pay_amount, cash_paper_amount, scan_pay_amount, groupbuy_pay_amount, recharge_cash_inflow, cash_inflow_total, cash_outflow_total, cash_balance_change、卡消费字段card_consume_total, recharge_card_consume, gift_card_consume和充值字段recharge_cash, first_recharge_cash, renewal_cash均应为 0。
**Validates: Requirements 2.5**
### Property 5: ETL 输出完整性与聚合正确性
*For any* 一组结算单输入数据ETL transform 应输出恰好 9 行area_code 覆盖 all/hall/hallA/hallB/hallC/vip/snooker/mahjong/ktv`all` 行的收入和优惠字段 = hallA~ktv 各行对应字段之和,`hall` 行的收入和优惠字段 = hallA~ktv 各行对应字段之和。
**Validates: Requirements 2.7, 2.8, 8.4**
### Property 6: ETL 幂等性delete-before-insert
*For any* 一组结算单输入数据,对同一 `(site_id, stat_date)` 运行两次 ETL transform两次输出应完全相同。
**Validates: Requirements 3.4**
### Property 7: settle_type 过滤
*For any* 一组包含不同 `settle_type` 值的结算单ETL 仅处理 `settle_type IN (1, 3)` 的记录,其他 settle_type 的结算单不应影响输出金额。
**Validates: Requirements 3.6**
### Property 8: 数据指纹确定性与缓存失效
*For any* 一组日粒度行,`compute_fingerprint` 是确定性的(相同输入产生相同输出)。且 *for any* 对源数据的修改(改变任意行的 gross_amount 或 discount_total新指纹应与原指纹不同。
**Validates: Requirements 5.2, 5.3, 5.4**
### Property 9: 当期周期不写入缓存
*For any* `time_range ∈ {month, week, quarter}`ETL 缓存任务不应为该 time_range 写入缓存记录。
**Validates: Requirements 5.7**
### Property 10: 查询路由正确性
*For any* 查询请求,当 `time_range` 为已完成周期且缓存存在时,应直接返回缓存数据;当缓存不存在时,应从日粒度表 SUM 计算并写入缓存;当 `time_range` 为当期周期时,应直接从日粒度表 SUM 计算,不查缓存。
**Validates: Requirements 6.1, 6.2, 6.3, 9.4**
### Property 11: 区域过滤行为
*For any* `area_code ≠ 'all'` 的查询,`recharge` 板块应返回 `null``cashflow`/`expense`/`coach_analysis` 板块的数据应与 `area_code='all'` 时一致。
**Validates: Requirements 6.7, 6.8**
### Property 12: revenue 固定项数
*For any* 查询返回的 revenue 板块,`discount_items` 应恰好包含 5 项(团购/会员折扣/手动调整/赠送卡/其他),`channel_items` 应恰好包含 3 项(储值卡结算冲销/现金线上支付/团购核销)。
**Validates: Requirements 7.3, 7.4**
### Property 13: area≠all 时 overview 覆盖逻辑
*For any* `area_code ≠ 'all'` 的查询,`overview.occurrence` 应等于 `revenue.total_occurrence``overview.discount` 应等于 `revenue.discount_total``overview.confirmed_revenue` 应等于 `revenue.confirmed_total`
**Validates: Requirements 7.6**
### Property 14: area=all 回归一致性
*For any* 日期范围和 `area_code='all'` 的查询,新逻辑(从 `dws_finance_area_daily` 查询)的 overview 板块 8 项指标应与旧逻辑(从 `dws_finance_daily_summary` 查询)的结果完全一致。
**Validates: Requirements 9.1**
## Error Handling
### ETL 层
| 场景 | 处理策略 |
|------|---------|
| `resolve_area_code` 返回 None未知区域 | 记录 WARNING 日志,该结算单不计入任何具体区域行,但仍计入 all 行 |
| `dws_finance_daily_summary` 无当天数据 | all 行的现金流/充值/卡消费字段填 0记录 WARNING |
| `dim_table``table_id` 无匹配(`scd2_is_current=1` | 该结算单的 area_code 视为 None同上处理 |
| delete-before-insert 事务失败 | 整个事务回滚,任务标记失败,下次调度重试 |
| 指纹计算时日粒度表无数据 | 指纹为空字符串的 MD5缓存标记为"无数据" |
### 后端查询层
| 场景 | 处理策略 |
|------|---------|
| `dws_finance_area_daily` 无数据(新站点/新日期) | 返回全零的 overview/revenue与现有降级逻辑一致 |
| 缓存写入失败 | 不影响查询结果返回,记录 ERROR 日志,下次请求重试写入 |
| 缓存表连接失败 | 降级为直接从日粒度表 SUM不中断请求 |
| area_code 参数非法 | 由 FastAPI 的 AreaFilterEnum 校验拦截,返回 422 |
### 数据一致性保护
- ETL delete-before-insert 在单个事务内执行,保证原子性
- 缓存写入使用 `ON CONFLICT ... DO UPDATE`,保证幂等性
- 后端查询使用 `SET LOCAL app.current_site_id` 保证 RLS 隔离
## Testing Strategy
### 属性测试Property-Based Testing
使用 **hypothesis**Python每个属性测试最少 100 次迭代。
| Property | 测试文件 | 生成器 |
|----------|---------|--------|
| Property 1-2 | `tests/test_area_mapping_props.py` | `st.text()` 生成随机 area_name |
| Property 3-7 | `tests/test_finance_area_daily_props.py` | 生成随机结算单列表(金额用 `st.decimals`area_name 从已知+未知混合) |
| Property 8-9 | `tests/test_finance_board_cache_props.py` | 生成随机日粒度行列表 |
| Property 10-14 | `tests/test_board_service_props.py` | 生成随机查询参数 + mock 数据库返回 |
每个测试函数必须包含注释标签:
```python
# Feature: board-finance-dws-area-refactor, Property 1: 区域映射 round-trip
```
### 单元测试
| 测试范围 | 测试文件 | 关注点 |
|---------|---------|--------|
| area_mapping | `tests/test_area_mapping_unit.py` | 边界空字符串、None、大小写、特殊字符 |
| ETL transform | `apps/etl/connectors/feiqiu/tests/unit/test_finance_area_daily.py` | discount_gift_card 口径验证、营业日切点边界 |
| ETL cache | `apps/etl/connectors/feiqiu/tests/unit/test_finance_board_cache.py` | 指纹变化检测、空数据处理 |
| 后端查询 | `apps/backend/tests/unit/test_fdw_queries_area.py` | SQL 正确性、area_code 过滤、缓存命中/未命中 |
| 后端服务 | `apps/backend/tests/unit/test_board_service_area.py` | 覆盖逻辑、环比计算、降级行为 |
| 回归验证 | `scripts/ops/validate_board_finance.py` | 144 组合全量对比 |
### 测试配置
```python
# conftest.py / hypothesis settings
from hypothesis import settings
settings.register_profile("ci", max_examples=100)
settings.register_profile("dev", max_examples=30)
```
### 集成验证
- 144 组合全量验证脚本:`scripts/ops/validate_board_finance.py`8 time_range × 9 area_code × 2 compare
- area=all 回归对比:新旧逻辑输出 diff
- 缓存命中率验证:已完成周期第二次请求不触发 SUM