- .kiro/specs/ → docs/specs/(41 个历史需求 spec 迁移,移除 .config.kiro) - CLAUDE.md 三层拆分:根文件精简 + apps/backend/CLAUDE.md + .claude/commands/ - 新增 /spec-close、/pre-change 两个工作流命令 - DDL 基线刷新(从测试库重新导出 11 个文件,dws 35→38 表,biz 18→21 表) - BD_Manual → BD_manual 命名统一(48 个文件) - 修复 3 处文档与数据库不一致(auth.users.status 默认值、scheduled_tasks 字段、RLS 视图数) - 新增 BD_manual_public_rbac_tables.md(public schema 8 张 RBAC/工作流表) - 合并 biz.trigger_jobs 文档(10→12 字段,归档独立文档) - docs/database/README.md 索引更新 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
21 KiB
Design Document: 财务看板 DWS 区域维度重构
Overview
本次重构解决财务看板在 area≠all 时优惠数据从全局 DWS 表取数导致区域级优惠占比严重失真的核心 bug(如 B区优惠占比 417.9%)。
采用两层架构:
- 原子层
dws_finance_area_daily:按(site_id, stat_date, area_code)日粒度存储 9 个区域的收入、优惠、现金流等预计算数据 - 缓存层
dws_finance_board_cache:缓存已完成周期的聚合结果,避免重复 SUM 计算
后端查询改为:已完成周期先查缓存 → 未命中从日粒度表 SUM → 当期周期直接从日粒度表 SUM。API 签名和返回结构完全不变,前端零改动。
核心设计决策:
- 优惠按结算单桌台区域直接聚合,不做分摊(每张结算单对应一张桌台)
- 区域映射抽成共享包,ETL 和后端共用同一份配置
discount_gift_card使用赠送卡消费金额口径(与现有 ETL 一致)- 现金流/充值/卡消费仅
area_code='all'时有值,区域级无法拆分
Architecture
系统架构图
graph TB
subgraph "数据源"
DWD[dwd_settlement_head<br/>结算单明细]
DIM[dim_table<br/>桌台维度表 SCD2]
DWS_OLD[dws_finance_daily_summary<br/>现有全局日汇总]
end
subgraph "共享包 packages/shared/"
AM[area_mapping.py<br/>AREA_LABEL_MAP + resolve_area_code]
end
subgraph "ETL 层 apps/etl/"
ETL1[DWS_FINANCE_AREA_DAILY<br/>每小时 · delete-before-insert]
ETL2[DWS_FINANCE_BOARD_CACHE<br/>每天 · 指纹对比]
end
subgraph "DWS 新表"
T1[dws_finance_area_daily<br/>原子层 · 9行/天/站点]
T2[dws_finance_board_cache<br/>缓存层 · 已完成周期]
end
subgraph "后端 apps/backend/"
SVC[board_service.py<br/>缓存优先查询逻辑]
FDW[fdw_queries.py<br/>get_finance_overview_area<br/>get_finance_revenue_area]
end
subgraph "前端(不改动)"
MP[小程序财务看板页]
end
DWD --> ETL1
DIM --> ETL1
DWS_OLD --> ETL1
AM --> ETL1
AM --> FDW
ETL1 --> T1
T1 --> ETL2
ETL2 --> T2
T1 --> FDW
T2 --> FDW
FDW --> SVC
SVC --> MP
数据流
sequenceDiagram
participant Client as 小程序
participant API as FastAPI
participant SVC as board_service
participant Cache as dws_finance_board_cache
participant Daily as dws_finance_area_daily
Client->>API: GET /api/xcx/board/finance?time=X&area=Y&compare=Z
API->>SVC: get_finance_board(time, area, compare, site_id)
alt 已完成周期 (lastMonth/lastWeek/...)
SVC->>Cache: 查询缓存
alt 缓存命中
Cache-->>SVC: 返回缓存数据
else 缓存未命中
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
SVC->>Cache: 写入缓存
end
else 当期周期 (month/week/quarter)
SVC->>Daily: SUM(area_code=Y, date_range)
Daily-->>SVC: 聚合结果
end
alt compare=1
SVC->>SVC: 对上期执行同样逻辑
SVC->>SVC: calc_compare(当期, 上期)
end
SVC-->>API: FinanceBoardResponse
API-->>Client: JSON (camelCase)
ETL 任务依赖
graph LR
A[DWD_LOAD_FROM_ODS] --> B[DWS_FINANCE_AREA_DAILY<br/>每小时]
B --> C[DWS_FINANCE_BOARD_CACHE<br/>每天一次]
A --> D[DWS_FINANCE_DAILY<br/>现有任务·不改动]
Components and Interfaces
1. 共享区域映射 — packages/shared/src/neozqyy_shared/area_mapping.py
# 区域编码 → 物理区域名称列表
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
# 所有具体区域编码(不含 all/hall)
SPECIFIC_AREA_CODES: list[str] # ["hallA", "hallB", ..., "ktv"]
# 全部 9 个区域编码
ALL_AREA_CODES: list[str] # ["all", "hall", "hallA", ..., "ktv"]
# 反向映射:物理区域名称 → 区域编码
_REVERSE_MAP: dict[str, str] # {"A区": "hallA", "B区": "hallB", ...}
def resolve_area_code(area_name: str | None) -> str | None:
"""输入 site_table_area_name,返回对应的 area_code。未匹配返回 None。"""
def get_area_labels(area_code: str) -> list[str] | None:
"""输入 area_code,返回对应的物理区域名称列表。all/hall 返回 None。"""
设计决策:
hall= 所有具体区域之和(不含 all),语义上等同于 all(历史兼容)all= 所有区域之和- 未匹配的
area_name返回None,由 ETL 决定是否记录警告
2. ETL 任务 — DWS_FINANCE_AREA_DAILY
位置:apps/etl/connectors/feiqiu/tasks/dws/finance_area_daily.py
继承 FinanceBaseTask(复用结算单提取方法),覆盖 extract / transform / load:
- extract:从
dwd_settlement_head+dim_table提取当天结算单(按营业日切点),同时从dws_finance_daily_summary提取全局现金流/充值/卡消费字段 - transform:使用
resolve_area_code将每张结算单映射到区域,按区域聚合收入和优惠字段,构建 9 行(7 个具体区域 + hall + all) - load:delete-before-insert(按
site_id + stat_date删除后插入 9 行)
关键接口:
class FinanceAreaDailyTask(FinanceBaseTask):
def get_task_code(self) -> str: return "DWS_FINANCE_AREA_DAILY"
def get_target_table(self) -> str: return "dws.dws_finance_area_daily"
def get_primary_keys(self) -> list[str]: return ["site_id", "stat_date", "area_code"]
def extract(self, context: TaskContext) -> dict: ...
def transform(self, extracted: dict, context: TaskContext) -> list[dict]: ...
3. ETL 任务 — DWS_FINANCE_BOARD_CACHE
位置:apps/etl/connectors/feiqiu/tasks/dws/finance_board_cache.py
继承 BaseDwsTask:
- extract:遍历 5 个已完成周期 × 9 个区域 = 45 组合,对每个组合从
dws_finance_area_daily读取日粒度行 - transform:计算数据指纹(MD5),与缓存表对比,标记需要重算的组合
- load:对需要重算的组合,从日粒度表 SUM 后 upsert 到缓存表
指纹计算:
def compute_fingerprint(rows: list[dict]) -> str:
"""对 (stat_date, gross_amount, discount_total) 排序后 MD5"""
sorted_rows = sorted(rows, key=lambda r: str(r['stat_date']))
payload = json.dumps([(str(r['stat_date']), str(r['gross_amount']), str(r['discount_total'])) for r in sorted_rows])
return hashlib.md5(payload.encode()).hexdigest()
4. 后端查询改造 — fdw_queries.py
新增/改造函数:
def get_finance_overview_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 overview 8 项指标"""
def get_finance_revenue_area(
conn, site_id: int, start_date: str, end_date: str, area_code: str = "all"
) -> dict:
"""从 v_dws_finance_area_daily 按 area_code 聚合 revenue 板块数据"""
def get_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str
) -> dict | None:
"""查询 v_dws_finance_board_cache 缓存"""
def set_finance_board_cache(
conn, site_id: int, time_range: str, area_code: str, data: dict
) -> None:
"""写入/更新缓存"""
5. 后端服务改造 — board_service.py
get_finance_board 函数改造:
- 新增缓存查询逻辑:已完成周期先查缓存
_build_overview改为调用get_finance_overview_area(传入 area_code)_build_revenue改为调用get_finance_revenue_area(传入 area_code)_build_cashflow不变(始终用全局数据)area≠all时 overview 覆盖逻辑保留
Data Models
1. dws_finance_area_daily — 原子层
CREATE TABLE dws.dws_finance_area_daily (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
tenant_id BIGINT NOT NULL,
stat_date DATE NOT NULL,
area_code VARCHAR(20) NOT NULL,
-- 收入结构(4 项 + gross_amount)
table_fee_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
goods_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_pd_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
assistant_cx_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
gross_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 优惠拆分(6 项 + discount_total)
discount_groupbuy NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_vip NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_manual NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_gift_card NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rounding NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_other NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_total NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 确认收入
confirmed_income NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 现金流(仅 area_code='all')
cash_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_paper_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
scan_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
groupbuy_pay_amount NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_cash_inflow NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_inflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_outflow_total NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance_change NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 卡消费(仅 area_code='all')
card_consume_total NUMERIC(14,2) NOT NULL DEFAULT 0,
recharge_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
gift_card_consume NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 充值(仅 area_code='all')
recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
first_recharge_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
renewal_cash NUMERIC(14,2) NOT NULL DEFAULT 0,
-- 订单统计
order_count INTEGER NOT NULL DEFAULT 0,
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, stat_date, area_code)
);
约束与恒等式:
gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amountdiscount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_otherconfirmed_income = gross_amount - discount_totalarea_code ∈ {all, hall, hallA, hallB, hallC, vip, snooker, mahjong, ktv}area_code ≠ 'all'时现金流/卡消费/充值字段 = 0
2. dws_finance_board_cache — 缓存层
CREATE TABLE dws.dws_finance_board_cache (
id BIGSERIAL PRIMARY KEY,
site_id BIGINT NOT NULL,
time_range VARCHAR(20) NOT NULL,
area_code VARCHAR(20) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
prev_start_date DATE,
prev_end_date DATE,
-- overview 8 项
occurrence NUMERIC(14,2) NOT NULL DEFAULT 0,
discount NUMERIC(14,2) NOT NULL DEFAULT 0,
discount_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
confirmed_revenue NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_in NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_out NUMERIC(14,2) NOT NULL DEFAULT 0,
cash_balance NUMERIC(14,2) NOT NULL DEFAULT 0,
balance_rate NUMERIC(8,4) NOT NULL DEFAULT 0,
-- 指纹
data_fingerprint VARCHAR(64),
computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- 元数据
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (site_id, time_range, area_code)
);
缓存策略:
time_range ∈ {lastMonth, lastWeek, lastQuarter, quarter3, half6}→ 缓存time_range ∈ {month, week, quarter}→ 不缓存- 失效条件:
data_fingerprint变化(补录导致)
3. 区域映射数据模型
# area_code 枚举值
AREA_CODES = Literal[
"all", "hall", "hallA", "hallB", "hallC",
"vip", "snooker", "mahjong", "ktv"
]
# AREA_LABEL_MAP 结构
AREA_LABEL_MAP: dict[str, list[str]] = {
"hallA": ["A区"],
"hallB": ["B区"],
"hallC": ["C区", "TV台", "美洲豹赛台"],
"vip": ["VIP包厢"],
"snooker": ["斯诺克区"],
"mahjong": ["麻将房", "M7", "M8", "666", "发财"],
"ktv": ["K包", "k包活动区", "幸会158"],
}
Correctness Properties
A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.
Property 1: 区域映射 round-trip
For any area_name 存在于 AREA_LABEL_MAP 的某个值列表中,resolve_area_code(area_name) 应返回对应的 area_code,且 area_name in get_area_labels(resolve_area_code(area_name)) 为 True。
Validates: Requirements 1.1, 1.5
Property 2: 未知区域名称返回 None
For any 不在 AREA_LABEL_MAP 任何值列表中的字符串 area_name,resolve_area_code(area_name) 应返回 None。
Validates: Requirements 1.4
Property 3: 日粒度行数学恒等式
For any dws_finance_area_daily 行,以下三个恒等式必须同时成立:
gross_amount = table_fee_amount + goods_amount + assistant_pd_amount + assistant_cx_amountdiscount_total = discount_groupbuy + discount_vip + discount_manual + discount_gift_card + discount_rounding + discount_otherconfirmed_income = gross_amount - discount_total
Validates: Requirements 2.1, 2.2, 2.3, 8.3
Property 4: 非 all 区域现金流/卡消费/充值为零
For any dws_finance_area_daily 行,当 area_code ≠ 'all' 时,所有现金流字段(cash_pay_amount, cash_paper_amount, scan_pay_amount, groupbuy_pay_amount, recharge_cash_inflow, cash_inflow_total, cash_outflow_total, cash_balance_change)、卡消费字段(card_consume_total, recharge_card_consume, gift_card_consume)和充值字段(recharge_cash, first_recharge_cash, renewal_cash)均应为 0。
Validates: Requirements 2.5
Property 5: ETL 输出完整性与聚合正确性
For any 一组结算单输入数据,ETL transform 应输出恰好 9 行(area_code 覆盖 all/hall/hallA/hallB/hallC/vip/snooker/mahjong/ktv),且 all 行的收入和优惠字段 = hallAktv 各行对应字段之和,ktv 各行对应字段之和。hall 行的收入和优惠字段 = hallA
Validates: Requirements 2.7, 2.8, 8.4
Property 6: ETL 幂等性(delete-before-insert)
For any 一组结算单输入数据,对同一 (site_id, stat_date) 运行两次 ETL transform,两次输出应完全相同。
Validates: Requirements 3.4
Property 7: settle_type 过滤
For any 一组包含不同 settle_type 值的结算单,ETL 仅处理 settle_type IN (1, 3) 的记录,其他 settle_type 的结算单不应影响输出金额。
Validates: Requirements 3.6
Property 8: 数据指纹确定性与缓存失效
For any 一组日粒度行,compute_fingerprint 是确定性的(相同输入产生相同输出)。且 for any 对源数据的修改(改变任意行的 gross_amount 或 discount_total),新指纹应与原指纹不同。
Validates: Requirements 5.2, 5.3, 5.4
Property 9: 当期周期不写入缓存
For any time_range ∈ {month, week, quarter},ETL 缓存任务不应为该 time_range 写入缓存记录。
Validates: Requirements 5.7
Property 10: 查询路由正确性
For any 查询请求,当 time_range 为已完成周期且缓存存在时,应直接返回缓存数据;当缓存不存在时,应从日粒度表 SUM 计算并写入缓存;当 time_range 为当期周期时,应直接从日粒度表 SUM 计算,不查缓存。
Validates: Requirements 6.1, 6.2, 6.3, 9.4
Property 11: 区域过滤行为
For any area_code ≠ 'all' 的查询,recharge 板块应返回 null,cashflow/expense/coach_analysis 板块的数据应与 area_code='all' 时一致。
Validates: Requirements 6.7, 6.8
Property 12: revenue 固定项数
For any 查询返回的 revenue 板块,discount_items 应恰好包含 5 项(团购/会员折扣/手动调整/赠送卡/其他),channel_items 应恰好包含 3 项(储值卡结算冲销/现金线上支付/团购核销)。
Validates: Requirements 7.3, 7.4
Property 13: area≠all 时 overview 覆盖逻辑
For any area_code ≠ 'all' 的查询,overview.occurrence 应等于 revenue.total_occurrence,overview.discount 应等于 revenue.discount_total,overview.confirmed_revenue 应等于 revenue.confirmed_total。
Validates: Requirements 7.6
Property 14: area=all 回归一致性
For any 日期范围和 area_code='all' 的查询,新逻辑(从 dws_finance_area_daily 查询)的 overview 板块 8 项指标应与旧逻辑(从 dws_finance_daily_summary 查询)的结果完全一致。
Validates: Requirements 9.1
Error Handling
ETL 层
| 场景 | 处理策略 |
|---|---|
resolve_area_code 返回 None(未知区域) |
记录 WARNING 日志,该结算单不计入任何具体区域行,但仍计入 all 行 |
dws_finance_daily_summary 无当天数据 |
all 行的现金流/充值/卡消费字段填 0,记录 WARNING |
dim_table 中 table_id 无匹配(scd2_is_current=1) |
该结算单的 area_code 视为 None,同上处理 |
| delete-before-insert 事务失败 | 整个事务回滚,任务标记失败,下次调度重试 |
| 指纹计算时日粒度表无数据 | 指纹为空字符串的 MD5,缓存标记为"无数据" |
后端查询层
| 场景 | 处理策略 |
|---|---|
dws_finance_area_daily 无数据(新站点/新日期) |
返回全零的 overview/revenue,与现有降级逻辑一致 |
| 缓存写入失败 | 不影响查询结果返回,记录 ERROR 日志,下次请求重试写入 |
| 缓存表连接失败 | 降级为直接从日粒度表 SUM,不中断请求 |
| area_code 参数非法 | 由 FastAPI 的 AreaFilterEnum 校验拦截,返回 422 |
数据一致性保护
- ETL delete-before-insert 在单个事务内执行,保证原子性
- 缓存写入使用
ON CONFLICT ... DO UPDATE,保证幂等性 - 后端查询使用
SET LOCAL app.current_site_id保证 RLS 隔离
Testing Strategy
属性测试(Property-Based Testing)
使用 hypothesis 库(Python),每个属性测试最少 100 次迭代。
| Property | 测试文件 | 生成器 |
|---|---|---|
| Property 1-2 | tests/test_area_mapping_props.py |
st.text() 生成随机 area_name |
| Property 3-7 | tests/test_finance_area_daily_props.py |
生成随机结算单列表(金额用 st.decimals,area_name 从已知+未知混合) |
| Property 8-9 | tests/test_finance_board_cache_props.py |
生成随机日粒度行列表 |
| Property 10-14 | tests/test_board_service_props.py |
生成随机查询参数 + mock 数据库返回 |
每个测试函数必须包含注释标签:
# Feature: board-finance-dws-area-refactor, Property 1: 区域映射 round-trip
单元测试
| 测试范围 | 测试文件 | 关注点 |
|---|---|---|
| area_mapping | tests/test_area_mapping_unit.py |
边界:空字符串、None、大小写、特殊字符 |
| ETL transform | apps/etl/connectors/feiqiu/tests/unit/test_finance_area_daily.py |
discount_gift_card 口径验证、营业日切点边界 |
| ETL cache | apps/etl/connectors/feiqiu/tests/unit/test_finance_board_cache.py |
指纹变化检测、空数据处理 |
| 后端查询 | apps/backend/tests/unit/test_fdw_queries_area.py |
SQL 正确性、area_code 过滤、缓存命中/未命中 |
| 后端服务 | apps/backend/tests/unit/test_board_service_area.py |
覆盖逻辑、环比计算、降级行为 |
| 回归验证 | scripts/ops/validate_board_finance.py |
144 组合全量对比 |
测试配置
# conftest.py / hypothesis settings
from hypothesis import settings
settings.register_profile("ci", max_examples=100)
settings.register_profile("dev", max_examples=30)
集成验证
- 144 组合全量验证脚本:
scripts/ops/validate_board_finance.py(8 time_range × 9 area_code × 2 compare) - area=all 回归对比:新旧逻辑输出 diff
- 缓存命中率验证:已完成周期第二次请求不触发 SUM