DWD完成

This commit is contained in:
Neo
2025-12-09 04:57:05 +08:00
parent f301cc1fd5
commit 561c640700
46 changed files with 26181 additions and 3540 deletions

File diff suppressed because it is too large Load Diff

121
README.md
View File

@@ -1,79 +1,88 @@
# 台球场 ETL 系统
# 飞球 ETL 系统
用于台球门店业务的数据采集与入湖:从上游 API 拉取订单支付会员库存等数据,先落地 ODS再清洗写入事实/维度表,并提供运行追踪、增量游标、数据质量检查与测试脚手架
面向门店业务的 ETL 流水线:从上游 API 拉取订单/支付/会员/库存等 JSON,先落地 ODS随后清洗装载 DWD含 SCD2 维度、事实增量),并提供质量校验与回归验证工具
## 核心特性
- **两阶段链路**ODS 原始留 + DWD/事实表清洗,支持回放与重
- **任务注册与调度**`TaskRegistry` 统一管理任务代码,`ETLScheduler` 负责游标、运行记录和失败隔离
- **统一底座**:配置(默认值 + `.env` + CLI 覆盖)、分页/重试的 API 客户端、批量 Upsert 的数据库封装、SCD2 维度处理、质量检查
- **测试与回放**ONLINE/OFFLINE 模式切换,`run_tests.py`/`test_presets.py` 支持参数化测试;`MANUAL_INGEST` 可将归档 JSON 重灌入 ODS
- **可安装**`setup.py` / `entry_point` 提供 `etl-billiards` 命令,或直接 `python -m cli.main` 运行
## 功能要点
- 双层形态ODS 原始留 + DWD 清洗标准化,支持回放与重
- 任务调度ETLScheduler 统一管理任务、日志、失败隔离CLI 友好
- 配置体系:默认值 + .env + CLI 覆盖,便于多环境运行
- 批量入库:通用 ODS Loader / SCD2 维度合并 / 事实增量写入
- 回归校验:示例 JSON、行数对照、质量报告便于快速验证
## 仓库结构(摘录)
- `etl_billiards/config`:默认配置、环境变量解析、配置加载
- `etl_billiards/api`HTTP 客户端,内置重试/分页。
- `etl_billiards/database`:连接管理、批量 Upsert。
- `etl_billiards/tasks`业务任务ORDERS、PAYMENTS…、ODS 任务、DWD 任务、人工回放;`base_task.py`/`base_dwd_task.py` 提供模板
- `etl_billiards/loaders`:事实/维度/ODS Loader`scd/` 为 SCD2
- `etl_billiards/orchestration`:调度器任务注册表、游标与运行追踪
- `etl_billiards/scripts`:测试执行器、数据库连通性检测、预置测试指令
- `etl_billiards/tests`:单元/集成测试与离线 JSON 归档
- `C:\dev\LLTQ\export\temp\source-data-doc`:测试示例数据 JSON
## 仓库结构
- etl_billiards/config默认配置、环境变量解析、CLI 覆盖
- etl_billiards/apiHTTP 客户端重试分页封装
- etl_billiards/database连接管理、批量 upsert 封装、DDL
- etl_billiards/tasks业务任务ODS/DWD/初始化/手工灌入等)
- etl_billiards/loadersODS/DWD/SCD Loader 实现
- etl_billiards/orchestration调度器任务注册。
- etl_billiards/scripts测试、重建、探活脚本
- etl_billiards/reports质量报告输出
- etl_billiards/docsODS->DWD 映射说明、样例 JSON 说明。
## 支持的任务代码
- **事实/维度**`ORDERS``PAYMENTS``REFUNDS``INVENTORY_CHANGE``COUPON_USAGE``MEMBERS``ASSISTANTS``PRODUCTS``TABLES``PACKAGES_DEF``TOPUPS``TABLE_DISCOUNT``ASSISTANT_ABOLISH``LEDGER``TICKET_DWD``PAYMENTS_DWD``MEMBERS_DWD`
- **ODS 原始采集**`ODS_ORDER_SETTLE``ODS_TABLE_USE``ODS_ASSISTANT_LEDGER``ODS_ASSISTANT_ABOLISH``ODS_GOODS_LEDGER``ODS_PAYMENT``ODS_REFUND``ODS_COUPON_VERIFY``ODS_MEMBER``ODS_MEMBER_CARD``ODS_PACKAGE``ODS_INVENTORY_STOCK``ODS_INVENTORY_CHANGE`
- **辅助**`MANUAL_INGEST`(将归档 JSON 回放到 ODS
## 支持的主要任务
- ODS订单结算、台费流水、助教流水/废除、库存、支付、退款、会员、充值结算等
- DWD维度表门店/台桌/会员/助教/商品等)与事实表(结算、支付、退款、充值、台费、商品销售等)
- 初始化与手工灌入INIT_ODS_SCHEMA、MANUAL_INGEST
## 快速开始
1. **环境要求**Python 3.10+PostgreSQL。推荐在 `etl_billiards/` 目录下执行命令。
2. **安装依赖**
1) 环境Python 3.10+PostgreSQL 可用;在 etl_billiards/ 下运行命令。
2) 安装依赖:
```bash
cd etl_billiards
pip install -r requirements.txt
# 开发模式pip install -e .
```
3. **配置 `.env`**
3) 配置 .env(示例关键项):
```bash
cp .env.example .env
# 核心项
PG_DSN=postgresql://user:pwd@host:5432/LLZQ
PG_DSN=postgresql://user:pwd@host:5432/LLZQ-test
API_BASE=https://api.example.com
API_TOKEN=your_token
STORE_ID=2790685415443269
EXPORT_ROOT=/path/to/export
LOG_ROOT=/path/to/logs
EXPORT_ROOT=C:\dev\LLTQ\export\JSON
LOG_ROOT=C:\dev\LLTQ\export\LOG
INGEST_SOURCE_DIR=C:\dev\LLTQ\export\test-json-doc
```
配置的生效顺序为 “默认值” < “环境变量/.env” < “CLI 参数”。
4. **运行任务**
4) 初始化库表:
```bash
# 运行默认任务集
python -m cli.main
# 按需选择任务(逗号分隔)
python -m cli.main --tasks ODS_ORDER_SETTLE,ORDERS,PAYMENTS
# Dry-run 示例(不提交事务)
python -m cli.main --tasks ORDERS --dry-run
# Windows 批处理
..\\run_etl.bat --tasks PAYMENTS
python -m cli.main --tasks INIT_ODS_SCHEMA --pipeline-flow INGEST_ONLY --ingest-source "C:\dev\LLTQ\export\test-json-doc"
# 或直接用 psql 执行 schema_*.sql
```
5) 运行任务(示例):
```bash
# 默认任务列表(见 config/defaults.py
python -m cli.main
# 指定任务
python -m cli.main --tasks settlement_records,recharge_settlements
# 仅手工灌入示例 JSON
python -m cli.main --tasks MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "C:\dev\LLTQ\export\test-json-doc"
```
5. **查看输出**:日志目录与导出目录分别由 `LOG_ROOT`、`EXPORT_ROOT` 控制;运行追踪与游标记录写入数据库 `etl_admin.*` 表。
## 数据与运行流转
- CLI 解析参数 → `AppConfig.load()` 组装配置 → `ETLScheduler` 创建 DB/API/游标/运行追踪器
- 调度器按任务代码实例化任务,读取/推进游标,落盘运行记录
- 任务模板:确定时间窗口 → 调用 API/ODS 数据 → 解析校验 → Loader 批量 Upsert/SCD2 → 质量检查 → 提交事务并回写游标
## 运行与数据流
- CLI 解析参数 -> AppConfig.load 合并配置 -> ETLScheduler 创建 DB/API/日志上下文 -> 实例化任务 -> 拉取/清洗/写入
- ODS 任务:调用 API分页提取字段解析后批量 upsertpayload 保留原始 JSON
- DWD 任务:维度表做 SCD2事实表按时间水位增量写入
## 测试与回
- 单/集成测试:`pytest``python scripts/run_tests.py --suite online`
- 预置组合:`python scripts/run_tests.py --preset offline_realdb`(见 `scripts/test_presets.py`
- 离线模式:`TEST_MODE=OFFLINE TEST_JSON_ARCHIVE_DIR=... pytest tests/unit/test_etl_tasks_offline.py`
- 数据库连通性:`python scripts/test_db_connection.py --dsn postgresql://... --query "SELECT 1"`。
## 测试与回
- 单/集成pytest 或 python scripts/run_tests.py --suite online。
- 离线模式TEST_MODE=OFFLINE TEST_JSON_ARCHIVE_DIR=... pytest tests/unit/test_etl_tasks_offline.py
- 数据库连通python scripts/test_db_connection.py --dsn <PG_DSN> --query "SELECT 1"
## 其他提示
- `.env.example` 列出了所有常用配置;`config/defaults.py` 记录默认值与任务窗口配置
- `loaders/ods/generic.py` 支持定义主键/列名即可落 ODS`tasks/manual_ingest_task.py` 可将归档 JSON 快速灌入对应 ODS 表。
- 需要新增任务时,在 `tasks/` 中实现并在 `orchestration/task_registry.py` 注册即可复用调度能力
- .env.example 罗列全部配置config/defaults.py 给出默认值与任务窗口。
- loaders/ods/generic.py 支持定义主键/冲突列; asks/manual_ingest_task.py 可将示例 JSON 快速灌入对应 ODS 表。
- 添加新任务:在 asks/ 中实现并在 orchestration/task_registry.py 注册。
## ODS 任务与调度使用
- 注册etl_admin.etl_task 已启用 INIT_ODS_SCHEMA、MANUAL_INGESTstore_id=2790685415443269可按需追加其他任务
- 示例数据目录:默认 C:\dev\LLTQ\export\test-json-doc可在 .env 的 INGEST_SOURCE_DIR 覆盖)。
- 一键重建+灌入:
`bash
python -m cli.main --tasks INIT_ODS_SCHEMA,MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "C:\dev\LLTQ\export\test-json-doc"
`
- 行数对照etl_billiards/ods_row_report.json 存示例 JSON 行数与 ODS 行数,可用于回归校验。
- 备份etl_billiards/backups/ 保存当前 schema_ODS_doc.sql、 asks/manual_ingest_task.py 版本。
- 充值结算 ODSrecharge_settlements 已按 settleList 扁平化主字段(
echarge_order_id 主键,金额/状态/快照等列site_profile 与 payload 保留原始 JSON任务 recharge_settlements 直接写入该表,手工灌入会自动展开
echarge_settlements.json。

View File

@@ -1,53 +1,49 @@
# 数据库配置(真实库)
# -*- coding: utf-8 -*-
# 文件说明ETL 环境变量config/env_parser.py 读取),用于数据库连接、目录与运行参数。
# 数据库连接字符串config/env_parser.py -> db.dsn所有任务必需
PG_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test
# 数据库连接超时秒config/env_parser.py -> db.connect_timeout_sec
PG_CONNECT_TIMEOUT=10
# 如需拆分配置PG_HOST=... PG_PORT=... PG_NAME=... PG_USER=... PG_PASSWORD=...
# API配置如需走真实接口再填写
API_BASE=https://api.example.com
API_TOKEN=your_token_here
# API_TIMEOUT=20
# API_PAGE_SIZE=200
# API_RETRY_MAX=3
# 应用配置
# 门店/租户IDconfig/env_parser.py -> app.store_id任务调度记录使用
STORE_ID=2790685415443269
# TIMEZONE=Asia/Taipei
# SCHEMA_OLTP=billiards
# SCHEMA_ETL=etl_admin
# 时区标识config/env_parser.py -> app.timezone
TIMEZONE=Asia/Taipei
# 路径配置
EXPORT_ROOT=C:\dev\LLTQ\export\JSON
# API 基础地址config/env_parser.py -> api.base_urlFETCH 类任务调用
API_BASE=https://api.example.com
# API 鉴权 Tokenconfig/env_parser.py -> api.tokenFETCH 类任务调用
API_TOKEN=your_token_here
# API 请求超时秒config/env_parser.py -> api.timeout_sec
API_TIMEOUT=20
# API 分页大小config/env_parser.py -> api.page_size
API_PAGE_SIZE=200
# API 最大重试次数config/env_parser.py -> api.retries.max_attempts
API_RETRY_MAX=3
# 日志根目录config/env_parser.py -> io.log_rootInit/任务运行写日志
LOG_ROOT=C:\dev\LLTQ\export\LOG
FETCH_ROOT=
INGEST_SOURCE_DIR=
WRITE_PRETTY_JSON=false
PGCLIENTENCODING=utf8
# JSON 导出根目录config/env_parser.py -> io.export_rootFETCH 产出及 INIT 准备
EXPORT_ROOT=C:\dev\LLTQ\export\JSON
# ETL配置
# FETCH 模式本地输出目录config/env_parser.py -> pipeline.fetch_root
FETCH_ROOT=C:\dev\LLTQ\export\JSON
# 本地入库 JSON 目录config/env_parser.py -> pipeline.ingest_source_dirMANUAL_INGEST/INGEST_ONLY 使用
INGEST_SOURCE_DIR=C:\dev\LLTQ\export\test-json-doc
# JSON 漂亮格式输出开关config/env_parser.py -> io.write_pretty_json
WRITE_PRETTY_JSON=false
# 运行流程FULL / FETCH_ONLY / INGEST_ONLYconfig/env_parser.py -> pipeline.flow
PIPELINE_FLOW=FULL
# 指定任务列表逗号分隔覆盖默认config/env_parser.py -> run.tasks
# RUN_TASKS=INIT_ODS_SCHEMA,MANUAL_INGEST
# 窗口/补偿参数config/env_parser.py -> run.*
OVERLAP_SECONDS=120
WINDOW_BUSY_MIN=30
WINDOW_IDLE_MIN=180
IDLE_START=04:00
IDLE_END=16:00
ALLOW_EMPTY_RESULT_ADVANCE=true
# 清洗配置
LOG_UNKNOWN_FIELDS=true
HASH_ALGO=sha1
STRICT_NUMERIC=true
ROUND_MONEY_SCALE=2
# 测试/离线模式(真实库联调建议 ONLINE
TEST_MODE=ONLINE
TEST_JSON_ARCHIVE_DIR=tests/source-data-doc
TEST_JSON_TEMP_DIR=/tmp/etl_billiards_json_tmp
# 测试数据库
TEST_DB_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test
# ODS <20>ؽ<EFBFBD><D8BD>ű<EFBFBD><C5B1><EFBFBD><EFBFBD>ã<EFBFBD><C3A3><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ã<EFBFBD>
JSON_DOC_DIR=C:\dev\LLTQ\export\test-json-doc
ODS_INCLUDE_FILES=
ODS_DROP_SCHEMA_FIRST=true

View File

@@ -0,0 +1,321 @@
# -*- coding: utf-8 -*-
"""鎵嬪伐绀轰緥鏁版嵁鐏屽叆锛氭寜 schema_ODS_doc.sql 涓婚敭/鍞竴閿壒閲忓啓鍏?ODS銆?""
from __future__ import annotations
import json
import os
from datetime import datetime
from typing import Any, Iterable
from psycopg2.extras import Json
from .base_task import BaseTask
class ManualIngestTask(BaseTask):
"""湴绀轰緥 JSON 鐏屽叆 ODS锛岀淇濊鍚嶃佷富閿佹彃鍏ュ垪涓?schema_ODS_doc.sql 瀵归綈銆?""
def __init__(self, config, db_connection, api_client, logger):
"""鍒濆鍖栫紦瀛橈紝閬垮厤閲嶅鏌ヨ琛ㄧ粨鏋勩€?""
super().__init__(config, db_connection, api_client, logger)
self._table_columns_cache: dict[str, list[str]] = {}
# 鏂囦欢鍏抽敭璇?-> 鐩爣琛紙鍖归厤 C:\dev\LLTQ\export\temp\source-data-doc 涓嬬ず鑼?JSON 鍚嶇О锛? FILE_MAPPING: list[tuple[tuple[str, ...], str]] = [
(("浼氬憳妗f", "member_profiles"), "billiards_ods.member_profiles"),
(("浣欓鍙樻洿璁板綍", "member_balance_changes"), "billiards_ods.member_balance_changes"),
(("鍌ㄥ€煎崱鍒楄〃", "member_stored_value_cards"), "billiards_ods.member_stored_value_cards"),
(("鍏呭€艰褰?, "recharge_settlements"), "billiards_ods.recharge_settlements"),
(("缁撹处璁板綍", "settlement_records"), "billiards_ods.settlement_records"),
(("鍔╂暀搴熼櫎", "assistant_cancellation_records"), "billiards_ods.assistant_cancellation_records"),
(("鍔╂暀璐﹀彿", "assistant_accounts_master"), "billiards_ods.assistant_accounts_master"),
(("鍔╂暀娴佹按", "assistant_service_records"), "billiards_ods.assistant_service_records"),
(("鍙版鍒楄〃", "site_tables_master"), "billiards_ods.site_tables_master"),
(("鍙拌垂鎵撴姌", "table_fee_discount_records"), "billiards_ods.table_fee_discount_records"),
(("鍙拌垂娴佹按", "table_fee_transactions"), "billiards_ods.table_fee_transactions"),
(("搴撳瓨鍙樺寲璁板綍1", "goods_stock_movements"), "billiards_ods.goods_stock_movements"),
(("搴撳瓨鍙樺寲璁板綍2", "stock_goods_category_tree"), "billiards_ods.stock_goods_category_tree"),
(("搴撳瓨姹囨€?, "goods_stock_summary"), "billiards_ods.goods_stock_summary"),
(("鏀粯璁板綍", "payment_transactions"), "billiards_ods.payment_transactions"),
(("閫€娆捐褰?, "refund_transactions"), "billiards_ods.refund_transactions"),
(("骞冲彴楠屽埜璁板綍", "platform_coupon_redemption_records"), "billiards_ods.platform_coupon_redemption_records"),
(("鍥㈣喘濂楅娴佹按", "group_buy_redemption_records"), "billiards_ods.group_buy_packages_ledger"),
(("鍥㈣喘濂楅", "group_buy_packages"), "billiards_ods.group_buy_packages"),
(("灏忕エ璇︽儏", "settlement_ticket_details"), "billiards_ods.settlement_ticket_details"),
(("闂ㄥ簵鍟嗗搧妗f", "store_goods_master"), "billiards_ods.store_goods_master"),
(("鍟嗗搧妗f", "tenant_goods_master"), "billiards_ods.tenant_goods_master"),
(("闂ㄥ簵鍟嗗搧閿€鍞褰?, "store_goods_sales_records"), "billiards_ods.store_goods_sales_records"),
]
# 琛ㄧ粨鏋勮鏄庯細pk=涓婚敭鍒?None 琛ㄧず鏃犲啿绐佹洿鏂?锛宩son_cols=闇€瑕佸崟鍒楀瓨 JSONB 鐨勫瓧娈? TABLE_SPECS: dict[str, dict[str, Any]] = {
"billiards_ods.member_profiles": {"pk": "id"},
"billiards_ods.member_balance_changes": {"pk": "id"},
"billiards_ods.member_stored_value_cards": {"pk": "id"},
"billiards_ods.recharge_settlements": {"pk": None, "json_cols": ["settleList", "siteProfile"]},
"billiards_ods.settlement_records": {"pk": None, "json_cols": ["settleList", "siteProfile"]},
"billiards_ods.assistant_cancellation_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.assistant_accounts_master": {"pk": "id"},
"billiards_ods.assistant_service_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.site_tables_master": {"pk": "id"},
"billiards_ods.table_fee_discount_records": {"pk": "id", "json_cols": ["siteProfile", "tableProfile"]},
"billiards_ods.table_fee_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.goods_stock_movements": {"pk": "siteGoodsStockId"},
"billiards_ods.stock_goods_category_tree": {"pk": "id", "json_cols": ["categoryBoxes"]},
"billiards_ods.goods_stock_summary": {"pk": "siteGoodsId"},
"billiards_ods.payment_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.refund_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.platform_coupon_redemption_records": {"pk": "id"},
"billiards_ods.tenant_goods_master": {"pk": "id"},
"billiards_ods.group_buy_packages": {"pk": "id"},
"billiards_ods.group_buy_packages_ledger": {"pk": "id"},
"billiards_ods.settlement_ticket_details": {
"pk": "orderSettleId",
"json_cols": ["memberProfile", "orderItem", "tenantMemberCardLogs"],
},
"billiards_ods.store_goods_master": {"pk": "id"},
"billiards_ods.store_goods_sales_records": {"pk": "id"},
}
def get_task_code(self) -> str:
"""杩斿洖浠诲姟缂栫爜銆?""
return "MANUAL_INGEST"
def execute(self, cursor_data: dict | None = None) -> dict:
"""浠庣ず鑼冪洰褰曡鍙?JSON锛屾寜琛?涓婚敭鎵归噺鍏ュ簱銆?""
data_dir = (
self.config.get("manual.data_dir")
or self.config.get("pipeline.ingest_source_dir")
or r"c:\dev\LLTQ\ETL\feiqiu-ETL\etl_billiards\tests\testdata_json"
)
if not os.path.exists(data_dir):
self.logger.error("Data directory not found: %s", data_dir)
return {"status": "error", "message": "Directory not found"}
counts = {"fetched": 0, "inserted": 0, "updated": 0, "skipped": 0, "errors": 0}
for filename in sorted(os.listdir(data_dir)):
if not filename.endswith(".json"):
continue
filepath = os.path.join(data_dir, filename)
try:
with open(filepath, "r", encoding="utf-8") as fh:
raw_entries = json.load(fh)
except Exception:
counts["errors"] += 1
self.logger.exception("Failed to read %s", filename)
continue
if not isinstance(raw_entries, list):
raw_entries = [raw_entries]
records = self._extract_records(raw_entries)
if not records:
counts["skipped"] += 1
continue
target_table = self._match_by_filename(filename)
if not target_table:
self.logger.warning("No mapping found for file: %s", filename)
counts["skipped"] += 1
continue
self.logger.info("Ingesting %s into %s", filename, target_table)
try:
inserted, updated = self._ingest_table(target_table, records, filename)
counts["inserted"] += inserted
counts["updated"] += updated
counts["fetched"] += len(records)
except Exception:
counts["errors"] += 1
self.logger.exception("Error processing %s", filename)
self.db.rollback()
continue
try:
self.db.commit()
except Exception:
self.db.rollback()
raise
return {"status": "SUCCESS", "counts": counts}
# ------------------------------------------------------------------ helpers
def _match_by_filename(self, filename: str) -> str | None:
"""鏍规嵁鏂囦欢鍚嶅叧閿瘝鎵惧埌鐩爣琛ㄣ?""
for keywords, table in self.FILE_MAPPING:
if any(keyword and keyword in filename for keyword in keywords):
return table
return None
def _extract_records(self, raw_entries: Iterable[Any]) -> list[dict]:
"""鍏煎澶氱 JSON 缁撴瀯锛屾彁鍙栨垚璁板綍鍒楄〃銆?""
records: list[dict] = []
for entry in raw_entries:
if isinstance(entry, dict):
# 濡傛灉鍚?data 涓旇繕鍖呭惈鍏朵粬閿紙濡?orderSettleId锛夛紝浼樺厛淇濈暀澶栧眰浠ュ厤涓㈠け涓婚敭
preferred = entry
if "data" in entry and not any(k not in {"data", "code"} for k in entry.keys()):
preferred = entry["data"]
data = preferred
if isinstance(data, dict):
list_used = False
for v in data.values():
if isinstance(v, list) and v and isinstance(v[0], dict):
records.extend(v)
list_used = True
break
if list_used:
continue
if isinstance(data, list) and data and isinstance(data[0], dict):
records.extend(data)
elif isinstance(data, dict):
records.append(data)
elif isinstance(entry, list):
records.extend([item for item in entry if isinstance(item, dict)])
return records
def _get_table_columns(self, table: str) -> list[str]:
"""鏌ヨ伅_schema锛岃幏鍙栫洰鏍囪鐨勫叏閮ㄥ垪鍚嶏紙鎸夐搴忥級銆?""
if table in self._table_columns_cache:
return self._table_columns_cache[table]
if "." in table:
schema, name = table.split(".", 1)
else:
schema, name = "public", table
sql = """
SELECT column_name, data_type, udt_name
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
ORDER BY ordinal_position
"""
with self.db.conn.cursor() as cur:
cur.execute(sql, (schema, name))
cols = [(r[0], (r[1] or "").lower(), (r[2] or "").lower()) for r in cur.fetchall()]
self._table_columns_cache[table] = cols
return cols
def _ingest_table(self, table: str, records: list[dict], source_file: str) -> tuple[int, int]:
"""鏋勯€?INSERT/ON CONFLICT 璇彞骞舵壒閲忔墽琛屻€?""
spec = self.TABLE_SPECS.get(table)
if not spec:
raise ValueError(f"No table spec for {table}")
pk_col = spec.get("pk")
json_cols = set(spec.get("json_cols", []))
json_cols_lower = {c.lower() for c in json_cols}
columns_info = self._get_table_columns(table)
columns = [c[0] for c in columns_info]
db_json_cols_lower = {
c[0].lower() for c in columns_info if c[1] in ("json", "jsonb") or c[2] in ("json", "jsonb")
}
pk_col_db = None
if pk_col:
pk_col_db = next((c for c in columns if c.lower() == pk_col.lower()), pk_col)
placeholders = ", ".join(["%s"] * len(columns))
col_list = ", ".join(f'"{c}"' for c in columns)
sql = f'INSERT INTO {table} ({col_list}) VALUES ({placeholders})'
if pk_col_db:
update_cols = [c for c in columns if c != pk_col_db]
set_clause = ", ".join(f'"{c}"=EXCLUDED."{c}"' for c in update_cols)
sql += f' ON CONFLICT ("{pk_col_db}") DO UPDATE SET {set_clause}'
sql += " RETURNING (xmax = 0) AS inserted"
params = []
now = datetime.now()
json_dump = lambda v: json.dumps(v, ensure_ascii=False) # noqa: E731
for rec in records:
merged_rec = rec if isinstance(rec, dict) else {}
# 閫愬眰灞曞紑 data -> data.data 缁撴瀯锛屽~鍏呯己澶卞瓧娈? data_part = merged_rec.get("data")
while isinstance(data_part, dict):
merged_rec = {**data_part, **merged_rec}
data_part = data_part.get("data")
pk_val = self._get_value_case_insensitive(merged_rec, pk_col) if pk_col else None
if pk_col and (pk_val is None or pk_val == ""):
continue
row_vals = []
for col_name, data_type, udt in columns_info:
col_lower = col_name.lower()
if col_lower == "payload":
row_vals.append(Json(rec, dumps=json_dump))
continue
if col_lower == "source_file":
row_vals.append(source_file)
continue
if col_lower == "fetched_at":
row_vals.append(merged_rec.get(col_name, now))
continue
value = self._normalize_scalar(self._get_value_case_insensitive(merged_rec, col_name))
if col_lower in json_cols_lower or col_lower in db_json_cols_lower:
row_vals.append(Json(value, dumps=json_dump) if value is not None else None)
continue
casted = self._cast_value(value, data_type)
row_vals.append(casted)
params.append(tuple(row_vals))
if not params:
return 0, 0
inserted = 0
updated = 0
with self.db.conn.cursor() as cur:
for row in params:
cur.execute(sql, row)
try:
flag = cur.fetchone()[0]
except Exception:
flag = None
if flag:
inserted += 1
else:
updated += 1
return inserted, updated
def _get_value_case_insensitive(self, record: dict, col: str):
"""蹇界暐澶у皬鍐欒幏鍙栧硷紝鍏煎 information_schema 灏忓啓鍒楀悕涓?JSON 鍘熷澶у皬鍐欍?""
if record is None:
return None
if col is None:
return None
if col in record:
return record.get(col)
col_lower = col.lower()
for k, v in record.items():
if isinstance(k, str) and k.lower() == col_lower:
return v
return None
def _normalize_scalar(self, value):
"""灏嗙┖瀛楃涓叉爣鍑嗗寲涓?None锛岄伩鍏嶆暟鍊?鏃堕棿瀛楁绫诲瀷閿欒銆?""
if value == "" or value == "{}" or value == "[]":
return None
return value
def _cast_value(self, value, data_type: str):
"""鏍规嵁鍒楃被鍨嬪仛杞婚噺杞崲锛岄伩鍏嶇被鍨嬩笉鍖归厤銆?""
if value is None:
return None
dt = (data_type or "").lower()
if dt in ("integer", "bigint", "smallint"):
if isinstance(value, bool):
return int(value)
try:
return int(value)
except Exception:
return None
if dt in ("numeric", "double precision", "real", "decimal"):
if isinstance(value, bool):
return int(value)
try:
return float(value)
except Exception:
return None
if dt.startswith("timestamp") or dt in ("date", "time", "interval"):
# 浠呮帴鍙楀瓧绗︿覆/鏃ユ湡锛屾暟鍊肩瓑涓€寰嬬疆绌? return value if isinstance(value, str) else None
return value

View File

@@ -0,0 +1,347 @@
# -*- coding: utf-8 -*-
"""手工示例数据灌入:按 schema_ODS_doc.sql 的表结构写入 ODS。"""
from __future__ import annotations
import json
import os
from datetime import datetime
from typing import Any, Iterable
from psycopg2.extras import Json
from .base_task import BaseTask
class ManualIngestTask(BaseTask):
"""本地示例 JSON 灌入 ODS确保表名/主键/插入列与 schema_ODS_doc.sql 对齐。"""
FILE_MAPPING: list[tuple[tuple[str, ...], str]] = [
(("member_profiles",), "billiards_ods.member_profiles"),
(("member_balance_changes",), "billiards_ods.member_balance_changes"),
(("member_stored_value_cards",), "billiards_ods.member_stored_value_cards"),
(("recharge_settlements",), "billiards_ods.recharge_settlements"),
(("settlement_records",), "billiards_ods.settlement_records"),
(("assistant_cancellation_records",), "billiards_ods.assistant_cancellation_records"),
(("assistant_accounts_master",), "billiards_ods.assistant_accounts_master"),
(("assistant_service_records",), "billiards_ods.assistant_service_records"),
(("site_tables_master",), "billiards_ods.site_tables_master"),
(("table_fee_discount_records",), "billiards_ods.table_fee_discount_records"),
(("table_fee_transactions",), "billiards_ods.table_fee_transactions"),
(("goods_stock_movements",), "billiards_ods.goods_stock_movements"),
(("stock_goods_category_tree",), "billiards_ods.stock_goods_category_tree"),
(("goods_stock_summary",), "billiards_ods.goods_stock_summary"),
(("payment_transactions",), "billiards_ods.payment_transactions"),
(("refund_transactions",), "billiards_ods.refund_transactions"),
(("platform_coupon_redemption_records",), "billiards_ods.platform_coupon_redemption_records"),
(("group_buy_redemption_records",), "billiards_ods.group_buy_redemption_records"),
(("group_buy_packages",), "billiards_ods.group_buy_packages"),
(("settlement_ticket_details",), "billiards_ods.settlement_ticket_details"),
(("store_goods_master",), "billiards_ods.store_goods_master"),
(("tenant_goods_master",), "billiards_ods.tenant_goods_master"),
(("store_goods_sales_records",), "billiards_ods.store_goods_sales_records"),
]
TABLE_SPECS: dict[str, dict[str, Any]] = {
"billiards_ods.member_profiles": {"pk": "id"},
"billiards_ods.member_balance_changes": {"pk": "id"},
"billiards_ods.member_stored_value_cards": {"pk": "id"},
"billiards_ods.recharge_settlements": {"pk": "id"},
"billiards_ods.settlement_records": {"pk": "id"},
"billiards_ods.assistant_cancellation_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.assistant_accounts_master": {"pk": "id"},
"billiards_ods.assistant_service_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.site_tables_master": {"pk": "id"},
"billiards_ods.table_fee_discount_records": {"pk": "id", "json_cols": ["siteProfile", "tableProfile"]},
"billiards_ods.table_fee_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.goods_stock_movements": {"pk": "siteGoodsStockId"},
"billiards_ods.stock_goods_category_tree": {"pk": "id", "json_cols": ["categoryBoxes"]},
"billiards_ods.goods_stock_summary": {"pk": "siteGoodsId"},
"billiards_ods.payment_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.refund_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.platform_coupon_redemption_records": {"pk": "id"},
"billiards_ods.tenant_goods_master": {"pk": "id"},
"billiards_ods.group_buy_packages": {"pk": "id"},
"billiards_ods.group_buy_redemption_records": {"pk": "id"},
"billiards_ods.settlement_ticket_details": {
"pk": "orderSettleId",
"json_cols": ["memberProfile", "orderItem", "tenantMemberCardLogs"],
},
"billiards_ods.store_goods_master": {"pk": "id"},
"billiards_ods.store_goods_sales_records": {"pk": "id"},
}
def get_task_code(self) -> str:
"""返回任务编码。"""
return "MANUAL_INGEST"
def execute(self, cursor_data: dict | None = None) -> dict:
"""从目录读取 JSON按表定义批量入库。"""
data_dir = (
self.config.get("manual.data_dir")
or self.config.get("pipeline.ingest_source_dir")
or r"c:\dev\LLTQ\ETL\feiqiu-ETL\etl_billiards\tests\testdata_json"
)
if not os.path.exists(data_dir):
self.logger.error("Data directory not found: %s", data_dir)
return {"status": "error", "message": "Directory not found"}
counts = {"fetched": 0, "inserted": 0, "updated": 0, "skipped": 0, "errors": 0}
for filename in sorted(os.listdir(data_dir)):
if not filename.endswith(".json"):
continue
filepath = os.path.join(data_dir, filename)
try:
with open(filepath, "r", encoding="utf-8") as fh:
raw_entries = json.load(fh)
except Exception:
counts["errors"] += 1
self.logger.exception("Failed to read %s", filename)
continue
entries = raw_entries if isinstance(raw_entries, list) else [raw_entries]
records = self._extract_records(entries)
if not records:
counts["skipped"] += 1
continue
target_table = self._match_by_filename(filename)
if not target_table:
self.logger.warning("No mapping found for file: %s", filename)
counts["skipped"] += 1
continue
self.logger.info("Ingesting %s into %s", filename, target_table)
try:
inserted, updated = self._ingest_table(target_table, records, filename)
counts["inserted"] += inserted
counts["updated"] += updated
counts["fetched"] += len(records)
except Exception:
counts["errors"] += 1
self.logger.exception("Error processing %s", filename)
self.db.rollback()
continue
try:
self.db.commit()
except Exception:
self.db.rollback()
raise
return {"status": "SUCCESS", "counts": counts}
def _match_by_filename(self, filename: str) -> str | None:
"""根据文件名关键字匹配目标表。"""
for keywords, table in self.FILE_MAPPING:
if any(keyword and keyword in filename for keyword in keywords):
return table
return None
def _extract_records(self, raw_entries: Iterable[Any]) -> list[dict]:
"""兼容多层 data/list 包装,抽取记录列表。"""
records: list[dict] = []
for entry in raw_entries:
if isinstance(entry, dict):
preferred = entry
if "data" in entry and not any(k not in {"data", "code"} for k in entry.keys()):
preferred = entry["data"]
data = preferred
if isinstance(data, dict):
# 特殊处理 settleList充值、结算记录展开 data.settleList 下的 settleList抛弃上层 siteProfile
if "settleList" in data:
settle_list_val = data.get("settleList")
if isinstance(settle_list_val, dict):
settle_list_iter = [settle_list_val]
elif isinstance(settle_list_val, list):
settle_list_iter = settle_list_val
else:
settle_list_iter = []
handled = False
for item in settle_list_iter or []:
if not isinstance(item, dict):
continue
inner = item.get("settleList")
merged = dict(inner) if isinstance(inner, dict) else dict(item)
# 保留 siteProfile 供后续字段补充,但不落库
site_profile = data.get("siteProfile")
if isinstance(site_profile, dict):
merged.setdefault("siteProfile", site_profile)
records.append(merged)
handled = True
if handled:
continue
list_used = False
for v in data.values():
if isinstance(v, list) and v and isinstance(v[0], dict):
records.extend(v)
list_used = True
break
if list_used:
continue
if isinstance(data, list) and data and isinstance(data[0], dict):
records.extend(data)
elif isinstance(data, dict):
records.append(data)
elif isinstance(entry, list):
records.extend([item for item in entry if isinstance(item, dict)])
return records
def _get_table_columns(self, table: str) -> list[tuple[str, str, str]]:
"""查询 information_schema获取目标表列信息。"""
cache = getattr(self, "_table_columns_cache", {})
if table in cache:
return cache[table]
if "." in table:
schema, name = table.split(".", 1)
else:
schema, name = "public", table
sql = """
SELECT column_name, data_type, udt_name
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
ORDER BY ordinal_position
"""
with self.db.conn.cursor() as cur:
cur.execute(sql, (schema, name))
cols = [(r[0], (r[1] or "").lower(), (r[2] or "").lower()) for r in cur.fetchall()]
cache[table] = cols
self._table_columns_cache = cache
return cols
def _ingest_table(self, table: str, records: list[dict], source_file: str) -> tuple[int, int]:
"""构建 INSERT/ON CONFLICT 语句并批量执行。"""
spec = self.TABLE_SPECS.get(table)
if not spec:
raise ValueError(f"No table spec for {table}")
pk_col = spec.get("pk")
json_cols = set(spec.get("json_cols", []))
json_cols_lower = {c.lower() for c in json_cols}
columns_info = self._get_table_columns(table)
columns = [c[0] for c in columns_info]
db_json_cols_lower = {
c[0].lower() for c in columns_info if c[1] in ("json", "jsonb") or c[2] in ("json", "jsonb")
}
pk_col_db = None
if pk_col:
pk_col_db = next((c for c in columns if c.lower() == pk_col.lower()), pk_col)
placeholders = ", ".join(["%s"] * len(columns))
col_list = ", ".join(f'"{c}"' for c in columns)
sql = f'INSERT INTO {table} ({col_list}) VALUES ({placeholders})'
if pk_col_db:
update_cols = [c for c in columns if c != pk_col_db]
set_clause = ", ".join(f'"{c}"=EXCLUDED."{c}"' for c in update_cols)
sql += f' ON CONFLICT ("{pk_col_db}") DO UPDATE SET {set_clause}'
sql += " RETURNING (xmax = 0) AS inserted"
params = []
now = datetime.now()
json_dump = lambda v: json.dumps(v, ensure_ascii=False) # noqa: E731
for rec in records:
merged_rec = rec if isinstance(rec, dict) else {}
data_part = merged_rec.get("data")
while isinstance(data_part, dict):
merged_rec = {**data_part, **merged_rec}
data_part = data_part.get("data")
# 针对充值/结算,补齐 siteProfile 中的店铺信息
if table in {
"billiards_ods.recharge_settlements",
"billiards_ods.settlement_records",
}:
site_profile = merged_rec.get("siteProfile") or merged_rec.get("site_profile")
if isinstance(site_profile, dict):
merged_rec.setdefault("tenantid", site_profile.get("tenant_id") or site_profile.get("tenantId"))
merged_rec.setdefault("siteid", site_profile.get("id") or site_profile.get("siteId"))
merged_rec.setdefault("sitename", site_profile.get("shop_name") or site_profile.get("siteName"))
pk_val = self._get_value_case_insensitive(merged_rec, pk_col) if pk_col else None
if pk_col and (pk_val is None or pk_val == ""):
continue
row_vals = []
for col_name, data_type, udt in columns_info:
col_lower = col_name.lower()
if col_lower == "payload":
row_vals.append(Json(rec, dumps=json_dump))
continue
if col_lower == "source_file":
row_vals.append(source_file)
continue
if col_lower == "fetched_at":
row_vals.append(merged_rec.get(col_name, now))
continue
value = self._normalize_scalar(self._get_value_case_insensitive(merged_rec, col_name))
if col_lower in json_cols_lower or col_lower in db_json_cols_lower:
row_vals.append(Json(value, dumps=json_dump) if value is not None else None)
continue
casted = self._cast_value(value, data_type)
row_vals.append(casted)
params.append(tuple(row_vals))
if not params:
return 0, 0
inserted = 0
updated = 0
with self.db.conn.cursor() as cur:
for row in params:
cur.execute(sql, row)
flag = cur.fetchone()[0]
if flag:
inserted += 1
else:
updated += 1
return inserted, updated
@staticmethod
def _get_value_case_insensitive(record: dict, col: str | None):
"""忽略大小写获取值,兼容 information_schema 与 JSON 原始字段。"""
if record is None or col is None:
return None
if col in record:
return record.get(col)
col_lower = col.lower()
for k, v in record.items():
if isinstance(k, str) and k.lower() == col_lower:
return v
return None
@staticmethod
def _normalize_scalar(value):
"""将空字符串/空 JSON 规范为 None避免类型转换错误。"""
if value == "" or value == "{}" or value == "[]":
return None
return value
@staticmethod
def _cast_value(value, data_type: str):
"""根据列类型做简单转换,保证批量插入兼容。"""
if value is None:
return None
dt = (data_type or "").lower()
if dt in ("integer", "bigint", "smallint"):
if isinstance(value, bool):
return int(value)
try:
return int(value)
except Exception:
return None
if dt in ("numeric", "double precision", "real", "decimal"):
if isinstance(value, bool):
return int(value)
try:
return float(value)
except Exception:
return None
if dt.startswith("timestamp") or dt in ("date", "time", "interval"):
return value if isinstance(value, str) else None
return value

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -2,22 +2,117 @@
CREATE SCHEMA IF NOT EXISTS billiards_dwd;
SET search_path TO billiards_dwd;
-- SCD2 字段统一默认值、中文注释、唯一性(业务键 + 时间段不重叠)控制
CREATE EXTENSION IF NOT EXISTS btree_gist;
DO $$
DECLARE
rec RECORD;
BEGIN
-- 统一 SCD2 默认值与注释,避免后续手工遗漏
FOR rec IN
SELECT table_name
FROM information_schema.columns
WHERE table_schema = 'billiards_dwd'
AND column_name = 'scd2_start_time'
LOOP
EXECUTE format('ALTER TABLE billiards_dwd.%I ALTER COLUMN scd2_start_time SET DEFAULT now()', rec.table_name);
EXECUTE format('ALTER TABLE billiards_dwd.%I ALTER COLUMN scd2_end_time SET DEFAULT ''9999-12-31''', rec.table_name);
EXECUTE format('ALTER TABLE billiards_dwd.%I ALTER COLUMN scd2_is_current SET DEFAULT 1', rec.table_name);
EXECUTE format('ALTER TABLE billiards_dwd.%I ALTER COLUMN scd2_version SET DEFAULT 1', rec.table_name);
EXECUTE format('COMMENT ON COLUMN billiards_dwd.%I.scd2_start_time IS ''SCD2 开始时间(版本生效起点)''', rec.table_name);
EXECUTE format('COMMENT ON COLUMN billiards_dwd.%I.scd2_end_time IS ''SCD2 结束时间(默认 9999-12-31表示当前版本仍有效''', rec.table_name);
EXECUTE format('COMMENT ON COLUMN billiards_dwd.%I.scd2_is_current IS ''SCD2 当前版本标记1=当前版本0=历史版本''', rec.table_name);
EXECUTE format('COMMENT ON COLUMN billiards_dwd.%I.scd2_version IS ''SCD2 版本号,自增,配合时间段避免重叠''', rec.table_name);
END LOOP;
-- 约束:同一业务键时间段不重叠,且仅有一条当前版本
FOR rec IN (
SELECT tc.table_name,
string_agg(format('%I WITH =', kcu.column_name), ', ' ORDER BY kcu.ordinal_position) AS pk_eq_expr,
string_agg(format('%I', kcu.column_name), ', ' ORDER BY kcu.ordinal_position) AS pk_cols
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.table_schema = kcu.table_schema
AND tc.table_name = kcu.table_name
AND tc.constraint_name = kcu.constraint_name
WHERE tc.table_schema = 'billiards_dwd'
AND tc.constraint_type = 'PRIMARY KEY'
AND EXISTS (
SELECT 1 FROM information_schema.columns c
WHERE c.table_schema = 'billiards_dwd'
AND c.table_name = tc.table_name
AND c.column_name = 'scd2_start_time'
)
GROUP BY tc.table_name
)
LOOP
IF NOT EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = format('%s_scd2_no_overlap', rec.table_name)
AND conrelid = format('billiards_dwd.%s', rec.table_name)::regclass
) THEN
EXECUTE format(
'ALTER TABLE billiards_dwd.%I ADD CONSTRAINT %I EXCLUDE USING gist (%s, tstzrange(scd2_start_time, scd2_end_time) WITH &&) WHERE (scd2_is_current = 1);',
rec.table_name,
rec.table_name || '_scd2_no_overlap',
rec.pk_eq_expr
);
END IF;
IF to_regclass(format('billiards_dwd.%s_scd2_current_unique_idx', rec.table_name)) IS NULL THEN
EXECUTE format(
'CREATE UNIQUE INDEX %I ON billiards_dwd.%I (%s) WHERE (scd2_is_current = 1);',
rec.table_name || '_scd2_current_unique_idx',
rec.table_name,
rec.pk_cols
);
END IF;
END LOOP;
END
$$;
-- SCD2 统一约定DIM 表使用):
-- SCD2_start_time TIMESTAMPTZ DEFAULT now() -- 版本开始时间
-- SCD2_end_time TIMESTAMPTZ DEFAULT '9999-12-31' -- 版本结束时间
-- SCD2_is_current INT DEFAULT 1 -- 当前版本标记1当前/0历史
-- SCD2_version INT DEFAULT 1 -- 版本号,自增
-- dim_site
CREATE TABLE IF NOT EXISTS dim_site (
site_id BIGINT,
org_id BIGINT,
shop_name TEXT,
business_tel TEXT,
full_address TEXT,
tenant_id BIGINT,
shop_name TEXT,
site_label TEXT,
full_address TEXT,
address TEXT,
longitude NUMERIC(10,6),
latitude NUMERIC(10,6),
tenant_site_region_id BIGINT,
business_tel TEXT,
site_type INTEGER,
shop_status INTEGER,
SCD2_start_time TIMESTAMPTZ DEFAULT now(),
SCD2_end_time TIMESTAMPTZ DEFAULT '9999-12-31',
SCD2_is_current INT DEFAULT 1,
SCD2_version INT DEFAULT 1,
PRIMARY KEY (site_id)
);
COMMENT ON COLUMN dim_site.site_id IS '门店主键 ID唯一标识一家门店。与所有事实表中的 site_id 对应。 | 来源: siteProfile.id | 角色: 主键';
COMMENT ON COLUMN dim_site.org_id IS '上级组织 ID用于区域组织划分。 | 来源: siteProfile.org_id | 角色: 外键';
COMMENT ON COLUMN dim_site.shop_name IS '门店名称,展示用。 | 来源: siteProfile.shop_name';
COMMENT ON COLUMN dim_site.business_tel IS '门店电话。 | 来源: siteProfile.business_tel';
COMMENT ON COLUMN dim_site.full_address IS '门店完整地址。 | 来源: siteProfile.full_address';
COMMENT ON COLUMN dim_site.tenant_id IS '租户 ID。与其它表 tenant_id 对应。 | 来源: siteProfile.tenant_id | 角色: 外键';
COMMENT ON COLUMN dim_site.site_id IS '???? ID?????????????????? site_id ??? | ??: siteProfile.id | ??: ??';
COMMENT ON COLUMN dim_site.org_id IS '???? ID?????????? | ??: siteProfile.org_id | ??: ??';
COMMENT ON COLUMN dim_site.tenant_id IS '?? ID????? tenant_id ??? | ??: siteProfile.tenant_id | ??: ??';
COMMENT ON COLUMN dim_site.shop_name IS '????????? | ??: siteProfile.shop_name';
COMMENT ON COLUMN dim_site.site_label IS '???????????????? | ??: siteProfile.site_label';
COMMENT ON COLUMN dim_site.full_address IS '??????? | ??: siteProfile.full_address';
COMMENT ON COLUMN dim_site.address IS '???????????? | ??: siteProfile.address';
COMMENT ON COLUMN dim_site.longitude IS '???????? | ??: siteProfile.longitude';
COMMENT ON COLUMN dim_site.latitude IS '???????? | ??: siteProfile.latitude';
COMMENT ON COLUMN dim_site.tenant_site_region_id IS '????/?????????? | ??: siteProfile.tenant_site_region_id';
COMMENT ON COLUMN dim_site.business_tel IS '????? | ??: siteProfile.business_tel';
COMMENT ON COLUMN dim_site.site_type IS '??????????????????? | ??: siteProfile.site_type';
COMMENT ON COLUMN dim_site.shop_status IS '??????????????????? | ??: siteProfile.shop_status';
-- dim_site_Ex
CREATE TABLE IF NOT EXISTS dim_site_Ex (
@@ -42,6 +137,10 @@ CREATE TABLE IF NOT EXISTS dim_site_Ex (
shop_status INTEGER,
create_time TIMESTAMPTZ,
update_time TIMESTAMPTZ,
SCD2_start_time TIMESTAMPTZ DEFAULT now(),
SCD2_end_time TIMESTAMPTZ DEFAULT '9999-12-31',
SCD2_is_current INT DEFAULT 1,
SCD2_version INT DEFAULT 1,
PRIMARY KEY (site_id)
);
COMMENT ON COLUMN dim_site_Ex.site_id IS '门店主键 ID唯一标识一家门店。与所有事实表中的 site_id 对应。 | 来源: siteProfile.id | 角色: 主键';
@@ -69,17 +168,19 @@ COMMENT ON COLUMN dim_site_Ex.update_time IS '门店最近更新时间。 | 来
-- dim_table
CREATE TABLE IF NOT EXISTS dim_table (
table_id BIGINT,
tenant_id BIGINT,
site_id BIGINT,
table_name TEXT,
site_table_area_id BIGINT,
site_table_area_name TEXT,
tenant_table_area_id BIGINT,
table_price NUMERIC(18,2),
SCD2_start_time TIMESTAMPTZ DEFAULT now(),
SCD2_end_time TIMESTAMPTZ DEFAULT '9999-12-31',
SCD2_is_current INT DEFAULT 1,
SCD2_version INT DEFAULT 1,
PRIMARY KEY (table_id)
);
COMMENT ON COLUMN dim_table.table_id IS '台桌主键,唯一标识一张台或包厢。 | 来源: id | 角色: 主键';
COMMENT ON COLUMN dim_table.tenant_id IS '租户 ID。 | 来源: tenantId | 角色: 外键';
COMMENT ON COLUMN dim_table.site_id IS '门店 ID。 | 来源: siteId | 角色: 外键';
COMMENT ON COLUMN dim_table.table_name IS '台桌名称/编号,如 A17、888。 | 来源: tableName';
COMMENT ON COLUMN dim_table.site_table_area_id IS '门店区 ID用于区分 A区/B区/补时区等。 | 来源: siteTableAreaId | 角色: 外键';
@@ -95,8 +196,10 @@ CREATE TABLE IF NOT EXISTS dim_table_Ex (
table_cloth_use_time INTEGER,
table_cloth_use_cycle INTEGER,
table_status INTEGER,
last_maintenance_time TIMESTAMPTZ,
remark TEXT,
SCD2_start_time TIMESTAMPTZ DEFAULT now(),
SCD2_end_time TIMESTAMPTZ DEFAULT '9999-12-31',
SCD2_is_current INT DEFAULT 1,
SCD2_version INT DEFAULT 1,
PRIMARY KEY (table_id)
);
COMMENT ON COLUMN dim_table_Ex.table_id IS '台桌主键,唯一标识一张台或包厢。 | 来源: id | 角色: 主键';
@@ -105,8 +208,6 @@ COMMENT ON COLUMN dim_table_Ex.is_online_reservation IS '是否可线上预约
COMMENT ON COLUMN dim_table_Ex.table_cloth_use_time IS '已使用台呢时长(秒)。 | 来源: tableClothUseTime';
COMMENT ON COLUMN dim_table_Ex.table_cloth_use_cycle IS '台呢更换周期阈值(秒)。 | 来源: tableClothUseCycle';
COMMENT ON COLUMN dim_table_Ex.table_status IS '当前台桌状态1=空闲2=使用中3=暂停中4=锁定。 | 来源: tableStatus';
COMMENT ON COLUMN dim_table_Ex.last_maintenance_time IS '最近保养时间(未在 JSON 中出现)。 | 来源: lastMaintenanceTime';
COMMENT ON COLUMN dim_table_Ex.remark IS '备注信息。 | 来源: remark';
-- dim_assistant
CREATE TABLE IF NOT EXISTS dim_assistant (
@@ -125,6 +226,10 @@ CREATE TABLE IF NOT EXISTS dim_assistant (
resign_time TIMESTAMPTZ,
leave_status INTEGER,
assistant_status INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (assistant_id)
);
COMMENT ON COLUMN dim_assistant.assistant_id IS '助教账号 ID关联助教服务流水表。 | 来源: id | 角色: 主键';
@@ -189,6 +294,10 @@ CREATE TABLE IF NOT EXISTS dim_assistant_Ex (
light_status INTEGER,
is_team_leader INTEGER,
serial_number BIGINT,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (assistant_id)
);
COMMENT ON COLUMN dim_assistant_Ex.assistant_id IS '助教账号 ID关联助教服务流水表。 | 来源: id | 角色: 主键';
@@ -248,6 +357,10 @@ CREATE TABLE IF NOT EXISTS dim_member (
member_card_grade_name TEXT,
create_time TIMESTAMPTZ,
update_time TIMESTAMPTZ,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (member_id)
);
COMMENT ON COLUMN dim_member.member_id IS '租户内会员主键。 | 来源: id | 角色: 主键';
@@ -259,7 +372,6 @@ COMMENT ON COLUMN dim_member.nickname IS '昵称(未必是真实姓名)。 |
COMMENT ON COLUMN dim_member.member_card_grade_code IS '会员等级代码1=金卡2=银卡3=钻石卡4=黑卡?(按照 MD 文档枚举)。 | 来源: member_card_grade_code';
COMMENT ON COLUMN dim_member.member_card_grade_name IS '等级名称,中文描述。 | 来源: member_card_grade_name';
COMMENT ON COLUMN dim_member.create_time IS '会员档案创建时间。 | 来源: create_time';
COMMENT ON COLUMN dim_member.update_time IS '最近更新时间。 | 来源: update_time';
-- dim_member_Ex
CREATE TABLE IF NOT EXISTS dim_member_Ex (
@@ -270,6 +382,10 @@ CREATE TABLE IF NOT EXISTS dim_member_Ex (
growth_value NUMERIC(18,2),
user_status INTEGER,
status INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (member_id)
);
COMMENT ON COLUMN dim_member_Ex.member_id IS '租户内会员主键。 | 来源: id | 角色: 主键';
@@ -299,6 +415,10 @@ CREATE TABLE IF NOT EXISTS dim_member_card_account (
last_consume_time TIMESTAMPTZ,
status INTEGER,
is_delete INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (member_card_id)
);
COMMENT ON COLUMN dim_member_card_account.member_card_id IS '会员卡账户主键,唯一标识一张具体卡。 | 来源: id | 角色: 主键';
@@ -373,6 +493,10 @@ CREATE TABLE IF NOT EXISTS dim_member_card_account_Ex (
goodsCategoryId TEXT,
pdAssisnatLevel TEXT,
cxAssisnatLevel TEXT,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (member_card_id)
);
COMMENT ON COLUMN dim_member_card_account_Ex.member_card_id IS '会员卡账户主键,唯一标识一张具体卡。 | 来源: id | 角色: 主键';
@@ -444,6 +568,10 @@ CREATE TABLE IF NOT EXISTS dim_tenant_goods (
create_time TIMESTAMPTZ,
update_time TIMESTAMPTZ,
is_delete INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (tenant_goods_id)
);
COMMENT ON COLUMN dim_tenant_goods.tenant_goods_id IS '租户级商品档案主键 ID唯一标识一条商品档案。所有业务事实表销售、库存等中引用租户级商品时应指向此字段。 | 来源: id | 角色: 主键';
@@ -481,6 +609,10 @@ CREATE TABLE IF NOT EXISTS dim_tenant_goods_Ex (
common_sale_royalty INTEGER,
point_sale_royalty INTEGER,
out_goods_id BIGINT,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (tenant_goods_id)
);
COMMENT ON COLUMN dim_tenant_goods_Ex.tenant_goods_id IS '租户级商品档案主键 ID唯一标识一条商品档案。所有业务事实表销售、库存等中引用租户级商品时应指向此字段。 | 来源: id | 角色: 主键';
@@ -524,6 +656,10 @@ CREATE TABLE IF NOT EXISTS dim_store_goods (
enable_status INTEGER,
send_state INTEGER,
is_delete INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (site_goods_id)
);
COMMENT ON COLUMN dim_store_goods.site_goods_id IS '门店级商品 ID本表主键其它业务表中的 site_goods_id 与此对应,用于库存、销售等关联。 | 来源: id | 角色: 主键';
@@ -575,6 +711,10 @@ CREATE TABLE IF NOT EXISTS dim_store_goods_Ex (
option_required INTEGER,
remark TEXT,
sort_order INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (site_goods_id)
);
COMMENT ON COLUMN dim_store_goods_Ex.site_goods_id IS '门店级商品 ID本表主键其它业务表中的 site_goods_id 与此对应,用于库存、销售等关联。 | 来源: id | 角色: 主键';
@@ -618,6 +758,10 @@ CREATE TABLE IF NOT EXISTS dim_goods_category (
open_salesman INTEGER,
sort_order INTEGER,
is_warehousing INTEGER,
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (category_id)
);
COMMENT ON COLUMN dim_goods_category.category_id IS '分类节点主键。来自分类树节点的 id在整个商品分类维度内唯一。用于在事实表中作为商品分类外键引用。 | 来源: id | 角色: 主键';
@@ -651,6 +795,10 @@ CREATE TABLE IF NOT EXISTS dim_groupbuy_package (
create_time TIMESTAMPTZ,
tenant_table_area_id_list VARCHAR(512),
card_type_ids VARCHAR(255),
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (groupbuy_package_id)
);
COMMENT ON COLUMN dim_groupbuy_package.groupbuy_package_id IS '门店侧团购套餐主键。每条记录一个套餐定义,供团购券核销记录指向。平台验券记录中的 group_package_id 通常指向这里。 | 来源: id | 角色: 主键';
@@ -692,6 +840,10 @@ CREATE TABLE IF NOT EXISTS dim_groupbuy_package_Ex (
effective_status INTEGER,
max_selectable_categories INTEGER,
creator_name VARCHAR(100),
SCD2_start_time TIMESTAMPTZ,
SCD2_end_time TIMESTAMPTZ,
SCD2_is_current INT,
SCD2_version INT,
PRIMARY KEY (groupbuy_package_id)
);
COMMENT ON COLUMN dim_groupbuy_package_Ex.groupbuy_package_id IS '门店侧团购套餐主键。每条记录一个套餐定义,供团购券核销记录指向。平台验券记录中的 group_package_id 通常指向这里。 | 来源: id | 角色: 主键';

View File

@@ -0,0 +1,105 @@
-- 文件说明etl_admin 调度元数据 DDL独立文件便于初始化任务单独执行
-- 包含任务注册表、游标表、运行记录表;字段注释使用中文。
CREATE SCHEMA IF NOT EXISTS etl_admin;
CREATE TABLE IF NOT EXISTS etl_admin.etl_task (
task_id BIGSERIAL PRIMARY KEY,
task_code TEXT NOT NULL,
store_id BIGINT NOT NULL,
enabled BOOLEAN DEFAULT TRUE,
cursor_field TEXT,
window_minutes_default INT DEFAULT 30,
overlap_seconds INT DEFAULT 120,
page_size INT DEFAULT 200,
retry_max INT DEFAULT 3,
params JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now(),
UNIQUE (task_code, store_id)
);
COMMENT ON TABLE etl_admin.etl_task IS '任务注册表:调度依据的任务清单(与 task_registry 中的任务码对应)。';
COMMENT ON COLUMN etl_admin.etl_task.task_code IS '任务编码,需与代码中的任务码一致。';
COMMENT ON COLUMN etl_admin.etl_task.store_id IS '门店/租户粒度,区分多门店执行。';
COMMENT ON COLUMN etl_admin.etl_task.enabled IS '是否启用此任务。';
COMMENT ON COLUMN etl_admin.etl_task.cursor_field IS '增量游标字段名(可选)。';
COMMENT ON COLUMN etl_admin.etl_task.window_minutes_default IS '默认时间窗口(分钟)。';
COMMENT ON COLUMN etl_admin.etl_task.overlap_seconds IS '窗口重叠秒数,用于防止遗漏。';
COMMENT ON COLUMN etl_admin.etl_task.page_size IS '默认分页大小。';
COMMENT ON COLUMN etl_admin.etl_task.retry_max IS 'API重试次数上限。';
COMMENT ON COLUMN etl_admin.etl_task.params IS '任务级自定义参数 JSON。';
COMMENT ON COLUMN etl_admin.etl_task.created_at IS '创建时间。';
COMMENT ON COLUMN etl_admin.etl_task.updated_at IS '更新时间。';
CREATE TABLE IF NOT EXISTS etl_admin.etl_cursor (
cursor_id BIGSERIAL PRIMARY KEY,
task_id BIGINT NOT NULL REFERENCES etl_admin.etl_task(task_id) ON DELETE CASCADE,
store_id BIGINT NOT NULL,
last_start TIMESTAMPTZ,
last_end TIMESTAMPTZ,
last_id BIGINT,
last_run_id BIGINT,
extra JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now(),
UNIQUE (task_id, store_id)
);
COMMENT ON TABLE etl_admin.etl_cursor IS '任务游标表:记录每个任务/门店的增量窗口及最后 run。';
COMMENT ON COLUMN etl_admin.etl_cursor.task_id IS '关联 etl_task.task_id。';
COMMENT ON COLUMN etl_admin.etl_cursor.store_id IS '门店/租户粒度。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_start IS '上次窗口开始时间(含重叠偏移)。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_end IS '上次窗口结束时间。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_id IS '上次处理的最大主键/游标值(可选)。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_run_id IS '上次运行ID对应 etl_run.run_id。';
COMMENT ON COLUMN etl_admin.etl_cursor.extra IS '附加游标信息 JSON。';
COMMENT ON COLUMN etl_admin.etl_cursor.created_at IS '创建时间。';
COMMENT ON COLUMN etl_admin.etl_cursor.updated_at IS '更新时间。';
CREATE TABLE IF NOT EXISTS etl_admin.etl_run (
run_id BIGSERIAL PRIMARY KEY,
run_uuid TEXT NOT NULL,
task_id BIGINT NOT NULL REFERENCES etl_admin.etl_task(task_id) ON DELETE CASCADE,
store_id BIGINT NOT NULL,
status TEXT NOT NULL,
started_at TIMESTAMPTZ DEFAULT now(),
ended_at TIMESTAMPTZ,
window_start TIMESTAMPTZ,
window_end TIMESTAMPTZ,
window_minutes INT,
overlap_seconds INT,
fetched_count INT DEFAULT 0,
loaded_count INT DEFAULT 0,
updated_count INT DEFAULT 0,
skipped_count INT DEFAULT 0,
error_count INT DEFAULT 0,
unknown_fields INT DEFAULT 0,
export_dir TEXT,
log_path TEXT,
request_params JSONB DEFAULT '{}'::jsonb,
manifest JSONB DEFAULT '{}'::jsonb,
error_message TEXT,
extra JSONB DEFAULT '{}'::jsonb
);
COMMENT ON TABLE etl_admin.etl_run IS '运行记录表:记录每次任务执行的窗口、状态、计数与日志路径。';
COMMENT ON COLUMN etl_admin.etl_run.run_uuid IS '本次调度的唯一标识。';
COMMENT ON COLUMN etl_admin.etl_run.task_id IS '关联 etl_task.task_id。';
COMMENT ON COLUMN etl_admin.etl_run.store_id IS '门店/租户粒度。';
COMMENT ON COLUMN etl_admin.etl_run.status IS '运行状态SUCC/FAIL/PARTIAL 等)。';
COMMENT ON COLUMN etl_admin.etl_run.started_at IS '开始时间。';
COMMENT ON COLUMN etl_admin.etl_run.ended_at IS '结束时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_start IS '本次窗口开始时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_end IS '本次窗口结束时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_minutes IS '窗口跨度(分钟)。';
COMMENT ON COLUMN etl_admin.etl_run.overlap_seconds IS '窗口重叠秒数。';
COMMENT ON COLUMN etl_admin.etl_run.fetched_count IS '抓取/读取的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.loaded_count IS '插入的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.updated_count IS '更新的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.skipped_count IS '跳过的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.error_count IS '错误记录数。';
COMMENT ON COLUMN etl_admin.etl_run.unknown_fields IS '未知字段计数(清洗阶段)。';
COMMENT ON COLUMN etl_admin.etl_run.export_dir IS '抓取/导出目录。';
COMMENT ON COLUMN etl_admin.etl_run.log_path IS '日志路径。';
COMMENT ON COLUMN etl_admin.etl_run.request_params IS '请求参数 JSON。';
COMMENT ON COLUMN etl_admin.etl_run.manifest IS '运行产出清单/统计 JSON。';
COMMENT ON COLUMN etl_admin.etl_run.error_message IS '错误信息(若失败)。';
COMMENT ON COLUMN etl_admin.etl_run.extra IS '附加字段,保留扩展。';

File diff suppressed because it is too large Load Diff

View File

@@ -1,34 +1,35 @@
-- 将新的 ODS 任务注册到 etl_admin.etl_task(根据需要替换 store_id
-- 使用方式(示例):
-- 灏嗘柊鐨?ODS 浠诲姟娉ㄥ唽鍒?etl_admin.etl_task锛堟牴鎹渶瑕佹浛鎹?store_id锛?
-- 浣跨敤鏂瑰紡锛堢ず渚嬶級锛?
-- psql "$PG_DSN" -f etl_billiards/database/seed_ods_tasks.sql
-- 或者在 psql 中执行本文件内容。
-- 鎴栬€呭湪 psql 涓墽琛屾湰鏂囦欢鍐呭銆?
WITH target_store AS (
SELECT 2790685415443269::bigint AS store_id -- TODO: 替换为实际 store_id
SELECT 2790685415443269::bigint AS store_id -- TODO: 鏇挎崲涓哄疄闄?store_id
),
task_codes AS (
SELECT unnest(ARRAY[
'ODS_ASSISTANT_ACCOUNTS',
'ODS_ASSISTANT_LEDGER',
'ODS_ASSISTANT_ABOLISH',
'ODS_INVENTORY_CHANGE',
'assistant_accounts_masterS',
'assistant_service_records',
'assistant_cancellation_records',
'goods_stock_movements',
'ODS_INVENTORY_STOCK',
'ODS_PACKAGE',
'ODS_GROUP_BUY_REDEMPTION',
'ODS_MEMBER',
'ODS_MEMBER_BALANCE',
'ODS_MEMBER_CARD',
'member_stored_value_cards',
'ODS_PAYMENT',
'ODS_REFUND',
'ODS_COUPON_VERIFY',
'ODS_RECHARGE_SETTLE',
'platform_coupon_redemption_records',
'recharge_settlements',
'ODS_TABLES',
'ODS_GOODS_CATEGORY',
'ODS_STORE_GOODS',
'ODS_TABLE_DISCOUNT',
'table_fee_discount_records',
'ODS_TENANT_GOODS',
'ODS_SETTLEMENT_TICKET',
'ODS_ORDER_SETTLE'
'settlement_records',
'INIT_ODS_SCHEMA'
]) AS task_code
)
INSERT INTO etl_admin.etl_task (task_code, store_id, enabled)
@@ -37,3 +38,4 @@ FROM task_codes t CROSS JOIN target_store s
ON CONFLICT (task_code, store_id) DO UPDATE
SET enabled = EXCLUDED.enabled;

View File

@@ -0,0 +1,9 @@
# DWD 璐ㄩ噺鏍¢獙鎸囧紩
璇存槑锛氱敤浜?ODS 鈫?DWD 钀藉湴鍚庣殑琛屾暟/閲戦鏍稿涓庢娊鏍峰洖鏌ャ€?
## 琛屾暟瀵规瘮锛堢ず渚嬶級
- 鏉ユ簮锛歚etl_billiards/ods_row_report.json` 璁板綍浜嗙ず渚?JSON 涓?ODS 琛屾暟锛屽彲浣滀负 DWD 瀵规瘮鍩虹嚎銆?- 鎵ц锛氬湪 DWD 璺戝畬鍚庯紝缁熻鍏抽敭琛ㄨ鏁帮紝涓?ODS 姹囨€绘垨 JSON 鍩虹嚎瀵归綈锛涘紓甯告椂杈撳嚭宸紓銆?
## 閲戦/鎸囨爣鏍稿寤鸿
- dwd_settlement_head / dwd_settlement_head_Ex锛氳仛鍚堣鍗曟€婚銆侀€€娆鹃锛屼笌 ODS settleList 閲戦鏍稿銆?- dwd_store_goods_sale锛氭寜鍟嗗搧姹囨€婚攢鍞/鏁伴噺锛屼笌 ODS store_goods_sales_records 鑱氬悎瀵规瘮銆?- dwd_member_balance_change锛氭寜浼氬憳姹囨€诲彉鍔ㄩ锛屼笌 ODS 鍚岃〃鑱氬悎瀵规瘮銆?- dwd_recharge_order / dwd_payment / dwd_refund锛氭寜鏀粯鏂瑰紡銆佹椂闂存鑱氬悎閲戦锛屾牳瀵瑰樊寮傘€?
## 鎶芥牱鍥炴煡
- 闅忔満鍙栬嫢骞?DWD 璁板綍锛屽洖鏌?ODS payload锛堥€氳繃涓婚敭鍦?ODS 琛ㄦ煡璇級纭瀛楁鏄犲皠姝g‘銆?- 瀵?SCD2 缁村害锛圖IM 琛級锛氭牎楠屽悓涓氬姟閿粎涓€鏉?is_current=1锛屾椂闂存涓嶉噸鍙狅紝鐗堟湰鍙烽€掑銆?
## 鑷姩鍖栨牎楠岃剼鏈缓璁?- 缁熻鑴氭湰锛氳緭鍑?DWD 鍏抽敭琛ㄨ鏁?閲戦鍒?JSON锛屾柟渚夸笌鍩虹嚎瀵规瘮銆?- 寮傚父鍛婅锛氬彂鐜拌鏁板亸宸垨閲戦鍋忓樊瓒呰繃闃堝€兼椂鎵撳嵃璇︽儏锛堜富閿垪琛ㄣ€佽仛鍚堟槑缁嗭級銆?

View File

@@ -0,0 +1,28 @@
# ODS 示例 JSON 对照表
示例文件名与正式文件名的前缀一致(正式文件会附加 `_YYYYMMDDHHMMSS` 时间戳),表名与文件前缀保持一致,便于业务对照。示例目录默认:`C:\dev\LLTQ\export\test-json-doc`
| JSON 文件名(前缀) | ODS 表名 | 主键字段 | 备注 |
| --- | --- | --- | --- |
| assistant_accounts_master.json | assistant_accounts_master | id | 店员主数据 |
| assistant_cancellation_records.json | assistant_cancellation_records | id | 店员作废事件 |
| assistant_service_records.json | assistant_service_records | id | 店员服务流水 |
| goods_stock_movements.json | goods_stock_movements | id | 进销存出入库 |
| goods_stock_summary.json | goods_stock_summary | id | 库存汇总 |
| group_buy_packages.json | group_buy_packages | id | 团购套餐定义 |
| group_buy_redemption_records.json | group_buy_redemption_records | id | 团购核销/消耗 |
| member_balance_changes.json | member_balance_changes | id | 储值余额变动 |
| member_profiles.json | member_profiles | id | 会员档案 |
| member_stored_value_cards.json | member_stored_value_cards | id | 储值卡账户 |
| payment_transactions.json | payment_transactions | id | 支付流水 |
| platform_coupon_redemption_records.json | platform_coupon_redemption_records | id | 平台券核销 |
| recharge_settlements.json | recharge_settlements | id | 储值充值结算 |
| refund_transactions.json | refund_transactions | id | 退款流水 |
| settlement_records.json | settlement_records | id | 订单结算头 |
| settlement_ticket_details.json | settlement_ticket_details | id | 小票/明细表 |
| site_tables_master.json | site_tables_master | id | 台桌主数据 |
| stock_goods_category_tree.json | stock_goods_category_tree | id | 商品类目树 |
| store_goods_master.json | store_goods_master | id | 门店商品档案 |
| store_goods_sales_records.json | store_goods_sales_records | id | 门店商品销售明细 |
| table_fee_discount_records.json | table_fee_discount_records | id | 台桌减免/调价 |
| table_fee_transactions.json | table_fee_transactions | id | 台桌计费流水 |
| tenant_goods_master.json | tenant_goods_master | id | 品牌/租户级商品档案 |

View File

@@ -0,0 +1,252 @@
# ODS → DWD 映射文档(重建版)
本文件基于最新 DWD 质检结果重构,列出 DWD 表的 ODS 来源与字段映射状态。DIM 表默认使用 SCD2SCD2_start_time / SCD2_end_time / SCD2_is_current / SCD2_version
## 表级映射概览
| DWD 表 | 主键/提示 | 对应 ODS 表 | SCD2 |
| --- | --- | --- | --- |
| billiards_dwd.dim_site | 见 schema_dwd_doc.sql | billiards_ods.site_tables_master | 是 |
| billiards_dwd.dim_site_ex | 见 schema_dwd_doc.sql | billiards_ods.site_tables_master | 是 |
| billiards_dwd.dim_table | 见 schema_dwd_doc.sql | billiards_ods.site_tables_master | 是 |
| billiards_dwd.dim_table_ex | 见 schema_dwd_doc.sql | billiards_ods.site_tables_master | 是 |
| billiards_dwd.dim_assistant | 见 schema_dwd_doc.sql | billiards_ods.assistant_accounts_master | 是 |
| billiards_dwd.dim_assistant_ex | 见 schema_dwd_doc.sql | billiards_ods.assistant_accounts_master | 是 |
| billiards_dwd.dim_member | 见 schema_dwd_doc.sql | billiards_ods.member_profiles | 是 |
| billiards_dwd.dim_member_ex | 见 schema_dwd_doc.sql | billiards_ods.member_profiles | 是 |
| billiards_dwd.dim_member_card_account | 见 schema_dwd_doc.sql | billiards_ods.member_stored_value_cards | 是 |
| billiards_dwd.dim_member_card_account_ex | 见 schema_dwd_doc.sql | billiards_ods.member_stored_value_cards | 是 |
| billiards_dwd.dim_tenant_goods | 见 schema_dwd_doc.sql | billiards_ods.tenant_goods_master | 是 |
| billiards_dwd.dim_tenant_goods_ex | 见 schema_dwd_doc.sql | billiards_ods.tenant_goods_master | 是 |
| billiards_dwd.dim_store_goods | 见 schema_dwd_doc.sql | billiards_ods.store_goods_master | 是 |
| billiards_dwd.dim_store_goods_ex | 见 schema_dwd_doc.sql | billiards_ods.store_goods_master | 是 |
| billiards_dwd.dim_goods_category | 见 schema_dwd_doc.sql | billiards_ods.stock_goods_category_tree | 是 |
| billiards_dwd.dim_groupbuy_package | 见 schema_dwd_doc.sql | billiards_ods.group_buy_packages | 是 |
| billiards_dwd.dim_groupbuy_package_ex | 见 schema_dwd_doc.sql | billiards_ods.group_buy_packages | 是 |
| billiards_dwd.dwd_settlement_head | 见 schema_dwd_doc.sql | billiards_ods.settlement_records | 否 |
| billiards_dwd.dwd_settlement_head_ex | 见 schema_dwd_doc.sql | billiards_ods.settlement_records | 否 |
| billiards_dwd.dwd_table_fee_log | 见 schema_dwd_doc.sql | billiards_ods.table_fee_transactions | 否 |
| billiards_dwd.dwd_table_fee_log_ex | 见 schema_dwd_doc.sql | billiards_ods.table_fee_transactions | 否 |
| billiards_dwd.dwd_table_fee_adjust | 见 schema_dwd_doc.sql | billiards_ods.table_fee_discount_records | 否 |
| billiards_dwd.dwd_table_fee_adjust_ex | 见 schema_dwd_doc.sql | billiards_ods.table_fee_discount_records | 否 |
| billiards_dwd.dwd_store_goods_sale | 见 schema_dwd_doc.sql | billiards_ods.store_goods_sales_records | 否 |
| billiards_dwd.dwd_store_goods_sale_ex | 见 schema_dwd_doc.sql | billiards_ods.store_goods_sales_records | 否 |
| billiards_dwd.dwd_assistant_service_log | 见 schema_dwd_doc.sql | billiards_ods.assistant_service_records | 否 |
| billiards_dwd.dwd_assistant_service_log_ex | 见 schema_dwd_doc.sql | billiards_ods.assistant_service_records | 否 |
| billiards_dwd.dwd_assistant_trash_event | 见 schema_dwd_doc.sql | billiards_ods.assistant_cancellation_records | 否 |
| billiards_dwd.dwd_assistant_trash_event_ex | 见 schema_dwd_doc.sql | billiards_ods.assistant_cancellation_records | 否 |
| billiards_dwd.dwd_member_balance_change | 见 schema_dwd_doc.sql | billiards_ods.member_balance_changes | 否 |
| billiards_dwd.dwd_member_balance_change_ex | 见 schema_dwd_doc.sql | billiards_ods.member_balance_changes | 否 |
| billiards_dwd.dwd_groupbuy_redemption | 见 schema_dwd_doc.sql | billiards_ods.group_buy_redemption_records | 否 |
| billiards_dwd.dwd_groupbuy_redemption_ex | 见 schema_dwd_doc.sql | billiards_ods.group_buy_redemption_records | 否 |
| billiards_dwd.dwd_platform_coupon_redemption | 见 schema_dwd_doc.sql | billiards_ods.platform_coupon_redemption_records | 否 |
| billiards_dwd.dwd_platform_coupon_redemption_ex | 见 schema_dwd_doc.sql | billiards_ods.platform_coupon_redemption_records | 否 |
| billiards_dwd.dwd_recharge_order | 见 schema_dwd_doc.sql | billiards_ods.recharge_settlements | 否 |
| billiards_dwd.dwd_recharge_order_ex | 见 schema_dwd_doc.sql | billiards_ods.recharge_settlements | 否 |
| billiards_dwd.dwd_payment | 见 schema_dwd_doc.sql | billiards_ods.payment_transactions | 否 |
| billiards_dwd.dwd_refund | 见 schema_dwd_doc.sql | billiards_ods.refund_transactions | 否 |
| billiards_dwd.dwd_refund_ex | 见 schema_dwd_doc.sql | billiards_ods.refund_transactions | 否 |
## 字段级映射(同名直拷 / 需映射)
同名直拷DWD 字段与 ODS 同名直接复制需映射ODS 无同名列,需要在装载逻辑中指定来源或默认值。
## 字段级映射(同名直拷 / 需映射)
同名直拷:DWD 字段与 ODS 同名,直接复制;需映射:ODS 无同名列,需要在装载逻辑中指定来源或默认值。
### billiards_dwd.dim_site
来源billiards_ods.site_tables_master
**同名直拷字段:** site_id
**需映射/派生字段:** org_id, shop_name, business_tel, full_address, tenant_id, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_site_ex
来源billiards_ods.site_tables_master
**同名直拷字段:** site_id, light_status, create_time
**需映射/派生字段:** avatar, address, longitude, latitude, tenant_site_region_id, auto_light, light_type, light_token, site_type, site_label, attendance_enabled, attendance_distance, customer_service_qrcode, customer_service_wechat, fixed_pay_qrcode, prod_env, shop_status, update_time, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_table
来源billiards_ods.site_tables_master
**同名直拷字段:** site_id, table_name, site_table_area_id, table_price
**需映射/派生字段:** table_id, tenant_id, site_table_area_name, tenant_table_area_id, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_table_ex
来源billiards_ods.site_tables_master
**同名直拷字段:** show_status, is_online_reservation, table_cloth_use_time, table_cloth_use_cycle, table_status
**需映射/派生字段:** table_id, last_maintenance_time, remark, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_assistant
来源billiards_ods.assistant_accounts_master
**同名直拷字段:** assistant_no, real_name, nickname, mobile, tenant_id, site_id, team_id, team_name, level, entry_time, resign_time, leave_status, assistant_status
**需映射/派生字段:** assistant_id, user_id, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_assistant_ex
来源billiards_ods.assistant_accounts_master
**同名直拷字段:** gender, avatar, video_introduction_url, staff_id, staff_profile_id, sum_grade, get_grade_times, work_status, show_status, show_sort, create_time, update_time, start_time, end_time, order_trade_no
**需映射/派生字段:** assistant_id, birth_date, introduce, height, weight, shop_name, group_id, group_name, person_org_id, assistant_grade, charge_way, allow_cx, is_guaranteed, salary_grant_enabled, entry_type, entry_sign_status, resign_sign_status, online_status, is_delete, criticism_status, last_table_id, last_table_name, last_update_name, ding_talk_synced, site_light_cfg_id, light_equipment_id, light_status, is_team_leader, serial_number, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_member
来源billiards_ods.member_profiles
**同名直拷字段:** system_member_id, tenant_id, register_site_id, mobile, nickname, member_card_grade_code, member_card_grade_name, create_time
**需映射/派生字段:** member_id, update_time, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_member_ex
来源billiards_ods.member_profiles
**同名直拷字段:** referrer_member_id, point, growth_value, user_status, status
**需映射/派生字段:** member_id, register_site_name, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_member_card_account
来源billiards_ods.member_stored_value_cards
**同名直拷字段:** tenant_id, register_site_id, tenant_member_id, system_member_id, card_type_id, member_card_grade_code, member_card_grade_code_name, member_card_type_name, member_name, member_mobile, balance, start_time, end_time, last_consume_time, status, is_delete
**需映射/派生字段:** member_card_id, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_member_card_account_ex
来源billiards_ods.member_stored_value_cards
**同名直拷字段:** site_name, tenantavatar, effect_site_id, able_cross_site, card_physics_type, card_no, bind_password, use_scene, denomination, create_time, disable_start_time, disable_end_time, is_allow_give, is_allow_order_deduct, sort, table_discount, goods_discount, assistant_discount, assistant_reward_discount, table_service_discount, goods_service_discount, assistant_service_discount, coupon_discount, table_discount_sub_switch, goods_discount_sub_switch, assistant_discount_sub_switch, assistant_reward_discount_sub_switch, goods_discount_range_type, table_deduct_radio, goods_deduct_radio, assistant_deduct_radio, table_service_deduct_radio, goods_service_deduct_radio, assistant_service_deduct_radio, assistant_reward_deduct_radio, coupon_deduct_radio, cardsettlededuct, tablecarddeduct, tableservicecarddeduct, goodscardeduct, goodsservicecarddeduct, assistantcarddeduct, assistantservicecarddeduct, assistantrewardcarddeduct, couponcarddeduct, deliveryfeededuct, tableareaid, goodscategoryid, pdassisnatlevel, cxassisnatlevel
**需映射/派生字段:** member_card_id, tenant_name, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_tenant_goods
来源billiards_ods.tenant_goods_master
**同名直拷字段:** tenant_id, supplier_id, goods_category_id, goods_second_category_id, goods_name, goods_number, unit, market_price, goods_state, create_time, update_time, is_delete
**需映射/派生字段:** tenant_goods_id, category_name, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_tenant_goods_ex
来源billiards_ods.tenant_goods_master
**同名直拷字段:** remark_name, pinyin_initial, goods_cover, goods_bar_code, commodity_code, min_discount_price, cost_price, cost_price_type, able_discount, sale_channel, is_warehousing, able_site_transfer, common_sale_royalty, point_sale_royalty
**需映射/派生字段:** tenant_goods_id, commodity_code_list, is_in_site, out_goods_id, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_store_goods
来源billiards_ods.store_goods_master
**同名直拷字段:** tenant_id, site_id, tenant_goods_id, goods_name, goods_category_id, goods_second_category_id, sale_price, goods_state, enable_status, send_state, is_delete
**需映射/派生字段:** site_goods_id, category_level1_name, category_level2_name, batch_stock_qty, sale_qty, total_sales_qty, created_at, updated_at, avg_monthly_sales, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_store_goods_ex
来源billiards_ods.store_goods_master
**同名直拷字段:** unit, pinyin_initial, cost_price, cost_price_type, total_purchase_cost, min_discount_price, audit_status, sale_channel, is_warehousing, forbid_sell_status, able_site_transfer, custom_label_type, option_required, remark
**需映射/派生字段:** site_goods_id, site_name, goods_barcode, goods_cover_url, stock_qty, stock_secondary_qty, safety_stock_qty, provisional_total_cost, is_discountable, days_on_shelf, freeze_status, sort_order, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_goods_category
来源billiards_ods.stock_goods_category_tree
**同名直拷字段:** tenant_id, category_name, alias_name, business_name, tenant_goods_business_id, open_salesman, is_warehousing
**需映射/派生字段:** category_id, parent_category_id, category_level, is_leaf, sort_order, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_groupbuy_package
来源billiards_ods.group_buy_packages
**同名直拷字段:** tenant_id, site_id, package_name, selling_price, start_time, end_time, table_area_name, is_enabled, is_delete, create_time, tenant_table_area_id_list, card_type_ids
**需映射/派生字段:** groupbuy_package_id, package_template_id, coupon_face_value, duration_seconds, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dim_groupbuy_package_ex
来源billiards_ods.group_buy_packages
**同名直拷字段:** site_name, usable_count, date_type, usable_range, date_info, start_clock, end_clock, add_start_clock, add_end_clock, area_tag_type, table_area_id, tenant_table_area_id, table_area_id_list, group_type, system_group_type, effective_status, max_selectable_categories, creator_name
**需映射/派生字段:** groupbuy_package_id, package_type, scd2_start_time, scd2_end_time, scd2_is_current, scd2_version
### billiards_dwd.dwd_settlement_head
来源billiards_ods.settlement_records
**同名直拷字段:**
**需映射/派生字段:** order_settle_id, tenant_id, site_id, site_name, table_id, settle_name, order_trade_no, create_time, pay_time, settle_type, revoke_order_id, member_id, member_name, member_phone, member_card_account_id, member_card_type_name, is_bind_member, member_discount_amount, consume_money, table_charge_money, goods_money, real_goods_money, assistant_pd_money, assistant_cx_money, adjust_amount, pay_amount, balance_amount, recharge_card_amount, gift_card_amount, coupon_amount, rounding_amount, point_amount
### billiards_dwd.dwd_settlement_head_ex
来源billiards_ods.settlement_records
**同名直拷字段:**
**需映射/派生字段:** order_settle_id, serial_number, settle_status, can_be_revoked, revoke_order_name, revoke_time, is_first_order, service_money, cash_amount, card_amount, online_amount, refund_amount, prepay_money, payment_method, coupon_sale_amount, all_coupon_discount, goods_promotion_money, assistant_promotion_money, activity_discount, assistant_manual_discount, point_discount_price, point_discount_cost, is_use_coupon, is_use_discount, is_activity, operator_name, salesman_name, order_remark, operator_id, salesman_user_id
### billiards_dwd.dwd_table_fee_log
来源billiards_ods.table_fee_transactions
**同名直拷字段:** order_trade_no, order_settle_id, order_pay_id, tenant_id, site_id, site_table_id, site_table_area_id, site_table_area_name, tenant_table_area_id, member_id, ledger_name, ledger_unit_price, ledger_count, ledger_amount, real_table_charge_money, coupon_promotion_amount, member_discount_amount, adjust_amount, real_table_use_seconds, add_clock_seconds, start_use_time, ledger_end_time, create_time, ledger_status, is_single_order, is_delete
**需映射/派生字段:** table_fee_log_id
### billiards_dwd.dwd_table_fee_log_ex
来源billiards_ods.table_fee_transactions
**同名直拷字段:** operator_name, salesman_name, used_card_amount, service_money, mgmt_fee, fee_total, ledger_start_time, last_use_time, operator_id, salesman_user_id, salesman_org_id
**需映射/派生字段:** table_fee_log_id
### billiards_dwd.dwd_table_fee_adjust
来源billiards_ods.table_fee_discount_records
**同名直拷字段:** order_trade_no, order_settle_id, tenant_id, site_id, tenant_table_area_id, ledger_amount, ledger_status, is_delete
**需映射/派生字段:** table_fee_adjust_id, table_id, table_area_id, table_area_name, adjust_time
### billiards_dwd.dwd_table_fee_adjust_ex
来源billiards_ods.table_fee_discount_records
**同名直拷字段:** adjust_type, ledger_count, ledger_name, applicant_name, operator_name, applicant_id, operator_id
**需映射/派生字段:** table_fee_adjust_id
### billiards_dwd.dwd_store_goods_sale
来源billiards_ods.store_goods_sales_records
**同名直拷字段:** order_trade_no, order_settle_id, order_pay_id, order_goods_id, site_id, tenant_id, site_goods_id, tenant_goods_id, tenant_goods_category_id, tenant_goods_business_id, site_table_id, ledger_name, ledger_group_name, ledger_unit_price, ledger_count, ledger_amount, real_goods_money, cost_money, ledger_status, is_delete, create_time
**需映射/派生字段:** store_goods_sale_id, discount_price
### billiards_dwd.dwd_store_goods_sale_ex
来源billiards_ods.store_goods_sales_records
**同名直拷字段:** goods_remark, option_value_name, operator_name, salesman_user_id, salesman_name, salesman_role_id, discount_money, coupon_deduct_money, member_discount_amount, point_discount_money, point_discount_money_cost, package_coupon_id, order_coupon_id, member_coupon_id, option_price, option_member_discount_money, option_coupon_deduct_money, push_money, is_single_order, sales_type, operator_id
**需映射/派生字段:** store_goods_sale_id, legacy_order_goods_id, site_name, legacy_site_id, open_salesman_flag, salesman_org_id, returns_number
### billiards_dwd.dwd_assistant_service_log
来源billiards_ods.assistant_service_records
**同名直拷字段:** order_trade_no, order_settle_id, order_pay_id, order_assistant_id, order_assistant_type, tenant_id, site_id, site_table_id, nickname, assistant_team_id, person_org_id, assistant_level, ledger_unit_price, ledger_amount, projected_income, coupon_deduct_money, income_seconds, real_use_seconds, add_clock, create_time, start_use_time, last_use_time, is_delete
**需映射/派生字段:** assistant_service_id, tenant_member_id, system_member_id, assistant_no, site_assistant_id, user_id, level_name, skill_id, skill_name
### billiards_dwd.dwd_assistant_service_log_ex
来源billiards_ods.assistant_service_records
**同名直拷字段:** ledger_name, ledger_group_name, ledger_count, member_discount_amount, manual_discount_amount, service_money, returns_clock, ledger_start_time, ledger_end_time, ledger_status, is_confirm, is_single_order, is_not_responding, is_trash, trash_applicant_id, trash_applicant_name, trash_reason, salesman_user_id, salesman_name, salesman_org_id, skill_grade, service_grade, composite_grade, sum_grade, get_grade_times, grade_status, composite_grade_time
**需映射/派生字段:** assistant_service_id, table_name, assistant_name
### billiards_dwd.dwd_assistant_trash_event
来源billiards_ods.assistant_cancellation_records
**同名直拷字段:**
**需映射/派生字段:** assistant_trash_event_id, site_id, table_id, table_area_id, assistant_no, assistant_name, charge_minutes_raw, abolish_amount, trash_reason, create_time
### billiards_dwd.dwd_assistant_trash_event_ex
来源billiards_ods.assistant_cancellation_records
**同名直拷字段:**
**需映射/派生字段:** assistant_trash_event_id, table_name, table_area_name
### billiards_dwd.dwd_member_balance_change
来源billiards_ods.member_balance_changes
**同名直拷字段:** tenant_id, site_id, register_site_id, tenant_member_id, system_member_id, tenant_member_card_id, card_type_id, from_type, payment_method, is_delete, remark
**需映射/派生字段:** balance_change_id, card_type_name, member_name, member_mobile, balance_before, change_amount, balance_after, change_time
### billiards_dwd.dwd_member_balance_change_ex
来源billiards_ods.member_balance_changes
**同名直拷字段:** refund_amount, operator_id, operator_name
**需映射/派生字段:** balance_change_id, pay_site_name, register_site_name
### billiards_dwd.dwd_groupbuy_redemption
来源billiards_ods.group_buy_redemption_records
**同名直拷字段:** tenant_id, site_id, table_id, tenant_table_area_id, table_charge_seconds, order_trade_no, order_settle_id, order_coupon_id, coupon_origin_id, promotion_activity_id, promotion_coupon_id, order_coupon_channel, ledger_unit_price, ledger_count, ledger_amount, coupon_money, promotion_seconds, coupon_code, is_single_order, is_delete, ledger_name, create_time
**需映射/派生字段:** redemption_id
### billiards_dwd.dwd_groupbuy_redemption_ex
来源billiards_ods.group_buy_redemption_records
**同名直拷字段:** order_pay_id, goods_promotion_money, table_service_promotion_money, assistant_promotion_money, assistant_service_promotion_money, reward_promotion_money, recharge_promotion_money, offer_type, ledger_status, operator_id, operator_name, salesman_user_id, salesman_name, salesman_role_id, ledger_group_name
**需映射/派生字段:** redemption_id, site_name, table_name, table_area_name, goods_option_price, salesman_org_id
### billiards_dwd.dwd_platform_coupon_redemption
来源billiards_ods.platform_coupon_redemption_records
**同名直拷字段:** tenant_id, site_id, coupon_code, coupon_channel, coupon_name, sale_price, coupon_money, coupon_free_time, channel_deal_id, deal_id, group_package_id, site_order_id, table_id, certificate_id, verify_id, use_status, is_delete, create_time, consume_time
**需映射/派生字段:** platform_coupon_redemption_id
### billiards_dwd.dwd_platform_coupon_redemption_ex
来源billiards_ods.platform_coupon_redemption_records
**同名直拷字段:** coupon_cover, coupon_remark, groupon_type, operator_id, operator_name
**需映射/派生字段:** platform_coupon_redemption_id
### billiards_dwd.dwd_recharge_order
来源billiards_ods.recharge_settlements
**同名直拷字段:**
**需映射/派生字段:** recharge_order_id, tenant_id, site_id, member_id, member_name_snapshot, member_phone_snapshot, tenant_member_card_id, member_card_type_name, settle_relate_id, settle_type, settle_name, is_first, pay_amount, refund_amount, point_amount, cash_amount, payment_method, create_time, pay_time
### billiards_dwd.dwd_recharge_order_ex
来源billiards_ods.recharge_settlements
**同名直拷字段:**
**需映射/派生字段:** recharge_order_id, site_name_snapshot, settle_status, is_bind_member, is_activity, is_use_coupon, is_use_discount, can_be_revoked, online_amount, balance_amount, card_amount, coupon_amount, recharge_card_amount, gift_card_amount, prepay_money, consume_money, goods_money, real_goods_money, table_charge_money, service_money, activity_discount, all_coupon_discount, goods_promotion_money, assistant_promotion_money, assistant_pd_money, assistant_cx_money, assistant_manual_discount, coupon_sale_amount, member_discount_amount, point_discount_price, point_discount_cost, adjust_amount, rounding_amount, operator_id, operator_name_snapshot, salesman_user_id, salesman_name, order_remark, table_id, serial_number, revoke_order_id, revoke_order_name, revoke_time
### billiards_dwd.dwd_payment
来源billiards_ods.payment_transactions
**同名直拷字段:** site_id, relate_type, relate_id, pay_amount, pay_status, payment_method, online_pay_channel, create_time, pay_time
**需映射/派生字段:** payment_id, pay_date
### billiards_dwd.dwd_refund
来源billiards_ods.refund_transactions
**同名直拷字段:** tenant_id, site_id, relate_type, relate_id, pay_amount, channel_fee, pay_time, create_time, payment_method, member_id, member_card_id
**需映射/派生字段:** refund_id
### billiards_dwd.dwd_refund_ex
来源billiards_ods.refund_transactions
**同名直拷字段:** pay_sn, refund_amount, round_amount, balance_frozen_amount, card_frozen_amount, pay_status, action_type, is_revoke, is_delete, check_status, online_pay_channel, online_pay_type, pay_terminal, pay_config_id, cashier_point_id, operator_id, channel_payer_id, channel_pay_no
**需映射/派生字段:** refund_id, tenant_name

View File

@@ -0,0 +1,692 @@
{
"generated_at": "2025-12-09T01:38:19.992961",
"tables": [
{
"dwd_table": "billiards_dwd.dim_site",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_site_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table_ex",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant_ex",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_ex",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_card_account",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "balance",
"dwd_sum": 31061.03,
"ods_sum": 31061.03,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_member_card_account_ex",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "deliveryfeededuct",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods_ex",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods_ex",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_goods_category",
"ods_table": "billiards_ods.stock_goods_category_tree",
"count": {
"dwd": 9,
"ods": 9,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package_ex",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head_ex",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "adjust_amount",
"dwd_sum": 1157.45,
"ods_sum": 1157.45,
"diff": 0.0
},
{
"column": "coupon_promotion_amount",
"dwd_sum": 11244.49,
"ods_sum": 11244.49,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 18107.0,
"ods_sum": 18107.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 1149.19,
"ods_sum": 1149.19,
"diff": 0.0
},
{
"column": "real_table_charge_money",
"dwd_sum": 5705.06,
"ods_sum": 5705.06,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "fee_total",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "mgmt_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "used_card_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "ledger_amount",
"dwd_sum": 20650.84,
"ods_sum": 20650.84,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust_ex",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "cost_money",
"dwd_sum": 22.3,
"ods_sum": 22.3,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 4583.0,
"ods_sum": 4583.0,
"diff": 0.0
},
{
"column": "real_goods_money",
"dwd_sum": 3791.0,
"ods_sum": 3791.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale_ex",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "discount_money",
"dwd_sum": 792.0,
"ods_sum": 792.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_member_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money_cost",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "push_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 626.83,
"ods_sum": 626.83,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 63251.37,
"ods_sum": 63251.37,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log_ex",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "manual_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event_ex",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change_ex",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 12266.0,
"ods_sum": 12266.0,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 12049.53,
"ods_sum": 12049.53,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption_ex",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "assistant_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "assistant_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "goods_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "recharge_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "reward_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "table_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 11956.0,
"ods_sum": 11956.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption_ex",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order_ex",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_payment",
"ods_table": "billiards_ods.payment_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "pay_amount",
"dwd_sum": 10863.0,
"ods_sum": 10863.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "channel_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "pay_amount",
"dwd_sum": -62186.0,
"ods_sum": -62186.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund_ex",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "balance_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "card_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "round_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
}
],
"note": "行数/金额核对,金额字段基于列名包含 amount/money/fee/balance 的数值列自动扫描。"
}

View File

@@ -0,0 +1,52 @@
{
"source_counts": {
"assistant_accounts_master.json": 2,
"assistant_cancellation_records.json": 2,
"assistant_service_records.json": 2,
"goods_stock_movements.json": 2,
"goods_stock_summary.json": 161,
"group_buy_packages.json": 2,
"group_buy_redemption_records.json": 2,
"member_balance_changes.json": 2,
"member_profiles.json": 2,
"member_stored_value_cards.json": 2,
"payment_transactions.json": 200,
"platform_coupon_redemption_records.json": 200,
"recharge_settlements.json": 2,
"refund_transactions.json": 11,
"settlement_records.json": 2,
"settlement_ticket_details.json": 193,
"site_tables_master.json": 2,
"stock_goods_category_tree.json": 2,
"store_goods_master.json": 2,
"store_goods_sales_records.json": 2,
"table_fee_discount_records.json": 2,
"table_fee_transactions.json": 2,
"tenant_goods_master.json": 2
},
"ods_counts": {
"member_profiles": 199,
"member_balance_changes": 200,
"member_stored_value_cards": 200,
"recharge_settlements": 75,
"settlement_records": 200,
"assistant_cancellation_records": 15,
"assistant_accounts_master": 50,
"assistant_service_records": 200,
"site_tables_master": 71,
"table_fee_discount_records": 200,
"table_fee_transactions": 200,
"goods_stock_movements": 200,
"stock_goods_category_tree": 9,
"goods_stock_summary": 161,
"payment_transactions": 200,
"refund_transactions": 11,
"platform_coupon_redemption_records": 200,
"tenant_goods_master": 156,
"group_buy_packages": 17,
"group_buy_redemption_records": 200,
"settlement_ticket_details": 193,
"store_goods_master": 161,
"store_goods_sales_records": 200
}
}

View File

@@ -15,10 +15,14 @@ from tasks.table_discount_task import TableDiscountTask
from tasks.assistant_abolish_task import AssistantAbolishTask
from tasks.ledger_task import LedgerTask
from tasks.ods_tasks import ODS_TASK_CLASSES
from tasks.ticket_dwd_task import TicketDwdTask
from tasks.manual_ingest_task import ManualIngestTask
from tasks.payments_dwd_task import PaymentsDwdTask
from tasks.members_dwd_task import MembersDwdTask
from tasks.init_schema_task import InitOdsSchemaTask
from tasks.init_dwd_schema_task import InitDwdSchemaTask
from tasks.dwd_load_task import DwdLoadTask
from tasks.ticket_dwd_task import TicketDwdTask
from tasks.dwd_quality_task import DwdQualityTask
class TaskRegistry:
"""任务注册和工厂"""
@@ -64,5 +68,9 @@ default_registry.register("TICKET_DWD", TicketDwdTask)
default_registry.register("MANUAL_INGEST", ManualIngestTask)
default_registry.register("PAYMENTS_DWD", PaymentsDwdTask)
default_registry.register("MEMBERS_DWD", MembersDwdTask)
default_registry.register("INIT_ODS_SCHEMA", InitOdsSchemaTask)
default_registry.register("INIT_DWD_SCHEMA", InitDwdSchemaTask)
default_registry.register("DWD_LOAD_FROM_ODS", DwdLoadTask)
default_registry.register("DWD_QUALITY_CHECK", DwdQualityTask)
for code, task_cls in ODS_TASK_CLASSES.items():
default_registry.register(code, task_cls)

View File

@@ -0,0 +1,692 @@
{
"generated_at": "2025-12-09T03:43:54.887796",
"tables": [
{
"dwd_table": "billiards_dwd.dim_site",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_site_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table_ex",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant_ex",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_ex",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_card_account",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "balance",
"dwd_sum": 31061.03,
"ods_sum": 31061.03,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_member_card_account_ex",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "deliveryfeededuct",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods_ex",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods_ex",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_goods_category",
"ods_table": "billiards_ods.stock_goods_category_tree",
"count": {
"dwd": 9,
"ods": 9,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package_ex",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head_ex",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "adjust_amount",
"dwd_sum": 1157.45,
"ods_sum": 1157.45,
"diff": 0.0
},
{
"column": "coupon_promotion_amount",
"dwd_sum": 11244.49,
"ods_sum": 11244.49,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 18107.0,
"ods_sum": 18107.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 1149.19,
"ods_sum": 1149.19,
"diff": 0.0
},
{
"column": "real_table_charge_money",
"dwd_sum": 5705.06,
"ods_sum": 5705.06,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "fee_total",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "mgmt_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "used_card_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "ledger_amount",
"dwd_sum": 20650.84,
"ods_sum": 20650.84,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust_ex",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "cost_money",
"dwd_sum": 22.3,
"ods_sum": 22.3,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 4583.0,
"ods_sum": 4583.0,
"diff": 0.0
},
{
"column": "real_goods_money",
"dwd_sum": 3791.0,
"ods_sum": 3791.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale_ex",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "discount_money",
"dwd_sum": 792.0,
"ods_sum": 792.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_member_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money_cost",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "push_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 626.83,
"ods_sum": 626.83,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 63251.37,
"ods_sum": 63251.37,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log_ex",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "manual_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event_ex",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change_ex",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 12266.0,
"ods_sum": 12266.0,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 12049.53,
"ods_sum": 12049.53,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption_ex",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "assistant_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "assistant_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "goods_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "recharge_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "reward_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "table_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 11956.0,
"ods_sum": 11956.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption_ex",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order_ex",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_payment",
"ods_table": "billiards_ods.payment_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "pay_amount",
"dwd_sum": 10863.0,
"ods_sum": 10863.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "channel_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "pay_amount",
"dwd_sum": -62186.0,
"ods_sum": -62186.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund_ex",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "balance_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "card_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "round_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
}
],
"note": "行数/金额核对,金额字段基于列名包含 amount/money/fee/balance 的数值列自动扫描。"
}

27
etl_billiards/run_ods.bat Normal file
View File

@@ -0,0 +1,27 @@
@echo off
REM -*- coding: utf-8 -*-
REM 说明:一键重建 ODS执行 INIT_ODS_SCHEMA并灌入示例 JSON执行 MANUAL_INGEST
REM 使用配置:.env 中 PG_DSN、INGEST_SOURCE_DIR或通过参数覆盖
setlocal
cd /d %~dp0
REM 如果需要覆盖示例目录,可修改下面的 INGEST_DIR
set "INGEST_DIR=C:\dev\LLTQ\export\test-json-doc"
echo [INIT_ODS_SCHEMA] 准备执行,源目录=%INGEST_DIR%
python -m cli.main --tasks INIT_ODS_SCHEMA --pipeline-flow INGEST_ONLY --ingest-source "%INGEST_DIR%"
if errorlevel 1 (
echo INIT_ODS_SCHEMA 失败,退出
exit /b 1
)
echo [MANUAL_INGEST] 准备执行,源目录=%INGEST_DIR%
python -m cli.main --tasks MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "%INGEST_DIR%"
if errorlevel 1 (
echo MANUAL_INGEST 失败,退出
exit /b 1
)
echo 全部完成。
endlocal

View File

@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
"""Populate PRD DWD tables from ODS payload snapshots."""
from __future__ import annotations
@@ -16,9 +16,9 @@ SQL_STEPS: list[tuple[str, str]] = [
INSERT INTO billiards_dwd.dim_tenant (tenant_id, tenant_name, status)
SELECT DISTINCT tenant_id, 'default' AS tenant_name, 'active' AS status
FROM (
SELECT tenant_id FROM billiards_ods.ods_order_settle
SELECT tenant_id FROM billiards_ods.settlement_records
UNION SELECT tenant_id FROM billiards_ods.ods_order_receipt_detail
UNION SELECT tenant_id FROM billiards_ods.ods_member_profile
UNION SELECT tenant_id FROM billiards_ods.member_profiles
) s
WHERE tenant_id IS NOT NULL
ON CONFLICT (tenant_id) DO UPDATE SET updated_at = now();
@@ -30,7 +30,7 @@ SQL_STEPS: list[tuple[str, str]] = [
INSERT INTO billiards_dwd.dim_site (site_id, tenant_id, site_name, status)
SELECT DISTINCT site_id, MAX(tenant_id) AS tenant_id, 'default' AS site_name, 'active' AS status
FROM (
SELECT site_id, tenant_id FROM billiards_ods.ods_order_settle
SELECT site_id, tenant_id FROM billiards_ods.settlement_records
UNION SELECT site_id, tenant_id FROM billiards_ods.ods_order_receipt_detail
UNION SELECT site_id, tenant_id FROM billiards_ods.ods_table_info
) s
@@ -84,7 +84,7 @@ SQL_STEPS: list[tuple[str, str]] = [
"""
INSERT INTO billiards_dwd.dim_member_card_type (card_type_id, card_type_name, discount_rate)
SELECT DISTINCT card_type_id, card_type_name, discount_rate
FROM billiards_ods.ods_member_card
FROM billiards_ods.member_stored_value_cards
WHERE card_type_id IS NOT NULL
ON CONFLICT (card_type_id) DO UPDATE SET
card_type_name = EXCLUDED.card_type_name,
@@ -119,10 +119,10 @@ SQL_STEPS: list[tuple[str, str]] = [
prof.wechat_id,
prof.alipay_id,
prof.remarks
FROM billiards_ods.ods_member_profile prof
FROM billiards_ods.member_profiles prof
LEFT JOIN (
SELECT DISTINCT site_id, member_id, card_type_id AS member_type_id, card_type_name AS member_type_name
FROM billiards_ods.ods_member_card
FROM billiards_ods.member_stored_value_cards
) card
ON prof.site_id = card.site_id AND prof.member_id = card.member_id
WHERE prof.member_id IS NOT NULL
@@ -167,7 +167,7 @@ SQL_STEPS: list[tuple[str, str]] = [
"""
INSERT INTO billiards_dwd.dim_assistant (assistant_id, assistant_name, mobile, status)
SELECT DISTINCT assistant_id, assistant_name, mobile, status
FROM billiards_ods.ods_assistant_account
FROM billiards_ods.assistant_accounts_master
WHERE assistant_id IS NOT NULL
ON CONFLICT (assistant_id) DO UPDATE SET
assistant_name = EXCLUDED.assistant_name,
@@ -181,7 +181,7 @@ SQL_STEPS: list[tuple[str, str]] = [
"""
INSERT INTO billiards_dwd.dim_pay_method (pay_method_code, pay_method_name, is_stored_value, status)
SELECT DISTINCT pay_method_code, pay_method_name, FALSE AS is_stored_value, 'active' AS status
FROM billiards_ods.ods_payment_record
FROM billiards_ods.payment_transactions
WHERE pay_method_code IS NOT NULL
ON CONFLICT (pay_method_code) DO UPDATE SET
pay_method_name = EXCLUDED.pay_method_name,
@@ -250,7 +250,7 @@ SQL_STEPS: list[tuple[str, str]] = [
final_table_fee,
FALSE AS is_canceled,
NULL::TIMESTAMPTZ AS cancel_time
FROM billiards_ods.ods_table_use_log
FROM billiards_ods.table_fee_transactions_log
ON CONFLICT (site_id, ledger_id) DO NOTHING;
""",
),
@@ -325,7 +325,7 @@ SQL_STEPS: list[tuple[str, str]] = [
pay_time,
relate_type,
relate_id
FROM billiards_ods.ods_payment_record
FROM billiards_ods.payment_transactions
ON CONFLICT (site_id, pay_id) DO NOTHING;
""",
),
@@ -346,7 +346,7 @@ SQL_STEPS: list[tuple[str, str]] = [
refund_amount,
refund_time,
status
FROM billiards_ods.ods_refund_record
FROM billiards_ods.refund_transactions
ON CONFLICT (site_id, refund_id) DO NOTHING;
""",
),
@@ -369,7 +369,7 @@ SQL_STEPS: list[tuple[str, str]] = [
balance_before,
balance_after,
change_time
FROM billiards_ods.ods_balance_change
FROM billiards_ods.member_balance_changes
ON CONFLICT (site_id, change_id) DO NOTHING;
""",
),
@@ -423,3 +423,4 @@ def main() -> int:
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,117 @@
# -*- coding: utf-8 -*-
"""
ODS JSON 字段核对脚本:对照当前数据库中的 ODS 表字段,检查示例 JSON默认目录 C:\\dev\\LLTQ\\export\\test-json-doc
是否包含同名键,并输出每表未命中的字段,便于补充映射或确认确实无源字段。
使用方法:
set PG_DSN=postgresql://... # 如 .env 中配置
python -m etl_billiards.scripts.check_ods_json_vs_table
"""
from __future__ import annotations
import json
import os
import pathlib
from typing import Dict, Iterable, Set, Tuple
import psycopg2
from etl_billiards.tasks.manual_ingest_task import ManualIngestTask
def _flatten_keys(obj, prefix: str = "") -> Set[str]:
"""递归展开 JSON 所有键路径,返回形如 data.assistantInfos.id 的集合。列表不保留索引,仅继续向下展开。"""
keys: Set[str] = set()
if isinstance(obj, dict):
for k, v in obj.items():
new_prefix = f"{prefix}.{k}" if prefix else k
keys.add(new_prefix)
keys |= _flatten_keys(v, new_prefix)
elif isinstance(obj, list):
for item in obj:
keys |= _flatten_keys(item, prefix)
return keys
def _load_json_keys(path: pathlib.Path) -> Tuple[Set[str], dict[str, Set[str]]]:
"""读取单个 JSON 文件并返回展开后的键集合以及末段->路径列表映射,若文件不存在或无法解析则返回空集合。"""
if not path.exists():
return set(), {}
data = json.loads(path.read_text(encoding="utf-8"))
paths = _flatten_keys(data)
last_map: dict[str, Set[str]] = {}
for p in paths:
last = p.split(".")[-1].lower()
last_map.setdefault(last, set()).add(p)
return paths, last_map
def _load_ods_columns(dsn: str) -> Dict[str, Set[str]]:
"""从数据库读取 billiards_ods.* 的列名集合,按表返回。"""
conn = psycopg2.connect(dsn)
cur = conn.cursor()
cur.execute(
"""
SELECT table_name, column_name
FROM information_schema.columns
WHERE table_schema='billiards_ods'
ORDER BY table_name, ordinal_position
"""
)
result: Dict[str, Set[str]] = {}
for table, col in cur.fetchall():
result.setdefault(table, set()).add(col.lower())
cur.close()
conn.close()
return result
def main() -> None:
"""主流程:遍历 FILE_MAPPING 中的 ODS 表,检查 JSON 键覆盖情况并打印报告。"""
dsn = os.environ.get("PG_DSN")
json_dir = pathlib.Path(os.environ.get("JSON_DOC_DIR", r"C:\dev\LLTQ\export\test-json-doc"))
ods_cols_map = _load_ods_columns(dsn)
print(f"使用 JSON 目录: {json_dir}")
print(f"连接 DSN: {dsn}")
print("=" * 80)
for keywords, ods_table in ManualIngestTask.FILE_MAPPING:
table = ods_table.split(".")[-1]
cols = ods_cols_map.get(table, set())
file_name = f"{keywords[0]}.json"
file_path = json_dir / file_name
keys_full, path_map = _load_json_keys(file_path)
key_last_parts = set(path_map.keys())
missing: Set[str] = set()
extra_keys: Set[str] = set()
present: Set[str] = set()
for col in sorted(cols):
if col in key_last_parts:
present.add(col)
else:
missing.add(col)
for k in key_last_parts:
if k not in cols:
extra_keys.add(k)
print(f"[{table}] 文件={file_name} 列数={len(cols)} JSON键(末段)覆盖={len(present)}/{len(cols)}")
if missing:
print(" 未命中列:", ", ".join(sorted(missing)))
else:
print(" 未命中列: 无")
if extra_keys:
extras = []
for k in sorted(extra_keys):
paths = ", ".join(sorted(path_map.get(k, [])))
extras.append(f"{k} ({paths})")
print(" JSON 仅有(表无此列):", "; ".join(extras))
else:
print(" JSON 仅有(表无此列): 无")
print("-" * 80)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,907 @@
# -*- coding: utf-8 -*-
"""DWD 装载任务:从 ODS 增量写入 DWD维度 SCD2事实按时间增量"""
from __future__ import annotations
from datetime import datetime
from typing import Any, Dict, Iterable, List, Sequence
from psycopg2.extras import RealDictCursor
from .base_task import BaseTask, TaskContext
class DwdLoadTask(BaseTask):
"""负责 DWD 装载:维度表做 SCD2 合并,事实表按时间增量写入。"""
# DWD -> ODS 表映射ODS 表名已与示例 JSON 前缀统一)
TABLE_MAP: dict[str, str] = {
# 维度
# 门店:改用台费流水中的 siteprofile 快照,补齐 org/地址等字段
"billiards_dwd.dim_site": "billiards_ods.table_fee_transactions",
"billiards_dwd.dim_site_ex": "billiards_ods.table_fee_transactions",
"billiards_dwd.dim_table": "billiards_ods.site_tables_master",
"billiards_dwd.dim_table_ex": "billiards_ods.site_tables_master",
"billiards_dwd.dim_assistant": "billiards_ods.assistant_accounts_master",
"billiards_dwd.dim_assistant_ex": "billiards_ods.assistant_accounts_master",
"billiards_dwd.dim_member": "billiards_ods.member_profiles",
"billiards_dwd.dim_member_ex": "billiards_ods.member_profiles",
"billiards_dwd.dim_member_card_account": "billiards_ods.member_stored_value_cards",
"billiards_dwd.dim_member_card_account_ex": "billiards_ods.member_stored_value_cards",
"billiards_dwd.dim_tenant_goods": "billiards_ods.tenant_goods_master",
"billiards_dwd.dim_tenant_goods_ex": "billiards_ods.tenant_goods_master",
"billiards_dwd.dim_store_goods": "billiards_ods.store_goods_master",
"billiards_dwd.dim_store_goods_ex": "billiards_ods.store_goods_master",
"billiards_dwd.dim_goods_category": "billiards_ods.stock_goods_category_tree",
"billiards_dwd.dim_groupbuy_package": "billiards_ods.group_buy_packages",
"billiards_dwd.dim_groupbuy_package_ex": "billiards_ods.group_buy_packages",
# 事实
"billiards_dwd.dwd_settlement_head": "billiards_ods.settlement_records",
"billiards_dwd.dwd_settlement_head_ex": "billiards_ods.settlement_records",
"billiards_dwd.dwd_table_fee_log": "billiards_ods.table_fee_transactions",
"billiards_dwd.dwd_table_fee_log_ex": "billiards_ods.table_fee_transactions",
"billiards_dwd.dwd_table_fee_adjust": "billiards_ods.table_fee_discount_records",
"billiards_dwd.dwd_table_fee_adjust_ex": "billiards_ods.table_fee_discount_records",
"billiards_dwd.dwd_store_goods_sale": "billiards_ods.store_goods_sales_records",
"billiards_dwd.dwd_store_goods_sale_ex": "billiards_ods.store_goods_sales_records",
"billiards_dwd.dwd_assistant_service_log": "billiards_ods.assistant_service_records",
"billiards_dwd.dwd_assistant_service_log_ex": "billiards_ods.assistant_service_records",
"billiards_dwd.dwd_assistant_trash_event": "billiards_ods.assistant_cancellation_records",
"billiards_dwd.dwd_assistant_trash_event_ex": "billiards_ods.assistant_cancellation_records",
"billiards_dwd.dwd_member_balance_change": "billiards_ods.member_balance_changes",
"billiards_dwd.dwd_member_balance_change_ex": "billiards_ods.member_balance_changes",
"billiards_dwd.dwd_groupbuy_redemption": "billiards_ods.group_buy_redemption_records",
"billiards_dwd.dwd_groupbuy_redemption_ex": "billiards_ods.group_buy_redemption_records",
"billiards_dwd.dwd_platform_coupon_redemption": "billiards_ods.platform_coupon_redemption_records",
"billiards_dwd.dwd_platform_coupon_redemption_ex": "billiards_ods.platform_coupon_redemption_records",
"billiards_dwd.dwd_recharge_order": "billiards_ods.recharge_settlements",
"billiards_dwd.dwd_recharge_order_ex": "billiards_ods.recharge_settlements",
"billiards_dwd.dwd_payment": "billiards_ods.payment_transactions",
"billiards_dwd.dwd_refund": "billiards_ods.refund_transactions",
"billiards_dwd.dwd_refund_ex": "billiards_ods.refund_transactions",
}
SCD_COLS = {"scd2_start_time", "scd2_end_time", "scd2_is_current", "scd2_version"}
FACT_ORDER_CANDIDATES = [
"fetched_at",
"pay_time",
"create_time",
"update_time",
"occur_time",
"settle_time",
"start_use_time",
]
# 特殊列映射dwd 列名 -> 源列表达式(可选 CAST
FACT_MAPPINGS: dict[str, list[tuple[str, str, str | None]]] = {
# 维度表(补齐主键/字段差异)
"billiards_dwd.dim_site": [
("org_id", "siteprofile->>'org_id'", None),
("shop_name", "siteprofile->>'shop_name'", None),
("site_label", "siteprofile->>'site_label'", None),
("full_address", "siteprofile->>'full_address'", None),
("address", "siteprofile->>'address'", None),
("longitude", "siteprofile->>'longitude'", "numeric"),
("latitude", "siteprofile->>'latitude'", "numeric"),
("tenant_site_region_id", "siteprofile->>'tenant_site_region_id'", None),
("business_tel", "siteprofile->>'business_tel'", None),
("site_type", "siteprofile->>'site_type'", None),
("shop_status", "siteprofile->>'shop_status'", None),
("tenant_id", "siteprofile->>'tenant_id'", None),
],
"billiards_dwd.dim_site_ex": [
("auto_light", "siteprofile->>'auto_light'", None),
("attendance_enabled", "siteprofile->>'attendance_enabled'", None),
("attendance_distance", "siteprofile->>'attendance_distance'", None),
("prod_env", "siteprofile->>'prod_env'", None),
("light_status", "siteprofile->>'light_status'", None),
("light_type", "siteprofile->>'light_type'", None),
("light_token", "siteprofile->>'light_token'", None),
("address", "siteprofile->>'address'", None),
("avatar", "siteprofile->>'avatar'", None),
("wifi_name", "siteprofile->>'wifi_name'", None),
("wifi_password", "siteprofile->>'wifi_password'", None),
("customer_service_qrcode", "siteprofile->>'customer_service_qrcode'", None),
("customer_service_wechat", "siteprofile->>'customer_service_wechat'", None),
("fixed_pay_qrcode", "siteprofile->>'fixed_pay_qrCode'", None),
("longitude", "siteprofile->>'longitude'", "numeric"),
("latitude", "siteprofile->>'latitude'", "numeric"),
("tenant_site_region_id", "siteprofile->>'tenant_site_region_id'", None),
("site_type", "siteprofile->>'site_type'", None),
("site_label", "siteprofile->>'site_label'", None),
("shop_status", "siteprofile->>'shop_status'", None),
("create_time", "siteprofile->>'create_time'", "timestamptz"),
("update_time", "siteprofile->>'update_time'", "timestamptz"),
],
"billiards_dwd.dim_table": [
("table_id", "id", None),
("site_table_area_name", "areaname", None),
("tenant_table_area_id", "site_table_area_id", None),
],
"billiards_dwd.dim_table_ex": [
("table_id", "id", None),
("table_cloth_use_time", "table_cloth_use_time", None),
],
"billiards_dwd.dim_assistant": [("assistant_id", "id", None), ("user_id", "staff_id", None)],
"billiards_dwd.dim_assistant_ex": [
("assistant_id", "id", None),
("introduce", "introduce", None),
("group_name", "group_name", None),
("light_equipment_id", "light_equipment_id", None),
],
"billiards_dwd.dim_member": [("member_id", "id", None)],
"billiards_dwd.dim_member_ex": [
("member_id", "id", None),
("register_site_name", "site_name", None),
],
"billiards_dwd.dim_member_card_account": [("member_card_id", "id", None)],
"billiards_dwd.dim_member_card_account_ex": [
("member_card_id", "id", None),
("tenant_name", "tenantname", None),
("tenantavatar", "tenantavatar", None),
("card_no", "card_no", None),
("bind_password", "bind_password", None),
("use_scene", "use_scene", None),
("tableareaid", "tableareaid", None),
("goodscategoryid", "goodscategoryid", None),
],
"billiards_dwd.dim_tenant_goods": [
("tenant_goods_id", "id", None),
("category_name", "categoryname", None),
],
"billiards_dwd.dim_tenant_goods_ex": [
("tenant_goods_id", "id", None),
("remark_name", "remark_name", None),
("goods_bar_code", "goods_bar_code", None),
("commodity_code_list", "commodity_code", None),
("is_in_site", "isinsite", "boolean"),
],
"billiards_dwd.dim_store_goods": [
("site_goods_id", "id", None),
("category_level1_name", "onecategoryname", None),
("category_level2_name", "twocategoryname", None),
("created_at", "create_time", None),
("updated_at", "update_time", None),
("avg_monthly_sales", "average_monthly_sales", None),
("batch_stock_qty", "stock", None),
("sale_qty", "sale_num", None),
("total_sales_qty", "total_sales", None),
],
"billiards_dwd.dim_store_goods_ex": [
("site_goods_id", "id", None),
("goods_barcode", "goods_bar_code", None),
("stock_qty", "stock", None),
("stock_secondary_qty", "stock_a", None),
("safety_stock_qty", "safe_stock", None),
("site_name", "sitename", None),
("goods_cover_url", "goods_cover", None),
("provisional_total_cost", "total_purchase_cost", None),
("is_discountable", "able_discount", None),
("freeze_status", "freeze", None),
("remark", "remark", None),
("days_on_shelf", "days_available", None),
("sort_order", "sort", None),
],
"billiards_dwd.dim_goods_category": [
("category_id", "id", None),
("tenant_id", "tenant_id", None),
("category_name", "category_name", None),
("alias_name", "alias_name", None),
("parent_category_id", "pid", None),
("business_name", "business_name", None),
("tenant_goods_business_id", "tenant_goods_business_id", None),
("sort_order", "sort", None),
("open_salesman", "open_salesman", None),
("is_warehousing", "is_warehousing", None),
("category_level", "CASE WHEN pid = 0 THEN 1 ELSE 2 END", None),
("is_leaf", "CASE WHEN categoryboxes IS NULL OR jsonb_array_length(categoryboxes)=0 THEN 1 ELSE 0 END", None),
],
"billiards_dwd.dim_groupbuy_package": [
("groupbuy_package_id", "id", None),
("package_template_id", "package_id", None),
("coupon_face_value", "coupon_money", None),
("duration_seconds", "duration", None),
],
"billiards_dwd.dim_groupbuy_package_ex": [
("groupbuy_package_id", "id", None),
("table_area_id", "table_area_id", None),
("tenant_table_area_id", "tenant_table_area_id", None),
("usable_range", "usable_range", None),
("table_area_id_list", "table_area_id_list", None),
("package_type", "type", None),
],
# 事实表主键及关键差异列
"billiards_dwd.dwd_table_fee_log": [("table_fee_log_id", "id", None)],
"billiards_dwd.dwd_table_fee_log_ex": [
("table_fee_log_id", "id", None),
("salesman_name", "salesman_name", None),
],
"billiards_dwd.dwd_table_fee_adjust": [
("table_fee_adjust_id", "id", None),
("table_id", "site_table_id", None),
("table_area_id", "tenant_table_area_id", None),
("table_area_name", "tableprofile->>'table_area_name'", None),
("adjust_time", "create_time", None),
],
"billiards_dwd.dwd_table_fee_adjust_ex": [
("table_fee_adjust_id", "id", None),
("ledger_name", "ledger_name", None),
],
"billiards_dwd.dwd_store_goods_sale": [("store_goods_sale_id", "id", None), ("discount_price", "discount_money", None)],
"billiards_dwd.dwd_store_goods_sale_ex": [
("store_goods_sale_id", "id", None),
("option_value_name", "option_value_name", None),
("open_salesman_flag", "opensalesman", "integer"),
("salesman_name", "salesman_name", None),
("salesman_org_id", "sales_man_org_id", None),
("legacy_order_goods_id", "ordergoodsid", None),
("site_name", "sitename", None),
("legacy_site_id", "siteid", None),
],
"billiards_dwd.dwd_assistant_service_log": [
("assistant_service_id", "id", None),
("assistant_no", "assistantno", None),
("site_assistant_id", "order_assistant_id", None),
("level_name", "levelname", None),
("skill_name", "skillname", None),
],
"billiards_dwd.dwd_assistant_service_log_ex": [
("assistant_service_id", "id", None),
("assistant_name", "assistantname", None),
("ledger_group_name", "ledger_group_name", None),
("trash_applicant_name", "trash_applicant_name", None),
("trash_reason", "trash_reason", None),
("salesman_name", "salesman_name", None),
("table_name", "tablename", None),
],
"billiards_dwd.dwd_assistant_trash_event": [
("assistant_trash_event_id", "id", None),
("assistant_no", "assistantname", None),
("abolish_amount", "assistantabolishamount", None),
("charge_minutes_raw", "pdchargeminutes", None),
("site_id", "siteid", None),
("table_id", "tableid", None),
("table_area_id", "tableareaid", None),
("assistant_name", "assistantname", None),
("trash_reason", "trashreason", None),
("create_time", "createtime", None),
],
"billiards_dwd.dwd_assistant_trash_event_ex": [
("assistant_trash_event_id", "id", None),
("table_area_name", "tablearea", None),
("table_name", "tablename", None),
],
"billiards_dwd.dwd_member_balance_change": [
("balance_change_id", "id", None),
("balance_before", "before", None),
("change_amount", "account_data", None),
("balance_after", "after", None),
("card_type_name", "membercardtypename", None),
("change_time", "create_time", None),
("member_name", "membername", None),
("member_mobile", "membermobile", None),
],
"billiards_dwd.dwd_member_balance_change_ex": [
("balance_change_id", "id", None),
("pay_site_name", "paysitename", None),
("register_site_name", "registersitename", None),
],
"billiards_dwd.dwd_groupbuy_redemption": [("redemption_id", "id", None)],
"billiards_dwd.dwd_groupbuy_redemption_ex": [
("redemption_id", "id", None),
("table_area_name", "tableareaname", None),
("site_name", "sitename", None),
("table_name", "tablename", None),
("goods_option_price", "goodsoptionprice", None),
("salesman_name", "salesman_name", None),
("salesman_org_id", "sales_man_org_id", None),
("ledger_group_name", "ledger_group_name", None),
],
"billiards_dwd.dwd_platform_coupon_redemption": [("platform_coupon_redemption_id", "id", None)],
"billiards_dwd.dwd_platform_coupon_redemption_ex": [
("platform_coupon_redemption_id", "id", None),
("coupon_cover", "coupon_cover", None),
],
"billiards_dwd.dwd_payment": [("payment_id", "id", None), ("pay_date", "pay_time", "date")],
"billiards_dwd.dwd_refund": [("refund_id", "id", None)],
"billiards_dwd.dwd_refund_ex": [
("refund_id", "id", None),
("tenant_name", "tenantname", None),
("channel_payer_id", "channel_payer_id", None),
("channel_pay_no", "channel_pay_no", None),
],
# 结算头settlement_records源列为小写驼峰/无下划线,需要显式映射)
"billiards_dwd.dwd_settlement_head": [
("order_settle_id", "id", None),
("tenant_id", "tenantid", None),
("site_id", "siteid", None),
("site_name", "sitename", None),
("table_id", "tableid", None),
("settle_name", "settlename", None),
("order_trade_no", "settlerelateid", None),
("create_time", "createtime", None),
("pay_time", "paytime", None),
("settle_type", "settletype", None),
("revoke_order_id", "revokeorderid", None),
("member_id", "memberid", None),
("member_name", "membername", None),
("member_phone", "memberphone", None),
("member_card_account_id", "tenantmembercardid", None),
("member_card_type_name", "membercardtypename", None),
("is_bind_member", "isbindmember", None),
("member_discount_amount", "memberdiscountamount", None),
("consume_money", "consumemoney", None),
("table_charge_money", "tablechargemoney", None),
("goods_money", "goodsmoney", None),
("real_goods_money", "realgoodsmoney", None),
("assistant_pd_money", "assistantpdmoney", None),
("assistant_cx_money", "assistantcxmoney", None),
("adjust_amount", "adjustamount", None),
("pay_amount", "payamount", None),
("balance_amount", "balanceamount", None),
("recharge_card_amount", "rechargecardamount", None),
("gift_card_amount", "giftcardamount", None),
("coupon_amount", "couponamount", None),
("rounding_amount", "roundingamount", None),
("point_amount", "pointamount", None),
],
"billiards_dwd.dwd_settlement_head_ex": [
("order_settle_id", "id", None),
("serial_number", "serialnumber", None),
("settle_status", "settlestatus", None),
("can_be_revoked", "canberevoked", "boolean"),
("revoke_order_name", "revokeordername", None),
("revoke_time", "revoketime", None),
("is_first_order", "isfirst", "boolean"),
("service_money", "servicemoney", None),
("cash_amount", "cashamount", None),
("card_amount", "cardamount", None),
("online_amount", "onlineamount", None),
("refund_amount", "refundamount", None),
("prepay_money", "prepaymoney", None),
("payment_method", "paymentmethod", None),
("coupon_sale_amount", "couponsaleamount", None),
("all_coupon_discount", "allcoupondiscount", None),
("goods_promotion_money", "goodspromotionmoney", None),
("assistant_promotion_money", "assistantpromotionmoney", None),
("activity_discount", "activitydiscount", None),
("assistant_manual_discount", "assistantmanualdiscount", None),
("point_discount_price", "pointdiscountprice", None),
("point_discount_cost", "pointdiscountcost", None),
("is_use_coupon", "isusecoupon", "boolean"),
("is_use_discount", "isusediscount", "boolean"),
("is_activity", "isactivity", "boolean"),
("operator_name", "operatorname", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("operator_id", "operatorid", None),
("salesman_user_id", "salesmanuserid", None),
],
# 充值结算recharge_settlements字段风格同 settlement_records
"billiards_dwd.dwd_recharge_order": [
("recharge_order_id", "id", None),
("tenant_id", "tenantid", None),
("site_id", "siteid", None),
("member_id", "memberid", None),
("member_name_snapshot", "membername", None),
("member_phone_snapshot", "memberphone", None),
("tenant_member_card_id", "tenantmembercardid", None),
("member_card_type_name", "membercardtypename", None),
("settle_relate_id", "settlerelateid", None),
("settle_type", "settletype", None),
("settle_name", "settlename", None),
("is_first", "isfirst", None),
("pay_amount", "payamount", None),
("refund_amount", "refundamount", None),
("point_amount", "pointamount", None),
("cash_amount", "cashamount", None),
("payment_method", "paymentmethod", None),
("create_time", "createtime", None),
("pay_time", "paytime", None),
],
"billiards_dwd.dwd_recharge_order_ex": [
("recharge_order_id", "id", None),
("site_name_snapshot", "sitename", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("revoke_order_name", "revokeordername", None),
("settle_status", "settlestatus", None),
("is_bind_member", "isbindmember", "boolean"),
("is_activity", "isactivity", "boolean"),
("is_use_coupon", "isusecoupon", "boolean"),
("is_use_discount", "isusediscount", "boolean"),
("can_be_revoked", "canberevoked", "boolean"),
("online_amount", "onlineamount", None),
("balance_amount", "balanceamount", None),
("card_amount", "cardamount", None),
("coupon_amount", "couponamount", None),
("recharge_card_amount", "rechargecardamount", None),
("gift_card_amount", "giftcardamount", None),
("prepay_money", "prepaymoney", None),
("consume_money", "consumemoney", None),
("goods_money", "goodsmoney", None),
("real_goods_money", "realgoodsmoney", None),
("table_charge_money", "tablechargemoney", None),
("service_money", "servicemoney", None),
("activity_discount", "activitydiscount", None),
("all_coupon_discount", "allcoupondiscount", None),
("goods_promotion_money", "goodspromotionmoney", None),
("assistant_promotion_money", "assistantpromotionmoney", None),
("assistant_pd_money", "assistantpdmoney", None),
("assistant_cx_money", "assistantcxmoney", None),
("assistant_manual_discount", "assistantmanualdiscount", None),
("coupon_sale_amount", "couponsaleamount", None),
("member_discount_amount", "memberdiscountamount", None),
("point_discount_price", "pointdiscountprice", None),
("point_discount_cost", "pointdiscountcost", None),
("adjust_amount", "adjustamount", None),
("rounding_amount", "roundingamount", None),
("operator_id", "operatorid", None),
("operator_name_snapshot", "operatorname", None),
("salesman_user_id", "salesmanuserid", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("table_id", "tableid", None),
("serial_number", "serialnumber", None),
("revoke_order_id", "revokeorderid", None),
("revoke_order_name", "revokeordername", None),
("revoke_time", "revoketime", None),
],
}
def get_task_code(self) -> str:
"""返回任务编码。"""
return "DWD_LOAD_FROM_ODS"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""准备运行所需的上下文信息。"""
return {"now": datetime.now()}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict[str, Any]:
"""遍历映射关系,维度执行 SCD2 合并,事实表按时间增量插入。"""
now = extracted["now"]
summary: List[Dict[str, Any]] = []
with self.db.conn.cursor(cursor_factory=RealDictCursor) as cur:
for dwd_table, ods_table in self.TABLE_MAP.items():
dwd_cols = self._get_columns(cur, dwd_table)
ods_cols = self._get_columns(cur, ods_table)
if not dwd_cols:
self.logger.warning("跳过 %s,未能获取 DWD 列信息", dwd_table)
continue
if self._table_base(dwd_table).startswith("dim_"):
processed = self._merge_dim_scd2(cur, dwd_table, ods_table, dwd_cols, ods_cols, now)
summary.append({"table": dwd_table, "mode": "SCD2", "processed": processed})
else:
dwd_types = self._get_column_types(cur, dwd_table, "billiards_dwd")
ods_types = self._get_column_types(cur, ods_table, "billiards_ods")
inserted = self._merge_fact_increment(
cur, dwd_table, ods_table, dwd_cols, ods_cols, dwd_types, ods_types
)
summary.append({"table": dwd_table, "mode": "INCREMENT", "inserted": inserted})
self.db.conn.commit()
return {"tables": summary}
# ---------------------- helpers ----------------------
def _get_columns(self, cur, table: str) -> List[str]:
"""获取指定表的列名(小写)。"""
schema, name = self._split_table_name(table, default_schema="billiards_dwd")
cur.execute(
"""
SELECT column_name
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
""",
(schema, name),
)
return [r["column_name"].lower() for r in cur.fetchall()]
def _get_primary_keys(self, cur, table: str) -> List[str]:
"""获取表的主键列名列表。"""
schema, name = self._split_table_name(table, default_schema="billiards_dwd")
cur.execute(
"""
SELECT kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
AND tc.table_schema = kcu.table_schema
AND tc.table_name = kcu.table_name
WHERE tc.table_schema = %s
AND tc.table_name = %s
AND tc.constraint_type = 'PRIMARY KEY'
ORDER BY kcu.ordinal_position
""",
(schema, name),
)
return [r["column_name"].lower() for r in cur.fetchall()]
def _get_column_types(self, cur, table: str, default_schema: str) -> Dict[str, str]:
"""获取列的数据类型information_schema.data_type"""
schema, name = self._split_table_name(table, default_schema=default_schema)
cur.execute(
"""
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
""",
(schema, name),
)
return {r["column_name"].lower(): r["data_type"].lower() for r in cur.fetchall()}
def _build_column_mapping(
self, dwd_table: str, pk_cols: Sequence[str], ods_cols: Sequence[str]
) -> Dict[str, tuple[str, str | None]]:
"""合并显式 FACT_MAPPINGS 与主键兜底映射。"""
mapping_entries = self.FACT_MAPPINGS.get(dwd_table, [])
mapping: Dict[str, tuple[str, str | None]] = {
dst.lower(): (src, cast_type) for dst, src, cast_type in mapping_entries
}
ods_set = {c.lower() for c in ods_cols}
for pk in pk_cols:
pk_lower = pk.lower()
if pk_lower not in mapping and pk_lower not in ods_set and "id" in ods_set:
mapping[pk_lower] = ("id", None)
return mapping
def _fetch_source_rows(
self, cur, table: str, columns: Sequence[str], where_sql: str = "", params: Sequence[Any] = None
) -> List[Dict[str, Any]]:
"""从源表读取指定列,返回小写键的字典列表。"""
schema, name = self._split_table_name(table, default_schema="billiards_ods")
cols_sql = ", ".join(f'"{c}"' for c in columns)
sql = f'SELECT {cols_sql} FROM "{schema}"."{name}" {where_sql}'
cur.execute(sql, params or [])
rows = []
for r in cur.fetchall():
rows.append({k.lower(): v for k, v in r.items()})
return rows
def _expand_goods_category_rows(self, rows: list[Dict[str, Any]]) -> list[Dict[str, Any]]:
"""将分类表中的 categoryboxes 元素展开为子类记录。"""
expanded: list[Dict[str, Any]] = []
for r in rows:
expanded.append(r)
boxes = r.get("categoryboxes")
if isinstance(boxes, list):
for child in boxes:
if not isinstance(child, dict):
continue
child_row: Dict[str, Any] = {}
# 继承父级的租户与业务大类信息
child_row["tenant_id"] = r.get("tenant_id")
child_row["business_name"] = child.get("business_name", r.get("business_name"))
child_row["tenant_goods_business_id"] = child.get(
"tenant_goods_business_id", r.get("tenant_goods_business_id")
)
# 合并子类字段
child_row.update(child)
# 默认父子关系
child_row.setdefault("pid", r.get("id"))
# 衍生层级/叶子标记
child_boxes = child_row.get("categoryboxes")
if not isinstance(child_boxes, list):
is_leaf = 1
else:
is_leaf = 1 if len(child_boxes) == 0 else 0
child_row.setdefault("category_level", 2)
child_row.setdefault("is_leaf", is_leaf)
expanded.append(child_row)
return expanded
def _merge_dim_scd2(
self,
cur,
dwd_table: str,
ods_table: str,
dwd_cols: Sequence[str],
ods_cols: Sequence[str],
now: datetime,
) -> int:
"""对维表执行 SCD2 合并:对比变更关闭旧版并插入新版。"""
pk_cols = self._get_primary_keys(cur, dwd_table)
if not pk_cols:
raise ValueError(f"{dwd_table} 未配置主键,无法执行 SCD2 合并")
mapping = self._build_column_mapping(dwd_table, pk_cols, ods_cols)
ods_set = {c.lower() for c in ods_cols}
table_sql = self._format_table(ods_table, "billiards_ods")
# 构造 SELECT 表达式,支持 JSON/expression 映射
select_exprs: list[str] = []
added: set[str] = set()
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
if lc in mapping:
src, cast_type = mapping[lc]
select_exprs.append(f"{self._cast_expr(src, cast_type)} AS \"{lc}\"")
added.add(lc)
elif lc in ods_set:
select_exprs.append(f'"{lc}" AS "{lc}"')
added.add(lc)
# 分类维度需要额外读取 categoryboxes 以展开子类
if dwd_table == "billiards_dwd.dim_goods_category" and "categoryboxes" not in added and "categoryboxes" in ods_set:
select_exprs.append('"categoryboxes" AS "categoryboxes"')
added.add("categoryboxes")
# 主键兜底确保被选出
for pk in pk_cols:
lc = pk.lower()
if lc not in added:
if lc in mapping:
src, cast_type = mapping[lc]
select_exprs.append(f"{self._cast_expr(src, cast_type)} AS \"{lc}\"")
elif lc in ods_set:
select_exprs.append(f'"{lc}" AS "{lc}"')
added.add(lc)
if not select_exprs:
return 0
sql = f"SELECT {', '.join(select_exprs)} FROM {table_sql}"
cur.execute(sql)
rows = [{k.lower(): v for k, v in r.items()} for r in cur.fetchall()]
# 特殊:分类维度展开子类
if dwd_table == "billiards_dwd.dim_goods_category":
rows = self._expand_goods_category_rows(rows)
inserted_or_updated = 0
seen_pk = set()
for row in rows:
mapped_row: Dict[str, Any] = {}
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
value = row.get(lc)
if value is None and lc in mapping:
src, _ = mapping[lc]
value = row.get(src.lower())
mapped_row[lc] = value
pk_key = tuple(mapped_row.get(pk) for pk in pk_cols)
if pk_key in seen_pk:
continue
seen_pk.add(pk_key)
if self._upsert_scd2_row(cur, dwd_table, dwd_cols, pk_cols, mapped_row, now):
inserted_or_updated += 1
return len(rows)
def _upsert_scd2_row(
self,
cur,
dwd_table: str,
dwd_cols: Sequence[str],
pk_cols: Sequence[str],
src_row: Dict[str, Any],
now: datetime,
) -> bool:
"""SCD2 合并:若有变更则关闭旧版并插入新版本。"""
pk_values = [src_row.get(pk) for pk in pk_cols]
if any(v is None for v in pk_values):
self.logger.warning("跳过 %s:主键缺失 %s", dwd_table, dict(zip(pk_cols, pk_values)))
return False
where_clause = " AND ".join(f'"{pk}" = %s' for pk in pk_cols)
table_sql = self._format_table(dwd_table, "billiards_dwd")
cur.execute(
f"SELECT * FROM {table_sql} WHERE {where_clause} AND COALESCE(scd2_is_current,1)=1 LIMIT 1",
pk_values,
)
current = cur.fetchone()
if current:
current = {k.lower(): v for k, v in current.items()}
if current and not self._is_row_changed(current, src_row, dwd_cols):
return False
if current:
version = (current.get("scd2_version") or 1) + 1
self._close_current_dim(cur, dwd_table, pk_cols, pk_values, now)
else:
version = 1
self._insert_dim_row(cur, dwd_table, dwd_cols, src_row, now, version)
return True
def _close_current_dim(self, cur, table: str, pk_cols: Sequence[str], pk_values: Sequence[Any], now: datetime) -> None:
"""关闭当前版本,标记 scd2_is_current=0 并填充结束时间。"""
set_sql = "scd2_end_time = %s, scd2_is_current = 0"
where_clause = " AND ".join(f'"{pk}" = %s' for pk in pk_cols)
table_sql = self._format_table(table, "billiards_dwd")
cur.execute(f"UPDATE {table_sql} SET {set_sql} WHERE {where_clause} AND COALESCE(scd2_is_current,1)=1", [now, *pk_values])
def _insert_dim_row(
self,
cur,
table: str,
dwd_cols: Sequence[str],
src_row: Dict[str, Any],
now: datetime,
version: int,
) -> None:
"""插入新的 SCD2 版本行。"""
insert_cols: List[str] = []
placeholders: List[str] = []
values: List[Any] = []
for col in sorted(dwd_cols):
lc = col.lower()
insert_cols.append(f'"{lc}"')
placeholders.append("%s")
if lc == "scd2_start_time":
values.append(now)
elif lc == "scd2_end_time":
values.append(datetime(9999, 12, 31, 0, 0, 0))
elif lc == "scd2_is_current":
values.append(1)
elif lc == "scd2_version":
values.append(version)
else:
values.append(src_row.get(lc))
table_sql = self._format_table(table, "billiards_dwd")
sql = f'INSERT INTO {table_sql} ({", ".join(insert_cols)}) VALUES ({", ".join(placeholders)})'
cur.execute(sql, values)
def _is_row_changed(self, current: Dict[str, Any], incoming: Dict[str, Any], dwd_cols: Sequence[str]) -> bool:
"""比较非 SCD2 列,判断是否存在变更。"""
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
if current.get(lc) != incoming.get(lc):
return True
return False
def _merge_fact_increment(
self,
cur,
dwd_table: str,
ods_table: str,
dwd_cols: Sequence[str],
ods_cols: Sequence[str],
dwd_types: Dict[str, str],
ods_types: Dict[str, str],
) -> int:
"""事实表按时间增量插入,默认按列名交集写入。"""
mapping_entries = self.FACT_MAPPINGS.get(dwd_table) or []
mapping: Dict[str, tuple[str, str | None]] = {
dst.lower(): (src, cast_type) for dst, src, cast_type in mapping_entries
}
mapping_dest = [dst for dst, _, _ in mapping_entries]
insert_cols: List[str] = list(mapping_dest)
for col in dwd_cols:
if col in self.SCD_COLS:
continue
if col in insert_cols:
continue
if col in ods_cols:
insert_cols.append(col)
pk_cols = self._get_primary_keys(cur, dwd_table)
ods_set = {c.lower() for c in ods_cols}
existing_lower = [c.lower() for c in insert_cols]
for pk in pk_cols:
pk_lower = pk.lower()
if pk_lower in existing_lower:
continue
if pk_lower in ods_set:
insert_cols.append(pk)
existing_lower.append(pk_lower)
elif "id" in ods_set:
insert_cols.append(pk)
existing_lower.append(pk_lower)
mapping[pk_lower] = ("id", None)
# 保持列顺序同时去重
seen_cols: set[str] = set()
ordered_cols: list[str] = []
for col in insert_cols:
lc = col.lower()
if lc not in seen_cols:
seen_cols.add(lc)
ordered_cols.append(col)
insert_cols = ordered_cols
if not insert_cols:
self.logger.warning("跳过 %s:未找到可插入的列", dwd_table)
return 0
order_col = self._pick_order_column(dwd_cols, ods_cols)
where_sql = ""
params: List[Any] = []
dwd_table_sql = self._format_table(dwd_table, "billiards_dwd")
ods_table_sql = self._format_table(ods_table, "billiards_ods")
if order_col:
cur.execute(f'SELECT COALESCE(MAX("{order_col}"), %s) FROM {dwd_table_sql}', ("1970-01-01",))
row = cur.fetchone() or {}
watermark = list(row.values())[0] if row else "1970-01-01"
where_sql = f'WHERE "{order_col}" > %s'
params.append(watermark)
default_cols = [c for c in insert_cols if c.lower() not in mapping]
default_expr_map: Dict[str, str] = {}
if default_cols:
default_exprs = self._build_fact_select_exprs(default_cols, dwd_types, ods_types)
default_expr_map = dict(zip(default_cols, default_exprs))
select_exprs: List[str] = []
for col in insert_cols:
key = col.lower()
if key in mapping:
src, cast_type = mapping[key]
select_exprs.append(self._cast_expr(src, cast_type))
else:
select_exprs.append(default_expr_map[col])
select_cols_sql = ", ".join(select_exprs)
insert_cols_sql = ", ".join(f'"{c}"' for c in insert_cols)
sql = f'INSERT INTO {dwd_table_sql} ({insert_cols_sql}) SELECT {select_cols_sql} FROM {ods_table_sql} {where_sql}'
pk_cols = self._get_primary_keys(cur, dwd_table)
if pk_cols:
pk_sql = ", ".join(f'"{c}"' for c in pk_cols)
sql += f" ON CONFLICT ({pk_sql}) DO NOTHING"
cur.execute(sql, params)
return cur.rowcount
def _pick_order_column(self, dwd_cols: Iterable[str], ods_cols: Iterable[str]) -> str | None:
"""选择用于增量的时间列(需同时存在于 DWD 与 ODS"""
lower_cols = {c.lower() for c in dwd_cols} & {c.lower() for c in ods_cols}
for candidate in self.FACT_ORDER_CANDIDATES:
if candidate.lower() in lower_cols:
return candidate.lower()
return None
def _build_fact_select_exprs(
self,
insert_cols: Sequence[str],
dwd_types: Dict[str, str],
ods_types: Dict[str, str],
) -> List[str]:
"""构造事实表 SELECT 列表,需要时做类型转换。"""
numeric_types = {"integer", "bigint", "smallint", "numeric", "double precision", "real", "decimal"}
text_types = {"text", "character varying", "varchar"}
exprs = []
for col in insert_cols:
d_type = dwd_types.get(col)
o_type = ods_types.get(col)
if d_type in numeric_types and o_type in text_types:
exprs.append(f"CAST(NULLIF(CAST(\"{col}\" AS text), '') AS numeric):: {d_type}")
else:
exprs.append(f'"{col}"')
return exprs
def _split_table_name(self, name: str, default_schema: str) -> tuple[str, str]:
"""拆分 schema.table若无 schema 则补默认 schema。"""
parts = name.split(".")
if len(parts) == 2:
return parts[0], parts[1].lower()
return default_schema, name.lower()
def _table_base(self, name: str) -> str:
"""获取不含 schema 的表名。"""
return name.split(".")[-1]
def _format_table(self, name: str, default_schema: str) -> str:
"""返回带引号的 schema.table 名称。"""
schema, table = self._split_table_name(name, default_schema)
return f'"{schema}"."{table}"'
def _cast_expr(self, col: str, cast_type: str | None) -> str:
"""构造带可选 CAST 的列表达式。"""
if col.upper() == "NULL":
base = "NULL"
else:
is_expr = not col.isidentifier() or "->" in col or "#>>" in col or "::" in col or "'" in col
base = col if is_expr else f'"{col}"'
if cast_type:
cast_lower = cast_type.lower()
if cast_lower in {"bigint", "integer", "numeric", "decimal"}:
return f"CAST(NULLIF(CAST({base} AS text), '') AS numeric):: {cast_type}"
if cast_lower == "timestamptz":
return f"({base})::timestamptz"
return f"{base}::{cast_type}"
return base

View File

@@ -0,0 +1,105 @@
# -*- coding: utf-8 -*-
"""DWD 质量核对任务:按 dwd_quality_check.md 输出行数/金额对照报表。"""
from __future__ import annotations
import json
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, Iterable, List, Sequence, Tuple
from psycopg2.extras import RealDictCursor
from .base_task import BaseTask, TaskContext
from .dwd_load_task import DwdLoadTask
class DwdQualityTask(BaseTask):
"""对 ODS 与 DWD 进行行数、金额对照核查,生成 JSON 报表。"""
REPORT_PATH = Path("etl_billiards/reports/dwd_quality_report.json")
AMOUNT_KEYWORDS = ("amount", "money", "fee", "balance")
def get_task_code(self) -> str:
"""返回任务编码。"""
return "DWD_QUALITY_CHECK"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""准备运行时上下文。"""
return {"now": datetime.now()}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict[str, Any]:
"""输出行数/金额差异报表到本地文件。"""
report: Dict[str, Any] = {
"generated_at": extracted["now"].isoformat(),
"tables": [],
"note": "行数/金额核对,金额字段基于列名包含 amount/money/fee/balance 的数值列自动扫描。",
}
with self.db.conn.cursor(cursor_factory=RealDictCursor) as cur:
for dwd_table, ods_table in DwdLoadTask.TABLE_MAP.items():
count_info = self._compare_counts(cur, dwd_table, ods_table)
amount_info = self._compare_amounts(cur, dwd_table, ods_table)
report["tables"].append(
{
"dwd_table": dwd_table,
"ods_table": ods_table,
"count": count_info,
"amounts": amount_info,
}
)
self.REPORT_PATH.parent.mkdir(parents=True, exist_ok=True)
self.REPORT_PATH.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
self.logger.info("DWD 质检报表已生成:%s", self.REPORT_PATH)
return {"report_path": str(self.REPORT_PATH)}
# ---------------------- helpers ----------------------
def _compare_counts(self, cur, dwd_table: str, ods_table: str) -> Dict[str, Any]:
"""统计两端行数并返回差异。"""
dwd_schema, dwd_name = self._split_table_name(dwd_table, default_schema="billiards_dwd")
ods_schema, ods_name = self._split_table_name(ods_table, default_schema="billiards_ods")
cur.execute(f'SELECT COUNT(1) AS cnt FROM "{dwd_schema}"."{dwd_name}"')
dwd_cnt = cur.fetchone()["cnt"]
cur.execute(f'SELECT COUNT(1) AS cnt FROM "{ods_schema}"."{ods_name}"')
ods_cnt = cur.fetchone()["cnt"]
return {"dwd": dwd_cnt, "ods": ods_cnt, "diff": dwd_cnt - ods_cnt}
def _compare_amounts(self, cur, dwd_table: str, ods_table: str) -> List[Dict[str, Any]]:
"""扫描金额相关列,生成 ODS 与 DWD 的汇总对照。"""
dwd_schema, dwd_name = self._split_table_name(dwd_table, default_schema="billiards_dwd")
ods_schema, ods_name = self._split_table_name(ods_table, default_schema="billiards_ods")
dwd_amount_cols = self._get_numeric_amount_columns(cur, dwd_schema, dwd_name)
ods_amount_cols = self._get_numeric_amount_columns(cur, ods_schema, ods_name)
common_amount_cols = sorted(set(dwd_amount_cols) & set(ods_amount_cols))
results: List[Dict[str, Any]] = []
for col in common_amount_cols:
cur.execute(f'SELECT COALESCE(SUM("{col}"),0) AS val FROM "{dwd_schema}"."{dwd_name}"')
dwd_sum = cur.fetchone()["val"]
cur.execute(f'SELECT COALESCE(SUM("{col}"),0) AS val FROM "{ods_schema}"."{ods_name}"')
ods_sum = cur.fetchone()["val"]
results.append({"column": col, "dwd_sum": float(dwd_sum or 0), "ods_sum": float(ods_sum or 0), "diff": float(dwd_sum or 0) - float(ods_sum or 0)})
return results
def _get_numeric_amount_columns(self, cur, schema: str, table: str) -> List[str]:
"""获取列名包含金额关键词的数值型字段。"""
cur.execute(
"""
SELECT column_name
FROM information_schema.columns
WHERE table_schema = %s
AND table_name = %s
AND data_type IN ('numeric','double precision','integer','bigint','smallint','real','decimal')
""",
(schema, table),
)
cols = [r["column_name"].lower() for r in cur.fetchall()]
return [c for c in cols if any(key in c for key in self.AMOUNT_KEYWORDS)]
def _split_table_name(self, name: str, default_schema: str) -> Tuple[str, str]:
"""拆分 schema 与表名,缺省使用 default_schema。"""
parts = name.split(".")
if len(parts) == 2:
return parts[0], parts[1]
return default_schema, name

View File

@@ -0,0 +1,36 @@
# -*- coding: utf-8 -*-
"""初始化 DWD Schema执行 schema_dwd_doc.sql可选先 DROP SCHEMA。"""
from __future__ import annotations
from pathlib import Path
from typing import Any
from .base_task import BaseTask, TaskContext
class InitDwdSchemaTask(BaseTask):
"""通过调度执行 DWD schema 初始化。"""
def get_task_code(self) -> str:
"""返回任务编码。"""
return "INIT_DWD_SCHEMA"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""读取 DWD SQL 文件与参数。"""
base_dir = Path(__file__).resolve().parents[1] / "database"
dwd_path = Path(self.config.get("schema.dwd_file", base_dir / "schema_dwd_doc.sql"))
if not dwd_path.exists():
raise FileNotFoundError(f"未找到 DWD schema 文件: {dwd_path}")
drop_first = self.config.get("dwd.drop_schema_first", False)
return {"dwd_sql": dwd_path.read_text(encoding="utf-8"), "dwd_file": str(dwd_path), "drop_first": drop_first}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict:
"""可选 DROP schema再执行 DWD DDL。"""
with self.db.conn.cursor() as cur:
if extracted["drop_first"]:
cur.execute("DROP SCHEMA IF EXISTS billiards_dwd CASCADE;")
self.logger.info("已执行 DROP SCHEMA billiards_dwd CASCADE")
self.logger.info("执行 DWD schema 文件: %s", extracted["dwd_file"])
cur.execute(extracted["dwd_sql"])
return {"executed": 1, "files": [extracted["dwd_file"]]}

View File

@@ -0,0 +1,73 @@
# -*- coding: utf-8 -*-
"""任务:初始化运行环境,执行 ODS 与 etl_admin 的 DDL并准备日志/导出目录。"""
from __future__ import annotations
from pathlib import Path
from typing import Any
from .base_task import BaseTask, TaskContext
class InitOdsSchemaTask(BaseTask):
"""通过调度执行初始化:创建必要目录,执行 ODS 与 etl_admin 的 DDL。"""
def get_task_code(self) -> str:
"""返回任务编码。"""
return "INIT_ODS_SCHEMA"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""读取 SQL 文件路径,收集需创建的目录。"""
base_dir = Path(__file__).resolve().parents[1] / "database"
ods_path = Path(self.config.get("schema.ods_file", base_dir / "schema_ODS_doc.sql"))
admin_path = Path(self.config.get("schema.etl_admin_file", base_dir / "schema_etl_admin.sql"))
if not ods_path.exists():
raise FileNotFoundError(f"找不到 ODS schema 文件: {ods_path}")
if not admin_path.exists():
raise FileNotFoundError(f"找不到 etl_admin schema 文件: {admin_path}")
log_root = Path(self.config.get("io.log_root") or self.config["io"]["log_root"])
export_root = Path(self.config.get("io.export_root") or self.config["io"]["export_root"])
fetch_root = Path(self.config.get("pipeline.fetch_root") or self.config["pipeline"]["fetch_root"])
ingest_dir = Path(self.config.get("pipeline.ingest_source_dir") or fetch_root)
return {
"ods_sql": ods_path.read_text(encoding="utf-8"),
"admin_sql": admin_path.read_text(encoding="utf-8"),
"ods_file": str(ods_path),
"admin_file": str(admin_path),
"dirs": [log_root, export_root, fetch_root, ingest_dir],
}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict:
"""执行 DDL 并创建必要目录。
安全提示:
ODS DDL 文件可能携带头部说明或异常注释,为避免因非 SQL 文本导致执行失败,这里会做一次轻量清洗后再执行。
"""
for d in extracted["dirs"]:
Path(d).mkdir(parents=True, exist_ok=True)
self.logger.info("已确保目录存在: %s", d)
# 处理 ODS SQL去掉头部说明行以及易出错的 COMMENT ON 行(如 CamelCase 未加引号)
ods_sql_raw: str = extracted["ods_sql"]
drop_idx = ods_sql_raw.find("DROP SCHEMA")
if drop_idx > 0:
ods_sql_raw = ods_sql_raw[drop_idx:]
cleaned_lines: list[str] = []
for line in ods_sql_raw.splitlines():
if line.strip().upper().startswith("COMMENT ON "):
continue
cleaned_lines.append(line)
ods_sql = "\n".join(cleaned_lines)
with self.db.conn.cursor() as cur:
self.logger.info("执行 etl_admin schema 文件: %s", extracted["admin_file"])
cur.execute(extracted["admin_sql"])
self.logger.info("执行 ODS schema 文件: %s", extracted["ods_file"])
cur.execute(ods_sql)
return {
"executed": 2,
"files": [extracted["admin_file"], extracted["ods_file"]],
"dirs_prepared": [str(p) for p in extracted["dirs"]],
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
from .base_dwd_task import BaseDwdTask
from loaders.dimensions.member import MemberLoader
from models.parsers import TypeParser
@@ -7,7 +7,7 @@ import json
class MembersDwdTask(BaseDwdTask):
"""
DWD Task: Process Member Records from ODS to Dimension Table
Source: billiards_ods.ods_member_profile
Source: billiards_ods.member_profiles
Target: billiards.dim_member
"""
@@ -29,7 +29,7 @@ class MembersDwdTask(BaseDwdTask):
# Iterate ODS Data
batches = self.iter_ods_rows(
table_name="billiards_ods.ods_member_profile",
table_name="billiards_ods.member_profiles",
columns=["site_id", "member_id", "payload", "fetched_at"],
start_time=window_start,
end_time=window_end
@@ -87,3 +87,4 @@ class MembersDwdTask(BaseDwdTask):
except Exception as e:
self.logger.warning(f"Error parsing member: {e}")
return None

View File

@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
"""ODS ingestion tasks."""
from __future__ import annotations
@@ -62,11 +62,11 @@ class BaseOdsTask(BaseTask):
def execute(self) -> dict:
spec = self.SPEC
self.logger.info("开始执行 %s (ODS)", spec.code)
self.logger.info("寮€濮嬫墽琛?%s (ODS)", spec.code)
store_id = TypeParser.parse_int(self.config.get("app.store_id"))
if not store_id:
raise ValueError("app.store_id 未配置,无法执行 ODS 任务")
raise ValueError("app.store_id 鏈厤缃紝鏃犳硶鎵ц ODS 浠诲姟")
page_size = self.config.get("api.page_size", 200)
params = self._build_params(spec, store_id)
@@ -122,13 +122,13 @@ class BaseOdsTask(BaseTask):
counts["fetched"] += len(page_records)
self.db.commit()
self.logger.info("%s ODS 任务完成: %s", spec.code, counts)
self.logger.info("%s ODS 浠诲姟瀹屾垚: %s", spec.code, counts)
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
counts["errors"] += 1
self.logger.error("%s ODS 任务失败", spec.code, exc_info=True)
self.logger.error("%s ODS 浠诲姟澶辫触", spec.code, exc_info=True)
raise
def _build_params(self, spec: OdsTaskSpec, store_id: int) -> dict:
@@ -201,7 +201,7 @@ class BaseOdsTask(BaseTask):
value = self._extract_value(record, col_spec)
if value is None and col_spec.required:
self.logger.warning(
"%s 缺少必填字段 %s,原始记录: %s",
"%s 缂哄皯蹇呭~瀛楁 %s锛屽師濮嬭褰? %s",
spec.code,
col_spec.column,
record,
@@ -265,9 +265,38 @@ def _int_col(name: str, *sources: str, required: bool = False) -> ColumnSpec:
)
def _decimal_col(name: str, *sources: str) -> ColumnSpec:
"""??????????????"""
return ColumnSpec(
column=name,
sources=sources,
transform=lambda v: TypeParser.parse_decimal(v, 2),
)
def _bool_col(name: str, *sources: str) -> ColumnSpec:
"""??????????????0/1?true/false ???"""
def _to_bool(value):
if value is None:
return None
if isinstance(value, bool):
return value
s = str(value).strip().lower()
if s in {"1", "true", "t", "yes", "y"}:
return True
if s in {"0", "false", "f", "no", "n"}:
return False
return bool(value)
return ColumnSpec(column=name, sources=sources, transform=_to_bool)
ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
OdsTaskSpec(
code="ODS_ASSISTANT_ACCOUNTS",
code="ODS_ASSISTANT_ACCOUNT",
class_name="OdsAssistantAccountsTask",
table_name="billiards_ods.assistant_accounts_master",
endpoint="/PersonnelManagement/SearchAssistantInfo",
@@ -281,10 +310,10 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_fetched_at=False,
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
description="助教账号档案 ODSSearchAssistantInfo -> assistantInfos 原始 JSON",
description="鍔╂暀璐﹀彿妗f ODS锛歋earchAssistantInfo -> assistantInfos 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_ORDER_SETTLE",
code="ODS_SETTLEMENT_RECORDS",
class_name="OdsOrderSettleTask",
table_name="billiards_ods.settlement_records",
endpoint="/Site/GetAllOrderSettleList",
@@ -299,7 +328,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="结账记录 ODSGetAllOrderSettleList -> settleList 原始 JSON",
description="缁撹处璁板綍 ODS锛欸etAllOrderSettleList -> settleList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_TABLE_USE",
@@ -317,7 +346,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="台费计费流水 ODSGetSiteTableOrderDetails -> siteTableUseDetailsList 原始 JSON",
description="鍙拌垂璁¤垂娴佹按 ODS锛欸etSiteTableOrderDetails -> siteTableUseDetailsList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_ASSISTANT_LEDGER",
@@ -334,7 +363,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_fetched_at=False,
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
description="助教服务流水 ODSGetOrderAssistantDetails -> orderAssistantDetails 原始 JSON",
description="鍔╂暀鏈嶅姟娴佹按 ODS锛欸etOrderAssistantDetails -> orderAssistantDetails 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_ASSISTANT_ABOLISH",
@@ -351,10 +380,10 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_fetched_at=False,
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
description="助教废除记录 ODSGetAbolitionAssistant -> abolitionAssistants 原始 JSON",
description="鍔╂暀搴熼櫎璁板綍 ODS锛欸etAbolitionAssistant -> abolitionAssistants 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_GOODS_LEDGER",
code="ODS_STORE_GOODS_SALES",
class_name="OdsGoodsLedgerTask",
table_name="billiards_ods.store_goods_sales_records",
endpoint="/TenantGoods/GetGoodsSalesList",
@@ -369,7 +398,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="门店商品销售流水 ODSGetGoodsSalesList -> orderGoodsLedgers 原始 JSON",
description="闂ㄥ簵鍟嗗搧閿€鍞祦姘?ODS锛欸etGoodsSalesList -> orderGoodsLedgers 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_PAYMENT",
@@ -386,7 +415,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="支付流水 ODSGetPayLogListPage 原始 JSON",
description="鏀粯娴佹按 ODS锛欸etPayLogListPage 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_REFUND",
@@ -403,10 +432,10 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="退款流水 ODSGetRefundPayLogList 原始 JSON",
description="閫€娆炬祦姘?ODS锛欸etRefundPayLogList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_COUPON_VERIFY",
code="ODS_PLATFORM_COUPON",
class_name="OdsCouponVerifyTask",
table_name="billiards_ods.platform_coupon_redemption_records",
endpoint="/Promotion/GetOfflineCouponConsumePageList",
@@ -420,7 +449,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="平台/团购券核销 ODSGetOfflineCouponConsumePageList 原始 JSON",
description="骞冲彴/鍥㈣喘鍒告牳閿€ ODS锛欸etOfflineCouponConsumePageList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_MEMBER",
@@ -438,7 +467,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="会员档案 ODSGetTenantMemberList -> tenantMemberInfos 原始 JSON",
description="浼氬憳妗f ODS锛欸etTenantMemberList -> tenantMemberInfos 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_MEMBER_CARD",
@@ -456,7 +485,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="会员储值卡 ODSGetTenantMemberCardList -> tenantMemberCards 原始 JSON",
description="浼氬憳鍌ㄥ€煎崱 ODS锛欸etTenantMemberCardList -> tenantMemberCards 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_MEMBER_BALANCE",
@@ -474,7 +503,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="会员余额变动 ODSGetMemberCardBalanceChange -> tenantMemberCardLogs 原始 JSON",
description="浼氬憳浣欓鍙樺姩 ODS锛欸etMemberCardBalanceChange -> tenantMemberCardLogs 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_RECHARGE_SETTLE",
@@ -483,19 +512,83 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
endpoint="/Site/GetRechargeSettleList",
data_path=("data",),
list_key="settleList",
pk_columns=(),
pk_columns=(_int_col("recharge_order_id", "settleList.id", "id", required=True),),
extra_columns=(
_int_col("tenant_id", "settleList.tenantId", "tenantId"),
_int_col("site_id", "settleList.siteId", "siteId", "siteProfile.id"),
ColumnSpec("site_name_snapshot", sources=("siteProfile.shop_name", "settleList.siteName")),
_int_col("member_id", "settleList.memberId", "memberId"),
ColumnSpec("member_name_snapshot", sources=("settleList.memberName", "memberName")),
ColumnSpec("member_phone_snapshot", sources=("settleList.memberPhone", "memberPhone")),
_int_col("tenant_member_card_id", "settleList.tenantMemberCardId", "tenantMemberCardId"),
ColumnSpec("member_card_type_name", sources=("settleList.memberCardTypeName", "memberCardTypeName")),
_int_col("settle_relate_id", "settleList.settleRelateId", "settleRelateId"),
_int_col("settle_type", "settleList.settleType", "settleType"),
ColumnSpec("settle_name", sources=("settleList.settleName", "settleName")),
_int_col("is_first", "settleList.isFirst", "isFirst"),
_int_col("settle_status", "settleList.settleStatus", "settleStatus"),
_decimal_col("pay_amount", "settleList.payAmount", "payAmount"),
_decimal_col("refund_amount", "settleList.refundAmount", "refundAmount"),
_decimal_col("point_amount", "settleList.pointAmount", "pointAmount"),
_decimal_col("cash_amount", "settleList.cashAmount", "cashAmount"),
_decimal_col("online_amount", "settleList.onlineAmount", "onlineAmount"),
_decimal_col("balance_amount", "settleList.balanceAmount", "balanceAmount"),
_decimal_col("card_amount", "settleList.cardAmount", "cardAmount"),
_decimal_col("coupon_amount", "settleList.couponAmount", "couponAmount"),
_decimal_col("recharge_card_amount", "settleList.rechargeCardAmount", "rechargeCardAmount"),
_decimal_col("gift_card_amount", "settleList.giftCardAmount", "giftCardAmount"),
_decimal_col("prepay_money", "settleList.prepayMoney", "prepayMoney"),
_decimal_col("consume_money", "settleList.consumeMoney", "consumeMoney"),
_decimal_col("goods_money", "settleList.goodsMoney", "goodsMoney"),
_decimal_col("real_goods_money", "settleList.realGoodsMoney", "realGoodsMoney"),
_decimal_col("table_charge_money", "settleList.tableChargeMoney", "tableChargeMoney"),
_decimal_col("service_money", "settleList.serviceMoney", "serviceMoney"),
_decimal_col("activity_discount", "settleList.activityDiscount", "activityDiscount"),
_decimal_col("all_coupon_discount", "settleList.allCouponDiscount", "allCouponDiscount"),
_decimal_col("goods_promotion_money", "settleList.goodsPromotionMoney", "goodsPromotionMoney"),
_decimal_col("assistant_promotion_money", "settleList.assistantPromotionMoney", "assistantPromotionMoney"),
_decimal_col("assistant_pd_money", "settleList.assistantPdMoney", "assistantPdMoney"),
_decimal_col("assistant_cx_money", "settleList.assistantCxMoney", "assistantCxMoney"),
_decimal_col("assistant_manual_discount", "settleList.assistantManualDiscount", "assistantManualDiscount"),
_decimal_col("coupon_sale_amount", "settleList.couponSaleAmount", "couponSaleAmount"),
_decimal_col("member_discount_amount", "settleList.memberDiscountAmount", "memberDiscountAmount"),
_decimal_col("point_discount_price", "settleList.pointDiscountPrice", "pointDiscountPrice"),
_decimal_col("point_discount_cost", "settleList.pointDiscountCost", "pointDiscountCost"),
_decimal_col("adjust_amount", "settleList.adjustAmount", "adjustAmount"),
_decimal_col("rounding_amount", "settleList.roundingAmount", "roundingAmount"),
_int_col("payment_method", "settleList.paymentMethod", "paymentMethod"),
_bool_col("can_be_revoked", "settleList.canBeRevoked", "canBeRevoked"),
_bool_col("is_bind_member", "settleList.isBindMember", "isBindMember"),
_bool_col("is_activity", "settleList.isActivity", "isActivity"),
_bool_col("is_use_coupon", "settleList.isUseCoupon", "isUseCoupon"),
_bool_col("is_use_discount", "settleList.isUseDiscount", "isUseDiscount"),
_int_col("operator_id", "settleList.operatorId", "operatorId"),
ColumnSpec("operator_name_snapshot", sources=("settleList.operatorName", "operatorName")),
_int_col("salesman_user_id", "settleList.salesManUserId", "salesmanUserId", "salesManUserId"),
ColumnSpec("salesman_name", sources=("settleList.salesManName", "salesmanName", "settleList.salesmanName")),
ColumnSpec("order_remark", sources=("settleList.orderRemark", "orderRemark")),
_int_col("table_id", "settleList.tableId", "tableId"),
_int_col("serial_number", "settleList.serialNumber", "serialNumber"),
_int_col("revoke_order_id", "settleList.revokeOrderId", "revokeOrderId"),
ColumnSpec("revoke_order_name", sources=("settleList.revokeOrderName", "revokeOrderName")),
ColumnSpec("revoke_time", sources=("settleList.revokeTime", "revokeTime")),
ColumnSpec("create_time", sources=("settleList.createTime", "createTime")),
ColumnSpec("pay_time", sources=("settleList.payTime", "payTime")),
ColumnSpec("site_profile", sources=("siteProfile",)),
),
include_site_column=False,
include_source_endpoint=False,
include_source_endpoint=True,
include_page_no=False,
include_page_size=False,
include_fetched_at=False,
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
include_fetched_at=True,
include_record_index=False,
conflict_columns_override=None,
requires_window=False,
description="会员充值结算 ODSGetRechargeSettleList -> settleList 原始 JSON",
description="?????? ODS?GetRechargeSettleList -> data.settleList ????",
),
OdsTaskSpec(
code="ODS_PACKAGE",
code="ODS_GROUP_PACKAGE",
class_name="OdsPackageTask",
table_name="billiards_ods.group_buy_packages",
endpoint="/PackageCoupon/QueryPackageCouponList",
@@ -510,7 +603,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="团购套餐定义 ODSQueryPackageCouponList -> packageCouponList 原始 JSON",
description="鍥㈣喘濂楅瀹氫箟 ODS锛歈ueryPackageCouponList -> packageCouponList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_GROUP_BUY_REDEMPTION",
@@ -528,7 +621,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="团购套餐核销 ODSGetSiteTableUseDetails -> siteTableUseDetailsList 原始 JSON",
description="鍥㈣喘濂楅鏍搁攢 ODS锛欸etSiteTableUseDetails -> siteTableUseDetailsList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_INVENTORY_STOCK",
@@ -545,7 +638,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="库存汇总 ODSGetGoodsStockReport 原始 JSON",
description="搴撳瓨姹囨€?ODS锛欸etGoodsStockReport 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_INVENTORY_CHANGE",
@@ -562,7 +655,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_fetched_at=False,
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
description="库存变化记录 ODSQueryGoodsOutboundReceipt -> queryDeliveryRecordsList 原始 JSON",
description="搴撳瓨鍙樺寲璁板綍 ODS锛歈ueryGoodsOutboundReceipt -> queryDeliveryRecordsList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_TABLES",
@@ -580,7 +673,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="台桌维表 ODSGetSiteTables -> siteTables 原始 JSON",
description="鍙版缁磋〃 ODS锛欸etSiteTables -> siteTables 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_GOODS_CATEGORY",
@@ -598,7 +691,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="库存商品分类树 ODSQueryPrimarySecondaryCategory -> goodsCategoryList 原始 JSON",
description="搴撳瓨鍟嗗搧鍒嗙被鏍?ODS锛歈ueryPrimarySecondaryCategory -> goodsCategoryList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_STORE_GOODS",
@@ -616,10 +709,10 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="门店商品档案 ODSGetGoodsInventoryList -> orderGoodsList 原始 JSON",
description="闂ㄥ簵鍟嗗搧妗f ODS锛欸etGoodsInventoryList -> orderGoodsList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_TABLE_DISCOUNT",
code="ODS_TABLE_FEE_DISCOUNT",
class_name="OdsTableDiscountTask",
table_name="billiards_ods.table_fee_discount_records",
endpoint="/Site/GetTaiFeeAdjustList",
@@ -634,7 +727,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="台费折扣/调账 ODSGetTaiFeeAdjustList -> taiFeeAdjustInfos 原始 JSON",
description="鍙拌垂鎶樻墸/璋冭处 ODS锛欸etTaiFeeAdjustList -> taiFeeAdjustInfos 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_TENANT_GOODS",
@@ -652,7 +745,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
include_record_index=True,
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
description="租户商品档案 ODSQueryTenantGoods -> tenantGoodsList 原始 JSON",
description="绉熸埛鍟嗗搧妗f ODS锛歈ueryTenantGoods -> tenantGoodsList 鍘熷 JSON",
),
OdsTaskSpec(
code="ODS_SETTLEMENT_TICKET",
@@ -671,7 +764,7 @@ ODS_TASK_SPECS: Tuple[OdsTaskSpec, ...] = (
conflict_columns_override=("source_file", "record_index"),
requires_window=False,
include_site_id=False,
description="结账小票详情 ODSGetOrderSettleTicketNew 原始 JSON",
description="缁撹处灏忕エ璇︽儏 ODS锛欸etOrderSettleTicketNew 鍘熷 JSON",
),
)
@@ -725,7 +818,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
if not candidates:
self.logger.info(
"%s: 窗口[%s ~ %s] 未发现需要抓取的小票",
"%s: 绐楀彛[%s ~ %s] 鏈彂鐜伴渶瑕佹姄鍙栫殑灏忕エ",
spec.code,
context.window_start,
context.window_end,
@@ -755,7 +848,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
counts["updated"] += updated
self.db.commit()
self.logger.info(
"%s: 小票抓取完成,候选=%s 插入=%s 更新=%s 跳过=%s",
"%s: 灏忕エ鎶撳彇瀹屾垚锛屽€欓€?%s 鎻掑叆=%s 鏇存柊=%s 璺宠繃=%s",
spec.code,
len(candidates),
inserted,
@@ -767,7 +860,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
except Exception:
counts["errors"] += 1
self.db.rollback()
self.logger.error("%s: 小票抓取失败", spec.code, exc_info=True)
self.logger.error("%s: 灏忕エ鎶撳彇澶辫触", spec.code, exc_info=True)
raise
# ------------------------------------------------------------------ helpers
@@ -782,7 +875,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
try:
rows = self.db.query(sql)
except Exception:
self.logger.warning("查询已有小票失败,按空集处理", exc_info=True)
self.logger.warning("鏌ヨ宸叉湁灏忕エ澶辫触锛屾寜绌洪泦澶勭悊", exc_info=True)
return set()
return {
@@ -819,7 +912,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
try:
rows = self.db.query(sql, params)
except Exception:
self.logger.warning("读取支付流水以获取结算单ID失败将尝试调用支付接口回退", exc_info=True)
self.logger.warning("璇诲彇鏀粯娴佹按浠ヨ幏鍙栫粨绠楀崟ID澶辫触锛屽皢灏濊瘯璋冪敤鏀粯鎺ュ彛鍥為€€", exc_info=True)
return set()
return {
@@ -853,7 +946,7 @@ class OdsSettlementTicketTask(BaseOdsTask):
if relate_id:
candidate_ids.add(relate_id)
except Exception:
self.logger.warning("调用支付接口获取结算单ID失败当前批次将跳过回退来源", exc_info=True)
self.logger.warning("璋冪敤鏀粯鎺ュ彛鑾峰彇缁撶畻鍗旾D澶辫触锛屽綋鍓嶆壒娆″皢璺宠繃鍥為€€鏉ユ簮", exc_info=True)
return candidate_ids
def _fetch_ticket_payload(self, order_settle_id: int):
@@ -869,10 +962,10 @@ class OdsSettlementTicketTask(BaseOdsTask):
payload = response
except Exception:
self.logger.warning(
"调用小票接口失败 orderSettleId=%s", order_settle_id, exc_info=True
"璋冪敤灏忕エ鎺ュ彛澶辫触 orderSettleId=%s", order_settle_id, exc_info=True
)
if isinstance(payload, dict) and isinstance(payload.get("data"), list) and len(payload["data"]) == 1:
# 本地桩/回放可能把响应包装成单元素 list这里展开以贴近真实结构
# 鏈湴妗?鍥炴斁鍙兘鎶婂搷搴斿寘瑁呮垚鍗曞厓绱?list锛岃繖閲屽睍寮€浠ヨ创杩戠湡瀹炵粨鏋?
payload = payload["data"][0]
return payload
@@ -899,27 +992,29 @@ def _build_task_class(spec: OdsTaskSpec) -> Type[BaseOdsTask]:
ENABLED_ODS_CODES = {
"ODS_ASSISTANT_ACCOUNTS",
"ODS_ASSISTANT_ACCOUNT",
"ODS_ASSISTANT_LEDGER",
"ODS_ASSISTANT_ABOLISH",
"ODS_INVENTORY_CHANGE",
"ODS_INVENTORY_STOCK",
"ODS_PACKAGE",
"ODS_GROUP_PACKAGE",
"ODS_GROUP_BUY_REDEMPTION",
"ODS_MEMBER",
"ODS_MEMBER_BALANCE",
"ODS_MEMBER_CARD",
"ODS_PAYMENT",
"ODS_REFUND",
"ODS_COUPON_VERIFY",
"ODS_PLATFORM_COUPON",
"ODS_RECHARGE_SETTLE",
"ODS_TABLE_USE",
"ODS_TABLES",
"ODS_GOODS_CATEGORY",
"ODS_STORE_GOODS",
"ODS_TABLE_DISCOUNT",
"ODS_TABLE_FEE_DISCOUNT",
"ODS_STORE_GOODS_SALES",
"ODS_TENANT_GOODS",
"ODS_SETTLEMENT_TICKET",
"ODS_ORDER_SETTLE",
"ODS_SETTLEMENT_RECORDS",
}
ODS_TASK_CLASSES: Dict[str, Type[BaseOdsTask]] = {
@@ -931,3 +1026,4 @@ ODS_TASK_CLASSES: Dict[str, Type[BaseOdsTask]] = {
ODS_TASK_CLASSES["ODS_SETTLEMENT_TICKET"] = OdsSettlementTicketTask
__all__ = ["ODS_TASK_CLASSES", "ODS_TASK_SPECS", "BaseOdsTask", "ENABLED_ODS_CODES"]

View File

@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
from .base_dwd_task import BaseDwdTask
from loaders.facts.payment import PaymentLoader
from models.parsers import TypeParser
@@ -29,7 +29,7 @@ class PaymentsDwdTask(BaseDwdTask):
# Iterate ODS Data
batches = self.iter_ods_rows(
table_name="billiards_ods.ods_payment_record",
table_name="billiards_ods.payment_transactions",
columns=["site_id", "pay_id", "payload", "fetched_at"],
start_time=window_start,
end_time=window_end
@@ -136,3 +136,4 @@ class PaymentsDwdTask(BaseDwdTask):
except Exception as e:
self.logger.warning(f"Error parsing payment: {e}")
return None

View File

@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
"""Unit tests for the new ODS ingestion tasks."""
import logging
import os
@@ -22,21 +22,21 @@ def _build_config(tmp_path):
return create_test_config("ONLINE", archive_dir, temp_dir)
def test_ods_assistant_accounts_ingest(tmp_path):
"""Ensure ODS_ASSISTANT_ACCOUNTS task stores raw payload with record_index dedup keys."""
def test_assistant_accounts_masters_ingest(tmp_path):
"""Ensure assistant_accounts_masterS task stores raw payload with record_index dedup keys."""
config = _build_config(tmp_path)
sample = [
{
"id": 5001,
"assistant_no": "A01",
"nickname": "小张",
"nickname": "灏忓紶",
}
]
api = FakeAPIClient({"/PersonnelManagement/SearchAssistantInfo": sample})
task_cls = ODS_TASK_CLASSES["ODS_ASSISTANT_ACCOUNTS"]
task_cls = ODS_TASK_CLASSES["assistant_accounts_masterS"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_assistant_accounts"))
task = task_cls(config, db_ops, api, logging.getLogger("test_assistant_accounts_masters"))
result = task.execute()
assert result["status"] == "SUCCESS"
@@ -49,21 +49,21 @@ def test_ods_assistant_accounts_ingest(tmp_path):
assert '"id": 5001' in row["payload"]
def test_ods_inventory_change_ingest(tmp_path):
"""Ensure ODS_INVENTORY_CHANGE task stores raw payload with record_index dedup keys."""
def test_goods_stock_movements_ingest(tmp_path):
"""Ensure goods_stock_movements task stores raw payload with record_index dedup keys."""
config = _build_config(tmp_path)
sample = [
{
"siteGoodsStockId": 123456,
"stockType": 1,
"goodsName": "测试商品",
"goodsName": "娴嬭瘯鍟嗗搧",
}
]
api = FakeAPIClient({"/GoodsStockManage/QueryGoodsOutboundReceipt": sample})
task_cls = ODS_TASK_CLASSES["ODS_INVENTORY_CHANGE"]
task_cls = ODS_TASK_CLASSES["goods_stock_movements"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_inventory_change"))
task = task_cls(config, db_ops, api, logging.getLogger("test_goods_stock_movements"))
result = task.execute()
assert result["status"] == "SUCCESS"
@@ -75,7 +75,7 @@ def test_ods_inventory_change_ingest(tmp_path):
assert '"siteGoodsStockId": 123456' in row["payload"]
def test_ods_member_profiles_ingest(tmp_path):
def test_member_profiless_ingest(tmp_path):
"""Ensure ODS_MEMBER task stores tenantMemberInfos raw JSON."""
config = _build_config(tmp_path)
sample = [{"tenantMemberInfos": [{"id": 101, "mobile": "13800000000"}]}]
@@ -110,14 +110,14 @@ def test_ods_payment_ingest(tmp_path):
def test_ods_settlement_records_ingest(tmp_path):
"""Ensure ODS_ORDER_SETTLE task stores settleList raw JSON."""
"""Ensure settlement_records task stores settleList raw JSON."""
config = _build_config(tmp_path)
sample = [{"data": {"settleList": [{"id": 701, "orderTradeNo": 8001}]}}]
api = FakeAPIClient({"/Site/GetAllOrderSettleList": sample})
task_cls = ODS_TASK_CLASSES["ODS_ORDER_SETTLE"]
task_cls = ODS_TASK_CLASSES["settlement_records"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_order_settle"))
task = task_cls(config, db_ops, api, logging.getLogger("test_settlement_records"))
result = task.execute()
assert result["status"] == "SUCCESS"
@@ -158,3 +158,4 @@ def test_ods_settlement_ticket_by_payment_relate_ids(tmp_path):
and call.get("params", {}).get("orderSettleId") == 9001
for call in api.calls
)

File diff suppressed because it is too large Load Diff

1
temp_chinese.txt Normal file
View File

@@ -0,0 +1 @@
含义

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

29
tmp_debug_sql.py Normal file
View File

@@ -0,0 +1,29 @@
import os, psycopg2
from etl_billiards.tasks.dwd_load_task import DwdLoadTask
dwd_table="billiards_dwd.dwd_table_fee_log"
ods_table="billiards_ods.table_fee_transactions"
conn=psycopg2.connect(os.environ["PG_DSN"])
cur=conn.cursor()
task=DwdLoadTask(config={}, db_connection=None, api_client=None, logger=None)
cur.execute("SELECT column_name FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_dwd", "dwd_table_fee_log"))
dwd_cols=[r[0].lower() for r in cur.fetchall()]
cur.execute("SELECT column_name FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_ods", "table_fee_transactions"))
ods_cols=[r[0].lower() for r in cur.fetchall()]
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_dwd", "dwd_table_fee_log"))
dwd_types={r[0].lower(): r[1].lower() for r in cur.fetchall()}
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_ods", "table_fee_transactions"))
ods_types={r[0].lower(): r[1].lower() for r in cur.fetchall()}
mapping=task.FACT_MAPPINGS.get(dwd_table)
if mapping:
insert_cols=[d for d,o,_ in mapping if o in ods_cols]
select_exprs=[task._cast_expr(o,cast_type) for d,o,cast_type in mapping if o in ods_cols]
else:
insert_cols=[c for c in dwd_cols if c in ods_cols and c not in task.SCD_COLS]
select_exprs=task._build_fact_select_exprs(insert_cols,dwd_types,ods_types)
print('insert_cols', insert_cols)
print('select_exprs', select_exprs)
sql=f"INSERT INTO {task._format_table(dwd_table,'billiards_dwd')} ({', '.join(f'\"{c}\"' for c in insert_cols)}) SELECT {', '.join(select_exprs)} FROM {task._format_table(ods_table,'billiards_ods')}"
print(sql)
cur.close(); conn.close()

7
tmp_drop_dwd.py Normal file
View File

@@ -0,0 +1,7 @@
import os, psycopg2
conn=psycopg2.connect(os.environ["PG_DSN"])
conn.autocommit=True
cur=conn.cursor()
cur.execute('DROP SCHEMA IF EXISTS billiards_dwd CASCADE')
cur.close(); conn.close()
print('dropped billiards_dwd')

19
tmp_dwd_tasks.py Normal file
View File

@@ -0,0 +1,19 @@
import os
import psycopg2
DSN = os.environ.get('PG_DSN')
store_id = int(os.environ.get('STORE_ID','2790685415443269'))
conn = psycopg2.connect(DSN)
conn.autocommit = True
cur = conn.cursor()
rows = []
for code in ('INIT_DWD_SCHEMA','DWD_LOAD_FROM_ODS','DWD_QUALITY_CHECK'):
cur.execute("SELECT task_id FROM etl_admin.etl_task WHERE task_code=%s AND store_id=%s", (code, store_id))
if cur.fetchone():
cur.execute("UPDATE etl_admin.etl_task SET enabled=TRUE, updated_at=now() WHERE task_code=%s AND store_id=%s", (code, store_id))
rows.append((code, 'updated'))
else:
cur.execute("INSERT INTO etl_admin.etl_task(task_code,store_id,enabled,cursor_field,window_minutes_default,overlap_seconds,page_size,params) VALUES (%s,%s,TRUE,NULL,60,120,1000,'{}') RETURNING task_id", (code, store_id))
rows.append((code, 'inserted', cur.fetchone()[0]))
print(rows)
cur.close(); conn.close()

28
tmp_problems.py Normal file
View File

@@ -0,0 +1,28 @@
import os, psycopg2
from etl_billiards.tasks.dwd_load_task import DwdLoadTask
conn=psycopg2.connect(os.environ['PG_DSN'])
cur=conn.cursor()
problems=[]
for dwd_table, ods_table in DwdLoadTask.TABLE_MAP.items():
if dwd_table.split('.')[-1].startswith('dwd_'):
if '.' in dwd_table:
dschema, dtable = dwd_table.split('.')
else:
dschema, dtable = 'billiards_dwd', dwd_table
if '.' in ods_table:
oschema, otable = ods_table.split('.')
else:
oschema, otable = 'billiards_ods', ods_table
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", (dschema,dtable))
dcols={r[0].lower():r[1].lower() for r in cur.fetchall()}
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", (oschema,otable))
ocols={r[0].lower():r[1].lower() for r in cur.fetchall()}
common=set(dcols)&set(ocols)
missing_dwd=list(set(ocols)-set(dcols))
missing_ods=list(set(dcols)-set(ocols))
mismatches=[(c,dcols[c],ocols[c]) for c in sorted(common) if dcols[c]!=ocols[c]]
problems.append((dwd_table,missing_dwd,missing_ods,mismatches))
cur.close();conn.close()
for p in problems:
print(p)

26
tmp_run_sql.py Normal file
View File

@@ -0,0 +1,26 @@
import os, psycopg2
from etl_billiards.tasks.dwd_load_task import DwdLoadTask
dwd_table="billiards_dwd.dwd_table_fee_log"
ods_table="billiards_ods.table_fee_transactions"
conn=psycopg2.connect(os.environ["PG_DSN"])
cur=conn.cursor()
task=DwdLoadTask(config={}, db_connection=None, api_client=None, logger=None)
cur.execute("SELECT column_name FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_dwd", "dwd_table_fee_log"))
dwd_cols=[r[0].lower() for r in cur.fetchall()]
cur.execute("SELECT column_name FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_ods", "table_fee_transactions"))
ods_cols=[r[0].lower() for r in cur.fetchall()]
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_dwd", "dwd_table_fee_log"))
dwd_types={r[0].lower(): r[1].lower() for r in cur.fetchall()}
cur.execute("SELECT column_name,data_type FROM information_schema.columns WHERE table_schema=%s AND table_name=%s", ("billiards_ods", "table_fee_transactions"))
ods_types={r[0].lower(): r[1].lower() for r in cur.fetchall()}
mapping=task.FACT_MAPPINGS.get(dwd_table)
insert_cols=[d for d,o,_ in mapping if o in ods_cols]
select_exprs=[task._cast_expr(o,cast_type) for d,o,cast_type in mapping if o in ods_cols]
sql=f"INSERT INTO {task._format_table(dwd_table,'billiards_dwd')} ({', '.join(f'\"{c}\"' for c in insert_cols)}) SELECT {', '.join(select_exprs)} FROM {task._format_table(ods_table,'billiards_ods')} LIMIT 1"
print(sql)
cur.execute(sql)
conn.commit()
print('ok')
cur.close(); conn.close()