Compare commits

...

12 Commits

Author SHA1 Message Date
Neo
0ab040b9fb 整理项目 2025-12-09 05:43:04 +08:00
Neo
0c29bd41f8 整理项目 2025-12-09 05:42:57 +08:00
Neo
561c640700 DWD完成 2025-12-09 04:57:05 +08:00
Neo
f301cc1fd5 更新一些文件 2025-12-06 00:17:42 +08:00
Neo
6f1d163a99 DWD文档确认 2025-12-05 18:57:20 +08:00
Neo
a6ad343092 ODS 完成 2025-11-30 07:19:05 +08:00
Neo
b9b050bb5d ODS 完成 2025-11-30 07:18:55 +08:00
Neo
cbd16a39ba 阶段性更新 2025-11-20 01:27:33 +08:00
Neo
92f219b575 阶段性更新 2025-11-20 01:27:04 +08:00
Neo
b1f64c4bac 版本更改 2025-11-19 05:35:22 +08:00
Neo
ed47754b46 版本更改 2025-11-19 05:35:10 +08:00
Neo
fbee8a751e 同步 2025-11-19 05:32:03 +08:00
123 changed files with 35176 additions and 173399 deletions

View File

@@ -1,5 +1,57 @@
# 台球场 ETL 系统(模块化版本)合并文档
# 飞球 ETL 系统(ODS → DWD
本文为原多份文档(如 `INDEX.md``QUICK_START.md``ARCHITECTURE.md``MIGRATION_GUIDE.md``PROJECT_STRUCTURE.md``README.md` 等)的合并版,只保留与**当前项目本身**相关的内容:项目说明、目录结构、架构设计、数据与控制流程、迁移与扩展指南等,不包含修改历史和重构过程描述
面向门店业务的 ETL拉取/或离线灌入上游 JSON先落地 ODS再清洗装载 DWD含 SCD2 维度、事实增量),并提供质量校验报表
## 见 etl_billiards
## 快速运行(离线示例 JSON
1) 环境Python 3.10+、PostgreSQL`.env` 关键项:`PG_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test``INGEST_SOURCE_DIR=C:\dev\LLTQ\export\test-json-doc`
2) 安装依赖:
```bash
cd etl_billiards
pip install -r requirements.txt
```
3) 一键 ODS→DWD→质检
```bash
python -m etl_billiards.cli.main --tasks INIT_ODS_SCHEMA,INIT_DWD_SCHEMA --pipeline-flow INGEST_ONLY
python -m etl_billiards.cli.main --tasks MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "C:\dev\LLTQ\export\test-json-doc"
python -m etl_billiards.cli.main --tasks DWD_LOAD_FROM_ODS --pipeline-flow INGEST_ONLY
python -m etl_billiards.cli.main --tasks DWD_QUALITY_CHECK --pipeline-flow INGEST_ONLY
# 报表etl_billiards/reports/dwd_quality_report.json
```
## 目录与文件作用
- 根目录:`etl_billiards/` 主代码;`requirements.txt` 依赖;`run_etl.sh/.bat` 启动脚本;`.env/.env.example` 配置;`tmp/` 存放草稿/调试/备份。
- etl_billiards/ 主线目录
- `config/``defaults.py` 默认值,`env_parser.py` 解析 .env`settings.py` 统一配置加载。
- `api/``client.py` HTTP 请求、重试与分页。
- `database/``connection.py` 连接封装,`operations.py` 批量 upsertDDL`schema_ODS_doc.sql`、`schema_dwd_doc.sql`。
- `tasks/`:业务任务
- `init_schema_task.py`INIT_ODS_SCHEMA / INIT_DWD_SCHEMA。
- `manual_ingest_task.py`:示例 JSON → ODS。
- `dwd_load_task.py`ODS → DWD映射、SCD2/事实增量)。
- 其他任务按需扩展。
- `loaders/`ODS/DWD/SCD2 Loader 实现。
- `scd/``scd2_handler.py` 处理维度 SCD2 历史。
- `quality/`:质量检查器(行数/金额对照)。
- `orchestration/``scheduler.py` 调度;`task_registry.py` 任务注册;`run_tracker.py` 运行记录。
- `scripts/`:重建/测试/探活工具。
- `docs/``ods_to_dwd_mapping.md` 映射说明,`ods_sample_json.md` 示例 JSON 说明,`dwd_quality_check.md` 质检说明。
- `reports/`:质检输出(如 `dwd_quality_report.json`)。
- `tests/`:单元/集成测试;`utils/`:通用工具。
- `backups/`(若存在):关键文件备份。
## 业务流程与文件关系
1) 调度入口:`cli/main.py` 解析 CLI → `orchestration/scheduler.py` 依 `task_registry.py` 创建任务 → 初始化 DB/API/Config 上下文。
2) ODS`init_schema_task.py` 执行 `schema_ODS_doc.sql` 建表;`manual_ingest_task.py` 从 `INGEST_SOURCE_DIR` 读 JSON批量 upsert ODS。
3) DWD`init_schema_task.py` 执行 `schema_dwd_doc.sql` 建表;`dwd_load_task.py` 依据 `TABLE_MAP/FACT_MAPPINGS` 从 ODS 清洗写入 DWD维度走 SCD2`scd/scd2_handler.py`),事实按时间/水位增量。
4) 质检:质量任务读取 ODS/DWD统计行数/金额,输出 `reports/dwd_quality_report.json`。
5) 配置:`config/defaults.py` + `.env` + CLI 参数叠加HTTP如启用在线走 `api/client.py`DB 访问走 `database/connection.py`。
6) 文档:`docs/ods_to_dwd_mapping.md` 记录字段映射;`docs/ods_sample_json.md` 描述示例数据结构,便于对照调试。
## 当前状态2025-12-09
- 示例 JSON 全量灌入DWD 行数与 ODS 对齐。
- 分类维度已展平大类+子类:`dim_goods_category` 26 行category_level/leaf 已赋值)。
- 剩余空值多因源数据为空;补值需先确认上游是否提供。
## 可精简/归档
- `tmp/`、`tmp/etl_billiards_misc/` 中的草稿、旧备份、调试脚本仅供参考,运行不依赖。
- 根级保留必要文件README、requirements、run_etl.*、.env/.env.example其余临时文件已移至 tmp。

216
README_FULL.md Normal file
View File

@@ -0,0 +1,216 @@
# 飞球 ETL 系统ODS → DWD— 详细版
> 本文为项目的详细说明,保持与当前代码一致,覆盖 ODS 任务、DWD 装载、质检及开发扩展要点。
---
## 1. 项目概览
面向门店业务的 ETL从上游 API 或离线 JSON 采集订单、支付、会员、库存等数据,先落地 **ODS**,再清洗装载 **DWD**(含 SCD2 维度、事实增量),并输出质量校验报表。项目采用模块化/分层架构配置、API、数据库、Loader/SCD、质量、调度、CLI、测试统一通过 CLI 调度。
---
## 2. 快速开始(离线示例 JSON
**环境要求**Python 3.10+PostgreSQL`.env` 关键项:
- `PG_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test`
- `INGEST_SOURCE_DIR=C:\dev\LLTQ\export\test-json-doc`
**安装依赖**
```bash
cd etl_billiards
pip install -r requirements.txt
```
**一键 ODS → DWD → 质检(离线回放)**
```bash
# 初始化 ODS + DWD
python -m etl_billiards.cli.main --tasks INIT_ODS_SCHEMA,INIT_DWD_SCHEMA --pipeline-flow INGEST_ONLY
# 灌入示例 JSON 到 ODS可用 .env 的 INGEST_SOURCE_DIR 覆盖)
python -m etl_billiards.cli.main --tasks MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "C:\dev\LLTQ\export\test-json-doc"
# 从 ODS 装载 DWD
python -m etl_billiards.cli.main --tasks DWD_LOAD_FROM_ODS --pipeline-flow INGEST_ONLY
# 质量校验报表
python -m etl_billiards.cli.main --tasks DWD_QUALITY_CHECK --pipeline-flow INGEST_ONLY
# 报表输出etl_billiards/reports/dwd_quality_report.json
```
> 可按需单独运行:
> - 仅建表:`python -m etl_billiards.cli.main --tasks INIT_ODS_SCHEMA`
> - 仅 ODS 灌入:`python -m etl_billiards.cli.main --tasks MANUAL_INGEST`
> - 仅 DWD 装载:`python -m etl_billiards.cli.main --tasks INIT_DWD_SCHEMA,DWD_LOAD_FROM_ODS`
---
## 3. 配置与路径
- 示例数据目录:`C:\dev\LLTQ\export\test-json-doc`(可由 `.env``INGEST_SOURCE_DIR` 覆盖)。
- 日志/导出目录:`LOG_ROOT``EXPORT_ROOT``.env`
- 报表:`etl_billiards/reports/dwd_quality_report.json`
- DDL`etl_billiards/database/schema_ODS_doc.sql``etl_billiards/database/schema_dwd_doc.sql`
- 任务注册:`etl_billiards/orchestration/task_registry.py`(默认启用 INIT_ODS_SCHEMA、MANUAL_INGEST、INIT_DWD_SCHEMA、DWD_LOAD_FROM_ODS、DWD_QUALITY_CHECK
**安全提示**:建议将数据库凭证保存在 `.env` 或受控秘钥管理中,生产环境使用最小权限账号。
---
## 4. 目录结构与关键文件
- 根目录:`etl_billiards/` 主代码;`requirements.txt` 依赖;`run_etl.sh/.bat` 启动脚本;`.env/.env.example` 配置;`tmp/` 草稿/调试归档。
- `config/``defaults.py` 默认值,`env_parser.py` 解析 .env`settings.py` AppConfig 统一加载。
- `api/``client.py` HTTP 请求、重试、分页。
- `database/``connection.py` 连接封装;`operations.py` 批量 upsertDDL SQLODS/DWD
- `tasks/`
- `init_schema_task.py`INIT_ODS_SCHEMA/INIT_DWD_SCHEMA
- `manual_ingest_task.py`(示例 JSON → ODS
- `dwd_load_task.py`ODS → DWD 映射、SCD2/事实增量);
- 其他任务按需扩展。
- `loaders/`ODS/DWD/SCD2 Loader 实现。
- `scd/``scd2_handler.py` 处理维度 SCD2 历史。
- `quality/`:质量检查器(行数/金额对照)。
- `orchestration/``scheduler.py` 调度;`task_registry.py` 注册;`run_tracker.py` 运行记录;`cursor_manager.py` 水位管理。
- `scripts/`:重建/测试/探活工具。
- `docs/``ods_to_dwd_mapping.md` 映射说明;`ods_sample_json.md` 示例 JSON 说明;`dwd_quality_check.md` 质检说明。
- `reports/`:质检输出(如 `dwd_quality_report.json`)。
- `tests/`:单元/集成测试;`utils/`:通用工具;`backups/`:备份(若存在)。
---
## 5. 架构与流程
执行链路(控制流):
1) CLI`cli/main.py`)解析参数 → 生成 AppConfig → 初始化日志/DB 连接;
2) 调度层(`scheduler.py`)按 `task_registry.py` 中的注册表实例化任务,设置 run_uuid、cursor水位、上下文
3) 任务基类模板:
- 获取时间窗口/水位cursor_manager
- 拉取数据:在线模式调用 `api/client.py` 支持分页、重试;离线模式直接读取 JSON 文件;
- 解析与校验:类型转换、必填校验(如任务内部 parse/validate
- 加载:调用 Loader`loaders/`)执行批量 Upsert/SCD2/增量写入(底层用 `database/operations.py`
- 质量检查(如需):质量模块对行数/金额等进行对比;
- 更新水位与运行记录(`run_tracker.py`),提交/回滚事务。
数据流与依赖:
- 配置:`config/defaults.py` + `.env` + CLI 参数叠加,形成 AppConfig。
- API 访问:`api/client.py` 支撑分页/重试;离线 ingest 直接读文件。
- DB 访问:`database/connection.py` 提供连接上下文;`operations.py` 负责批量 upsert/分页写入。
- ODS`manual_ingest_task.py` 读取 JSON → ODS 表(保留 payload/来源/时间戳)。
- DWD`dwd_load_task.py` 依据 `TABLE_MAP/FACT_MAPPINGS` 从 ODS 选取字段;维度走 SCD2`scd/scd2_handler.py`事实走增量支持字段表达式JSON->>、CAST
- 质检:`quality` 模块或相关任务对 ODS/DWD 行数、金额等进行比对,输出 `reports/`
---
## 6. ODS → DWD 策略
1. ODS 留底保留源主键、payload、时间/来源信息。
2. DWD 清洗:维度 SCD2事实按时间/水位增量;字段类型、单位、枚举标准化,保留可溯源字段。
3. 业务键统一site_id、member_id、table_id、order_settle_id、order_trade_no 等统一命名。
4. 不过度汇总DWD 只做明细/轻度清洗,汇总留待 DWS/报表。
5. 去嵌套:数组展开为子表/子行,重复 profile 提炼为维度。
6. 长期演进:优先加列/加表,避免频繁改已有表结构。
---
## 7. 常用 CLI
```bash
# 运行所有已注册任务
python -m etl_billiards.cli.main
# 运行指定任务
python -m etl_billiards.cli.main --tasks INIT_ODS_SCHEMA,MANUAL_INGEST
# 覆盖 DSN
python -m etl_billiards.cli.main --pg-dsn "postgresql://user:pwd@host:5432/db"
# 覆盖 API
python -m etl_billiards.cli.main --api-base "https://api.example.com" --api-token "..."
# 试运行(不写库)
python -m etl_billiards.cli.main --dry-run --tasks DWD_LOAD_FROM_ODS
```
---
## 8. 测试ONLINE / OFFLINE
- `TEST_MODE=ONLINE`:调用真实 API全链路 E/T/L。
- `TEST_MODE=OFFLINE`:从 `TEST_JSON_ARCHIVE_DIR` 读取离线 JSON只做 Transform + Load。
- `TEST_DB_DSN`:如设置,则集成测试连真库;未设置用内存/临时库。
示例:
```bash
TEST_MODE=ONLINE pytest tests/unit/test_etl_tasks_online.py
TEST_MODE=OFFLINE TEST_JSON_ARCHIVE_DIR=tests/source-data-doc pytest tests/unit/test_etl_tasks_offline.py
python scripts/test_db_connection.py --dsn postgresql://user:pwd@host:5432/db --query "SELECT 1"
```
---
## 9. 开发与扩展
- 新任务:在 `tasks/` 继承 BaseTask实现 `get_task_code/execute`,并在 `orchestration/task_registry.py` 注册。
- 新 Loader/Checker参考 `loaders/``quality/` 复用批量 upsert/质检接口。
- 配置:`config/defaults.py` + `.env` + CLI 叠加,新增配置需在 defaults 与 env_parser 中声明。
---
## 10. ODS 任务上线指引
- 任务注册脚本:`etl_billiards/database/seed_ods_tasks.sql`(替换 store_id 后执行:`psql "$PG_DSN" -f ...`)。
- 确认 `etl_admin.etl_task` 中已启用所需 ODS 任务。
- 离线回放:可用 `scripts/rebuild_ods_from_json`(如有)从本地 JSON 重建 ODS。
- 单测:`pytest etl_billiards/tests/unit/test_ods_tasks.py`
---
## 11. ODS 表概览(数据路径)
| ODS 表名 | 接口 Path | 数据列表路径 |
| ------------------------------------ | ------------------------------------------------- | ----------------------------- |
| assistant_accounts_master | /PersonnelManagement/SearchAssistantInfo | data.assistantInfos |
| assistant_service_records | /AssistantPerformance/GetOrderAssistantDetails | data.orderAssistantDetails |
| assistant_cancellation_records | /AssistantPerformance/GetAbolitionAssistant | data.abolitionAssistants |
| goods_stock_movements | /GoodsStockManage/QueryGoodsOutboundReceipt | data.queryDeliveryRecordsList |
| goods_stock_summary | /TenantGoods/GetGoodsStockReport | data |
| group_buy_packages | /PackageCoupon/QueryPackageCouponList | data.packageCouponList |
| group_buy_redemption_records | /Site/GetSiteTableUseDetails | data.siteTableUseDetailsList |
| member_profiles | /MemberProfile/GetTenantMemberList | data.tenantMemberInfos |
| member_balance_changes | /MemberProfile/GetMemberCardBalanceChange | data.tenantMemberCardLogs |
| member_stored_value_cards | /MemberProfile/GetTenantMemberCardList | data.tenantMemberCards |
| payment_transactions | /PayLog/GetPayLogListPage | data |
| platform_coupon_redemption_records | /Promotion/GetOfflineCouponConsumePageList | data |
| recharge_settlements | /Site/GetRechargeSettleList | data.settleList |
| refund_transactions | /Order/GetRefundPayLogList | data |
| settlement_records | /Site/GetAllOrderSettleList | data.settleList |
| settlement_ticket_details | /Order/GetOrderSettleTicketNew | 完整 JSON |
| site_tables_master | /Table/GetSiteTables | data.siteTables |
| stock_goods_category_tree | /TenantGoodsCategory/QueryPrimarySecondaryCategory| data.goodsCategoryList |
| store_goods_master | /TenantGoods/GetGoodsInventoryList | data.orderGoodsList |
| store_goods_sales_records | /TenantGoods/GetGoodsSalesList | data.orderGoodsLedgers |
| table_fee_discount_records | /Site/GetTaiFeeAdjustList | data.taiFeeAdjustInfos |
| table_fee_transactions | /Site/GetSiteTableOrderDetails | data.siteTableUseDetailsList |
| tenant_goods_master | /TenantGoods/QueryTenantGoods | data.tenantGoodsList |
> 完整字段级映射见 `docs/` 与 ODS/DWD DDL。
---
## 12. DWD 维度与建模要点
1. 颗粒一致、单一业务键:一张 DWD 表只承载一种业务事件/颗粒,避免混颗粒。
2. 先理解业务链路,再建模;不要机械按 JSON 列表建表。
3. 业务键统一site_id、member_id、table_id、order_settle_id、order_trade_no 等必须一致命名。
4. 保留明细,不过度汇总;聚合留到 DWS/报表。
5. 清洗标准化同时保留溯源字段源主键、时间、金额、payload
6. 去嵌套与解耦:数组展开子行,重复 profile 提炼维度。
7. 演进优先加列/加表,减少对已有表结构的破坏。
---
## 13. 当前状态2025-12-09
- 示例 JSON 已全量灌入DWD 行数与 ODS 对齐。
- 分类维度已展平大类+子类:`dim_goods_category` 26 行category_level/leaf 已赋值)。
- 部分空字段源数据即为空,如需补值请先确认上游。
---
## 14. 可精简/归档
- `tmp/``tmp/etl_billiards_misc/` 中草稿、旧备份、调试脚本仅供参考,不影响运行。
- 根级保留必要文件README、requirements、run_etl.*、.env/.env.example其他临时文件已移至 tmp。
---
## 15. FAQ
- 字段空值:若映射已存在且源列非空仍为空,再检查上游 JSON维度 SCD2 按全量合并。
- DSN/路径:确认 `.env``PG_DSN``INGEST_SOURCE_DIR` 与本地一致。
- 新增任务:在 `tasks/` 实现并注册到 `task_registry.py`,必要时同步更新 DDL 与映射。
- 权限/运行:检查网络、账号权限;脚本需执行权限(如 `chmod +x run_etl.sh`)。

View File

@@ -1,46 +1,49 @@
# 数据库配置
PG_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ
# PG_HOST=localhost
# PG_PORT=5432
# PG_NAME=LLZQ
# PG_USER=local-Python
# PG_PASSWORD=your_password_here
# -*- coding: utf-8 -*-
# 文件说明ETL 环境变量config/env_parser.py 读取),用于数据库连接、目录与运行参数。
# API配置(抓取时,非球的一些配置)
API_BASE=https://api.example.com # 非球URL前缀
API_TOKEN=your_token_here # 登录Token
# API_TIMEOUT=20
# API_PAGE_SIZE=200
# API_RETRY_MAX=3
# 数据库连接字符串config/env_parser.py -> db.dsn所有任务必需
PG_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test
# 数据库连接超时秒config/env_parser.py -> db.connect_timeout_sec
PG_CONNECT_TIMEOUT=10
# 应用配置
# 门店/租户IDconfig/env_parser.py -> app.store_id任务调度记录使用
STORE_ID=2790685415443269
# TIMEZONE=Asia/Taipei
# SCHEMA_OLTP=billiards
# SCHEMA_ETL=etl_admin
# 时区标识config/env_parser.py -> app.timezone
TIMEZONE=Asia/Taipei
# 路径配置
EXPORT_ROOT=r"D:\LLZQ\DB\export",
LOG_ROOT=r"D:\LLZQ\DB\logs",
# API 基础地址config/env_parser.py -> api.base_urlFETCH 类任务调用
API_BASE=https://api.example.com
# API 鉴权 Tokenconfig/env_parser.py -> api.tokenFETCH 类任务调用
API_TOKEN=your_token_here
# API 请求超时秒config/env_parser.py -> api.timeout_sec
API_TIMEOUT=20
# API 分页大小config/env_parser.py -> api.page_size
API_PAGE_SIZE=200
# API 最大重试次数config/env_parser.py -> api.retries.max_attempts
API_RETRY_MAX=3
# ETL配置
OVERLAP_SECONDS=120 # 为了防止边界遗漏,会往前“回拨”一点的冗余秒数
# 日志根目录config/env_parser.py -> io.log_rootInit/任务运行写日志
LOG_ROOT=C:\dev\LLTQ\export\LOG
# JSON 导出根目录config/env_parser.py -> io.export_rootFETCH 产出及 INIT 准备
EXPORT_ROOT=C:\dev\LLTQ\export\JSON
# FETCH 模式本地输出目录config/env_parser.py -> pipeline.fetch_root
FETCH_ROOT=C:\dev\LLTQ\export\JSON
# 本地入库 JSON 目录config/env_parser.py -> pipeline.ingest_source_dirMANUAL_INGEST/INGEST_ONLY 使用
INGEST_SOURCE_DIR=C:\dev\LLTQ\export\test-json-doc
# JSON 漂亮格式输出开关config/env_parser.py -> io.write_pretty_json
WRITE_PRETTY_JSON=false
# 运行流程FULL / FETCH_ONLY / INGEST_ONLYconfig/env_parser.py -> pipeline.flow
PIPELINE_FLOW=FULL
# 指定任务列表逗号分隔覆盖默认config/env_parser.py -> run.tasks
# RUN_TASKS=INIT_ODS_SCHEMA,MANUAL_INGEST
# 窗口/补偿参数config/env_parser.py -> run.*
OVERLAP_SECONDS=120
WINDOW_BUSY_MIN=30
WINDOW_IDLE_MIN=180
IDLE_START=04:00
IDLE_END=16:00
ALLOW_EMPTY_ADVANCE=true
# 清洗配置
LOG_UNKNOWN_FIELDS=true
HASH_ALGO=sha1
STRICT_NUMERIC=true
ROUND_MONEY_SCALE=2
# 测试/离线模式
TEST_MODE=OFFLINE
TEST_JSON_ARCHIVE_DIR=tests/testdata_json #指定离线模式OFFLINE要读取的 JSON 归档目录。测试或回放任务时,会从这个目录中找到各个任务预先保存的 API 响应文件,直接做 Transform + Load不再访问真实接口。
TEST_JSON_TEMP_DIR=/tmp/etl_billiards_json_tmp #指定测试运行时临时生成或复制 JSON 文件的目录。在线/离线联动测试会把输出、中间文件等写到这个路径,既避免污染真实导出目录,也让 CI 可以在一次运行中隔离不同任务产生的临时数据。
# 测试数据库(留空则单元测试使用伪库)
TEST_DB_DSN=postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test
ALLOW_EMPTY_RESULT_ADVANCE=true

View File

@@ -1,7 +0,0 @@
# -*- coding: UTF-8 -*-
# Filename : helloworld.py
# author by : www.runoob.com
# 该实例输出 Hello World!
print('Hello World!')

View File

@@ -1,651 +0,0 @@
# 台球场 ETL 系统(模块化版本)合并文档
本文为原多份文档(如 `INDEX.md``QUICK_START.md``ARCHITECTURE.md``MIGRATION_GUIDE.md``PROJECT_STRUCTURE.md``README.md` 等)的合并版,只保留与**当前项目本身**相关的内容:项目说明、目录结构、架构设计、数据与控制流程、迁移与扩展指南等,不包含修改历史和重构过程描述。
---
## 1. 项目概述
台球场 ETL 系统是一个面向门店业务的专业 ETL 工程项目,用于从外部业务 API 拉取订单、支付、会员等数据经过解析、校验、SCD2 处理、质量检查后写入 PostgreSQL 数据库,并支持增量同步和任务运行追踪。
系统采用模块化、分层架构设计,核心特性包括:
- 模块化目录结构配置、数据库、API、模型、加载器、SCD2、质量检查、编排、任务、CLI、工具、测试等分层清晰
- 完整的配置管理:默认值 + 环境变量 + CLI 参数多层覆盖。
- 可复用的数据库访问层(连接管理、批量 Upsert 封装)。
- 支持重试与分页的 API 客户端。
- 类型安全的数据解析与校验模块。
- SCD2 维度历史管理。
- 数据质量检查(例如余额一致性检查)。
- 任务编排层统一调度、游标管理与运行追踪。
- 命令行入口统一管理任务执行支持筛选任务、Dry-run 等模式。
---
## 2. 快速开始
### 2.1 环境准备
- Python 版本:建议 3.10+
- 数据库PostgreSQL
- 操作系统Windows / Linux / macOS 均可
```bash
# 克隆/下载代码后进入项目目录
cd etl_billiards/
ls -la
```
你会看到下述目录结构的顶层部分(详细见第 4 章):
- `config/` - 配置管理
- `database/` - 数据库访问
- `api/` - API 客户端
- `tasks/` - ETL 任务实现
- `cli/` - 命令行入口
- `docs/` - 技术文档
### 2.2 安装依赖
```bash
pip install -r requirements.txt
```
主要依赖示例(按实际 `requirements.txt` 为准):
- `psycopg2-binary`PostgreSQL 驱动
- `requests`HTTP 客户端
- `python-dateutil`:时间处理
- `tzdata`:时区数据
### 2.3 配置环境变量
复制并修改环境变量模板:
```bash
cp .env.example .env
# 使用你习惯的编辑器修改 .env
```
`.env` 示例(最小配置):
```bash
# 数据库
PG_DSN=postgresql://user:password@localhost:5432/LLZQ
# API
API_BASE=https://api.example.com
API_TOKEN=your_token_here
# 门店/应用
STORE_ID=2790685415443269
TIMEZONE=Asia/Taipei
# 目录
EXPORT_ROOT=/path/to/export
LOG_ROOT=/path/to/logs
```
> 所有配置项的默认值见 `config/defaults.py`,最终生效配置由「默认值 + 环境变量 + CLI 参数」三层叠加。
### 2.4 运行第一个任务
通过 CLI 入口运行:
```bash
# 运行所有任务
python -m cli.main
# 仅运行订单任务
python -m cli.main --tasks ORDERS
# 运行订单 + 支付
python -m cli.main --tasks ORDERS,PAYMENTS
# Windows 使用脚本
run_etl.bat --tasks ORDERS
# Linux / macOS 使用脚本
./run_etl.sh --tasks ORDERS
```
### 2.5 查看结果
- 日志目录:使用 `LOG_ROOT` 指定,例如
```bash
ls -la D:\LLZQ\DB\logs/
```
- 导出目录:使用 `EXPORT_ROOT` 指定,例如
```bash
ls -la D:\LLZQ\DB\export/
```
---
## 3. 常用命令与开发工具
### 3.1 CLI 常用命令
```bash
# 运行所有任务
python -m cli.main
# 运行指定任务
python -m cli.main --tasks ORDERS,PAYMENTS,MEMBERS
# 使用自定义数据库
python -m cli.main --pg-dsn "postgresql://user:password@host:5432/db"
# 使用自定义 API 端点
python -m cli.main --api-base "https://api.example.com" --api-token "..."
# 试运行(不写入数据库)
python -m cli.main --dry-run --tasks ORDERS
```
### 3.2 IDE / 代码质量工具示例VSCode
`.vscode/settings.json` 示例:
```json
{
"python.linting.enabled": true,
"python.linting.pylintEnabled": true,
"python.formatting.provider": "black",
"python.testing.pytestEnabled": true
}
```
代码格式化与检查:
```bash
pip install black isort pylint
black .
isort .
pylint etl_billiards/
```
### 3.3 测试
```bash
# 安装测试依赖(按需)
pip install pytest pytest-cov
# 运行全部测试
pytest
# 仅运行单元测试
pytest tests/unit/
# 生成覆盖率报告
pytest --cov=. --cov-report=html
```
测试示例(按实际项目为准):
- `tests/unit/test_config.py` 配置管理单元测试
- `tests/unit/test_parsers.py` 解析器单元测试
- `tests/integration/test_database.py` 数据库集成测试
<<<<<<< HEAD
=======
#### 3.3.1 测试模式ONLINE / OFFLINE
- `TEST_MODE=ONLINE`(默认)时,测试会模拟实时 API完整执行 E/T/L。
- `TEST_MODE=OFFLINE` 时,测试改为从 `TEST_JSON_ARCHIVE_DIR` 指定的归档 JSON 中读取数据,仅做 Transform + Load适合验证本地归档数据是否仍可回放。
- `TEST_JSON_ARCHIVE_DIR`:离线 JSON 归档目录(示例:`tests/testdata_json` 或 CI 产出的快照)。
- `TEST_JSON_TEMP_DIR`:测试生成的临时 JSON 输出目录,便于隔离每次运行的数据。
- `TEST_DB_DSN`:可选,若设置则单元测试会连接到此 PostgreSQL DSN实打实执行写库留空时测试使用内存伪库避免依赖数据库。
示例命令:
```bash
# 在线模式覆盖所有任务
TEST_MODE=ONLINE pytest tests/unit/test_etl_tasks_online.py
# 离线模式使用归档 JSON 覆盖所有任务
TEST_MODE=OFFLINE TEST_JSON_ARCHIVE_DIR=tests/testdata_json pytest tests/unit/test_etl_tasks_offline.py
# 使用脚本按需组合参数(示例:在线 + 仅订单用例)
python scripts/run_tests.py --suite online --mode ONLINE --keyword ORDERS
# 使用脚本连接真实测试库并回放离线模式
python scripts/run_tests.py --suite offline --mode OFFLINE --db-dsn postgresql://user:pwd@localhost:5432/testdb
# 使用“指令仓库”中的预置命令
python scripts/run_tests.py --preset offline_realdb
python scripts/run_tests.py --list-presets # 查看或自定义 scripts/test_presets.py
```
>>>>>>> main
---
## 4. 项目结构与文件说明
### 4.1 总体目录结构(树状图)
```text
etl_billiards/
├── README.md # 项目总览和使用说明
├── MIGRATION_GUIDE.md # 从旧版本迁移指南
├── requirements.txt # Python 依赖列表
├── setup.py # 项目安装配置
├── .env.example # 环境变量配置模板
├── .gitignore # Git 忽略文件配置
├── run_etl.sh # Linux/Mac 运行脚本
├── run_etl.bat # Windows 运行脚本
├── config/ # 配置管理模块
│ ├── __init__.py
│ ├── defaults.py # 默认配置值定义
│ ├── env_parser.py # 环境变量解析器
│ └── settings.py # 配置管理主类
├── database/ # 数据库访问层
│ ├── __init__.py
│ ├── connection.py # 数据库连接管理
│ └── operations.py # 批量操作封装
├── api/ # HTTP API 客户端
│ ├── __init__.py
│ └── client.py # API 客户端(重试 + 分页)
├── models/ # 数据模型层
│ ├── __init__.py
│ ├── parsers.py # 类型解析器
│ └── validators.py # 数据验证器
├── loaders/ # 数据加载器层
│ ├── __init__.py
│ ├── base_loader.py # 加载器基类
│ ├── dimensions/ # 维度表加载器
│ │ ├── __init__.py
│ │ └── member.py # 会员维度加载器
│ └── facts/ # 事实表加载器
│ ├── __init__.py
│ ├── order.py # 订单事实表加载器
│ └── payment.py # 支付记录加载器
├── scd/ # SCD2 处理层
│ ├── __init__.py
│ └── scd2_handler.py # SCD2 历史记录处理器
├── quality/ # 数据质量检查层
│ ├── __init__.py
│ ├── base_checker.py # 质量检查器基类
│ └── balance_checker.py # 余额一致性检查器
├── orchestration/ # ETL 编排层
│ ├── __init__.py
│ ├── scheduler.py # ETL 调度器
│ ├── task_registry.py # 任务注册表(工厂模式)
│ ├── cursor_manager.py # 游标管理器
│ └── run_tracker.py # 运行记录追踪器
├── tasks/ # ETL 任务层
│ ├── __init__.py
│ ├── base_task.py # 任务基类(模板方法)
│ ├── orders_task.py # 订单 ETL 任务
│ ├── payments_task.py # 支付 ETL 任务
│ └── members_task.py # 会员 ETL 任务
├── cli/ # 命令行接口层
│ ├── __init__.py
│ └── main.py # CLI 主入口
├── utils/ # 工具函数
│ ├── __init__.py
│ └── helpers.py # 通用工具函数
├── tests/ # 测试代码
│ ├── __init__.py
│ ├── unit/ # 单元测试
│ │ ├── __init__.py
│ │ ├── test_config.py
│ │ └── test_parsers.py
<<<<<<< HEAD
=======
│ ├── testdata_json/ # 清洗入库用的测试Json文件
│ │ └── XX.json
>>>>>>> main
│ └── integration/ # 集成测试
│ ├── __init__.py
│ └── test_database.py
└── docs/ # 文档
└── ARCHITECTURE.md # 架构设计文档
```
### 4.2 各模块职责概览
- **config/**
- 统一配置入口,支持默认值、环境变量、命令行参数三层覆盖。
- **database/**
- 封装 PostgreSQL 连接与批量操作插入、更新、Upsert 等)。
- **api/**
- 对上游业务 API 的 HTTP 调用进行统一封装,支持重试、分页与超时控制。
- **models/**
- 提供类型解析器(时间戳、金额、整数等)与业务级数据校验器。
- **loaders/**
- 提供事实表与维度表的加载逻辑(包含批量 Upsert、统计写入结果等
- **scd/**
- 维度型数据的 SCD2 历史管理(有效期、版本标记等)。
- **quality/**
- 质量检查策略,例如余额一致性、记录数量对齐等。
- **orchestration/**
- 任务调度、任务注册、游标管理(增量窗口)、运行记录追踪。
- **tasks/**
- 具体业务任务(订单、支付、会员等),封装了从“取数 → 处理 → 写库 → 记录结果”的完整流程。
- **cli/**
- 命令行入口,解析参数并启动调度流程。
- **utils/**
- 杂项工具函数。
- **tests/**
- 单元测试与集成测试代码。
---
## 5. 架构设计与流程说明
### 5.1 分层架构图
```text
┌─────────────────────────────────────┐
│ CLI 命令行接口 │ <- cli/main.py
└─────────────┬───────────────────────┘
┌─────────────▼───────────────────────┐
│ Orchestration 编排层 │ <- orchestration/
│ (Scheduler, TaskRegistry, ...) │
└─────────────┬───────────────────────┘
┌─────────────▼───────────────────────┐
│ Tasks 任务层 │ <- tasks/
│ (OrdersTask, PaymentsTask, ...) │
└───┬─────────┬─────────┬─────────────┘
│ │ │
▼ ▼ ▼
┌────────┐ ┌─────┐ ┌──────────┐
│Loaders │ │ SCD │ │ Quality │ <- loaders/, scd/, quality/
└────────┘ └─────┘ └──────────┘
┌───────▼────────┐
│ Models 模型 │ <- models/
└───────┬────────┘
┌───────▼────────┐
│ API 客户端 │ <- api/
└───────┬────────┘
┌───────▼────────┐
│ Database 访问 │ <- database/
└───────┬────────┘
┌───────▼────────┐
│ Config 配置 │ <- config/
└────────────────┘
```
### 5.2 各层职责(当前设计)
- **CLI 层 (`cli/`)**
- 解析命令行参数指定任务列表、Dry-run、覆盖配置项等
- 初始化配置与日志后交由编排层执行。
- **编排层 (`orchestration/`)**
- `scheduler.py`:根据配置与 CLI 参数选择需要执行的任务,控制执行顺序和并行策略。
- `task_registry.py`:提供任务注册表,按任务代码创建任务实例(工厂模式)。
- `cursor_manager.py`:管理增量游标(时间窗口 / ID 游标)。
- `run_tracker.py`:记录每次任务运行的状态、统计信息和错误信息。
- **任务层 (`tasks/`)**
- `base_task.py`:定义任务执行模板流程(模板方法模式),包括获取窗口、调用上游、解析 / 校验、写库、更新游标等。
- `orders_task.py` / `payments_task.py` / `members_task.py`:实现具体任务逻辑(订单、支付、会员)。
- **加载器 / SCD / 质量层**
- `loaders/`:根据目标表封装 Upsert / Insert / Update 逻辑。
- `scd/scd2_handler.py`:为维度表提供 SCD2 历史管理能力。
- `quality/`:执行数据质量检查,如余额对账。
- **模型层 (`models/`)**
- `parsers.py`:负责数据类型转换(字符串 → 时间戳、Decimal、int 等)。
- `validators.py`:执行字段级和记录级的数据校验。
- **API 层 (`api/client.py`)**
- 封装 HTTP 调用,处理重试、超时及分页。
- **数据库层 (`database/`)**
- 管理数据库连接及上下文。
- 提供批量插入 / 更新 / Upsert 操作接口。
- **配置层 (`config/`)**
- 定义配置项默认值。
- 解析环境变量并进行类型转换。
- 对外提供统一配置对象。
### 5.3 设计模式(当前使用)
- 工厂模式:任务注册 / 创建(`TaskRegistry`)。
- 模板方法模式:任务执行流程(`BaseTask`)。
- 策略模式:不同 Loader / Checker 实现不同策略。
- 依赖注入:通过构造函数向任务传入 `db`、`api`、`config` 等依赖。
### 5.4 数据与控制流程
整体流程:
1. CLI 解析参数并加载配置。
2. Scheduler 构建数据库连接、API 客户端等依赖。
3. Scheduler 遍历任务配置,从 `TaskRegistry` 获取任务类并实例化。
4. 每个任务按统一模板执行:
- 读取游标 / 时间窗口。
- 调用 API 拉取数据(可分页)。
- 解析、验证数据。
- 通过 Loader 写入数据库(事实表 / 维度表 / SCD2
- 执行质量检查。
- 更新游标与运行记录。
5. 所有任务执行完成后,释放连接并退出进程。
### 5.5 错误处理策略
- 单个任务失败不影响其他任务执行。
- 数据库操作异常自动回滚当前事务。
- API 请求失败时按配置进行重试,超过重试次数记录错误并终止该任务。
- 所有错误被记录到日志和运行追踪表,便于事后排查。
---
## 6. 迁移指南(从旧脚本到当前项目)
本节用于说明如何从旧的单文件脚本(如 `task_merged.py`)迁移到当前模块化项目,属于当前项目的使用说明,不涉及历史对比细节。
### 6.1 核心功能映射示意
| 旧版本函数 / 类 | 新版本位置 | 说明 |
|---------------------------|--------------------------------------------------------|----------------|
| `DEFAULTS` 字典 | `config/defaults.py` | 配置默认值 |
| `build_config()` | `config/settings.py::AppConfig.load()` | 配置加载 |
| `Pg` 类 | `database/connection.py::DatabaseConnection` | 数据库连接 |
| `http_get_json()` | `api/client.py::APIClient.get()` | API 请求 |
| `paged_get()` | `api/client.py::APIClient.get_paginated()` | 分页请求 |
| `parse_ts()` | `models/parsers.py::TypeParser.parse_timestamp()` | 时间解析 |
| `upsert_fact_order()` | `loaders/facts/order.py::OrderLoader.upsert_orders()` | 订单加载 |
| `scd2_upsert()` | `scd/scd2_handler.py::SCD2Handler.upsert()` | SCD2 处理 |
| `run_task_orders()` | `tasks/orders_task.py::OrdersTask.execute()` | 订单任务 |
| `main()` | `cli/main.py::main()` | 主入口 |
### 6.2 典型迁移步骤
1. **配置迁移**
- 原来在 `DEFAULTS` 或脚本内硬编码的配置,迁移到 `.env` 与 `config/defaults.py`。
- 使用 `AppConfig.load()` 统一获取配置。
2. **并行运行验证**
```bash
# 旧脚本
python task_merged.py --tasks ORDERS
# 新项目
python -m cli.main --tasks ORDERS
```
对比新旧版本导出的数据表和日志,确认一致性。
3. **自定义逻辑迁移**
- 原脚本中的自定义清洗逻辑 → 放入相应 `loaders/` 或任务类中。
- 自定义任务 → 在 `tasks/` 中实现并在 `task_registry` 中注册。
- 自定义 API 调用 → 扩展 `api/client.py` 或单独封装服务类。
4. **逐步切换**
- 先在测试环境并行运行。
- 再逐步切换生产任务到新版本。
---
## 7. 开发与扩展指南(当前项目)
### 7.1 添加新任务
1. 在 `tasks/` 目录创建任务类:
```python
from .base_task import BaseTask
class MyTask(BaseTask):
def get_task_code(self) -> str:
return "MY_TASK"
def execute(self) -> dict:
# 1. 获取时间窗口
window_start, window_end, _ = self._get_time_window()
# 2. 调用 API 获取数据
records, _ = self.api.get_paginated(...)
# 3. 解析 / 校验
parsed = [self._parse(r) for r in records]
# 4. 加载数据
loader = MyLoader(self.db)
inserted, updated, _ = loader.upsert(parsed)
# 5. 提交并返回结果
self.db.commit()
return self._build_result("SUCCESS", {
"inserted": inserted,
"updated": updated,
})
```
2. 在 `orchestration/task_registry.py` 中注册:
```python
from tasks.my_task import MyTask
default_registry.register("MY_TASK", MyTask)
```
3. 在任务配置表中启用(示例):
```sql
INSERT INTO etl_admin.etl_task (task_code, store_id, enabled)
VALUES ('MY_TASK', 123456, TRUE);
```
### 7.2 添加新加载器
```python
from loaders.base_loader import BaseLoader
class MyLoader(BaseLoader):
def upsert(self, records: list) -> tuple:
sql = "INSERT INTO table_name (...) VALUES (...) ON CONFLICT (...) DO UPDATE SET ... RETURNING (xmax = 0) AS inserted"
inserted, updated = self.db.batch_upsert_with_returning(
sql, records, page_size=self._batch_size()
)
return (inserted, updated, 0)
```
### 7.3 添加新质量检查器
1. 在 `quality/` 中实现检查器,继承 `base_checker.py`。
2. 在任务或调度流程中调用该检查器,在写库后进行验证。
### 7.4 类型解析与校验扩展
- 在 `models/parsers.py` 中添加新类型解析方法。
- 在 `models/validators.py` 中添加新规则(如枚举校验、跨字段校验等)。
---
## 8. 常见问题排查
### 8.1 数据库连接失败
```text
错误: could not connect to server
```
排查要点:
- 检查 `PG_DSN` 或相关数据库配置是否正确。
- 确认数据库服务是否启动、网络是否可达。
### 8.2 API 请求超时
```text
错误: requests.exceptions.Timeout
```
排查要点:
- 检查 `API_BASE` 地址与网络连通性。
- 适当提高超时与重试次数(在配置中调整)。
### 8.3 模块导入错误
```text
错误: ModuleNotFoundError
```
排查要点:
- 确认在项目根目录下运行(包含 `etl_billiards/` 包)。
- 或通过 `pip install -e .` 以可编辑模式安装项目。
### 8.4 权限相关问题
```text
错误: Permission denied
```
排查要点:
- 脚本无执行权限:`chmod +x run_etl.sh`。
- Windows 需要以管理员身份运行,或修改日志 / 导出目录权限。
---
## 9. 使用前检查清单
在正式运行前建议确认:
- [ ] 已安装 Python 3.10+。
- [ ] 已执行 `pip install -r requirements.txt`。
- [ ] `.env` 已配置正确数据库、API、门店 ID、路径等
- [ ] PostgreSQL 数据库可连接。
- [ ] API 服务可访问且凭证有效。
- [ ] `LOG_ROOT`、`EXPORT_ROOT` 目录存在且拥有写权限。
---
## 10. 参考说明
- 本文已合并原有的快速开始、项目结构、架构说明、迁移指南等内容,可作为当前项目的统一说明文档。
- 如需在此基础上拆分多份文档,可按章节拆出,例如「快速开始」「架构设计」「迁移指南」「开发扩展」等。

View File

@@ -1,95 +1,256 @@
# -*- coding: utf-8 -*-
"""API客户端"""
"""API客户端:统一封装 POST/重试/分页与列表提取逻辑。"""
from __future__ import annotations
from typing import Iterable, Sequence, Tuple
import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
DEFAULT_BROWSER_HEADERS = {
"Accept": "application/json, text/plain, */*",
"Content-Type": "application/json",
"Origin": "https://pc.ficoo.vip",
"Referer": "https://pc.ficoo.vip/",
"User-Agent": (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36"
),
"Accept-Language": "zh-CN,zh;q=0.9",
"sec-ch-ua": '"Google Chrome";v="141", "Not?A_Brand";v="8", "Chromium";v="141"',
"sec-ch-ua-platform": '"Windows"',
"sec-ch-ua-mobile": "?0",
"sec-fetch-site": "same-origin",
"sec-fetch-mode": "cors",
"sec-fetch-dest": "empty",
"priority": "u=1, i",
"X-Requested-With": "XMLHttpRequest",
"DNT": "1",
}
DEFAULT_LIST_KEYS: Tuple[str, ...] = (
"list",
"rows",
"records",
"items",
"dataList",
"data_list",
"tenantMemberInfos",
"tenantMemberCardLogs",
"tenantMemberCards",
"settleList",
"orderAssistantDetails",
"assistantInfos",
"siteTables",
"taiFeeAdjustInfos",
"siteTableUseDetailsList",
"tenantGoodsList",
"packageCouponList",
"queryDeliveryRecordsList",
"goodsCategoryList",
"orderGoodsList",
"orderGoodsLedgers",
)
class APIClient:
"""HTTP API客户端"""
def __init__(self, base_url: str, token: str = None, timeout: int = 20,
retry_max: int = 3, headers_extra: dict = None):
self.base_url = base_url.rstrip("/")
self.token = token
"""HTTP API 客户端(默认使用 POST + JSON 请求体)"""
def __init__(
self,
base_url: str,
token: str | None = None,
timeout: int = 20,
retry_max: int = 3,
headers_extra: dict | None = None,
):
self.base_url = (base_url or "").rstrip("/")
self.token = self._normalize_token(token)
self.timeout = timeout
self.retry_max = retry_max
self.headers_extra = headers_extra or {}
self._session = None
def _get_session(self):
"""获取或创建会话"""
self._session: requests.Session | None = None
# ------------------------------------------------------------------ HTTP 基础
def _get_session(self) -> requests.Session:
"""获取或创建带重试的 Session。"""
if self._session is None:
self._session = requests.Session()
retries = max(0, int(self.retry_max) - 1)
retry = Retry(
total=None,
connect=retries,
read=retries,
status=retries,
allowed_methods=frozenset(["GET"]),
allowed_methods=frozenset(["GET", "POST"]),
status_forcelist=(429, 500, 502, 503, 504),
backoff_factor=1.0,
backoff_factor=0.5,
respect_retry_after_header=True,
raise_on_status=False,
)
adapter = HTTPAdapter(max_retries=retry)
self._session.mount("http://", adapter)
self._session.mount("https://", adapter)
if self.headers_extra:
self._session.headers.update(self.headers_extra)
self._session.headers.update(self._build_headers())
return self._session
def get(self, endpoint: str, params: dict = None) -> dict:
"""执行GET请求"""
def get(self, endpoint: str, params: dict | None = None) -> dict:
"""
兼容旧名的请求入口(实际以 POST JSON 方式请求)。
"""
return self._post_json(endpoint, params)
def _post_json(self, endpoint: str, payload: dict | None = None) -> dict:
if not self.base_url:
raise ValueError("API base_url 未配置")
url = f"{self.base_url}/{endpoint.lstrip('/')}"
headers = {"Authorization": self.token} if self.token else {}
headers.update(self.headers_extra)
sess = self._get_session()
resp = sess.get(url, headers=headers, params=params, timeout=self.timeout)
resp = sess.post(url, json=payload or {}, timeout=self.timeout)
resp.raise_for_status()
return resp.json()
def get_paginated(self, endpoint: str, params: dict, page_size: int = 200,
page_field: str = "pageIndex", size_field: str = "pageSize",
data_path: tuple = ("data",), list_key: str = None) -> tuple:
"""分页获取数据"""
records, pages_meta = [], []
page = 1
data = resp.json()
self._ensure_success(data)
return data
def _build_headers(self) -> dict:
headers = dict(DEFAULT_BROWSER_HEADERS)
headers.update(self.headers_extra)
if self.token:
headers["Authorization"] = self.token
return headers
@staticmethod
def _normalize_token(token: str | None) -> str | None:
if not token:
return None
t = str(token).strip()
if not t.lower().startswith("bearer "):
t = f"Bearer {t}"
return t
@staticmethod
def _ensure_success(payload: dict):
"""API 返回 code 非 0 时主动抛错,便于上层重试/记录。"""
if isinstance(payload, dict) and "code" in payload:
code = payload.get("code")
if code not in (0, "0", None):
msg = payload.get("msg") or payload.get("message") or ""
raise ValueError(f"API 返回错误 code={code} msg={msg}")
# ------------------------------------------------------------------ 分页
def iter_paginated(
self,
endpoint: str,
params: dict | None,
page_size: int | None = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | Sequence[str] | None = None,
page_start: int = 1,
page_end: int | None = None,
) -> Iterable[tuple[int, list, dict, dict]]:
"""
分页迭代器:逐页拉取数据并产出 (page_no, records, request_params, raw_response)。
page_size=None 时不附带分页参数,仅拉取一次。
"""
base_params = dict(params or {})
page = page_start
while True:
p = dict(params)
p[page_field] = page
p[size_field] = page_size
obj = self.get(endpoint, p)
# 解析数据路径
cur = obj
for k in data_path:
if isinstance(cur, dict) and k in cur:
cur = cur[k]
if list_key:
cur = (cur or {}).get(list_key, [])
if not isinstance(cur, list):
cur = []
records.extend(cur)
if len(cur) == 0:
page_params = dict(base_params)
if page_size is not None:
page_params[page_field] = page
page_params[size_field] = page_size
payload = self._post_json(endpoint, page_params)
records = self._extract_list(payload, data_path, list_key)
yield page, records, page_params, payload
if page_size is None:
break
pages_meta.append({"page": page, "request": p, "response": obj})
if len(cur) < page_size:
if page_end is not None and page >= page_end:
break
if len(records) < (page_size or 0):
break
if len(records) == 0:
break
page += 1
def get_paginated(
self,
endpoint: str,
params: dict,
page_size: int | None = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | Sequence[str] | None = None,
page_start: int = 1,
page_end: int | None = None,
) -> tuple[list, list]:
"""分页获取数据并将所有记录汇总在一个列表中。"""
records, pages_meta = [], []
for page_no, page_records, request_params, response in self.iter_paginated(
endpoint=endpoint,
params=params,
page_size=page_size,
page_field=page_field,
size_field=size_field,
data_path=data_path,
list_key=list_key,
page_start=page_start,
page_end=page_end,
):
records.extend(page_records)
pages_meta.append(
{"page": page_no, "request": request_params, "response": response}
)
return records, pages_meta
# ------------------------------------------------------------------ 响应解析
@classmethod
def _extract_list(
cls, payload: dict | list, data_path: tuple, list_key: str | Sequence[str] | None
) -> list:
"""根据 data_path/list_key 提取列表结构,兼容常见字段名。"""
cur: object = payload
if isinstance(cur, list):
return cur
for key in data_path:
if isinstance(cur, dict):
cur = cur.get(key)
else:
cur = None
if cur is None:
break
if isinstance(cur, list):
return cur
if isinstance(cur, dict):
if list_key:
keys = (list_key,) if isinstance(list_key, str) else tuple(list_key)
for k in keys:
if isinstance(cur.get(k), list):
return cur[k]
for k in DEFAULT_LIST_KEYS:
if isinstance(cur.get(k), list):
return cur[k]
for v in cur.values():
if isinstance(v, list):
return v
return []

View File

@@ -0,0 +1,74 @@
# -*- coding: utf-8 -*-
"""本地 JSON 客户端,模拟 APIClient 的分页接口,从落盘的 JSON 回放数据。"""
from __future__ import annotations
import json
from pathlib import Path
from typing import Iterable, Tuple
from api.client import APIClient
from utils.json_store import endpoint_to_filename
class LocalJsonClient:
"""
读取 RecordingAPIClient 生成的 JSON提供 iter_paginated/get_paginated 接口。
"""
def __init__(self, base_dir: str | Path):
self.base_dir = Path(base_dir)
if not self.base_dir.exists():
raise FileNotFoundError(f"JSON 目录不存在: {self.base_dir}")
def iter_paginated(
self,
endpoint: str,
params: dict | None,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | None = None,
) -> Iterable[Tuple[int, list, dict, dict]]:
file_path = self.base_dir / endpoint_to_filename(endpoint)
if not file_path.exists():
raise FileNotFoundError(f"未找到匹配的 JSON 文件: {file_path}")
with file_path.open("r", encoding="utf-8") as fp:
payload = json.load(fp)
pages = payload.get("pages")
if not isinstance(pages, list) or not pages:
pages = [{"page": 1, "request": params or {}, "response": payload}]
for idx, page in enumerate(pages, start=1):
response = page.get("response", {})
request_params = page.get("request") or {}
page_no = page.get("page") or idx
records = APIClient._extract_list(response, data_path, list_key) # type: ignore[attr-defined]
yield page_no, records, request_params, response
def get_paginated(
self,
endpoint: str,
params: dict,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | None = None,
) -> tuple[list, list]:
records: list = []
pages_meta: list = []
for page_no, page_records, request_params, response in self.iter_paginated(
endpoint=endpoint,
params=params,
page_size=page_size,
page_field=page_field,
size_field=size_field,
data_path=data_path,
list_key=list_key,
):
records.extend(page_records)
pages_meta.append({"page": page_no, "request": request_params, "response": response})
return records, pages_meta

View File

@@ -0,0 +1,118 @@
# -*- coding: utf-8 -*-
"""包装 APIClient将分页响应落盘便于后续本地清洗。"""
from __future__ import annotations
from datetime import datetime
from pathlib import Path
from typing import Any, Iterable, Tuple
from api.client import APIClient
from utils.json_store import dump_json, endpoint_to_filename
class RecordingAPIClient:
"""
代理 APIClient在调用 iter_paginated/get_paginated 时同时把响应写入 JSON 文件。
文件名根据 endpoint 生成,写入到指定 output_dir。
"""
def __init__(
self,
base_client: APIClient,
output_dir: Path | str,
task_code: str,
run_id: int,
write_pretty: bool = False,
):
self.base = base_client
self.output_dir = Path(output_dir)
self.output_dir.mkdir(parents=True, exist_ok=True)
self.task_code = task_code
self.run_id = run_id
self.write_pretty = write_pretty
self.last_dump: dict[str, Any] | None = None
# ------------------------------------------------------------------ public API
def iter_paginated(
self,
endpoint: str,
params: dict | None,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | None = None,
) -> Iterable[Tuple[int, list, dict, dict]]:
pages: list[dict[str, Any]] = []
total_records = 0
for page_no, records, request_params, response in self.base.iter_paginated(
endpoint=endpoint,
params=params,
page_size=page_size,
page_field=page_field,
size_field=size_field,
data_path=data_path,
list_key=list_key,
):
pages.append({"page": page_no, "request": request_params, "response": response})
total_records += len(records)
yield page_no, records, request_params, response
self._dump(endpoint, params, page_size, pages, total_records)
def get_paginated(
self,
endpoint: str,
params: dict,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: tuple = ("data",),
list_key: str | None = None,
) -> tuple[list, list]:
records: list = []
pages_meta: list = []
for page_no, page_records, request_params, response in self.iter_paginated(
endpoint=endpoint,
params=params,
page_size=page_size,
page_field=page_field,
size_field=size_field,
data_path=data_path,
list_key=list_key,
):
records.extend(page_records)
pages_meta.append({"page": page_no, "request": request_params, "response": response})
return records, pages_meta
# ------------------------------------------------------------------ internal
def _dump(
self,
endpoint: str,
params: dict | None,
page_size: int,
pages: list[dict[str, Any]],
total_records: int,
):
filename = endpoint_to_filename(endpoint)
path = self.output_dir / filename
payload = {
"task_code": self.task_code,
"run_id": self.run_id,
"endpoint": endpoint,
"params": params or {},
"page_size": page_size,
"pages": pages,
"total_records": total_records,
"dumped_at": datetime.utcnow().isoformat() + "Z",
}
dump_json(path, payload, pretty=self.write_pretty)
self.last_dump = {
"file": str(path),
"endpoint": endpoint,
"pages": len(pages),
"records": total_records,
}

View File

@@ -36,13 +36,25 @@ def parse_args():
# API参数
parser.add_argument("--api-base", help="API基础URL")
parser.add_argument("--api-token", help="API令牌")
parser.add_argument("--api-token", "--token", dest="api_token", help="API令牌Bearer Token")
parser.add_argument("--api-timeout", type=int, help="API超时(秒)")
parser.add_argument("--api-page-size", type=int, help="分页大小")
parser.add_argument("--api-retry-max", type=int, help="API重试最大次数")
# 目录参数
parser.add_argument("--export-root", help="导出根目录")
parser.add_argument("--log-root", help="日志根目录")
# 抓取/清洗管线
parser.add_argument("--pipeline-flow", choices=["FULL", "FETCH_ONLY", "INGEST_ONLY"], help="流水线模式")
parser.add_argument("--fetch-root", help="抓取JSON输出根目录")
parser.add_argument("--ingest-source", help="本地清洗入库源目录")
parser.add_argument("--write-pretty-json", action="store_true", help="抓取JSON美化输出")
# 运行窗口
parser.add_argument("--idle-start", help="闲时窗口开始(HH:MM)")
parser.add_argument("--idle-end", help="闲时窗口结束(HH:MM)")
parser.add_argument("--allow-empty-advance", action="store_true", help="允许空结果推进窗口")
return parser.parse_args()
@@ -77,12 +89,32 @@ def build_cli_overrides(args) -> dict:
overrides.setdefault("api", {})["timeout_sec"] = args.api_timeout
if args.api_page_size:
overrides.setdefault("api", {})["page_size"] = args.api_page_size
if args.api_retry_max:
overrides.setdefault("api", {}).setdefault("retries", {})["max_attempts"] = args.api_retry_max
# 目录
if args.export_root:
overrides.setdefault("io", {})["export_root"] = args.export_root
if args.log_root:
overrides.setdefault("io", {})["log_root"] = args.log_root
# 抓取/清洗管线
if args.pipeline_flow:
overrides.setdefault("pipeline", {})["flow"] = args.pipeline_flow.upper()
if args.fetch_root:
overrides.setdefault("pipeline", {})["fetch_root"] = args.fetch_root
if args.ingest_source:
overrides.setdefault("pipeline", {})["ingest_source_dir"] = args.ingest_source
if args.write_pretty_json:
overrides.setdefault("io", {})["write_pretty_json"] = True
# 运行窗口
if args.idle_start:
overrides.setdefault("run", {}).setdefault("idle_window", {})["start"] = args.idle_start
if args.idle_end:
overrides.setdefault("run", {}).setdefault("idle_window", {})["end"] = args.idle_end
if args.allow_empty_advance:
overrides.setdefault("run", {})["allow_empty_result_advance"] = True
# 任务
if args.tasks:

View File

@@ -1,12 +1,12 @@
# -*- coding: utf-8 -*-
"""配置默认值"""
"""配置默认值定义"""
DEFAULTS = {
"app": {
"timezone": "Asia/Taipei",
"store_id": "",
"schema_oltp": "billiards",
"schema_etl": "etl_admin",
"schema_etl": "etl_admin",
},
"db": {
"dsn": "",
@@ -15,20 +15,21 @@ DEFAULTS = {
"name": "",
"user": "",
"password": "",
"connect_timeout_sec": 5,
"connect_timeout_sec": 20,
"batch_size": 1000,
"session": {
"timezone": "Asia/Taipei",
"statement_timeout_ms": 30000,
"lock_timeout_ms": 5000,
"idle_in_tx_timeout_ms": 600000
"idle_in_tx_timeout_ms": 600000,
},
},
"api": {
"base_url": None,
"base_url": "https://pc.ficoo.vip/apiprod/admin/v1",
"token": None,
"timeout_sec": 20,
"page_size": 200,
"params": {},
"retries": {
"max_attempts": 3,
"backoff_sec": [1, 2, 4],
@@ -37,9 +38,19 @@ DEFAULTS = {
},
"run": {
"tasks": [
"PRODUCTS", "TABLES", "MEMBERS", "ASSISTANTS", "PACKAGES_DEF",
"ORDERS", "PAYMENTS", "REFUNDS", "COUPON_USAGE", "INVENTORY_CHANGE",
"TOPUPS", "TABLE_DISCOUNT", "ASSISTANT_ABOLISH",
"PRODUCTS",
"TABLES",
"MEMBERS",
"ASSISTANTS",
"PACKAGES_DEF",
"ORDERS",
"PAYMENTS",
"REFUNDS",
"COUPON_USAGE",
"INVENTORY_CHANGE",
"TOPUPS",
"TABLE_DISCOUNT",
"ASSISTANT_ABOLISH",
"LEDGER",
],
"window_minutes": {
@@ -49,18 +60,26 @@ DEFAULTS = {
"overlap_seconds": 120,
"idle_window": {
"start": "04:00",
"end": "16:00",
"end": "16:00",
},
"allow_empty_result_advance": True,
},
"io": {
"export_root": r"D:\LLZQ\DB\export",
"log_root": r"D:\LLZQ\DB\logs",
"export_root": r"C:\dev\LLTQ\export\JSON",
"log_root": r"C:\dev\LLTQ\export\LOG",
"manifest_name": "manifest.json",
"ingest_report_name": "ingest_report.json",
"write_pretty_json": False,
"write_pretty_json": True,
"max_file_bytes": 50 * 1024 * 1024,
},
"pipeline": {
# 运行流程FETCH_ONLY仅在线抓取落盘、INGEST_ONLY本地清洗入库、FULL抓取 + 清洗入库)
"flow": "FULL",
# 在线抓取 JSON 输出根目录按任务、run_id 与时间自动创建子目录)
"fetch_root": r"C:\dev\LLTQ\export\JSON",
# 本地清洗入库时的 JSON 输入目录(为空则默认使用本次抓取目录)
"ingest_source_dir": "",
},
"clean": {
"log_unknown_fields": True,
"unknown_fields_limit": 50,
@@ -76,31 +95,26 @@ DEFAULTS = {
"redact_keys": ["token", "password", "Authorization"],
"echo_token_in_logs": False,
},
<<<<<<< HEAD
=======
"testing": {
# ONLINE: 正常实时ETLOFFLINE: 读取归档JSON做T/L
"mode": "OFFLINE",
# 离线归档JSON所在目录
"json_archive_dir": "",
# 测试运行时用于生成/复制临时JSON的目录
"temp_json_dir": "",
"ods": {
# ODS 离线重建/回放相关(仅开发/运维使用)
"json_doc_dir": r"C:\dev\LLTQ\export\test-json-doc",
"include_files": "",
"drop_schema_first": True,
},
>>>>>>> main
}
# 任务代码常量
TASK_ORDERS = "ORDERS"
TASK_PAYMENTS = "PAYMENTS"
TASK_REFUNDS = "REFUNDS"
TASK_INVENTORY_CHANGE = "INVENTORY_CHANGE"
TASK_COUPON_USAGE = "COUPON_USAGE"
TASK_MEMBERS = "MEMBERS"
TASK_ASSISTANTS = "ASSISTANTS"
TASK_PRODUCTS = "PRODUCTS"
TASK_TABLES = "TABLES"
TASK_PACKAGES_DEF = "PACKAGES_DEF"
TASK_TOPUPS = "TOPUPS"
TASK_TABLE_DISCOUNT = "TABLE_DISCOUNT"
TASK_ORDERS = "ORDERS"
TASK_PAYMENTS = "PAYMENTS"
TASK_REFUNDS = "REFUNDS"
TASK_INVENTORY_CHANGE = "INVENTORY_CHANGE"
TASK_COUPON_USAGE = "COUPON_USAGE"
TASK_MEMBERS = "MEMBERS"
TASK_ASSISTANTS = "ASSISTANTS"
TASK_PRODUCTS = "PRODUCTS"
TASK_TABLES = "TABLES"
TASK_PACKAGES_DEF = "PACKAGES_DEF"
TASK_TOPUPS = "TOPUPS"
TASK_TABLE_DISCOUNT = "TABLE_DISCOUNT"
TASK_ASSISTANT_ABOLISH = "ASSISTANT_ABOLISH"
TASK_LEDGER = "LEDGER"
TASK_LEDGER = "LEDGER"

View File

@@ -2,42 +2,59 @@
"""环境变量解析"""
import os
import json
from pathlib import Path
from copy import deepcopy
ENV_MAP = {
"TIMEZONE": ("app.timezone",),
"STORE_ID": ("app.store_id",),
"SCHEMA_OLTP": ("app.schema_oltp",),
"SCHEMA_ETL": ("app.schema_etl",),
"PG_DSN": ("db.dsn",),
"PG_HOST": ("db.host",),
"PG_PORT": ("db.port",),
"PG_NAME": ("db.name",),
"PG_USER": ("db.user",),
"PG_PASSWORD": ("db.password",),
"API_BASE": ("api.base_url",),
"API_TOKEN": ("api.token",),
"API_TIMEOUT": ("api.timeout_sec",),
"API_PAGE_SIZE": ("api.page_size",),
"EXPORT_ROOT": ("io.export_root",),
"LOG_ROOT": ("io.log_root",),
"OVERLAP_SECONDS": ("run.overlap_seconds",),
"WINDOW_BUSY_MIN": ("run.window_minutes.default_busy",),
"WINDOW_IDLE_MIN": ("run.window_minutes.default_idle",),
<<<<<<< HEAD
=======
"TEST_MODE": ("testing.mode",),
"TEST_JSON_ARCHIVE_DIR": ("testing.json_archive_dir",),
"TEST_JSON_TEMP_DIR": ("testing.temp_json_dir",),
>>>>>>> main
"TIMEZONE": ("app.timezone",),
"STORE_ID": ("app.store_id",),
"SCHEMA_OLTP": ("app.schema_oltp",),
"SCHEMA_ETL": ("app.schema_etl",),
"PG_DSN": ("db.dsn",),
"PG_HOST": ("db.host",),
"PG_PORT": ("db.port",),
"PG_NAME": ("db.name",),
"PG_USER": ("db.user",),
"PG_PASSWORD": ("db.password",),
"PG_CONNECT_TIMEOUT": ("db.connect_timeout_sec",),
"API_BASE": ("api.base_url",),
"API_TOKEN": ("api.token",),
"FICOO_TOKEN": ("api.token",),
"API_TIMEOUT": ("api.timeout_sec",),
"API_PAGE_SIZE": ("api.page_size",),
"API_RETRY_MAX": ("api.retries.max_attempts",),
"API_RETRY_BACKOFF": ("api.retries.backoff_sec",),
"API_PARAMS": ("api.params",),
"EXPORT_ROOT": ("io.export_root",),
"LOG_ROOT": ("io.log_root",),
"MANIFEST_NAME": ("io.manifest_name",),
"INGEST_REPORT_NAME": ("io.ingest_report_name",),
"WRITE_PRETTY_JSON": ("io.write_pretty_json",),
"RUN_TASKS": ("run.tasks",),
"OVERLAP_SECONDS": ("run.overlap_seconds",),
"WINDOW_BUSY_MIN": ("run.window_minutes.default_busy",),
"WINDOW_IDLE_MIN": ("run.window_minutes.default_idle",),
"IDLE_START": ("run.idle_window.start",),
"IDLE_END": ("run.idle_window.end",),
"IDLE_WINDOW_START": ("run.idle_window.start",),
"IDLE_WINDOW_END": ("run.idle_window.end",),
"ALLOW_EMPTY_RESULT_ADVANCE": ("run.allow_empty_result_advance",),
"ALLOW_EMPTY_ADVANCE": ("run.allow_empty_result_advance",),
"PIPELINE_FLOW": ("pipeline.flow",),
"JSON_FETCH_ROOT": ("pipeline.fetch_root",),
"JSON_SOURCE_DIR": ("pipeline.ingest_source_dir",),
"FETCH_ROOT": ("pipeline.fetch_root",),
"INGEST_SOURCE_DIR": ("pipeline.ingest_source_dir",),
}
def _deep_set(d, dotted_keys, value):
cur = d
for k in dotted_keys[:-1]:
cur = cur.setdefault(k, {})
cur[dotted_keys[-1]] = value
def _coerce_env(v: str):
if v is None:
return None
@@ -56,13 +73,103 @@ def _coerce_env(v: str):
return s
return s
def load_env_overrides(defaults: dict) -> dict:
cfg = deepcopy(defaults)
def _strip_inline_comment(value: str) -> str:
"""去掉未被引号包裹的内联注释"""
result = []
in_quote = False
quote_char = ""
escape = False
for ch in value:
if escape:
result.append(ch)
escape = False
continue
if ch == "\\":
escape = True
result.append(ch)
continue
if ch in ("'", '"'):
if not in_quote:
in_quote = True
quote_char = ch
elif quote_char == ch:
in_quote = False
quote_char = ""
result.append(ch)
continue
if ch == "#" and not in_quote:
break
result.append(ch)
return "".join(result).rstrip()
def _unquote_value(value: str) -> str:
"""处理引号/原始字符串以及尾随逗号"""
trimmed = value.strip()
trimmed = _strip_inline_comment(trimmed)
trimmed = trimmed.rstrip(",").rstrip()
if not trimmed:
return trimmed
if len(trimmed) >= 2 and trimmed[0] in ("'", '"') and trimmed[-1] == trimmed[0]:
return trimmed[1:-1]
if (
len(trimmed) >= 3
and trimmed[0] in ("r", "R")
and trimmed[1] in ("'", '"')
and trimmed[-1] == trimmed[1]
):
return trimmed[2:-1]
return trimmed
def _parse_dotenv_line(line: str) -> tuple[str, str] | None:
"""解析 .env 文件中的单行"""
stripped = line.strip()
if not stripped or stripped.startswith("#"):
return None
if stripped.startswith("export "):
stripped = stripped[len("export ") :].strip()
if "=" not in stripped:
return None
key, value = stripped.split("=", 1)
key = key.strip()
value = _unquote_value(value)
return key, value
def _load_dotenv_values() -> dict:
"""从项目根目录读取 .env 文件键值"""
if os.environ.get("ETL_SKIP_DOTENV") in ("1", "true", "TRUE", "True"):
return {}
root = Path(__file__).resolve().parents[1]
dotenv_path = root / ".env"
if not dotenv_path.exists():
return {}
values: dict[str, str] = {}
for line in dotenv_path.read_text(encoding="utf-8", errors="ignore").splitlines():
parsed = _parse_dotenv_line(line)
if parsed:
key, value = parsed
values[key] = value
return values
def _apply_env_values(cfg: dict, source: dict):
for env_key, dotted in ENV_MAP.items():
val = os.environ.get(env_key)
val = source.get(env_key)
if val is None:
continue
v2 = _coerce_env(val)
for path in dotted:
if path == "run.tasks" and isinstance(v2, str):
v2 = [item.strip() for item in v2.split(",") if item.strip()]
_deep_set(cfg, path.split("."), v2)
def load_env_overrides(defaults: dict) -> dict:
cfg = deepcopy(defaults)
# 先读取 .env再读取真实环境变量确保 CLI 仍然最高优先级
_apply_env_values(cfg, _load_dotenv_values())
_apply_env_values(cfg, os.environ)
return cfg

View File

@@ -49,6 +49,13 @@ class AppConfig:
f"@{cfg['db']['host']}:{cfg['db']['port']}/{cfg['db']['name']}"
)
# connect_timeout 限定 1-20 秒
try:
timeout_sec = int(cfg["db"].get("connect_timeout_sec") or 5)
except Exception:
raise SystemExit("db.connect_timeout_sec 必须为整数")
cfg["db"]["connect_timeout_sec"] = max(1, min(timeout_sec, 20))
# 会话参数
cfg["db"].setdefault("session", {})
sess = cfg["db"]["session"]

View File

@@ -1,49 +1,62 @@
# -*- coding: utf-8 -*-
"""数据库连接管理"""
"""Database connection manager with capped connect_timeout."""
import psycopg2
import psycopg2.extras
class DatabaseConnection:
"""数据库连接管理器"""
"""Wrap psycopg2 connection with session parameters and timeout guard."""
def __init__(self, dsn: str, session: dict = None, connect_timeout: int = None):
self.conn = psycopg2.connect(dsn, connect_timeout=(connect_timeout or 5))
timeout_val = connect_timeout if connect_timeout is not None else 5
# PRD: database connect_timeout must not exceed 20 seconds.
timeout_val = max(1, min(int(timeout_val), 20))
self.conn = psycopg2.connect(dsn, connect_timeout=timeout_val)
self.conn.autocommit = False
# 设置会话参数
# Session parameters (timezone, statement timeout, etc.)
if session:
with self.conn.cursor() as c:
if session.get("timezone"):
c.execute("SET TIME ZONE %s", (session["timezone"],))
if session.get("statement_timeout_ms") is not None:
c.execute("SET statement_timeout = %s", (int(session["statement_timeout_ms"]),))
c.execute(
"SET statement_timeout = %s",
(int(session["statement_timeout_ms"]),),
)
if session.get("lock_timeout_ms") is not None:
c.execute("SET lock_timeout = %s", (int(session["lock_timeout_ms"]),))
c.execute(
"SET lock_timeout = %s", (int(session["lock_timeout_ms"]),)
)
if session.get("idle_in_tx_timeout_ms") is not None:
c.execute("SET idle_in_transaction_session_timeout = %s",
(int(session["idle_in_tx_timeout_ms"]),))
c.execute(
"SET idle_in_transaction_session_timeout = %s",
(int(session["idle_in_tx_timeout_ms"]),),
)
def query(self, sql: str, args=None):
"""执行查询并返回结果"""
"""Execute a query and fetch all rows."""
with self.conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as c:
c.execute(sql, args)
return c.fetchall()
def execute(self, sql: str, args=None):
"""执行SQL语句"""
"""Execute a SQL statement without returning rows."""
with self.conn.cursor() as c:
c.execute(sql, args)
def commit(self):
"""提交事务"""
"""Commit current transaction."""
self.conn.commit()
def rollback(self):
"""回滚事务"""
"""Rollback current transaction."""
self.conn.rollback()
def close(self):
"""关闭连接"""
"""Safely close the connection."""
try:
self.conn.close()
except Exception:

View File

@@ -7,6 +7,7 @@ class DatabaseOperations:
"""数据库批量操作封装"""
def __init__(self, connection):
self._connection = connection
self.conn = connection.conn
def batch_execute(self, sql: str, rows: list, page_size: int = 1000):
@@ -75,3 +76,24 @@ class DatabaseOperations:
if isinstance(rec, dict):
return bool(rec.get("inserted"))
return False
# --- pass-through helpers -------------------------------------------------
def commit(self):
"""提交事务(委托给底层连接)"""
self._connection.commit()
def rollback(self):
"""回滚事务(委托给底层连接)"""
self._connection.rollback()
def query(self, sql: str, args=None):
"""执行查询并返回结果"""
return self._connection.query(sql, args)
def execute(self, sql: str, args=None):
"""执行任意 SQL"""
self._connection.execute(sql, args)
def cursor(self):
"""暴露原生 cursor供特殊操作使用"""
return self.conn.cursor()

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,105 @@
-- 文件说明etl_admin 调度元数据 DDL独立文件便于初始化任务单独执行
-- 包含任务注册表、游标表、运行记录表;字段注释使用中文。
CREATE SCHEMA IF NOT EXISTS etl_admin;
CREATE TABLE IF NOT EXISTS etl_admin.etl_task (
task_id BIGSERIAL PRIMARY KEY,
task_code TEXT NOT NULL,
store_id BIGINT NOT NULL,
enabled BOOLEAN DEFAULT TRUE,
cursor_field TEXT,
window_minutes_default INT DEFAULT 30,
overlap_seconds INT DEFAULT 120,
page_size INT DEFAULT 200,
retry_max INT DEFAULT 3,
params JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now(),
UNIQUE (task_code, store_id)
);
COMMENT ON TABLE etl_admin.etl_task IS '任务注册表:调度依据的任务清单(与 task_registry 中的任务码对应)。';
COMMENT ON COLUMN etl_admin.etl_task.task_code IS '任务编码,需与代码中的任务码一致。';
COMMENT ON COLUMN etl_admin.etl_task.store_id IS '门店/租户粒度,区分多门店执行。';
COMMENT ON COLUMN etl_admin.etl_task.enabled IS '是否启用此任务。';
COMMENT ON COLUMN etl_admin.etl_task.cursor_field IS '增量游标字段名(可选)。';
COMMENT ON COLUMN etl_admin.etl_task.window_minutes_default IS '默认时间窗口(分钟)。';
COMMENT ON COLUMN etl_admin.etl_task.overlap_seconds IS '窗口重叠秒数,用于防止遗漏。';
COMMENT ON COLUMN etl_admin.etl_task.page_size IS '默认分页大小。';
COMMENT ON COLUMN etl_admin.etl_task.retry_max IS 'API重试次数上限。';
COMMENT ON COLUMN etl_admin.etl_task.params IS '任务级自定义参数 JSON。';
COMMENT ON COLUMN etl_admin.etl_task.created_at IS '创建时间。';
COMMENT ON COLUMN etl_admin.etl_task.updated_at IS '更新时间。';
CREATE TABLE IF NOT EXISTS etl_admin.etl_cursor (
cursor_id BIGSERIAL PRIMARY KEY,
task_id BIGINT NOT NULL REFERENCES etl_admin.etl_task(task_id) ON DELETE CASCADE,
store_id BIGINT NOT NULL,
last_start TIMESTAMPTZ,
last_end TIMESTAMPTZ,
last_id BIGINT,
last_run_id BIGINT,
extra JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now(),
UNIQUE (task_id, store_id)
);
COMMENT ON TABLE etl_admin.etl_cursor IS '任务游标表:记录每个任务/门店的增量窗口及最后 run。';
COMMENT ON COLUMN etl_admin.etl_cursor.task_id IS '关联 etl_task.task_id。';
COMMENT ON COLUMN etl_admin.etl_cursor.store_id IS '门店/租户粒度。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_start IS '上次窗口开始时间(含重叠偏移)。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_end IS '上次窗口结束时间。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_id IS '上次处理的最大主键/游标值(可选)。';
COMMENT ON COLUMN etl_admin.etl_cursor.last_run_id IS '上次运行ID对应 etl_run.run_id。';
COMMENT ON COLUMN etl_admin.etl_cursor.extra IS '附加游标信息 JSON。';
COMMENT ON COLUMN etl_admin.etl_cursor.created_at IS '创建时间。';
COMMENT ON COLUMN etl_admin.etl_cursor.updated_at IS '更新时间。';
CREATE TABLE IF NOT EXISTS etl_admin.etl_run (
run_id BIGSERIAL PRIMARY KEY,
run_uuid TEXT NOT NULL,
task_id BIGINT NOT NULL REFERENCES etl_admin.etl_task(task_id) ON DELETE CASCADE,
store_id BIGINT NOT NULL,
status TEXT NOT NULL,
started_at TIMESTAMPTZ DEFAULT now(),
ended_at TIMESTAMPTZ,
window_start TIMESTAMPTZ,
window_end TIMESTAMPTZ,
window_minutes INT,
overlap_seconds INT,
fetched_count INT DEFAULT 0,
loaded_count INT DEFAULT 0,
updated_count INT DEFAULT 0,
skipped_count INT DEFAULT 0,
error_count INT DEFAULT 0,
unknown_fields INT DEFAULT 0,
export_dir TEXT,
log_path TEXT,
request_params JSONB DEFAULT '{}'::jsonb,
manifest JSONB DEFAULT '{}'::jsonb,
error_message TEXT,
extra JSONB DEFAULT '{}'::jsonb
);
COMMENT ON TABLE etl_admin.etl_run IS '运行记录表:记录每次任务执行的窗口、状态、计数与日志路径。';
COMMENT ON COLUMN etl_admin.etl_run.run_uuid IS '本次调度的唯一标识。';
COMMENT ON COLUMN etl_admin.etl_run.task_id IS '关联 etl_task.task_id。';
COMMENT ON COLUMN etl_admin.etl_run.store_id IS '门店/租户粒度。';
COMMENT ON COLUMN etl_admin.etl_run.status IS '运行状态SUCC/FAIL/PARTIAL 等)。';
COMMENT ON COLUMN etl_admin.etl_run.started_at IS '开始时间。';
COMMENT ON COLUMN etl_admin.etl_run.ended_at IS '结束时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_start IS '本次窗口开始时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_end IS '本次窗口结束时间。';
COMMENT ON COLUMN etl_admin.etl_run.window_minutes IS '窗口跨度(分钟)。';
COMMENT ON COLUMN etl_admin.etl_run.overlap_seconds IS '窗口重叠秒数。';
COMMENT ON COLUMN etl_admin.etl_run.fetched_count IS '抓取/读取的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.loaded_count IS '插入的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.updated_count IS '更新的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.skipped_count IS '跳过的记录数。';
COMMENT ON COLUMN etl_admin.etl_run.error_count IS '错误记录数。';
COMMENT ON COLUMN etl_admin.etl_run.unknown_fields IS '未知字段计数(清洗阶段)。';
COMMENT ON COLUMN etl_admin.etl_run.export_dir IS '抓取/导出目录。';
COMMENT ON COLUMN etl_admin.etl_run.log_path IS '日志路径。';
COMMENT ON COLUMN etl_admin.etl_run.request_params IS '请求参数 JSON。';
COMMENT ON COLUMN etl_admin.etl_run.manifest IS '运行产出清单/统计 JSON。';
COMMENT ON COLUMN etl_admin.etl_run.error_message IS '错误信息(若失败)。';
COMMENT ON COLUMN etl_admin.etl_run.extra IS '附加字段,保留扩展。';

View File

@@ -0,0 +1,41 @@
-- 灏嗘柊鐨?ODS 浠诲姟娉ㄥ唽鍒?etl_admin.etl_task锛堟牴鎹渶瑕佹浛鎹?store_id锛?
-- 浣跨敤鏂瑰紡锛堢ず渚嬶級锛?
-- psql "$PG_DSN" -f etl_billiards/database/seed_ods_tasks.sql
-- 鎴栬€呭湪 psql 涓墽琛屾湰鏂囦欢鍐呭銆?
WITH target_store AS (
SELECT 2790685415443269::bigint AS store_id -- TODO: 鏇挎崲涓哄疄闄?store_id
),
task_codes AS (
SELECT unnest(ARRAY[
'assistant_accounts_masterS',
'assistant_service_records',
'assistant_cancellation_records',
'goods_stock_movements',
'ODS_INVENTORY_STOCK',
'ODS_PACKAGE',
'ODS_GROUP_BUY_REDEMPTION',
'ODS_MEMBER',
'ODS_MEMBER_BALANCE',
'member_stored_value_cards',
'ODS_PAYMENT',
'ODS_REFUND',
'platform_coupon_redemption_records',
'recharge_settlements',
'ODS_TABLES',
'ODS_GOODS_CATEGORY',
'ODS_STORE_GOODS',
'table_fee_discount_records',
'ODS_TENANT_GOODS',
'ODS_SETTLEMENT_TICKET',
'settlement_records',
'INIT_ODS_SCHEMA'
]) AS task_code
)
INSERT INTO etl_admin.etl_task (task_code, store_id, enabled)
SELECT t.task_code, s.store_id, TRUE
FROM task_codes t CROSS JOIN target_store s
ON CONFLICT (task_code, store_id) DO UPDATE
SET enabled = EXCLUDED.enabled;

View File

@@ -1,19 +1,23 @@
# -*- coding: utf-8 -*-
"""数据加载器基类"""
import logging
class BaseLoader:
"""数据加载器基类"""
def __init__(self, db_ops):
def __init__(self, db_ops, logger=None):
self.db = db_ops
self.logger = logger or logging.getLogger(self.__class__.__name__)
def upsert(self, records: list) -> tuple:
"""
执行UPSERT操作
执行 UPSERT 操作
返回: (inserted_count, updated_count, skipped_count)
"""
raise NotImplementedError("子类需实现 upsert 方法")
def _batch_size(self) -> int:
"""批次大小"""
return 1000

View File

@@ -1,29 +1,55 @@
# -*- coding: utf-8 -*-
"""支付记录加载器"""
"""支付事实表加载器"""
from ..base_loader import BaseLoader
class PaymentLoader(BaseLoader):
"""支付记录加载器"""
"""支付数据加载器"""
def upsert_payments(self, records: list, store_id: int) -> tuple:
"""加载支付记录"""
"""加载支付数据"""
if not records:
return (0, 0, 0)
sql = """
INSERT INTO billiards.fact_payment (
store_id, pay_id, order_id, pay_time, pay_amount,
pay_type, pay_status, remark, raw_data
store_id, pay_id, order_id,
site_id, tenant_id,
order_settle_id, order_trade_no,
relate_type, relate_id,
create_time, pay_time,
pay_amount, fee_amount, discount_amount,
payment_method, pay_type,
online_pay_channel, pay_terminal,
pay_status, remark, raw_data
)
VALUES (
%(store_id)s, %(pay_id)s, %(order_id)s, %(pay_time)s, %(pay_amount)s,
%(pay_type)s, %(pay_status)s, %(remark)s, %(raw_data)s
%(store_id)s, %(pay_id)s, %(order_id)s,
%(site_id)s, %(tenant_id)s,
%(order_settle_id)s, %(order_trade_no)s,
%(relate_type)s, %(relate_id)s,
%(create_time)s, %(pay_time)s,
%(pay_amount)s, %(fee_amount)s, %(discount_amount)s,
%(payment_method)s, %(pay_type)s,
%(online_pay_channel)s, %(pay_terminal)s,
%(pay_status)s, %(remark)s, %(raw_data)s
)
ON CONFLICT (store_id, pay_id) DO UPDATE SET
order_settle_id = EXCLUDED.order_settle_id,
order_trade_no = EXCLUDED.order_trade_no,
relate_type = EXCLUDED.relate_type,
relate_id = EXCLUDED.relate_id,
order_id = EXCLUDED.order_id,
site_id = EXCLUDED.site_id,
tenant_id = EXCLUDED.tenant_id,
create_time = EXCLUDED.create_time,
pay_time = EXCLUDED.pay_time,
pay_amount = EXCLUDED.pay_amount,
fee_amount = EXCLUDED.fee_amount,
discount_amount = EXCLUDED.discount_amount,
payment_method = EXCLUDED.payment_method,
pay_type = EXCLUDED.pay_type,
online_pay_channel = EXCLUDED.online_pay_channel,
pay_terminal = EXCLUDED.pay_terminal,
pay_status = EXCLUDED.pay_status,
remark = EXCLUDED.remark,
raw_data = EXCLUDED.raw_data,

View File

@@ -0,0 +1,188 @@
# -*- coding: utf-8 -*-
"""小票详情加载器"""
from ..base_loader import BaseLoader
import json
class TicketLoader(BaseLoader):
"""
Loader for parsing Ticket Detail JSON and populating DWD fact tables.
Handles:
- fact_order (Header)
- fact_order_goods (Items)
- fact_table_usage (Items)
- fact_assistant_service (Items)
"""
def process_tickets(self, tickets: list, store_id: int) -> tuple:
"""
Process a batch of ticket JSONs.
Returns (inserted_count, error_count)
"""
inserted_count = 0
error_count = 0
# Prepare batch lists
orders = []
goods_list = []
table_usages = []
assistant_services = []
for ticket in tickets:
try:
# 1. Parse Header (fact_order)
root_data = ticket.get("data", {}).get("data", {})
if not root_data:
continue
order_settle_id = root_data.get("orderSettleId")
if not order_settle_id:
continue
orders.append({
"store_id": store_id,
"order_settle_id": order_settle_id,
"order_trade_no": 0,
"order_no": str(root_data.get("orderSettleNumber", "")),
"member_id": 0,
"pay_time": root_data.get("payTime"),
"total_amount": root_data.get("consumeMoney", 0),
"pay_amount": root_data.get("actualPayment", 0),
"discount_amount": root_data.get("memberOfferAmount", 0),
"coupon_amount": root_data.get("couponAmount", 0),
"status": "PAID",
"cashier_name": root_data.get("cashierName", ""),
"remark": root_data.get("orderRemark", ""),
"raw_data": json.dumps(ticket, ensure_ascii=False)
})
# 2. Parse Items (orderItem list)
order_items = root_data.get("orderItem", [])
for item in order_items:
order_trade_no = item.get("siteOrderId")
# 2.1 Table Ledger
table_ledger = item.get("tableLedger")
if table_ledger:
table_usages.append({
"store_id": store_id,
"order_ledger_id": table_ledger.get("orderTableLedgerId"),
"order_settle_id": order_settle_id,
"table_id": table_ledger.get("siteTableId"),
"table_name": table_ledger.get("tableName"),
"start_time": table_ledger.get("chargeStartTime"),
"end_time": table_ledger.get("chargeEndTime"),
"duration_minutes": table_ledger.get("useDuration", 0),
"total_amount": table_ledger.get("consumptionAmount", 0),
"pay_amount": table_ledger.get("consumptionAmount", 0) - table_ledger.get("memberDiscountAmount", 0)
})
# 2.2 Goods Ledgers
goods_ledgers = item.get("goodsLedgers", [])
for g in goods_ledgers:
goods_list.append({
"store_id": store_id,
"order_goods_id": g.get("orderGoodsLedgerId"),
"order_settle_id": order_settle_id,
"order_trade_no": order_trade_no,
"goods_id": g.get("siteGoodsId"),
"goods_name": g.get("goodsName"),
"quantity": g.get("goodsCount", 0),
"unit_price": g.get("goodsPrice", 0),
"total_amount": g.get("ledgerAmount", 0),
"pay_amount": g.get("realGoodsMoney", 0)
})
# 2.3 Assistant Services
assistant_ledgers = item.get("assistantPlayWith", [])
for a in assistant_ledgers:
assistant_services.append({
"store_id": store_id,
"ledger_id": a.get("orderAssistantLedgerId"),
"order_settle_id": order_settle_id,
"assistant_id": a.get("assistantId"),
"assistant_name": a.get("ledgerName"),
"service_type": a.get("skillName", "Play"),
"start_time": a.get("ledgerStartTime"),
"end_time": a.get("ledgerEndTime"),
"duration_minutes": int(a.get("ledgerCount", 0) / 60) if a.get("ledgerCount") else 0,
"total_amount": a.get("ledgerAmount", 0),
"pay_amount": a.get("ledgerAmount", 0)
})
inserted_count += 1
except Exception as e:
self.logger.error(f"Error parsing ticket: {e}", exc_info=True)
error_count += 1
# 3. Batch Insert/Upsert
if orders:
self._upsert_orders(orders)
if goods_list:
self._upsert_goods(goods_list)
if table_usages:
self._upsert_table_usages(table_usages)
if assistant_services:
self._upsert_assistant_services(assistant_services)
return inserted_count, error_count
def _upsert_orders(self, rows):
sql = """
INSERT INTO billiards.fact_order (
store_id, order_settle_id, order_trade_no, order_no, member_id,
pay_time, total_amount, pay_amount, discount_amount, coupon_amount,
status, cashier_name, remark, raw_data
) VALUES (
%(store_id)s, %(order_settle_id)s, %(order_trade_no)s, %(order_no)s, %(member_id)s,
%(pay_time)s, %(total_amount)s, %(pay_amount)s, %(discount_amount)s, %(coupon_amount)s,
%(status)s, %(cashier_name)s, %(remark)s, %(raw_data)s
)
ON CONFLICT (store_id, order_settle_id) DO UPDATE SET
pay_time = EXCLUDED.pay_time,
pay_amount = EXCLUDED.pay_amount,
updated_at = now()
"""
self.db.batch_execute(sql, rows)
def _upsert_goods(self, rows):
sql = """
INSERT INTO billiards.fact_order_goods (
store_id, order_goods_id, order_settle_id, order_trade_no,
goods_id, goods_name, quantity, unit_price, total_amount, pay_amount
) VALUES (
%(store_id)s, %(order_goods_id)s, %(order_settle_id)s, %(order_trade_no)s,
%(goods_id)s, %(goods_name)s, %(quantity)s, %(unit_price)s, %(total_amount)s, %(pay_amount)s
)
ON CONFLICT (store_id, order_goods_id) DO UPDATE SET
pay_amount = EXCLUDED.pay_amount
"""
self.db.batch_execute(sql, rows)
def _upsert_table_usages(self, rows):
sql = """
INSERT INTO billiards.fact_table_usage (
store_id, order_ledger_id, order_settle_id, table_id, table_name,
start_time, end_time, duration_minutes, total_amount, pay_amount
) VALUES (
%(store_id)s, %(order_ledger_id)s, %(order_settle_id)s, %(table_id)s, %(table_name)s,
%(start_time)s, %(end_time)s, %(duration_minutes)s, %(total_amount)s, %(pay_amount)s
)
ON CONFLICT (store_id, order_ledger_id) DO UPDATE SET
pay_amount = EXCLUDED.pay_amount
"""
self.db.batch_execute(sql, rows)
def _upsert_assistant_services(self, rows):
sql = """
INSERT INTO billiards.fact_assistant_service (
store_id, ledger_id, order_settle_id, assistant_id, assistant_name,
service_type, start_time, end_time, duration_minutes, total_amount, pay_amount
) VALUES (
%(store_id)s, %(ledger_id)s, %(order_settle_id)s, %(assistant_id)s, %(assistant_name)s,
%(service_type)s, %(start_time)s, %(end_time)s, %(duration_minutes)s, %(total_amount)s, %(pay_amount)s
)
ON CONFLICT (store_id, ledger_id) DO UPDATE SET
pay_amount = EXCLUDED.pay_amount
"""
self.db.batch_execute(sql, rows)

View File

@@ -0,0 +1,6 @@
# -*- coding: utf-8 -*-
"""ODS loader helpers."""
from .generic import GenericODSLoader
__all__ = ["GenericODSLoader"]

View File

@@ -0,0 +1,67 @@
# -*- coding: utf-8 -*-
"""Generic ODS loader that keeps raw payload + primary keys."""
from __future__ import annotations
import json
from datetime import datetime, timezone
from typing import Iterable, Sequence
from ..base_loader import BaseLoader
class GenericODSLoader(BaseLoader):
"""Insert/update helper for ODS tables that share the same pattern."""
def __init__(
self,
db_ops,
table_name: str,
columns: Sequence[str],
conflict_columns: Sequence[str],
):
super().__init__(db_ops)
if not conflict_columns:
raise ValueError("conflict_columns must not be empty for ODS loader")
self.table_name = table_name
self.columns = list(columns)
self.conflict_columns = list(conflict_columns)
self._sql = self._build_sql()
def upsert_rows(self, rows: Iterable[dict]) -> tuple[int, int, int]:
"""Insert/update the provided iterable of dictionaries."""
rows = list(rows)
if not rows:
return (0, 0, 0)
normalized = [self._normalize_row(row) for row in rows]
inserted, updated = self.db.batch_upsert_with_returning(
self._sql, normalized, page_size=self._batch_size()
)
return inserted, updated, 0
def _build_sql(self) -> str:
col_list = ", ".join(self.columns)
placeholders = ", ".join(f"%({col})s" for col in self.columns)
conflict_clause = ", ".join(self.conflict_columns)
update_columns = [c for c in self.columns if c not in self.conflict_columns]
set_clause = ", ".join(f"{col} = EXCLUDED.{col}" for col in update_columns)
return (
f"INSERT INTO {self.table_name} ({col_list}) "
f"VALUES ({placeholders}) "
f"ON CONFLICT ({conflict_clause}) DO UPDATE SET {set_clause} "
f"RETURNING (xmax = 0) AS inserted"
)
def _normalize_row(self, row: dict) -> dict:
normalized = {}
for col in self.columns:
value = row.get(col)
if col == "payload" and value is not None and not isinstance(value, str):
normalized[col] = json.dumps(value, ensure_ascii=False)
else:
normalized[col] = value
if "fetched_at" in normalized and normalized["fetched_at"] is None:
normalized["fetched_at"] = datetime.now(timezone.utc)
return normalized

View File

@@ -0,0 +1,52 @@
{
"source_counts": {
"assistant_accounts_master.json": 2,
"assistant_cancellation_records.json": 2,
"assistant_service_records.json": 2,
"goods_stock_movements.json": 2,
"goods_stock_summary.json": 161,
"group_buy_packages.json": 2,
"group_buy_redemption_records.json": 2,
"member_balance_changes.json": 2,
"member_profiles.json": 2,
"member_stored_value_cards.json": 2,
"payment_transactions.json": 200,
"platform_coupon_redemption_records.json": 200,
"recharge_settlements.json": 2,
"refund_transactions.json": 11,
"settlement_records.json": 2,
"settlement_ticket_details.json": 193,
"site_tables_master.json": 2,
"stock_goods_category_tree.json": 2,
"store_goods_master.json": 2,
"store_goods_sales_records.json": 2,
"table_fee_discount_records.json": 2,
"table_fee_transactions.json": 2,
"tenant_goods_master.json": 2
},
"ods_counts": {
"member_profiles": 199,
"member_balance_changes": 200,
"member_stored_value_cards": 200,
"recharge_settlements": 75,
"settlement_records": 200,
"assistant_cancellation_records": 15,
"assistant_accounts_master": 50,
"assistant_service_records": 200,
"site_tables_master": 71,
"table_fee_discount_records": 200,
"table_fee_transactions": 200,
"goods_stock_movements": 200,
"stock_goods_category_tree": 9,
"goods_stock_summary": 161,
"payment_transactions": 200,
"refund_transactions": 11,
"platform_coupon_redemption_records": 200,
"tenant_goods_master": 156,
"group_buy_packages": 17,
"group_buy_redemption_records": 200,
"settlement_ticket_details": 193,
"store_goods_master": 161,
"store_goods_sales_records": 200
}
}

View File

@@ -1,131 +1,234 @@
# -*- coding: utf-8 -*-
"""ETL调度"""
"""ETL 调度:支持在线抓取、离线清洗入库、全流程三种模式。"""
from __future__ import annotations
import uuid
from datetime import datetime
from pathlib import Path
from zoneinfo import ZoneInfo
from api.client import APIClient
from api.local_json_client import LocalJsonClient
from api.recording_client import RecordingAPIClient
from database.connection import DatabaseConnection
from database.operations import DatabaseOperations
from api.client import APIClient
from orchestration.cursor_manager import CursorManager
from orchestration.run_tracker import RunTracker
from orchestration.task_registry import default_registry
class ETLScheduler:
"""ETL任务调度器"""
"""调度多个任务,按 pipeline.flow 执行抓取/清洗入库。"""
def __init__(self, config, logger):
self.config = config
self.logger = logger
self.tz = ZoneInfo(config.get("app.timezone", "Asia/Taipei"))
# 初始化组件
self.pipeline_flow = str(config.get("pipeline.flow", "FULL") or "FULL").upper()
self.fetch_root = Path(config.get("pipeline.fetch_root") or config["io"]["export_root"])
self.ingest_source_dir = config.get("pipeline.ingest_source_dir") or ""
self.write_pretty_json = bool(config.get("io.write_pretty_json", False))
# 组件
self.db_conn = DatabaseConnection(
dsn=config["db"]["dsn"],
session=config["db"].get("session"),
connect_timeout=config["db"].get("connect_timeout_sec")
connect_timeout=config["db"].get("connect_timeout_sec"),
)
self.db_ops = DatabaseOperations(self.db_conn)
self.api_client = APIClient(
base_url=config["api"]["base_url"],
token=config["api"]["token"],
timeout=config["api"]["timeout_sec"],
retry_max=config["api"]["retries"]["max_attempts"],
headers_extra=config["api"].get("headers_extra")
headers_extra=config["api"].get("headers_extra"),
)
self.cursor_mgr = CursorManager(self.db_conn)
self.run_tracker = RunTracker(self.db_conn)
self.task_registry = default_registry
def run_tasks(self, task_codes: list = None):
"""运行任务列表"""
# ------------------------------------------------------------------ public
def run_tasks(self, task_codes: list | None = None):
"""按配置或传入列表执行任务。"""
run_uuid = uuid.uuid4().hex
store_id = self.config.get("app.store_id")
if not task_codes:
task_codes = self.config.get("run.tasks", [])
self.logger.info(f"开始运行任务: {task_codes}, run_uuid={run_uuid}")
self.logger.info("开始运行任务: %s, run_uuid=%s", task_codes, run_uuid)
for task_code in task_codes:
try:
self._run_single_task(task_code, run_uuid, store_id)
except Exception as e:
self.logger.error(f"任务 {task_code} 失败: {e}", exc_info=True)
except Exception as exc: # noqa: BLE001
self.logger.error("任务 %s 失败: %s", task_code, exc, exc_info=True)
continue
self.logger.info("所有任务执行完成")
# ------------------------------------------------------------------ internals
def _run_single_task(self, task_code: str, run_uuid: str, store_id: int):
"""运行单个任务"""
# 创建任务实例
task = self.task_registry.create_task(
task_code, self.config, self.db_ops, self.api_client, self.logger
)
# 获取任务配置(从数据库)
"""单个任务的抓取/清洗编排。"""
task_cfg = self._load_task_config(task_code, store_id)
if not task_cfg:
self.logger.warning(f"任务 {task_code} 未启用或不存在")
self.logger.warning("任务 %s 未启用或不存在", task_code)
return
task_id = task_cfg["task_id"]
# 创建运行记录
cursor_data = self.cursor_mgr.get_or_create(task_id, store_id)
# run 记录
export_dir = Path(self.config["io"]["export_root"]) / datetime.now(self.tz).strftime("%Y%m%d")
log_path = str(Path(self.config["io"]["log_root"]) / f"{run_uuid}.log")
run_id = self.run_tracker.create_run(
task_id=task_id,
store_id=store_id,
run_uuid=run_uuid,
export_dir=str(export_dir),
log_path=log_path,
status="RUNNING"
status=self._map_run_status("RUNNING"),
)
# 执行任务
# 为抓取阶段准备目录
fetch_dir = self._build_fetch_dir(task_code, run_id)
fetch_stats = None
try:
result = task.execute()
# 更新运行记录
self.run_tracker.update_run(
run_id=run_id,
counts=result["counts"],
status=result["status"],
ended_at=datetime.now(self.tz)
)
# 推进游标
if result["status"] == "SUCCESS":
# TODO: 从任务结果中获取窗口信息
pass
except Exception as e:
if self._flow_includes_fetch():
fetch_stats = self._execute_fetch(task_code, cursor_data, fetch_dir, run_id)
if self.pipeline_flow == "FETCH_ONLY":
counts = self._counts_from_fetch(fetch_stats)
self.run_tracker.update_run(
run_id=run_id,
counts=counts,
status=self._map_run_status("SUCCESS"),
ended_at=datetime.now(self.tz),
)
return
if self._flow_includes_ingest():
source_dir = self._resolve_ingest_source(fetch_dir, fetch_stats)
result = self._execute_ingest(task_code, cursor_data, source_dir)
self.run_tracker.update_run(
run_id=run_id,
counts=result["counts"],
status=self._map_run_status(result["status"]),
ended_at=datetime.now(self.tz),
)
if (result.get("status") or "").upper() == "SUCCESS":
window = result.get("window")
if window:
self.cursor_mgr.advance(
task_id=task_id,
store_id=store_id,
window_start=window.get("start"),
window_end=window.get("end"),
run_id=run_id,
)
except Exception as exc: # noqa: BLE001
self.run_tracker.update_run(
run_id=run_id,
counts={},
status="FAIL",
status=self._map_run_status("FAIL"),
ended_at=datetime.now(self.tz),
error_message=str(e)
error_message=str(exc),
)
raise
def _load_task_config(self, task_code: str, store_id: int) -> dict:
"""从数据库加载任务配置"""
def _execute_fetch(self, task_code: str, cursor_data: dict | None, fetch_dir: Path, run_id: int):
"""在线抓取阶段:用 RecordingAPIClient 拉取并落盘,不做 Transform/Load。"""
recording_client = RecordingAPIClient(
base_client=self.api_client,
output_dir=fetch_dir,
task_code=task_code,
run_id=run_id,
write_pretty=self.write_pretty_json,
)
task = self.task_registry.create_task(task_code, self.config, self.db_ops, recording_client, self.logger)
context = task._build_context(cursor_data) # type: ignore[attr-defined]
self.logger.info("%s: 抓取阶段开始,目录=%s", task_code, fetch_dir)
extracted = task.extract(context)
# 抓取结束,不执行 transform/load
stats = recording_client.last_dump or {}
fetched_count = stats.get("records") or len(extracted.get("records", [])) if isinstance(extracted, dict) else 0
self.logger.info(
"%s: 抓取完成,文件=%s,记录数=%s",
task_code,
stats.get("file"),
fetched_count,
)
return {"file": stats.get("file"), "records": fetched_count, "pages": stats.get("pages")}
def _execute_ingest(self, task_code: str, cursor_data: dict | None, source_dir: Path):
"""本地清洗入库:使用 LocalJsonClient 回放 JSON走原有任务 ETL。"""
local_client = LocalJsonClient(source_dir)
task = self.task_registry.create_task(task_code, self.config, self.db_ops, local_client, self.logger)
self.logger.info("%s: 本地清洗入库开始,源目录=%s", task_code, source_dir)
return task.execute(cursor_data)
def _build_fetch_dir(self, task_code: str, run_id: int) -> Path:
ts = datetime.now(self.tz).strftime("%Y%m%d-%H%M%S")
return Path(self.fetch_root) / f"{task_code.upper()}-{run_id}-{ts}"
def _resolve_ingest_source(self, fetch_dir: Path, fetch_stats: dict | None) -> Path:
if fetch_stats and fetch_dir.exists():
return fetch_dir
if self.ingest_source_dir:
return Path(self.ingest_source_dir)
raise FileNotFoundError("未提供本地清洗入库所需的 JSON 目录")
def _counts_from_fetch(self, stats: dict | None) -> dict:
fetched = (stats or {}).get("records") or 0
return {
"fetched": fetched,
"inserted": 0,
"updated": 0,
"skipped": 0,
"errors": 0,
}
def _flow_includes_fetch(self) -> bool:
return self.pipeline_flow in {"FETCH_ONLY", "FULL"}
def _flow_includes_ingest(self) -> bool:
return self.pipeline_flow in {"INGEST_ONLY", "FULL"}
def _load_task_config(self, task_code: str, store_id: int) -> dict | None:
"""从数据库加载任务配置。"""
sql = """
SELECT task_id, task_code, store_id, enabled, cursor_field,
window_minutes_default, overlap_seconds, page_size, retry_max, params
FROM etl_admin.etl_task
WHERE store_id = %s AND task_code = %s AND enabled = TRUE
"""
rows = self.db_conn.query(sql, (store_id, task_code))
return rows[0] if rows else None
def close(self):
"""关闭连接"""
"""关闭连接"""
self.db_conn.close()
@staticmethod
def _map_run_status(status: str) -> str:
"""
将任务返回的状态转换为 etl_admin.run_status_enum
(SUCC / FAIL / PARTIAL)
"""
normalized = (status or "").upper()
if normalized in {"SUCCESS", "SUCC"}:
return "SUCC"
if normalized in {"FAIL", "FAILED", "ERROR"}:
return "FAIL"
if normalized in {"RUNNING", "PARTIAL", "PENDING", "IN_PROGRESS"}:
return "PARTIAL"
# 未知状态默认标记为 FAIL便于排查
return "FAIL"

View File

@@ -3,8 +3,6 @@
from tasks.orders_task import OrdersTask
from tasks.payments_task import PaymentsTask
from tasks.members_task import MembersTask
<<<<<<< HEAD
=======
from tasks.products_task import ProductsTask
from tasks.tables_task import TablesTask
from tasks.assistants_task import AssistantsTask
@@ -16,7 +14,15 @@ from tasks.topups_task import TopupsTask
from tasks.table_discount_task import TableDiscountTask
from tasks.assistant_abolish_task import AssistantAbolishTask
from tasks.ledger_task import LedgerTask
>>>>>>> main
from tasks.ods_tasks import ODS_TASK_CLASSES
from tasks.manual_ingest_task import ManualIngestTask
from tasks.payments_dwd_task import PaymentsDwdTask
from tasks.members_dwd_task import MembersDwdTask
from tasks.init_schema_task import InitOdsSchemaTask
from tasks.init_dwd_schema_task import InitDwdSchemaTask
from tasks.dwd_load_task import DwdLoadTask
from tasks.ticket_dwd_task import TicketDwdTask
from tasks.dwd_quality_task import DwdQualityTask
class TaskRegistry:
"""任务注册和工厂"""
@@ -44,12 +50,6 @@ class TaskRegistry:
# 默认注册表
default_registry = TaskRegistry()
<<<<<<< HEAD
default_registry.register("ORDERS", OrdersTask)
default_registry.register("PAYMENTS", PaymentsTask)
default_registry.register("MEMBERS", MembersTask)
# 可以继续注册其他任务...
=======
default_registry.register("PRODUCTS", ProductsTask)
default_registry.register("TABLES", TablesTask)
default_registry.register("MEMBERS", MembersTask)
@@ -64,4 +64,13 @@ default_registry.register("TOPUPS", TopupsTask)
default_registry.register("TABLE_DISCOUNT", TableDiscountTask)
default_registry.register("ASSISTANT_ABOLISH", AssistantAbolishTask)
default_registry.register("LEDGER", LedgerTask)
>>>>>>> main
default_registry.register("TICKET_DWD", TicketDwdTask)
default_registry.register("MANUAL_INGEST", ManualIngestTask)
default_registry.register("PAYMENTS_DWD", PaymentsDwdTask)
default_registry.register("MEMBERS_DWD", MembersDwdTask)
default_registry.register("INIT_ODS_SCHEMA", InitOdsSchemaTask)
default_registry.register("INIT_DWD_SCHEMA", InitDwdSchemaTask)
default_registry.register("DWD_LOAD_FROM_ODS", DwdLoadTask)
default_registry.register("DWD_QUALITY_CHECK", DwdQualityTask)
for code, task_cls in ODS_TASK_CLASSES.items():
default_registry.register(code, task_cls)

View File

@@ -0,0 +1,692 @@
{
"generated_at": "2025-12-09T05:21:24.745244",
"tables": [
{
"dwd_table": "billiards_dwd.dim_site",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_site_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 1,
"ods": 200,
"diff": -199
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_table_ex",
"ods_table": "billiards_ods.site_tables_master",
"count": {
"dwd": 71,
"ods": 71,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_assistant_ex",
"ods_table": "billiards_ods.assistant_accounts_master",
"count": {
"dwd": 50,
"ods": 50,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_ex",
"ods_table": "billiards_ods.member_profiles",
"count": {
"dwd": 199,
"ods": 199,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_member_card_account",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "balance",
"dwd_sum": 31061.03,
"ods_sum": 31061.03,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_member_card_account_ex",
"ods_table": "billiards_ods.member_stored_value_cards",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "deliveryfeededuct",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_tenant_goods_ex",
"ods_table": "billiards_ods.tenant_goods_master",
"count": {
"dwd": 156,
"ods": 156,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_store_goods_ex",
"ods_table": "billiards_ods.store_goods_master",
"count": {
"dwd": 161,
"ods": 161,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_goods_category",
"ods_table": "billiards_ods.stock_goods_category_tree",
"count": {
"dwd": 26,
"ods": 9,
"diff": 17
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dim_groupbuy_package_ex",
"ods_table": "billiards_ods.group_buy_packages",
"count": {
"dwd": 17,
"ods": 17,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_settlement_head_ex",
"ods_table": "billiards_ods.settlement_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "adjust_amount",
"dwd_sum": 1157.45,
"ods_sum": 1157.45,
"diff": 0.0
},
{
"column": "coupon_promotion_amount",
"dwd_sum": 11244.49,
"ods_sum": 11244.49,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 18107.0,
"ods_sum": 18107.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 1149.19,
"ods_sum": 1149.19,
"diff": 0.0
},
{
"column": "real_table_charge_money",
"dwd_sum": 5705.06,
"ods_sum": 5705.06,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_log_ex",
"ods_table": "billiards_ods.table_fee_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "fee_total",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "mgmt_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "used_card_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "ledger_amount",
"dwd_sum": 20650.84,
"ods_sum": 20650.84,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_table_fee_adjust_ex",
"ods_table": "billiards_ods.table_fee_discount_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "cost_money",
"dwd_sum": 22.3,
"ods_sum": 22.3,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 4583.0,
"ods_sum": 4583.0,
"diff": 0.0
},
{
"column": "real_goods_money",
"dwd_sum": 3791.0,
"ods_sum": 3791.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_store_goods_sale_ex",
"ods_table": "billiards_ods.store_goods_sales_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "discount_money",
"dwd_sum": 792.0,
"ods_sum": 792.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_coupon_deduct_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "option_member_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "point_discount_money_cost",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "push_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_deduct_money",
"dwd_sum": 626.83,
"ods_sum": 626.83,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 63251.37,
"ods_sum": 63251.37,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_service_log_ex",
"ods_table": "billiards_ods.assistant_service_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "manual_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "member_discount_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "service_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_assistant_trash_event_ex",
"ods_table": "billiards_ods.assistant_cancellation_records",
"count": {
"dwd": 15,
"ods": 15,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_member_balance_change_ex",
"ods_table": "billiards_ods.member_balance_changes",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 12266.0,
"ods_sum": 12266.0,
"diff": 0.0
},
{
"column": "ledger_amount",
"dwd_sum": 12049.53,
"ods_sum": 12049.53,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_groupbuy_redemption_ex",
"ods_table": "billiards_ods.group_buy_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "assistant_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "assistant_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "goods_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "recharge_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "reward_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "table_service_promotion_money",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "coupon_money",
"dwd_sum": 11956.0,
"ods_sum": 11956.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_platform_coupon_redemption_ex",
"ods_table": "billiards_ods.platform_coupon_redemption_records",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_recharge_order_ex",
"ods_table": "billiards_ods.recharge_settlements",
"count": {
"dwd": 74,
"ods": 74,
"diff": 0
},
"amounts": []
},
{
"dwd_table": "billiards_dwd.dwd_payment",
"ods_table": "billiards_ods.payment_transactions",
"count": {
"dwd": 200,
"ods": 200,
"diff": 0
},
"amounts": [
{
"column": "pay_amount",
"dwd_sum": 10863.0,
"ods_sum": 10863.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "channel_fee",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "pay_amount",
"dwd_sum": -62186.0,
"ods_sum": -62186.0,
"diff": 0.0
}
]
},
{
"dwd_table": "billiards_dwd.dwd_refund_ex",
"ods_table": "billiards_ods.refund_transactions",
"count": {
"dwd": 11,
"ods": 11,
"diff": 0
},
"amounts": [
{
"column": "balance_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "card_frozen_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "refund_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
},
{
"column": "round_amount",
"dwd_sum": 0.0,
"ods_sum": 0.0,
"diff": 0.0
}
]
}
],
"note": "行数/金额核对,金额字段基于列名包含 amount/money/fee/balance 的数值列自动扫描。"
}

27
etl_billiards/run_ods.bat Normal file
View File

@@ -0,0 +1,27 @@
@echo off
REM -*- coding: utf-8 -*-
REM 说明:一键重建 ODS执行 INIT_ODS_SCHEMA并灌入示例 JSON执行 MANUAL_INGEST
REM 使用配置:.env 中 PG_DSN、INGEST_SOURCE_DIR或通过参数覆盖
setlocal
cd /d %~dp0
REM 如果需要覆盖示例目录,可修改下面的 INGEST_DIR
set "INGEST_DIR=C:\dev\LLTQ\export\test-json-doc"
echo [INIT_ODS_SCHEMA] 准备执行,源目录=%INGEST_DIR%
python -m cli.main --tasks INIT_ODS_SCHEMA --pipeline-flow INGEST_ONLY --ingest-source "%INGEST_DIR%"
if errorlevel 1 (
echo INIT_ODS_SCHEMA 失败,退出
exit /b 1
)
echo [MANUAL_INGEST] 准备执行,源目录=%INGEST_DIR%
python -m cli.main --tasks MANUAL_INGEST --pipeline-flow INGEST_ONLY --ingest-source "%INGEST_DIR%"
if errorlevel 1 (
echo MANUAL_INGEST 失败,退出
exit /b 1
)
echo 全部完成。
endlocal

View File

@@ -1,92 +1,89 @@
# -*- coding: utf-8 -*-
"""SCD2 (Slowly Changing Dimension Type 2) 处理"""
"""SCD2 (Slowly Changing Dimension Type 2) 处理逻辑"""
from datetime import datetime
def _row_to_dict(cursor, row):
if row is None:
return None
columns = [desc[0] for desc in cursor.description]
return {col: row[idx] for idx, col in enumerate(columns)}
class SCD2Handler:
"""SCD2历史记录处理"""
"""SCD2历史记录处理"""
def __init__(self, db_ops):
self.db = db_ops
def upsert(self, table_name: str, natural_key: list, tracked_fields: list,
record: dict, effective_date: datetime = None) -> str:
def upsert(
self,
table_name: str,
natural_key: list,
tracked_fields: list,
record: dict,
effective_date: datetime = None,
) -> str:
"""
处理SCD2更新
Args:
table_name: 表名
natural_key: 自然键字段列表
tracked_fields: 需要跟踪变化的字段列表
record: 记录数据
effective_date: 生效日期
Returns:
操作类型: 'INSERT', 'UPDATE', 'UNCHANGED'
"""
effective_date = effective_date or datetime.now()
# 查找当前有效记录
where_clause = " AND ".join([f"{k} = %({k})s" for k in natural_key])
sql_select = f"""
SELECT * FROM {table_name}
WHERE {where_clause}
AND valid_to IS NULL
"""
# 使用 db 的 connection
current = self.db.conn.cursor()
current.execute(sql_select, record)
existing = current.fetchone()
if not existing:
# 新记录:直接插入
with self.db.conn.cursor() as current:
current.execute(sql_select, record)
existing = _row_to_dict(current, current.fetchone())
if not existing:
record["valid_from"] = effective_date
record["valid_to"] = None
record["is_current"] = True
fields = list(record.keys())
placeholders = ", ".join([f"%({f})s" for f in fields])
sql_insert = f"""
INSERT INTO {table_name} ({', '.join(fields)})
VALUES ({placeholders})
"""
current.execute(sql_insert, record)
return "INSERT"
has_changes = any(existing.get(field) != record.get(field) for field in tracked_fields)
if not has_changes:
return "UNCHANGED"
update_where = " AND ".join([f"{k} = %({k})s" for k in natural_key])
sql_close = f"""
UPDATE {table_name}
SET valid_to = %(effective_date)s,
is_current = FALSE
WHERE {update_where}
AND valid_to IS NULL
"""
record["effective_date"] = effective_date
current.execute(sql_close, record)
record["valid_from"] = effective_date
record["valid_to"] = None
record["is_current"] = True
fields = list(record.keys())
if "effective_date" in fields:
fields.remove("effective_date")
placeholders = ", ".join([f"%({f})s" for f in fields])
sql_insert = f"""
INSERT INTO {table_name} ({', '.join(fields)})
VALUES ({placeholders})
"""
current.execute(sql_insert, record)
return 'INSERT'
# 检查是否有变化
has_changes = any(
existing.get(field) != record.get(field)
for field in tracked_fields
)
if not has_changes:
return 'UNCHANGED'
# 有变化:关闭旧记录,插入新记录
update_where = " AND ".join([f"{k} = %({k})s" for k in natural_key])
sql_close = f"""
UPDATE {table_name}
SET valid_to = %(effective_date)s,
is_current = FALSE
WHERE {update_where}
AND valid_to IS NULL
"""
record["effective_date"] = effective_date
current.execute(sql_close, record)
# 插入新记录
record["valid_from"] = effective_date
record["valid_to"] = None
record["is_current"] = True
fields = list(record.keys())
if "effective_date" in fields:
fields.remove("effective_date")
placeholders = ", ".join([f"%({f})s" for f in fields])
sql_insert = f"""
INSERT INTO {table_name} ({', '.join(fields)})
VALUES ({placeholders})
"""
current.execute(sql_insert, record)
return 'UPDATE'
return "UPDATE"

View File

View File

@@ -0,0 +1,76 @@
# -*- coding: utf-8 -*-
"""Apply the PRD-aligned warehouse schema (ODS/DWD/DWS) to PostgreSQL."""
from __future__ import annotations
import argparse
import os
import sys
from pathlib import Path
PROJECT_ROOT = Path(__file__).resolve().parents[1]
if str(PROJECT_ROOT) not in sys.path:
sys.path.insert(0, str(PROJECT_ROOT))
from database.connection import DatabaseConnection # noqa: E402
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Create/upgrade warehouse schemas using schema_v2.sql"
)
parser.add_argument(
"--dsn",
help="PostgreSQL DSN (fallback to PG_DSN env)",
default=os.environ.get("PG_DSN"),
)
parser.add_argument(
"--file",
help="Path to schema SQL",
default=str(PROJECT_ROOT / "database" / "schema_v2.sql"),
)
parser.add_argument(
"--timeout",
type=int,
default=int(os.environ.get("PG_CONNECT_TIMEOUT", 10) or 10),
help="connect_timeout seconds (capped at 20, default 10)",
)
return parser.parse_args()
def apply_schema(dsn: str, sql_path: Path, timeout: int) -> None:
if not sql_path.exists():
raise FileNotFoundError(f"Schema file not found: {sql_path}")
sql_text = sql_path.read_text(encoding="utf-8")
timeout_val = max(1, min(timeout, 20))
conn = DatabaseConnection(dsn, connect_timeout=timeout_val)
try:
with conn.conn.cursor() as cur:
cur.execute(sql_text)
conn.commit()
except Exception:
conn.rollback()
raise
finally:
conn.close()
def main() -> int:
args = parse_args()
if not args.dsn:
print("Missing DSN. Set PG_DSN or pass --dsn.", file=sys.stderr)
return 2
try:
apply_schema(args.dsn, Path(args.file), args.timeout)
except Exception as exc: # pragma: no cover - utility script
print(f"Schema apply failed: {exc}", file=sys.stderr)
return 1
print("Schema applied successfully.")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,426 @@
# -*- coding: utf-8 -*-
"""Populate PRD DWD tables from ODS payload snapshots."""
from __future__ import annotations
import argparse
import os
import sys
import psycopg2
SQL_STEPS: list[tuple[str, str]] = [
(
"dim_tenant",
"""
INSERT INTO billiards_dwd.dim_tenant (tenant_id, tenant_name, status)
SELECT DISTINCT tenant_id, 'default' AS tenant_name, 'active' AS status
FROM (
SELECT tenant_id FROM billiards_ods.settlement_records
UNION SELECT tenant_id FROM billiards_ods.ods_order_receipt_detail
UNION SELECT tenant_id FROM billiards_ods.member_profiles
) s
WHERE tenant_id IS NOT NULL
ON CONFLICT (tenant_id) DO UPDATE SET updated_at = now();
""",
),
(
"dim_site",
"""
INSERT INTO billiards_dwd.dim_site (site_id, tenant_id, site_name, status)
SELECT DISTINCT site_id, MAX(tenant_id) AS tenant_id, 'default' AS site_name, 'active' AS status
FROM (
SELECT site_id, tenant_id FROM billiards_ods.settlement_records
UNION SELECT site_id, tenant_id FROM billiards_ods.ods_order_receipt_detail
UNION SELECT site_id, tenant_id FROM billiards_ods.ods_table_info
) s
WHERE site_id IS NOT NULL
GROUP BY site_id
ON CONFLICT (site_id) DO UPDATE SET updated_at = now();
""",
),
(
"dim_product_category",
"""
INSERT INTO billiards_dwd.dim_product_category (category_id, category_name, parent_id, level_no, status)
SELECT DISTINCT category_id, category_name, parent_id, level_no, status
FROM billiards_ods.ods_goods_category
WHERE category_id IS NOT NULL
ON CONFLICT (category_id) DO UPDATE SET
category_name = EXCLUDED.category_name,
parent_id = EXCLUDED.parent_id,
level_no = EXCLUDED.level_no,
status = EXCLUDED.status;
""",
),
(
"dim_product",
"""
INSERT INTO billiards_dwd.dim_product (goods_id, goods_name, goods_code, category_id, category_name, unit, default_price, status)
SELECT DISTINCT goods_id, goods_name, NULL::TEXT AS goods_code, category_id, category_name, NULL::TEXT AS unit, sale_price AS default_price, status
FROM billiards_ods.ods_store_product
WHERE goods_id IS NOT NULL
ON CONFLICT (goods_id) DO UPDATE SET
goods_name = EXCLUDED.goods_name,
category_id = EXCLUDED.category_id,
category_name = EXCLUDED.category_name,
default_price = EXCLUDED.default_price,
status = EXCLUDED.status,
updated_at = now();
""",
),
(
"dim_product_from_sales",
"""
INSERT INTO billiards_dwd.dim_product (goods_id, goods_name)
SELECT DISTINCT goods_id, goods_name
FROM billiards_ods.ods_store_sale_item
WHERE goods_id IS NOT NULL
ON CONFLICT (goods_id) DO NOTHING;
""",
),
(
"dim_member_card_type",
"""
INSERT INTO billiards_dwd.dim_member_card_type (card_type_id, card_type_name, discount_rate)
SELECT DISTINCT card_type_id, card_type_name, discount_rate
FROM billiards_ods.member_stored_value_cards
WHERE card_type_id IS NOT NULL
ON CONFLICT (card_type_id) DO UPDATE SET
card_type_name = EXCLUDED.card_type_name,
discount_rate = EXCLUDED.discount_rate;
""",
),
(
"dim_member",
"""
INSERT INTO billiards_dwd.dim_member (
site_id, member_id, tenant_id, member_name, nickname, gender, birthday, mobile,
member_type_id, member_type_name, status, register_time, last_visit_time,
balance, total_recharge_amount, total_consumed_amount, wechat_id, alipay_id, remark
)
SELECT DISTINCT
prof.site_id,
prof.member_id,
prof.tenant_id,
prof.member_name,
prof.nickname,
prof.gender,
prof.birthday,
prof.mobile,
card.member_type_id,
card.member_type_name,
prof.status,
prof.register_time,
prof.last_visit_time,
prof.balance,
NULL::NUMERIC AS total_recharge_amount,
NULL::NUMERIC AS total_consumed_amount,
prof.wechat_id,
prof.alipay_id,
prof.remarks
FROM billiards_ods.member_profiles prof
LEFT JOIN (
SELECT DISTINCT site_id, member_id, card_type_id AS member_type_id, card_type_name AS member_type_name
FROM billiards_ods.member_stored_value_cards
) card
ON prof.site_id = card.site_id AND prof.member_id = card.member_id
WHERE prof.member_id IS NOT NULL
ON CONFLICT (site_id, member_id) DO UPDATE SET
member_name = EXCLUDED.member_name,
nickname = EXCLUDED.nickname,
gender = EXCLUDED.gender,
birthday = EXCLUDED.birthday,
mobile = EXCLUDED.mobile,
member_type_id = EXCLUDED.member_type_id,
member_type_name = EXCLUDED.member_type_name,
status = EXCLUDED.status,
register_time = EXCLUDED.register_time,
last_visit_time = EXCLUDED.last_visit_time,
balance = EXCLUDED.balance,
wechat_id = EXCLUDED.wechat_id,
alipay_id = EXCLUDED.alipay_id,
remark = EXCLUDED.remark,
updated_at = now();
""",
),
(
"dim_table",
"""
INSERT INTO billiards_dwd.dim_table (table_id, site_id, table_code, table_name, table_type, area_name, status, created_time, updated_time)
SELECT DISTINCT table_id, site_id, table_code, table_name, table_type, area_name, status, created_time, updated_time
FROM billiards_ods.ods_table_info
WHERE table_id IS NOT NULL
ON CONFLICT (table_id) DO UPDATE SET
site_id = EXCLUDED.site_id,
table_code = EXCLUDED.table_code,
table_name = EXCLUDED.table_name,
table_type = EXCLUDED.table_type,
area_name = EXCLUDED.area_name,
status = EXCLUDED.status,
created_time = EXCLUDED.created_time,
updated_time = EXCLUDED.updated_time;
""",
),
(
"dim_assistant",
"""
INSERT INTO billiards_dwd.dim_assistant (assistant_id, assistant_name, mobile, status)
SELECT DISTINCT assistant_id, assistant_name, mobile, status
FROM billiards_ods.assistant_accounts_master
WHERE assistant_id IS NOT NULL
ON CONFLICT (assistant_id) DO UPDATE SET
assistant_name = EXCLUDED.assistant_name,
mobile = EXCLUDED.mobile,
status = EXCLUDED.status,
updated_at = now();
""",
),
(
"dim_pay_method",
"""
INSERT INTO billiards_dwd.dim_pay_method (pay_method_code, pay_method_name, is_stored_value, status)
SELECT DISTINCT pay_method_code, pay_method_name, FALSE AS is_stored_value, 'active' AS status
FROM billiards_ods.payment_transactions
WHERE pay_method_code IS NOT NULL
ON CONFLICT (pay_method_code) DO UPDATE SET
pay_method_name = EXCLUDED.pay_method_name,
status = EXCLUDED.status,
updated_at = now();
""",
),
(
"dim_coupon_platform",
"""
INSERT INTO billiards_dwd.dim_coupon_platform (platform_code, platform_name)
SELECT DISTINCT platform_code, platform_code AS platform_name
FROM billiards_ods.ods_platform_coupon_log
WHERE platform_code IS NOT NULL
ON CONFLICT (platform_code) DO NOTHING;
""",
),
(
"fact_sale_item",
"""
INSERT INTO billiards_dwd.fact_sale_item (
site_id, sale_item_id, order_trade_no, order_settle_id, member_id,
goods_id, category_id, quantity, original_amount, discount_amount,
final_amount, is_gift, sale_time
)
SELECT
site_id,
sale_item_id,
order_trade_no,
order_settle_id,
NULL::BIGINT AS member_id,
goods_id,
category_id,
quantity,
original_amount,
discount_amount,
final_amount,
COALESCE(is_gift, FALSE),
sale_time
FROM billiards_ods.ods_store_sale_item
ON CONFLICT (site_id, sale_item_id) DO NOTHING;
""",
),
(
"fact_table_usage",
"""
INSERT INTO billiards_dwd.fact_table_usage (
site_id, ledger_id, order_trade_no, order_settle_id, table_id,
member_id, start_time, end_time, duration_minutes,
original_table_fee, member_discount_amount, manual_discount_amount,
final_table_fee, is_canceled, cancel_time
)
SELECT
site_id,
ledger_id,
order_trade_no,
order_settle_id,
table_id,
member_id,
start_time,
end_time,
duration_minutes,
original_table_fee,
0::NUMERIC AS member_discount_amount,
discount_amount AS manual_discount_amount,
final_table_fee,
FALSE AS is_canceled,
NULL::TIMESTAMPTZ AS cancel_time
FROM billiards_ods.table_fee_transactions_log
ON CONFLICT (site_id, ledger_id) DO NOTHING;
""",
),
(
"fact_assistant_service",
"""
INSERT INTO billiards_dwd.fact_assistant_service (
site_id, ledger_id, order_trade_no, order_settle_id, assistant_id,
assist_type_code, member_id, start_time, end_time, duration_minutes,
original_fee, member_discount_amount, manual_discount_amount,
final_fee, is_canceled, cancel_time
)
SELECT
site_id,
ledger_id,
order_trade_no,
order_settle_id,
assistant_id,
NULL::TEXT AS assist_type_code,
member_id,
start_time,
end_time,
duration_minutes,
original_fee,
0::NUMERIC AS member_discount_amount,
discount_amount AS manual_discount_amount,
final_fee,
FALSE AS is_canceled,
NULL::TIMESTAMPTZ AS cancel_time
FROM billiards_ods.ods_assistant_service_log
ON CONFLICT (site_id, ledger_id) DO NOTHING;
""",
),
(
"fact_coupon_usage",
"""
INSERT INTO billiards_dwd.fact_coupon_usage (
site_id, coupon_id, package_id, order_trade_no, order_settle_id,
member_id, platform_code, status, deduct_amount, settle_price, used_time
)
SELECT
site_id,
coupon_id,
NULL::BIGINT AS package_id,
order_trade_no,
order_settle_id,
member_id,
platform_code,
status,
deduct_amount,
settle_price,
used_time
FROM billiards_ods.ods_platform_coupon_log
ON CONFLICT (site_id, coupon_id) DO NOTHING;
""",
),
(
"fact_payment",
"""
INSERT INTO billiards_dwd.fact_payment (
site_id, pay_id, order_trade_no, order_settle_id, member_id,
pay_method_code, pay_amount, pay_time, relate_type, relate_id
)
SELECT
site_id,
pay_id,
order_trade_no,
order_settle_id,
member_id,
pay_method_code,
pay_amount,
pay_time,
relate_type,
relate_id
FROM billiards_ods.payment_transactions
ON CONFLICT (site_id, pay_id) DO NOTHING;
""",
),
(
"fact_refund",
"""
INSERT INTO billiards_dwd.fact_refund (
site_id, refund_id, order_trade_no, order_settle_id, member_id,
pay_method_code, refund_amount, refund_time, status
)
SELECT
site_id,
refund_id,
order_trade_no,
order_settle_id,
member_id,
pay_method_code,
refund_amount,
refund_time,
status
FROM billiards_ods.refund_transactions
ON CONFLICT (site_id, refund_id) DO NOTHING;
""",
),
(
"fact_balance_change",
"""
INSERT INTO billiards_dwd.fact_balance_change (
site_id, change_id, member_id, change_type, relate_type, relate_id,
pay_method_code, change_amount, balance_before, balance_after, change_time
)
SELECT
site_id,
change_id,
member_id,
change_type,
NULL::TEXT AS relate_type,
relate_id,
NULL::TEXT AS pay_method_code,
change_amount,
balance_before,
balance_after,
change_time
FROM billiards_ods.member_balance_changes
ON CONFLICT (site_id, change_id) DO NOTHING;
""",
),
]
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Build DWD tables from ODS payloads (PRD schema).")
parser.add_argument(
"--dsn",
default=os.environ.get("PG_DSN"),
help="PostgreSQL DSN (fallback PG_DSN env)",
)
parser.add_argument(
"--timeout",
type=int,
default=int(os.environ.get("PG_CONNECT_TIMEOUT", 10) or 10),
help="connect_timeout seconds (capped at 20, default 10)",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
if not args.dsn:
print("Missing DSN. Use --dsn or PG_DSN.", file=sys.stderr)
return 2
timeout_val = max(1, min(args.timeout, 20))
conn = psycopg2.connect(args.dsn, connect_timeout=timeout_val)
conn.autocommit = False
try:
with conn.cursor() as cur:
for name, sql in SQL_STEPS:
cur.execute(sql)
print(f"[OK] {name}")
conn.commit()
except Exception as exc: # pragma: no cover - operational script
conn.rollback()
print(f"[FAIL] {exc}", file=sys.stderr)
return 1
finally:
try:
conn.close()
except Exception:
pass
print("DWD build complete.")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,322 @@
# -*- coding: utf-8 -*-
"""Recompute billiards_dws.dws_order_summary from DWD fact tables."""
from __future__ import annotations
import argparse
import os
import sys
from pathlib import Path
PROJECT_ROOT = Path(__file__).resolve().parents[1]
if str(PROJECT_ROOT) not in sys.path:
sys.path.insert(0, str(PROJECT_ROOT))
from database.connection import DatabaseConnection # noqa: E402
SQL_BUILD_SUMMARY = r"""
WITH table_fee AS (
SELECT
site_id,
order_settle_id,
order_trade_no,
MIN(member_id) AS member_id,
SUM(COALESCE(final_table_fee, 0)) AS table_fee_amount,
SUM(COALESCE(member_discount_amount, 0)) AS member_discount_amount,
SUM(COALESCE(manual_discount_amount, 0)) AS manual_discount_amount,
SUM(COALESCE(original_table_fee, 0)) AS original_table_fee,
MIN(start_time) AS first_time
FROM billiards_dwd.fact_table_usage
WHERE (%(site_id)s IS NULL OR site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR start_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR start_time::date <= %(end_date)s)
AND COALESCE(is_canceled, FALSE) = FALSE
GROUP BY site_id, order_settle_id, order_trade_no
),
assistant_fee AS (
SELECT
site_id,
order_settle_id,
order_trade_no,
MIN(member_id) AS member_id,
SUM(COALESCE(final_fee, 0)) AS assistant_service_amount,
SUM(COALESCE(member_discount_amount, 0)) AS member_discount_amount,
SUM(COALESCE(manual_discount_amount, 0)) AS manual_discount_amount,
SUM(COALESCE(original_fee, 0)) AS original_fee,
MIN(start_time) AS first_time
FROM billiards_dwd.fact_assistant_service
WHERE (%(site_id)s IS NULL OR site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR start_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR start_time::date <= %(end_date)s)
AND COALESCE(is_canceled, FALSE) = FALSE
GROUP BY site_id, order_settle_id, order_trade_no
),
goods_fee AS (
SELECT
site_id,
order_settle_id,
order_trade_no,
MIN(member_id) AS member_id,
SUM(COALESCE(final_amount, 0)) FILTER (WHERE COALESCE(is_gift, FALSE) = FALSE) AS goods_amount,
SUM(COALESCE(discount_amount, 0)) FILTER (WHERE COALESCE(is_gift, FALSE) = FALSE) AS goods_discount_amount,
SUM(COALESCE(original_amount, 0)) FILTER (WHERE COALESCE(is_gift, FALSE) = FALSE) AS goods_original_amount,
COUNT(*) FILTER (WHERE COALESCE(is_gift, FALSE) = FALSE) AS item_count,
SUM(COALESCE(quantity, 0)) FILTER (WHERE COALESCE(is_gift, FALSE) = FALSE) AS total_item_quantity,
MIN(sale_time) AS first_time
FROM billiards_dwd.fact_sale_item
WHERE (%(site_id)s IS NULL OR site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR sale_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR sale_time::date <= %(end_date)s)
GROUP BY site_id, order_settle_id, order_trade_no
),
coupon_usage AS (
SELECT
site_id,
order_settle_id,
order_trade_no,
MIN(member_id) AS member_id,
SUM(COALESCE(deduct_amount, 0)) AS coupon_deduction,
SUM(COALESCE(settle_price, 0)) AS settle_price,
MIN(used_time) AS first_time
FROM billiards_dwd.fact_coupon_usage
WHERE (%(site_id)s IS NULL OR site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR used_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR used_time::date <= %(end_date)s)
GROUP BY site_id, order_settle_id, order_trade_no
),
payments AS (
SELECT
fp.site_id,
fp.order_settle_id,
fp.order_trade_no,
MIN(fp.member_id) AS member_id,
SUM(COALESCE(fp.pay_amount, 0)) AS total_paid_amount,
SUM(COALESCE(fp.pay_amount, 0)) FILTER (WHERE COALESCE(pm.is_stored_value, FALSE)) AS stored_card_deduct,
SUM(COALESCE(fp.pay_amount, 0)) FILTER (WHERE NOT COALESCE(pm.is_stored_value, FALSE)) AS external_paid_amount,
MIN(fp.pay_time) AS first_time
FROM billiards_dwd.fact_payment fp
LEFT JOIN billiards_dwd.dim_pay_method pm ON fp.pay_method_code = pm.pay_method_code
WHERE (%(site_id)s IS NULL OR fp.site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR fp.pay_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR fp.pay_time::date <= %(end_date)s)
GROUP BY fp.site_id, fp.order_settle_id, fp.order_trade_no
),
refunds AS (
SELECT
site_id,
order_settle_id,
order_trade_no,
SUM(COALESCE(refund_amount, 0)) AS refund_amount
FROM billiards_dwd.fact_refund
WHERE (%(site_id)s IS NULL OR site_id = %(site_id)s)
AND (%(start_date)s IS NULL OR refund_time::date >= %(start_date)s)
AND (%(end_date)s IS NULL OR refund_time::date <= %(end_date)s)
GROUP BY site_id, order_settle_id, order_trade_no
),
combined_ids AS (
SELECT site_id, order_settle_id, order_trade_no FROM table_fee
UNION
SELECT site_id, order_settle_id, order_trade_no FROM assistant_fee
UNION
SELECT site_id, order_settle_id, order_trade_no FROM goods_fee
UNION
SELECT site_id, order_settle_id, order_trade_no FROM coupon_usage
UNION
SELECT site_id, order_settle_id, order_trade_no FROM payments
UNION
SELECT site_id, order_settle_id, order_trade_no FROM refunds
),
site_dim AS (
SELECT site_id, tenant_id FROM billiards_dwd.dim_site
)
INSERT INTO billiards_dws.dws_order_summary (
site_id,
order_settle_id,
order_trade_no,
order_date,
tenant_id,
member_id,
member_flag,
recharge_order_flag,
item_count,
total_item_quantity,
table_fee_amount,
assistant_service_amount,
goods_amount,
group_amount,
total_coupon_deduction,
member_discount_amount,
manual_discount_amount,
order_original_amount,
order_final_amount,
stored_card_deduct,
external_paid_amount,
total_paid_amount,
book_table_flow,
book_assistant_flow,
book_goods_flow,
book_group_flow,
book_order_flow,
order_effective_consume_cash,
order_effective_recharge_cash,
order_effective_flow,
refund_amount,
net_income,
created_at,
updated_at
)
SELECT
c.site_id,
c.order_settle_id,
c.order_trade_no,
COALESCE(tf.first_time, af.first_time, gf.first_time, pay.first_time, cu.first_time)::date AS order_date,
sd.tenant_id,
COALESCE(tf.member_id, af.member_id, gf.member_id, cu.member_id, pay.member_id) AS member_id,
COALESCE(tf.member_id, af.member_id, gf.member_id, cu.member_id, pay.member_id) IS NOT NULL AS member_flag,
-- recharge flag: no consumption side but has payments
(COALESCE(tf.table_fee_amount, 0) + COALESCE(af.assistant_service_amount, 0) + COALESCE(gf.goods_amount, 0) + COALESCE(cu.settle_price, 0) = 0)
AND COALESCE(pay.total_paid_amount, 0) > 0 AS recharge_order_flag,
COALESCE(gf.item_count, 0) AS item_count,
COALESCE(gf.total_item_quantity, 0) AS total_item_quantity,
COALESCE(tf.table_fee_amount, 0) AS table_fee_amount,
COALESCE(af.assistant_service_amount, 0) AS assistant_service_amount,
COALESCE(gf.goods_amount, 0) AS goods_amount,
COALESCE(cu.settle_price, 0) AS group_amount,
COALESCE(cu.coupon_deduction, 0) AS total_coupon_deduction,
COALESCE(tf.member_discount_amount, 0) + COALESCE(af.member_discount_amount, 0) + COALESCE(gf.goods_discount_amount, 0) AS member_discount_amount,
COALESCE(tf.manual_discount_amount, 0) + COALESCE(af.manual_discount_amount, 0) AS manual_discount_amount,
COALESCE(tf.original_table_fee, 0) + COALESCE(af.original_fee, 0) + COALESCE(gf.goods_original_amount, 0) AS order_original_amount,
COALESCE(tf.table_fee_amount, 0) + COALESCE(af.assistant_service_amount, 0) + COALESCE(gf.goods_amount, 0) + COALESCE(cu.settle_price, 0) - COALESCE(cu.coupon_deduction, 0) AS order_final_amount,
COALESCE(pay.stored_card_deduct, 0) AS stored_card_deduct,
COALESCE(pay.external_paid_amount, 0) AS external_paid_amount,
COALESCE(pay.total_paid_amount, 0) AS total_paid_amount,
COALESCE(tf.table_fee_amount, 0) AS book_table_flow,
COALESCE(af.assistant_service_amount, 0) AS book_assistant_flow,
COALESCE(gf.goods_amount, 0) AS book_goods_flow,
COALESCE(cu.settle_price, 0) AS book_group_flow,
COALESCE(tf.table_fee_amount, 0) + COALESCE(af.assistant_service_amount, 0) + COALESCE(gf.goods_amount, 0) + COALESCE(cu.settle_price, 0) AS book_order_flow,
CASE
WHEN (COALESCE(tf.table_fee_amount, 0) + COALESCE(af.assistant_service_amount, 0) + COALESCE(gf.goods_amount, 0) + COALESCE(cu.settle_price, 0) = 0)
THEN 0
ELSE COALESCE(pay.external_paid_amount, 0)
END AS order_effective_consume_cash,
CASE
WHEN (COALESCE(tf.table_fee_amount, 0) + COALESCE(af.assistant_service_amount, 0) + COALESCE(gf.goods_amount, 0) + COALESCE(cu.settle_price, 0) = 0)
THEN COALESCE(pay.external_paid_amount, 0)
ELSE 0
END AS order_effective_recharge_cash,
COALESCE(pay.external_paid_amount, 0) + COALESCE(cu.settle_price, 0) AS order_effective_flow,
COALESCE(rf.refund_amount, 0) AS refund_amount,
(COALESCE(pay.external_paid_amount, 0) + COALESCE(cu.settle_price, 0)) - COALESCE(rf.refund_amount, 0) AS net_income,
now() AS created_at,
now() AS updated_at
FROM combined_ids c
LEFT JOIN table_fee tf ON c.site_id = tf.site_id AND c.order_settle_id = tf.order_settle_id
LEFT JOIN assistant_fee af ON c.site_id = af.site_id AND c.order_settle_id = af.order_settle_id
LEFT JOIN goods_fee gf ON c.site_id = gf.site_id AND c.order_settle_id = gf.order_settle_id
LEFT JOIN coupon_usage cu ON c.site_id = cu.site_id AND c.order_settle_id = cu.order_settle_id
LEFT JOIN payments pay ON c.site_id = pay.site_id AND c.order_settle_id = pay.order_settle_id
LEFT JOIN refunds rf ON c.site_id = rf.site_id AND c.order_settle_id = rf.order_settle_id
LEFT JOIN site_dim sd ON c.site_id = sd.site_id
ON CONFLICT (site_id, order_settle_id) DO UPDATE SET
order_trade_no = EXCLUDED.order_trade_no,
order_date = EXCLUDED.order_date,
tenant_id = EXCLUDED.tenant_id,
member_id = EXCLUDED.member_id,
member_flag = EXCLUDED.member_flag,
recharge_order_flag = EXCLUDED.recharge_order_flag,
item_count = EXCLUDED.item_count,
total_item_quantity = EXCLUDED.total_item_quantity,
table_fee_amount = EXCLUDED.table_fee_amount,
assistant_service_amount = EXCLUDED.assistant_service_amount,
goods_amount = EXCLUDED.goods_amount,
group_amount = EXCLUDED.group_amount,
total_coupon_deduction = EXCLUDED.total_coupon_deduction,
member_discount_amount = EXCLUDED.member_discount_amount,
manual_discount_amount = EXCLUDED.manual_discount_amount,
order_original_amount = EXCLUDED.order_original_amount,
order_final_amount = EXCLUDED.order_final_amount,
stored_card_deduct = EXCLUDED.stored_card_deduct,
external_paid_amount = EXCLUDED.external_paid_amount,
total_paid_amount = EXCLUDED.total_paid_amount,
book_table_flow = EXCLUDED.book_table_flow,
book_assistant_flow = EXCLUDED.book_assistant_flow,
book_goods_flow = EXCLUDED.book_goods_flow,
book_group_flow = EXCLUDED.book_group_flow,
book_order_flow = EXCLUDED.book_order_flow,
order_effective_consume_cash = EXCLUDED.order_effective_consume_cash,
order_effective_recharge_cash = EXCLUDED.order_effective_recharge_cash,
order_effective_flow = EXCLUDED.order_effective_flow,
refund_amount = EXCLUDED.refund_amount,
net_income = EXCLUDED.net_income,
updated_at = now();
"""
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Build/update dws_order_summary from DWD fact tables."
)
parser.add_argument(
"--dsn",
default=os.environ.get("PG_DSN"),
help="PostgreSQL DSN (fallback: PG_DSN env)",
)
parser.add_argument(
"--site-id",
type=int,
default=None,
help="Filter by site_id (optional, default all sites)",
)
parser.add_argument(
"--start-date",
dest="start_date",
default=None,
help="Filter facts from this date (YYYY-MM-DD, optional)",
)
parser.add_argument(
"--end-date",
dest="end_date",
default=None,
help="Filter facts until this date (YYYY-MM-DD, optional)",
)
parser.add_argument(
"--timeout",
type=int,
default=int(os.environ.get("PG_CONNECT_TIMEOUT", 10) or 10),
help="connect_timeout seconds (capped at 20, default 10)",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
if not args.dsn:
print("Missing DSN. Set PG_DSN or pass --dsn.", file=sys.stderr)
return 2
params = {
"site_id": args.site_id,
"start_date": args.start_date,
"end_date": args.end_date,
}
timeout_val = max(1, min(args.timeout, 20))
conn = DatabaseConnection(args.dsn, connect_timeout=timeout_val)
try:
with conn.conn.cursor() as cur:
cur.execute(SQL_BUILD_SUMMARY, params)
conn.commit()
except Exception as exc: # pragma: no cover - operational script
conn.rollback()
print(f"DWS build failed: {exc}", file=sys.stderr)
return 1
finally:
conn.close()
print("dws_order_summary refreshed.")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,117 @@
# -*- coding: utf-8 -*-
"""
ODS JSON 字段核对脚本:对照当前数据库中的 ODS 表字段,检查示例 JSON默认目录 C:\\dev\\LLTQ\\export\\test-json-doc
是否包含同名键,并输出每表未命中的字段,便于补充映射或确认确实无源字段。
使用方法:
set PG_DSN=postgresql://... # 如 .env 中配置
python -m etl_billiards.scripts.check_ods_json_vs_table
"""
from __future__ import annotations
import json
import os
import pathlib
from typing import Dict, Iterable, Set, Tuple
import psycopg2
from etl_billiards.tasks.manual_ingest_task import ManualIngestTask
def _flatten_keys(obj, prefix: str = "") -> Set[str]:
"""递归展开 JSON 所有键路径,返回形如 data.assistantInfos.id 的集合。列表不保留索引,仅继续向下展开。"""
keys: Set[str] = set()
if isinstance(obj, dict):
for k, v in obj.items():
new_prefix = f"{prefix}.{k}" if prefix else k
keys.add(new_prefix)
keys |= _flatten_keys(v, new_prefix)
elif isinstance(obj, list):
for item in obj:
keys |= _flatten_keys(item, prefix)
return keys
def _load_json_keys(path: pathlib.Path) -> Tuple[Set[str], dict[str, Set[str]]]:
"""读取单个 JSON 文件并返回展开后的键集合以及末段->路径列表映射,若文件不存在或无法解析则返回空集合。"""
if not path.exists():
return set(), {}
data = json.loads(path.read_text(encoding="utf-8"))
paths = _flatten_keys(data)
last_map: dict[str, Set[str]] = {}
for p in paths:
last = p.split(".")[-1].lower()
last_map.setdefault(last, set()).add(p)
return paths, last_map
def _load_ods_columns(dsn: str) -> Dict[str, Set[str]]:
"""从数据库读取 billiards_ods.* 的列名集合,按表返回。"""
conn = psycopg2.connect(dsn)
cur = conn.cursor()
cur.execute(
"""
SELECT table_name, column_name
FROM information_schema.columns
WHERE table_schema='billiards_ods'
ORDER BY table_name, ordinal_position
"""
)
result: Dict[str, Set[str]] = {}
for table, col in cur.fetchall():
result.setdefault(table, set()).add(col.lower())
cur.close()
conn.close()
return result
def main() -> None:
"""主流程:遍历 FILE_MAPPING 中的 ODS 表,检查 JSON 键覆盖情况并打印报告。"""
dsn = os.environ.get("PG_DSN")
json_dir = pathlib.Path(os.environ.get("JSON_DOC_DIR", r"C:\dev\LLTQ\export\test-json-doc"))
ods_cols_map = _load_ods_columns(dsn)
print(f"使用 JSON 目录: {json_dir}")
print(f"连接 DSN: {dsn}")
print("=" * 80)
for keywords, ods_table in ManualIngestTask.FILE_MAPPING:
table = ods_table.split(".")[-1]
cols = ods_cols_map.get(table, set())
file_name = f"{keywords[0]}.json"
file_path = json_dir / file_name
keys_full, path_map = _load_json_keys(file_path)
key_last_parts = set(path_map.keys())
missing: Set[str] = set()
extra_keys: Set[str] = set()
present: Set[str] = set()
for col in sorted(cols):
if col in key_last_parts:
present.add(col)
else:
missing.add(col)
for k in key_last_parts:
if k not in cols:
extra_keys.add(k)
print(f"[{table}] 文件={file_name} 列数={len(cols)} JSON键(末段)覆盖={len(present)}/{len(cols)}")
if missing:
print(" 未命中列:", ", ".join(sorted(missing)))
else:
print(" 未命中列: 无")
if extra_keys:
extras = []
for k in sorted(extra_keys):
paths = ", ".join(sorted(path_map.get(k, [])))
extras.append(f"{k} ({paths})")
print(" JSON 仅有(表无此列):", "; ".join(extras))
else:
print(" JSON 仅有(表无此列): 无")
print("-" * 80)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,258 @@
# -*- coding: utf-8 -*-
"""
从本地 JSON 示例目录重建 billiards_ods.* 表,并导入样例数据。
用法:
PYTHONPATH=. python -m etl_billiards.scripts.rebuild_ods_from_json [--dsn ...] [--json-dir ...] [--include ...] [--drop-schema-first]
依赖环境变量:
PG_DSN PostgreSQL 连接串(必填)
PG_CONNECT_TIMEOUT 可选,秒,默认 10
JSON_DOC_DIR 可选JSON 目录,默认 C:\\dev\\LLTQ\\export\\test-json-doc
ODS_INCLUDE_FILES 可选,逗号分隔文件名(不含 .json
ODS_DROP_SCHEMA_FIRST 可选true/false默认 true
"""
from __future__ import annotations
import argparse
import os
import re
import sys
import json
from pathlib import Path
from typing import Iterable, List, Tuple
import psycopg2
from psycopg2 import sql
from psycopg2.extras import Json, execute_values
DEFAULT_JSON_DIR = r"C:\dev\LLTQ\export\test-json-doc"
SPECIAL_LIST_PATHS: dict[str, tuple[str, ...]] = {
"assistant_accounts_master": ("data", "assistantInfos"),
"assistant_cancellation_records": ("data", "abolitionAssistants"),
"assistant_service_records": ("data", "orderAssistantDetails"),
"goods_stock_movements": ("data", "queryDeliveryRecordsList"),
"goods_stock_summary": ("data",),
"group_buy_packages": ("data", "packageCouponList"),
"group_buy_redemption_records": ("data", "siteTableUseDetailsList"),
"member_balance_changes": ("data", "tenantMemberCardLogs"),
"member_profiles": ("data", "tenantMemberInfos"),
"member_stored_value_cards": ("data", "tenantMemberCards"),
"recharge_settlements": ("data", "settleList"),
"settlement_records": ("data", "settleList"),
"site_tables_master": ("data", "siteTables"),
"stock_goods_category_tree": ("data", "goodsCategoryList"),
"store_goods_master": ("data", "orderGoodsList"),
"store_goods_sales_records": ("data", "orderGoodsLedgers"),
"table_fee_discount_records": ("data", "taiFeeAdjustInfos"),
"table_fee_transactions": ("data", "siteTableUseDetailsList"),
"tenant_goods_master": ("data", "tenantGoodsList"),
}
def sanitize_identifier(name: str) -> str:
"""将任意字符串转为可用的 SQL identifier小写、非字母数字转下划线"""
cleaned = re.sub(r"[^0-9a-zA-Z_]", "_", name.strip())
if not cleaned:
cleaned = "col"
if cleaned[0].isdigit():
cleaned = f"_{cleaned}"
return cleaned.lower()
def _extract_list_via_path(node, path: tuple[str, ...]):
cur = node
for key in path:
if isinstance(cur, dict):
cur = cur.get(key)
else:
return []
return cur if isinstance(cur, list) else []
def load_records(payload, list_path: tuple[str, ...] | None = None) -> list:
"""
尝试从 JSON 结构中提取记录列表:
- 直接是 list -> 返回
- dict 中 data 是 list -> 返回
- dict 中 data 是 dict取第一个 list 字段
- dict 中任意值是 list -> 返回
- 其余情况,包装为单条记录
"""
if list_path:
if isinstance(payload, list):
merged: list = []
for item in payload:
merged.extend(_extract_list_via_path(item, list_path))
if merged:
return merged
elif isinstance(payload, dict):
lst = _extract_list_via_path(payload, list_path)
if lst:
return lst
if isinstance(payload, list):
return payload
if isinstance(payload, dict):
data_node = payload.get("data")
if isinstance(data_node, list):
return data_node
if isinstance(data_node, dict):
for v in data_node.values():
if isinstance(v, list):
return v
for v in payload.values():
if isinstance(v, list):
return v
return [payload]
def collect_columns(records: Iterable[dict]) -> List[str]:
"""汇总所有顶层键,作为表字段;仅处理 dict 记录。"""
cols: set[str] = set()
for rec in records:
if isinstance(rec, dict):
cols.update(rec.keys())
return sorted(cols)
def create_table(cur, schema: str, table: str, columns: List[Tuple[str, str]]):
"""
创建表:字段全部 jsonb外加 source_file、record_index、payload、ingested_at。
columns: [(col_name, original_key)]
"""
fields = [sql.SQL("{} jsonb").format(sql.Identifier(col)) for col, _ in columns]
constraint_name = f"uq_{table}_source_record"
ddl = sql.SQL(
"CREATE TABLE IF NOT EXISTS {schema}.{table} ("
"source_file text,"
"record_index integer,"
"{cols},"
"payload jsonb,"
"ingested_at timestamptz default now(),"
"CONSTRAINT {constraint} UNIQUE (source_file, record_index)"
");"
).format(
schema=sql.Identifier(schema),
table=sql.Identifier(table),
cols=sql.SQL(",").join(fields),
constraint=sql.Identifier(constraint_name),
)
cur.execute(ddl)
def insert_records(cur, schema: str, table: str, columns: List[Tuple[str, str]], records: list, source_file: str):
"""批量插入记录。"""
col_idents = [sql.Identifier(col) for col, _ in columns]
col_names = [col for col, _ in columns]
orig_keys = [orig for _, orig in columns]
all_cols = [sql.Identifier("source_file"), sql.Identifier("record_index")] + col_idents + [
sql.Identifier("payload")
]
rows = []
for idx, rec in enumerate(records):
if not isinstance(rec, dict):
rec = {"value": rec}
row_values = [source_file, idx]
for key in orig_keys:
row_values.append(Json(rec.get(key)))
row_values.append(Json(rec))
rows.append(row_values)
insert_sql = sql.SQL("INSERT INTO {}.{} ({}) VALUES %s ON CONFLICT DO NOTHING").format(
sql.Identifier(schema),
sql.Identifier(table),
sql.SQL(",").join(all_cols),
)
execute_values(cur, insert_sql, rows, page_size=500)
def rebuild(schema: str = "billiards_ods", data_dir: str | Path = DEFAULT_JSON_DIR):
parser = argparse.ArgumentParser(description="重建 billiards_ods.* 表并导入 JSON 样例")
parser.add_argument("--dsn", dest="dsn", help="PostgreSQL DSN默认读取环境变量 PG_DSN")
parser.add_argument("--json-dir", dest="json_dir", help=f"JSON 目录,默认 {DEFAULT_JSON_DIR}")
parser.add_argument(
"--include",
dest="include_files",
help="限定导入的文件名(逗号分隔,不含 .json默认全部",
)
parser.add_argument(
"--drop-schema-first",
dest="drop_schema_first",
action="store_true",
help="先删除并重建 schema默认 true",
)
parser.add_argument(
"--no-drop-schema-first",
dest="drop_schema_first",
action="store_false",
help="保留现有 schema仅按冲突去重导入",
)
parser.set_defaults(drop_schema_first=None)
args = parser.parse_args()
dsn = args.dsn or os.environ.get("PG_DSN")
if not dsn:
print("缺少参数/环境变量 PG_DSN无法连接数据库。")
sys.exit(1)
timeout = max(1, min(int(os.environ.get("PG_CONNECT_TIMEOUT", 10)), 60))
env_drop = os.environ.get("ODS_DROP_SCHEMA_FIRST") or os.environ.get("DROP_SCHEMA_FIRST")
drop_schema_first = (
args.drop_schema_first
if args.drop_schema_first is not None
else str(env_drop or "true").lower() in ("1", "true", "yes")
)
include_files_env = args.include_files or os.environ.get("ODS_INCLUDE_FILES") or os.environ.get("INCLUDE_FILES")
include_files = set()
if include_files_env:
include_files = {p.strip().lower() for p in include_files_env.split(",") if p.strip()}
base_dir = Path(args.json_dir or data_dir or DEFAULT_JSON_DIR)
if not base_dir.exists():
print(f"JSON 目录不存在: {base_dir}")
sys.exit(1)
conn = psycopg2.connect(dsn, connect_timeout=timeout)
conn.autocommit = False
cur = conn.cursor()
if drop_schema_first:
print(f"Dropping schema {schema} ...")
cur.execute(sql.SQL("DROP SCHEMA IF EXISTS {} CASCADE;").format(sql.Identifier(schema)))
cur.execute(sql.SQL("CREATE SCHEMA {};").format(sql.Identifier(schema)))
else:
cur.execute(
sql.SQL("SELECT schema_name FROM information_schema.schemata WHERE schema_name=%s"),
(schema,),
)
if not cur.fetchone():
cur.execute(sql.SQL("CREATE SCHEMA {};").format(sql.Identifier(schema)))
json_files = sorted(base_dir.glob("*.json"))
for path in json_files:
stem_lower = path.stem.lower()
if include_files and stem_lower not in include_files:
continue
print(f"Processing {path.name} ...")
payload = json.loads(path.read_text(encoding="utf-8"))
list_path = SPECIAL_LIST_PATHS.get(stem_lower)
records = load_records(payload, list_path=list_path)
columns_raw = collect_columns(records)
columns = [(sanitize_identifier(c), c) for c in columns_raw]
table_name = sanitize_identifier(path.stem)
create_table(cur, schema, table_name, columns)
if records:
insert_records(cur, schema, table_name, columns, records, path.name)
print(f" -> rows: {len(records)}, columns: {len(columns)}")
conn.commit()
cur.close()
conn.close()
print("Rebuild done.")
if __name__ == "__main__":
rebuild()

View File

@@ -4,9 +4,9 @@
直接运行本文件即可触发 pytest。
示例:
python scripts/run_tests.py --suite online --mode ONLINE --keyword ORDERS
python scripts/run_tests.py --preset offline_realdb
python scripts/run_tests.py --suite online offline --db-dsn ... --json-archive tmp/archives
python scripts/run_tests.py --suite online --flow FULL --keyword ORDERS
python scripts/run_tests.py --preset fetch_only
python scripts/run_tests.py --suite online --json-source tmp/archives
"""
from __future__ import annotations
@@ -21,9 +21,12 @@ import pytest
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
# 确保项目根目录在 sys.path便于 tests 内部 import config / tasks 等模块
if PROJECT_ROOT not in sys.path:
sys.path.insert(0, PROJECT_ROOT)
SUITE_MAP: Dict[str, str] = {
"online": "tests/unit/test_etl_tasks_online.py",
"offline": "tests/unit/test_etl_tasks_offline.py",
"integration": "tests/integration/test_database.py",
}
@@ -60,13 +63,12 @@ def parse_args() -> argparse.Namespace:
help="自定义测试路径(可与 --suite 混用),例如 tests/unit/test_config.py",
)
parser.add_argument(
"--mode",
choices=["ONLINE", "OFFLINE"],
help="覆盖 TEST_MODE默认沿用 .env / 环境变量",
"--flow",
choices=["FETCH_ONLY", "INGEST_ONLY", "FULL"],
help="覆盖 PIPELINE_FLOW在线抓取/本地清洗/全流程",
)
parser.add_argument("--db-dsn", help="设置 TEST_DB_DSN连接真实数据库进行测试")
parser.add_argument("--json-archive", help="设置 TEST_JSON_ARCHIVE_DIR离线档案目录)")
parser.add_argument("--json-temp", help="设置 TEST_JSON_TEMP_DIR临时 JSON 路径)")
parser.add_argument("--json-source", help="设置 JSON_SOURCE_DIR本地清洗入库使用的 JSON 目录)")
parser.add_argument("--json-fetch-root", help="设置 JSON_FETCH_ROOT在线抓取输出根目录)")
parser.add_argument(
"--keyword",
"-k",
@@ -119,14 +121,12 @@ def apply_presets_to_args(args: argparse.Namespace):
def apply_env(args: argparse.Namespace) -> Dict[str, str]:
env_updates = {}
if args.mode:
env_updates["TEST_MODE"] = args.mode
if args.db_dsn:
env_updates["TEST_DB_DSN"] = args.db_dsn
if args.json_archive:
env_updates["TEST_JSON_ARCHIVE_DIR"] = args.json_archive
if args.json_temp:
env_updates["TEST_JSON_TEMP_DIR"] = args.json_temp
if args.flow:
env_updates["PIPELINE_FLOW"] = args.flow
if args.json_source:
env_updates["JSON_SOURCE_DIR"] = args.json_source
if args.json_fetch_root:
env_updates["JSON_FETCH_ROOT"] = args.json_fetch_root
if args.env:
for item in args.env:
if "=" not in item:
@@ -147,8 +147,7 @@ def build_pytest_args(args: argparse.Namespace) -> List[str]:
if args.tests:
targets.extend(args.tests)
if not targets:
# 默认跑 online + offline 套件
targets = [SUITE_MAP["online"], SUITE_MAP["offline"]]
targets = list(SUITE_MAP.values())
pytest_args: List[str] = targets
if args.keyword:

View File

@@ -0,0 +1,64 @@
# -*- coding: utf-8 -*-
"""Quick utility for validating PostgreSQL connectivity (ASCII-only output)."""
from __future__ import annotations
import argparse
import os
import sys
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
if PROJECT_ROOT not in sys.path:
sys.path.insert(0, PROJECT_ROOT)
from database.connection import DatabaseConnection
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="PostgreSQL connectivity smoke test")
parser.add_argument("--dsn", help="Override TEST_DB_DSN / env value")
parser.add_argument(
"--query",
default="SELECT 1 AS ok",
help="Custom SQL to run after connection (default: SELECT 1 AS ok)",
)
parser.add_argument(
"--timeout",
type=int,
default=10,
help="connect_timeout seconds passed to psycopg2 (capped at 20, default: 10)",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
dsn = args.dsn or os.environ.get("TEST_DB_DSN")
if not dsn:
print("Missing DSN. Use --dsn or TEST_DB_DSN.", file=sys.stderr)
return 2
print(f"Trying connection: {dsn}")
try:
timeout = max(1, min(args.timeout, 20))
conn = DatabaseConnection(dsn, connect_timeout=timeout)
except Exception as exc: # pragma: no cover - diagnostic output
print("Connection failed:", exc, file=sys.stderr)
return 1
try:
result = conn.query(args.query)
print("Connection OK, query result:")
for row in result:
print(row)
conn.close()
return 0
except Exception as exc: # pragma: no cover - diagnostic output
print("Connection succeeded but query failed:", exc, file=sys.stderr)
try:
conn.close()
finally:
return 3
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -1,55 +1,5 @@
# -*- coding: utf-8 -*-
"""测试命令仓库”,集中维护 run_tests.py 的预置组合。
支持的参数键(可在 PRESETS 中自由组合):
1. suite
- 类型:列表
- 作用:引用 run_tests 中的预置套件,值可为 online / offline / integration 等。
- 用法:["online"] 仅跑在线模式;["online","offline"] 同时跑两套;["integration"] 跑数据库集成测试。
2. tests
- 类型:列表
- 作用:传入任意 pytest 目标路径,适合补充临时/自定义测试文件。
- 用法:["tests/unit/test_config.py","tests/unit/test_parsers.py"]。
3. mode
- 类型:字符串
- 取值ONLINE 或 OFFLINE。
- 作用:覆盖 TEST_MODEONLINE 走 API 全流程OFFLINE 读取 JSON 归档执行 T+L。
4. db_dsn
- 类型:字符串
- 作用:设置 TEST_DB_DSN指定真实 PostgreSQL 连接;缺省时测试引擎使用伪 DB仅记录写入不触库。
- 示例postgresql://user:pwd@localhost:5432/testdb。
5. json_archive / json_temp
- 类型:字符串
- 作用:离线模式的 JSON 归档目录 / 临时输出目录。
- 说明:不设置时沿用 .env 或默认值;仅在 OFFLINE 模式需要关注。
6. keyword
- 类型:字符串
- 作用:等价 pytest -k用于筛选测试名/节点。
- 示例:"ORDERS" 可只运行包含该关键字的测试函数。
7. pytest_args
- 类型:字符串
- 作用:附加 pytest 命令行参数。
- 示例:"-vv --maxfail=1 --disable-warnings"
8. env
- 类型:列表
- 作用:追加环境变量,形如 ["STORE_ID=123","API_TOKEN=xxx"],会在 run_tests 内透传给 os.environ。
9. preset_meta
- 类型:字符串
- 作用:纯注释信息,便于描述该预置组合的用途,不会传递给 run_tests。
运行方式建议直接 F5或 `python scripts/test_presets.py`),脚本将读取 AUTO_RUN_PRESETS 中的配置依次执行。
如需临时指定其它预置,可传入 `--preset xxx``--list` 用于查看所有参数说明和预置详情。
"""
"""测试命令仓库集中维护 run_tests.py 的常用组合,支持一键执行。"""
from __future__ import annotations
import argparse
@@ -60,60 +10,52 @@ from typing import List
RUN_TESTS_SCRIPT = os.path.join(os.path.dirname(__file__), "run_tests.py")
AUTO_RUN_PRESETS = ["online_orders"]
# PRESETS = {
# "online_orders": {
# "suite": ["online"],
# "mode": "ONLINE",
# "keyword": "ORDERS",
# "pytest_args": "-vv",
# "preset_meta": "在线模式,仅跑订单任务,输出更详细日志",
# },
# "offline_realdb": {
# "suite": ["offline"],
# "mode": "OFFLINE",
# "db_dsn": "postgresql://user:pwd@localhost:5432/testdb",
# "json_archive": "tests/testdata_json",
# "preset_meta": "离线模式 + 真实测试库,用预置 JSON 回放全量任务",
# },
# "integration_db": {
# "suite": ["integration"],
# "db_dsn": "postgresql://user:pwd@localhost:5432/testdb",
# "preset_meta": "仅跑数据库连接/操作相关的集成测试",
# },
# }
# 默认自动运行的预置(可根据需要修改顺序/条目)
AUTO_RUN_PRESETS = ["fetch_only"]
PRESETS = {
"offline_realdb": {
"suite": ["offline"],
"mode": "OFFLINE",
"db_dsn": "postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test",
"json_archive": "tests/testdata_json",
"preset_meta": "离线模式 + 真实测试库,用预置 JSON 回放全量任务",
"fetch_only": {
"suite": ["online"],
"flow": "FETCH_ONLY",
"json_fetch_root": "tmp/json_fetch",
"keyword": "ORDERS",
"pytest_args": "-vv",
"preset_meta": "仅在线抓取阶段,输出到本地目录",
},
"ingest_local": {
"suite": ["online"],
"flow": "INGEST_ONLY",
"json_source": "tests/source-data-doc",
"keyword": "ORDERS",
"preset_meta": "从指定 JSON 目录做本地清洗入库",
},
"full_pipeline": {
"suite": ["online"],
"flow": "FULL",
"json_fetch_root": "tmp/json_fetch",
"keyword": "ORDERS",
"preset_meta": "先抓取再清洗入库的全流程",
},
}
def print_parameter_help():
print("可用参数键说明:")
print(" suite -> 预置测试套件列表,如 ['online','offline']")
print(" tests -> 自定义测试文件路径列表")
print(" mode -> TEST_MODEONLINE / OFFLINE")
print(" db_dsn -> TEST_DB_DSN连接真实 PostgreSQL")
print(" json_archive -> TEST_JSON_ARCHIVE_DIR离线 JSON 目录")
print(" json_temp -> TEST_JSON_TEMP_DIR离线临时目录")
print(" keyword -> pytest -k 过滤关键字")
print(" pytest_args -> 额外 pytest 参数(单个字符串)")
print(" env -> 附加环境变量,形如 ['KEY=VALUE']")
print(" preset_meta -> 注释说明,不会传给 run_tests")
def print_parameter_help() -> None:
print("=== 参数键说明 ===")
print("suite : 预置套件列表,如 ['online','integration']")
print("tests : 自定义 pytest 路径列表")
print("flow : PIPELINE_FLOWFETCH_ONLY / INGEST_ONLY / FULL")
print("json_source : JSON_SOURCE_DIR本地清洗入库使用的 JSON 目录")
print("json_fetch_root : JSON_FETCH_ROOT在线抓取输出根目录")
print("keyword : pytest -k 过滤关键字")
print("pytest_args : 额外 pytest 参数(字符串)")
print("env : 附加环境变量,例如 ['KEY=VALUE']")
print("preset_meta : 仅用于注释说明")
print()
def print_presets():
def print_presets() -> None:
if not PRESETS:
print("当前没有定义任何预置命令,可自行在 PRESETS 中添加。")
print("当前定义任何预置,请在 PRESETS 中添加。")
return
for idx, (name, payload) in enumerate(PRESETS.items(), start=1):
comment = payload.get("preset_meta", "")
@@ -127,13 +69,31 @@ def print_presets():
print()
def run_presets(preset_names: List[str], dry_run: bool):
cmds = []
def resolve_targets(requested: List[str] | None) -> List[str]:
if not PRESETS:
raise SystemExit("预置为空,请先在 PRESETS 中定义测试组合。")
def valid(names: List[str]) -> List[str]:
return [name for name in names if name in PRESETS]
if requested:
candidates = valid(requested)
missing = [name for name in requested if name not in PRESETS]
if missing:
print(f"警告:忽略未定义的预置 {missing}")
if candidates:
return candidates
auto = valid(AUTO_RUN_PRESETS)
if auto:
return auto
return list(PRESETS.keys())
def run_presets(preset_names: List[str], dry_run: bool) -> None:
for name in preset_names:
cmd = [sys.executable, RUN_TESTS_SCRIPT, "--preset", name]
cmds.append(cmd)
for cmd in cmds:
printable = " ".join(cmd)
if dry_run:
print(f"[Dry-Run] {printable}")
@@ -142,11 +102,11 @@ def run_presets(preset_names: List[str], dry_run: bool):
subprocess.run(cmd, check=False)
def main():
parser = argparse.ArgumentParser(description="测试预置仓库(在此集中配置并运行测试组合")
parser.add_argument("--preset", choices=sorted(PRESETS.keys()), nargs="+", help="直接指定要运行的预置命令")
parser.add_argument("--list", action="store_true", help="仅列出参数键和所有预置命令")
parser.add_argument("--dry-run", action="store_true", help="仅打印将要执行的命令,而不真正运行")
def main() -> None:
parser = argparse.ArgumentParser(description="测试预置仓库(集中配置即可批量触发 run_tests")
parser.add_argument("--preset", choices=sorted(PRESETS.keys()), nargs="+", help="指定要运行的预置命令")
parser.add_argument("--list", action="store_true", help="仅列出参数说明与所有预置")
parser.add_argument("--dry-run", action="store_true", help="仅打印命令,不执行 pytest")
args = parser.parse_args()
if args.list:
@@ -154,12 +114,8 @@ def main():
print_presets()
return
if args.preset:
target = args.preset
else:
target = AUTO_RUN_PRESETS or list(PRESETS.keys())
run_presets(target, dry_run=args.dry_run)
targets = resolve_targets(args.preset)
run_presets(targets, dry_run=args.dry_run)
if __name__ == "__main__":

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.assistant_abolish import AssistantAbolishLoader
from models.parsers import TypeParser
@@ -14,54 +14,54 @@ class AssistantAbolishTask(BaseTask):
def get_task_code(self) -> str:
return "ASSISTANT_ABOLISH"
def execute(self) -> dict:
self.logger.info("开始执行 ASSISTANT_ABOLISH 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/AssistantPerformance/GetAbolitionAssistant",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="abolitionAssistants",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_record(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Assistant/AbolishList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "abolitionAssistants"),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = AssistantAbolishLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_records(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_record(raw)
if mapped:
parsed.append(mapped)
loader = AssistantAbolishLoader(self.db)
inserted, updated, skipped = loader.upsert_records(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"ASSISTANT_ABOLISH 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("ASSISTANT_ABOLISH 失败", exc_info=True)
raise
def _parse_record(self, raw: dict) -> dict | None:
def _parse_record(self, raw: dict, store_id: int) -> dict | None:
abolish_id = TypeParser.parse_int(raw.get("id"))
if not abolish_id:
self.logger.warning("跳过缺少 id 的助教作废记录: %s", raw)
self.logger.warning("跳过缺少作废ID的记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"abolish_id": abolish_id,
@@ -72,9 +72,7 @@ class AssistantAbolishTask(BaseTask):
"assistant_no": raw.get("assistantOn"),
"assistant_name": raw.get("assistantName"),
"charge_minutes": TypeParser.parse_int(raw.get("pdChargeMinutes")),
"abolish_amount": TypeParser.parse_decimal(
raw.get("assistantAbolishAmount")
),
"abolish_amount": TypeParser.parse_decimal(raw.get("assistantAbolishAmount")),
"create_time": TypeParser.parse_timestamp(
raw.get("createTime") or raw.get("create_time"), self.tz
),

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.dimensions.assistant import AssistantLoader
from models.parsers import TypeParser
@@ -14,49 +14,48 @@ class AssistantsTask(BaseTask):
def get_task_code(self) -> str:
return "ASSISTANTS"
def execute(self) -> dict:
self.logger.info("开始执行 ASSISTANTS 任务")
params = {"storeId": self.config.get("app.store_id")}
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params({"siteId": context.store_id})
records, _ = self.api.get_paginated(
endpoint="/PersonnelManagement/SearchAssistantInfo",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="assistantInfos",
)
return {"records": records}
try:
records, _ = self.api.get_paginated(
endpoint="/Assistant/List",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "assistantInfos"),
)
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_assistant(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
parsed = []
for raw in records:
mapped = self._parse_assistant(raw)
if mapped:
parsed.append(mapped)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = AssistantLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_assistants(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
loader = AssistantLoader(self.db)
inserted, updated, skipped = loader.upsert_assistants(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"ASSISTANTS 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("ASSISTANTS 失败", exc_info=True)
raise
def _parse_assistant(self, raw: dict) -> dict | None:
def _parse_assistant(self, raw: dict, store_id: int) -> dict | None:
assistant_id = TypeParser.parse_int(raw.get("id"))
if not assistant_id:
self.logger.warning("跳过缺少 id 的助教数据: %s", raw)
self.logger.warning("跳过缺少助教ID的数据: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"assistant_id": assistant_id,

View File

@@ -0,0 +1,79 @@
# -*- coding: utf-8 -*-
"""DWD任务基类"""
import json
from typing import Any, Dict, Iterator, List, Optional, Tuple
from datetime import datetime
from .base_task import BaseTask
from models.parsers import TypeParser
class BaseDwdTask(BaseTask):
"""
DWD 层任务基类
负责从 ODS 表读取数据,供子类清洗和写入事实/维度表
"""
def _get_ods_cursor(self, task_code: str) -> datetime:
"""
获取上次处理的 ODS 数据的时间点 (fetched_at)
这里简化处理,实际应该从 etl_cursor 表读取
目前先依赖 BaseTask 的时间窗口逻辑,或者子类自己管理
"""
# TODO: 对接真正的 CursorManager
# 暂时返回一个较早的时间,或者由子类通过 _get_time_window 获取
return None
def iter_ods_rows(
self,
table_name: str,
columns: List[str],
start_time: datetime,
end_time: datetime,
time_col: str = "fetched_at",
batch_size: int = 1000
) -> Iterator[List[Dict[str, Any]]]:
"""
分批迭代读取 ODS 表数据
Args:
table_name: ODS 表名
columns: 需要查询的字段列表 (必须包含 payload)
start_time: 开始时间 (包含)
end_time: 结束时间 (包含)
time_col: 时间过滤字段,默认 fetched_at
batch_size: 批次大小
"""
offset = 0
cols_str = ", ".join(columns)
while True:
sql = f"""
SELECT {cols_str}
FROM {table_name}
WHERE {time_col} >= %s AND {time_col} <= %s
ORDER BY {time_col} ASC
LIMIT %s OFFSET %s
"""
rows = self.db.query(sql, (start_time, end_time, batch_size, offset))
if not rows:
break
yield rows
if len(rows) < batch_size:
break
offset += batch_size
def parse_payload(self, row: Dict[str, Any]) -> Dict[str, Any]:
"""
解析 ODS 行中的 payload JSON
"""
payload = row.get("payload")
if isinstance(payload, str):
return json.loads(payload)
elif isinstance(payload, dict):
return payload
return {}

View File

@@ -1,62 +1,141 @@
# -*- coding: utf-8 -*-
"""ETL任务基类"""
"""ETL任务基类(引入 Extract/Transform/Load 模板方法)"""
from __future__ import annotations
from dataclasses import dataclass
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
@dataclass(frozen=True)
class TaskContext:
"""统一透传给 Extract/Transform/Load 的运行期信息。"""
store_id: int
window_start: datetime
window_end: datetime
window_minutes: int
cursor: dict | None = None
class BaseTask:
"""ETL任务基类"""
"""提供 E/T/L 模板的任务基类"""
def __init__(self, config, db_connection, api_client, logger):
self.config = config
self.db = db_connection
self.api = api_client
self.logger = logger
self.tz = ZoneInfo(config.get("app.timezone", "Asia/Taipei"))
# ------------------------------------------------------------------ 基本信息
def get_task_code(self) -> str:
"""获取任务代码"""
raise NotImplementedError("子类需实现 get_task_code 方法")
def execute(self) -> dict:
"""执行任务"""
raise NotImplementedError("子类需实现 execute 方法")
# ------------------------------------------------------------------ E/T/L 钩子
def extract(self, context: TaskContext):
"""提取数据"""
raise NotImplementedError("子类需实现 extract 方法")
def transform(self, extracted, context: TaskContext):
"""转换数据"""
return extracted
def load(self, transformed, context: TaskContext) -> dict:
"""加载数据并返回统计信息"""
raise NotImplementedError("子类需实现 load 方法")
# ------------------------------------------------------------------ 主流程
def execute(self, cursor_data: dict | None = None) -> dict:
"""统一 orchestrate Extract → Transform → Load"""
context = self._build_context(cursor_data)
task_code = self.get_task_code()
self.logger.info(
"%s: 开始执行,窗口[%s ~ %s]",
task_code,
context.window_start,
context.window_end,
)
try:
extracted = self.extract(context)
transformed = self.transform(extracted, context)
counts = self.load(transformed, context) or {}
self.db.commit()
except Exception:
self.db.rollback()
self.logger.error("%s: 执行失败", task_code, exc_info=True)
raise
result = self._build_result("SUCCESS", counts)
result["window"] = {
"start": context.window_start,
"end": context.window_end,
"minutes": context.window_minutes,
}
self.logger.info("%s: 完成,统计=%s", task_code, result["counts"])
return result
# ------------------------------------------------------------------ 辅助方法
def _build_context(self, cursor_data: dict | None) -> TaskContext:
window_start, window_end, window_minutes = self._get_time_window(cursor_data)
return TaskContext(
store_id=self.config.get("app.store_id"),
window_start=window_start,
window_end=window_end,
window_minutes=window_minutes,
cursor=cursor_data,
)
def _get_time_window(self, cursor_data: dict = None) -> tuple:
"""计算时间窗口"""
now = datetime.now(self.tz)
# 判断是否在闲时窗口
idle_start = self.config.get("run.idle_window.start", "04:00")
idle_end = self.config.get("run.idle_window.end", "16:00")
is_idle = self._is_in_idle_window(now, idle_start, idle_end)
# 获取窗口大小
if is_idle:
window_minutes = self.config.get("run.window_minutes.default_idle", 180)
else:
window_minutes = self.config.get("run.window_minutes.default_busy", 30)
# 计算窗口
overlap_seconds = self.config.get("run.overlap_seconds", 120)
if cursor_data and cursor_data.get("last_end"):
window_start = cursor_data["last_end"] - timedelta(seconds=overlap_seconds)
else:
window_start = now - timedelta(minutes=window_minutes)
window_end = now
return window_start, window_end, window_minutes
def _is_in_idle_window(self, dt: datetime, start_time: str, end_time: str) -> bool:
"""判断是否在闲时窗口"""
current_time = dt.strftime("%H:%M")
return start_time <= current_time <= end_time
def _merge_common_params(self, base: dict) -> dict:
"""
合并全局/任务级参数池便于在配置中统一覆<E4B880>?/追加过滤条件。
支持:
- api.params 下的通用键<E794A8>?
- api.params.<task_code_lower> 下的任务级键<E7BAA7>?
"""
merged: dict = {}
common = self.config.get("api.params", {}) or {}
if isinstance(common, dict):
merged.update(common)
task_key = f"api.params.{self.get_task_code().lower()}"
scoped = self.config.get(task_key, {}) or {}
if isinstance(scoped, dict):
merged.update(scoped)
merged.update(base)
return merged
def _build_result(self, status: str, counts: dict) -> dict:
"""构建结果字典"""
return {
"status": status,
"counts": counts
}
return {"status": status, "counts": counts}

View File

@@ -3,65 +3,66 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.coupon_usage import CouponUsageLoader
from models.parsers import TypeParser
class CouponUsageTask(BaseTask):
"""同步平台券验/核销记录"""
"""同步平台券验/核销记录"""
def get_task_code(self) -> str:
return "COUPON_USAGE"
def execute(self) -> dict:
self.logger.info("开始执行 COUPON_USAGE 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/Promotion/GetOfflineCouponConsumePageList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_usage(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Coupon/UsageList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=(),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = CouponUsageLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_coupon_usage(
transformed["records"]
)
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_usage(raw)
if mapped:
parsed.append(mapped)
loader = CouponUsageLoader(self.db)
inserted, updated, skipped = loader.upsert_coupon_usage(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"COUPON_USAGE 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("COUPON_USAGE 失败", exc_info=True)
raise
def _parse_usage(self, raw: dict) -> dict | None:
def _parse_usage(self, raw: dict, store_id: int) -> dict | None:
usage_id = TypeParser.parse_int(raw.get("id"))
if not usage_id:
self.logger.warning("跳过缺少 id 的券核销记录: %s", raw)
self.logger.warning("跳过缺少券核销ID的记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"usage_id": usage_id,

View File

@@ -0,0 +1,907 @@
# -*- coding: utf-8 -*-
"""DWD 装载任务:从 ODS 增量写入 DWD维度 SCD2事实按时间增量"""
from __future__ import annotations
from datetime import datetime
from typing import Any, Dict, Iterable, List, Sequence
from psycopg2.extras import RealDictCursor
from .base_task import BaseTask, TaskContext
class DwdLoadTask(BaseTask):
"""负责 DWD 装载:维度表做 SCD2 合并,事实表按时间增量写入。"""
# DWD -> ODS 表映射ODS 表名已与示例 JSON 前缀统一)
TABLE_MAP: dict[str, str] = {
# 维度
# 门店:改用台费流水中的 siteprofile 快照,补齐 org/地址等字段
"billiards_dwd.dim_site": "billiards_ods.table_fee_transactions",
"billiards_dwd.dim_site_ex": "billiards_ods.table_fee_transactions",
"billiards_dwd.dim_table": "billiards_ods.site_tables_master",
"billiards_dwd.dim_table_ex": "billiards_ods.site_tables_master",
"billiards_dwd.dim_assistant": "billiards_ods.assistant_accounts_master",
"billiards_dwd.dim_assistant_ex": "billiards_ods.assistant_accounts_master",
"billiards_dwd.dim_member": "billiards_ods.member_profiles",
"billiards_dwd.dim_member_ex": "billiards_ods.member_profiles",
"billiards_dwd.dim_member_card_account": "billiards_ods.member_stored_value_cards",
"billiards_dwd.dim_member_card_account_ex": "billiards_ods.member_stored_value_cards",
"billiards_dwd.dim_tenant_goods": "billiards_ods.tenant_goods_master",
"billiards_dwd.dim_tenant_goods_ex": "billiards_ods.tenant_goods_master",
"billiards_dwd.dim_store_goods": "billiards_ods.store_goods_master",
"billiards_dwd.dim_store_goods_ex": "billiards_ods.store_goods_master",
"billiards_dwd.dim_goods_category": "billiards_ods.stock_goods_category_tree",
"billiards_dwd.dim_groupbuy_package": "billiards_ods.group_buy_packages",
"billiards_dwd.dim_groupbuy_package_ex": "billiards_ods.group_buy_packages",
# 事实
"billiards_dwd.dwd_settlement_head": "billiards_ods.settlement_records",
"billiards_dwd.dwd_settlement_head_ex": "billiards_ods.settlement_records",
"billiards_dwd.dwd_table_fee_log": "billiards_ods.table_fee_transactions",
"billiards_dwd.dwd_table_fee_log_ex": "billiards_ods.table_fee_transactions",
"billiards_dwd.dwd_table_fee_adjust": "billiards_ods.table_fee_discount_records",
"billiards_dwd.dwd_table_fee_adjust_ex": "billiards_ods.table_fee_discount_records",
"billiards_dwd.dwd_store_goods_sale": "billiards_ods.store_goods_sales_records",
"billiards_dwd.dwd_store_goods_sale_ex": "billiards_ods.store_goods_sales_records",
"billiards_dwd.dwd_assistant_service_log": "billiards_ods.assistant_service_records",
"billiards_dwd.dwd_assistant_service_log_ex": "billiards_ods.assistant_service_records",
"billiards_dwd.dwd_assistant_trash_event": "billiards_ods.assistant_cancellation_records",
"billiards_dwd.dwd_assistant_trash_event_ex": "billiards_ods.assistant_cancellation_records",
"billiards_dwd.dwd_member_balance_change": "billiards_ods.member_balance_changes",
"billiards_dwd.dwd_member_balance_change_ex": "billiards_ods.member_balance_changes",
"billiards_dwd.dwd_groupbuy_redemption": "billiards_ods.group_buy_redemption_records",
"billiards_dwd.dwd_groupbuy_redemption_ex": "billiards_ods.group_buy_redemption_records",
"billiards_dwd.dwd_platform_coupon_redemption": "billiards_ods.platform_coupon_redemption_records",
"billiards_dwd.dwd_platform_coupon_redemption_ex": "billiards_ods.platform_coupon_redemption_records",
"billiards_dwd.dwd_recharge_order": "billiards_ods.recharge_settlements",
"billiards_dwd.dwd_recharge_order_ex": "billiards_ods.recharge_settlements",
"billiards_dwd.dwd_payment": "billiards_ods.payment_transactions",
"billiards_dwd.dwd_refund": "billiards_ods.refund_transactions",
"billiards_dwd.dwd_refund_ex": "billiards_ods.refund_transactions",
}
SCD_COLS = {"scd2_start_time", "scd2_end_time", "scd2_is_current", "scd2_version"}
FACT_ORDER_CANDIDATES = [
"fetched_at",
"pay_time",
"create_time",
"update_time",
"occur_time",
"settle_time",
"start_use_time",
]
# 特殊列映射dwd 列名 -> 源列表达式(可选 CAST
FACT_MAPPINGS: dict[str, list[tuple[str, str, str | None]]] = {
# 维度表(补齐主键/字段差异)
"billiards_dwd.dim_site": [
("org_id", "siteprofile->>'org_id'", None),
("shop_name", "siteprofile->>'shop_name'", None),
("site_label", "siteprofile->>'site_label'", None),
("full_address", "siteprofile->>'full_address'", None),
("address", "siteprofile->>'address'", None),
("longitude", "siteprofile->>'longitude'", "numeric"),
("latitude", "siteprofile->>'latitude'", "numeric"),
("tenant_site_region_id", "siteprofile->>'tenant_site_region_id'", None),
("business_tel", "siteprofile->>'business_tel'", None),
("site_type", "siteprofile->>'site_type'", None),
("shop_status", "siteprofile->>'shop_status'", None),
("tenant_id", "siteprofile->>'tenant_id'", None),
],
"billiards_dwd.dim_site_ex": [
("auto_light", "siteprofile->>'auto_light'", None),
("attendance_enabled", "siteprofile->>'attendance_enabled'", None),
("attendance_distance", "siteprofile->>'attendance_distance'", None),
("prod_env", "siteprofile->>'prod_env'", None),
("light_status", "siteprofile->>'light_status'", None),
("light_type", "siteprofile->>'light_type'", None),
("light_token", "siteprofile->>'light_token'", None),
("address", "siteprofile->>'address'", None),
("avatar", "siteprofile->>'avatar'", None),
("wifi_name", "siteprofile->>'wifi_name'", None),
("wifi_password", "siteprofile->>'wifi_password'", None),
("customer_service_qrcode", "siteprofile->>'customer_service_qrcode'", None),
("customer_service_wechat", "siteprofile->>'customer_service_wechat'", None),
("fixed_pay_qrcode", "siteprofile->>'fixed_pay_qrCode'", None),
("longitude", "siteprofile->>'longitude'", "numeric"),
("latitude", "siteprofile->>'latitude'", "numeric"),
("tenant_site_region_id", "siteprofile->>'tenant_site_region_id'", None),
("site_type", "siteprofile->>'site_type'", None),
("site_label", "siteprofile->>'site_label'", None),
("shop_status", "siteprofile->>'shop_status'", None),
("create_time", "siteprofile->>'create_time'", "timestamptz"),
("update_time", "siteprofile->>'update_time'", "timestamptz"),
],
"billiards_dwd.dim_table": [
("table_id", "id", None),
("site_table_area_name", "areaname", None),
("tenant_table_area_id", "site_table_area_id", None),
],
"billiards_dwd.dim_table_ex": [
("table_id", "id", None),
("table_cloth_use_time", "table_cloth_use_time", None),
],
"billiards_dwd.dim_assistant": [("assistant_id", "id", None), ("user_id", "staff_id", None)],
"billiards_dwd.dim_assistant_ex": [
("assistant_id", "id", None),
("introduce", "introduce", None),
("group_name", "group_name", None),
("light_equipment_id", "light_equipment_id", None),
],
"billiards_dwd.dim_member": [("member_id", "id", None)],
"billiards_dwd.dim_member_ex": [
("member_id", "id", None),
("register_site_name", "site_name", None),
],
"billiards_dwd.dim_member_card_account": [("member_card_id", "id", None)],
"billiards_dwd.dim_member_card_account_ex": [
("member_card_id", "id", None),
("tenant_name", "tenantname", None),
("tenantavatar", "tenantavatar", None),
("card_no", "card_no", None),
("bind_password", "bind_password", None),
("use_scene", "use_scene", None),
("tableareaid", "tableareaid", None),
("goodscategoryid", "goodscategoryid", None),
],
"billiards_dwd.dim_tenant_goods": [
("tenant_goods_id", "id", None),
("category_name", "categoryname", None),
],
"billiards_dwd.dim_tenant_goods_ex": [
("tenant_goods_id", "id", None),
("remark_name", "remark_name", None),
("goods_bar_code", "goods_bar_code", None),
("commodity_code_list", "commodity_code", None),
("is_in_site", "isinsite", "boolean"),
],
"billiards_dwd.dim_store_goods": [
("site_goods_id", "id", None),
("category_level1_name", "onecategoryname", None),
("category_level2_name", "twocategoryname", None),
("created_at", "create_time", None),
("updated_at", "update_time", None),
("avg_monthly_sales", "average_monthly_sales", None),
("batch_stock_qty", "stock", None),
("sale_qty", "sale_num", None),
("total_sales_qty", "total_sales", None),
],
"billiards_dwd.dim_store_goods_ex": [
("site_goods_id", "id", None),
("goods_barcode", "goods_bar_code", None),
("stock_qty", "stock", None),
("stock_secondary_qty", "stock_a", None),
("safety_stock_qty", "safe_stock", None),
("site_name", "sitename", None),
("goods_cover_url", "goods_cover", None),
("provisional_total_cost", "total_purchase_cost", None),
("is_discountable", "able_discount", None),
("freeze_status", "freeze", None),
("remark", "remark", None),
("days_on_shelf", "days_available", None),
("sort_order", "sort", None),
],
"billiards_dwd.dim_goods_category": [
("category_id", "id", None),
("tenant_id", "tenant_id", None),
("category_name", "category_name", None),
("alias_name", "alias_name", None),
("parent_category_id", "pid", None),
("business_name", "business_name", None),
("tenant_goods_business_id", "tenant_goods_business_id", None),
("sort_order", "sort", None),
("open_salesman", "open_salesman", None),
("is_warehousing", "is_warehousing", None),
("category_level", "CASE WHEN pid = 0 THEN 1 ELSE 2 END", None),
("is_leaf", "CASE WHEN categoryboxes IS NULL OR jsonb_array_length(categoryboxes)=0 THEN 1 ELSE 0 END", None),
],
"billiards_dwd.dim_groupbuy_package": [
("groupbuy_package_id", "id", None),
("package_template_id", "package_id", None),
("coupon_face_value", "coupon_money", None),
("duration_seconds", "duration", None),
],
"billiards_dwd.dim_groupbuy_package_ex": [
("groupbuy_package_id", "id", None),
("table_area_id", "table_area_id", None),
("tenant_table_area_id", "tenant_table_area_id", None),
("usable_range", "usable_range", None),
("table_area_id_list", "table_area_id_list", None),
("package_type", "type", None),
],
# 事实表主键及关键差异列
"billiards_dwd.dwd_table_fee_log": [("table_fee_log_id", "id", None)],
"billiards_dwd.dwd_table_fee_log_ex": [
("table_fee_log_id", "id", None),
("salesman_name", "salesman_name", None),
],
"billiards_dwd.dwd_table_fee_adjust": [
("table_fee_adjust_id", "id", None),
("table_id", "site_table_id", None),
("table_area_id", "tenant_table_area_id", None),
("table_area_name", "tableprofile->>'table_area_name'", None),
("adjust_time", "create_time", None),
],
"billiards_dwd.dwd_table_fee_adjust_ex": [
("table_fee_adjust_id", "id", None),
("ledger_name", "ledger_name", None),
],
"billiards_dwd.dwd_store_goods_sale": [("store_goods_sale_id", "id", None), ("discount_price", "discount_money", None)],
"billiards_dwd.dwd_store_goods_sale_ex": [
("store_goods_sale_id", "id", None),
("option_value_name", "option_value_name", None),
("open_salesman_flag", "opensalesman", "integer"),
("salesman_name", "salesman_name", None),
("salesman_org_id", "sales_man_org_id", None),
("legacy_order_goods_id", "ordergoodsid", None),
("site_name", "sitename", None),
("legacy_site_id", "siteid", None),
],
"billiards_dwd.dwd_assistant_service_log": [
("assistant_service_id", "id", None),
("assistant_no", "assistantno", None),
("site_assistant_id", "order_assistant_id", None),
("level_name", "levelname", None),
("skill_name", "skillname", None),
],
"billiards_dwd.dwd_assistant_service_log_ex": [
("assistant_service_id", "id", None),
("assistant_name", "assistantname", None),
("ledger_group_name", "ledger_group_name", None),
("trash_applicant_name", "trash_applicant_name", None),
("trash_reason", "trash_reason", None),
("salesman_name", "salesman_name", None),
("table_name", "tablename", None),
],
"billiards_dwd.dwd_assistant_trash_event": [
("assistant_trash_event_id", "id", None),
("assistant_no", "assistantname", None),
("abolish_amount", "assistantabolishamount", None),
("charge_minutes_raw", "pdchargeminutes", None),
("site_id", "siteid", None),
("table_id", "tableid", None),
("table_area_id", "tableareaid", None),
("assistant_name", "assistantname", None),
("trash_reason", "trashreason", None),
("create_time", "createtime", None),
],
"billiards_dwd.dwd_assistant_trash_event_ex": [
("assistant_trash_event_id", "id", None),
("table_area_name", "tablearea", None),
("table_name", "tablename", None),
],
"billiards_dwd.dwd_member_balance_change": [
("balance_change_id", "id", None),
("balance_before", "before", None),
("change_amount", "account_data", None),
("balance_after", "after", None),
("card_type_name", "membercardtypename", None),
("change_time", "create_time", None),
("member_name", "membername", None),
("member_mobile", "membermobile", None),
],
"billiards_dwd.dwd_member_balance_change_ex": [
("balance_change_id", "id", None),
("pay_site_name", "paysitename", None),
("register_site_name", "registersitename", None),
],
"billiards_dwd.dwd_groupbuy_redemption": [("redemption_id", "id", None)],
"billiards_dwd.dwd_groupbuy_redemption_ex": [
("redemption_id", "id", None),
("table_area_name", "tableareaname", None),
("site_name", "sitename", None),
("table_name", "tablename", None),
("goods_option_price", "goodsoptionprice", None),
("salesman_name", "salesman_name", None),
("salesman_org_id", "sales_man_org_id", None),
("ledger_group_name", "ledger_group_name", None),
],
"billiards_dwd.dwd_platform_coupon_redemption": [("platform_coupon_redemption_id", "id", None)],
"billiards_dwd.dwd_platform_coupon_redemption_ex": [
("platform_coupon_redemption_id", "id", None),
("coupon_cover", "coupon_cover", None),
],
"billiards_dwd.dwd_payment": [("payment_id", "id", None), ("pay_date", "pay_time", "date")],
"billiards_dwd.dwd_refund": [("refund_id", "id", None)],
"billiards_dwd.dwd_refund_ex": [
("refund_id", "id", None),
("tenant_name", "tenantname", None),
("channel_payer_id", "channel_payer_id", None),
("channel_pay_no", "channel_pay_no", None),
],
# 结算头settlement_records源列为小写驼峰/无下划线,需要显式映射)
"billiards_dwd.dwd_settlement_head": [
("order_settle_id", "id", None),
("tenant_id", "tenantid", None),
("site_id", "siteid", None),
("site_name", "sitename", None),
("table_id", "tableid", None),
("settle_name", "settlename", None),
("order_trade_no", "settlerelateid", None),
("create_time", "createtime", None),
("pay_time", "paytime", None),
("settle_type", "settletype", None),
("revoke_order_id", "revokeorderid", None),
("member_id", "memberid", None),
("member_name", "membername", None),
("member_phone", "memberphone", None),
("member_card_account_id", "tenantmembercardid", None),
("member_card_type_name", "membercardtypename", None),
("is_bind_member", "isbindmember", None),
("member_discount_amount", "memberdiscountamount", None),
("consume_money", "consumemoney", None),
("table_charge_money", "tablechargemoney", None),
("goods_money", "goodsmoney", None),
("real_goods_money", "realgoodsmoney", None),
("assistant_pd_money", "assistantpdmoney", None),
("assistant_cx_money", "assistantcxmoney", None),
("adjust_amount", "adjustamount", None),
("pay_amount", "payamount", None),
("balance_amount", "balanceamount", None),
("recharge_card_amount", "rechargecardamount", None),
("gift_card_amount", "giftcardamount", None),
("coupon_amount", "couponamount", None),
("rounding_amount", "roundingamount", None),
("point_amount", "pointamount", None),
],
"billiards_dwd.dwd_settlement_head_ex": [
("order_settle_id", "id", None),
("serial_number", "serialnumber", None),
("settle_status", "settlestatus", None),
("can_be_revoked", "canberevoked", "boolean"),
("revoke_order_name", "revokeordername", None),
("revoke_time", "revoketime", None),
("is_first_order", "isfirst", "boolean"),
("service_money", "servicemoney", None),
("cash_amount", "cashamount", None),
("card_amount", "cardamount", None),
("online_amount", "onlineamount", None),
("refund_amount", "refundamount", None),
("prepay_money", "prepaymoney", None),
("payment_method", "paymentmethod", None),
("coupon_sale_amount", "couponsaleamount", None),
("all_coupon_discount", "allcoupondiscount", None),
("goods_promotion_money", "goodspromotionmoney", None),
("assistant_promotion_money", "assistantpromotionmoney", None),
("activity_discount", "activitydiscount", None),
("assistant_manual_discount", "assistantmanualdiscount", None),
("point_discount_price", "pointdiscountprice", None),
("point_discount_cost", "pointdiscountcost", None),
("is_use_coupon", "isusecoupon", "boolean"),
("is_use_discount", "isusediscount", "boolean"),
("is_activity", "isactivity", "boolean"),
("operator_name", "operatorname", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("operator_id", "operatorid", None),
("salesman_user_id", "salesmanuserid", None),
],
# 充值结算recharge_settlements字段风格同 settlement_records
"billiards_dwd.dwd_recharge_order": [
("recharge_order_id", "id", None),
("tenant_id", "tenantid", None),
("site_id", "siteid", None),
("member_id", "memberid", None),
("member_name_snapshot", "membername", None),
("member_phone_snapshot", "memberphone", None),
("tenant_member_card_id", "tenantmembercardid", None),
("member_card_type_name", "membercardtypename", None),
("settle_relate_id", "settlerelateid", None),
("settle_type", "settletype", None),
("settle_name", "settlename", None),
("is_first", "isfirst", None),
("pay_amount", "payamount", None),
("refund_amount", "refundamount", None),
("point_amount", "pointamount", None),
("cash_amount", "cashamount", None),
("payment_method", "paymentmethod", None),
("create_time", "createtime", None),
("pay_time", "paytime", None),
],
"billiards_dwd.dwd_recharge_order_ex": [
("recharge_order_id", "id", None),
("site_name_snapshot", "sitename", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("revoke_order_name", "revokeordername", None),
("settle_status", "settlestatus", None),
("is_bind_member", "isbindmember", "boolean"),
("is_activity", "isactivity", "boolean"),
("is_use_coupon", "isusecoupon", "boolean"),
("is_use_discount", "isusediscount", "boolean"),
("can_be_revoked", "canberevoked", "boolean"),
("online_amount", "onlineamount", None),
("balance_amount", "balanceamount", None),
("card_amount", "cardamount", None),
("coupon_amount", "couponamount", None),
("recharge_card_amount", "rechargecardamount", None),
("gift_card_amount", "giftcardamount", None),
("prepay_money", "prepaymoney", None),
("consume_money", "consumemoney", None),
("goods_money", "goodsmoney", None),
("real_goods_money", "realgoodsmoney", None),
("table_charge_money", "tablechargemoney", None),
("service_money", "servicemoney", None),
("activity_discount", "activitydiscount", None),
("all_coupon_discount", "allcoupondiscount", None),
("goods_promotion_money", "goodspromotionmoney", None),
("assistant_promotion_money", "assistantpromotionmoney", None),
("assistant_pd_money", "assistantpdmoney", None),
("assistant_cx_money", "assistantcxmoney", None),
("assistant_manual_discount", "assistantmanualdiscount", None),
("coupon_sale_amount", "couponsaleamount", None),
("member_discount_amount", "memberdiscountamount", None),
("point_discount_price", "pointdiscountprice", None),
("point_discount_cost", "pointdiscountcost", None),
("adjust_amount", "adjustamount", None),
("rounding_amount", "roundingamount", None),
("operator_id", "operatorid", None),
("operator_name_snapshot", "operatorname", None),
("salesman_user_id", "salesmanuserid", None),
("salesman_name", "salesmanname", None),
("order_remark", "orderremark", None),
("table_id", "tableid", None),
("serial_number", "serialnumber", None),
("revoke_order_id", "revokeorderid", None),
("revoke_order_name", "revokeordername", None),
("revoke_time", "revoketime", None),
],
}
def get_task_code(self) -> str:
"""返回任务编码。"""
return "DWD_LOAD_FROM_ODS"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""准备运行所需的上下文信息。"""
return {"now": datetime.now()}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict[str, Any]:
"""遍历映射关系,维度执行 SCD2 合并,事实表按时间增量插入。"""
now = extracted["now"]
summary: List[Dict[str, Any]] = []
with self.db.conn.cursor(cursor_factory=RealDictCursor) as cur:
for dwd_table, ods_table in self.TABLE_MAP.items():
dwd_cols = self._get_columns(cur, dwd_table)
ods_cols = self._get_columns(cur, ods_table)
if not dwd_cols:
self.logger.warning("跳过 %s,未能获取 DWD 列信息", dwd_table)
continue
if self._table_base(dwd_table).startswith("dim_"):
processed = self._merge_dim_scd2(cur, dwd_table, ods_table, dwd_cols, ods_cols, now)
summary.append({"table": dwd_table, "mode": "SCD2", "processed": processed})
else:
dwd_types = self._get_column_types(cur, dwd_table, "billiards_dwd")
ods_types = self._get_column_types(cur, ods_table, "billiards_ods")
inserted = self._merge_fact_increment(
cur, dwd_table, ods_table, dwd_cols, ods_cols, dwd_types, ods_types
)
summary.append({"table": dwd_table, "mode": "INCREMENT", "inserted": inserted})
self.db.conn.commit()
return {"tables": summary}
# ---------------------- helpers ----------------------
def _get_columns(self, cur, table: str) -> List[str]:
"""获取指定表的列名(小写)。"""
schema, name = self._split_table_name(table, default_schema="billiards_dwd")
cur.execute(
"""
SELECT column_name
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
""",
(schema, name),
)
return [r["column_name"].lower() for r in cur.fetchall()]
def _get_primary_keys(self, cur, table: str) -> List[str]:
"""获取表的主键列名列表。"""
schema, name = self._split_table_name(table, default_schema="billiards_dwd")
cur.execute(
"""
SELECT kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
AND tc.table_schema = kcu.table_schema
AND tc.table_name = kcu.table_name
WHERE tc.table_schema = %s
AND tc.table_name = %s
AND tc.constraint_type = 'PRIMARY KEY'
ORDER BY kcu.ordinal_position
""",
(schema, name),
)
return [r["column_name"].lower() for r in cur.fetchall()]
def _get_column_types(self, cur, table: str, default_schema: str) -> Dict[str, str]:
"""获取列的数据类型information_schema.data_type"""
schema, name = self._split_table_name(table, default_schema=default_schema)
cur.execute(
"""
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
""",
(schema, name),
)
return {r["column_name"].lower(): r["data_type"].lower() for r in cur.fetchall()}
def _build_column_mapping(
self, dwd_table: str, pk_cols: Sequence[str], ods_cols: Sequence[str]
) -> Dict[str, tuple[str, str | None]]:
"""合并显式 FACT_MAPPINGS 与主键兜底映射。"""
mapping_entries = self.FACT_MAPPINGS.get(dwd_table, [])
mapping: Dict[str, tuple[str, str | None]] = {
dst.lower(): (src, cast_type) for dst, src, cast_type in mapping_entries
}
ods_set = {c.lower() for c in ods_cols}
for pk in pk_cols:
pk_lower = pk.lower()
if pk_lower not in mapping and pk_lower not in ods_set and "id" in ods_set:
mapping[pk_lower] = ("id", None)
return mapping
def _fetch_source_rows(
self, cur, table: str, columns: Sequence[str], where_sql: str = "", params: Sequence[Any] = None
) -> List[Dict[str, Any]]:
"""从源表读取指定列,返回小写键的字典列表。"""
schema, name = self._split_table_name(table, default_schema="billiards_ods")
cols_sql = ", ".join(f'"{c}"' for c in columns)
sql = f'SELECT {cols_sql} FROM "{schema}"."{name}" {where_sql}'
cur.execute(sql, params or [])
rows = []
for r in cur.fetchall():
rows.append({k.lower(): v for k, v in r.items()})
return rows
def _expand_goods_category_rows(self, rows: list[Dict[str, Any]]) -> list[Dict[str, Any]]:
"""将分类表中的 categoryboxes 元素展开为子类记录。"""
expanded: list[Dict[str, Any]] = []
for r in rows:
expanded.append(r)
boxes = r.get("categoryboxes")
if isinstance(boxes, list):
for child in boxes:
if not isinstance(child, dict):
continue
child_row: Dict[str, Any] = {}
# 继承父级的租户与业务大类信息
child_row["tenant_id"] = r.get("tenant_id")
child_row["business_name"] = child.get("business_name", r.get("business_name"))
child_row["tenant_goods_business_id"] = child.get(
"tenant_goods_business_id", r.get("tenant_goods_business_id")
)
# 合并子类字段
child_row.update(child)
# 默认父子关系
child_row.setdefault("pid", r.get("id"))
# 衍生层级/叶子标记
child_boxes = child_row.get("categoryboxes")
if not isinstance(child_boxes, list):
is_leaf = 1
else:
is_leaf = 1 if len(child_boxes) == 0 else 0
child_row.setdefault("category_level", 2)
child_row.setdefault("is_leaf", is_leaf)
expanded.append(child_row)
return expanded
def _merge_dim_scd2(
self,
cur,
dwd_table: str,
ods_table: str,
dwd_cols: Sequence[str],
ods_cols: Sequence[str],
now: datetime,
) -> int:
"""对维表执行 SCD2 合并:对比变更关闭旧版并插入新版。"""
pk_cols = self._get_primary_keys(cur, dwd_table)
if not pk_cols:
raise ValueError(f"{dwd_table} 未配置主键,无法执行 SCD2 合并")
mapping = self._build_column_mapping(dwd_table, pk_cols, ods_cols)
ods_set = {c.lower() for c in ods_cols}
table_sql = self._format_table(ods_table, "billiards_ods")
# 构造 SELECT 表达式,支持 JSON/expression 映射
select_exprs: list[str] = []
added: set[str] = set()
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
if lc in mapping:
src, cast_type = mapping[lc]
select_exprs.append(f"{self._cast_expr(src, cast_type)} AS \"{lc}\"")
added.add(lc)
elif lc in ods_set:
select_exprs.append(f'"{lc}" AS "{lc}"')
added.add(lc)
# 分类维度需要额外读取 categoryboxes 以展开子类
if dwd_table == "billiards_dwd.dim_goods_category" and "categoryboxes" not in added and "categoryboxes" in ods_set:
select_exprs.append('"categoryboxes" AS "categoryboxes"')
added.add("categoryboxes")
# 主键兜底确保被选出
for pk in pk_cols:
lc = pk.lower()
if lc not in added:
if lc in mapping:
src, cast_type = mapping[lc]
select_exprs.append(f"{self._cast_expr(src, cast_type)} AS \"{lc}\"")
elif lc in ods_set:
select_exprs.append(f'"{lc}" AS "{lc}"')
added.add(lc)
if not select_exprs:
return 0
sql = f"SELECT {', '.join(select_exprs)} FROM {table_sql}"
cur.execute(sql)
rows = [{k.lower(): v for k, v in r.items()} for r in cur.fetchall()]
# 特殊:分类维度展开子类
if dwd_table == "billiards_dwd.dim_goods_category":
rows = self._expand_goods_category_rows(rows)
inserted_or_updated = 0
seen_pk = set()
for row in rows:
mapped_row: Dict[str, Any] = {}
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
value = row.get(lc)
if value is None and lc in mapping:
src, _ = mapping[lc]
value = row.get(src.lower())
mapped_row[lc] = value
pk_key = tuple(mapped_row.get(pk) for pk in pk_cols)
if pk_key in seen_pk:
continue
seen_pk.add(pk_key)
if self._upsert_scd2_row(cur, dwd_table, dwd_cols, pk_cols, mapped_row, now):
inserted_or_updated += 1
return len(rows)
def _upsert_scd2_row(
self,
cur,
dwd_table: str,
dwd_cols: Sequence[str],
pk_cols: Sequence[str],
src_row: Dict[str, Any],
now: datetime,
) -> bool:
"""SCD2 合并:若有变更则关闭旧版并插入新版本。"""
pk_values = [src_row.get(pk) for pk in pk_cols]
if any(v is None for v in pk_values):
self.logger.warning("跳过 %s:主键缺失 %s", dwd_table, dict(zip(pk_cols, pk_values)))
return False
where_clause = " AND ".join(f'"{pk}" = %s' for pk in pk_cols)
table_sql = self._format_table(dwd_table, "billiards_dwd")
cur.execute(
f"SELECT * FROM {table_sql} WHERE {where_clause} AND COALESCE(scd2_is_current,1)=1 LIMIT 1",
pk_values,
)
current = cur.fetchone()
if current:
current = {k.lower(): v for k, v in current.items()}
if current and not self._is_row_changed(current, src_row, dwd_cols):
return False
if current:
version = (current.get("scd2_version") or 1) + 1
self._close_current_dim(cur, dwd_table, pk_cols, pk_values, now)
else:
version = 1
self._insert_dim_row(cur, dwd_table, dwd_cols, src_row, now, version)
return True
def _close_current_dim(self, cur, table: str, pk_cols: Sequence[str], pk_values: Sequence[Any], now: datetime) -> None:
"""关闭当前版本,标记 scd2_is_current=0 并填充结束时间。"""
set_sql = "scd2_end_time = %s, scd2_is_current = 0"
where_clause = " AND ".join(f'"{pk}" = %s' for pk in pk_cols)
table_sql = self._format_table(table, "billiards_dwd")
cur.execute(f"UPDATE {table_sql} SET {set_sql} WHERE {where_clause} AND COALESCE(scd2_is_current,1)=1", [now, *pk_values])
def _insert_dim_row(
self,
cur,
table: str,
dwd_cols: Sequence[str],
src_row: Dict[str, Any],
now: datetime,
version: int,
) -> None:
"""插入新的 SCD2 版本行。"""
insert_cols: List[str] = []
placeholders: List[str] = []
values: List[Any] = []
for col in sorted(dwd_cols):
lc = col.lower()
insert_cols.append(f'"{lc}"')
placeholders.append("%s")
if lc == "scd2_start_time":
values.append(now)
elif lc == "scd2_end_time":
values.append(datetime(9999, 12, 31, 0, 0, 0))
elif lc == "scd2_is_current":
values.append(1)
elif lc == "scd2_version":
values.append(version)
else:
values.append(src_row.get(lc))
table_sql = self._format_table(table, "billiards_dwd")
sql = f'INSERT INTO {table_sql} ({", ".join(insert_cols)}) VALUES ({", ".join(placeholders)})'
cur.execute(sql, values)
def _is_row_changed(self, current: Dict[str, Any], incoming: Dict[str, Any], dwd_cols: Sequence[str]) -> bool:
"""比较非 SCD2 列,判断是否存在变更。"""
for col in dwd_cols:
lc = col.lower()
if lc in self.SCD_COLS:
continue
if current.get(lc) != incoming.get(lc):
return True
return False
def _merge_fact_increment(
self,
cur,
dwd_table: str,
ods_table: str,
dwd_cols: Sequence[str],
ods_cols: Sequence[str],
dwd_types: Dict[str, str],
ods_types: Dict[str, str],
) -> int:
"""事实表按时间增量插入,默认按列名交集写入。"""
mapping_entries = self.FACT_MAPPINGS.get(dwd_table) or []
mapping: Dict[str, tuple[str, str | None]] = {
dst.lower(): (src, cast_type) for dst, src, cast_type in mapping_entries
}
mapping_dest = [dst for dst, _, _ in mapping_entries]
insert_cols: List[str] = list(mapping_dest)
for col in dwd_cols:
if col in self.SCD_COLS:
continue
if col in insert_cols:
continue
if col in ods_cols:
insert_cols.append(col)
pk_cols = self._get_primary_keys(cur, dwd_table)
ods_set = {c.lower() for c in ods_cols}
existing_lower = [c.lower() for c in insert_cols]
for pk in pk_cols:
pk_lower = pk.lower()
if pk_lower in existing_lower:
continue
if pk_lower in ods_set:
insert_cols.append(pk)
existing_lower.append(pk_lower)
elif "id" in ods_set:
insert_cols.append(pk)
existing_lower.append(pk_lower)
mapping[pk_lower] = ("id", None)
# 保持列顺序同时去重
seen_cols: set[str] = set()
ordered_cols: list[str] = []
for col in insert_cols:
lc = col.lower()
if lc not in seen_cols:
seen_cols.add(lc)
ordered_cols.append(col)
insert_cols = ordered_cols
if not insert_cols:
self.logger.warning("跳过 %s:未找到可插入的列", dwd_table)
return 0
order_col = self._pick_order_column(dwd_cols, ods_cols)
where_sql = ""
params: List[Any] = []
dwd_table_sql = self._format_table(dwd_table, "billiards_dwd")
ods_table_sql = self._format_table(ods_table, "billiards_ods")
if order_col:
cur.execute(f'SELECT COALESCE(MAX("{order_col}"), %s) FROM {dwd_table_sql}', ("1970-01-01",))
row = cur.fetchone() or {}
watermark = list(row.values())[0] if row else "1970-01-01"
where_sql = f'WHERE "{order_col}" > %s'
params.append(watermark)
default_cols = [c for c in insert_cols if c.lower() not in mapping]
default_expr_map: Dict[str, str] = {}
if default_cols:
default_exprs = self._build_fact_select_exprs(default_cols, dwd_types, ods_types)
default_expr_map = dict(zip(default_cols, default_exprs))
select_exprs: List[str] = []
for col in insert_cols:
key = col.lower()
if key in mapping:
src, cast_type = mapping[key]
select_exprs.append(self._cast_expr(src, cast_type))
else:
select_exprs.append(default_expr_map[col])
select_cols_sql = ", ".join(select_exprs)
insert_cols_sql = ", ".join(f'"{c}"' for c in insert_cols)
sql = f'INSERT INTO {dwd_table_sql} ({insert_cols_sql}) SELECT {select_cols_sql} FROM {ods_table_sql} {where_sql}'
pk_cols = self._get_primary_keys(cur, dwd_table)
if pk_cols:
pk_sql = ", ".join(f'"{c}"' for c in pk_cols)
sql += f" ON CONFLICT ({pk_sql}) DO NOTHING"
cur.execute(sql, params)
return cur.rowcount
def _pick_order_column(self, dwd_cols: Iterable[str], ods_cols: Iterable[str]) -> str | None:
"""选择用于增量的时间列(需同时存在于 DWD 与 ODS"""
lower_cols = {c.lower() for c in dwd_cols} & {c.lower() for c in ods_cols}
for candidate in self.FACT_ORDER_CANDIDATES:
if candidate.lower() in lower_cols:
return candidate.lower()
return None
def _build_fact_select_exprs(
self,
insert_cols: Sequence[str],
dwd_types: Dict[str, str],
ods_types: Dict[str, str],
) -> List[str]:
"""构造事实表 SELECT 列表,需要时做类型转换。"""
numeric_types = {"integer", "bigint", "smallint", "numeric", "double precision", "real", "decimal"}
text_types = {"text", "character varying", "varchar"}
exprs = []
for col in insert_cols:
d_type = dwd_types.get(col)
o_type = ods_types.get(col)
if d_type in numeric_types and o_type in text_types:
exprs.append(f"CAST(NULLIF(CAST(\"{col}\" AS text), '') AS numeric):: {d_type}")
else:
exprs.append(f'"{col}"')
return exprs
def _split_table_name(self, name: str, default_schema: str) -> tuple[str, str]:
"""拆分 schema.table若无 schema 则补默认 schema。"""
parts = name.split(".")
if len(parts) == 2:
return parts[0], parts[1].lower()
return default_schema, name.lower()
def _table_base(self, name: str) -> str:
"""获取不含 schema 的表名。"""
return name.split(".")[-1]
def _format_table(self, name: str, default_schema: str) -> str:
"""返回带引号的 schema.table 名称。"""
schema, table = self._split_table_name(name, default_schema)
return f'"{schema}"."{table}"'
def _cast_expr(self, col: str, cast_type: str | None) -> str:
"""构造带可选 CAST 的列表达式。"""
if col.upper() == "NULL":
base = "NULL"
else:
is_expr = not col.isidentifier() or "->" in col or "#>>" in col or "::" in col or "'" in col
base = col if is_expr else f'"{col}"'
if cast_type:
cast_lower = cast_type.lower()
if cast_lower in {"bigint", "integer", "numeric", "decimal"}:
return f"CAST(NULLIF(CAST({base} AS text), '') AS numeric):: {cast_type}"
if cast_lower == "timestamptz":
return f"({base})::timestamptz"
return f"{base}::{cast_type}"
return base

View File

@@ -0,0 +1,105 @@
# -*- coding: utf-8 -*-
"""DWD 质量核对任务:按 dwd_quality_check.md 输出行数/金额对照报表。"""
from __future__ import annotations
import json
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, Iterable, List, Sequence, Tuple
from psycopg2.extras import RealDictCursor
from .base_task import BaseTask, TaskContext
from .dwd_load_task import DwdLoadTask
class DwdQualityTask(BaseTask):
"""对 ODS 与 DWD 进行行数、金额对照核查,生成 JSON 报表。"""
REPORT_PATH = Path("etl_billiards/reports/dwd_quality_report.json")
AMOUNT_KEYWORDS = ("amount", "money", "fee", "balance")
def get_task_code(self) -> str:
"""返回任务编码。"""
return "DWD_QUALITY_CHECK"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""准备运行时上下文。"""
return {"now": datetime.now()}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict[str, Any]:
"""输出行数/金额差异报表到本地文件。"""
report: Dict[str, Any] = {
"generated_at": extracted["now"].isoformat(),
"tables": [],
"note": "行数/金额核对,金额字段基于列名包含 amount/money/fee/balance 的数值列自动扫描。",
}
with self.db.conn.cursor(cursor_factory=RealDictCursor) as cur:
for dwd_table, ods_table in DwdLoadTask.TABLE_MAP.items():
count_info = self._compare_counts(cur, dwd_table, ods_table)
amount_info = self._compare_amounts(cur, dwd_table, ods_table)
report["tables"].append(
{
"dwd_table": dwd_table,
"ods_table": ods_table,
"count": count_info,
"amounts": amount_info,
}
)
self.REPORT_PATH.parent.mkdir(parents=True, exist_ok=True)
self.REPORT_PATH.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
self.logger.info("DWD 质检报表已生成:%s", self.REPORT_PATH)
return {"report_path": str(self.REPORT_PATH)}
# ---------------------- helpers ----------------------
def _compare_counts(self, cur, dwd_table: str, ods_table: str) -> Dict[str, Any]:
"""统计两端行数并返回差异。"""
dwd_schema, dwd_name = self._split_table_name(dwd_table, default_schema="billiards_dwd")
ods_schema, ods_name = self._split_table_name(ods_table, default_schema="billiards_ods")
cur.execute(f'SELECT COUNT(1) AS cnt FROM "{dwd_schema}"."{dwd_name}"')
dwd_cnt = cur.fetchone()["cnt"]
cur.execute(f'SELECT COUNT(1) AS cnt FROM "{ods_schema}"."{ods_name}"')
ods_cnt = cur.fetchone()["cnt"]
return {"dwd": dwd_cnt, "ods": ods_cnt, "diff": dwd_cnt - ods_cnt}
def _compare_amounts(self, cur, dwd_table: str, ods_table: str) -> List[Dict[str, Any]]:
"""扫描金额相关列,生成 ODS 与 DWD 的汇总对照。"""
dwd_schema, dwd_name = self._split_table_name(dwd_table, default_schema="billiards_dwd")
ods_schema, ods_name = self._split_table_name(ods_table, default_schema="billiards_ods")
dwd_amount_cols = self._get_numeric_amount_columns(cur, dwd_schema, dwd_name)
ods_amount_cols = self._get_numeric_amount_columns(cur, ods_schema, ods_name)
common_amount_cols = sorted(set(dwd_amount_cols) & set(ods_amount_cols))
results: List[Dict[str, Any]] = []
for col in common_amount_cols:
cur.execute(f'SELECT COALESCE(SUM("{col}"),0) AS val FROM "{dwd_schema}"."{dwd_name}"')
dwd_sum = cur.fetchone()["val"]
cur.execute(f'SELECT COALESCE(SUM("{col}"),0) AS val FROM "{ods_schema}"."{ods_name}"')
ods_sum = cur.fetchone()["val"]
results.append({"column": col, "dwd_sum": float(dwd_sum or 0), "ods_sum": float(ods_sum or 0), "diff": float(dwd_sum or 0) - float(ods_sum or 0)})
return results
def _get_numeric_amount_columns(self, cur, schema: str, table: str) -> List[str]:
"""获取列名包含金额关键词的数值型字段。"""
cur.execute(
"""
SELECT column_name
FROM information_schema.columns
WHERE table_schema = %s
AND table_name = %s
AND data_type IN ('numeric','double precision','integer','bigint','smallint','real','decimal')
""",
(schema, table),
)
cols = [r["column_name"].lower() for r in cur.fetchall()]
return [c for c in cols if any(key in c for key in self.AMOUNT_KEYWORDS)]
def _split_table_name(self, name: str, default_schema: str) -> Tuple[str, str]:
"""拆分 schema 与表名,缺省使用 default_schema。"""
parts = name.split(".")
if len(parts) == 2:
return parts[0], parts[1]
return default_schema, name

View File

@@ -0,0 +1,36 @@
# -*- coding: utf-8 -*-
"""初始化 DWD Schema执行 schema_dwd_doc.sql可选先 DROP SCHEMA。"""
from __future__ import annotations
from pathlib import Path
from typing import Any
from .base_task import BaseTask, TaskContext
class InitDwdSchemaTask(BaseTask):
"""通过调度执行 DWD schema 初始化。"""
def get_task_code(self) -> str:
"""返回任务编码。"""
return "INIT_DWD_SCHEMA"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""读取 DWD SQL 文件与参数。"""
base_dir = Path(__file__).resolve().parents[1] / "database"
dwd_path = Path(self.config.get("schema.dwd_file", base_dir / "schema_dwd_doc.sql"))
if not dwd_path.exists():
raise FileNotFoundError(f"未找到 DWD schema 文件: {dwd_path}")
drop_first = self.config.get("dwd.drop_schema_first", False)
return {"dwd_sql": dwd_path.read_text(encoding="utf-8"), "dwd_file": str(dwd_path), "drop_first": drop_first}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict:
"""可选 DROP schema再执行 DWD DDL。"""
with self.db.conn.cursor() as cur:
if extracted["drop_first"]:
cur.execute("DROP SCHEMA IF EXISTS billiards_dwd CASCADE;")
self.logger.info("已执行 DROP SCHEMA billiards_dwd CASCADE")
self.logger.info("执行 DWD schema 文件: %s", extracted["dwd_file"])
cur.execute(extracted["dwd_sql"])
return {"executed": 1, "files": [extracted["dwd_file"]]}

View File

@@ -0,0 +1,73 @@
# -*- coding: utf-8 -*-
"""任务:初始化运行环境,执行 ODS 与 etl_admin 的 DDL并准备日志/导出目录。"""
from __future__ import annotations
from pathlib import Path
from typing import Any
from .base_task import BaseTask, TaskContext
class InitOdsSchemaTask(BaseTask):
"""通过调度执行初始化:创建必要目录,执行 ODS 与 etl_admin 的 DDL。"""
def get_task_code(self) -> str:
"""返回任务编码。"""
return "INIT_ODS_SCHEMA"
def extract(self, context: TaskContext) -> dict[str, Any]:
"""读取 SQL 文件路径,收集需创建的目录。"""
base_dir = Path(__file__).resolve().parents[1] / "database"
ods_path = Path(self.config.get("schema.ods_file", base_dir / "schema_ODS_doc.sql"))
admin_path = Path(self.config.get("schema.etl_admin_file", base_dir / "schema_etl_admin.sql"))
if not ods_path.exists():
raise FileNotFoundError(f"找不到 ODS schema 文件: {ods_path}")
if not admin_path.exists():
raise FileNotFoundError(f"找不到 etl_admin schema 文件: {admin_path}")
log_root = Path(self.config.get("io.log_root") or self.config["io"]["log_root"])
export_root = Path(self.config.get("io.export_root") or self.config["io"]["export_root"])
fetch_root = Path(self.config.get("pipeline.fetch_root") or self.config["pipeline"]["fetch_root"])
ingest_dir = Path(self.config.get("pipeline.ingest_source_dir") or fetch_root)
return {
"ods_sql": ods_path.read_text(encoding="utf-8"),
"admin_sql": admin_path.read_text(encoding="utf-8"),
"ods_file": str(ods_path),
"admin_file": str(admin_path),
"dirs": [log_root, export_root, fetch_root, ingest_dir],
}
def load(self, extracted: dict[str, Any], context: TaskContext) -> dict:
"""执行 DDL 并创建必要目录。
安全提示:
ODS DDL 文件可能携带头部说明或异常注释,为避免因非 SQL 文本导致执行失败,这里会做一次轻量清洗后再执行。
"""
for d in extracted["dirs"]:
Path(d).mkdir(parents=True, exist_ok=True)
self.logger.info("已确保目录存在: %s", d)
# 处理 ODS SQL去掉头部说明行以及易出错的 COMMENT ON 行(如 CamelCase 未加引号)
ods_sql_raw: str = extracted["ods_sql"]
drop_idx = ods_sql_raw.find("DROP SCHEMA")
if drop_idx > 0:
ods_sql_raw = ods_sql_raw[drop_idx:]
cleaned_lines: list[str] = []
for line in ods_sql_raw.splitlines():
if line.strip().upper().startswith("COMMENT ON "):
continue
cleaned_lines.append(line)
ods_sql = "\n".join(cleaned_lines)
with self.db.conn.cursor() as cur:
self.logger.info("执行 etl_admin schema 文件: %s", extracted["admin_file"])
cur.execute(extracted["admin_sql"])
self.logger.info("执行 ODS schema 文件: %s", extracted["ods_file"])
cur.execute(ods_sql)
return {
"executed": 2,
"files": [extracted["admin_file"], extracted["ods_file"]],
"dirs_prepared": [str(p) for p in extracted["dirs"]],
}

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.inventory_change import InventoryChangeLoader
from models.parsers import TypeParser
@@ -14,56 +14,56 @@ class InventoryChangeTask(BaseTask):
def get_task_code(self) -> str:
return "INVENTORY_CHANGE"
def execute(self) -> dict:
self.logger.info("开始执行 INVENTORY_CHANGE 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/GoodsStockManage/QueryGoodsOutboundReceipt",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="queryDeliveryRecordsList",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_change(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Inventory/ChangeList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "queryDeliveryRecordsList"),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = InventoryChangeLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_changes(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_change(raw)
if mapped:
parsed.append(mapped)
loader = InventoryChangeLoader(self.db)
inserted, updated, skipped = loader.upsert_changes(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"INVENTORY_CHANGE 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("INVENTORY_CHANGE 失败", exc_info=True)
raise
def _parse_change(self, raw: dict) -> dict | None:
def _parse_change(self, raw: dict, store_id: int) -> dict | None:
change_id = TypeParser.parse_int(
raw.get("siteGoodsStockId") or raw.get("site_goods_stock_id")
)
if not change_id:
self.logger.warning("跳过缺少变动 id 的库存记录: %s", raw)
self.logger.warning("跳过缺少库存变动ID的记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"change_id": change_id,

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.assistant_ledger import AssistantLedgerLoader
from models.parsers import TypeParser
@@ -14,54 +14,54 @@ class LedgerTask(BaseTask):
def get_task_code(self) -> str:
return "LEDGER"
def execute(self) -> dict:
self.logger.info("开始执行 LEDGER 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/AssistantPerformance/GetOrderAssistantDetails",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="orderAssistantDetails",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_ledger(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Assistant/LedgerList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "orderAssistantDetails"),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = AssistantLedgerLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_ledgers(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_ledger(raw)
if mapped:
parsed.append(mapped)
loader = AssistantLedgerLoader(self.db)
inserted, updated, skipped = loader.upsert_ledgers(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"LEDGER 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("LEDGER 失败", exc_info=True)
raise
def _parse_ledger(self, raw: dict) -> dict | None:
def _parse_ledger(self, raw: dict, store_id: int) -> dict | None:
ledger_id = TypeParser.parse_int(raw.get("id"))
if not ledger_id:
self.logger.warning("跳过缺少 id 的助教流水: %s", raw)
self.logger.warning("跳过缺少助教流水ID的记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"ledger_id": ledger_id,
@@ -100,12 +100,8 @@ class LedgerTask(BaseTask):
"ledger_end_time": TypeParser.parse_timestamp(
raw.get("ledger_end_time"), self.tz
),
"start_use_time": TypeParser.parse_timestamp(
raw.get("start_use_time"), self.tz
),
"last_use_time": TypeParser.parse_timestamp(
raw.get("last_use_time"), self.tz
),
"start_use_time": TypeParser.parse_timestamp(raw.get("start_use_time"), self.tz),
"last_use_time": TypeParser.parse_timestamp(raw.get("last_use_time"), self.tz),
"income_seconds": TypeParser.parse_int(raw.get("income_seconds")),
"real_use_seconds": TypeParser.parse_int(raw.get("real_use_seconds")),
"is_trash": raw.get("is_trash"),

View File

@@ -0,0 +1,347 @@
# -*- coding: utf-8 -*-
"""手工示例数据灌入:按 schema_ODS_doc.sql 的表结构写入 ODS。"""
from __future__ import annotations
import json
import os
from datetime import datetime
from typing import Any, Iterable
from psycopg2.extras import Json
from .base_task import BaseTask
class ManualIngestTask(BaseTask):
"""本地示例 JSON 灌入 ODS确保表名/主键/插入列与 schema_ODS_doc.sql 对齐。"""
FILE_MAPPING: list[tuple[tuple[str, ...], str]] = [
(("member_profiles",), "billiards_ods.member_profiles"),
(("member_balance_changes",), "billiards_ods.member_balance_changes"),
(("member_stored_value_cards",), "billiards_ods.member_stored_value_cards"),
(("recharge_settlements",), "billiards_ods.recharge_settlements"),
(("settlement_records",), "billiards_ods.settlement_records"),
(("assistant_cancellation_records",), "billiards_ods.assistant_cancellation_records"),
(("assistant_accounts_master",), "billiards_ods.assistant_accounts_master"),
(("assistant_service_records",), "billiards_ods.assistant_service_records"),
(("site_tables_master",), "billiards_ods.site_tables_master"),
(("table_fee_discount_records",), "billiards_ods.table_fee_discount_records"),
(("table_fee_transactions",), "billiards_ods.table_fee_transactions"),
(("goods_stock_movements",), "billiards_ods.goods_stock_movements"),
(("stock_goods_category_tree",), "billiards_ods.stock_goods_category_tree"),
(("goods_stock_summary",), "billiards_ods.goods_stock_summary"),
(("payment_transactions",), "billiards_ods.payment_transactions"),
(("refund_transactions",), "billiards_ods.refund_transactions"),
(("platform_coupon_redemption_records",), "billiards_ods.platform_coupon_redemption_records"),
(("group_buy_redemption_records",), "billiards_ods.group_buy_redemption_records"),
(("group_buy_packages",), "billiards_ods.group_buy_packages"),
(("settlement_ticket_details",), "billiards_ods.settlement_ticket_details"),
(("store_goods_master",), "billiards_ods.store_goods_master"),
(("tenant_goods_master",), "billiards_ods.tenant_goods_master"),
(("store_goods_sales_records",), "billiards_ods.store_goods_sales_records"),
]
TABLE_SPECS: dict[str, dict[str, Any]] = {
"billiards_ods.member_profiles": {"pk": "id"},
"billiards_ods.member_balance_changes": {"pk": "id"},
"billiards_ods.member_stored_value_cards": {"pk": "id"},
"billiards_ods.recharge_settlements": {"pk": "id"},
"billiards_ods.settlement_records": {"pk": "id"},
"billiards_ods.assistant_cancellation_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.assistant_accounts_master": {"pk": "id"},
"billiards_ods.assistant_service_records": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.site_tables_master": {"pk": "id"},
"billiards_ods.table_fee_discount_records": {"pk": "id", "json_cols": ["siteProfile", "tableProfile"]},
"billiards_ods.table_fee_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.goods_stock_movements": {"pk": "siteGoodsStockId"},
"billiards_ods.stock_goods_category_tree": {"pk": "id", "json_cols": ["categoryBoxes"]},
"billiards_ods.goods_stock_summary": {"pk": "siteGoodsId"},
"billiards_ods.payment_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.refund_transactions": {"pk": "id", "json_cols": ["siteProfile"]},
"billiards_ods.platform_coupon_redemption_records": {"pk": "id"},
"billiards_ods.tenant_goods_master": {"pk": "id"},
"billiards_ods.group_buy_packages": {"pk": "id"},
"billiards_ods.group_buy_redemption_records": {"pk": "id"},
"billiards_ods.settlement_ticket_details": {
"pk": "orderSettleId",
"json_cols": ["memberProfile", "orderItem", "tenantMemberCardLogs"],
},
"billiards_ods.store_goods_master": {"pk": "id"},
"billiards_ods.store_goods_sales_records": {"pk": "id"},
}
def get_task_code(self) -> str:
"""返回任务编码。"""
return "MANUAL_INGEST"
def execute(self, cursor_data: dict | None = None) -> dict:
"""从目录读取 JSON按表定义批量入库。"""
data_dir = (
self.config.get("manual.data_dir")
or self.config.get("pipeline.ingest_source_dir")
or r"c:\dev\LLTQ\ETL\feiqiu-ETL\etl_billiards\tests\testdata_json"
)
if not os.path.exists(data_dir):
self.logger.error("Data directory not found: %s", data_dir)
return {"status": "error", "message": "Directory not found"}
counts = {"fetched": 0, "inserted": 0, "updated": 0, "skipped": 0, "errors": 0}
for filename in sorted(os.listdir(data_dir)):
if not filename.endswith(".json"):
continue
filepath = os.path.join(data_dir, filename)
try:
with open(filepath, "r", encoding="utf-8") as fh:
raw_entries = json.load(fh)
except Exception:
counts["errors"] += 1
self.logger.exception("Failed to read %s", filename)
continue
entries = raw_entries if isinstance(raw_entries, list) else [raw_entries]
records = self._extract_records(entries)
if not records:
counts["skipped"] += 1
continue
target_table = self._match_by_filename(filename)
if not target_table:
self.logger.warning("No mapping found for file: %s", filename)
counts["skipped"] += 1
continue
self.logger.info("Ingesting %s into %s", filename, target_table)
try:
inserted, updated = self._ingest_table(target_table, records, filename)
counts["inserted"] += inserted
counts["updated"] += updated
counts["fetched"] += len(records)
except Exception:
counts["errors"] += 1
self.logger.exception("Error processing %s", filename)
self.db.rollback()
continue
try:
self.db.commit()
except Exception:
self.db.rollback()
raise
return {"status": "SUCCESS", "counts": counts}
def _match_by_filename(self, filename: str) -> str | None:
"""根据文件名关键字匹配目标表。"""
for keywords, table in self.FILE_MAPPING:
if any(keyword and keyword in filename for keyword in keywords):
return table
return None
def _extract_records(self, raw_entries: Iterable[Any]) -> list[dict]:
"""兼容多层 data/list 包装,抽取记录列表。"""
records: list[dict] = []
for entry in raw_entries:
if isinstance(entry, dict):
preferred = entry
if "data" in entry and not any(k not in {"data", "code"} for k in entry.keys()):
preferred = entry["data"]
data = preferred
if isinstance(data, dict):
# 特殊处理 settleList充值、结算记录展开 data.settleList 下的 settleList抛弃上层 siteProfile
if "settleList" in data:
settle_list_val = data.get("settleList")
if isinstance(settle_list_val, dict):
settle_list_iter = [settle_list_val]
elif isinstance(settle_list_val, list):
settle_list_iter = settle_list_val
else:
settle_list_iter = []
handled = False
for item in settle_list_iter or []:
if not isinstance(item, dict):
continue
inner = item.get("settleList")
merged = dict(inner) if isinstance(inner, dict) else dict(item)
# 保留 siteProfile 供后续字段补充,但不落库
site_profile = data.get("siteProfile")
if isinstance(site_profile, dict):
merged.setdefault("siteProfile", site_profile)
records.append(merged)
handled = True
if handled:
continue
list_used = False
for v in data.values():
if isinstance(v, list) and v and isinstance(v[0], dict):
records.extend(v)
list_used = True
break
if list_used:
continue
if isinstance(data, list) and data and isinstance(data[0], dict):
records.extend(data)
elif isinstance(data, dict):
records.append(data)
elif isinstance(entry, list):
records.extend([item for item in entry if isinstance(item, dict)])
return records
def _get_table_columns(self, table: str) -> list[tuple[str, str, str]]:
"""查询 information_schema获取目标表列信息。"""
cache = getattr(self, "_table_columns_cache", {})
if table in cache:
return cache[table]
if "." in table:
schema, name = table.split(".", 1)
else:
schema, name = "public", table
sql = """
SELECT column_name, data_type, udt_name
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
ORDER BY ordinal_position
"""
with self.db.conn.cursor() as cur:
cur.execute(sql, (schema, name))
cols = [(r[0], (r[1] or "").lower(), (r[2] or "").lower()) for r in cur.fetchall()]
cache[table] = cols
self._table_columns_cache = cache
return cols
def _ingest_table(self, table: str, records: list[dict], source_file: str) -> tuple[int, int]:
"""构建 INSERT/ON CONFLICT 语句并批量执行。"""
spec = self.TABLE_SPECS.get(table)
if not spec:
raise ValueError(f"No table spec for {table}")
pk_col = spec.get("pk")
json_cols = set(spec.get("json_cols", []))
json_cols_lower = {c.lower() for c in json_cols}
columns_info = self._get_table_columns(table)
columns = [c[0] for c in columns_info]
db_json_cols_lower = {
c[0].lower() for c in columns_info if c[1] in ("json", "jsonb") or c[2] in ("json", "jsonb")
}
pk_col_db = None
if pk_col:
pk_col_db = next((c for c in columns if c.lower() == pk_col.lower()), pk_col)
placeholders = ", ".join(["%s"] * len(columns))
col_list = ", ".join(f'"{c}"' for c in columns)
sql = f'INSERT INTO {table} ({col_list}) VALUES ({placeholders})'
if pk_col_db:
update_cols = [c for c in columns if c != pk_col_db]
set_clause = ", ".join(f'"{c}"=EXCLUDED."{c}"' for c in update_cols)
sql += f' ON CONFLICT ("{pk_col_db}") DO UPDATE SET {set_clause}'
sql += " RETURNING (xmax = 0) AS inserted"
params = []
now = datetime.now()
json_dump = lambda v: json.dumps(v, ensure_ascii=False) # noqa: E731
for rec in records:
merged_rec = rec if isinstance(rec, dict) else {}
data_part = merged_rec.get("data")
while isinstance(data_part, dict):
merged_rec = {**data_part, **merged_rec}
data_part = data_part.get("data")
# 针对充值/结算,补齐 siteProfile 中的店铺信息
if table in {
"billiards_ods.recharge_settlements",
"billiards_ods.settlement_records",
}:
site_profile = merged_rec.get("siteProfile") or merged_rec.get("site_profile")
if isinstance(site_profile, dict):
merged_rec.setdefault("tenantid", site_profile.get("tenant_id") or site_profile.get("tenantId"))
merged_rec.setdefault("siteid", site_profile.get("id") or site_profile.get("siteId"))
merged_rec.setdefault("sitename", site_profile.get("shop_name") or site_profile.get("siteName"))
pk_val = self._get_value_case_insensitive(merged_rec, pk_col) if pk_col else None
if pk_col and (pk_val is None or pk_val == ""):
continue
row_vals = []
for col_name, data_type, udt in columns_info:
col_lower = col_name.lower()
if col_lower == "payload":
row_vals.append(Json(rec, dumps=json_dump))
continue
if col_lower == "source_file":
row_vals.append(source_file)
continue
if col_lower == "fetched_at":
row_vals.append(merged_rec.get(col_name, now))
continue
value = self._normalize_scalar(self._get_value_case_insensitive(merged_rec, col_name))
if col_lower in json_cols_lower or col_lower in db_json_cols_lower:
row_vals.append(Json(value, dumps=json_dump) if value is not None else None)
continue
casted = self._cast_value(value, data_type)
row_vals.append(casted)
params.append(tuple(row_vals))
if not params:
return 0, 0
inserted = 0
updated = 0
with self.db.conn.cursor() as cur:
for row in params:
cur.execute(sql, row)
flag = cur.fetchone()[0]
if flag:
inserted += 1
else:
updated += 1
return inserted, updated
@staticmethod
def _get_value_case_insensitive(record: dict, col: str | None):
"""忽略大小写获取值,兼容 information_schema 与 JSON 原始字段。"""
if record is None or col is None:
return None
if col in record:
return record.get(col)
col_lower = col.lower()
for k, v in record.items():
if isinstance(k, str) and k.lower() == col_lower:
return v
return None
@staticmethod
def _normalize_scalar(value):
"""将空字符串/空 JSON 规范为 None避免类型转换错误。"""
if value == "" or value == "{}" or value == "[]":
return None
return value
@staticmethod
def _cast_value(value, data_type: str):
"""根据列类型做简单转换,保证批量插入兼容。"""
if value is None:
return None
dt = (data_type or "").lower()
if dt in ("integer", "bigint", "smallint"):
if isinstance(value, bool):
return int(value)
try:
return int(value)
except Exception:
return None
if dt in ("numeric", "double precision", "real", "decimal"):
if isinstance(value, bool):
return int(value)
try:
return float(value)
except Exception:
return None
if dt.startswith("timestamp") or dt in ("date", "time", "interval"):
return value if isinstance(value, str) else None
return value

View File

@@ -0,0 +1,90 @@
# -*- coding: utf-8 -*-
from .base_dwd_task import BaseDwdTask
from loaders.dimensions.member import MemberLoader
from models.parsers import TypeParser
import json
class MembersDwdTask(BaseDwdTask):
"""
DWD Task: Process Member Records from ODS to Dimension Table
Source: billiards_ods.member_profiles
Target: billiards.dim_member
"""
def get_task_code(self) -> str:
return "MEMBERS_DWD"
def execute(self) -> dict:
self.logger.info(f"Starting {self.get_task_code()} task")
window_start, window_end, _ = self._get_time_window()
self.logger.info(f"Processing window: {window_start} to {window_end}")
loader = MemberLoader(self.db)
store_id = self.config.get("app.store_id")
total_inserted = 0
total_updated = 0
total_errors = 0
# Iterate ODS Data
batches = self.iter_ods_rows(
table_name="billiards_ods.member_profiles",
columns=["site_id", "member_id", "payload", "fetched_at"],
start_time=window_start,
end_time=window_end
)
for batch in batches:
if not batch:
continue
parsed_rows = []
for row in batch:
payload = self.parse_payload(row)
if not payload:
continue
parsed = self._parse_member(payload, store_id)
if parsed:
parsed_rows.append(parsed)
if parsed_rows:
inserted, updated, skipped = loader.upsert_members(parsed_rows, store_id)
total_inserted += inserted
total_updated += updated
self.db.commit()
self.logger.info(f"Task {self.get_task_code()} completed. Inserted: {total_inserted}, Updated: {total_updated}")
return {
"status": "success",
"inserted": total_inserted,
"updated": total_updated,
"window_start": window_start.isoformat(),
"window_end": window_end.isoformat()
}
def _parse_member(self, raw: dict, store_id: int) -> dict:
"""Parse ODS payload into Dim structure"""
try:
# Handle both API structure (camelCase) and manual structure
member_id = raw.get("id") or raw.get("memberId")
if not member_id:
return None
return {
"store_id": store_id,
"member_id": member_id,
"member_name": raw.get("name") or raw.get("memberName"),
"phone": raw.get("phone") or raw.get("mobile"),
"balance": raw.get("balance", 0),
"status": str(raw.get("status", "NORMAL")),
"register_time": raw.get("createTime") or raw.get("registerTime"),
"raw_data": json.dumps(raw, ensure_ascii=False)
}
except Exception as e:
self.logger.warning(f"Error parsing member: {e}")
return None

View File

@@ -1,73 +1,72 @@
# -*- coding: utf-8 -*-
"""会员ETL任务"""
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.dimensions.member import MemberLoader
from models.parsers import TypeParser
class MembersTask(BaseTask):
"""会员ETL任务"""
def get_task_code(self) -> str:
return "MEMBERS"
def execute(self) -> dict:
"""执行会员ETL"""
self.logger.info(f"开始执行 {self.get_task_code()} 任务")
params = {
"storeId": self.config.get("app.store_id"),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params({"siteId": context.store_id})
records, _ = self.api.get_paginated(
endpoint="/MemberProfile/GetTenantMemberList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="tenantMemberInfos",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
parsed_row = self._parse_member(raw, context.store_id)
if parsed_row:
parsed.append(parsed_row)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, pages_meta = self.api.get_paginated(
endpoint="/MemberProfile/GetTenantMemberList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",)
)
parsed_records = []
for rec in records:
parsed = self._parse_member(rec)
if parsed:
parsed_records.append(parsed)
loader = MemberLoader(self.db)
store_id = self.config.get("app.store_id")
inserted, updated, skipped = loader.upsert_members(parsed_records, store_id)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0
}
self.logger.info(f"{self.get_task_code()} 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception as e:
self.db.rollback()
self.logger.error(f"{self.get_task_code()} 失败", exc_info=True)
raise
def _parse_member(self, raw: dict) -> dict:
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = MemberLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_members(
transformed["records"], context.store_id
)
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
def _parse_member(self, raw: dict, store_id: int) -> dict | None:
"""解析会员记录"""
try:
member_id = TypeParser.parse_int(raw.get("memberId"))
if not member_id:
return None
return {
"store_id": self.config.get("app.store_id"),
"member_id": TypeParser.parse_int(raw.get("memberId")),
"store_id": store_id,
"member_id": member_id,
"member_name": raw.get("memberName"),
"phone": raw.get("phone"),
"balance": TypeParser.parse_decimal(raw.get("balance")),
"status": raw.get("status"),
"register_time": TypeParser.parse_timestamp(raw.get("registerTime"), self.tz),
"raw_data": json.dumps(raw, ensure_ascii=False)
"raw_data": json.dumps(raw, ensure_ascii=False),
}
except Exception as e:
self.logger.warning(f"解析会员记录失败: {e}, 原始数据: {raw}")
except Exception as exc:
self.logger.warning("解析会员记录失败: %s, 原始数据: %s", exc, raw)
return None

File diff suppressed because it is too large Load Diff

View File

@@ -1,80 +1,77 @@
# -*- coding: utf-8 -*-
"""订单ETL任务"""
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.order import OrderLoader
from models.parsers import TypeParser
class OrdersTask(BaseTask):
"""订单数据ETL任务"""
def get_task_code(self) -> str:
return "ORDERS"
def execute(self) -> dict:
"""执行订单数据ETL"""
self.logger.info(f"开始执行 {self.get_task_code()} 任务")
# 1. 获取时间窗口
window_start, window_end, window_minutes = self._get_time_window()
# 2. 调用API获取数据
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
}
try:
records, pages_meta = self.api.get_paginated(
endpoint="/order/list",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",)
)
# 3. 解析并清洗数据
parsed_records = []
for rec in records:
parsed = self._parse_order(rec)
if parsed:
parsed_records.append(parsed)
# 4. 加载数据
loader = OrderLoader(self.db)
store_id = self.config.get("app.store_id")
inserted, updated, skipped = loader.upsert_orders(
parsed_records,
store_id
)
# 5. 提交事务
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0
# ------------------------------------------------------------------ E/T/L hooks
def extract(self, context: TaskContext) -> dict:
"""调用 API 拉取订单记录"""
params = self._merge_common_params(
{
"siteId": context.store_id,
"rangeStartTime": TypeParser.format_timestamp(context.window_start, self.tz),
"rangeEndTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
self.logger.info(
f"{self.get_task_code()} 完成: {counts}"
)
return self._build_result("SUCCESS", counts)
except Exception as e:
self.db.rollback()
self.logger.error(f"{self.get_task_code()} 失败", exc_info=True)
raise
def _parse_order(self, raw: dict) -> dict:
)
records, pages_meta = self.api.get_paginated(
endpoint="/Site/GetAllOrderSettleList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="settleList",
)
return {"records": records, "meta": pages_meta}
def transform(self, extracted: dict, context: TaskContext) -> dict:
"""解析原始订单 JSON"""
parsed_records = []
skipped = 0
for rec in extracted.get("records", []):
parsed = self._parse_order(rec, context.store_id)
if parsed:
parsed_records.append(parsed)
else:
skipped += 1
return {
"records": parsed_records,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
def load(self, transformed: dict, context: TaskContext) -> dict:
"""写入 fact_order"""
loader = OrderLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_orders(
transformed["records"], context.store_id
)
counts = {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
return counts
# ------------------------------------------------------------------ helpers
def _parse_order(self, raw: dict, store_id: int) -> dict | None:
"""解析单条订单记录"""
try:
return {
"store_id": self.config.get("app.store_id"),
"store_id": store_id,
"order_id": TypeParser.parse_int(raw.get("orderId")),
"order_no": raw.get("orderNo"),
"member_id": TypeParser.parse_int(raw.get("memberId")),
@@ -87,8 +84,8 @@ class OrdersTask(BaseTask):
"pay_status": raw.get("payStatus"),
"order_status": raw.get("orderStatus"),
"remark": raw.get("remark"),
"raw_data": json.dumps(raw, ensure_ascii=False)
"raw_data": json.dumps(raw, ensure_ascii=False),
}
except Exception as e:
self.logger.warning(f"解析订单失败: {e}, 原始数据: {raw}")
except Exception as exc:
self.logger.warning("解析订单失败: %s, 原始数据: %s", exc, raw)
return None

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.dimensions.package import PackageDefinitionLoader
from models.parsers import TypeParser
@@ -14,49 +14,48 @@ class PackagesDefTask(BaseTask):
def get_task_code(self) -> str:
return "PACKAGES_DEF"
def execute(self) -> dict:
self.logger.info("开始执行 PACKAGES_DEF 任务")
params = {"storeId": self.config.get("app.store_id")}
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params({"siteId": context.store_id})
records, _ = self.api.get_paginated(
endpoint="/PackageCoupon/QueryPackageCouponList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="packageCouponList",
)
return {"records": records}
try:
records, _ = self.api.get_paginated(
endpoint="/Package/List",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "packageCouponList"),
)
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_package(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
parsed = []
for raw in records:
mapped = self._parse_package(raw)
if mapped:
parsed.append(mapped)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = PackageDefinitionLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_packages(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
loader = PackageDefinitionLoader(self.db)
inserted, updated, skipped = loader.upsert_packages(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"PACKAGES_DEF 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("PACKAGES_DEF 失败", exc_info=True)
raise
def _parse_package(self, raw: dict) -> dict | None:
def _parse_package(self, raw: dict, store_id: int) -> dict | None:
package_id = TypeParser.parse_int(raw.get("id"))
if not package_id:
self.logger.warning("跳过缺少 id 的套餐数据: %s", raw)
self.logger.warning("跳过缺少 package id 的套餐记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"package_id": package_id,

View File

@@ -0,0 +1,139 @@
# -*- coding: utf-8 -*-
from .base_dwd_task import BaseDwdTask
from loaders.facts.payment import PaymentLoader
from models.parsers import TypeParser
import json
class PaymentsDwdTask(BaseDwdTask):
"""
DWD Task: Process Payment Records from ODS to Fact Table
Source: billiards_ods.ods_payment
Target: billiards.fact_payment
"""
def get_task_code(self) -> str:
return "PAYMENTS_DWD"
def execute(self) -> dict:
self.logger.info(f"Starting {self.get_task_code()} task")
window_start, window_end, _ = self._get_time_window()
self.logger.info(f"Processing window: {window_start} to {window_end}")
loader = PaymentLoader(self.db, logger=self.logger)
store_id = self.config.get("app.store_id")
total_inserted = 0
total_updated = 0
total_skipped = 0
# Iterate ODS Data
batches = self.iter_ods_rows(
table_name="billiards_ods.payment_transactions",
columns=["site_id", "pay_id", "payload", "fetched_at"],
start_time=window_start,
end_time=window_end
)
for batch in batches:
if not batch:
continue
parsed_rows = []
for row in batch:
payload = self.parse_payload(row)
if not payload:
continue
parsed = self._parse_payment(payload, store_id)
if parsed:
parsed_rows.append(parsed)
if parsed_rows:
inserted, updated, skipped = loader.upsert_payments(parsed_rows, store_id)
total_inserted += inserted
total_updated += updated
total_skipped += skipped
self.db.commit()
self.logger.info(
"Task %s completed. inserted=%s updated=%s skipped=%s",
self.get_task_code(),
total_inserted,
total_updated,
total_skipped,
)
return {
"status": "SUCCESS",
"counts": {
"inserted": total_inserted,
"updated": total_updated,
"skipped": total_skipped,
},
"window_start": window_start,
"window_end": window_end,
}
def _parse_payment(self, raw: dict, store_id: int) -> dict:
"""Parse ODS payload into Fact structure"""
try:
pay_id = TypeParser.parse_int(raw.get("payId") or raw.get("id"))
if not pay_id:
return None
relate_type = str(raw.get("relateType") or raw.get("relate_type") or "")
relate_id = TypeParser.parse_int(raw.get("relateId") or raw.get("relate_id"))
# Attempt to populate settlement / trade identifiers
order_settle_id = TypeParser.parse_int(
raw.get("orderSettleId") or raw.get("order_settle_id")
)
order_trade_no = TypeParser.parse_int(
raw.get("orderTradeNo") or raw.get("order_trade_no")
)
if relate_type in {"1", "SETTLE", "ORDER"}:
order_settle_id = order_settle_id or relate_id
return {
"store_id": store_id,
"pay_id": pay_id,
"order_id": TypeParser.parse_int(raw.get("orderId") or raw.get("order_id")),
"order_settle_id": order_settle_id,
"order_trade_no": order_trade_no,
"relate_type": relate_type,
"relate_id": relate_id,
"site_id": TypeParser.parse_int(
raw.get("siteId") or raw.get("site_id") or store_id
),
"tenant_id": TypeParser.parse_int(raw.get("tenantId") or raw.get("tenant_id")),
"create_time": TypeParser.parse_timestamp(
raw.get("createTime") or raw.get("create_time"), self.tz
),
"pay_time": TypeParser.parse_timestamp(raw.get("payTime"), self.tz),
"pay_amount": TypeParser.parse_decimal(raw.get("payAmount")),
"fee_amount": TypeParser.parse_decimal(
raw.get("feeAmount")
or raw.get("serviceFee")
or raw.get("channelFee")
or raw.get("fee_amount")
),
"discount_amount": TypeParser.parse_decimal(
raw.get("discountAmount")
or raw.get("couponAmount")
or raw.get("discount_amount")
),
"payment_method": str(raw.get("paymentMethod") or raw.get("payment_method") or ""),
"pay_type": raw.get("payType") or raw.get("pay_type"),
"online_pay_channel": raw.get("onlinePayChannel") or raw.get("online_pay_channel"),
"pay_terminal": raw.get("payTerminal") or raw.get("pay_terminal"),
"pay_status": str(raw.get("payStatus") or raw.get("pay_status") or ""),
"remark": raw.get("remark"),
"raw_data": json.dumps(raw, ensure_ascii=False)
}
except Exception as e:
self.logger.warning(f"Error parsing payment: {e}")
return None

View File

@@ -1,78 +1,111 @@
# -*- coding: utf-8 -*-
"""支付记录ETL任务"""
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.payment import PaymentLoader
from models.parsers import TypeParser
class PaymentsTask(BaseTask):
"""支付记录ETL任务"""
"""支付记录 E/T/L 任务"""
def get_task_code(self) -> str:
return "PAYMENTS"
def execute(self) -> dict:
"""执行支付记录ETL"""
self.logger.info(f"开始执行 {self.get_task_code()} 任务")
window_start, window_end, window_minutes = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
}
try:
records, pages_meta = self.api.get_paginated(
endpoint="/pay/records",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",)
)
parsed_records = []
for rec in records:
parsed = self._parse_payment(rec)
if parsed:
parsed_records.append(parsed)
loader = PaymentLoader(self.db)
store_id = self.config.get("app.store_id")
inserted, updated, skipped = loader.upsert_payments(parsed_records, store_id)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0
# ------------------------------------------------------------------ E/T/L hooks
def extract(self, context: TaskContext) -> dict:
"""调用 API 抓取支付记录"""
params = self._merge_common_params(
{
"siteId": context.store_id,
"StartPayTime": TypeParser.format_timestamp(context.window_start, self.tz),
"EndPayTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
self.logger.info(f"{self.get_task_code()} 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception as e:
self.db.rollback()
self.logger.error(f"{self.get_task_code()} 失败", exc_info=True)
raise
def _parse_payment(self, raw: dict) -> dict:
)
records, pages_meta = self.api.get_paginated(
endpoint="/PayLog/GetPayLogListPage",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
)
return {"records": records, "meta": pages_meta}
def transform(self, extracted: dict, context: TaskContext) -> dict:
"""解析支付 JSON"""
parsed, skipped = [], 0
for rec in extracted.get("records", []):
cleaned = self._parse_payment(rec, context.store_id)
if cleaned:
parsed.append(cleaned)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
def load(self, transformed: dict, context: TaskContext) -> dict:
"""写入 fact_payment"""
loader = PaymentLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_payments(
transformed["records"], context.store_id
)
counts = {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
return counts
# ------------------------------------------------------------------ helpers
def _parse_payment(self, raw: dict, store_id: int) -> dict | None:
"""解析支付记录"""
try:
return {
"store_id": self.config.get("app.store_id"),
"pay_id": TypeParser.parse_int(raw.get("payId")),
"store_id": store_id,
"pay_id": TypeParser.parse_int(raw.get("payId") or raw.get("id")),
"order_id": TypeParser.parse_int(raw.get("orderId")),
"order_settle_id": TypeParser.parse_int(
raw.get("orderSettleId") or raw.get("order_settle_id")
),
"order_trade_no": TypeParser.parse_int(
raw.get("orderTradeNo") or raw.get("order_trade_no")
),
"relate_type": raw.get("relateType") or raw.get("relate_type"),
"relate_id": TypeParser.parse_int(raw.get("relateId") or raw.get("relate_id")),
"site_id": TypeParser.parse_int(
raw.get("siteId") or raw.get("site_id") or store_id
),
"tenant_id": TypeParser.parse_int(raw.get("tenantId") or raw.get("tenant_id")),
"pay_time": TypeParser.parse_timestamp(raw.get("payTime"), self.tz),
"create_time": TypeParser.parse_timestamp(
raw.get("createTime") or raw.get("create_time"), self.tz
),
"pay_amount": TypeParser.parse_decimal(raw.get("payAmount")),
"fee_amount": TypeParser.parse_decimal(
raw.get("feeAmount")
or raw.get("serviceFee")
or raw.get("channelFee")
or raw.get("fee_amount")
),
"discount_amount": TypeParser.parse_decimal(
raw.get("discountAmount")
or raw.get("couponAmount")
or raw.get("discount_amount")
),
"pay_type": raw.get("payType"),
"payment_method": raw.get("paymentMethod") or raw.get("payment_method"),
"online_pay_channel": raw.get("onlinePayChannel")
or raw.get("online_pay_channel"),
"pay_status": raw.get("payStatus"),
"pay_terminal": raw.get("payTerminal") or raw.get("pay_terminal"),
"remark": raw.get("remark"),
"raw_data": json.dumps(raw, ensure_ascii=False)
"raw_data": json.dumps(raw, ensure_ascii=False),
}
except Exception as e:
self.logger.warning(f"解析支付记录失败: {e}, 原始数据: {raw}")
except Exception as exc:
self.logger.warning("解析支付记录失败: %s, 原始数据: %s", exc, raw)
return None

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.dimensions.product import ProductLoader
from models.parsers import TypeParser
@@ -12,95 +12,56 @@ class ProductsTask(BaseTask):
"""商品维度 ETL 任务"""
def get_task_code(self) -> str:
"""任务代码,应与 etl_admin.etl_task.task_code 一致"""
return "PRODUCTS"
def execute(self) -> dict:
"""
执行商品档案 ETL
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params({"siteId": context.store_id})
records, _ = self.api.get_paginated(
endpoint="/TenantGoods/QueryTenantGoods",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="tenantGoodsList",
)
return {"records": records}
流程:
1. 调用上游 /TenantGoods/QueryTenantGoods 分页拉取商品列表
2. 解析/清洗字段
3. 通过 ProductLoader 写入 dim_product 和 dim_product_price_scd
"""
self.logger.info(f"开始执行 {self.get_task_code()} 任务")
params = {
"storeId": self.config.get("app.store_id"),
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
parsed_row = self._parse_product(raw, context.store_id)
if parsed_row:
parsed.append(parsed_row)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = ProductLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_products(
transformed["records"], context.store_id
)
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
def _parse_product(self, raw: dict, store_id: int) -> dict | None:
try:
# 1. 分页拉取数据
records, pages_meta = self.api.get_paginated(
endpoint="/TenantGoods/QueryTenantGoods",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
)
# 2. 解析/清洗
parsed_records = []
for raw in records:
parsed = self._parse_product(raw)
if parsed:
parsed_records.append(parsed)
# 3. 加载入库(维度主表 + 价格SCD2
loader = ProductLoader(self.db)
store_id = self.config.get("app.store_id")
inserted, updated, skipped = loader.upsert_products(
parsed_records, store_id
)
# 4. 提交事务
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"{self.get_task_code()} 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
# 明确回滚,避免部分成功
self.db.rollback()
self.logger.error(f"{self.get_task_code()} 失败", exc_info=True)
raise
def _parse_product(self, raw: dict) -> dict | None:
"""
解析单条商品记录,字段映射参考旧版 upsert_dim_product_and_price_scd
上游字段示例:
- siteGoodsId / tenantGoodsId / productId
- goodsName / productName
- tenantGoodsCategoryId / goodsCategoryId / categoryName / goodsCategorySecondId
- goodsUnit
- costPrice / goodsPrice / salePrice
- goodsState / status
- supplierId / barcode / isCombo
- createTime / updateTime
"""
try:
product_id = (
TypeParser.parse_int(
raw.get("siteGoodsId")
or raw.get("tenantGoodsId")
or raw.get("productId")
)
product_id = TypeParser.parse_int(
raw.get("siteGoodsId") or raw.get("tenantGoodsId") or raw.get("productId")
)
if not product_id:
# 主键缺失,直接跳过
return None
return {
"store_id": self.config.get("app.store_id"),
"store_id": store_id,
"product_id": product_id,
"site_product_id": TypeParser.parse_int(raw.get("siteGoodsId")),
"product_name": raw.get("goodsName") or raw.get("productName"),
@@ -108,15 +69,12 @@ class ProductsTask(BaseTask):
raw.get("tenantGoodsCategoryId") or raw.get("goodsCategoryId")
),
"category_name": raw.get("categoryName"),
"second_category_id": TypeParser.parse_int(
raw.get("goodsCategorySecondId")
),
"second_category_id": TypeParser.parse_int(raw.get("goodsCategorySecondId")),
"unit": raw.get("goodsUnit"),
"cost_price": TypeParser.parse_decimal(raw.get("costPrice")),
"sale_price": TypeParser.parse_decimal(
raw.get("goodsPrice") or raw.get("salePrice")
),
# 旧版这里就是 None如后面有明确字段可以再补
"allow_discount": None,
"status": raw.get("goodsState") or raw.get("status"),
"supplier_id": TypeParser.parse_int(raw.get("supplierId"))
@@ -126,14 +84,10 @@ class ProductsTask(BaseTask):
"is_combo": bool(raw.get("isCombo"))
if raw.get("isCombo") is not None
else None,
"created_time": TypeParser.parse_timestamp(
raw.get("createTime"), self.tz
),
"updated_time": TypeParser.parse_timestamp(
raw.get("updateTime"), self.tz
),
"created_time": TypeParser.parse_timestamp(raw.get("createTime"), self.tz),
"updated_time": TypeParser.parse_timestamp(raw.get("updateTime"), self.tz),
"raw_data": json.dumps(raw, ensure_ascii=False),
}
except Exception as e:
self.logger.warning(f"解析商品记录失败: {e}, 原始数据: {raw}")
return None
except Exception as exc:
self.logger.warning("解析商品记录失败: %s, 原始数据: %s", exc, raw)
return None

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.refund import RefundLoader
from models.parsers import TypeParser
@@ -14,54 +14,53 @@ class RefundsTask(BaseTask):
def get_task_code(self) -> str:
return "REFUNDS"
def execute(self) -> dict:
self.logger.info("开始执行 REFUNDS 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/Order/GetRefundPayLogList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_refund(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Pay/RefundList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=(),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = RefundLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_refunds(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_refund(raw)
if mapped:
parsed.append(mapped)
loader = RefundLoader(self.db)
inserted, updated, skipped = loader.upsert_refunds(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"REFUNDS 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("REFUNDS 失败", exc_info=True)
raise
def _parse_refund(self, raw: dict) -> dict | None:
def _parse_refund(self, raw: dict, store_id: int) -> dict | None:
refund_id = TypeParser.parse_int(raw.get("id"))
if not refund_id:
self.logger.warning("跳过缺少 id 的退款记录: %s", raw)
self.logger.warning("跳过缺少退款ID的数据: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"refund_id": refund_id,

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.table_discount import TableDiscountLoader
from models.parsers import TypeParser
@@ -14,55 +14,55 @@ class TableDiscountTask(BaseTask):
def get_task_code(self) -> str:
return "TABLE_DISCOUNT"
def execute(self) -> dict:
self.logger.info("开始执行 TABLE_DISCOUNT 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"startTime": TypeParser.format_timestamp(context.window_start, self.tz),
"endTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/Site/GetTaiFeeAdjustList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="taiFeeAdjustInfos",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_discount(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Table/AdjustList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "taiFeeAdjustInfos"),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = TableDiscountLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_discounts(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_discount(raw)
if mapped:
parsed.append(mapped)
loader = TableDiscountLoader(self.db)
inserted, updated, skipped = loader.upsert_discounts(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"TABLE_DISCOUNT 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("TABLE_DISCOUNT 失败", exc_info=True)
raise
def _parse_discount(self, raw: dict) -> dict | None:
def _parse_discount(self, raw: dict, store_id: int) -> dict | None:
discount_id = TypeParser.parse_int(raw.get("id"))
if not discount_id:
self.logger.warning("跳过缺少 id 的台费折扣记录: %s", raw)
self.logger.warning("跳过缺少折扣ID的记录: %s", raw)
return None
table_profile = raw.get("tableProfile") or {}
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"discount_id": discount_id,

View File

@@ -1,15 +1,9 @@
<<<<<<< HEAD
class TablesTask(BaseTask):
def get_task_code(self) -> str: # 返回 "TABLES"
def execute(self) -> dict: # 拉取 /Table/GetSiteTables
def _parse_table(self, raw: dict) -> dict | None:
=======
# -*- coding: utf-8 -*-
"""台桌档案任务"""
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.dimensions.table import TableLoader
from models.parsers import TypeParser
@@ -20,49 +14,48 @@ class TablesTask(BaseTask):
def get_task_code(self) -> str:
return "TABLES"
def execute(self) -> dict:
self.logger.info("开始执行 TABLES 任务")
params = {"storeId": self.config.get("app.store_id")}
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params({"siteId": context.store_id})
records, _ = self.api.get_paginated(
endpoint="/Table/GetSiteTables",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="siteTables",
)
return {"records": records}
try:
records, _ = self.api.get_paginated(
endpoint="/Table/GetSiteTables",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "siteTables"),
)
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_table(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
parsed = []
for raw in records:
mapped = self._parse_table(raw)
if mapped:
parsed.append(mapped)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = TableLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_tables(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
loader = TableLoader(self.db)
inserted, updated, skipped = loader.upsert_tables(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"TABLES 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("TABLES 失败", exc_info=True)
raise
def _parse_table(self, raw: dict) -> dict | None:
def _parse_table(self, raw: dict, store_id: int) -> dict | None:
table_id = TypeParser.parse_int(raw.get("id"))
if not table_id:
self.logger.warning("跳过缺少 table_id 的台桌记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"table_id": table_id,
@@ -89,4 +82,3 @@ class TablesTask(BaseTask):
),
"raw_data": json.dumps(raw, ensure_ascii=False),
}
>>>>>>> main

View File

@@ -0,0 +1,69 @@
# -*- coding: utf-8 -*-
from .base_dwd_task import BaseDwdTask
from loaders.facts.ticket import TicketLoader
class TicketDwdTask(BaseDwdTask):
"""
DWD Task: Process Ticket Details from ODS to Fact Tables
Source: billiards_ods.ods_ticket_detail
Targets:
- billiards.fact_order
- billiards.fact_order_goods
- billiards.fact_table_usage
- billiards.fact_assistant_service
"""
def get_task_code(self) -> str:
return "TICKET_DWD"
def execute(self) -> dict:
self.logger.info(f"Starting {self.get_task_code()} task")
# 1. Get Time Window (Incremental Load)
window_start, window_end, _ = self._get_time_window()
self.logger.info(f"Processing window: {window_start} to {window_end}")
# 2. Initialize Loader
loader = TicketLoader(self.db, logger=self.logger)
store_id = self.config.get("app.store_id")
total_inserted = 0
total_errors = 0
# 3. Iterate ODS Data
# We query ods_ticket_detail based on fetched_at
batches = self.iter_ods_rows(
table_name="billiards_ods.settlement_ticket_details",
columns=["payload", "fetched_at", "source_file", "record_index"],
start_time=window_start,
end_time=window_end
)
for batch in batches:
if not batch:
continue
# Extract payloads
tickets = []
for row in batch:
payload = self.parse_payload(row)
if payload:
tickets.append(payload)
# Process Batch
inserted, errors = loader.process_tickets(tickets, store_id)
total_inserted += inserted
total_errors += errors
# 4. Commit
self.db.commit()
self.logger.info(f"Task {self.get_task_code()} completed. Inserted: {total_inserted}, Errors: {total_errors}")
return {
"status": "success",
"inserted": total_inserted,
"errors": total_errors,
"window_start": window_start.isoformat(),
"window_end": window_end.isoformat()
}

View File

@@ -3,7 +3,7 @@
import json
from .base_task import BaseTask
from .base_task import BaseTask, TaskContext
from loaders.facts.topup import TopupLoader
from models.parsers import TypeParser
@@ -14,55 +14,55 @@ class TopupsTask(BaseTask):
def get_task_code(self) -> str:
return "TOPUPS"
def execute(self) -> dict:
self.logger.info("开始执行 TOPUPS 任务")
window_start, window_end, _ = self._get_time_window()
params = {
"storeId": self.config.get("app.store_id"),
"startTime": TypeParser.format_timestamp(window_start, self.tz),
"endTime": TypeParser.format_timestamp(window_end, self.tz),
def extract(self, context: TaskContext) -> dict:
params = self._merge_common_params(
{
"siteId": context.store_id,
"rangeStartTime": TypeParser.format_timestamp(context.window_start, self.tz),
"rangeEndTime": TypeParser.format_timestamp(context.window_end, self.tz),
}
)
records, _ = self.api.get_paginated(
endpoint="/Site/GetRechargeSettleList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data",),
list_key="settleList",
)
return {"records": records}
def transform(self, extracted: dict, context: TaskContext) -> dict:
parsed, skipped = [], 0
for raw in extracted.get("records", []):
mapped = self._parse_topup(raw, context.store_id)
if mapped:
parsed.append(mapped)
else:
skipped += 1
return {
"records": parsed,
"fetched": len(extracted.get("records", [])),
"skipped": skipped,
}
try:
records, _ = self.api.get_paginated(
endpoint="/Topup/SettleList",
params=params,
page_size=self.config.get("api.page_size", 200),
data_path=("data", "settleList"),
)
def load(self, transformed: dict, context: TaskContext) -> dict:
loader = TopupLoader(self.db)
inserted, updated, loader_skipped = loader.upsert_topups(transformed["records"])
return {
"fetched": transformed["fetched"],
"inserted": inserted,
"updated": updated,
"skipped": transformed["skipped"] + loader_skipped,
"errors": 0,
}
parsed = []
for raw in records:
mapped = self._parse_topup(raw)
if mapped:
parsed.append(mapped)
loader = TopupLoader(self.db)
inserted, updated, skipped = loader.upsert_topups(parsed)
self.db.commit()
counts = {
"fetched": len(records),
"inserted": inserted,
"updated": updated,
"skipped": skipped,
"errors": 0,
}
self.logger.info(f"TOPUPS 完成: {counts}")
return self._build_result("SUCCESS", counts)
except Exception:
self.db.rollback()
self.logger.error("TOPUPS 失败", exc_info=True)
raise
def _parse_topup(self, raw: dict) -> dict | None:
def _parse_topup(self, raw: dict, store_id: int) -> dict | None:
node = raw.get("settleList") if isinstance(raw.get("settleList"), dict) else raw
topup_id = TypeParser.parse_int(node.get("id"))
if not topup_id:
self.logger.warning("跳过缺少 id 的充值结算: %s", raw)
self.logger.warning("跳过缺少充值ID的记录: %s", raw)
return None
store_id = self.config.get("app.store_id")
return {
"store_id": store_id,
"topup_id": topup_id,

View File

@@ -1,647 +0,0 @@
[
{
"data": {
"total": 15,
"abolitionAssistants": [
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-11-09 19:23:29",
"id": 2957675849518789,
"siteId": 2790685415443269,
"tableAreaId": 2791963816579205,
"tableId": 2793016660660357,
"tableArea": "C区",
"tableName": "C1",
"assistantOn": "27",
"assistantName": "泡芙",
"pdChargeMinutes": 214,
"assistantAbolishAmount": 5.83,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-11-06 17:42:09",
"id": 2953329501898373,
"siteId": 2790685415443269,
"tableAreaId": 2802006170324037,
"tableId": 2851642357976581,
"tableArea": "补时长",
"tableName": "补时长5",
"assistantOn": "23",
"assistantName": "婉婉",
"pdChargeMinutes": 10800,
"assistantAbolishAmount": 570.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-11-06 17:42:09",
"id": 2953329502357125,
"siteId": 2790685415443269,
"tableAreaId": 2802006170324037,
"tableId": 2851642357976581,
"tableArea": "补时长",
"tableName": "补时长5",
"assistantOn": "52",
"assistantName": "小柔",
"pdChargeMinutes": 10800,
"assistantAbolishAmount": 570.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-30 01:17:22",
"id": 2942452375932869,
"siteId": 2790685415443269,
"tableAreaId": 2791963825803397,
"tableId": 2793018776604805,
"tableArea": "VIP包厢",
"tableName": "VIP1",
"assistantOn": "2",
"assistantName": "佳怡",
"pdChargeMinutes": 0,
"assistantAbolishAmount": 0.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-29 06:57:22",
"id": 2941371032964997,
"siteId": 2790685415443269,
"tableAreaId": 2791963848527941,
"tableId": 2793021451292741,
"tableArea": "666",
"tableName": "董事办",
"assistantOn": "4",
"assistantName": "璇子",
"pdChargeMinutes": 0,
"assistantAbolishAmount": 0.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-28 02:13:18",
"id": 2939676194180229,
"siteId": 2790685415443269,
"tableAreaId": 2791963887030341,
"tableId": 2793023960551493,
"tableArea": "麻将房",
"tableName": "1",
"assistantOn": "2",
"assistantName": "佳怡",
"pdChargeMinutes": 3602,
"assistantAbolishAmount": 108.06,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-26 21:06:37",
"id": 2937959143262725,
"siteId": 2790685415443269,
"tableAreaId": 2791963855982661,
"tableId": 2793022145302597,
"tableArea": "K包",
"tableName": "888",
"assistantOn": "16",
"assistantName": "周周",
"pdChargeMinutes": 0,
"assistantAbolishAmount": 0.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-25 21:36:22",
"id": 2936572806285765,
"siteId": 2790685415443269,
"tableAreaId": 2791963816579205,
"tableId": 2793017278451845,
"tableArea": "C区",
"tableName": "C2",
"assistantOn": "4",
"assistantName": "璇子",
"pdChargeMinutes": 0,
"assistantAbolishAmount": 0.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-23 19:05:48",
"id": 2933593641256581,
"siteId": 2790685415443269,
"tableAreaId": 2791963807682693,
"tableId": 2793012902318213,
"tableArea": "B区",
"tableName": "B9",
"assistantOn": "16",
"assistantName": "周周",
"pdChargeMinutes": 3600,
"assistantAbolishAmount": 190.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-18 20:25:50",
"id": 2926594431305093,
"siteId": 2790685415443269,
"tableAreaId": 2791963794329671,
"tableId": 2793001904918661,
"tableArea": "A区",
"tableName": "A4",
"assistantOn": "15",
"assistantName": "七七",
"pdChargeMinutes": 2379,
"assistantAbolishAmount": 71.37,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-14 14:20:32",
"id": 2920573007709573,
"siteId": 2790685415443269,
"tableAreaId": 2791963855982661,
"tableId": 2793022145302597,
"tableArea": "K包",
"tableName": "888",
"assistantOn": "9",
"assistantName": "球球",
"pdChargeMinutes": 14400,
"assistantAbolishAmount": 392.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-03 01:21:59",
"id": 2904236313234373,
"siteId": 2790685415443269,
"tableAreaId": 2791963848527941,
"tableId": 2793020955840645,
"tableArea": "666",
"tableName": "666",
"assistantOn": "9",
"assistantName": "球球",
"pdChargeMinutes": 0,
"assistantAbolishAmount": 0.0,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-01 00:27:29",
"id": 2901351579143365,
"siteId": 2790685415443269,
"tableAreaId": 2791963855982661,
"tableId": 2793022145302597,
"tableArea": "K包",
"tableName": "888",
"assistantOn": "99",
"assistantName": "Amy",
"pdChargeMinutes": 10605,
"assistantAbolishAmount": 465.44,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-01 00:27:29",
"id": 2901351578864837,
"siteId": 2790685415443269,
"tableAreaId": 2791963855982661,
"tableId": 2793022145302597,
"tableArea": "K包",
"tableName": "888",
"assistantOn": "4",
"assistantName": "璇子",
"pdChargeMinutes": 10608,
"assistantAbolishAmount": 318.24,
"trashReason": ""
},
{
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 1,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"createTime": "2025-10-01 00:27:29",
"id": 2901351578602693,
"siteId": 2790685415443269,
"tableAreaId": 2791963855982661,
"tableId": 2793022145302597,
"tableArea": "K包",
"tableName": "888",
"assistantOn": "2",
"assistantName": "佳怡",
"pdChargeMinutes": 10611,
"assistantAbolishAmount": 318.33,
"trashReason": ""
}
]
},
"code": 0
},
{
"data": {
"total": 15,
"abolitionAssistants": []
},
"code": 0
}
]

View File

@@ -1,673 +0,0 @@
[
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2955202296416389,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -5000.0,
"pay_status": 2,
"pay_time": "2025-11-08 01:27:16",
"create_time": "2025-11-08 01:27:16",
"relate_type": 5,
"relate_id": 2955078219057349,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2955171790194821,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -10000.0,
"pay_status": 2,
"pay_time": "2025-11-08 00:56:14",
"create_time": "2025-11-08 00:56:14",
"relate_type": 5,
"relate_id": 2955153380001861,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2951883030513413,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -12.0,
"pay_status": 2,
"pay_time": "2025-11-05 17:10:44",
"create_time": "2025-11-05 17:10:44",
"relate_type": 2,
"relate_id": 2951881496577861,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2948959062542597,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -65.0,
"pay_status": 2,
"pay_time": "2025-11-03 15:36:19",
"create_time": "2025-11-03 15:36:19",
"relate_type": 2,
"relate_id": 2948934289967557,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2948630468005509,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -88.33,
"pay_status": 2,
"pay_time": "2025-11-03 10:02:03",
"create_time": "2025-11-03 10:02:03",
"relate_type": 2,
"relate_id": 2948246513454661,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2948269239095045,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -0.67,
"pay_status": 2,
"pay_time": "2025-11-03 03:54:36",
"create_time": "2025-11-03 03:54:36",
"relate_type": 2,
"relate_id": 2948246513454661,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2944743812581445,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -44000.0,
"pay_status": 2,
"pay_time": "2025-10-31 16:08:21",
"create_time": "2025-10-31 16:08:21",
"relate_type": 5,
"relate_id": 2944743413958789,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2931109065131653,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -10.0,
"pay_status": 2,
"pay_time": "2025-10-22 00:58:22",
"create_time": "2025-10-22 00:58:22",
"relate_type": 2,
"relate_id": 2931108522378885,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2921195994465669,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -2.0,
"pay_status": 2,
"pay_time": "2025-10-15 00:54:16",
"create_time": "2025-10-15 00:54:16",
"relate_type": 2,
"relate_id": 2920440691344901,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2919690732146181,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -3000.0,
"pay_status": 2,
"pay_time": "2025-10-13 23:23:02",
"create_time": "2025-10-13 23:23:02",
"relate_type": 5,
"relate_id": 2919519811440261,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
},
{
"tenantName": "朗朗桌球",
"siteProfile": {
"id": 2790685415443269,
"org_id": 2790684179467077,
"shop_name": "朗朗桌球",
"avatar": "https://oss.ficoo.vip/admin/hXcE4E_1752495052016.jpg",
"business_tel": "13316068642",
"full_address": "广东省广州市天河区丽阳街12号",
"address": "广东省广州市天河区天园街道朗朗桌球",
"longitude": 113.360321,
"latitude": 23.133629,
"tenant_site_region_id": 156440100,
"tenant_id": 2790683160709957,
"auto_light": 1,
"attendance_distance": 0,
"wifi_name": "",
"wifi_password": "",
"customer_service_qrcode": "",
"customer_service_wechat": "",
"fixed_pay_qrCode": "",
"prod_env": 1,
"light_status": 2,
"light_type": 0,
"site_type": 1,
"light_token": "",
"site_label": "A",
"attendance_enabled": 1,
"shop_status": 1
},
"id": 2914039374956165,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"pay_sn": 0,
"pay_amount": -8.0,
"pay_status": 2,
"pay_time": "2025-10-09 23:34:11",
"create_time": "2025-10-09 23:34:11",
"relate_type": 2,
"relate_id": 2914030720124357,
"is_revoke": 0,
"is_delete": 0,
"online_pay_channel": 0,
"payment_method": 4,
"balance_frozen_amount": 0.0,
"card_frozen_amount": 0.0,
"member_id": 0,
"member_card_id": 0,
"round_amount": 0.0,
"online_pay_type": 0,
"action_type": 2,
"refund_amount": 0.0,
"cashier_point_id": 0,
"operator_id": 0,
"pay_terminal": 1,
"pay_config_id": 0,
"channel_payer_id": "",
"channel_pay_no": "",
"check_status": 1,
"channel_fee": 0.0
}
]

View File

@@ -1,712 +0,0 @@
[
{
"data": {
"total": 0,
"goodsCategoryList": [
{
"id": 2790683528350533,
"tenant_id": 2790683160709957,
"category_name": "槟榔",
"alias_name": "",
"pid": 0,
"business_name": "槟榔",
"tenant_goods_business_id": 2790683528317766,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350534,
"tenant_id": 2790683160709957,
"category_name": "槟榔",
"alias_name": "",
"pid": 2790683528350533,
"business_name": "槟榔",
"tenant_goods_business_id": 2790683528317766,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 1,
"is_warehousing": 1
},
{
"id": 2790683528350535,
"tenant_id": 2790683160709957,
"category_name": "器材",
"alias_name": "",
"pid": 0,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350536,
"tenant_id": 2790683160709957,
"category_name": "皮头",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350537,
"tenant_id": 2790683160709957,
"category_name": "球杆",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350538,
"tenant_id": 2790683160709957,
"category_name": "其他",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350539,
"tenant_id": 2790683160709957,
"category_name": "酒水",
"alias_name": "",
"pid": 0,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350540,
"tenant_id": 2790683160709957,
"category_name": "饮料",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350541,
"tenant_id": 2790683160709957,
"category_name": "酒水",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350542,
"tenant_id": 2790683160709957,
"category_name": "茶水",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350543,
"tenant_id": 2790683160709957,
"category_name": "咖啡",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350544,
"tenant_id": 2790683160709957,
"category_name": "加料",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793221553489733,
"tenant_id": 2790683160709957,
"category_name": "洋酒",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350545,
"tenant_id": 2790683160709957,
"category_name": "果盘",
"alias_name": "",
"pid": 0,
"business_name": "水果",
"tenant_goods_business_id": 2790683528317769,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792050275864453,
"tenant_id": 2790683160709957,
"category_name": "果盘",
"alias_name": "",
"pid": 2790683528350545,
"business_name": "水果",
"tenant_goods_business_id": 2790683528317769,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2791941988405125,
"tenant_id": 2790683160709957,
"category_name": "零食",
"alias_name": "",
"pid": 0,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2791948300259205,
"tenant_id": 2790683160709957,
"category_name": "零食",
"alias_name": "",
"pid": 2791941988405125,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793236829620037,
"tenant_id": 2790683160709957,
"category_name": "面",
"alias_name": "",
"pid": 2791941988405125,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2791942087561093,
"tenant_id": 2790683160709957,
"category_name": "雪糕",
"alias_name": "",
"pid": 0,
"business_name": "雪糕",
"tenant_goods_business_id": 2791931866402693,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792035069284229,
"tenant_id": 2790683160709957,
"category_name": "雪糕",
"alias_name": "",
"pid": 2791942087561093,
"business_name": "雪糕",
"tenant_goods_business_id": 2791931866402693,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2792062778003333,
"tenant_id": 2790683160709957,
"category_name": "香烟",
"alias_name": "",
"pid": 0,
"business_name": "香烟",
"tenant_goods_business_id": 2790683528317765,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792063209623429,
"tenant_id": 2790683160709957,
"category_name": "香烟",
"alias_name": "",
"pid": 2792062778003333,
"business_name": "香烟",
"tenant_goods_business_id": 2790683528317765,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 1,
"is_warehousing": 1
}
],
"sort": 1,
"is_warehousing": 1
},
{
"id": 2793217944864581,
"tenant_id": 2790683160709957,
"category_name": "其他",
"alias_name": "",
"pid": 0,
"business_name": "其他",
"tenant_goods_business_id": 2793217599407941,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2793218343257925,
"tenant_id": 2790683160709957,
"category_name": "其他2",
"alias_name": "",
"pid": 2793217944864581,
"business_name": "其他",
"tenant_goods_business_id": 2793217599407941,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793220945250117,
"tenant_id": 2790683160709957,
"category_name": "小吃",
"alias_name": "",
"pid": 0,
"business_name": "小吃",
"tenant_goods_business_id": 2793220268902213,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2793221283104581,
"tenant_id": 2790683160709957,
"category_name": "小吃",
"alias_name": "",
"pid": 2793220945250117,
"business_name": "小吃",
"tenant_goods_business_id": 2793220268902213,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
}
]
},
"code": 0
},
{
"data": {
"total": 0,
"goodsCategoryList": [
{
"id": 2790683528350533,
"tenant_id": 2790683160709957,
"category_name": "槟榔",
"alias_name": "",
"pid": 0,
"business_name": "槟榔",
"tenant_goods_business_id": 2790683528317766,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350534,
"tenant_id": 2790683160709957,
"category_name": "槟榔",
"alias_name": "",
"pid": 2790683528350533,
"business_name": "槟榔",
"tenant_goods_business_id": 2790683528317766,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 1,
"is_warehousing": 1
},
{
"id": 2790683528350535,
"tenant_id": 2790683160709957,
"category_name": "器材",
"alias_name": "",
"pid": 0,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350536,
"tenant_id": 2790683160709957,
"category_name": "皮头",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350537,
"tenant_id": 2790683160709957,
"category_name": "球杆",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350538,
"tenant_id": 2790683160709957,
"category_name": "其他",
"alias_name": "",
"pid": 2790683528350535,
"business_name": "器材",
"tenant_goods_business_id": 2790683528317767,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350539,
"tenant_id": 2790683160709957,
"category_name": "酒水",
"alias_name": "",
"pid": 0,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2790683528350540,
"tenant_id": 2790683160709957,
"category_name": "饮料",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350541,
"tenant_id": 2790683160709957,
"category_name": "酒水",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350542,
"tenant_id": 2790683160709957,
"category_name": "茶水",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350543,
"tenant_id": 2790683160709957,
"category_name": "咖啡",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350544,
"tenant_id": 2790683160709957,
"category_name": "加料",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793221553489733,
"tenant_id": 2790683160709957,
"category_name": "洋酒",
"alias_name": "",
"pid": 2790683528350539,
"business_name": "酒水",
"tenant_goods_business_id": 2790683528317768,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2790683528350545,
"tenant_id": 2790683160709957,
"category_name": "果盘",
"alias_name": "",
"pid": 0,
"business_name": "水果",
"tenant_goods_business_id": 2790683528317769,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792050275864453,
"tenant_id": 2790683160709957,
"category_name": "果盘",
"alias_name": "",
"pid": 2790683528350545,
"business_name": "水果",
"tenant_goods_business_id": 2790683528317769,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2791941988405125,
"tenant_id": 2790683160709957,
"category_name": "零食",
"alias_name": "",
"pid": 0,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2791948300259205,
"tenant_id": 2790683160709957,
"category_name": "零食",
"alias_name": "",
"pid": 2791941988405125,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793236829620037,
"tenant_id": 2790683160709957,
"category_name": "面",
"alias_name": "",
"pid": 2791941988405125,
"business_name": "零食",
"tenant_goods_business_id": 2791932037238661,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2791942087561093,
"tenant_id": 2790683160709957,
"category_name": "雪糕",
"alias_name": "",
"pid": 0,
"business_name": "雪糕",
"tenant_goods_business_id": 2791931866402693,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792035069284229,
"tenant_id": 2790683160709957,
"category_name": "雪糕",
"alias_name": "",
"pid": 2791942087561093,
"business_name": "雪糕",
"tenant_goods_business_id": 2791931866402693,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2792062778003333,
"tenant_id": 2790683160709957,
"category_name": "香烟",
"alias_name": "",
"pid": 0,
"business_name": "香烟",
"tenant_goods_business_id": 2790683528317765,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2792063209623429,
"tenant_id": 2790683160709957,
"category_name": "香烟",
"alias_name": "",
"pid": 2792062778003333,
"business_name": "香烟",
"tenant_goods_business_id": 2790683528317765,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 1,
"is_warehousing": 1
}
],
"sort": 1,
"is_warehousing": 1
},
{
"id": 2793217944864581,
"tenant_id": 2790683160709957,
"category_name": "其他",
"alias_name": "",
"pid": 0,
"business_name": "其他",
"tenant_goods_business_id": 2793217599407941,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2793218343257925,
"tenant_id": 2790683160709957,
"category_name": "其他2",
"alias_name": "",
"pid": 2793217944864581,
"business_name": "其他",
"tenant_goods_business_id": 2793217599407941,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
},
{
"id": 2793220945250117,
"tenant_id": 2790683160709957,
"category_name": "小吃",
"alias_name": "",
"pid": 0,
"business_name": "小吃",
"tenant_goods_business_id": 2793220268902213,
"open_salesman": 2,
"categoryBoxes": [
{
"id": 2793221283104581,
"tenant_id": 2790683160709957,
"category_name": "小吃",
"alias_name": "",
"pid": 2793220945250117,
"business_name": "小吃",
"tenant_goods_business_id": 2793220268902213,
"open_salesman": 2,
"categoryBoxes": [],
"sort": 0,
"is_warehousing": 1
}
],
"sort": 0,
"is_warehousing": 1
}
]
},
"code": 0
}
]

View File

@@ -1,646 +0,0 @@
[
{
"data": {
"total": 17,
"packageCouponList": [
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2939215004469573,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "早场特惠一小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-10-27 00:00:00",
"end_time": "2026-10-28 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 2,
"package_id": 1814707240811572,
"usable_count": 9999999,
"create_time": "2025-10-27 18:24:09",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2861343275830405,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "B区桌球一小时",
"table_area_id": "0",
"table_area_name": "B区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-09-02 00:00:00",
"end_time": "2026-09-03 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1370841337,
"usable_count": 9999999,
"create_time": "2025-09-02 18:08:56",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960521691013",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 3,
"id": 2836713896429317,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "午夜一小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-08-16 00:00:00",
"end_time": "2026-08-17 00:00:00",
"is_enabled": 2,
"is_delete": 0,
"type": 1,
"package_id": 1370841337,
"usable_count": 9999999,
"create_time": "2025-08-16 08:34:38",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2801876691340293,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "中八、斯诺克包厢两小时",
"table_area_id": "0",
"table_area_name": "VIP包厢",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-22 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1126976372,
"usable_count": 9999999,
"create_time": "2025-07-22 17:56:24",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961060364165",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 3,
"id": 2801875268668357,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "中八、斯诺克包厢两小时",
"table_area_id": "0",
"table_area_name": "VIP包厢",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-22 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 2,
"is_delete": 0,
"type": 1,
"package_id": 1126976372,
"usable_count": 9999999,
"create_time": "2025-07-22 17:54:57",
"creator_name": "管理员:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961060364165",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "0",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 3,
"id": 2800772613934149,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "午夜场一小时A区",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 2,
"is_delete": 0,
"type": 1,
"package_id": 1370841337,
"usable_count": 9999999,
"create_time": "2025-07-21 23:13:16",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798905767676933,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "中八、斯诺克包厢两小时",
"table_area_id": "0",
"table_area_name": "VIP包厢",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 2,
"package_id": 1812429097416714,
"usable_count": 9999999,
"create_time": "2025-07-20 15:34:13",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961060364165",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 3,
"id": 2798901295615045,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "新人特惠A区中八一小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 2,
"is_delete": 0,
"type": 2,
"package_id": 1814707240811572,
"usable_count": 9999999,
"create_time": "2025-07-20 15:29:40",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798898826300485,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "斯诺克两小时",
"table_area_id": "0",
"table_area_name": "斯诺克区",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 2,
"package_id": 1814983609169019,
"usable_count": 9999999,
"create_time": "2025-07-20 15:27:09",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961347968901",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798734170983493,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "助理教练竞技教学两小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1173128804,
"usable_count": 9999999,
"create_time": "2025-07-20 12:39:39",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798732571167749,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "全天斯诺克一小时",
"table_area_id": "0",
"table_area_name": "斯诺克区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-30 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1147633733,
"usable_count": 9999999,
"create_time": "2025-07-20 12:38:02",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961347968901",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798731703045189,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "KTV欢唱四小时",
"table_area_id": "0",
"table_area_name": "888",
"selling_price": 0.0,
"duration": 14400,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1137882866,
"usable_count": 9999999,
"create_time": "2025-07-20 12:37:09",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791961709907845",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "10:00:00",
"add_end_clock": "18:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798729978514501,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "全天A区中八两小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1130465371,
"usable_count": 9999999,
"create_time": "2025-07-20 12:35:24",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798728823213061,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "全天B区中八两小时",
"table_area_id": "0",
"table_area_name": "B区",
"selling_price": 0.0,
"duration": 7200,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1137872168,
"usable_count": 9999999,
"create_time": "2025-07-20 12:34:13",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960521691013",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798727423528005,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "全天A区中八一小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1128411555,
"usable_count": 9999999,
"create_time": "2025-07-20 12:32:48",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798723640069125,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "中八A区新人特惠一小时",
"table_area_id": "0",
"table_area_name": "A区",
"selling_price": 0.0,
"duration": 3600,
"start_time": "2025-07-20 00:00:00",
"end_time": "2025-12-31 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1203035334,
"usable_count": 9999999,
"create_time": "2025-07-20 12:28:57",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791960001957765",
"start_clock": "00:00:00",
"end_clock": "1.00:00:00",
"add_start_clock": "00:00:00",
"add_end_clock": "1.00:00:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
},
{
"site_name": "朗朗桌球",
"effective_status": 1,
"id": 2798713926290437,
"site_id": 2790685415443269,
"tenant_id": 2790683160709957,
"package_name": "麻将 、掼蛋包厢四小时",
"table_area_id": "0",
"table_area_name": "麻将房",
"selling_price": 0.0,
"duration": 14400,
"start_time": "2025-07-21 00:00:00",
"end_time": "2025-12-30 00:00:00",
"is_enabled": 1,
"is_delete": 0,
"type": 1,
"package_id": 1134269810,
"usable_count": 9999999,
"create_time": "2025-07-20 12:19:04",
"creator_name": "店长:郑丽珊",
"tenant_table_area_id": "0",
"table_area_id_list": "",
"tenant_table_area_id_list": "2791962314215301",
"start_clock": "10:00:00",
"end_clock": "1.02:00:00",
"add_start_clock": "10:00:00",
"add_end_clock": "23:59:00",
"date_info": "",
"date_type": 1,
"group_type": 1,
"usable_range": "",
"coupon_money": 0.0,
"area_tag_type": 1,
"system_group_type": 1,
"max_selectable_categories": 0,
"card_type_ids": "0"
}
]
},
"code": 0
},
{
"data": {
"total": 17,
"packageCouponList": []
},
"code": 0
}
]

View File

@@ -1,20 +0,0 @@
[
{
"data": {
"goodsStockA": 0,
"goodsStockB": 6252,
"goodsSaleNum": 210.29,
"stockSumMoney": 1461.28
},
"code": 0
},
{
"data": {
"goodsStockA": 0,
"goodsStockB": 6252,
"goodsSaleNum": 210.29,
"stockSumMoney": 1461.28
},
"code": 0
}
]

View File

@@ -26,6 +26,7 @@ from tasks.refunds_task import RefundsTask
from tasks.table_discount_task import TableDiscountTask
from tasks.tables_task import TablesTask
from tasks.topups_task import TopupsTask
from utils.json_store import endpoint_to_filename
DEFAULT_STORE_ID = 2790685415443269
BASE_TS = "2025-01-01 10:00:00"
@@ -47,12 +48,6 @@ class TaskSpec:
return endpoint_to_filename(self.endpoint)
def endpoint_to_filename(endpoint: str) -> str:
"""根据 API endpoint 生成稳定可复用的文件名,便于离线模式在目录中直接定位归档 JSON。"""
normalized = endpoint.strip("/").replace("/", "__").replace(" ", "_").lower()
return f"{normalized or 'root'}.json"
def wrap_records(records: List[Dict], data_path: Sequence[str]):
"""按照 data_path 逐层包裹记录列表,使其结构与真实 API 返回体一致,方便离线回放。"""
payload = records
@@ -68,6 +63,7 @@ def create_test_config(mode: str, archive_dir: Path, temp_dir: Path) -> AppConfi
archive_dir.mkdir(parents=True, exist_ok=True)
temp_dir.mkdir(parents=True, exist_ok=True)
flow = "FULL" if str(mode or "").upper() == "ONLINE" else "INGEST_ONLY"
overrides = {
"app": {"store_id": DEFAULT_STORE_ID, "timezone": "Asia/Taipei"},
"db": {"dsn": "postgresql://user:pass@localhost:5432/etl_billiards_test"},
@@ -77,10 +73,10 @@ def create_test_config(mode: str, archive_dir: Path, temp_dir: Path) -> AppConfi
"timeout_sec": 3,
"page_size": 50,
},
"testing": {
"mode": mode,
"json_archive_dir": str(archive_dir),
"temp_json_dir": str(temp_dir),
"pipeline": {
"flow": flow,
"fetch_root": str(temp_dir / "json_fetch"),
"ingest_source_dir": str(archive_dir),
},
"io": {
"export_root": str(temp_dir / "export"),
@@ -135,16 +131,45 @@ class FakeDBOperations:
def __init__(self):
self.upserts: List[Dict] = []
self.executes: List[Dict] = []
self.commits = 0
self.rollbacks = 0
self.conn = FakeConnection()
# Pre-seeded query results (FIFO) to let tests control DB-returned rows
self.query_results: List[List[Dict]] = []
def batch_upsert_with_returning(self, sql: str, rows: List[Dict], page_size: int = 1000):
self.upserts.append({"sql": sql.strip(), "count": len(rows), "page_size": page_size})
self.upserts.append(
{
"sql": sql.strip(),
"count": len(rows),
"page_size": page_size,
"rows": [dict(row) for row in rows],
}
)
return len(rows), 0
def batch_execute(self, sql: str, rows: List[Dict], page_size: int = 1000):
self.upserts.append({"sql": sql.strip(), "count": len(rows), "page_size": page_size})
self.executes.append(
{
"sql": sql.strip(),
"count": len(rows),
"page_size": page_size,
"rows": [dict(row) for row in rows],
}
)
def execute(self, sql: str, params=None):
self.executes.append({"sql": sql.strip(), "params": params})
def query(self, sql: str, params=None):
self.executes.append({"sql": sql.strip(), "params": params, "type": "query"})
if self.query_results:
return self.query_results.pop(0)
return []
def cursor(self):
return self.conn.cursor()
def commit(self):
self.commits += 1
@@ -161,22 +186,53 @@ class FakeAPIClient:
self.calls: List[Dict] = []
# pylint: disable=unused-argument
def get_paginated(self, endpoint: str, params=None, **kwargs):
def iter_paginated(
self,
endpoint: str,
params=None,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: Tuple[str, ...] = (),
list_key: str | None = None,
):
self.calls.append({"endpoint": endpoint, "params": params})
if endpoint not in self.data_map:
raise AssertionError(f"Missing fixture for endpoint {endpoint}")
return list(self.data_map[endpoint]), [{"page": 1, "size": len(self.data_map[endpoint])}]
records = list(self.data_map[endpoint])
yield 1, records, dict(params or {}), {"data": records}
def get_paginated(self, endpoint: str, params=None, **kwargs):
records = []
pages = []
for page_no, page_records, req, resp in self.iter_paginated(endpoint, params, **kwargs):
records.extend(page_records)
pages.append({"page": page_no, "request": req, "response": resp})
return records, pages
def get_source_hint(self, endpoint: str) -> str | None:
return None
class OfflineAPIClient:
"""离线模式专用 API Client根据 endpoint 读取归档 JSON、套 data_path 并回放列表数据。"""
"""离线模式专用 API Client根据 endpoint 读取归档 JSON、套 data_path 并回放列表数据。"""
def __init__(self, file_map: Dict[str, Path]):
self.file_map = {k: Path(v) for k, v in file_map.items()}
self.calls: List[Dict] = []
# pylint: disable=unused-argument
def get_paginated(self, endpoint: str, params=None, page_size: int = 200, data_path: Tuple[str, ...] = (), **kwargs):
def iter_paginated(
self,
endpoint: str,
params=None,
page_size: int = 200,
page_field: str = "page",
size_field: str = "limit",
data_path: Tuple[str, ...] = (),
list_key: str | None = None,
):
self.calls.append({"endpoint": endpoint, "params": params})
if endpoint not in self.file_map:
raise AssertionError(f"Missing archive for endpoint {endpoint}")
@@ -188,17 +244,42 @@ class OfflineAPIClient:
for key in data_path:
if isinstance(data, dict):
data = data.get(key, [])
else:
data = []
break
if list_key and isinstance(data, dict):
data = data.get(list_key, [])
if not isinstance(data, list):
data = []
return data, [{"page": 1, "mode": "offline"}]
total = len(data)
start = 0
page = 1
while start < total or (start == 0 and total == 0):
chunk = data[start : start + page_size]
if not chunk and total != 0:
break
yield page, list(chunk), dict(params or {}), payload
if len(chunk) < page_size:
break
start += page_size
page += 1
def get_paginated(self, endpoint: str, params=None, **kwargs):
records = []
pages = []
for page_no, page_records, req, resp in self.iter_paginated(endpoint, params, **kwargs):
records.extend(page_records)
pages.append({"page": page_no, "request": req, "response": resp})
return records, pages
def get_source_hint(self, endpoint: str) -> str | None:
if endpoint not in self.file_map:
return None
return str(self.file_map[endpoint])
class RealDBOperationsAdapter:
"""连接真实 PostgreSQL 的适配器,为任务提供 batch_upsert + 事务能力。"""
def __init__(self, dsn: str):
@@ -247,7 +328,7 @@ TASK_SPECS: List[TaskSpec] = [
code="PRODUCTS",
task_cls=ProductsTask,
endpoint="/TenantGoods/QueryTenantGoods",
data_path=("data",),
data_path=("data", "tenantGoodsList"),
sample_records=[
{
"siteGoodsId": 101,
@@ -298,7 +379,7 @@ TASK_SPECS: List[TaskSpec] = [
code="MEMBERS",
task_cls=MembersTask,
endpoint="/MemberProfile/GetTenantMemberList",
data_path=("data",),
data_path=("data", "tenantMemberInfos"),
sample_records=[
{
"memberId": 401,
@@ -313,7 +394,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="ASSISTANTS",
task_cls=AssistantsTask,
endpoint="/Assistant/List",
endpoint="/PersonnelManagement/SearchAssistantInfo",
data_path=("data", "assistantInfos"),
sample_records=[
{
@@ -351,7 +432,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="PACKAGES_DEF",
task_cls=PackagesDefTask,
endpoint="/Package/List",
endpoint="/PackageCoupon/QueryPackageCouponList",
data_path=("data", "packageCouponList"),
sample_records=[
{
@@ -381,8 +462,8 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="ORDERS",
task_cls=OrdersTask,
endpoint="/order/list",
data_path=("data",),
endpoint="/Site/GetAllOrderSettleList",
data_path=("data", "settleList"),
sample_records=[
{
"orderId": 701,
@@ -403,7 +484,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="PAYMENTS",
task_cls=PaymentsTask,
endpoint="/pay/records",
endpoint="/PayLog/GetPayLogListPage",
data_path=("data",),
sample_records=[
{
@@ -420,8 +501,8 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="REFUNDS",
task_cls=RefundsTask,
endpoint="/Pay/RefundList",
data_path=(),
endpoint="/Order/GetRefundPayLogList",
data_path=("data",),
sample_records=[
{
"id": 901,
@@ -449,8 +530,8 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="COUPON_USAGE",
task_cls=CouponUsageTask,
endpoint="/Coupon/UsageList",
data_path=(),
endpoint="/Promotion/GetOfflineCouponConsumePageList",
data_path=("data",),
sample_records=[
{
"id": 1001,
@@ -479,7 +560,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="INVENTORY_CHANGE",
task_cls=InventoryChangeTask,
endpoint="/Inventory/ChangeList",
endpoint="/GoodsStockManage/QueryGoodsOutboundReceipt",
data_path=("data", "queryDeliveryRecordsList"),
sample_records=[
{
@@ -503,7 +584,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="TOPUPS",
task_cls=TopupsTask,
endpoint="/Topup/SettleList",
endpoint="/Site/GetRechargeSettleList",
data_path=("data", "settleList"),
sample_records=[
{
@@ -542,7 +623,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="TABLE_DISCOUNT",
task_cls=TableDiscountTask,
endpoint="/Table/AdjustList",
endpoint="/Site/GetTaiFeeAdjustList",
data_path=("data", "taiFeeAdjustInfos"),
sample_records=[
{
@@ -572,7 +653,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="ASSISTANT_ABOLISH",
task_cls=AssistantAbolishTask,
endpoint="/Assistant/AbolishList",
endpoint="/AssistantPerformance/GetAbolitionAssistant",
data_path=("data", "abolitionAssistants"),
sample_records=[
{
@@ -593,7 +674,7 @@ TASK_SPECS: List[TaskSpec] = [
TaskSpec(
code="LEDGER",
task_cls=LedgerTask,
endpoint="/Assistant/LedgerList",
endpoint="/AssistantPerformance/GetOrderAssistantDetails",
data_path=("data", "orderAssistantDetails"),
sample_records=[
{

View File

@@ -0,0 +1,59 @@
# -*- coding: utf-8 -*-
"""验证 14 个任务的 E/T/L 分阶段调用FakeDB/FakeAPI不访问真实接口或数据库"""
import logging
import sys
from pathlib import Path
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
import pytest
PROJECT_ROOT = Path(__file__).resolve().parents[2]
if str(PROJECT_ROOT) not in sys.path:
sys.path.insert(0, str(PROJECT_ROOT))
from tasks.base_task import TaskContext
from tests.unit.task_test_utils import (
TASK_SPECS,
create_test_config,
get_db_operations,
FakeAPIClient,
)
def _build_context(store_id: int) -> TaskContext:
now = datetime.now(ZoneInfo("Asia/Taipei"))
return TaskContext(
store_id=store_id,
window_start=now - timedelta(minutes=30),
window_end=now,
window_minutes=30,
cursor=None,
)
@pytest.mark.parametrize("spec", TASK_SPECS)
def test_etl_stage_flow(spec, tmp_path):
"""对每个任务,单独调用 transform/load验证 counts 结构与 FakeDB 写入。"""
config = create_test_config("ONLINE", tmp_path / "archive", tmp_path / "temp")
api = FakeAPIClient({spec.endpoint: spec.sample_records})
logger = logging.getLogger(f"test_{spec.code.lower()}")
task_cls = spec.task_cls
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logger)
ctx = _build_context(config.get("app.store_id"))
# 跳过 extract直接验证 transform + load
extracted = {"records": spec.sample_records}
transformed = task.transform(extracted, ctx)
counts = task.load(transformed, ctx)
assert set(counts.keys()) == {"fetched", "inserted", "updated", "skipped", "errors"}
assert counts["fetched"] == len(spec.sample_records)
assert counts["errors"] == 0
# FakeDB 记录upserts/executes至少有一条
upserts = getattr(db_ops, "upserts", [])
executes = getattr(db_ops, "executes", [])
assert upserts or executes, "expected db operations to be recorded"

View File

@@ -0,0 +1,161 @@
# -*- coding: utf-8 -*-
"""Unit tests for the new ODS ingestion tasks."""
import logging
import os
import sys
from pathlib import Path
# Ensure project root is resolvable when running tests in isolation
PROJECT_ROOT = Path(__file__).resolve().parents[2]
if str(PROJECT_ROOT) not in sys.path:
sys.path.insert(0, str(PROJECT_ROOT))
os.environ.setdefault("ETL_SKIP_DOTENV", "1")
from tasks.ods_tasks import ODS_TASK_CLASSES
from .task_test_utils import create_test_config, get_db_operations, FakeAPIClient
def _build_config(tmp_path):
archive_dir = tmp_path / "archive"
temp_dir = tmp_path / "temp"
return create_test_config("ONLINE", archive_dir, temp_dir)
def test_assistant_accounts_masters_ingest(tmp_path):
"""Ensure assistant_accounts_masterS task stores raw payload with record_index dedup keys."""
config = _build_config(tmp_path)
sample = [
{
"id": 5001,
"assistant_no": "A01",
"nickname": "灏忓紶",
}
]
api = FakeAPIClient({"/PersonnelManagement/SearchAssistantInfo": sample})
task_cls = ODS_TASK_CLASSES["assistant_accounts_masterS"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_assistant_accounts_masters"))
result = task.execute()
assert result["status"] == "SUCCESS"
assert result["counts"]["fetched"] == 1
assert db_ops.commits == 1
row = db_ops.upserts[0]["rows"][0]
assert row["id"] == 5001
assert row["record_index"] == 0
assert row["source_file"] is None or row["source_file"] == ""
assert '"id": 5001' in row["payload"]
def test_goods_stock_movements_ingest(tmp_path):
"""Ensure goods_stock_movements task stores raw payload with record_index dedup keys."""
config = _build_config(tmp_path)
sample = [
{
"siteGoodsStockId": 123456,
"stockType": 1,
"goodsName": "娴嬭瘯鍟嗗搧",
}
]
api = FakeAPIClient({"/GoodsStockManage/QueryGoodsOutboundReceipt": sample})
task_cls = ODS_TASK_CLASSES["goods_stock_movements"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_goods_stock_movements"))
result = task.execute()
assert result["status"] == "SUCCESS"
assert result["counts"]["fetched"] == 1
assert db_ops.commits == 1
row = db_ops.upserts[0]["rows"][0]
assert row["sitegoodsstockid"] == 123456
assert row["record_index"] == 0
assert '"siteGoodsStockId": 123456' in row["payload"]
def test_member_profiless_ingest(tmp_path):
"""Ensure ODS_MEMBER task stores tenantMemberInfos raw JSON."""
config = _build_config(tmp_path)
sample = [{"tenantMemberInfos": [{"id": 101, "mobile": "13800000000"}]}]
api = FakeAPIClient({"/MemberProfile/GetTenantMemberList": sample})
task_cls = ODS_TASK_CLASSES["ODS_MEMBER"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_member"))
result = task.execute()
assert result["status"] == "SUCCESS"
row = db_ops.upserts[0]["rows"][0]
assert row["record_index"] == 0
assert '"id": 101' in row["payload"]
def test_ods_payment_ingest(tmp_path):
"""Ensure ODS_PAYMENT task stores payment_transactions raw JSON."""
config = _build_config(tmp_path)
sample = [{"payId": 901, "payAmount": "100.00"}]
api = FakeAPIClient({"/PayLog/GetPayLogListPage": sample})
task_cls = ODS_TASK_CLASSES["ODS_PAYMENT"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_payment"))
result = task.execute()
assert result["status"] == "SUCCESS"
row = db_ops.upserts[0]["rows"][0]
assert row["record_index"] == 0
assert '"payId": 901' in row["payload"]
def test_ods_settlement_records_ingest(tmp_path):
"""Ensure settlement_records task stores settleList raw JSON."""
config = _build_config(tmp_path)
sample = [{"data": {"settleList": [{"id": 701, "orderTradeNo": 8001}]}}]
api = FakeAPIClient({"/Site/GetAllOrderSettleList": sample})
task_cls = ODS_TASK_CLASSES["settlement_records"]
with get_db_operations() as db_ops:
task = task_cls(config, db_ops, api, logging.getLogger("test_settlement_records"))
result = task.execute()
assert result["status"] == "SUCCESS"
row = db_ops.upserts[0]["rows"][0]
assert row["record_index"] == 0
assert '"orderTradeNo": 8001' in row["payload"]
def test_ods_settlement_ticket_by_payment_relate_ids(tmp_path):
"""Ensure settlement tickets are fetched per payment relate_id and skip existing ones."""
config = _build_config(tmp_path)
ticket_payload = {"data": {"data": {"orderSettleId": 9001, "orderSettleNumber": "T001"}}}
api = FakeAPIClient({"/Order/GetOrderSettleTicketNew": [ticket_payload]})
task_cls = ODS_TASK_CLASSES["ODS_SETTLEMENT_TICKET"]
with get_db_operations() as db_ops:
# First query: existing ticket ids; Second query: payment relate_ids
db_ops.query_results = [
[{"order_settle_id": 9002}],
[
{"order_settle_id": 9001},
{"order_settle_id": 9002},
{"order_settle_id": None},
],
]
task = task_cls(config, db_ops, api, logging.getLogger("test_ods_settlement_ticket"))
result = task.execute()
assert result["status"] == "SUCCESS"
counts = result["counts"]
assert counts["fetched"] == 1
assert counts["inserted"] == 1
assert counts["updated"] == 0
assert counts["skipped"] == 0
assert '"orderSettleId": 9001' in db_ops.upserts[0]["rows"][0]["payload"]
assert any(
call["endpoint"] == "/Order/GetOrderSettleTicketNew"
and call.get("params", {}).get("orderSettleId") == 9001
for call in api.calls
)

View File

@@ -0,0 +1,22 @@
# -*- coding: utf-8 -*-
"""汇总与报告工具的单测。"""
from utils.reporting import summarize_counts, format_report
def test_summarize_counts_and_format():
task_results = [
{"task_code": "ORDERS", "counts": {"fetched": 2, "inserted": 2, "updated": 0, "skipped": 0, "errors": 0}},
{"task_code": "PAYMENTS", "counts": {"fetched": 3, "inserted": 2, "updated": 1, "skipped": 0, "errors": 0}},
]
summary = summarize_counts(task_results)
assert summary["total"]["fetched"] == 5
assert summary["total"]["inserted"] == 4
assert summary["total"]["updated"] == 1
assert summary["total"]["errors"] == 0
assert len(summary["details"]) == 2
report = format_report(summary)
assert "TOTAL fetched=5" in report
assert "ORDERS:" in report
assert "PAYMENTS:" in report

View File

@@ -0,0 +1,78 @@
# -*- coding: utf-8 -*-
"""JSON 归档/读取的通用工具。"""
from __future__ import annotations
import json
from pathlib import Path
from typing import Any
from urllib.parse import urlparse
ENDPOINT_FILENAME_MAP: dict[str, str] = {
"/memberprofile/gettenantmemberlist": "member_profiles.json",
"/memberprofile/getmembercardbalancechange": "member_balance_changes.json",
"/memberprofile/gettenantmembercardlist": "member_stored_value_cards.json",
"/site/getrechargesettlelist": "recharge_settlements.json",
"/assistantperformance/getabolitionassistant": "assistant_cancellation_records.json",
"/assistantperformance/getorderassistantdetails": "assistant_service_records.json",
"/personnelmanagement/searchassistantinfo": "assistant_accounts_master.json",
"/table/getsitetables": "site_tables_master.json",
"/site/gettaifeeadjustlist": "table_fee_discount_records.json",
"/site/getsitetableorderdetails": "table_fee_transactions.json",
"/tenantgoods/querytenantgoods": "tenant_goods_master.json",
"/packagecoupon/querypackagecouponlist": "group_buy_packages.json",
"/site/getsitetableusedetails": "group_buy_redemption_records.json",
"/order/getordersettleticketnew": "settlement_ticket_details.json",
"/promotion/getofflinecouponconsumepagelist": "platform_coupon_redemption_records.json",
"/goodsstockmanage/querygoodsoutboundreceipt": "goods_stock_movements.json",
"/tenantgoodscategory/queryprimarysecondarycategory": "stock_goods_category_tree.json",
"/tenantgoods/getgoodsstockreport": "goods_stock_summary.json",
"/paylog/getpayloglistpage": "payment_transactions.json",
"/site/getallordersettlelist": "settlement_records.json",
"/order/getrefundpayloglist": "refund_transactions.json",
"/tenantgoods/getgoodsinventorylist": "store_goods_master.json",
"/tenantgoods/getgoodssaleslist": "store_goods_sales_records.json",
}
def endpoint_to_filename(endpoint: str) -> str:
"""
将 API endpoint 转换为规范化的文件名,优先使用 非球接口API.md 中约定的名称。
未覆盖的路径会回退到“去掉开头斜杠 -> 用双下划线替换斜杠 -> 小写”的规则。
"""
normalized = _normalize_endpoint(endpoint)
if normalized in ENDPOINT_FILENAME_MAP:
return ENDPOINT_FILENAME_MAP[normalized]
fallback = normalized.strip("/").replace("/", "__").replace(" ", "_")
return f"{fallback or 'root'}.json"
def dump_json(path: Path, payload: Any, pretty: bool = False):
"""将 JSON 对象写入文件,默认紧凑,可选美化。"""
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w", encoding="utf-8") as fp:
json.dump(payload, fp, ensure_ascii=False, indent=2 if pretty else None)
def _normalize_endpoint(endpoint: str) -> str:
"""标准化 endpoint提取路径部分并统一小写、去除 base 前缀。"""
raw = str(endpoint or "").strip()
if not raw:
return ""
parsed = urlparse(raw)
path = parsed.path or raw
if not path.startswith("/"):
path = f"/{path}"
path = path.rstrip("/") or "/"
lowered = path.lower()
for prefix in ("/apiprod/admin/v1", "apiprod/admin/v1"):
if lowered.startswith(prefix):
path = path[len(prefix) :]
if not path.startswith("/"):
path = f"/{path}"
path = path.rstrip("/") or "/"
lowered = path.lower()
break
return lowered

View File

@@ -0,0 +1,53 @@
# -*- coding: utf-8 -*-
"""简单的任务结果汇总与格式化工具。"""
from __future__ import annotations
from typing import Iterable
def summarize_counts(task_results: Iterable[dict]) -> dict:
"""
汇总多个任务的 counts返回总计与逐任务明细。
task_results: 形如 {"task_code": str, "counts": {...}} 的字典序列。
"""
totals = {"fetched": 0, "inserted": 0, "updated": 0, "skipped": 0, "errors": 0}
details = []
for res in task_results:
code = res.get("task_code") or res.get("code") or "UNKNOWN"
counts = res.get("counts") or {}
row = {"task_code": code}
for key in totals.keys():
val = int(counts.get(key, 0) or 0)
row[key] = val
totals[key] += val
details.append(row)
return {"total": totals, "details": details}
def format_report(summary: dict) -> str:
"""将 summarize_counts 的输出格式化为可读文案。"""
lines = []
totals = summary.get("total", {})
lines.append(
"TOTAL fetched={fetched} inserted={inserted} updated={updated} skipped={skipped} errors={errors}".format(
fetched=totals.get("fetched", 0),
inserted=totals.get("inserted", 0),
updated=totals.get("updated", 0),
skipped=totals.get("skipped", 0),
errors=totals.get("errors", 0),
)
)
for row in summary.get("details", []):
lines.append(
"{task_code}: fetched={fetched} inserted={inserted} updated={updated} skipped={skipped} errors={errors}".format(
task_code=row.get("task_code", "UNKNOWN"),
fetched=row.get("fetched", 0),
inserted=row.get("inserted", 0),
updated=row.get("updated", 0),
skipped=row.get("skipped", 0),
errors=row.get("errors", 0),
)
)
return "\n".join(lines)

View File

@@ -1,2 +1,4 @@
requests
requests
psycopg2-binary
python-dateutil
tzdata

1361
tmp/20251121-task.txt Normal file

File diff suppressed because it is too large Load Diff

2
tmp/doc_extracted.txt Normal file

File diff suppressed because one or more lines are too long

286
tmp/doc_lines.txt Normal file
View File

@@ -0,0 +1,286 @@
台球厅数仓 DWD 层数据库说明书
本说明书详细列出了台球厅经营系统的 DWD 层表结构。
每张表都包含字段名称、数据类型、来源、含义、是否属于主键/外键、业务重要性、未知作用标记,以及枚举值解释。说明书依据《*-Analysis.md》中提供的字段说明整理完成未出现省略号确保字段信息完整可追溯。
因业务需求将一个表拆成主数据表和扩展数据表Ex为后缀维度表的门店数据表分为主表dim_site 和扩展表dim_site_Ex主键相同作为唯一关联标识。在业务代码处理的读和写时使用统一处理方式将数据视为一个表格。注意极少数表没有扩展表。
注意:考虑到后期分布式部署,以及测试的便利性。所有的“外键”的处理,使用业务处理,不在数据库中强制约束。
维度表DIM
dim_site
门店维度表,提取自各 ODS 中的 siteProfile 对象如table_fee_transactions。记录门店的基本信息和配置是其他事实表的外键。
dim_site_Ex
dim_table
台桌维度表,来自 site_tables_master。每行代表一张球台或包厢包含区域和业务角色信息。
dim_table_Ex
dim_assistant
助教档案维表,对应 assistant_accounts_master。每行代表一位助教账号及其人事/账号状态。
dim_assistant_Ex
dim_member
会员档案维表,对应 member_profiles。每行记录租户内某会员的主档信息包括等级、状态、注册信息等。
dim_member_Ex
dim_member_card_account
已开通的会员卡账户视图,来自 member_stored_value_cards。每行代表一张会员卡账户的快照记录卡种、持卡人、余额、有效期及各种折扣/扣款配置。
重要说明:本视图不仅包含储值卡,还囊括活动抵用券、台费卡、酒水卡、月卡等多种卡种。
大多数折扣/扣款字段在当前数据中保持默认值(如 10.0 表示不打折、100.0 表示全额抵扣、0 表示不启用),业务上暂未使用,但为系统预留能力。
dim_member_card_account_Ex
dim_tenant_goods
租户级商品档案,来自 tenant_goods_master。每行代表一款商品标准定义。
dim_tenant_goods_Ex
dim_store_goods
门店级商品档案,来自 store_goods_master.json。每行代表门店自定义的商品 SKU包括售价和折扣。关联到 dim_tenant_goods 和分类维度。
dim_store_goods_Ex
dim_goods_category
商品分类索引树,来自 stock_goods_category_tree.json。每行是一个分类节点。
categoryBoxes 是“某个分类节点下面的子分类列表”,整个文件里只有两层:根节点 + 子节点两级,不存在孙节点。
每个 categoryBoxes 里的元素结构与根节点完全一致(同样的 11 个字段),只是 pid 指向父节点的 idcategoryBoxes 为空。
同一个分类树在 JSON 里分页返回了两次goodsCategoryList 和每个 categoryBoxes 在两个 page 中完全重复,真正的不同分类节点一共只有 26 个。
从数仓角度,树结构的“真实关系”完全由 id 和 pid 就可以表达categoryBoxes 更像是前端为了直接画树而准备的冗余展开结果,在 DWD 里不需要原样存这一坨结构,只需要被“打散”成一行一个节点。
下面我把完整的 categoryBoxes 结构按业务和数据的视角展开给你看。
一、整体结构categoryBoxes 是子分类数组,深度只有两层
stock_goods_category_tree.json 顶层是一个分页数组 pages每个元素形如
{
"code": 0,
"data": {
"total": 9,
"goodsCategoryList": [ 根分类1, 根分类2, ... 共9个 ]
}
}
每个“根分类”对象都有这些字段:
id
tenant_id
category_name
alias_name
pid
business_name
tenant_goods_business_id
open_salesman
categoryBoxes
sort
is_warehousing
其中:
pid = 0 表示根分类。
categoryBoxes 是一个数组,里面放的是子分类节点对象。
子分类对象和根分类字段完全一样,只是:
pid = 父节点 id
categoryBoxes = []
两个 page 的 goodsCategoryList 完全相同,所以你看到的 18 个“根”其实是相同的 9 个重复了两次categoryBoxes 里的子节点也重复了两次。按照去重后的真实结构:
根节点 9 个。
子节点 17 个。
总共 26 个不同的 id。
所以“完整的 categoryBoxes”其实就是9 个根节点各自带着若干子节点。
二、逐个根节点把 categoryBoxes 展开给你看
下面按“根分类 → categoryBoxes 子分类”的树形方式列一遍,便于你直观看到完整结构。
1. 根:槟榔
根节点:
id = 2790683528350533
category_name = 槟榔
business_name = 槟榔
pid = 0
categoryBoxes 只有一个子分类:
子分类 1
id = 2790683528350534
category_name = 槟榔
business_name = 槟榔
pid = 2790683528350533
categoryBoxes = []
其它字段tenant_id、tenant_goods_business_id、is_warehousing、open_salesman、sort 与父节点一致。
可以理解为:业务线“槟榔”,下面只有一个细分类“槟榔”。
2. 根:器材
根节点:
id = 2790683528350535
category_name = 器材
business_name = 器材
pid = 0
categoryBoxes 子分类 3 个:
子分类 1
id = 2790683528350536
category_name = 皮头
pid = 2790683528350535
子分类 2
id = 2790683528350537
category_name = 球杆
pid = 2790683528350535
子分类 3
id = 2790683528350538
category_name = 其他
pid = 2790683528350535
这条业务线代表所有“器材相关商品”,细分为皮头、球杆、器材其他。
3. 根:酒水
根节点:
id = 2790683528350539
category_name = 酒水
business_name = 酒水
pid = 0
categoryBoxes 子分类 6 个:
子分类 1饮料
id = 2790683528350540
category_name = 饮料
pid = 2790683528350539
子分类 2酒水
id = 2790683528350541
category_name = 酒水
pid = 2790683528350539
子分类 3茶水
id = 2790683528350542
category_name = 茶水
pid = 2790683528350539
子分类 4咖啡
id = 2790683528350543
category_name = 咖啡
pid = 2790683528350539
子分类 5加料
id = 2790683528350544
category_name = 加料
pid = 2790683528350539
子分类 6洋酒
id = 2793221553489733
category_name = 洋酒
pid = 2790683528350539
这里是最典型的一棵分类树:业务线“酒水”,细分成饮料、普通酒水、茶水、咖啡、加料、洋酒。
4. 根:果盘(业务线:水果)
根节点:
id = 2790683528350545
category_name = 果盘
business_name = 水果
pid = 0
categoryBoxes 子分类 1 个:
子分类:
id = 2792050275864453
category_name = 果盘
business_name = 水果
pid = 2790683528350545
这里有个有意思的点:
分类名称用的是“果盘”,而业务大类 business_name 是“水果”,说明业务线从“水果”角度管理,这个店真正卖的具体品类是“果盘”。
5. 根:零食
根节点:
id = 2791941988405125
category_name = 零食
business_name = 零食
pid = 0
categoryBoxes 子分类 2 个:
子分类 1
id = 2791948300259205
category_name = 零食
pid = 2791941988405125
子分类 2
id = 2793236829620037
category_name = 面
pid = 2791941988405125
这说明“面”类商品也被算在零食这条业务线里(这完全是你们的门店本地习惯)。
6. 根:雪糕
根节点:
id = 2791942087561093
category_name = 雪糕
business_name = 雪糕
pid = 0
categoryBoxes 子分类 1 个:
子分类:
id = 2792035069284229
category_name = 雪糕
pid = 2791942087561093
7. 根:香烟
根节点:
id = 2792062778003333
category_name = 香烟
business_name = 香烟
pid = 0
categoryBoxes 子分类 1 个:
子分类:
id = 2792063209623429
category_name = 香烟
pid = 2792062778003333
8. 根:其他
根节点:
id = 2793217944864581
category_name = 其他
business_name = 其他
pid = 0
categoryBoxes 子分类 1 个:
子分类:
id = 2793218343257925
category_name = 其他2
pid = 2793217944864581
可以理解为“杂项类商品”的一级和二级拆分。
9. 根:小吃
根节点:
id = 2793220945250117
category_name = 小吃
business_name = 小吃
pid = 0
categoryBoxes 子分类 1 个:
子分类:
id = 2793221283104581
category_name = 小吃
pid = 2793220945250117
三、categoryBoxes 元素的字段与含义
无论是在 goodsCategoryList 还是 categoryBoxes 里,每个分类节点的字段集合完全一致:
id分类节点主键唯一。
tenant_id租户 ID本文件所有节点相同。
category_name分类名见上面的各种名称。
alias_name分类别名当前全部为空字符串。
pid父级分类 ID根节点为 0子节点为父节点 id。
business_name业务大类名用于业务线聚合。
tenant_goods_business_id业务大类 ID对应 business_name根节点与子节点在同一业务线中取值相同。
open_salesman营业员开关当前所有值为 2表示没启用分类级差异。
categoryBoxes子分类数组只有根节点非空子节点都是空数组。
sort排序小部分为 1大部分为 0目前排序未精细化使用。
is_warehousing是否走库存本文件全部为 1。
这一点很重要categoryBoxes 里的元素不是“别的结构”,就是一套完整分类节点,只是挂在父节点下面而已。
四、从 DWD 建模角度怎么看 categoryBoxes
结合上面的完整展开,可以得出几个明确结论和建议:
真实树关系靠的是 id 和 pid
子节点的 pid 永远等于父节点的 id。
即使没有 categoryBoxes你也完全可以自下而上拼出整棵树。
categoryBoxes 是前端友好的结构,不是建模必要字段。
当前树只有两层
根节点的 categoryBoxes 非空,子节点的 categoryBoxes 全为空。
对应 DWD 可以直接用一个 category_level 字段区分 1、2 层,再配一个 is_leaf 字段。
JSON 有分页重复
同一套分类树出现在两个 page 中goodsCategoryList 与 categoryBoxes 内容重复。
ETL 时必须按 id 去重,否则维表会重复插入相同分类。
在 DWD 的 dim_goods_category 里categoryBoxes 本身不需要落列
只保留每个节点一行category_id、category_name、parent_category_id、category_level、is_leaf、tenant_goods_business_id、business_name 等即可。
如果你特别想保留源结构,可以另开一个 raw_json 字段存原始节点 JSON日后排错用但不建议在分析建模中依赖它。
dim_goods_category_Ex
dim_groupbuy_package
团购套餐定义,来自 group_buy_packages。每行代表一种团购套餐及其使用规则。
dim_groupbuy_package_Ex
事实表DWD
以下事实表均以“业务事件”为粒度,不做聚合。字段来源包括原始 JSON(或ODS) 中的明细数组以及对象属性。时间单位均统一为秒,并保留原始字段以备检查。金额按照源系统保持符号规则,不做符号转换。
dwd_settlement_head结账记录
来自 settlement_records的内层 settleList 对象,每行代表一次结账。该表在业务上是其他明细事实表的汇总头,用于串联台费、商品、助教、券等明细。
dwd_settlement_head_Ex结账记录扩展
dwd_table_fee_log台费流水
来自 table_fee_transactions的 siteTableUseDetailsList忽略siteProfile已在dim_site实现。粒度为一次台费使用记录包括包厢。该表连结订单结账头、桌台、会员等维度。
dwd_table_fee_log_Ex台费流水扩展
dwd_table_fee_adjust台费折扣/调整)
来自 table_fee_discount_records的table_fee_discount_records.data.taiFeeAdjustInfos. 路径下字段路径。每行代表一次台费打折或减免操作。由于结构相对简单,字段说明如下:
dwd_table_fee_adjust_Ex台费折扣/调整扩展)
dwd_store_goods_sale商品销售明细
来自 store_goods_sales_records的 orderGoodsLedgers。每行代表订单中的一条商品销售明细。字段较多以下列出关键字段及其作用。
dwd_store_goods_sale_Ex商品销售明细扩展
dwd_assistant_service_log助教服务流水
来自 assistant_service_records的 assistant_service_records.data.orderAssistantDetails.。每行表示一次助教提供服务的记录,包括服务时长、金额、助教与会员关联等。
dwd_assistant_service_log_Ex助教服务流水扩展
dwd_assistant_trash_event助教废除事件
来自 assistant_cancellation_records 的 abolitionAssistants。每行代表一次助教服务被废除的事件无法直接与结算记录或助教流水关联只能通过门店+台桌+助教+时间窗口软关联。
dwd_assistant_trash_event_Ex助教废除事件扩展
dwd_member_balance_change会员余额变动
来自 member_balance_changes.json粒度为一次储值卡账户余额变动。此表是分析会员资金往来的核心事实表。
dwd_member_balance_change_EX会员余额变动扩展
dwd_groupbuy_redemption团购券核销
来自 group_buy_redemption_records.json 中各条记录。每行代表一次团购券使用/核销事件。
dwd_groupbuy_redemption_Ex团购券核销扩展
来自 group_buy_redemption_records.json 中各条记录。每行代表一次团购券使用/核销事件。
dwd_platform_coupon_redemption第三方平台券核销
来自 platform_coupon_redemption_records.json。每条记录代表一次第三方团购券的核销用于追踪渠道引流和兑换。
dwd_platform_coupon_redemption_Ex第三方平台券核销扩展
dwd_recharge_order充值结算
来自 recharge_settlements.json的settleList.settleList.。每行是一条充值订单,记录充值金额、赠送金额及是否首充。
dwd_recharge_order_Ex充值结算扩展
dwd_payment支付流水
来自 payment_transactions.json。每行代表一笔支付或收款流水与结算单、充值单等关联。只有 pay_status=2 的支付成功记录被导出。
dwd_refund退款流水
来自 refund_transactions.json。每行代表一笔退款对应原支付流水。退款金额以负数存储在 pay_amount 字段;字段 refund_amount 全部为 0实际退款金额需取 pay_amount 的绝对值。
dwd_refund退款流水
来自 refund_transactions.json。每行代表一笔退款对应原支付流水。退款金额以负数存储在 pay_amount 字段;字段 refund_amount 全部为 0实际退款金额需取 pay_amount 的绝对值。
总结
本说明书列出了经营数据仓库 DWD 层的主要维度表和事实表的字段结构及说明,尽可能在每个字段上标注其来源、含义及业务重要性。对于未在 MD 文档中解释的字段标记为 作用未知,在建模时建议保留字段但谨慎使用;对业务逻辑影响较小的展示类字段标记为 不重要。枚举字段均列出了观测到的取值和推断的含义,便于后续 ETL 做值域映射和数据清洗。随着业务扩展和数据补充,可继续完善枚举信息、用途说明和字段分类。

2610
tmp/dwd_tables.json Normal file

File diff suppressed because it is too large Load Diff

6188
tmp/dwd_tables_full.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,40 @@
"""Simple PostgreSQL connectivity smoke-checker."""
import os
import sys
import psycopg2
from psycopg2 import OperationalError
DEFAULT_DSN = os.environ.get(
"PG_DSN", "postgresql://local-Python:Neo-local-1991125@100.64.0.4:5432/LLZQ-test"
)
DEFAULT_TIMEOUT = max(1, min(int(os.environ.get("PG_CONNECT_TIMEOUT", 10)), 20))
def check_postgres_connection(dsn: str, timeout: int = DEFAULT_TIMEOUT) -> bool:
"""Return True if connection succeeds; print diagnostics otherwise."""
try:
conn = psycopg2.connect(dsn, connect_timeout=timeout)
with conn:
with conn.cursor() as cur:
cur.execute("SELECT 1;")
_ = cur.fetchone()
print(f"PostgreSQL 连接成功 (timeout={timeout}s)")
return True
except OperationalError as exc:
print("PostgreSQL 连接失败OperationalError", exc)
except Exception as exc: # pragma: no cover - defensive
print("PostgreSQL 连接失败(其他异常):", exc)
return False
if __name__ == "__main__":
dsn = sys.argv[1] if len(sys.argv) > 1 else DEFAULT_DSN
if not dsn:
print("缺少 DSN请传入参数或设置 PG_DSN 环境变量。")
sys.exit(2)
ok = check_postgres_connection(dsn)
if not ok:
sys.exit(1)

Some files were not shown because too many files have changed in this diff Show More