Files
Neo-ZQYY/.kiro/specs/etl-staff-dimension/design.md

355 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 设计文档ETL 员工维度表staff_info
## 概述
为飞球 ETL 连接器新增员工维度表,从 `SearchSystemStaffInfo` API 抓取球房全体员工数据(店长、主管、教练、收银员、助教管理员等),经 ODS 落地后清洗装载至 DWD 层。员工表与现有助教表(`assistant_accounts_master`)是完全独立的实体。
## API 响应结构
```json
{
"data": {
"total": 15,
"staffProfiles": [
{
"id": 3020236636900101,
"cashierPointId": 2790685415443270,
"cashierPointName": "默认",
"job_num": "",
"staff_name": "葛芃",
"mobile": "13811638071",
"auth_code": "",
"avatar": "",
"create_time": "2025-12-24 00:03:37",
"entry_time": "2025-12-23 08:00:00",
"is_delete": 0,
"leave_status": 0,
"resign_time": "2225-12-24 00:03:37",
"site_id": 2790685415443269,
"staff_identity": 2,
"status": 1,
"system_role_id": 4,
"system_user_id": 3020236636293893,
"tenant_id": 2790683160709957,
"tenant_org_id": 2790685415443269,
"job": "店长",
"shop_name": "朗朗桌球",
"account_status": 1,
"is_reserve": 1,
"groupName": "",
"groupId": 0,
"alias_name": "葛芃",
"staff_profile_id": 0,
"site_label": "",
"rank_id": -1,
"ding_talk_synced": 1,
"new_rank_id": 0,
"new_staff_identity": 0,
"salary_grant_enabled": 2,
"rankName": "无职级",
"entry_type": 1,
"userRoles": [],
"entry_sign_status": 0,
"resign_sign_status": 0,
"criticism_status": 1,
"gender": 3
}
]
},
"code": 0
}
```
## 1. ODS 层设计
### 1.1 ODS 任务规格
```python
OdsTaskSpec(
code="ODS_STAFF_INFO",
class_name="OdsStaffInfoTask",
table_name="ods.staff_info_master",
endpoint="/PersonnelManagement/SearchSystemStaffInfo",
data_path=("data",),
list_key="staffProfiles",
pk_columns=(_int_col("id", "id", required=True),),
extra_params={
"workStatusEnum": 0,
"dingTalkSynced": 0,
"staffIdentity": 0,
"rankId": 0,
"criticismStatus": 0,
"signStatus": -1,
},
include_source_endpoint=False,
include_fetched_at=False,
include_record_index=True,
requires_window=False,
time_fields=None,
snapshot_mode=SnapshotMode.FULL_TABLE,
description="员工档案 ODSSearchSystemStaffInfo -> staffProfiles 原始 JSON",
)
```
### 1.2 ODS 表 DDL`ods.staff_info_master`
```sql
CREATE TABLE ods.staff_info_master (
id BIGINT NOT NULL,
tenant_id BIGINT,
site_id BIGINT,
tenant_org_id BIGINT,
system_user_id BIGINT,
staff_name TEXT,
alias_name TEXT,
mobile TEXT,
avatar TEXT,
gender INTEGER,
job TEXT,
job_num TEXT,
staff_identity INTEGER,
status INTEGER,
account_status INTEGER,
system_role_id INTEGER,
rank_id INTEGER,
rank_name TEXT,
new_rank_id INTEGER,
new_staff_identity INTEGER,
leave_status INTEGER,
entry_time TIMESTAMP WITHOUT TIME ZONE,
resign_time TIMESTAMP WITHOUT TIME ZONE,
create_time TIMESTAMP WITHOUT TIME ZONE,
is_delete INTEGER,
is_reserve INTEGER,
shop_name TEXT,
site_label TEXT,
cashier_point_id BIGINT,
cashier_point_name TEXT,
group_id BIGINT,
group_name TEXT,
staff_profile_id BIGINT,
auth_code TEXT,
auth_code_create TIMESTAMP WITHOUT TIME ZONE,
ding_talk_synced INTEGER,
salary_grant_enabled INTEGER,
entry_type INTEGER,
entry_sign_status INTEGER,
resign_sign_status INTEGER,
criticism_status INTEGER,
user_roles JSONB,
-- ETL 元数据
content_hash TEXT NOT NULL,
source_file TEXT,
fetched_at TIMESTAMP WITH TIME ZONE DEFAULT now(),
payload JSONB NOT NULL
);
COMMENT ON TABLE ods.staff_info_master IS '员工档案主数据来源SearchSystemStaffInfo API';
```
### 1.3 ODS 列名映射说明
API 返回的驼峰字段在 ODS 层统一转为蛇形命名(由 BaseOdsTask 自动处理):
- `cashierPointId``cashier_point_id`
- `cashierPointName``cashier_point_name`
- `staffName` / `staff_name``staff_name`API 已是蛇形)
- `systemUserId` / `system_user_id``system_user_id`
- `tenantOrgId` / `tenant_org_id``tenant_org_id`
- `groupName``group_name`注意API 返回驼峰 `groupName`
- `groupId``group_id`API 返回驼峰 `groupId`
- `rankName``rank_name`API 返回驼峰 `rankName`
- `userRoles``user_roles`(数组,存为 JSONB
- `authCodeCreate` / `auth_code_create``auth_code_create`
## 2. DWD 层设计
### 2.1 主表 DDL`dwd.dim_staff`
核心业务字段,高频查询使用。
```sql
CREATE TABLE dwd.dim_staff (
staff_id BIGINT NOT NULL,
staff_name TEXT,
alias_name TEXT,
mobile TEXT,
gender INTEGER,
job TEXT,
tenant_id BIGINT,
site_id BIGINT,
system_role_id INTEGER,
staff_identity INTEGER,
status INTEGER,
leave_status INTEGER,
entry_time TIMESTAMP WITH TIME ZONE,
resign_time TIMESTAMP WITH TIME ZONE,
is_delete INTEGER,
-- SCD2
scd2_start_time TIMESTAMP WITH TIME ZONE NOT NULL,
scd2_end_time TIMESTAMP WITH TIME ZONE,
scd2_is_current INTEGER,
scd2_version INTEGER,
PRIMARY KEY (staff_id, scd2_start_time)
);
COMMENT ON TABLE dwd.dim_staff IS '员工档案维度主表SCD2';
```
### 2.2 扩展表 DDL`dwd.dim_staff_ex`
次要/低频变更字段。
```sql
CREATE TABLE dwd.dim_staff_ex (
staff_id BIGINT NOT NULL,
avatar TEXT,
job_num TEXT,
account_status INTEGER,
rank_id INTEGER,
rank_name TEXT,
new_rank_id INTEGER,
new_staff_identity INTEGER,
is_reserve INTEGER,
shop_name TEXT,
site_label TEXT,
tenant_org_id BIGINT,
system_user_id BIGINT,
cashier_point_id BIGINT,
cashier_point_name TEXT,
group_id BIGINT,
group_name TEXT,
staff_profile_id BIGINT,
auth_code TEXT,
auth_code_create TIMESTAMP WITH TIME ZONE,
ding_talk_synced INTEGER,
salary_grant_enabled INTEGER,
entry_type INTEGER,
entry_sign_status INTEGER,
resign_sign_status INTEGER,
criticism_status INTEGER,
create_time TIMESTAMP WITH TIME ZONE,
user_roles JSONB,
-- SCD2
scd2_start_time TIMESTAMP WITH TIME ZONE NOT NULL,
scd2_end_time TIMESTAMP WITH TIME ZONE,
scd2_is_current INTEGER,
scd2_version INTEGER,
PRIMARY KEY (staff_id, scd2_start_time)
);
COMMENT ON TABLE dwd.dim_staff_ex IS '员工档案维度扩展表SCD2';
```
### 2.3 TABLE_MAP 映射
```python
# 在 DwdLoadTask.TABLE_MAP 中新增:
"dwd.dim_staff": "ods.staff_info_master",
"dwd.dim_staff_ex": "ods.staff_info_master",
```
### 2.4 FACT_MAPPINGS 字段映射
```python
# dim_staff 主表映射
"dwd.dim_staff": [
("staff_id", "id", None),
("entry_time", "entry_time", "timestamptz"),
("resign_time", "resign_time", "timestamptz"),
],
# dim_staff_ex 扩展表映射
"dwd.dim_staff_ex": [
("staff_id", "id", None),
("rank_name", "rankname", None),
("cashier_point_id", "cashierpointid", "bigint"),
("cashier_point_name", "cashierpointname", None),
("group_id", "groupid", "bigint"),
("group_name", "groupname", None),
("system_user_id", "systemuserid", "bigint"),
("tenant_org_id", "tenantorgid", "bigint"),
("auth_code_create", "auth_code_create", "timestamptz"),
("create_time", "create_time", "timestamptz"),
("user_roles", "userroles", "jsonb"),
],
```
说明:
- ODS 层的列名由 BaseOdsTask 自动从 API 驼峰转为蛇形(如 `cashierPointId``cashierpointid`,注意 PG 列名全小写无下划线)
- DWD 主表中 `staff_name``alias_name``mobile` 等与 ODS 同名列自动映射,无需显式配置
- `staff_id` 映射自 ODS 的 `id`
## 3. 数据流概览
```
API: SearchSystemStaffInfo
↓ (POST, 分页, extra_params 筛选)
ODS: ods.staff_info_master
↓ (SCD2 合并, FULL_TABLE 快照)
DWD: dwd.dim_staff + dwd.dim_staff_ex
```
## 4. 测试框架
- 测试框架:`pytest` + `hypothesis`
- 单元测试使用 `FakeDB` / `FakeAPI``tests/unit/task_test_utils.py`
## 5. 正确性属性
### P1ODS 任务规格完整性
对于 `ODS_STAFF_INFO` 任务规格,以下属性必须成立:
- `code == "ODS_STAFF_INFO"`
- `table_name == "ods.staff_info_master"`
- `endpoint == "/PersonnelManagement/SearchSystemStaffInfo"`
- `list_key == "staffProfiles"`
- `snapshot_mode == SnapshotMode.FULL_TABLE`
- `requires_window == False`
- `time_fields is None`
- `"staffProfiles"` 存在于 `DEFAULT_LIST_KEYS`
- `"ODS_STAFF_INFO"` 存在于 `ENABLED_ODS_CODES`
验证方式:单元测试直接断言
### P2DWD 映射完整性
对于 DWD 装载配置,以下属性必须成立:
- `TABLE_MAP["dwd.dim_staff"] == "ods.staff_info_master"`
- `TABLE_MAP["dwd.dim_staff_ex"] == "ods.staff_info_master"`
- `FACT_MAPPINGS["dwd.dim_staff"]` 包含 `staff_id``id` 的映射
- `FACT_MAPPINGS["dwd.dim_staff_ex"]` 包含 `staff_id``id` 的映射
验证方式:单元测试直接断言
### P3ODS 列名提取一致性(属性测试)
对于任意 API 返回的员工记录(含驼峰和蛇形混合字段名),经 BaseOdsTask 处理后:
- 所有字段名转为小写蛇形
- `id` 字段不为空且为正整数
- `payload` 字段包含完整原始 JSON
验证方式hypothesis 属性测试,生成随机员工记录验证转换一致性
## 6. 文件变更清单
### 代码变更
| 文件 | 变更类型 | 说明 |
|------|----------|------|
| `apps/etl/connectors/feiqiu/api/client.py` | 修改 | `DEFAULT_LIST_KEYS` 添加 `"staffProfiles"` |
| `apps/etl/connectors/feiqiu/tasks/ods/ods_tasks.py` | 修改 | 新增 `ODS_STAFF_INFO` 任务规格 + 注册到 `ENABLED_ODS_CODES` |
| `apps/etl/connectors/feiqiu/tasks/dwd/dwd_load_task.py` | 修改 | `TABLE_MAP``FACT_MAPPINGS` 新增 dim_staff/dim_staff_ex 映射 |
### DDL / 迁移
| 文件 | 变更类型 | 说明 |
|------|----------|------|
| `db/etl_feiqiu/migrations/2026-02-22__add_staff_info_tables.sql` | 新增 | ODS + DWD 建表迁移脚本 |
| `docs/database/ddl/etl_feiqiu__ods.sql` | 修改 | 追加 `ods.staff_info_master` DDL |
| `docs/database/ddl/etl_feiqiu__dwd.sql` | 修改 | 追加 `dwd.dim_staff` + `dwd.dim_staff_ex` DDL |
### 文档
| 文件 | 变更类型 | 说明 |
|------|----------|------|
| `apps/etl/connectors/feiqiu/docs/database/ODS/mappings/mapping_SearchSystemStaffInfo_staff_info_master.md` | 新增 | API→ODS 字段映射文档 |
| `apps/etl/connectors/feiqiu/docs/database/ODS/main/BD_manual_staff_info_master.md` | 新增 | ODS 表 BD 手册 |
| `apps/etl/connectors/feiqiu/docs/database/DWD/main/BD_manual_dim_staff.md` | 新增 | DWD 主表 BD 手册 |
| `apps/etl/connectors/feiqiu/docs/database/DWD/main/BD_manual_dim_staff_ex.md` | 新增 | DWD 扩展表 BD 手册 |
| `apps/etl/connectors/feiqiu/docs/database/README.md` | 修改 | 增加员工表条目 |
| `apps/etl/connectors/feiqiu/docs/etl_tasks/ods_tasks.md` | 修改 | 增加 ODS_STAFF_INFO 任务说明 |
| `docs/database/README.md` | 修改 | 增加员工相关表条目 |