- [P20260220-111837] 2026-02-20 11:18:37 +0800
  - summary: CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary: --- ## TASK 1: Full Prod…
  - prompt:
```text
CONTEXT TRANSFER: We are continuing a conversation that had gotten too long. Here is a summary:

---
## TASK 1: Full Production Integration Test (api_full, verify_only, custom window)

**STATUS**: done

**DETAILS**: ETL run `896b9647` completed successfully. 22 ODS tasks: 21 SUCC, 1 FAIL (`ODS_ASSISTANT_LEDGER` — known JSONB bug, fix already in code). `verify_only` mode only executes ODS fetch + verification, skips DWD/DWS/INDEX.

**FILEPATHS**:
- `export/ETL-Connectors/feiqiu/REPORTS/etl_timing_20260220_091414.md`
- `export/ETL-Connectors/feiqiu/REPORTS/consistency_report_20260220_091414.md`

---

## TASK 2: Fix consistency report issues (missing DWD fields + whitelist)

**STATUS**: done

**DETAILS**:
- `principal_change_amount` fix: Added FACT_MAPPINGS expression
- `update_time` fix: Confirmed upstream API has NO update_time field. Added `KNOWN_NO_SOURCE` whitelist in `consistency_checker.py`
- All 735 unit tests passing at time of completion

**FILEPATHS**:
- `apps/etl/connectors/feiqiu/tasks/dwd/dwd_load_task.py`
- `apps/etl/connectors/feiqiu/quality/consistency_checker.py`

---

## TASK 3: ODS_ASSISTANT_LEDGER JSONB fix

**STATUS**: done (code fix applied, awaiting next ETL run to verify)

**DETAILS**: `_mark_missing_as_deleted` in `ods_tasks.py` now detects ALL JSONB columns via `cols_info` udt_name and wraps dict/list values with `Json()`.

**FILEPATHS**: `apps/etl/connectors/feiqiu/tasks/ods/ods_tasks.py`

---

## TASK 4: Explain increment_only vs increment_verify vs verify_only modes

**STATUS**: done

---

## TASK 5: Explain full increment_verify data pipeline (API→ODS→DWD→DWS→INDEX)

**STATUS**: done

---

## TASK 6: Remove `pipeline` parameter, rename to `flow` everywhere

**STATUS**: in-progress

**DETAILS**:
User wants to completely remove `pipeline` parameter/field name across the entire codebase, replacing with `flow`. No backward compatibility needed — clean break.

**What's been done (actually implemented):**

1. `apps/etl/connectors/feiqiu/orchestration/flow_runner.py` — DONE:
   - `run()` parameter `pipeline` → `flow`
   - Return dict key `"pipeline"` → `"flow"`
   - Docstring comment updated

2. `apps/etl/connectors/feiqiu/orchestration/scheduler.py` — DONE:
   - `PIPELINE_LAYERS` → `FLOW_LAYERS`

3. `apps/etl/connectors/feiqiu/cli/main.py` — DONE:
   - Removed `--pipeline` argument definition entirely
   - Removed `if args.pipeline_deprecated:` block
   - Changed `runner.run(pipeline=args.flow, ...)` → `runner.run(flow=args.flow, ...)`
   - Changed `runner.run(pipeline=None, layers=layers, ...)` → `runner.run(flow=None, layers=layers, ...)`
   - Removed `--pipeline` from module docstring
   - NOTE: `--pipeline-flow` (data_source deprecated param) is intentionally KEPT — it's a separate concept

4. `apps/backend/app/schemas/tasks.py` — DONE:
   - `pipeline: str = "api_ods_dwd"` → `flow: str = "api_ods_dwd"`
   - Docstring updated

5. `apps/backend/app/services/cli_builder.py` — DONE:
   - `config.pipeline` → `config.flow`

6. `apps/backend/app/routers/tasks.py` — DONE:
   - `config.pipeline` → `config.flow`

7. Frontend files — ALL DONE:
   - `apps/admin-web/src/types/index.ts`: `pipeline: string` → `flow: string`, `PipelineDefinition` → `FlowDefinition`
   - `apps/admin-web/src/pages/TaskConfig.tsx`: `pipeline: flow` → `flow: flow`
   - `apps/admin-web/src/App.tsx`: `runningTask.config.pipeline` → `runningTask.config.flow`
   - `apps/admin-web/src/pages/TaskManager.tsx`: `config.pipeline` → `config.flow`
   - `apps/admin-web/src/components/ScheduleTab.tsx`: `pipeline: 'api_full'` → `flow: 'api_full'`

8. Test files — BULK REPLACED via Python scripts (3 rounds):
   - Round 1 (`_rename_pipeline_to_flow.py`): Regex-based replacement of `pipeline=` → `flow=`, `"pipeline"` → `"flow"`, `.pipeline` → `.flow`, `--pipeline` → `--flow` across 15 test files
   - Round 2 (`_rename_pipeline_vars.py`): Cleaned up local variable names (`pipeline` → `flow_name`/`flow_id`), docstrings, comments across 7 files
   - Round 3 (`_rename_pipeline_feature_tags.py`): `etl-pipeline-debug` → `etl-flow-debug` in 3 files

9. `apps/etl/connectors/feiqiu/tests/unit/test_layers_cli.py` — PARTIALLY DONE:
   - `TestFlowParameter` class rewritten to only test `--flow` (removed all `--pipeline` deprecated tests)
   - `TestLayersPipelineMutualExclusion` → `TestLayersFlowMutualExclusion`
   - `TestLayersArgParsing.test_pipeline_still_works` removed
   - Section comment `# 3. --layers 与 --pipeline 互斥` still present (cosmetic)
   - Section comment `# 5. --flow / --pipeline 弃用别名测试` still present (cosmetic)

**What still needs to be done (10 test failures remain):**

The last test run showed 10 failures, 674 passed. The failures are:

A. `test_cli_args.py::TestPipelineAndTasks::test_pipeline_and_tasks_both_parsed` — This entire class needs to be DELETED. It tests `--pipeline` + `--tasks` which no longer exists. The class is at the bottom of the file (lines ~126-141). Also need to remove `pipeline_deprecated=None` from `_make_args` defaults in `TestBuildCliOverrides` (line ~84).

B. `test_e2e_flow.py` — 2 failures. Needs:
   - `runner.run(pipeline="api_ods", ...)` → `runner.run(flow="api_ods", ...)`  (2 occurrences)
   - `result["pipeline"]` → `result["flow"]`  (1 occurrence)
   - `scheduler.run_flow_with_verification(pipeline="api_ods")` → check if this method exists and fix
   - NOTE: `"pipeline.fetch_root"` and `"pipeline.ingest_source_dir"` are AppConfig keys, NOT the pipeline parameter — leave them alone

C. `test_layers_cli.py` — 7 failures listed but the file has already been cleaned. These failures are likely from **pytest cache** (`.pytest_cache/` or `__pycache__`). Need to clear cache and re-run: `pytest tests/unit --cache-clear` or delete `__pycache__` dirs.

**After fixing tests, need to run:**
1. ETL unit tests: `cd apps/etl/connectors/feiqiu && pytest tests/unit -q --tb=short --ignore=tests/unit/test_audit_doc_alignment.py`
2. Backend tests: `cd apps/backend && pytest tests/ -q --tb=short`
3. Monorepo property tests: `pytest tests/ -v` (from root)

**FILEPATHS** (files that were modified):
- `apps/etl/connectors/feiqiu/orchestration/flow_runner.py`
- `apps/etl/connectors/feiqiu/orchestration/scheduler.py`
- `apps/etl/connectors/feiqiu/cli/main.py`
- `apps/backend/app/schemas/tasks.py`
- `apps/backend/app/services/cli_builder.py`
- `apps/backend/app/routers/tasks.py`
- `apps/admin-web/src/types/index.ts`
- `apps/admin-web/src/pages/TaskConfig.tsx`
- `apps/admin-web/src/pages/TaskManager.tsx`
- `apps/admin-web/src/App.tsx`
- `apps/admin-web/src/components/ScheduleTab.tsx`
- `apps/etl/connectors/feiqiu/tests/unit/test_pipeline_runner_properties.py`
- `apps/etl/connectors/feiqiu/tests/unit/test_debug_orchestration_properties.py`
- `apps/etl/connectors/feiqiu/tests/unit/test_layers_cli.py`
- `apps/etl/connectors/feiqiu/scripts/debug/debug_orchestration.py`
- `tests/test_property_5_etl_param_parsing.py`
- All 11 backend test files listed in context transfer

**FILEPATHS** (files that still need changes):
- `apps/etl/connectors/feiqiu/tests/unit/test_cli_args.py` — delete `TestPipelineAndTasks` class, remove `pipeline_deprecated=None` from `_make_args`
- `apps/etl/connectors/feiqiu/tests/unit/test_e2e_flow.py` — fix `pipeline=` → `flow=` and `result["pipeline"]` → `result["flow"]`

**One-time scripts created (can be deleted after task completion):**
- `scripts/ops/_rename_pipeline_to_flow.py`
- `scripts/ops/_rename_pipeline_vars.py`
- `scripts/ops/_rename_pipeline_feature_tags.py`

---

## TASK 7: New `full_window` processing mode

**STATUS**: not-started

**DETAILS**:
- User wants a new `processing_mode="full_window"` distinct from `increment_verify`
- ODS tasks should NOT use cursor to calculate time window; instead use the actual time bounds from the returned JSON data
- No `_run_verification` needed (API data is the source of truth, no cursor drift risk)
- ODS entry still uses `content_hash` dedup + idempotent upsert (unchanged)
- DWD/DWS processing unchanged

**FILEPATHS**:
- `apps/etl/connectors/feiqiu/orchestration/flow_runner.py`
- `apps/etl/connectors/feiqiu/tasks/ods/ods_tasks.py`

---

## KEY ARCHITECTURE:
- Frontend: React + Vite + Ant Design at `apps/admin-web/`
- Backend: FastAPI at `apps/backend/`
- ETL: `apps/etl/connectors/feiqiu/`
- ODS PK = `(id, content_hash)` — snapshot mode, content change = new row
- DWD has no cursor; dim tables use SCD2, fact tables use `fetched_at` window upsert
- DWS has no cursor; delete-before-insert by date range
- Four DB connections: `mcp_pg_etl`, `mcp_pg_etl_test`, `mcp_pg_app`, `mcp_pg_app_test`
- `pipeline_flow` / `--pipeline-flow` is a SEPARATE concept (data_source deprecated param) — NOT part of the pipeline→flow rename
- `"pipeline.fetch_root"` and `"pipeline.ingest_source_dir"` are AppConfig configuration keys — NOT part of the pipeline→flow rename

## USER CORRECTIONS AND INSTRUCTIONS:
- All output text must be in simplified Chinese
- `fetched_at` is ETL ingestion timestamp, NOT business update time
- ODS content_hash change = NEW ROW (snapshot), not in-place update
- `pipeline` is NOT a legacy/deprecated alias for `flow` — user wants it completely removed, not kept for compatibility
- `--pipeline-flow` is a separate deprecated param (for data_source), should NOT be renamed
- Audit reminder pending for high-risk path changes
- Python scripts preferred over PowerShell for complex operations
- `test_audit_doc_alignment.py` has a pre-existing deadline flaky test — ignore it (not related to this task)

## Files to read
- `apps/etl/connectors/feiqiu/tests/unit/test_cli_args.py` (CRITICAL — needs `TestPipelineAndTasks` class deleted and `pipeline_deprecated` removed from `_make_args`)
- `apps/etl/connectors/feiqiu/tests/unit/test_e2e_flow.py` (CRITICAL — needs `pipeline=` → `flow=` and `result["pipeline"]` → `result["flow"]`)
- `apps/etl/connectors/feiqiu/tests/unit/test_layers_cli.py` (verify clean — may just need pytest cache clear)

USER QUERIES(most recent first):
1. Output:
Command executed successfully with no output.

Exit Code: 0
2. Output:
Command executed successfully with no output.

Exit Code: 0
---

METADATA:
The previous conversation had 2 messages.

INSTRUCTIONS:
Continue working until the user query has been fully addressed. Do not ask for clarification - proceed with the work based on the context provided.
IMPORTANT: you need to read from the files to Read section
```