# 实施计划：仓库治理只读审计

## 概述

将设计文档中的审计脚本拆分为增量式编码任务。每个任务构建在前一个任务之上，最终产出可运行的审计工具集。所有脚本位于 `scripts/audit/` 目录，报告输出到 `docs/audit/repo/`。

## 任务

- [x] 1. 搭建审计脚本骨架和数据模型
  - [x] 1.1 创建 `scripts/audit/__init__.py` 和数据模型定义
    - 定义 `FileEntry` dataclass（`rel_path`, `is_dir`, `size_bytes`, `extension`, `is_empty_dir`）
    - 定义 `Category` 和 `Disposition` 枚举
    - 定义 `InventoryItem` dataclass
    - 定义 `FlowNode` dataclass
    - 定义 `DocMapping` 和 `AlignmentIssue` dataclass
    - _Requirements: 1.2, 1.3, 1.4, 2.7, 3.2, 3.3_

  - [x] 1.2 编写 classify 完整性属性测试
    - **Property 1: classify 完整性**
    - **Validates: Requirements 1.2, 1.3**

- [x] 2. 实现仓库扫描器
  - [x] 2.1 创建 `scripts/audit/scanner.py`
    - 实现 `EXCLUDED_PATTERNS` 常量和排除匹配逻辑
    - 实现 `scan_repo(root, exclude)` 函数：递归遍历文件系统，返回 `list[FileEntry]`
    - 处理空目录检测（`is_empty_dir`）
    - 处理文件读取权限错误（跳过并记录）
    - _Requirements: 1.1, 5.1, 5.3_

  - [x] 2.2 编写扫描器排除规则属性测试
    - **Property 7: 扫描器排除规则**
    - **Validates: Requirements 1.1**

- [x] 3. 实现文件清单分析器
  - [x] 3.1 创建 `scripts/audit/inventory_analyzer.py`
    - 实现 `classify(entry: FileEntry) -> InventoryItem` 函数，包含完整分类规则表
    - 实现 `build_inventory(entries) -> list[InventoryItem]` 批量分类函数
    - 实现 `render_inventory_report(items, repo_root) -> str` Markdown 渲染函数
    - 包含统计摘要生成（各分类/标签计数）
    - 注意：需求 1.8 仅覆盖 `logs/` 和 `export/` 目录（不含 `reports/`）
    - _Requirements: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 4.2, 4.5_

  - [x] 3.2 编写 classify 分类规则属性测试
    - **Property 3: 空目录标记为候选删除**
    - **Property 4: .lnk/.rar 文件标记为候选删除**
    - **Property 5: tmp/ 下文件处置范围**
    - **Property 6: 运行时产出目录标记为候选归档**（仅 `logs/`、`export/`）
    - **Validates: Requirements 1.5, 1.6, 1.7, 1.8**

  - [x] 3.3 编写清单渲染属性测试
    - **Property 2: 清单渲染完整性**
    - **Property 8: 清单按分类分组**
    - **Validates: Requirements 1.4, 1.10**

- [x] 4. 检查点 - 确保文件清单模块测试通过
  - 确保所有测试通过，如有疑问请向用户确认。

- [x] 5. 实现流程树分析器
  - [x] 5.1 创建 `scripts/audit/flow_analyzer.py`
    - 实现 `parse_imports(filepath)` 函数：使用 `ast` 模块解析 Python 文件的 import 语句
    - 实现 `build_flow_tree(repo_root, entry_file)` 函数：从入口递归追踪 import 链
    - 实现 `find_orphan_modules(repo_root, all_entries, reachable)` 函数
    - 实现 `render_flow_report(trees, orphans, repo_root)` 函数：生成 Mermaid 图和缩进文本
    - 包含入口点识别逻辑（CLI、GUI、批处理、运维脚本）
    - 包含任务类型和加载器类型区分逻辑
    - 包含统计摘要生成
    - _Requirements: 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 4.6_

  - [x] 5.2 编写流程树属性测试
    - **Property 9: 流程树节点 source_file 有效性**
    - **Property 10: 孤立模块检测正确性**
    - **Validates: Requirements 2.7, 2.8**

- [x] 6. 实现文档对齐分析器
  - [x] 6.1 创建 `scripts/audit/doc_alignment_analyzer.py`
    - 实现 `scan_docs(repo_root)` 函数：扫描所有文档来源
    - 实现 `extract_code_references(doc_path)` 函数：从文档提取代码引用
    - 实现 `check_reference_validity(ref, repo_root)` 函数
    - 实现 `find_undocumented_modules(repo_root, documented)` 函数
    - 实现 `check_ddl_vs_dictionary(repo_root)` 函数：DDL 与数据字典比对
    - 实现 `check_api_samples_vs_parsers(repo_root)` 函数：API 样本与解析器比对
    - 实现 `render_alignment_report(mappings, issues, repo_root)` 函数
    - 包含统计摘要生成
    - _Requirements: 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 4.7_

  - [x] 6.2 编写文档对齐属性测试
    - **Property 11: 过期引用检测**
    - **Property 12: 缺失文档检测**
    - **Property 16: 文档对齐报告分区完整性**
    - **Validates: Requirements 3.3, 3.5, 3.8**

- [x] 7. 检查点 - 确保流程树和文档对齐模块测试通过
  - 确保所有测试通过，如有疑问请向用户确认。

- [x] 8. 实现审计主入口和报告输出
  - [x] 8.1 创建 `scripts/audit/run_audit.py`
    - 实现 `run_audit(repo_root)` 主函数：依次调用扫描器和三个分析器
    - 实现 `docs/audit/repo/` 目录检查与创建逻辑
    - 实现报告头部元信息（时间戳、仓库路径）注入
    - 实现三份报告的文件写入
    - 添加 `if __name__ == "__main__"` 入口
    - _Requirements: 4.1, 4.2, 4.3, 4.4, 5.2, 5.4_

  - [x] 8.2 编写报告输出属性测试
    - **Property 13: 统计摘要一致性**
    - **Property 14: 报告头部元信息**
    - **Property 15: 写操作仅限 docs/audit/**
    - **Validates: Requirements 4.2, 4.5, 4.6, 4.7, 5.2**

- [x] 9. 最终检查点 - 确保所有测试通过
  - 确保所有测试通过，如有疑问请向用户确认。

## 备注

- 标记 `*` 的子任务为可选，可跳过以加速 MVP 交付
- 每个任务引用了具体的需求编号，便于追溯
- 属性测试使用 `hypothesis` 库，每个测试至少 100 次迭代
- 单元测试验证具体示例和边界情况，属性测试验证通用正确性