|
| 1 | +# Plan: python-bcb Improvements (QUESTIONS.md) |
| 2 | + |
| 3 | +## Context |
| 4 | +50 architectural/design questions were raised in a code review of python-bcb (Brazilian Central Bank API client). The user reviewed and answered all questions. This plan organizes the actionable decisions into 8 independent implementation phases, ordered by dependency. |
| 5 | + |
| 6 | +--- |
| 7 | + |
| 8 | +## Architecture Decisions (Finalized) |
| 9 | + |
| 10 | +| # | Decision | |
| 11 | +|---|----------| |
| 12 | +| A1 | **Shared global `httpx.Client`** in `bcb/http.py`, imported by all modules | |
| 13 | +| A2 | **Fail-fast everywhere**: `_fetch_symbol_response` raises immediately; `get(["USD", "BOGUS"])` raises `CurrencyNotFoundError` on first invalid symbol | |
| 14 | +| A3 | **True async via `httpx.AsyncClient`**: add `async_get()` functions in each module | |
| 15 | +| A4 | **Format detection**: assert expected response structure, raise descriptive `BCBAPIError` if BCB API format changes | |
| 16 | + |
| 17 | +--- |
| 18 | + |
| 19 | +## Phase 1: Shared HTTP Infrastructure |
| 20 | +**Dependencies:** None — must be done first (all other phases depend on it) |
| 21 | + |
| 22 | +**Files to create/modify:** |
| 23 | +- `bcb/http.py` (NEW) — shared client module |
| 24 | +- `bcb/currency.py` |
| 25 | +- `bcb/sgs/__init__.py` |
| 26 | +- `bcb/odata/framework.py` |
| 27 | +- `pyproject.toml` — add `tenacity` dependency |
| 28 | + |
| 29 | +**Changes:** |
| 30 | +- Create `bcb/http.py` with: |
| 31 | + - `_CLIENT = httpx.Client(timeout=30.0, follow_redirects=True)` (sync) |
| 32 | + - `_ASYNC_CLIENT = httpx.AsyncClient(timeout=30.0, follow_redirects=True)` (async) |
| 33 | + - `DEFAULT_TIMEOUT = 30.0` constant |
| 34 | +- Replace all bare `httpx.get()` calls across modules with `_CLIENT.get()` |
| 35 | +- Standardize timeout: all calls use `DEFAULT_TIMEOUT`, allow per-call override via `timeout=` kwarg |
| 36 | +- Add `tenacity`-based `@retry` decorator for transient network errors (max 3 retries, fixed wait) |
| 37 | +- Fix `_get_valid_currency_list` recursive date-rollback: add `max_rollback=30` depth guard to prevent `RecursionError` |
| 38 | +- Fix `http://` URL in `_get_valid_currency_list` → `https://` |
| 39 | + |
| 40 | +**No async API yet** — `_ASYNC_CLIENT` just defined, used in Phase 6. |
| 41 | + |
| 42 | +--- |
| 43 | + |
| 44 | +## Phase 2: Exception Hierarchy & Error Handling |
| 45 | +**Dependencies:** Phase 1 (HTTP client standardized) |
| 46 | + |
| 47 | +**Files:** |
| 48 | +- `bcb/exceptions.py` |
| 49 | +- `bcb/currency.py` |
| 50 | +- `bcb/sgs/__init__.py` |
| 51 | + |
| 52 | +**Changes:** |
| 53 | +- In `bcb/exceptions.py`: |
| 54 | + - Make `BCBAPIError.status_code` required (remove `Optional`) |
| 55 | + - Add `BCBAPINotFoundError(BCBAPIError)` — for 404 responses |
| 56 | + - Add `BCBRateLimitError(BCBAPIError)` — for 429 responses |
| 57 | +- In `bcb/currency.py`: |
| 58 | + - Remove `warnings.warn()` at L141 → raise `BCBAPIError` with status code |
| 59 | + - Change `_fetch_symbol_response` to raise (never return `None`) — per A2 decision |
| 60 | + - Remove `Optional` return type from `_fetch_symbol_response`, `_get_symbol`, `_get_symbol_text` |
| 61 | + - Detect 429 → raise `BCBRateLimitError` |
| 62 | + - Detect 404 → raise `BCBAPINotFoundError` |
| 63 | +- In `bcb/sgs/__init__.py`: |
| 64 | + - Replace bare `except Exception` (L253) with `except json.JSONDecodeError` |
| 65 | + - Detect 429 → raise `BCBRateLimitError` |
| 66 | + |
| 67 | +--- |
| 68 | + |
| 69 | +## Phase 3: Cache Refactor |
| 70 | +**Dependencies:** None (independent) |
| 71 | + |
| 72 | +**Files:** |
| 73 | +- `bcb/currency.py` |
| 74 | +- `bcb/odata/framework.py` |
| 75 | + |
| 76 | +**Changes:** |
| 77 | +- In `bcb/currency.py`: |
| 78 | + - Replace `_CACHE: dict[str, pd.DataFrame]` with a thread-safe wrapper class: |
| 79 | + ```python |
| 80 | + class _Cache: |
| 81 | + def __init__(self) -> None: |
| 82 | + self._lock = threading.Lock() |
| 83 | + self._data: dict[CacheKey, pd.DataFrame] = {} |
| 84 | + def get(self, key): ... |
| 85 | + def set(self, key, value): ... |
| 86 | + def clear(self): ... |
| 87 | + ``` |
| 88 | + - Replace string keys `"TEMP_CURRENCY_ID_LIST"` / `"TEMP_FILE_CURRENCY_LIST"` with a `CacheKey` namedtuple (or `@dataclass(frozen=True)`) |
| 89 | + - Make cache injectable: `get(..., cache: _Cache = _DEFAULT_CACHE)` — module-level default, but testable |
| 90 | + - Update `clear_cache()` to delegate to `_DEFAULT_CACHE.clear()` |
| 91 | +- In `bcb/odata/framework.py`: |
| 92 | + - Add module-level `_METADATA_CACHE: dict[str, ODataMetadata] = {}` (keyed by URL) |
| 93 | + - In `ODataService.__init__`: check cache before fetching `$metadata` |
| 94 | + - Cache is in-memory, no TTL, lives with process |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +## Phase 4: Data Validation & Type Safety |
| 99 | +**Dependencies:** Phases 1 and 2 (exception classes must exist) |
| 100 | + |
| 101 | +**Files:** |
| 102 | +- `bcb/currency.py` |
| 103 | +- `bcb/sgs/__init__.py` |
| 104 | +- `bcb/odata/api.py` |
| 105 | +- `bcb/odata/framework.py` |
| 106 | +- `bcb/utils.py` |
| 107 | + |
| 108 | +**Changes:** |
| 109 | +- **CSV validation** (`bcb/currency.py`): |
| 110 | + - Assert CSV has exactly 8 columns after parsing; raise `BCBAPIError` if not |
| 111 | + - Add explicit `errors='raise'` to `.astype()` calls for type conversion failures |
| 112 | +- **Date validation** (`bcb/currency.py`, `bcb/sgs/__init__.py`): |
| 113 | + - Wrap `pd.to_datetime(..., format=...)` in try/except `ValueError` → raise `BCBAPIError` with descriptive message |
| 114 | +- **Type conversion validation** (`bcb/currency.py`): |
| 115 | + - Wrap `.astype("int32")` and `.astype(np.float64)` conversions in try/except → raise `BCBAPIError` |
| 116 | +- **SGSCode → dataclass** (`bcb/sgs/__init__.py`): |
| 117 | + - Convert `SGSCode` to `@dataclass`: |
| 118 | + ```python |
| 119 | + @dataclass(frozen=True) |
| 120 | + class SGSCode: |
| 121 | + value: int |
| 122 | + name: str |
| 123 | + |
| 124 | + @classmethod |
| 125 | + def from_code(cls, code: int | str) -> SGSCode: ... |
| 126 | + |
| 127 | + @classmethod |
| 128 | + def from_named(cls, code: int | str, name: str) -> SGSCode: ... |
| 129 | + |
| 130 | + def __repr__(self) -> str: ... |
| 131 | + ``` |
| 132 | + - Remove the always-true nested `if isinstance(code, int | str)` check |
| 133 | +- **SGS input validation** (`bcb/sgs/__init__.py`): |
| 134 | + - In `_codes()`: validate that int codes are positive; raise `ValueError` if not |
| 135 | +- **TypedDict for dict returns** (`bcb/currency.py`, `bcb/sgs/__init__.py`): |
| 136 | + - Define `CurrencyTextResult = TypedDict("CurrencyTextResult", {"symbol": str, ...})` |
| 137 | + - Update overload declarations to use `TypedDict` |
| 138 | +- **Absolute imports + annotations** (all `bcb/` files): |
| 139 | + - Add `from __future__ import annotations` to every file |
| 140 | + - Convert all relative imports to absolute imports (e.g., `from .exceptions` → `from bcb.exceptions`) |
| 141 | +- **Format detection** (`bcb/currency.py`, `bcb/sgs/__init__.py`, `bcb/odata/framework.py`): |
| 142 | + - Add `_validate_*_response()` helpers that assert expected response structure |
| 143 | + - Called immediately after HTTP response parsing |
| 144 | + |
| 145 | +--- |
| 146 | + |
| 147 | +## Phase 5: API Consistency |
| 148 | +**Dependencies:** Phases 1–4 |
| 149 | + |
| 150 | +**Files:** |
| 151 | +- `bcb/currency.py` |
| 152 | +- `bcb/sgs/__init__.py` |
| 153 | +- `bcb/odata/api.py` |
| 154 | +- `bcb/odata/framework.py` |
| 155 | + |
| 156 | +**Changes:** |
| 157 | +- **Standardize `start`/`end` parameter names** across all modules (already consistent; verify and document) |
| 158 | +- **URL construction** (`bcb/currency.py`, `bcb/sgs/__init__.py`): |
| 159 | + - Replace raw f-string URL building with `urllib.parse.urlencode` / `urllib.parse.urljoin` where user input is involved |
| 160 | + - Note: OData framework already uses `urllib.parse.quote` — leave as-is |
| 161 | +- **OData `.get()` keyword args** (`bcb/odata/api.py`): |
| 162 | + - Add `filter=`, `orderby=`, `select=` as explicit keyword args to `Endpoint.get()` in addition to existing positional args |
| 163 | + |
| 164 | +--- |
| 165 | + |
| 166 | +## Phase 6: Async API |
| 167 | +**Dependencies:** Phase 1 (`_ASYNC_CLIENT` defined), Phases 2–5 (stable sync API) |
| 168 | + |
| 169 | +**Files:** |
| 170 | +- `bcb/sgs/__init__.py` |
| 171 | +- `bcb/currency.py` |
| 172 | +- `bcb/odata/framework.py` |
| 173 | +- `bcb/odata/api.py` |
| 174 | +- `bcb/http.py` |
| 175 | + |
| 176 | +**Changes:** |
| 177 | +- Add `async_get()` in `bcb/sgs/__init__.py`: |
| 178 | + - Same signature as `get()` but `async def async_get(...)` |
| 179 | + - Uses `_ASYNC_CLIENT.get()` from `bcb/http.py` |
| 180 | + - Internal `_async_get_json()` helper |
| 181 | +- Add `async_get()` in `bcb/currency.py`: |
| 182 | + - Async version of the currency fetch flow |
| 183 | +- Add `async_text()` / `async_collect()` to `ODataQuery` in `bcb/odata/framework.py` |
| 184 | +- Add `async_get()` / `async_query()` to `Endpoint` in `bcb/odata/api.py` |
| 185 | +- Lifecycle note: `_ASYNC_CLIENT` is a module-level object; document that it should be closed in long-running apps via `bcb.http.close_async_client()` |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +## Phase 7: Testing Overhaul |
| 190 | +**Dependencies:** Phases 1–5 complete (stable API) |
| 191 | + |
| 192 | +**Files:** |
| 193 | +- `tests/conftest.py` |
| 194 | +- `tests/test_currency.py` |
| 195 | +- `tests/test_sgs.py` |
| 196 | +- `tests/test_odata.py` (new) |
| 197 | +- `tests/integration/` (existing) |
| 198 | + |
| 199 | +**Changes:** |
| 200 | +- Replace hardcoded mock data constants in `conftest.py` with factory functions |
| 201 | + - e.g., `make_currency_csv(symbols=["USD"], dates=1)` generates minimal valid CSV |
| 202 | + - e.g., `make_sgs_response(code=1, n=1)` generates minimal valid JSON |
| 203 | +- Add comprehensive negative test cases: |
| 204 | + - Malformed input: bad dates, invalid symbols, non-integer SGS codes, empty symbol list |
| 205 | + - API failures: 404, 429, 500 responses |
| 206 | + - Network failures: connection timeout, malformed JSON, malformed CSV, wrong column count |
| 207 | +- Use `pytest-httpx` to mock all HTTP calls in unit tests (eliminate all flakiness) |
| 208 | +- Move all tests using live HTTP to `tests/integration/` |
| 209 | +- Remove `@mark.flaky` from any test where HTTP is now properly mocked |
| 210 | +- Add tests for thread-safety of cache (concurrent reads/writes) |
| 211 | +- Add tests for async API (use `pytest-anyio` or `anyio`) |
| 212 | + |
| 213 | +--- |
| 214 | + |
| 215 | +## Phase 8: Logging & Documentation |
| 216 | +**Dependencies:** All previous phases (API surface is stable) |
| 217 | + |
| 218 | +**Files:** |
| 219 | +- All `bcb/` modules |
| 220 | +- `README.md` |
| 221 | +- `docs/` (existing Sphinx docs) |
| 222 | +- `examples/` (NEW directory) |
| 223 | + |
| 224 | +**Changes:** |
| 225 | +- **Logging** (`bcb/currency.py`, `bcb/sgs/__init__.py`, `bcb/odata/framework.py`): |
| 226 | + - Add `logger = logging.getLogger(__name__)` to each module |
| 227 | + - `logger.debug()` before each HTTP request: URL and params (no secrets) |
| 228 | + - `logger.debug()` after each response: status code and content length |
| 229 | + - `logger.warning()` for retry attempts |
| 230 | +- **Docstrings** (all public functions): |
| 231 | + - Full docstrings: summary, Parameters, Returns, Raises sections |
| 232 | + - Minimal one-line docstrings for private helpers |
| 233 | + - Add `pydocstyle` to ruff config (or `ruff` rule `D` subset) |
| 234 | +- **Examples** (`examples/`): |
| 235 | + - `examples/sgs_time_series.py` — basic SGS usage |
| 236 | + - `examples/currency_exchange.py` — currency rate fetching |
| 237 | + - `examples/odata_query.py` — OData filter/orderby usage |
| 238 | + - `examples/async_usage.py` — async API example |
| 239 | +- **README** (`README.md`): |
| 240 | + - Add "Which module to use?" decision table |
| 241 | + - Add FAQ section for common use cases |
| 242 | + |
| 243 | +--- |
| 244 | + |
| 245 | +## Summary |
| 246 | + |
| 247 | +| Phase | Area | Key Files | Scope | |
| 248 | +|-------|------|-----------|-------| |
| 249 | +| 1 | Shared HTTP client | `bcb/http.py` (new), 3 existing | Foundation | |
| 250 | +| 2 | Exception hierarchy | `bcb/exceptions.py`, `bcb/currency.py`, `bcb/sgs/__init__.py` | Errors | |
| 251 | +| 3 | Cache refactor | `bcb/currency.py`, `bcb/odata/framework.py` | State mgmt | |
| 252 | +| 4 | Data validation & types | 5 files | Type safety | |
| 253 | +| 5 | API consistency | 4 files | API polish | |
| 254 | +| 6 | Async API | 5 files | New feature | |
| 255 | +| 7 | Testing overhaul | `tests/` | Quality | |
| 256 | +| 8 | Logging & docs | All + new `examples/` | DX | |
| 257 | + |
| 258 | +## Verification (Run After Each Phase) |
| 259 | +```bash |
| 260 | +uv run pytest -m "not integration" |
| 261 | +uv run ruff check bcb/ tests/ |
| 262 | +uv run ruff format --check bcb/ tests/ |
| 263 | +uv run mypy bcb/ |
| 264 | +``` |
0 commit comments