Skip to content

Add API request IDs, structured errors, pagination and filters#73

Merged
Jagadeeshck merged 1 commit into
mainfrom
codex/add-production-api-features-and-pagination
May 21, 2026
Merged

Add API request IDs, structured errors, pagination and filters#73
Jagadeeshck merged 1 commit into
mainfrom
codex/add-production-api-features-and-pagination

Conversation

@Jagadeeshck

Copy link
Copy Markdown
Owner

Motivation

  • Bring API behaviour to production readiness by adding request correlation, consistent structured errors, and safe list handling for large datasets.
  • Provide a single, predictable error contract and request IDs to help clients and server-side observability/debugging.
  • Support basic offset pagination and useful query filters on commonly used list endpoints while preserving existing top-level response keys.

Description

  • Added request ID middleware that accepts an incoming X-Request-ID or generates a UUID, stores it on request.state, and echoes it on every response; request IDs are included in all structured error payloads (src/api/app.py).
  • Introduced a consistent structured error schema ErrorResponse (error.code, error.message, error.details, request_id) and wired handlers for general HTTPException, StarletteHTTPException, RequestValidationError (400/422) and unhandled exceptions (500) (src/api/app.py).
  • Implemented pagination models (PaginationMeta) and applied limit/offset query parameters (defaults and caps via Query) to GET /rules, GET /quality/results, GET /lineage/nodes, and GET /lineage/edges, preserving the existing top-level keys (rules, results, nodes, edges) and adding pagination metadata (src/api/app.py).
  • Added filtering query parameters to the endpoints requested (rules, quality results, lineage nodes/edges) and ensured parameters appear in OpenAPI by using Query parameters (src/api/app.py).
  • Extended the store contract to accept optional limit/offset and filter parameters, and updated InMemoryStore and ElasticsearchStore to perform filtering/pagination (pushed into ES where practical) while keeping backwards-compatible defaults (src/api/store.py, src/api/es_store.py).
  • Tests updated/added to cover request ID propagation, generated request IDs, structured error payloads with request_id, pagination and filter behaviour, limit max enforcement and negative offset validation; documentation added at docs/api/rest-api.md with examples and curl snippets.

Testing

  • Ran byte-compile: python -m compileall -q src tests and it completed successfully (no compile failures).
  • Ran endpoint/unit tests individually: pytest tests/test_api_endpoints.py -q, pytest tests/test_api_es_store.py -q, pytest tests/test_settings.py -q, pytest tests/test_poc_pipeline.py -q and all targeted test files passed.
  • Ran full test suite: pytest -q which completed successfully with all tests passing (suite result: tests passed; some integration tests remain skipped by default).

Codex Task

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22fff0accf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/api/app.py
Comment on lines +185 to 189
return
token = credentials.credentials if credentials and credentials.scheme.lower() == "bearer" else None
# Preserve the legacy parser's exact "Bearer " prefix behavior for unusual clients.
if token is None and authorization and authorization.startswith("Bearer "):
token = authorization[len("Bearer "):].strip()
if token != settings.api_token:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject unauthenticated requests when strict auth is enabled

When api_token is unset and allow_unauthenticated_dev is False, this logic still allows requests without an Authorization header: token stays None, and token != settings.api_token is False when both are None. That means protected endpoints become publicly accessible under a strict-auth configuration with a missing token, which is a security regression from the previous behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot fix issues based on recommendation

Comment thread src/api/app.py
Comment on lines 220 to +223
rules = stores.store.get_all_rules()
return {"rules": rules, "count": len(rules)}
filtered = [r for r in rules if (dataset is None or r.get("dataset") == dataset) and (enabled is None or r.get("enabled") == enabled) and (severity is None or r.get("severity") == severity) and (check_type is None or r.get("check_type", r.get("type")) == check_type)]
page = _paginate(filtered, limit, offset)
return {"rules": page["items"], "count": len(page["items"]), "pagination": page["pagination"]}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Propagate /rules limit and offset into the store query

This endpoint now calls get_all_rules() without pagination arguments, then paginates the returned subset in memory. Because store defaults are limit=100, offset=0, requests like ?limit=500 or ?offset=120 can never access records beyond the first 100, and pagination.total is computed from that truncated slice rather than the real result set.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot fix issues based on recommendation

Comment thread src/api/store.py
Comment on lines +62 to 66
def list_quality_results(self, limit: int = 100, offset: int = 0, table: Optional[str] = None, status: Optional[str] = None, dataset: Optional[str] = None, check_type: Optional[str] = None, severity: Optional[str] = None, run_id: Optional[str] = None) -> List[Dict[str, Any]]:
results = list(self._quality.values())
if table:
results = [r for r in results if r.get("table") == table]
if status:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor check_type, severity, and run_id in memory filtering

The in-memory implementation advertises check_type, severity, and run_id filters in its method signature, but it only applies table, status, and dataset. In memory-backed deployments/tests, those documented query params are silently ignored, producing incorrect results and backend-inconsistent behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot fix issues based on recommendation

@Jagadeeshck Jagadeeshck merged commit c9e9e96 into main May 21, 2026
15 of 21 checks passed
@Jagadeeshck Jagadeeshck deleted the codex/add-production-api-features-and-pagination branch June 6, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant