Skip to content

[WIP] Update CrawlService to use OCP HTTP client instead of Guzzle#2432

Closed
Copilot wants to merge 1 commit into
masterfrom
copilot/update-crawl-service-http-client
Closed

[WIP] Update CrawlService to use OCP HTTP client instead of Guzzle#2432
Copilot wants to merge 1 commit into
masterfrom
copilot/update-crawl-service-http-client

Conversation

Copy link
Copy Markdown

Copilot AI commented Jun 1, 2026

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.

Original prompt

Update lib/Service/CrawlService.php in nextcloud/bookmarks to use Nextcloud's built-in OCP HTTP client instead of directly instantiating raw Guzzle classes, while preserving the current behavior.

Context:

  • The current implementation imports and uses GuzzleHttp\Client and GuzzleHttp\Psr7\Response directly.
  • The file currently does a GET request in crawl(Bookmark $bookmark): void with these request options:
    • headers['User-Agent'] = self::UA_FIREFOX
    • connect_timeout = self::CONNECT_TIMEOUT
    • timeout = self::TIMEOUT
    • read_timeout = self::READ_TIMEOUT
    • http_errors = false
  • It treats availability as statusCode !== 404, logs exceptions as warnings, and then, when available:
    • sets the user in UserSettingsService
    • optionally archives the file/content using the HTTP response
    • generates bookmark preview and favicon
    • marks preview created, sets availability, and updates the bookmark
  • The helper methods archiveContent() and archiveFile() currently accept the Guzzle response object and use:
    • getHeader('Content-Type')
    • getHeader('Content-Length')
    • getBody()
    • getStatusCode()

Requirements:

  1. Replace direct Guzzle client usage with the HTTP client provided by Nextcloud OCP.
  2. Preserve the current behavior as closely as possible, including:
    • sending the same User-Agent
    • respecting timeout behavior as closely as OCP allows
    • not throwing on non-404/non-2xx responses if the OCP client has an equivalent option
    • considering the bookmark available whenever the request succeeds and the status code is not 404
    • preserving existing archive and preview behavior
  3. Refactor the code as needed to work with the OCP client/response interfaces, but keep the change focused to CrawlService unless a small supporting change is necessary.
  4. Update constructor dependencies if needed to inject the appropriate OCP HTTP service/factory.
  5. Ensure imports are updated accordingly and direct raw Guzzle imports are removed if no longer needed.
  6. If tests exist and need updates due to constructor signature or behavior, update them as part of the PR.

Please create a pull request with the implementation.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Can you make a PR that changes CrawlService to use the HTTP client built into Nextcloud OCP, instead of raw Guzzle, while keeping the current functionality?

Assistant: I’m waiting for your confirmation on the pull request action—once you approve it, I can continue.

This pull request was created from Copilot chat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants