Skip to content

[Copilot] BuildLookasideURL performs raw placeholder substitution and can generate incorrect or invalid URLs #73

@dmcilvaney

Description

@dmcilvaney

Issue Description

BuildLookasideURL constructs lookaside URLs by performing raw string substitution for $pkg, $filename, $hashtype, and $hash without URL-encoding substituted values.

If packageName or fileName contains reserved URL characters such as /, ?, #, or malformed % escapes, the generated URL can change meaning or become invalid.

This is a shared helper issue. The newer download-sources flow increases its reachability because it accepts filenames from Fedora-style sources files and feeds them into this path.

Impact

  • Generated lookaside URLs can point at the wrong resource.
  • Malformed URLs may fail only later when an HTTP request is created.
  • Logged URLs may look plausible while the parsed request semantics differ.

This is primarily a correctness and robustness issue with some security relevance. It does not appear to be straightforward arbitrary-host SSRF because the host still comes from distro configuration, and hash verification limits silent content substitution after download.

Root cause

BuildLookasideURL treats placeholder values as plain strings instead of URL path components.

That means inserted values can change URL structure:

  • / creates extra path segments
  • ? starts a query string
  • # starts a fragment
  • malformed % sequences can make request creation fail

Reachability

The issue is reachable through multiple paths:

  1. normal lookaside downloads via the source manager
  2. dist-git URL construction that also performs raw $pkg substitution

Example

Given a template such as:

https://example.com/lookaside/$pkg/$filename/$hashtype/$hash/$filename

These inputs are problematic:

filename = "foo/bar"
filename = "file?x=1"
filename = "file#frag"
filename = "file%zz"
packageName = "foo/bar"
packageName = "foo#bar"

Today those values are inserted directly into the template instead of being treated as URL-safe path values.

Recommended fix

  1. URL-escape placeholder values intended for path positions, for example with url.PathEscape().
  2. Parse and validate the final URL before returning it so malformed escapes and other structural issues fail early.
  3. Apply the same fix to dist-git URL construction, which currently has the same raw-substitution problem for $pkg.
  4. If any placeholders are not always used in path positions, explicitly validate and reject unsafe values rather than relying on raw replacement.

Suggested tests

Add coverage for BuildLookasideURL and equivalent dist-git URL builders for:

  1. filename = "foo/bar"
  2. filename = "file?x=1"
  3. filename = "file#frag"
  4. filename = "file%zz"
  5. equivalent cases for packageName

Expected results should verify either correctly escaped URLs or explicit validation errors.

Affected areas

  • internal/providers/sourceproviders/fedorasource/fedorasource.go
  • internal/providers/sourceproviders/sourcemanager.go
  • internal/providers/sourceproviders/fedorasourceprovider.go
  • internal/app/azldev/cmds/downloadsources/downloadsources.go

Expected Changes

Review findings and make a fix.

Additional Context

No response

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions