Skip to content

Add Chebyshev approximations#18

Open
sigilante wants to merge 8 commits into
mainfrom
claude/add-numeric-types-A3HVY
Open

Add Chebyshev approximations#18
sigilante wants to merge 8 commits into
mainfrom
claude/add-numeric-types-A3HVY

Conversation

@sigilante
Copy link
Copy Markdown
Collaborator

Prototype implementation demonstrating jet-compatible approach for transcendental functions (exp, sin, cos, tan) using:

  • Fixed polynomial coefficients from musl libc
  • Deterministic argument reduction
  • Horner's method evaluation

Key principle: every FP operation happens in same order with same intermediate values in both Hoon and C jet.

Prototype implementation demonstrating jet-compatible approach for
transcendental functions (exp, sin, cos, tan) using:
- Fixed polynomial coefficients from musl libc
- Deterministic argument reduction
- Horner's method evaluation

Key principle: every FP operation happens in same order with same
intermediate values in both Hoon and C jet.
Complete fixed-polynomial math library with:
- Half precision (@rh): degree 5 sin, degree 4 cos, degree 3 exp
- Single precision (@rs): degree 9 sin, degree 10 cos, degree 4 exp
- Double precision (@rd): degree 13 sin, degree 14 cos, degree 5 exp

All coefficients from musl libc expressed as hex literals for
bit-exact jet compatibility. Each precision level includes:
- Polynomial coefficients and reduction constants
- Chebyshev polynomial generator (+cheb)
- Kernel functions (sindf, cosdf)
- Public API (sin, cos, tan, exp)
- Helper functions (scalbn, floor-int, rem-pio2)
Complete binary128 (128-bit) implementation with:
- Degree 17 sin polynomial (S1-S8)
- Degree 16 cos polynomial (C1-C8)
- Degree 7 exp polynomial (P1-P7)
- Chebyshev polynomial generator (+cheb)
- Helper functions (scalbn, floor-int, rem-pio2)

Coefficients derived from FreeBSD ld128/k_sinl.c, k_cosl.c.
Uses .~~~N syntax for quad precision literals.
Extended math-chebyshev.hoon with:
- log: Fixed polynomial (Lg1-Lg4 for @rs, Lg1-Lg7 for @rd)
- sqrt: Newton-Raphson with fixed iterations (4 for @rs, 6 for @rd)
- atan: Fixed polynomial with range reduction
- asin, acos, atan2: Derived from atan
- pow-n: Binary exponentiation (fixed 32/64 iterations)
- pow: exp(n*log(x)) for non-integer exponents
- log-2, log-10: Derived from log

Added test suite with:
- Individual function tests for @rs and @rd
- Identity tests (sin^2+cos^2=1, exp(log(x))=x, etc.)
- Chebyshev polynomial verification
Half precision (@rh):
- log: Simplified polynomial (single Lg coefficient)
- sqrt: Newton-Raphson with 3 iterations
- atan: Reduced polynomial with large-argument reduction
- asin, acos, atan2, pow-n, pow

Quad precision (@rq):
- log: 8-coefficient polynomial (Lg1-Lg8) for 113-bit accuracy
- sqrt: Newton-Raphson with 8 iterations
- atan: 6-coefficient polynomial with range reduction
- asin, acos, atan2, pow-n, pow

All functions use hex-exact coefficients for jet compatibility.
Adds comprehensive tests for half precision (@rh) and quad precision (@rq):

- Helper functions: expect-near-rh, expect-near-rq
- Function tests: sin, cos, exp, log, sqrt, atan, pow-n, cheb
- Identity tests: sin²+cos²=1, exp(log(x))=x, sqrt(x)²=x

Half precision tests use hex literals (0x3c00 = 1.0, etc.)
Quad precision tests use .~~~X literal format
Hyperbolic functions added to all precision levels (@rh, @rs, @rd, @rq):
- sinh: Hyperbolic sine via (exp(x) - exp(-x)) / 2
- cosh: Hyperbolic cosine via (exp(x) + exp(-x)) / 2
- tanh: Hyperbolic tangent via (e^2x - 1) / (e^2x + 1)
- asinh: Inverse hyperbolic sine via ln(x + sqrt(x^2 + 1))
- acosh: Inverse hyperbolic cosine via ln(x + sqrt(x^2 - 1))
- atanh: Inverse hyperbolic tangent via 0.5 * ln((1+x)/(1-x))

Special function:
- erf: Error function using Abramowitz & Stegun approximation 7.1.26

Tests include:
- Individual function tests for sinh, cosh, tanh, asinh, acosh, atanh
- Hyperbolic identity: cosh^2(x) - sinh^2(x) = 1
- Inverse identities: asinh(sinh(x))=x, acosh(cosh(x))=x, atanh(tanh(x))=x
- erf function tests and symmetry: erf(-x) = -erf(x)
Gamma function:
- Uses Lanczos approximation (g=7, n=9 coefficients)
- lgamma: Log-gamma function via log(gamma(x))
- Handles special cases: Γ(1)=1, Γ(2)=1

Bessel functions of first kind:
- j0: J₀(x) using polynomial for |x|<8, asymptotic otherwise
- j1: J₁(x) with proper sign handling

Bessel functions of second kind:
- y0: Y₀(x) using log-based formula for small x
- y1: Y₁(x) with asymptotic expansion for large x

Tests include:
- Gamma at integer values (factorial property)
- Γ(1.5) = √π/2 ~ 1.7724538
- Γ(n+1) = n*Γ(n) identity
- J0, J1 at standard test points
- Numerical derivative identity: J1(x) ≈ -J0'(x)
- Y0, Y1 at standard test points
@sigilante
Copy link
Copy Markdown
Collaborator Author

Follow-up analysis: libmath direction (from the Saloon eig work)

Captured here so it isn't lost — context is the Saloon eigendecomposition effort (PRs #47/#48), which surfaced concrete evidence for why this PR's approach is the right one. Pending for now; this is a roadmap, not a request to merge.

The problem this PR solves (with evidence)

The current math.hoon implements transcendentals as naive Taylor series with rtol-driven convergence loops (exp/sin/cos/log/sqt/… all loop until (lth (abs (sub po p)) rtol)). Three structural problems, worst last:

  1. Unbounded → can hang the ship. While developing Hermitian eig (Saloon: Hermitian (%cplx) eigendecomposition (eig Phase A2) #48), a sub-ULP rtol made Newton sqt oscillate between two adjacent floats forever → loom exhaustion → recover: dig: meme + segfault (had to reboot with a bigger loom). This hazard exists in every rtol-loop transcendental, not just sqt.
  2. Inaccurate. Argument reduction is minimal (sin does a single lossy (mod x tau); exp has none past overflow guards). Taylor converges slowly with cancellation — the doc comments show (cos pi) = -1.0000000000013558 (~1.4e-12 off, not f64-accurate). That's an accuracy ceiling for the whole Lagoon/Saloon stack.
  3. Not bit-exactly jettable — the decisive one. Iteration count depends on value and rtol, and term order is series-specific, so a C jet can't reproduce the exact SoftFloat op sequence to yield an identical noun. Effectively unjettable as written.

Why the fixed-polynomial approach here is correct

This PR's stated principle — "every FP op happens in the same order with the same intermediate values in both Hoon and the C jet" — fixes all three at once:

  • Fixed-degree minimax/Chebyshev polys with hex-exact frozen coefficients (musl is a fine reference) → bounded constant work → no hang.
  • Deterministic argument reduction (Cody-Waite / Payne-Hanek) + Horner (or Clenshaw for a Chebyshev series) → near-correctly-rounded at low degree → accurate.
  • Fixed straight-line op sequence → a C jet calling SoftFloat in the same order is bit-identical. Since vere's jets and Hoon @rd ops share the SoftFloat engine, bit-exactness is attainable — but only once the op order is fixed (which Taylor-with-variable-iterations is not).

Consequences / decisions for the migration

  • rtol largely goes away. Fixed-degree functions take no tolerance; the [rnd rtol] door sample becomes vestigial for most arms. This also removes the exact foot-gun that crashed the ship (mismatched-width rtol).
  • sqt is the natural first target and a clean template: Newton-for-sqrt with a fixed iteration count (5–6 steps after a good seed) is already deterministic and jettable — quadratic convergence makes it bit-stable. Saloon Saloon: Hermitian (%cplx) eigendecomposition (eig Phase A2) #48 ships an interim iteration-capped fsqt; the principled replacement is a jettable libmath sqt that Saloon then defers to.
  • Clenshaw (Chebyshev series) vs Horner (minimax monomials): both are fixed-op-order and jettable. musl-style Horner with frozen coefficients matches the coefficient tables directly and is simplest for the jet path; the +cheb T_n recurrence in this PR is best viewed as derivation tooling, somewhat orthogonal to runtime evaluation. Recommend Horner for the jetted path, keep a Chebyshev generator as tooling.
  • Payoff is automatic for Saloon/Lagoon: their element-wise exp/sin/… and the eig sqt route through libmath via trans-scalar, so converting libmath gives the array stack accurate + fast + jettable transcendentals for free.
  • vs PR Add Chebyshev function support for more efficient calculations. #9 (sigilante/cheb): older and sprawls across lib3d/lagoon/math; likely superseded by this focused prototype. Salvage any useful coefficients, otherwise let it lapse.

Suggested first step (when picked up)

Convert sqt (all widths) to a fixed-iteration, rtol-free, bit-exact-jet-ready form as the migration template, then retire Saloon's interim fsqt. Small, self-contained, closes the durability hole, and establishes the pattern for exp/sin/cos/log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants