Floating point arithmetic #
The floating point number ⟨n,s⟩ represents n * 2^(s - 2^63), where n : Int64, s : UInt64.
Validity of a Floating as a single type
0has a single, standardized representation- nan_same : n = Int64.minValue → s = UInt64.max
nanhas a single, standardized representation If we're not
0,nan, or denormalized, the high bit ofnis set
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Computational version of Floating.Valid
Equations
- Floating.valid n s = bif n == 0 then s == 0 else bif n == Int64.minValue then s == UInt64.max else s == 0 || decide (1 <<< 62 ≤ n.uabs)
Instances For
Floating.valid decides Floating.Valid
Valid is decidable
Equations
- Floating.instDecidableValid n s = if v : Floating.valid n s = true then isTrue ⋯ else isFalse ⋯
Standard floating point nan
Equations
- Floating.instNan = { nan := { n := Int64.minValue, s := UInt64.max, v := Floating.instNan._proof_1 } }
0 : Floating
Equations
- Floating.instZero = { zero := { n := 0, s := 0, v := Floating.instZero._proof_1 } }
If we're not nan, x.n ≠ .min
Simplification lemmas used elsewhere #
This should really be cleaned up
The smallest normalized value #
@[irreducible]
The smallest normalized floating point value
Instances For
Print raw representation of a Floating
Equations
- Floating.instReprRaw = { reprPrec := fun (x : Raw Floating) (x_1 : ℕ) => Std.format "⟨⟨" ++ Std.format x.val.n.toUInt64 ++ Std.format "⟩, ⟨" ++ Std.format x.val.s ++ Std.format "⟩, _⟩" }