Floating-point considerations

Investigating a discrepancy

In the Converting a single-scale simulation to multi-scale page, a single-scale simulation was converted to an equivalent multiscale simulation, and outputs were compared. One detail that was glossed over, but important to bear in mind as a PlantSimEngine user is related to floating-point approximations.

Single-scale simulation

meteo_day = CSV.read(joinpath(pkgdir(PlantSimEngine), "examples/meteo_day.csv"), DataFrame, header=18)

models_singlescale = ModelList(
    ToyLAIModel(),
    Beer(0.5),
    ToyRUEGrowthModel(0.2),
    status=(TT_cu=cumsum(meteo_day.TT),),
)

outputs_singlescale = run!(models_singlescale, meteo_day)
TimeStepTable{Status{(:TT_cu, :LAI, :aPPFD,...}(365 x 5):
╭─────┬──────────┬────────────┬───────────┬────────────┬───────────────────╮
 Row     TT_cu         LAI      aPPFD     biomass  biomass_increment 
       Float64     Float64    Float64     Float64            Float64 
├─────┼──────────┼────────────┼───────────┼────────────┼───────────────────┤
   1       0.0  0.00554988  0.0396961  0.00793922         0.00793922 
   2       0.0  0.00554988    0.02173   0.0122852           0.004346 
   3       0.0  0.00554988  0.0314899   0.0185832         0.00629798 
   4       0.0  0.00554988  0.0390834   0.0263999         0.00781668 
   5       0.0  0.00554988  0.0454514   0.0354902         0.00909028 
   6       0.0  0.00554988  0.0472677   0.0449437         0.00945354 
   7       0.0  0.00554988    0.04346   0.0536357         0.00869201 
   8       0.0  0.00554988  0.0469832   0.0630324         0.00939665 
   9       0.0  0.00554988  0.0291703   0.0688664         0.00583406 
  10       0.0  0.00554988  0.0140052   0.0716675         0.00280105 
  11       0.0  0.00554988  0.0505283   0.0817731          0.0101057 
  12       0.0  0.00554988  0.0405277   0.0898787         0.00810554 
  13    0.5625  0.00557831  0.0297814    0.095835         0.00595629 
  14  0.945833  0.00559777  0.0433269      0.1045         0.00866538 
  15  0.979167  0.00559946  0.0470271    0.113906         0.00940542 

╰─────┴──────────┴────────────┴───────────┴────────────┴───────────────────╯
                                                            350 rows omitted

### Multi-scale equivalent

PlantSimEngine.@process "tt_cu" verbose = false

struct ToyTt_CuModel <: AbstractTt_CuModel end

function PlantSimEngine.run!(::ToyTt_CuModel, models, status, meteo, constants, extra=nothing)
    status.TT_cu +=
        meteo.TT
end

function PlantSimEngine.inputs_(::ToyTt_CuModel)
    NamedTuple() # No input variables
end

function PlantSimEngine.outputs_(::ToyTt_CuModel)
    (TT_cu=0.0,)
end

mapping_multiscale = Dict(
    "Scene" => ToyTt_CuModel(),
    "Plant" => (
        MultiScaleModel(
            model=ToyLAIModel(),
            mapped_variables=[
                :TT_cu => "Scene",
            ],
        ),
        Beer(0.5),
        ToyRUEGrowthModel(0.2),
    ),
)

mtg_multiscale = MultiScaleTreeGraph.Node(MultiScaleTreeGraph.NodeMTG("/", "Plant", 0, 0),)
    plant = MultiScaleTreeGraph.Node(mtg_multiscale, MultiScaleTreeGraph.NodeMTG("+", "Plant", 1, 1))

outputs_multiscale = run!(mtg_multiscale, mapping_multiscale, meteo_day)
Dict{String, Dict{Symbol, Vector}} with 2 entries:
  "Scene" => Dict(:TT_cu=>[Float64[], Float64[], Float64[], Float64[], Float64[…
  "Plant" => Dict(:TT_cu=>[[0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0…

### Output comparison

computed_TT_cu_multiscale = collect(Base.Iterators.flatten(outputs_multiscale["Scene"][:TT_cu]))

is_approx_equal = length(unique(computed_TT_cu_multiscale .≈ outputs_singlescale.TT_cu)) == 1
true

Why was the comparison only approximate ? Why instead of ==?

Let's try it out. What if write instead:

computed_TT_cu_multiscale = collect(Base.Iterators.flatten(outputs_multiscale["Scene"][:TT_cu]))

is_perfectly_equal = length(unique(computed_TT_cu_multiscale .== outputs_singlescale.TT_cu)) == 1
false

Why is this false? Let's look at the data.

Looking more closely at the output, we can notice that values are identical up to timestep #105 :

(computed_TT_cu_multiscale .== outputs_singlescale.TT_cu)[104]
true
(computed_TT_cu_multiscale .== outputs_singlescale.TT_cu)[105]
false

We have the values 132.33333333333331 (multi-scale) and 132.33333333333334 (single-scale). The final output values are : 2193.8166666666643 (multi-scale) and 2193.816666666666 (single-scale).

The divergence isn't huge, but in other situations or over more timesteps it could start becoming a problem.

Floating-point summation

The reason values aren't identical, is due to the fact that many numbers do not have an exact floating point representation. A classical example is the fact that 0.1 + 0.2 != 0.3 :

println(0.1 + 0.2 - 0.3)
5.551115123125783e-17

When summing many numbers, depnding on the order in which they are summed, floating-point approximation errors may aggregate more or less quickly.

The default summation per-timestep in our example Toy_Tt_CuModel was a naive summation. The cumsum function used in the single-scale simulation to directly compute the TT_cu uses a pairwise summation method that provides approximation error on fewer digits compared to naive summation. Errors aggregate more slowly.

In our simple example, using Float64 values, the difference wasn't significant enough to matter, but if you are writing a simulation over many timesteps or aggregating a value over many nodes, you may need to alter models to avoid numerical errors blowing up due to floating-point accuracy.

Depending on what value is being computed and the mathematical operations used, changes may range from applying a simple scale to a range of values, to significant refactoring.

Note that many of the examples in these blogposts discuss Float32 accuracy. Float64 values have several extra precision bits to work.

A series of blog posts on floating-point accuracy: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ Floating-Point Visually Explained : https://fabiensanglard.net/floatingpointvisually_explained/ Examples of floating point problems: https://jvns.ca/blog/2023/01/13/examples-of-floating-point-problems/

Relating specifically to floating-point sums:

Pairwise summation: https://en.wikipedia.org/wiki/Pairwise_summation Kahan summation: https://en.wikipedia.org/wiki/Kahansummationalgorithm Taming Floating-Point Sums: https://orlp.net/blog/taming-float-sums/