Slow performance of auto dimensions #1785

Open
opened 2025-02-06 17:28:43 -08:00 by Hexlord · 3 comments
Hexlord commented 2025-02-06 17:28:43 -08:00 (Migrated from github.com)

We found out that defining at least one of the dimensions to be 100%, instead of keeping them both auto, improves performance a lot. We were able to achieve 3x frame time improvement (12ms -> 4ms) by going through every style we have and setting width to 100% when it does not break the UI (it mostly never does for auto-auto cases).

It would be nice though if the algorithm would be able to detect these scenarios better (of when 100% width gives the same result as auto) and avoid the expensive auto dimension resolve recursion / whatever in the process costs so much.

For now we changed the default values for style's dimensions to be 100% width and 100% height, so the user explicitly sets it to auto, with the goal being preserving as much performance as possible.

Speaking without knowledge, perhaps the algorithm could try to select a dimension according to current flex direction, usually width, and see that if there is only one relative child with flexGrow > 0, it could treat its {auto, auto} dimensions as {100%, auto} instead, for less algorithmic complexity but hopefully guaranteed same result?

We found out that defining at least one of the dimensions to be 100%, instead of keeping them both auto, improves performance a lot. We were able to achieve 3x frame time improvement (12ms -> 4ms) by going through every style we have and setting width to 100% when it does not break the UI (it mostly never does for auto-auto cases). It would be nice though if the algorithm would be able to detect these scenarios better (of when 100% width gives the same result as auto) and avoid the expensive auto dimension resolve recursion / whatever in the process costs so much. For now we changed the default values for style's dimensions to be 100% width and 100% height, so the user explicitly sets it to auto, with the goal being preserving as much performance as possible. Speaking without knowledge, perhaps the algorithm could try to select a dimension according to current flex direction, usually width, and see that if there is only one relative child with flexGrow > 0, it could treat its {auto, auto} dimensions as {100%, auto} instead, for less algorithmic complexity but hopefully guaranteed same result?
NickGerleman commented 2025-02-07 11:32:38 -08:00 (Migrated from github.com)

My guess is that this is because an explicit percentage is counted as a definite dimension, which means we can potentially avoid needing to measure children, when measuring size of the node (we still would need to go down everything when doing layout phase though).

Speaking without knowledge, perhaps the algorithm could try to select a dimension according to current flex direction, usually width, and see that if there is only one relative child with flexGrow > 0, it could treat its {auto, auto} dimensions as {100%, auto} instead, for less algorithmic complexity but hopefully guaranteed same result?

I could think of a couple edge cases here off the top of my head. E.g. if this child has a max dimension constraint, or any time the parent is being measured under a constraint like max-content instead of fit-content (in stretch-fit, we already know the size).

My guess is that this is because an explicit percentage is counted as a definite dimension, which means we can potentially avoid needing to measure children, when measuring size of the node (we still would need to go down everything when doing layout phase though). > Speaking without knowledge, perhaps the algorithm could try to select a dimension according to current flex direction, usually width, and see that if there is only one relative child with flexGrow > 0, it could treat its {auto, auto} dimensions as {100%, auto} instead, for less algorithmic complexity but hopefully guaranteed same result? I could think of a couple edge cases here off the top of my head. E.g. if this child has a max dimension constraint, or any time the parent is being measured under a constraint like max-content instead of fit-content (in stretch-fit, we already know the size).
nicoburns commented 2025-02-23 14:29:05 -08:00 (Migrated from github.com)

Percentage sizes are always going to be fast than auto.

Having said that, we do have one Taffy benchmark that involves "auto" sizes where Yoga is much (10x) slower than Taffy. The original bug report, which led us to fixing a similar issue in Taffy was https://github.com/DioxusLabs/taffy/issues/502.

Our current benchmark for this is:

  • A tree with 2 children per node
  • The following styles for each node: flex-grow: 1; margin: 10px
  • We run this at various sizes (4k, 10k, 100k nodes) to show algorithmic time complexity

The results I'm getting which are potentially relevant to this issue are:

Benchmark Tree Depth Node Count Taffy Yoga
Deep tree (auto size) 12 4000 2.77 ms 33.62 ms
Deep tree (auto size) 14 10000 6.61 ms 104.76 ms
Deep tree (auto size) 17 100000 110.84 ms 2424.00 ms

Yoga and Taffy are yielding numbers of the same order of magnitude (and usually within about 10% of each other) in all our other benchmarks, so this seems to be something specific to this "auto size" setup.

You can run Taffy's benchmarks with Yoga comparison enabled by cloning the Taffy repo and running cargo bench --package taffy_benchmarks --features yoga (or cargo bench --package taffy_benchmarks --features yoga,large if you want to run the benchmark variants with 100k nodes).

Percentage sizes are always going to be fast than `auto`. Having said that, we do have one Taffy benchmark that involves "auto" sizes where Yoga is much (10x) slower than Taffy. The original bug report, which led us to fixing a similar issue in Taffy was https://github.com/DioxusLabs/taffy/issues/502. Our current benchmark for this is: - A tree with 2 children per node - The following styles for each node: `flex-grow: 1; margin: 10px` - We run this at various sizes (4k, 10k, 100k nodes) to show algorithmic time complexity The results I'm getting which are potentially relevant to this issue are: | Benchmark | Tree Depth | Node Count | Taffy | Yoga | | --- | --- | --- | --- | --- | | Deep tree (auto size) | 12 | 4000 | 2.77 ms | 33.62 ms | | Deep tree (auto size) | 14 | 10000 | 6.61 ms | 104.76 ms | | Deep tree (auto size) | 17 | 100000 | 110.84 ms | 2424.00 ms | Yoga and Taffy are yielding numbers of the same order of magnitude (and usually within about 10% of each other) in all our other benchmarks, so this seems to be something specific to this "auto size" setup. You can run Taffy's benchmarks with Yoga comparison enabled by cloning the Taffy repo and running `cargo bench --package taffy_benchmarks --features yoga` (or `cargo bench --package taffy_benchmarks --features yoga,large` if you want to run the benchmark variants with 100k nodes).
Hexlord commented 2025-02-28 12:43:55 -08:00 (Migrated from github.com)

I looked into "Fix caching for flexbox columns" commit that fixed the perf in Taffy, but Yoga does not have the concept of cache slots, I tried increasing Yoga's "MaxCachedMeasurements" but the results are mixed, so fixing auto node performance in Yoga is probably not trivial and will require someone from the dev team to take a look

I looked into "Fix caching for flexbox columns" commit that fixed the perf in Taffy, but Yoga does not have the concept of cache slots, I tried increasing Yoga's "MaxCachedMeasurements" but the results are mixed, so fixing auto node performance in Yoga is probably not trivial and will require someone from the dev team to take a look
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: DaddyFrosty/yoga#1785
No description provided.