Zig Structs of Arrays (2024)

(andreashohmann.com)

125 points | by Tomte 5 days ago

9 comments

  • The GPU loves arrays of structures AoS, since all vertex data fits in its triangle assembly cache. Once given to the GPU, the software side doesnt really care for all vertex parameters so this optimisation is pointless. Only relavent when you have instance rendering (leaves, grass) but then you only need an array of vec3’s, not the other parameters so back to normal arrays.

    Meanwhile, game engines need operator overloading for adding/multiplying vectors (spatial transforms, lighting, physics) and core zig design philosophy prevents operator overloading.

    Blind leading the blind. Disclaimer - I do professional rendering engines.

    • > Meanwhile, game engines need operator overloading for adding/multiplying vectors (spatial transforms, lighting, physics) and core zig design philosophy prevents operator overloading.

      This is a frustrating decision. My use cases for low level languages overlap closely with my use cases for vectors (etc) with operator overloading. It was one of the first things which put a bad taste in my mouth about Zig.

      • flohofwoe 10 hours ago

        Zig has a builtin vector type and will get a builtin matrix type. The only useful thing that's missing from shading language vector-math syntax is swizzling, but you don't get that via operator overloading either (and for dot- and cross-product you'll still need a function, but that's also in line with shading languages).

      • flohofwoe 10 hours ago

        Splitting fat vertex component data into multiple streams also often makes sense in rendering engines (e.g. not all vertex shaders might need all vertex components). Strict SoA or strict AoS hardly ever makes sense, but an 'inbetween' approach often does (maybe call it SoAoS) - and this should be possible just fine with Zig's comptime approach, e.g. only apply the SoA transform to the toplevel items of a struct.

        As for CPU-side vector math:

        Zig already has a @Vector type (which will probably be renamed to @Simd) and it will get a builtin matrix type. With those two things, the main reason for operator overloading in game/rendering engines is pretty much handled via builtin types.

        • geysersam 1 day ago

          Genuine question: why do you think game engines need operator overloading? I mean, what's wrong with generic functions like add, multiply, dot etc.

          • aaaashley 1 day ago

            Not GP, but I've written game engines and rendering engines. Vector operations are just common enough that having to write `.mul` every time is a huge pain, especially when you put many of them together for a large formula. Compare:

            (physics_data.velocity + omega * change) * frame_delta_time

            to

            physics_data.velocity.add(omega.mul(change)).mul(frame_delta_time)

            We learn to read and think about math a certain way, which is incompatible with Zig. Also, Zig's design philosophy of "reading code over writing code" is incompatible with the kind of small modification-test-cycles required when doing games, and creative programming in general. So Zig is sort of DOA anyway for that kind of thing.

            But I've been using Zig for non-game projects and it's been fantastic, so definitely not "Blind leading the blind" for the overall language design, imo.

            • smj-edison 1 day ago

              I've been thinking about a way around this, and I'd be interested to see if comptime with a DSL wouldn't be too unwieldy. Something like

                math("(v + Ω * c) * Δt", .{ .v = physics_data.velocity, .@"Ω" = omega, .c = change, .@"Δt" = frame_delta_time})
              
              I know this is already possible with comptime, though I haven't implemented it yet since I haven't needed vector math in what I'm working on currently. Can't decide whether using math names is better or worse than using the full variable names though.
              • dnautics 1 day ago

                I have a sibling comment -- having thought about this for a very very long time, zig should really implement binary pseudo-operator syntactic sugar. I don't think this violate zig's spirit of 'no hidden function calls' in that I don't think it takes much of a mental lift to "get" that (_ <+> _) means "heyo this is a function call, not a true operator".

                • smj-edison 1 day ago

                  At first I was going to say that I disagreed since you couldn't choose what implementation of addition you wanted, but now that I've read your comment where you import the type of addition used, it's growing on me. Would you have operator precedence, or would it be more like Smalltalk's binary operators?

                  • dnautics 23 hours ago

                    forced use of parens, or else syntax error.

                • stouset 1 day ago

                  All this just to prevent people from using + - * / and ^. Why?

                  • smj-edison 1 day ago

                    Andrew talks about it because it introduces hidden control flow where you're expecting simple operators. In Zig anything that deals with control flow is a keyword (including short circuiting and, which is `and` instead of `&&`).

                    I'd argue though that the real disadvantage to having overloadable arithmetic is that you're limited to one implementation. This is actually my biggest beef with Rust, namely traits/type classes. It locks you into a single implementation when you may want to do something different based on the context. Zig pushes the dispatch decision to the callsite, not a trait subsystem (see how Zig implements hash mays for example). So I'd personally prefer to use a DSL, since it lets me specify what type of dispatch to use.

                    • kibwen 23 hours ago

                      Overloadable operators are not an instance of hidden control flow. Overloadable operators represent a user-defined function call, and thus can't influence control flow any more than a regular function. And if regular functions can't do anything weird to control flow (e.g. if your language already lacks exceptions (or even weirder things like Ruby-style procs)), then overloadable operators can't either.

                      > It locks you into a single implementation when you may want to do something different based on the context.

                      If you want differing behavior in a certain context, and if you don't want to use a different method to make the differing behavior explicit (e.g. the `wrapping_add` methods that Rust provides on numeric types), then you can use a different type for that context, e.g. the `std::num::Wrapping` type that Rust provides.

                      • smj-edison 22 hours ago

                        > Overloadable operators are not an instance of hidden control flow.

                        In general perhaps not, but in Zig it definitely does. Zig considers calling a function to change control flow, because it's no longer just an operator but something that can cause side effects, includinh mutating in place. Perhaps control flow isn't the right term, maybe non-trivial would be better?

                        With regard to wrappers, I personally find them ugly since 1. They bring in indirection, and I have a personal vendetta against unnecessary indirection, 2. Wrapping doesn't compose well and is a pain to shephard between representations, 3. It's harder to make a function generic across different representations, and 4. Wrappers often don't re-export everything available to their underlying value.

                        • charlieflowers 3 hours ago

                          I’m not advocating this, but it is worth observing that it is yet another problem one could attempt to address with dependency injection, similar to io and allocators.

                    • AnduCrandu 20 hours ago

                      It's appealing to people who want to understand and control everything they're doing. When I'm using pandas or SQLAlchemy, I have no idea what the code is actually doing. Most people don't care about such implementation details, but some people do.

                    • aaaashley 1 day ago

                      yes! i had this exact idea. i also thought about integrating geometric/clifford algebra using zig's type system so that you could have one mathematical multivector object instead of complex / quaternion types, etc.

                      • smj-edison 1 day ago

                        That's the other great thing about using comptime, is you can specify which DSL you want to use for which scenario. You're not locked into one implementation.

                  • hmry 1 day ago

                    Why have operators at all?

                      x = x.add(step.mul(2)).mod(width)
                    Or in C

                      x = imod(iadd(x, imul(step, 2)), width)
                    vs

                      x = (x + 2*step) % width
                    
                    For me the answer is very simple: Operators make it easier to read the code which makes it easier to spot bugs. It also makes it easier to turn formulas from textbooks into code.

                    If 50% of the code you're working with is using vectors and matrices, not having operators for those parts is quite annoying.

                    Note that you can have vector operators without overloading, e.g. Odin has built in vector and matrix types.

                    But personally I think it's better to give the user more power instead of only letting the compiler author pick which types to allow operators on. Like how Java overloads + but only on the String class. Why do they get to do it, but not me?

                    • Woah there, "=" is an operator! I'm afraid you're going to have to go to jail for using an operator in a no-operator zone.

                      • dnautics 1 day ago

                        you actually don't want "operator overloading", you want syntactic sugar. I once proposed just a special operator syntax at the parser level, but it got rejected, but if you REALLY wanted it, you could probably do this in about 100-120 lines as a fork of the zig compiler, just hacking (a <_> b) as a special form to be transformed into @"<_>"(a, b). Requiring parentheses elides questions about operator precedence.

                            const @"<+>" = @import("operator_module").plus;
                        
                            ...
                        
                            const x = (a <+> b);
                        • fluffybucktsnek 20 hours ago

                          I think both operator overloading and most operators themselves are syntactic sugars. Operator overloading happens to point towards specific functions, whereas arithmetic integer operators point to compiler intrinsics.

                          • dnautics 19 hours ago

                            no, in general overloading is not syntactic sugar, it's a feature of the language (being able to (re-)define a function in place X and have it change the function in unrelated place Y).

                            • fluffybucktsnek 18 hours ago

                              I don't see how it is unrelated. If have a custom type `A` with an overload on `+`, it will only affect places I used custom type `A`. If there wasn't operator overloading, I would just have to use a different notation to call the same function, but with possibly worse ergonomics (which is also why I think your solution doesn't really satisfy that, it doesn't read like algebra which is kind of the point). Given that type A is presumed to be custom, I don't see how place Y would be unrelated since it deliberately uses type `A`.

                              If we include operator overloading for any types, then sure. i32 + i32 might suddenly start meaning something else. But I think that's beyond the scope of what is normally asked by operator overloading.

                              • dnautics 16 hours ago

                                one is implementable entirely in the parser. overloading (operator or otherwise) in general is a deeper compiler feature

                        • benj111 7 hours ago

                          > x = (x + 2step) % width

                          Hmm. now. Is operator precedence not an instance of hidden flow control?

                          You need to know that 2step is done before adding x.

                          x = (x + (2step)) % width Or x = ( 2step + x) % width

                          Should be preferred?

                          Personally I try to bracket all things like this, so that it isn't hidden.

                          • Decabytes 1 day ago

                            > Why have operators at all?

                            I mean as an avid Lisp fan, I feel like Lisp basically answers the question of how much syntax you need in a langauge. I must admit though, not having to deal with operators precedence is really nice

                              (mod (+ x (* 2 step)) width)
                            • adrian_b 22 hours ago

                              Regarding operators, there are 3 distinct problems.

                              One is to allow the use of simple mathematical symbols as names for functions, instead of allowing only alphanumeric identifiers.

                              Most programming languages allow only a small fixed set of symbols to be used as "operators", i.e. as function names.

                              The better solution is to allow any Unicode character from certain categories, e.g. "Sm" and "Po" ("Symbol, math" and "Punctuation, other"), which does not have an already assigned role in the language syntax, to be used as a function name.

                              Most LISP variants allow the use of various kinds of character symbols as function names.

                              The second problem is overloading. Overloading must be treated uniformly for any kind of functions, regardless if their names are identifiers or operator symbols, i.e. not like in Java, where forbidding operator overloading was a mistake (that was an overreaction to C++, which allows the overloading of a few "operators" that are not normal functions and whose overloading should not have been allowed, e.g. the comma operator).

                              The overloading of operators, especially for user-defined data types is something absolutely essential for scientific and technical computing.

                              The majority of programmers have not been exposed to programs that contain a great amount of computations, so they are accustomed only with simple expressions that contain a few variables.

                              In scientific and technical computing it is very frequent to have very big expressions, which may contain a large number of operations and variables, where the variables may have various types, like complex numbers, vectors, matrices, complex vectors, complex matrices, or there may be a type system with distinct types for various physical quantities, like voltages, electric currents, capacitances and so on.

                              Anyone who had to write frequently such big expressions will definitely prefer, both for writing and for reading, to use overloaded operator symbols instead of long function names, which would fill most of the visual space with superfluous characters, obscuring the structure of the big expression.

                              The third problem is the syntax of function invocation. Most programming languages allow functions whose names are identifiers to use only prefix invocation but for some symbolic operators they allow infix invocation.

                              Here I also prefer the languages that do not differentiate between functions with alphanumeric names and functions with symbolic names (i.e. operators). There are languages where for any function it may be specified that it must be invoked as an infix operator, if this is desired.

                              Which is the best between the 3 classic solutions for expression syntax, traditional expressions with infix operators and multi-level precedence rules (like in FORTRAN and ALGOL), expressions with infix operators and a unique precedence rule for all operators (like in APL) and expressions without infix operators (like in LISP), is debatable.

                              Each of the 3 solutions has advantages and disadvantages, so the choice between them is a matter of personal preferences.

                        • fasterik 23 hours ago

                          On the other hand, SIMD loves SoA, and so does the CPU cache. It all depends on what you're doing with your data.

                          Zig professes to be a C replacement, not a C++ replacement, so leaving out operator overloading is consistent with that design goal. But I agree, I would prefer to program in a language that expresses mathematical relationships more naturally.

                          • awesan 1 day ago

                            Zig is adding native vectors including operator support, there are some recent issues/prs about this topic.

                            The general technique of SoA is pretty useful both in games and other applications, but of course I cannot speak to the specific use-case you are describing.

                            • nvme0n1p1 1 day ago

                              Zig vectors force data into SIMD registers even if that would make the code slower. They're a specialty type. You should only reach for vectors if you would have used SIMD intrinsics in C for example.

                              • e4m2 1 day ago

                                Zig vectors do not necessarily force data into SIMD registers; a scalar implementation would work equally well. This is not just a theoretical argument, because Zig code that uses `@Vector` also has to compile for architectures that do not have SIMD instructions.

                                That being said, the parent commenter is actually referring to other recent proposals as opposed to existing `@Vector` functionality:

                                https://codeberg.org/ziglang/zig/issues/32032

                                https://codeberg.org/ziglang/zig/issues/35376

                                • nvme0n1p1 1 day ago

                                  Interesting, so zig might have both "vectors" and "vecs"? I guess naming is another thing to fix before 1.0 <g>

                            • Ciantic 1 day ago

                              Rust should (eventually) support arrays of structures via compile-time reflection: https://fnordig.de/2026/03/25/rust-reflection-and-a-multi-ar...

                              • smj-edison 1 day ago

                                I didn't realize compile time reflection was back on track, that's really exciting!

                              • So is the argument that any SoA is pointless? Or just for GPU stuff? Because this isn't really talking about all that one way or another.

                                Also does one really need operator overloading? That feels a little strong. I've gotten by with functions just fine.. Does that make the GPU not like me Mr. wise engineer?

                              • Sweepi 1 day ago

                                OT: I just spend a few minutes searching for the source of the "Not all CPU operations are created equal" slide of the linked presentation (Andrew Kelley - Practical DOD), its here:

                                https://6it.dev/blog/infographics-operation-costs-in-cpu-clo...

                              • hiccuphippo 1 day ago
                                • trymas 1 day ago

                                  Slightly related recent HN post: https://news.ycombinator.com/item?id=48382382

                                  • blt 22 hours ago

                                    After that build-up, I was hoping to see a toy implementation of a method or two for `MultiArrayList`.

                                    • binaryturtle 1 day ago

                                      I'm just seeing a "410 Gone" error on the linked site (same happens to the parent URL too).

                                      • ArneCode 1 day ago

                                        Works for me

                                        • binaryturtle 1 day ago

                                          Still the same. I guess it's some sort of wild anti-bot stuff basing on the user agent?

                                          /edit

                                          Yes, as confirmed with cURL, using my browser's "User Agent": 410 blocked. Using some other "User Agent" and it passes along the data. Pretty silly, IMHO.

                                      • Thaxll 1 day ago

                                        This is what games do with ECS.

                                        • nejam 1 day ago

                                          Jfpe?

                                          • nejam 1 day ago

                                            Jdoemhoe