Comments for Hemisphere Games Games for both sides of your brain Fri, 26 May 2017 19:06:17 +0000 hourly 1 Comment on Osmos, Updates, and Floating-Point Determinism by eddybox Fri, 26 May 2017 19:06:17 +0000 Very interesting suggestion, David, thanks! In practice that’d be hard to implement, since by the time a device got the 7k dump from the remote device, the simulation would have moved on (between 3-15 time-steps, depending on the current ping). So we’d have to compare the dump to a historical snapshot, and if they’re different, we’d have to rewind and re-simulate forward from that point to the present. If all goes well / if the differences are very minor, the player wouldn’t notice a thing, which would be great, but the physics simulation in Osmos is expensive (a lot of moving bodies, forces, and collisions per frame) so it might take a while for the catchup calculations; if there are were frequent micro-divergences, we’d have to do that often, which might seriously hurt framerate. We could try to amortize all those calculations across frames, but yeah… all that to say, it’d be far from trivial to implement a solution like that for Osmos, and could lead down its own rabbit-hole. A cool suggestion though – and if it wasn’t for the lag between the dump and the present (or if rewinding wasn’t so costly) that’d be a great and relatively easy way to recover!

Comment on Osmos, Updates, and Floating-Point Determinism by David Cantrell Wed, 24 May 2017 23:36:58 +0000 As an alternative to requiring simulation to work perfectly for ever, did you consider a scheme where you’d send player input across the network most of the time, with a full 7K dump of the universe every so often? A bit like how MPEG video has occasional “key frames” where the entire image gets transmitted with just inter-frame diffs the rest of the time. That way you would stand a good chance of recovering from simulation errors without anyone noticing.

Comment on Osmos, Updates, and Floating-Point Determinism by Daniel Weiner Fri, 19 May 2017 02:43:03 +0000 This was really interesting! Please do more posts like this.

Comment on Osmos, Updates, and Floating-Point Determinism by Dave Knott Thu, 11 May 2017 22:07:53 +0000 I think that NEON and SSE have gotten a lot better about respecting the IEEE standard in recent years, so maybe it is less of an issue these days?
With regards to transcendental SIMD functions like vrsqrts_f32, they are typically not very accurate. Usually it is recommended to start with that function, followed by one or two iterations of Newton-Raphson.

You will usually see a performance benefit by using SIMD, even for 2D simulations.
The main drawback is that your data will need to be restructured to take advantage, and then you need to refactor your simulation math to use the new data structures. When moving to SIMD parallelization, the big up-front decision that needs to be made is whether to arrange your data as structure-of-arrays (SOA) vs array-of-structures (AOS).
Think about it for Osmos 2 ;-)

By the way, you are right about floating point rounding being a pain in the ass. I spent a *lot* of time debugging problems in our math library unit tests that could ultimately be traced back to subtle issues related to rounding.

Comment on Osmos, Updates, and Floating-Point Determinism by eddybox Thu, 11 May 2017 04:15:18 +0000 Thanks for the thoughts and feedback, guys!

@Ed: We thought about it, but
1. We were adding to an existing codebase where everything was already floating point. It would have required a lot of changes back in the day to make the switch, and these floating-point issues (different results on different devices) didn’t occur back then once we’d set things up correctly.
2. Osmos has an extreme range of scales; some things can be teeny-tiny, and some can be huge. That includes very slow, drifty velocities, tiny & huge masses, and even time increments due to the ability to warp its flow smoothly. (Though we do disallow time-warping in multiplayer.) That’s a big range to cover with fixed point! Hard to say for sure without trying it of course, but I suspected it wouldn’t feel very good.

@Dave: Good point about the multiply-and-add. I had heard such things were possible (I vaguely & inaccurately alluded to that in my “higher-level registers” mention), though I don’t actually know if that’s what was happening here. As for fastmath, we had optimizations set to -Os as opposed to -Ofast (which enables fast math under the hood) and we had “Relax IEEE compliance” set to No. I’m not sure what else Xcode/clang might do under the hood though.
Memory Barrier: I had never head of that before, thanks! I just read a bit about them, and yeah, I think that would have worked as well.
And yes, I avoided SSE/Neon. Osmos physics is 2D so it didn’t seem worth it. Good to know about it being fast and loose with IEEE, thanks. That said, as I was writing this I came across the RSQRTSS / vrsqrts_f32 function, which may be a workaround for sqrtf() differences…? Not sure.

@Phil: Haha, you’d have to ask Jon to be sure, as I don’t want to put words in his mouth. That said, he didn’t actually suggest doing it differently. Maybe internally he agreed this was the way to go. (And yeah, what else could we do given my fixed-point thoughts earlier in this comment?) I imagine he just understood how delicate / precarious a solution like this could be, and didn’t envy the task. ;-)

Comment on Osmos, Updates, and Floating-Point Determinism by Phil Wed, 10 May 2017 21:54:52 +0000 Insightful writeup! Thanks!
What’s the “ugh” justification for using lock-step? Is there some alternative (short of remote sim with regular full-state updates) that would have avoided this (or other) problem(s) had you not been using lock-step?

Comment on Osmos, Updates, and Floating-Point Determinism by Dave Knott Wed, 10 May 2017 16:42:24 +0000 Interesting post, Eddy!

A few comments…

The “daisy-chaining” problems that you describe may not be solely due to the compiler’s choice of registers. When the compiler sees code like this: “mote.x += mote.vx * dt;”, it may decide to fuse the multiply and add into a single instruction (assuming that your CPU architecture has such an instruction). The nasty thing about fused mutiply-and-add instructions is that, because they are not part of the the IEEE floating-point standard, they are not required to produce identical results across different CPUs. You can usually tell the compiler to avoid using fused multiply-and-add by turning off “fastmath” (which I assume you did ;-)

Also, I wonder if instead of using “volatile”, you could insert a memory barrier between the multiply and add in the expanded code, thus forcing the compiler to use specific registers in order to enforce synchronization. Just a thought…

Finally, I assume that floating point determinism is also the reason why you’re not using SSE or Neon ?
We’ve found that SIMD instructions can sometimes play fast-and-loose with the IEEE floating point standard, which causes a lot of the same kind of problems that you describe. That’s why we maintain two versions of our linear algebra libraries. One that is SSE optimized, and the other not. Multiplayer code that requires deterministic results (e.g. NPC pathfinding) is always written with the non-SSE-optimized version.

Comment on Osmos, Updates, and Floating-Point Determinism by Ed Powley Wed, 10 May 2017 08:39:33 +0000 Interesting article! Just wondering if you ever considered switching to fixed-point calculations in the simulation? Integer math is deterministic so it would sidestep the issue.

Comment on Osmos, Updates, and Floating-Point Determinism by Tikitu Tue, 09 May 2017 13:50:05 +0000 Hey, I’d love to hear more about the development process! I can’t honestly say it will be useful (I’m an iOS dev but not in games and not likely to do anything real-time-ish or graphics-heavy in the near future) but it’s definitely interesting!

Comment on Karmaka’s Art Lived Multiple Lives: Part 2, The Box by Meal Worms Fri, 26 Feb 2016 01:18:30 +0000 Thanks very much Ben – music to our ears! Glad to know our efforts are appreciated.

While you’re waiting for ship, Ben, feel free to check out the Print-and-Play (available for download at