Page 2 of 3

Re: Freezing in Ubuntu Lucid

Posted: Sun Jun 27, 2010 2:30 pm
by vcato
I'm having this problem too on Fedora 12. Is there any progress on this? I've tried turning vsync on and using different sound systems, but the freezes keep happening about the same. strace shows gettimeofday being call repeatedly, so it isn't hung inside gettimeofday.

Re: Freezing in Ubuntu Lucid

Posted: Sat Jul 17, 2010 8:23 pm
by Meal Worms
Hi vcato,

I had a couple of potential workarounds for this issue, but nothing panned out, as the underlying kernel issue is the same reagardless of the codepath I choose.

Dave

Re: Freezing in Ubuntu Lucid

Posted: Wed Oct 20, 2010 7:40 pm
by Dijit
Hi, I've seen a "hang" a few times on my laptop (Dell XPS 1530) running Ubuntu 10.04 with Osmos 1.6.0 1314.

I broke in with gdb to see where it was stuck and here's the backtrace (not very intelligible without symbols, obviously):

(gdb) where
#0 0xb77bd422 in __kernel_vsyscall ()
#1 0xb727b6d6 in gettimeofday () from /lib/tls/i686/cmov/libc.so.6
#2 0x0809a2d1 in ?? ()
#3 0x0809a307 in ?? ()
#4 0x0808196f in ?? ()
#5 0xb7208bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#6 0x0804c211 in ?? ()
(gdb) ^CQuit

I was also able to break in with strace and confirm that gettimeofday is not hanging, just getting called a lot and the results seem reasonable (monotonically increasing at the rate I'd expect). It seems like a plain ol' infinite loop in the application side code, but I'm not sure if this is the same hang referenced in the Osmos Linux FAQ.

gettimeofday({1287629328, 560331}, NULL) = 0
gettimeofday({1287629328, 560355}, NULL) = 0
gettimeofday({1287629328, 560379}, NULL) = 0
gettimeofday({1287629328, 560402}, NULL) = 0
...
gettimeofday({1287629328, 560641}, NULL) = 0
gettimeofday({1287629328, 560664}, NULL) = 0
gettimeofday(^C{1287629328, 560688}, NULL) = 0

I have some ideas to play with regarding reproducing the hang based on some of what I've found while Googling around for gettimeofday TSC issues. A few placed indicated that strange behavior can result from gettimeofday calls that occur precisely when processors frequencies are changing and/or throttled differently (i.e., maybe programmatically throttling them more often might elicit the bug more quickly). That may indicate that a power conservative device like a laptop is more likely to see the bug than a desktop machine. It may also mean that a debug build's timing differences might mask it entirely.

Anyway, Osmos is still a great game and I'm still happy to that there's a Linux version!

Cheers,
Dan Tull

Re: Freezing in Ubuntu Lucid

Posted: Thu Oct 21, 2010 6:58 am
by Meal Worms
Thanks for your message Dan, interesting stuff. Yeah, my investigation had resulted in largely the same thing, that there's some odd kernel-level mojo regarding frequent TSC calls that's tough for a realtime application to work around reliably (we need to check the TSC frequently to keep things perceptually nice'n'smooth, and alternative API-calls/ways of doing timing invariably call down to TSC anyway) so I'm at a bit of a loss as to a line of attack that can decisively solve the issue.

Dave

Re: Freezing in Ubuntu Lucid

Posted: Thu Oct 21, 2010 10:07 am
by rawler
How do others do it? Must be a problem for virtually every game under Linux in that case?

Just a common silly thing I myself fail to consider all the time; It's not something simple like the time-delta occasionally flipping to the other side of 0, and being stored in a variable that can't handle it, turning into a very large number?

Re: Freezing in Ubuntu Lucid

Posted: Sun Oct 24, 2010 9:03 pm
by Dijit
Meal Worms wrote:Thanks for your message Dan, interesting stuff.

I did some experiments setting my CPUs to not throttle and run at either minimum or maximum speed, but still got the hang (3 times tonight) which kills my dynamic frequency scaling hypothesis.

I also fiddled with adjusting the time once Osmos was spinning in its apparent infinite loop in the hopes that it might shake it loose (guessing the bug might just be due to a faulty assumption of monotonically increasing results and another discontinuity might get it back on track), but that didn't work either.

Do you have the symbols for the referenced version so that you know the actual call stack? Can you tell from that what conditions would be required to trigger an infinite loop in that location? Do you have a sense for precisely how this bug manifests at the API level? (e.g., a result from gettimeofday in the distant future or past?)

Apologies for all the questions. I work in software development/whitebox testing (Adobe Lightroom) and can't resist a good bug hunt... Well, that I'd love to see this particular bug fixed and the best way to do that is to understand it well enough to corner it and hand it to you on a silver platter. :)

DT

Re: Freezing in Ubuntu Lucid

Posted: Mon Oct 25, 2010 4:53 pm
by Dijit
At the risk of belaboring the (probably) obvious:

Easy repro case which, on the surface, has the same symptoms:
While Osmos is running, set the clock back in time a week or so. Let it run for a moment and then set it forward again.

Of course, as you (I think) refer in your prior post, even CLOCK_MONOTONIC mode is reported to have some bugs where it is not monotonic as promised.

DT

Re: Freezing in Ubuntu Lucid

Posted: Wed Dec 15, 2010 3:52 pm
by bouncing
I unfortunately found this to be a problem on my Thinkpad T500 too. You mentioned it's a known bug. Is there a bugzilla.kernel.org entry?

Re: Freezing in Ubuntu Lucid

Posted: Wed Dec 15, 2010 4:07 pm
by bouncing
Also, fwiw, I find that in general it freezes within a minute of use or not at all, until you alt+tab to another application.

Re: Freezing in Ubuntu Lucid

Posted: Wed Dec 15, 2010 5:28 pm
by eddybox
Howdy folks,

Sounds like there's some renewed interest in this problem since the Humble Bundle launch, so we're going to get back on it. Dave (the man behind the Linux port) *just* got back from some travles abroad. Give him a few days to get over jetlag and settle back in and we'll look into getting a build to you with debug symbols. We haven't been able to reproduce the problem ourselves, but hopefully that'll give all you helpful folk the info you need to nail down the bug -- be it the details needed to submit a proper kernel bug report, or if there's some mojo we're not handling quite right.

Stay tuned...

Thanks,
Eddy