Linux's own BSOD. We need to get it out into the open!

Discussion topics, Linux related - not requests for help

Moderators: ChrisThornett, LXF moderators

Linux's own BSOD. We need to get it out into the open!

Postby davecs » Fri May 13, 2005 11:46 am

My only real frustration when using Linux is a fault that seems only to happen on some boxes and not others, and only when using GLX/3D acceleration, especially on nVidia but also occasionally on ATi.

What happens is that everything freezes except the mouse pointer which moves around the screen but its cursor stays fixed. Apparently if the computer is on a network, you can SSH into it (whatever that means) and find that X has taken over all your memory, you can close X and the computer is still running. Otherwise you are forced to press ALT-SYSRQ-S to save all open files, ALT-SYSRQ-U to unmount them and reopen them as read-only, and ALT-SYSRQ-B to reboot. There are other combinations that restart the keyboard, etc, but ultimately you never get up and running properly unless you reboot.

There are several theories about the origin of this fault. One is the nVidia drivers. I'm not sure about this, because I have read that it can happen in any 3D-enabled X setup. Another is the kernel. Another is somewhere in X.

Everyone blames the other, and you only read about it on forums. It never seems to be discussed in magazines, or acknowledged by any developers. At the nvidia help site, it is one of the biggest topics, but no "official" answers at all.

It also seems to strike randomly, I didn't have this problem for ages, and then it got so bad that I had to switch to the nv (2D) driver which is slow even in 2D.

It can strike in any distro, and there is no rhyme or reason. Two people with identical computers and identical distros can find that one persion never sees this bug and another sees it all the time. A little tweak of xorg.conf can result in it going away for a while only for it to come back with a vengeance later.

If developers can't work out what's causing it, can they at least add a means of escaping from it without having to reboot? Has anyone else had this problem?
User avatar
davecs
LXF regular
 
Posts: 530
Joined: Sat Apr 09, 2005 11:13 pm
Location: Dagenham, Essex

RE: Linux

Postby towy71 » Fri May 13, 2005 1:01 pm

It happened to me yesterday but I did not know the keyboard sequence to reboot so had to resort to reset button which then set off whole sequence of fsck stuff but the system did eventually get going again!
so thank davecs for the tip will bear it mind if it happens again
Dick
still looking for that door into summer
User avatar
towy71
Moderator
 
Posts: 4276
Joined: Wed Apr 06, 2005 2:11 pm
Location: wild West Wales

RE: Linux

Postby Rhakios » Fri May 13, 2005 6:11 pm

One thing to be aware of with the alt+sysrq sequence, some distros disable this function by default for security reasons. SuSE, for example: one of my first tasks after a fresh installation of a SuSE distribution is to re-enable it.

There is a full explanantion of all the magic sysrq key sequnces in the kernel docs, such as
/usr/src/linux/Documentation/sysrq.txt
on SuSE. IIRC, it has also been dealt with briefly in the magazine.
Bye, Rhakios
User avatar
Rhakios
Moderator
 
Posts: 7634
Joined: Wed Apr 06, 2005 11:18 pm
Location: Midlands, UK

RE: Linux

Postby davecs » Fri May 13, 2005 8:07 pm

If you don't actually need 3D/GLX, the "nv" solution is OK, it just slows the machine down so much. Oh and if you have an analogue connection to your monitor, it shifts the picture over, about 6-7%, so you have to adjust it, which is annoying if you need to change back and forth.

But still, is there a way to force this issue right out into the open so that the developers will get their heads together and deal with it? Maybe we need a "bribery" fund for them.
Image
Asus Asus M2N32 WS Pro+Athlon AM2/4200+ — GeForce 7600GT — 2Gb Cosair VS RAM — 500Gb WD5000AAKS SATA Drive — PCLinuxOS
User avatar
davecs
LXF regular
 
Posts: 530
Joined: Sat Apr 09, 2005 11:13 pm
Location: Dagenham, Essex

RE: Linux

Postby towy71 » Fri May 13, 2005 8:29 pm

Bribery? What are you implying?

Thats my tuppence! ;-)
still looking for that door into summer
User avatar
towy71
Moderator
 
Posts: 4276
Joined: Wed Apr 06, 2005 2:11 pm
Location: wild West Wales

RE: Linux

Postby fingers99 » Fri May 13, 2005 8:54 pm

So,

Alt + backspace won't do it?

That would seem to point to something more fundamental than X.
fingers99
LXF regular
 
Posts: 143
Joined: Thu Apr 07, 2005 6:15 pm

RE: Linux

Postby linuxgirlie » Fri May 13, 2005 9:16 pm

I must admit I always resort to alt + backspace if I have a problem, unless I have a winex problem when I do kill wine -9 I use that, but i'm lazy ;)

Jo
My knowledge comes with no warranty...........

Server operating system designed for schools:http://www.linuxschools.com
linuxgirlie
LXF regular
 
Posts: 787
Joined: Sat Apr 09, 2005 6:34 pm
Location: Kent...UK

Re: RE: Linux

Postby davecs » Fri May 13, 2005 9:18 pm

fingers99 wrote:So,

Alt + backspace won't do it?

That would seem to point to something more fundamental than X.


The problem is that whilst X is running, it controls the keyboard. It is X that has locked up, not the kernel. The only keys that work at this point are the ALT-SYSRQ-key combinations, because they are directly controlled by the kernel!

So the problem is definitely X. But a lot makes up X, it's a question of which part and who is responsible. Or is a clash with part of the kernel to blame?

No-one is taking responsibility...
User avatar
davecs
LXF regular
 
Posts: 530
Joined: Sat Apr 09, 2005 11:13 pm
Location: Dagenham, Essex

RE: Re: RE: Linux

Postby nelz » Fri May 13, 2005 10:21 pm

It doesn't sound like a kernel issue. If, as you said earlier, you can still SSH into the computer from another, the OS is running fine. The problem is that X has locked up and, as you say, X is grabbing almost all input so you cannot get out of it. Using SSH bypasses all of that since it sets up a new console login that has nothing to do with X. From there you can kill X.

A kernel level lockup would also kill networking, so SSH wouldn't work.
Last edited by nelz on Sat May 14, 2005 2:15 pm, edited 1 time in total.
User avatar
nelz
Site admin
 
Posts: 8579
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

RE: Re: RE: Linux

Postby fingers99 » Sat May 14, 2005 11:03 am

OK, so what effect does

Alt + SysRq + R have?

Is it true for all the nVidia drivers or only the latest one(s).
fingers99
LXF regular
 
Posts: 143
Joined: Thu Apr 07, 2005 6:15 pm

RE: Re: RE: Linux

Postby towy71 » Sat May 14, 2005 11:43 am

I don't like you fingers99 :-P

I had to try it and then had to Alt+SysRq+B grrrr
still looking for that door into summer
User avatar
towy71
Moderator
 
Posts: 4276
Joined: Wed Apr 06, 2005 2:11 pm
Location: wild West Wales

RE: Re: RE: Linux

Postby Rhakios » Sat May 14, 2005 11:53 am

Why? You should have been able to switch to another terminal and shutdown gracefully. That is the purpose of alt+sysrq+r, to free the keyboard from a locked X session and allow you to switch to a terminal where you can do things in a more controlled manner.
Bye, Rhakios
User avatar
Rhakios
Moderator
 
Posts: 7634
Joined: Wed Apr 06, 2005 11:18 pm
Location: Midlands, UK

RE: Re: RE: Linux

Postby systemx » Sat May 14, 2005 3:10 pm

The current state of nvidia is pretty unstable and it is my experience that anything over 1.0-6629 can and will cause lockups on most systems...

As for ATI, I have no idea because I don't use it :)

If yr already using that nvidia version, maybe you should apply there patch set: http://www.nvnews.net/vbulletin/showpos ... ostcount=1

It fixes most, if not all the known bugs in that release...
User avatar
systemx
 
Posts: 2
Joined: Sat May 14, 2005 3:02 pm
Location: Earth

RE: Re: RE: Linux

Postby davecs » Sat May 14, 2005 10:59 pm

Unfortunately, Nvidia 6629 won't install on my PCLinuxOS system, only 7174 will. There is a fix in the full version of the above post, but that doesn't work for me either!

I've put back "nvidia" in my xorg.conf and am running in 3D at present. Just to see how long it lasts!
Image
Asus Asus M2N32 WS Pro+Athlon AM2/4200+ — GeForce 7600GT — 2Gb Cosair VS RAM — 500Gb WD5000AAKS SATA Drive — PCLinuxOS
User avatar
davecs
LXF regular
 
Posts: 530
Joined: Sat Apr 09, 2005 11:13 pm
Location: Dagenham, Essex

Re: RE: Linux

Postby doctorflange » Tue May 17, 2005 10:34 am

davecs wrote:Oh and if you have an analogue connection to your monitor, it shifts the picture over, about 6-7%, so you have to adjust it, which is annoying if you need to change back and forth.

Ah, I thought that was just me. Ye olde nvidia squint. Plus, it's nice to be able to run the flurry screensaver to annoy my mac-using flatmate. He gets more annoyed when I run systempreferences :twisted:
Image
Signature license here. Background by Michele Valentinuz. Source.
User avatar
doctorflange
 
Posts: 40
Joined: Sat Apr 09, 2005 10:46 pm
Location: Anniesland, Glasgow

Next

Return to Discussion

Who is online

Users browsing this forum: No registered users and 0 guests