Frequent, random crashes

24

Comments

  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    The I/O Delta Writes was at 1,400 for which application, Subnautica?
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    0x6A7232 wrote: »
    The I/O Delta Writes was at 1,400 for which application, Subnautica?
    Yes. That was after the crash. So the game was still running, but without video.

  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    Can you run Osiris New Dawn without crashing? Osiris is built on Unity as well, and is still in development. I'm gonna say if you can run Osiris, it's gotta be a weird config issue with Subnautica or something (that crash trace mentioned your drivers). @nesrak1 Do you see anything in the error.txt or output_log.txt?


    As the title says, I experience very frequent crashes at very random intervals. There's no discernible pattern at all. Sometimes it crashes during the load screen, sometimes within seconds of loading into the game, sometimes after a few minutes of play, and (occasionally) sometimes after a few hours of play. It doesn't matter what we're doing in the game, what mode we're playing, or where we're located. It even crashes while we have the game paused. I've run out of ideas for fixing it myself and could really use some expert guidance. I don't think it's a problem with the game itself since others don't seem to be having as much trouble. And I don't think it's a hardware issue. I'm not sure what options that leaves.

    Whatever we're doing, the game freezes, the screen goes blue, then we get either a black screen and have to open the task manager manually, a white screen and the task manager telling us the program has stopped working, or we end up on the desktop. We can almost always still hear the game, and the controls still work-- we can actually hear the guy walking/swimming while we're hitting the keys on the desktop.

    Per the sticky, I've uploaded the relevant information to Pastebin;
    Output Log
    DxDiag
    error.txt

    I know the error file is not specifically requested. I just figured more information might be helpful.

    As for what I've done to try and resolve the issue myself;
    Used driver-remover software to completely eliminate old Radeon info.
    Installed the latest drivers for the RX 460.
    Replaced the stock cooler with a Hyper Evo 212 to see if thermals were an issue.
    Overclocked the GPU.
    Underclocked the GPU.
    Overclocked the CPU.
    Ran the game offline to see if it was a network issue.
    Ran the game with "Above Normal" priority.
    Upgraded the PSU to see if that fixed any instability.
    Repeatedly defragged and "optimized" the drive.

    Why I don't think it's a hardware issue;
    I can run the Unigine Valley benchmark with no trouble.
    I run the OCCT stress test with no problems.
    The PassMark BurnIn Test also runs fine.
    I also spend hours working with GIMP and Unreal Engine without issue.
    The histogram from the Radeon software shows no obvious problems prior to any crash.

    I realize that running other applications doesn't mean my rig is flawless. But surely that at least indicates it's capable of more than browsing ebay and sending mail.

    What's most perplexing is that it hasn't always done this. The first few weeks we had the game there were no crashes at all. And even now it's not unheard of to go hours without a crash. But those stable periods are less frequent and shorter than they had been. I'm certain it's something that can be fixed either in the system settings or in the command line. It's just that diagnosing the problem is beyond my skill level. Whatever help y'all can offer would be greatly appreciated.

    *EDIT*
    Forgot to add, in case it makes a difference, our rig is a livingroom computer. It's hooked to a 42" 1080p TV via HDMI with the sound sent to a surround sound receiver via S/PDIF. Is that maybe more than it can handle?

  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    I haven't tried Osiris yet. But I do now get crashes with Poly Bridge, Unreal Engine, and occasionally even YouTube. Either something is going on behind the scenes or my rig is becoming increasingly unstable.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    nesrak1 wrote: »
    Gpu maybe?

    I thought so too except he didn't have any problems with other games or anything. Until now, so, yeah, maybe. Or memory? RAM and PSU can make everything else look like the culprit, and he's already tried a new PSU. @SouthernGorilla try using Windows Memory Diagnostic (click start, search memory, it'll be a desktop app that's already installed). Normally I'd recommend downloading MemTest86 / 86+ and booting that from a flash drive or CD, but you've already got WMD built-in.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    edited September 2017
    According to the stacktrace, the game crashes on a shader (having the top one be most recent)
    Since it is shader related (maybe) there's nothing in mono that you can debug.

    Also, it may be worth downgrading driver versions to see what happens. There aren't many fixes for problems like this online, but you said previously you never had this problem.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    nesrak1 wrote: »
    According to the stacktrace, the game crashes on a shader (having the top one be most recent)
    Since it is shader related (maybe) there's nothing in mono that you can debug... You could try disabling some shaders and see what happens.

    How do I disable a shader? Just got another update from Radeon but updates haven't made a difference so far. I know an RX 460 isn't great, but surely it should be adequate. It's brand new and doesn't show signs of trouble in any of the tests I've run.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    edited September 2017
    That would be something in unity/subnautica, but nevermind that...
    See edited msg:
    nesrak1 wrote: »
    Also, it may be worth downgrading driver versions to see what happens. There aren't many fixes for problems like this online, but you said previously you never had this problem.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    Opened a ticket with AMD to see what they say. I hadn't thought about rolling back the driver updates but that makes a lot of sense. We had no problems at all the first few weeks after installing the card. I don't remember, but it's possible the problems started after an update. Went to AMD's site to look for drivers but they don't offer a selection to choose from, only the latest for your card. Just did another update last night and still have crashes.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    AMD had me try DDU to remove the driver so I could reinstall. It didn't work any better this time than it did the first two times I tried DDU. What kills me is that the tech guy asks for my DxDiag info and says "I see you are running an RX 460 with Windows 10"... duh! I told him that when he first responded to my ticket. I would have expected him to give me some way of getting info straight from the Radeon software, not just ask for a printout of basic system info which I had already given him.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    edited September 2017
    The only reasonable fix I've found is to disable anti aliasing. Everyone else gets replied with "rollback your drivers" or some other unhelpful comment. You should also try the -dx9 flag and see if anything else happens.
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    Most (90+%?) of tech support is just reading from scripts. I begin to suspect if they didn't, they'd be fired. It's just stupid.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    nesrak1 wrote: »
    The only reasonable fix I've found is to disable anti aliasing. Everyone else gets replied with "rollback your drivers" or some other unhelpful comment. You should also try the -dx9 flag and see if anything else happens.
    Anti-aliasing is a great idea. That's one of the tougher bits for a GPU. And it's something I can disable globally in the Radeon settings, which may make all the difference. The -dx9 flag... would I add that to the Steam launcher options?

    0x6A7232 wrote: »
    Most (90+%?) of tech support is just reading from scripts. I begin to suspect if they didn't, they'd be fired. It's just stupid.
    Yup. At least he didn't tell me to turn it off and back on.

  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    The -dx9 flag... would I add that to the Steam launcher options?

    I guess it's -force-d3d9 (I've never actually used it just heard of it) but yes, in the game launch options
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    I disabled anti-aliasing via the options menu. Radeon settings doesn't offer a disable option. But it does have a "use application settings" mode. So who knows if it will actually disable anti-aliasing or go with its own 2x minimum. It's past my bedtime, so I can't test it out right now. But I'll play tonight and see if it lasts longer.

    I'm also wondering if it might not be a bottleneck from running the card on a PCI-E x16 v1.1 slot. Could it just be that the bandwidth or data rate isn't there to feed the GPU? I'm planning to rebuild the rig as soon as finances allow, so it's no huge deal if I have to wait til then to get stable gameplay.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    Tech guy gave me a link to download an older driver. It crashed too. Disabling anti-aliasing didn't help either. I'm just convinced it's something to do with the age of the board... it just can't run at the speed it needs to.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    edited September 2017
    Have you tried the force d3d9 yet?
    I was thinking about continuing execution with a debugger but that wouldn't work because I just took a look at the memory in the crash and it's being destroyed for whatever reason. Looks like we're out of options :(
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    Tech guy gave me a link to download an older driver. It crashed too. Disabling anti-aliasing didn't help either. I'm just convinced it's something to do with the age of the board... it just can't run at the speed it needs to.

    Could be thermal paste has deteriorated and it's overheating?

    I mean, you could try re-applying the thermal paste (pull heatsink, scrape + clean old thermal paste with 97% isopropyl alcohol and Q-Tips, then re-apply a SMALL bead of thermal paste to the GPU, you can search for videos for your specific card, just DON'T put too much paste on! I'd recommend Arctic Silver 5 for paste (Amazon or eBay).
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    nesrak1 wrote: »
    Have you tried the force d3d9 yet?
    I was thinking about continuing execution with a debugger but that wouldn't work because I just took a look at the memory in the crash and it's being destroyed for whatever reason. Looks like we're out of options :(
    Thanks for reminding me about the DX9. Just tried it, got another crash. But that may have been because I still had three browser windows open along with the Steam interface. So I'll try again with nothing running. I'm also going to reinstall the latest driver since the rollback didn't help. The new driver minus anti-aliasing did last significantly longer than the old driver. So *maybe* the new driver minus anti-aliasing plus DX9 will be the magic bullet.
    0x6A7232 wrote: »
    Tech guy gave me a link to download an older driver. It crashed too. Disabling anti-aliasing didn't help either. I'm just convinced it's something to do with the age of the board... it just can't run at the speed it needs to.
    Could be thermal paste has deteriorated and it's overheating?
    I doubt the paste is an issue. I just upgraded from the stock cooler to a HyperEvo 212 maybe a month ago in order to deal with any potential heat issues. The temps never get over 70-72 when I'm testing. Unless there's a problem with the GPU fan and the GPU is thermal throttling or crashing.

    The frustrating part of this is that it ran fine for the first few weeks. So I know the system was capable of running it. I have 335 hours in the game and I didn't do that fifteen minutes at a time. I would sit here and literally play for 8, 10, 12 hours at a time. I don't know what changed. But this last crash happened in the time it took me to walk across my base, get in the Seamoth, sail to the Aurora lab entrance, and get out of the boat. I didn't even make it out of the water. Is that even five minutes?
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    nesrak1 wrote: »
    Have you tried the force d3d9 yet?
    I was thinking about continuing execution with a debugger but that wouldn't work because I just took a look at the memory in the crash and it's being destroyed for whatever reason. Looks like we're out of options :(
    Thanks for reminding me about the DX9. Just tried it, got another crash. But that may have been because I still had three browser windows open along with the Steam interface. So I'll try again with nothing running. I'm also going to reinstall the latest driver since the rollback didn't help. The new driver minus anti-aliasing did last significantly longer than the old driver. So *maybe* the new driver minus anti-aliasing plus DX9 will be the magic bullet.
    0x6A7232 wrote: »
    Tech guy gave me a link to download an older driver. It crashed too. Disabling anti-aliasing didn't help either. I'm just convinced it's something to do with the age of the board... it just can't run at the speed it needs to.
    Could be thermal paste has deteriorated and it's overheating?
    I doubt the paste is an issue. I just upgraded from the stock cooler to a HyperEvo 212 maybe a month ago in order to deal with any potential heat issues. The temps never get over 70-72 when I'm testing. Unless there's a problem with the GPU fan and the GPU is thermal throttling or crashing.

    The frustrating part of this is that it ran fine for the first few weeks. So I know the system was capable of running it. I have 335 hours in the game and I didn't do that fifteen minutes at a time. I would sit here and literally play for 8, 10, 12 hours at a time. I don't know what changed. But this last crash happened in the time it took me to walk across my base, get in the Seamoth, sail to the Aurora lab entrance, and get out of the boat. I didn't even make it out of the water. Is that even five minutes?

    Yeah, I meant the GPU. Could just be a new version of Unity that isn't playing well with your driver version or something. :/
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    I'm done trying to diagnose it. I just can't play until I rebuild the rig. The only thing I haven't done is a malware scan to see if anything is stealing resources. Maybe next paycheck I'll upgrade to an SSD to see if that helps. That would be part of the rebuild anyway.
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    I'm done trying to diagnose it. I just can't play until I rebuild the rig. The only thing I haven't done is a malware scan to see if anything is stealing resources. Maybe next paycheck I'll upgrade to an SSD to see if that helps. That would be part of the rebuild anyway.

    Yeah, you can keep some components, so upgrading them first, then transferring them to your new rig, makes sense. (GPU, SSD / HDD, RAM if it's DDR4, PSU)
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    The RAM is possibly the problem. I have 8gb, but it's DDR2. Could be too slow to keep up.
  • nesrak1nesrak1 Places Join Date: 2016-12-04 Member: 224536Members
    During crash, ram looked fine. Don't know about vram though. Why it could be crashing in main menu is weird too.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    The whole thing is weird. How can it go from perfectly fine to a complete mess with no sign of trouble in the system? Why do programs specifically designed to expose problem areas run fine, but games crash? I wish I could get logs from every program that crashes to see if the trigger is the same every time. But it doesn't crash much outside of Subnautica and I don't know if other programs log such things.
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    Hah! So the latest request from the AMD guy is for me to put the GPU in a "known good" computer and to try another GPU in this computer. Like I have spare computers in every room and can just swap components whenever I feel like it. The GPU I pulled out of this rig is so old and puny it's probably worse than the onboard graphics.

    Of course, I do have two laptops, a Linux desktop, and a quad-socket server motherboard laying around. But none of that will help me diagnose the problem with this thing. What am I supposed to do, go visit my mother-in-law and tear her computer down, throw in this GPU, install Subnautica on it, and see if it crashes?
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    The whole thing is weird. How can it go from perfectly fine to a complete mess with no sign of trouble in the system? Why do programs specifically designed to expose problem areas run fine, but games crash? I wish I could get logs from every program that crashes to see if the trigger is the same every time. But it doesn't crash much outside of Subnautica and I don't know if other programs log such things.

    Heh. Schooling time:
    1) Click Start
    2) Type "Event Viewer" (no quotes) and press Enter
    3) In the pane on the left, choose your category. An example would be "Event Viewer (Local) > Applications and Services > Microsoft > Windows > {list of MS Windows services}"
    You're probably going to want "Event Viewer (Local) > Windows Logs > Application" (Although System also has logs that might affect this).
    4) After clicking an entry in the left pane, it gives a list of logs for the selected type (say, Application Logs) in the upper half of the center pane. Click on a specific log to view its contents in the lower half of the center pane. You have a list of actions you can perform in the right hand pane, such as finding a specific term (say, Subnautica.exe? Or a driver name, or a crash error code?).
    You can also create a custom view or filter the current log, etc.

    Using the Event Viewer in combination with such tools as Sysinternals Suite (Process Explorer, Process Monitor, etc etc) gives you a pretty good view of what's going on in your system (if you can be bothered to wade through it, looking for common themes connecting the dots to form the picture).

    ALSO, you might have a bad PSU (works fine under light load, all tests come back ok, then under heavy (gaming) load, can't handle delivering the amount of power required, (voltages drop below minimum, computer freaks out crashes). You'd have to measure the voltages coming into the motherboard with a multimeter while the system was under load (very carefully). Or replace with a known good PSU.

    Also, components can have the same issues: GPU works fine under light load, but under heavy load, the voltage regulators have broken down to the point where they can't handle the normal required power to perform, and screw everything up.

    It's a real fun guessing game, that is best solved by replacing components one at a time with known good spares, and testing (so, when you build that new rig, you could start one piece at a time, with, say, a power supply, and replace your old computer's parts until you find the culprit, then you'll have the old system as a known good spare parts or extra computer for your use or donation, having isolated the faulty component).

    I'd replace:
    1) PSU
    2) GPU
    3) HDD
    The rest aren't cross-compatible, unfortunately. Then just transfer the replaced pieces to the new rig when it's ready. (Note: replace that PSU first, it can affect the rest, and you also want to have a powerful enough PSU to handle the new GPU)
  • SouthernGorillaSouthernGorilla United States Join Date: 2017-07-26 Member: 232057Members
    The PSU is new, just put it in a couple months ago. But I do wonder if I should have stepped up to a 600W instead of 500. The drive is definitely next on the list. After that, the rig gets rebuilt. If I still get crashes with a fresh Ryzen build I'll know for sure it's the GPU.

    I'll try that event viewer my next day off. I have used the histogram on the Radeon software, it never showed anything odd after a crash. But it's not exactly high-resolution diagnostics.

    Would the occt stress test be considered light duty? It sure looks like it's putting the system under some pressure.

    If nothing else, I'm definitely learning more about diagnosing problems.
  • 0x6A72320x6A7232 US Join Date: 2016-10-06 Member: 222906Members
    edited September 2017
    The PSU is new, just put it in a couple months ago. But I do wonder if I should have stepped up to a 600W instead of 500. The drive is definitely next on the list. After that, the rig gets rebuilt. If I still get crashes with a fresh Ryzen build I'll know for sure it's the GPU.

    I'll try that event viewer my next day off. I have used the histogram on the Radeon software, it never showed anything odd after a crash. But it's not exactly high-resolution diagnostics.

    Would the occt stress test be considered light duty? It sure looks like it's putting the system under some pressure.

    If nothing else, I'm definitely learning more about diagnosing problems.

    You'll want something that stresses both the CPU and GPU, I'd think, because that's what a game would do. (Especially one that's not finished being optimized).

    Oh. You could also try this (but if you already reset your PC, probably no need. But it can't hurt):

    From an admin cmd prompt using Win+X+A (opens PowerShell, almost the same thing as cmd, although if you really need the old cmd prompt, put in cmd and enter in PowerShell and it'll open the old prompt. For these tasks, however, PowerShell will work fine.)
    dism /online /cleanup-image /restorehealth
    

    and then
    sfc /scannow
    

    EDIT: curious, what brand & model PSU did you get?
Sign In or Register to comment.