Curiosity has another computer crash


Please consider donating to Behind the Black, by giving either a one-time contribution or a regular subscription, as outlined in the tip jar below. Your support will allow me to continue covering science and culture as I have for the past twenty years, independent and free from any outside influence.


 

Regular readers can support Behind The Black with a contribution via paypal:

Or with a subscription with regular donations from your Paypal or credit card account:


 

If Paypal doesn't work for you, you can support Behind The Black directly by sending your donation by check, payable to Robert Zimmerman, to
 
Behind The Black
c/o Robert Zimmerman
P.O.Box 1262
Cortaro, AZ 85652

Since March 6 all activity from Curiosity seemed to stop, with no images and no science team updates. The reason? The rover had experienced another computer crash and reboot:

Curiosity experienced a computer reset on its Side-A computer on Wednesday, March 6, 2019 (Sol 2,339), that triggered the rover’s safe mode. This was the second computer reset in three weeks; both resets were related to the computer’s memory.

The mission team decided to switch from the Side-A computer back to the rover’s Side-B computer, which it operated on for most of the mission until November of 2018. Side-B recently experienced its own memory issue; the team has since further diagnosed the matter, reformatting the Side-B computer to isolate areas of “bad” memory. As of today, Curiosity is out of safe mode, and the team is configuring the rover for new science operations in the clay unit. Curiosity is expected to return to science operations as early as Wednesday.

This news is worrisome. The track record for spacecraft with increasing computer problems is that they never get better. Instead, the problem steadily worsens until operations become limited or even impossible. In the meantime engineers work wonders to extend the mission, but in the end this is a battle they appear to always lose.

We are beginning to see this pattern with Curiosity. Both of its computers have now experienced problems. It appears they have a better handle on the problems with the back-up computer (Side-B), so that is why they have switched back to it. Should its own memory issues continue to deteriorate however the rover will be in serious trouble, as the Side-A computer has proven to be very unreliable.

Share

7 comments

  • Col Beausabre

    Bob, Do we know why these memory problems continue across different spacecraft? Is exposure to cosmic radiation ? Just random quantum mechanical “flips” ? I don’t think there is anything mechanical involved in memory so there’s nothing to wear out…

  • TL

    Conventional computer memory does wear out and go bad over time. There is a finite number of times it can be accessed before it becomes unreliable. Mil-spec chips are much more robust than consumer grade parts, but eventually they too are going to go bad. What it sounds like the engineers are doing is to restrict the OS from using the specific memory registers which have been identified as bad. Cool trick to extend the life, but at that point the system is on life support.

  • Col Beausabre

    TL – Thank you. At what point (years) do you think a home computer or laptop might be facing problems ?

  • Edward_2

    Voyager 1 and 2 have been traveling through space since 1977, using “ancient” computer technology.

    How is it that Curiosity and Opportunity are having computer problems when they were only launched in 2011 and 2003?

  • Edward

    Edward_2 asked: “How is it that Curiosity and Opportunity are having computer problems when they were only launched in 2011 and 2003?

    When NASA stopped using core memory for the much lighter weight memories and computers that are used today, they understood that they were trading some amount of reliability and lifespan for more capability of their probes, including much more data collection.

    Limitations of the Deep Space Network for data download from our probes influences decisions on mission lifespans, but NASA will gladly extend a mission whenever and however it can. The Voyagers now collect a limited set of data and are only occasionally called for data download.

    The other day, Col Beausabre started a brief discussion of magnetic core memory:
    https://behindtheblack.com/behind-the-black/points-of-information/nasa-considering-replacing-sls-with-commercial-rockets-for-first-orion-test-mission/#comment-1065090

    It is a mystery to me why Curiosity is having so much trouble with its computers when other probes are not. New Horizons uses modern computer technology, yet it is one of many probes that have been flying for much longer than Curiosity’s mission. Cassini flew for two decades with few problems.

  • Gealon

    There are quite a wide range of factors as to why New Horizons has had so little trouble with it’s computer while the Rovers have had considerably more. I would speculate though that the primary factors are; the manner in which the memory is used, the environment the spacecraft is subjected to, and the manufacturer of the memory.

    New Horizons spends much of it’s time in sleep or cruise mode, doing very little other than checking up on it’s system’s state of health. This means that very little is being written into flash memory over the life of the spacecraft. Yes it collects a boat load of data on a flyby, but considering the time between flybys, the memory sees very little use. The Rovers on the other hand spend almost all of their time awake, collecting data and sending it back. The ships undergo many more read/write cycles in comparison to New Horizons over the course of the spacecraft’s life, which would greatly increase the risk of eventual failure of a memory cluster from time to time.

    Second is environment, New Horizons is moving away from the sun, and if my memory on the math or it is correct, there is in inverse relationship between the amount of energy, or in this case, particle emissions, the spacecraft is subjected to and it’s distance from the sun. I believe it is a square of the distance or some such. The point being, as it moves away, it will be subjected to steadily lesser particle hits, reducing the risk of any of them effecting the computers. Conversely, the Rovers are going nowhere as far as the sun is concerned, so their risk remains the same.

    And lastly the manufacturer does play a large role in the quality of the memory used on the spacecraft. Without looking the information up myself, I would wager that neither of these craft use the same chips or even chip manufacturers. This could also contribute to the issues. I however am more of the mind that because of the shared computer issues between the Rovers and their relative ages to each other, that it is not so much a manufacturing issue, but more likely related to the first two factors, mode of usage and the environment the computers are operating in.

  • Edward

    Gealon wrote: “Second is environment, New Horizons is moving away from the sun, and if my memory on the math or it is correct, there is in inverse relationship between the amount of energy, or in this case, particle emissions, the spacecraft is subjected to and it’s distance from the sun. I believe it is a square of the distance or some such. The point being, as it moves away, it will be subjected to steadily lesser particle hits, reducing the risk of any of them effecting the computers.

    Excellent point. In fact, particle radiation may be further reduced, as not all the particles leave the sun at escape velocity, so those particles probably don’t reach New Horizon’s current position. Further, the relative speeds of the particles to the spacecraft may also be reduced, at this distance, meaning that the damaging affects would be reduced.

    On the other hand, this far out the cosmic rays may be more dense, as the sun’s magnetic field tends to divert many of them from getting close to the sun (e.g. near Earth or Mars).

    Chip design and manufacture would make quite a difference on reliability. Modern chip manufacturing can make amazingly small transistors. It has been a couple of decades since I worked on NASA space instruments, but I don’t think it is going on a limb to suggest that NASA insists on larger transistors in order to improve reliability. NASA has lists of already-approved vendors for many parts, because the manufacturer and design does play a large role in the quality of many things used on a spacecraft.

Leave a Reply

Your email address will not be published. Required fields are marked *