Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations waross on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Serial EEPROM Memory, um.... Forgets

Status
Not open for further replies.

lownoise

Electrical
Sep 17, 2003
4
0
0
US
We have been using 93C46 serial EEPROMs (usually from Microchip, but also TI/Fairchild, Atmel, others) in our products for a number of years to maintain instrument-specific data such as serial numbers.

In the last few years we are experiencing an increasing trickle of field failures of these parts. When they return to us, their contents have been corrupted, usually only in a few cells. The most common occurrence is $FFFF at the top address; also commonly we see the bottom address corrupted, or random cells in the middle. The most common value is $FFFF, but we have seen others.

The failures appear to be quite rare, but happen often enough in the field that they are a concern. However, we have not been able to recreate them in captivity, so we're not sure how to fix them. The failures span several products and situations with very little in common in terms of surrounding logic. We're following pretty solid design practices in terms of making sure we're not attempting to talk to the parts during power cycles, keeping supplies clean, avoiding glitches, etc.

Elsewhere on this forum I noticed some references to some unreliability from Microchip parts. Is there any substance to this claim?

Any idea what might be getting us in trouble?

Thanks
lownoise
 
Replies continue below

Recommended for you

Perhaps the cells are exceeding their expected life of erase/write cycles? Most manufacturers give an estimated lifetime expectancy in tearms of erase/write cycles for the serial eeproms... If a cell is erased or written over several times a day, it is conceivable that the limit will be reached within a few years... which could lead to a failure.
 
Usually, EEPROMS can tolerate a minimum of 10^4 cycles and most of the one built in the last 5 yrs should have 10^5 cycle tolerance. Over a period of 5 yrs, that would still be an average of 6 to 60 cycles per day.

Although, I don't doubt that you've done due diligence with the design, you've left unanswered two issues:

1. Are the corrupted cells permanently damaged? It's possible to do margin testing on memory devices to determine if there is any degradation of the threshold characteristics of the memory.

2. The specificity of the locations are interesting. Are there specific reasons why those cells might be cycled more than the rest?

If there is no obvious issue with cell margins, then the pattern failure would suggest that there is some operational/initialization problem heretofore unknown that causes a hiccup in the program.

TTFN
 
Are all unused inputs tied high / low, or are they left floating. I encountered a very similar problem with E^2 corrutpion, and it was easily fixed by tieing the unused (even though they were NC's on the datasheet) to ground via a 10K. However, it is always a good idea to run that by the device manufacturer. No connect device pins might still be a signal used elsewhere on the die.
 
Thanks for your thoughts...

The problem is not one of maximum cycles; the data is written very rarely - nominally, only when the product is produced, and perhaps another time or two if the product is ever serviced.

The data is read more often; perhaps a few times a day, maybe as many as a few hundred.

We have not been able to find any evidence of a problem in the program. Being mostly a hardware guy, I suspected this at first too, but the code is pretty simple and the situation straightforward. We can't find a weakness there, though I suppose since we can't find a weakness anywhere, it's as likely there as anywhere else...
 
As far as pullups/downs, the pins are generally driven by discrete logic which has no floating state.

(In a few cases they are connected to configurable logic devices which take some time to 'wake up' and drive their pins. We have already identified the need to put pull resistors on these guys, but it doesn't account for all the failures).
 
The fact that the most common address is the top address is very suspicious.

You're left with either some sort of problem in the chip itself or some problem with the system that it's installed in.

TTFN
 
Calculate the failure rate (MTTF) based on the total field population and the duration of time they've been there. Talk to the suppliers about what the resulting failure rate is. You say it's "rare"...hopefully you can quantify that.

Check the power and signal quality, too, especially in situ if possible. Noise and (especially) negative-going spikes in power and signals can affect the charge on the floating gate. Just using "pretty solid design practices" doesn't say that you've actually checked the signal quality. Also, does power quality vary between installations? Are you decoupling the parts properly? Have your suppliers FAEs review your design.

If the failures occur across suppliers it points to a design (or SW) problem. If it's mostly with Microchip, but your volume is mostly Microchip, then you should determine failure rates vs. supplier, too.

--
Mike Kirschner
Design Chain Associates, LLC
 
I have designed boards with eeproms about 10 years ago myself. As far as I remember the manufacturers specified a data retention time of 10 years at that time. Maybe that time has just ellapsed.
 
Actually, 10 years was simply the most rational number to select. Oxide-isolated floating gate projected retention was on the order of 100 yrs, but no one would be silly enough to actually guarantee that.

TTFN
 
93C46 from MicroChip take a wild to initialise at power up. It is not in the datasheet but you must wait a minimum of 10msec before addressing the 93C46 from MicroChip. It is usually not a big problem to wait 30msec or more at start-up to be sure that all components are all initialised correctly. From my experience, 93C46 from Atmel initialise a lot faster at power up.
 
Hi

Some years ago I had the same problem, and the solution I found was the one from some Microchip's app note that avoid the memory to be corrupted. Basically it is a 10uF capacitor between memory's Vcc and GND, but the Vcc from the system comes to the memory trough a diode, which doesn't allow to tranfer the capacitor's current to the circuit when the energy turns off. Under this schema my memories seem to work apropiatly. However change the memorie's brand. Now I'm using them exclusively from Atmel
 
Status
Not open for further replies.
Back
Top