EEPROM corruption - even though not writing to it

MartinM57

Moderator
I have a 08M2 low power battery-powered app that regularly naps with the normal "setfreq k31/nap 5/setfreq m32" combination.

It has "disablebod" at the start, so it runs with no brown out detection at all times.

Sometimes the batteries go very flat and need changing.

I have an
Code:
eeprom 0,(0,0,0,0,0,0,0,0,0,0,50,50,50...
declaration in the code that almost fills the entire EEPROM - the code never writes to the EEPROM.

I have some evidence that the EEPROM is being corrupted - unfortunately the code isn't instrumented to be able to output the EEPROM values, so I can't completely prove it - but re-programming the chip sorts the problem.

Is it likely that the lack of brown out detection can lead to EEPROM corruption?
 

nick12ab

Senior Member
If the microcontroller is being allowed to run outside of its rated operating range by disabling brown-out detection then anything could happen. As the PICAXE is an interpreter, the code for writing EEPROM is always there even if there is nothing in the user code that writes to EEPROM, so a simple corruption of the program counter is all that's needed for a write to be performed erroneously.

The 08M2 has the EEPROM shared with the program memory which prevents the use of '#no_data'. This is unfortunate as it means you cannot write a program to read out the EEPROM. If there is space you could add code that does that to your existing program though and hope the issue happens again.
 

srnet

Senior Member
Modifing the code might or might not provide more information as to the problem.

However you will still need to prevent the PICAXE from being allowed to run at marginal, out of specification, voltage levels.
 

MartinM57

Moderator
So if I'm getting corruption in the "EEPROM memory" i.e the part populated via the "eeprom" statement, and as it seems that in an 08M2 the program is held in the EEPROM also (?), then I could be liable to corruptions in the program as well (root cause being a disablebod and no subsequent enablebod)?

...semi-interestingly, the normal recommended min current sleep regime is the "disablebod/setfreq k31/nap 5/setfreq m32" combination. Not may other threads I've come across talk about an "enablebod" to get an 08Mx processor back to being maximally protected from brown out problems.

For info, I've removed the disablebod from the code completely now (and no explicit enablebod - it's the default I believe). Will keep monitoring closely.
 

tmfkam

Senior Member
I wrote a program for a battery powered clock that ran from batteries. This had an LCD with some user defined graphics, stored in Eeprom to save memory, that displayed the state of charge remaining in the battery, along with a bell for the alarm and so on. I had 'disablebod' active in an attempt to reduce the power consumption as much as possible.

For a while, I had the supply generated from a little booster module that stepped up the nominal battery of 4.5V to 5V. This module would work with a supply of just a few volts (2.5V comes to mind, but don't quote me) generating a stable 5V until the batteries were totally exhausted, at which point it stopped. Dead.

This wasn't a problem, I'd replace the batteries and all would appear to be well. On closer inspection though, the graphics stored in Eeprom would often become corrupted, requiring me to re-programme the 08M2 in order to restore the correct display of the graphic elements. I had a second device using the same program but with 'traditional' Alkaline cells which allowed the low battery warnings to display - letting me know it was time to change the batteries before they collapsed, this version had no problems whatsoever with corruption of the graphics. I had initially assumed it was the battery booster causing the problems though was later advised that it was likely that 'disablebod' was the cause of the corruption. Still, as the battery booster offered no extension in running time on the batteries, I ended up using NiMh cells with no booster. No 'disablebod' either. I've had no corruption since.
 

MartinM57

Moderator
Interesting info tmf - thanks for taking the time to respond. Seems to be some correlation with my scenario.

Also interestingly, I have a 470uF reservoir cap (as well as 0.1uf, of course) across the LDO 3.3v regulator feeding the 08M2 - with disablebod I was getting problems with the processor not (re-)starting properly when the battery power was removed briefly (1 second or so). The cure was to put a 3k3 resistor across the regulated supply to force the cap to discharge quickly - this fixed the symptom but had the unwelcome side effect of always drawing 1mA for what was meant to be a low power app.

With no disablebod now, I've removed the 3k3 ane the 08M2 (re-)starts cleanly when the batteries are interrupted for just a few hundred milliseconds...the (default) enabled bod looks to be doing a great job.

Lesson seems to be to treat disablebod with some respect...
 

rq3

Senior Member
Interesting info tmf - thanks for taking the time to respond. Seems to be some correlation with my scenario.

Also interestingly, I have a 470uF reservoir cap (as well as 0.1uf, of course) across the LDO 3.3v regulator feeding the 08M2 - with disablebod I was getting problems with the processor not (re-)starting properly when the battery power was removed briefly (1 second or so). The cure was to put a 3k3 resistor across the regulated supply to force the cap to discharge quickly - this fixed the symptom but had the unwelcome side effect of always drawing 1mA for what was meant to be a low power app.

With no disablebod now, I've removed the 3k3 ane the 08M2 (re-)starts cleanly when the batteries are interrupted for just a few hundred milliseconds...the (default) enabled bod looks to be doing a great job.

Lesson seems to be to treat disablebod with some respect...
The default "enablebod" command does exactly what it's supposed to do: retain data at all levels when the power supply gets to a low level. The chip then gracefully shuts down, saving everything. Turning off this function by using the "disablebod" command negates all of these benefits, and leaves you (almost) completely in the dark as to the state of the silicon the next time you power it up.

The voltage at which this takes effect is in the data sheet for each device. It is NOT zero volts! If you expect your Picaxe to gracefully recover from a zero volt power supply because you have enabled "disablebod", you are mistaken, and using "disablebod" in a situation like an automotive application, where the power supply may vary wildly, may lead to some really bizarre effects.

In short, use "disablebod" with EXTREME caution!
 

MartinM57

Moderator
It's certainly not on my list of useful commands now :)

On another processor family I use, and forum I frequent, not setting bod (it's off by default) is a cardinal sin, most frowned upon, for good reasons. Not sure why I though disablebod on a PICAXE would be any better for processor health.
 

MartinM57

Moderator
The 08M2 has the EEPROM shared with the program memory which prevents the use of '#no_data'. This is unfortunate as it means you cannot write a program to read out the EEPROM. If there is space you could add code that does that to your existing program though and hope the issue happens again.
So I thought, as a last resort, I'd just try it on my allegedly "corrupted" 08M2...
Code:
#picaxe 08m2
#no_data

for b1 = 0 to 255
	read b1, b2
	sertxd (#b1, " = ", #b2, cr, lf)
next
...works fine, obviously trashes the current program, but it printed out the previously programmed EEPROM showing the (one byte) corruption that had occurred exactly where I thought it had since the original programming :)
 

hippy

Technical Support
Staff member
it printed out the previously programmed EEPROM showing the (one byte) corruption that had occurred exactly where I thought it had since the original programming :)
That is unfortunate, and the first time I have ever seen evidence of such corruption. I would guess this is a consequence of operating 'out of spec' when voltage drops too low and DISABLEBOD actively allows that to occur.

I would presume there are four tiers of operating -

Above BOD voltage : Absolutely fine
Below BOD with ENABLEBOD : Absolutely fine; held in reset
Below BOD but above some lower level with DISABLEBOD : Absolutely fine
Below that lower level : All bets are off

I haven't looked at the datasheets so don't know what they say when operating below that lower level. It may be, as srnet suggests, that they are expecting some external means to be used to keep things okay when below that level and not allow it to run 'out of spec'.

PICAXE firmware obviously does contain code to write to EEPROM but there should be limited chance of runaway code randomly hitting that, or otherwise randomly enabling the hardware sequence which allows a write to EEPROM. But in unpredictable situations things are entirely unpredictable.

An 08M2 has separate program and EEPROM space, but does have a trick whereby a longer program can extend into EEPROM. It is not like the original 08's and friends where downloading a program always overwrote EEPROM because they were completely shared, #NO_DATA can be used as you have found.

And, yes, I would expect that if EEPROM can be corrupted then such corruption could affect a larger program which has extended into EEPROM. I recall that would only affect the 08M2 but would have to check; others have genuinely separate 2048 byte code storage plus 256 EEPROM.

Just out of interest; at what address was the byte corrupted, and what value was it corrupted from and to ?
 

MartinM57

Moderator
All understood Hippy, bar the subtlety of the 3rd/4th "tiers".

The presence of a large smoothing cap after the regulator would suggest that the supply voltage to the PICAXE will decay fairly slowly whenever power is removed - so the PICAXE is regularly passing through the "lower levels" of supply voltage every time it is switched off, and with DISABLEBOD in place, the bets are indded off.

The EEPROM byte corrupted and the corruption were nothing special - byte 49, programmed value 68, corrupted value 2. I know another microprocessor family where it is recommended not to use EEPROM location 0 as the perceived wisdom is that location 0 actually is/is the most likely to get corrupted.
 

hippy

Technical Support
Staff member
The EEPROM byte corrupted and the corruption were nothing special - byte 49, programmed value 68, corrupted value 2.
A 'random address' and bits both cleared and set. That is useful to know for anyone trying to replicate the corruption, avoids wasting time looking for a change which may not get detected by the test code.

It also helps to see how things match with the 'perceived wisdom' or predictions of what would likely happen ...

I know another microprocessor family where it is recommended not to use EEPROM location 0 as the perceived wisdom is that location 0 actually is/is the most likely to get corrupted.
It is always hard to tell if such perceived wisdom comes from experience, analysis or is simply 'best guess'. I would expect a runaway write to EEPROM to use the value in EEADR SFR, which would often be $00 at power on, but probably not after an actual read or write. Once things go wrong, it's hard to tell what else might have gone wrong, where bit corruptions may be occurring.

That also assumes it is a write operation being inadvertently activated. It could well be it's not a result of a runaway program at all, it could potentially be invalid signal levels within the the chip causing the corruption.

Neither runaway program nor low-voltage should cause any EEPROM corruption, or be likely to do something which could cause that corruption, and if there were such issues it would likely have been more evidenced and reported than it appears to be. It could well be that it's something else causing the issue. As to what it's not easy to say.
 
Last edited:

MartinM57

Moderator
...EEADR SFR, which would often be $00 at power on, but probably not after an actual read or write.
My understanding also.

The advice from "the other place" is actually based on many reported EEPROM corruptions and the best advice for mitigation is, in order:
- set BOD at a high level - e.g. 4.3v for a 5v supply (to avoid operation at low voltages)
- don't use location 0 (the most likely to be affected at power on)
- "park" the equivalent of the EEPROM SFR at an unused high EEPROM address after use e.g. after a READ of a wanted EEPROM address, READ the unused high EEPROM address (so that if EEPROM corruption does take place it might tend to be at this unused address)
...some or all could be applicable to PICAXE if circumstances dicate
 

rossko57

Senior Member
So I thought, as a last resort, I'd just try it on my allegedly "corrupted" 08M2...
Code:
#picaxe 08m2
#no_data
Are you all sure that really did what you expected, and did not instead actually write some ghost of the data held in a PE buffer from previous load of your 'real' program? Just a thought, seeing as it wasn't expected to work properly.
 

techElder

Well-known member
Please explore the real root cause of this particular problem from the programmer's perspective: disablebod.

Using disablebod to reduce power consumption in battery powered equipment has been recommended for extreme power saving in many instances on the forum.

Is there something that can be changed in disablebod to allow its use but still save power?
 

PhilHornby

Senior Member
08m2 #no_data

So I thought, as a last resort, I'd just try it on my allegedly "corrupted" 08M2...
Code:
#picaxe 08m2
#no_data
Are you all sure that really did what you expected, and did not instead actually write some ghost of the data held in a PE buffer from previous load of your 'real' program? Just a thought, seeing as it wasn't expected to work properly.
My experience of #no_data and the 08M2 has been that it functions as you would expect (though maybe not as documented) until the program size reaches 1792 bytes.
 

hippy

Technical Support
Staff member
Please explore the real root cause of this particular problem from the programmer's perspective: disablebod.
It's important though to note that no one knows exactly what the problem is or why, whether it's an issue related to DISABLEBOD, consequential to its use or something else.

Using disablebod to reduce power consumption in battery powered equipment has been recommended for extreme power saving in many instances on the forum.
That seems to have been the impression given but I am not sure where the assertion it reduces power consumption comes from. It will extend the run time of battery powered circuits but I am not sure it does actually reduce power consumption, other than it may disable some silicon.

I personally haven't paid much attention to BOD other than in testing to check it runs at a lower voltage when DISABLEBOD is used so cannot supply definitive answers.
 

Technical

Technical Support
Staff member
There is some confusion here so let's go back to basics:

1) Running any microcontroller below its minimum operating voltage (as defined in the Microchip datasheets) means 'anything may happen'. You are running a microcontroller part out of the manufacturer's specifications and hence incorrectly.

2) The purpose of a brown out detect is to prevent 1) above. It stops the chip running below the recommended minimum voltage - there is a reason the brownout circuit is built into the silicon to start with...

3) If you turn off the on-board internal brown out detect you should still have a circuit to prevent running at too low a voltage. Normally this is an external brown out circuit / brown out part instead.

4) If you do run at too low a voltage the program counter in the silicon can 'skip' to an incorrect location. Hence any random code may execute. Or it may just carry on working as expected.

5) It's almost impossible to 'accidentally' write to EEPROM memory within normal assembler during normal operation. The silicon has a very specific sequence of assembler commands and 'checksum' bytes that need to be output in sequence to activate a write. Hence its not something that can occur accidentally. However naturally the PICAXE 'write' command code does contain this sequence to allow the write command to work.


But the big 'but' is what happens inside the silicon if running at a voltage below the minimum operating specifications? Anything. The program counter may skip and hence run any bit of code anywhere in the chip that it finds. And that code may not work fully before the PC skips to another random bit of code somewhere else. And then, eventually, the code will stop running completely as the voltage goes very low. Or it may work perfectly normally and not show any problem at all.

The solution is quite simple and common sense - never run the chip below the manufacturers minimum operating voltage. The internal brownout detect is the best way to ensure this occurs, hence the reason it is enabled by default.
 
Last edited:

lbenson

Senior Member
Using disablebod to reduce power consumption in battery powered equipment has been recommended for extreme power saving in many instances on the forum.
In my "Low-Power Battery Backup Reference Design" thread ( http://www.picaxeforum.co.uk/showthread.php?8353-Low-Power-Battery-Backup-Reference-Design&highlight=disablebod ), still blinking just 7 months shy of 10 years on the same 3 AA batteries, in the second post, Mycroft said 'For a further reduction in current, consider the DISABLEBOD command. According to the spec sheet it "significantly reduces current during sleep"'--so I added it.

That spec sheet (not otherwise referenced) would have been for the 08M pic chip if actually applicable. I have not tested to determine if its use actually did reduce current usage as opposed to just allowing the 08M to run at a lower-than-otherwise voltage (if, as Technical says, with unpredictable results).

In a post on BOD, Mycroft submitted this chart suggesting significant power savings with DISABLEBOD. http://www.picaxeforum.co.uk/showthread.php?4826-Low-Power-PICAXE-08M-BOD

Code:
         BOD Enabled  BOD Disabled
 Clock   Pause  Nap   Pause  Nap
 8 MHz    1.28  0.09   1.16  0.01
 4 MHz    0.74  0.09   0.66  0.01 
 2 MHz    0.52  0.09   0.45  0.01
 1 MHz    0.40  0.09   0.32  0.01
 500 kHz  0.34  0.09   0.26  0.01
 250 kHz  0.32  0.09   0.23  0.01
 125 kHz  0.30  0.09   0.22  0.01 
 31 kHz   0.12  0.09   0.03  0.01
The issue of unreliability is not addressed in that thread.

So there seem to be two factors at play with DISABLEBOD.

1) significant saving in current usage can be achieved in low power designs by using DISABLEBOD (at least with the 08M).
2) if the voltage goes below the point where shutdown would occur if BOD is enabled, unpredictable behavior may result.

Perhaps the solution when using DISABLEBOD is to occasionally test internally for voltage below spec, and shutting down when detected.
 
Last edited:

AllyCat

Senior Member
Hi,

Yes the Microchip data sheets for the base PICs show that the BOD consumes almost ten times more current than the rest of the chip in Sleep/Nap, so DISABLEBOD is a little hard to resist. ;)

I suspect that the write-error might have occured because the supply voltage was falling very slowly. Thus a single "bad" address might accidentally jump into the PICaxe code whose purpose is to write into the EPROM, but that code is then able to continue running and correctly "jump through all the hoops" that are necessary to write a byte into the EPROM.

Obviously the "sensible" solution is to keep the BOD enabled, but if saving the last ten uA is really important, then I propose the following: Determine the voltage that the BOD uses (it should be in the data sheet) which for PICs, I believe is only a few hundred mV above the voltage that it will stop working correctly. Disable the BOD and when the program comes out of sleep/nap, measure the supply rail using CALIBADC10 (or I have devised some code snippetts that can give higher resolution). Maybe even estimate the battery's internal resistance, which can be a better indicator of imminent failure. When the supply rail is getting close to the BOD level then Enable the BOD and let it shut down the processor as normal.

How often you would need to use CALIBADC will depend on how quickly the supply rail might decay; A (wired in) battery probably only needs to be checked occasionally, particularly if it's just running the PIC core, mainly in sleep/nap. However, if the supply might drop quickly then you could add a large capacitor or Supercap. But the CALIBADC code, if executed frequently, might then consume more power than saved by disabling the BOD. Also, before entering sleep/nap, ensure than any other hardware is disabled (such as the FVR used for calibadc) or you may drain even more power from the battery than the BOD module!

Cheers, Alan.
 

hippy

Technical Support
Staff member
Perhaps the solution when using DISABLEBOD is to occasionally test internally for voltage below spec, and shutting down when detected.
The best way to do that would perhaps be to have ENABLEBOD when coming out of a NAP and only disable it immediately before napping.

The risk there is that the power rail could collapse to an out of spec voltage during the NAP and before BOR is re-enabled.

The only full solution would appear to be to keep BOR enabled or use external hardware to keep power rails from going 'out of spec' and that hardware might draw more current than simply leaving BOR enabled.
 

matchbox

Senior Member
I suspect that the write-error might have occured because the supply voltage was falling very slowly. Thus a single "bad" address might accidentally jump into the PICaxe code whose purpose is to write into the EPROM, but that code is then able to continue running and correctly "jump through all the hoops" that are necessary to write a byte into the EPROM.
This thread was a good read guys.

I have had this same issue from the time I wrote a specific program for a 14m2.
My Eeprom storage is for a power failure backup. And I have brown out detection enabled. I have never used Disablebod on this project.
Its supply comes from a switch mode plug pack.
I have found too, that it is the slow drop in voltage that causes the Eeprom to get corrupted.
The only reason I use Eeprom storage, is in case a power blackout occurs. But unfortunate the corruption only occur if the plug pack losses power :rolleyes:

It will get corrupted if I turn off the mains power point switch to the plug pack. But if I just unplug the DC input, it does not happen.

I have an Eeprom reset on its operations switch. So if I hold it down and re-power the unit. It first clears the Eeprom memories to zero.
The only reason I even noticed this, is because the transmitted data on its OLED display appeared corrupted after a power outage.
 

matchbox

Senior Member
Rev Ed? Do you have any idea's about a firmware fix for this?
It seems pointless have a BOD if it doesn't work detecting slow voltage drop brown out conditions.

I can work around it.... But lately I seem to be using more program memory writing work arounds for syntax functions that don't work as stated, under all conditions.
 

AllyCat

Senior Member
Hi,

Are you saying that you are experiencing EEPROM corruption with the BOD ENABLED ? As far as I can see that is NOT the experience of any other posters in this thread. You may need to look for another cause.

The BOD is part of the basic Microchip hardware, it is not something created by Rev Ed. The only "mistake" that Rev Ed might have made is including the DISABLEBOD command to deactivate it ! For programmers who REALLY want to save the last few microamps (or use alternative hardware) and undestand all the other requirements and risks, there is always the corresponding POKESFR command.

Cheers, Alan.
 

hippy

Technical Support
Staff member
Rev Ed? Do you have any idea's about a firmware fix for this?
Not really; we would need to investigate the issue, determine exactly what is going on and check if it is how it is reported to be, to be able to make any constructive comment upon it.

As AllyCat notes; if it is a silicon issue then it is almost certainly outside of our control and there may be little we could do in firmware to resolve the issue. The best one could perhaps suggest, if a particular situation causes corruption, would be to avoid such a situation occurring.

If the problem were the slow rate of decay of a collapsing power rail the usual solution to that would be to speed the collapse of that power rail to get it quickly through the circumstances which causes a problem to arise.
 

BESQUEUT

Senior Member
I had same problem with an Atmel processor... An EEprom checksum was writen to avoid dramatic consequences...
 

matchbox

Senior Member
Thanks for the feedback guys.
That's correct. I do have the BOD enabled. Even with it enabled, the slow rate of voltage decay was causing a corruption. It takes the filter capacitor in the switch mode plug pack, at least 6seconds to discharge.

The way I over come it, was to use an ADC voltage read to detect the voltage below supply level. Then it would write to the Eeprom and implement the reset command.
At the beginning of the program, I had another ADC voltage read that uses a do:loop until the voltage would rise above supply. Then it would read the Eeprom and start the program running again.
This fixed it for me.
 

hippy

Technical Support
Staff member
I would guess that what is happening is the power rail is falling, the program writes to Eeprom, this places a load on the supply, which causes it to drop further

It could be the voltage is then below the level required to reliably write to Eeprom or goes below the BOD level, which resets the chip and prematurely terminates the write. The end result being that what should have been written was not written, is wrong when subsequently read.

Checking the voltage before the write will help avoid that happening, will reduce the likelihood of it being too low until after the write completes.

A checksum as suggested by BESQUEUT will help identify if correct data has been written or not. Having two sets of Eeprom data can help identify what the latest valid data was when a program restarts -
Code:
SaveData:
  Gosub WriteBlock1
  Gosub ReadBlock1
  If checksumOkay = 1 Then
    Gosub WriteBlock2
  End If
  End

WriteBlock1:
  sequenceNumber = sequenceNumber + 1
  checksum = sequenceNumber + data ^ $A5
  Write 10, sequenceNumber
  Write 11, dataValue
  Write 12, checksum
  Return

WriteBlock2:
  sequenceNumber = sequenceNumber + 1
  checksum = sequenceNumber + data ^ $A5
  Write 20, sequenceNumber
  Write 21, dataValue
  Write 22, checksum
  Return

ReadBlock1:
  Read 10, sequenceNumber1
  Read 11, data1
  Read 12, checksum1
  checksumOkay = sequenceNumber1 + data1 ^ $A5 - checksum1 Max 1 ^ 1
  Return

ReadBlock2:
  Read 20, sequenceNumber2
  Read 21, data2
  Read 22, checksum2
  checksumOkay = sequenceNumber2 + data2 ^ $A5 - checksum2 Max 1 ^ 1
  Return
When the program then restarts at least one of those sets of data should be correct, becomes the latest valid data written.
Code:
ReadData:
  Gosub ReadBlock1
  If checksumOkay = 1 Then
    Gosub ReadBlock2
    If checksumOkay = 1 Then
      ; Both have valid data
    Else
      ; Only block 1 is valid
    End If
  Else
    Gosub ReadBlock2
    If checksumOkay = 1 Then
      ; Only block 2 is valid
    Else
      ; Neither are valid
    End If
  End If
 

matchbox

Senior Member
Thanks Hippy. I will go over it and see how it works.

The supply that I am using on this project is a 12.5volt - 1.5amp plug pack.
It also drives an RF module and a motor controller.

I'll give you some more insight into what appears to be taking place when power loss occurs.
The motor controller has a power LED on its board. And when the plug pack is turned off at the mains switch. The LED will slowly go dull then brighten for a split second, then continue to completely fade over the next 3 seconds.
It most certainly has plenty of time to write to the Eeprom... But my concern was, that it would WRITE to it, and then it would try to restart the program and READ the Eeprom in the split second, when the load appears to drop of the discharging filter cap.
That is what lead me to make it WRITE if the voltage drops below 10.5v; then RESET, but not READ the Eeprom or restart the program again until the voltage recovered above 11.8v.
 
Top