hi2c "blows up" (i2c comms fails at a point)

johndk

Senior Member
I have a rather large program (3631 bytes) which has a master 28x2 communicating with several i2c slaves, one of which is another 28x2. I pull data from an external eeprom, send text to an OLED, and get timing from an RTC. All works well until I get to a certain part of the code. Then the master loses the ability to communicate with the slave 28x2 (it can still pull data from the eeprom and send it to the OLED).

One would think that was a simple coding problem and I spent a good week trying to find it. No such luck. I'm using the same subs that communicate successfully earlier in the program and introduce no new variables. I've even isolated the variables (push and pop with sp) to be sure. I've checked that I'm not overwriting parts of the sp (checked on both master and slave) to corrupt my data. And I've checked the slave side, which is behaving as it should. It gets all the proper transmissions until the master blows up. Then it gets nothing.

I've checked and rewritten portions of the code and tried various workarounds, all to no avail. As I said, the code is large, taking almost all of slot 0, and I have some portions running out of the other 3 slots. But where it blows up, it has already returned from slot 2 and communicated with the slave successfully several times. When it blows up, the slave receives nothing, but the master "reads" the slave sp locations I have set up for communications and reports that it sent a 255 and that the slave has answered with a 117.

My communication codes are all in the range of 0 - 40. I know that a 255 will come up if the master can't read the slave, but notice that I don't get two 255s. Also keep in mind that the slave end actually gets none of this. It seems to me that the master is suddenly lost and reading another address or another slave. But I call the hi2csetup with the correct parameters just prior to the problem. So that shouldn't be. Once the master blows up, it stays in that mode. It doesn't eventually straighten out even though I have it re-trying. (Keep in mind that master continues to communicate with external eeprom and OLED during this time.)

These are the slave addresses I'm using:
;i2c addresses
symbol slave_addressA = %10110000 ;slave address for 28x2 = 176
symbol RTC_addr = %11010000 ;address of slave RTC = 208
symbol lc256_addr = %10100000 ;$A0 eeprom address = 160
symbol OLED_ADDR = %01111000 ;$78 OLED I2C address = 120

One possibility is that the master briefly relinquishes the ic2 bus so that the slave can become master, read the external eeprom, and then go back into slave mode. It's just a quick read, and the slave does it successfully. I have the timing on the master side set up so there should be no chance of collision (I've even tried increasing the hi2csetup off time on the master side to no avail).

My only guess at this point is that there is some kind of overflow taking place that affects whatever memory space hi2c is using. Is that possible? Has anyone run into anything like this?

Hardly any hair left !!!

John
 

Attachments

Last edited by a moderator:

inglewoodpete

Senior Member
Just had a quick look at your code. I don't have the time to bury myself in it but one thing stands out - The hi2cSetup commands for some devices have i2cfast_8 and others have i2cslow_8. Since all slaves share the i2c bus, the bus must run at the speed of the slowest device. This is because every slave must interpret the data on the bus in order to respond to its own address.

As well, as far as I'm aware, a PIC running as an i2c slave can only run reliably at i2cSlow. Also, how long is the i2c bus? Anything longer than about a metre at i2cSlow will see a rise in errors.
 

johndk

Senior Member
I've had them all running at i2cslow with the same result. I just recently changed some to fast just to see if that would make a difference. As to the bus requiring just a single speed, I wonder if that's true. My speed change-up seems to work just fine. But again, that is not the source of the problem. Running everything on slow yields the same result.

As to the code, I only sent slot 0 as I don't believe the other slots have any impact. But I'd be happy to post is anyone thinks it might be useful.

The problem occurs consistently at line 507 (or somewhere close to that - I might have made a few changes since I posted). It is at the Slave_Get_Status after the send_params message has been sent. That's where I have the master release the i2c to slave temporarily. The slave receives the send_params, successfully downloads the parameters and then waits to hear from master. It sees nothing after that point.

I first had the master download the params directly to the slave sp via i2c (no need for the slave to become master). But that blew up also. In fact, having the parameters delivered via external eeprom was my attempt at a work-around.
 

srnet

Senior Member
As to the bus requiring just a single speed, I wonder if that's true. My speed change-up seems to work just fine
A slower device might mis-read the faster data on the bus (or vice versa) and react to bus signals that it should not.

Maybe you appear to get away with it, but one day .........
 

Technical

Technical Support
Staff member
It's not that common to turn the i2c master off and let something else become the master. You may get unexpected results if the other master has not completely released the bus before you restart the first master.
It could also possibly be that you are getting unexpected residual bytes into the master's i2c buffer whilst the other devices are using the bus (the i2c buffer is separate silicon in the chip and so a 'dedicated i2c' buffer).

It may be worth flushing the buffer before reconfiguring the first device as an i2cmaster. However you really need to isolate the issue with a much simpler program first, narrow the issue down in a simple test case and make it repeatable and then we should be able to find a solution.
 

johndk

Senior Member
As to srnet, point taken. I assumed a fast bus speed would be ignored (or unread) by slower devices. But I suppose we might get some unexpected interactions. But again, that's not the problem. I put everything at slow and get the same problem.

Again a reminder that I started this adventure just pushing the parameters to the slave sp. That worked fine until I made some program revisions. (Primarily adding the external eeprom and display routines) After that, I could no longer push the parameters at this part of the program. Hence, I came up with the "slave master the bus" work-around.

Technical: Which i2c buffer are you referring to? How do I get to it to flush it? I suppose I could try simplifying the program by removing the ex eeprom and display portions. But they're somewhat interdependent so removing them wholesale wouldn't tell me much. If I can access that i2c buffer, I would like to give that a try.

By the way, answer to an earlier question. My i2c bus is entirely on a PCB. Max length less than 10 cm.
 

srnet

Senior Member
Incidentally, I presume your I2C setup did not actually explode ?

It would be very unusual for sure.
 

johndk

Senior Member
No, the explosion is purely virtual. I get mixed bit spatters all over my i2c comm. But they are magically and thankfully all sorted out and put back in place when I reboot.

I'm hoping someone can supply a virtual mop so I can clean up without a restart. How about that i2c "buffer flusher", Technical ?
 

Technical

Technical Support
Staff member
Before switching on the hi2c module again (via hi2csetup) you could peeksfr SSP1BUF (peeksfr $C9,b1) to read and hence clear it - and possibly clear any bus collision flags (by pokesfr $C6,0) of SSP1CON1. You will need to check the PIC datasheets to get the correct SFR addresses for the chip you are using - for 28X2 they are 0xC9 and 0xC6, page 83 of http://ww1.microchip.com/downloads/en/DeviceDoc/41412F.pdf

Also test both SDA and SCL pins are actually high (idle state = pulled high by i2c bus resistor, no other device currently pulling them low) before switching back on the hi2c module.
 

johndk

Senior Member
Before I got Technical's "buffer flusher" info, I bit the bullet and started progressively shutting down portions of the program.

I started with the display functions, added (subtracted, actually) the openlog functions, the one-wire functions, and finally the calibration function. The problem persisted with only the leanest skeleton of the program left. Aaargh!

Next move, recheck all the memory locations being used (again). I actually found that I was over-writing one byte in slave sp. But that one byte could not have caused the problem I was having. It would simply have caused some run-time anomalies (which it did - I just hadn't found it yet). At that point I thought, what the heck, let's go back to pushing the parameters directly to slave sp and avoid that potential pitfall while I'm this deep into it. So I dumped the "slave master the bus" and went back to direct write of slave sp.

The final result. .... drum roll ..... IT WORKED ! I have no idea what actually made the difference, but I'm guessing it was the "slave master the bus" because that one overwrite could not have caused the problem I was having.

Long and short. I don't know why the problem presented itself when I was pushing direct to slave sp, but apparently my brilliant solution caused it's own (more interesting?) problem.

Lesson: Bit spatters are no fun. Cleanup can be very work intensive.

Thanks everyone for helpful suggestions.
 
Top