UART write leads to device reset
# help
b
Under what cases would writing to UART lead to a machine reset?
Copy code
sh
[jaguar] INFO: running Jaguar device 'wiro' (id: '402d9458-9cf0-42c7-9cee-09714a97d6bf') on 'http://192.168.86.200:9000'
[jaguar] INFO: program 7e4b355c-9066-ca1b-a357-14f20db16477 started
Hello test!
ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x3 (RTC_SW_SYS_RST),boot:0xc (SPI_FAST_FLASH_BOOT)
Saved PC:0x40386f10
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fcd6100,len:0x48
load:0x403ce000,len:0x64c
load:0x403d0000,len:0x2318
entry 0x403ce000
[toit] INFO: starting <v2.0.0-alpha.64>
[wifi] DEBUG: connecting
[wifi] DEBUG: connected
[wifi] INFO: network address dynamically assigned through dhcp {ip: 192.168.86.200}
[wifi] INFO: dns server address dynamically assigned through dhcp {ip: [192.168.86.1]}
[jaguar] INFO: running Jaguar device 'wiro' (id: '402d9458-9cf0-42c7-9cee-09714a97d6bf') on 'http://192.168.86.200:9000'
My code:
Copy code
import uart
import gpio
import reader show BufferedReader
import writer show Writer

main:
    print "Hello test!"

    port := uart.Port
      --rx=gpio.Pin 20
      --tx=gpio.Pin 21
      --baud_rate=9600

    writer := Writer port
    10.repeat:
      print "Writing"
      writer.write "Test \n\n\n"
      sleep --ms=1000
Also - thanks for the quick responses. Really appreciate y'all and excited about this project
f
The UART should never reset the board... I wonder if that was just coincidental. I wonder if a watchdog timer or something similar is triggering a reboot. So far we haven't tested that model for a longer time. I will flash one at the office and stress test it a bit today. See if I can reproduce it.
I can reproduce a crash when using the UART. My output is significantly more verbose, though:
Copy code
Hello test!
Guru Meditation Error: Core  0 panic'ed (Memory protection fault). 
  memory type: IRAM0_SRAM
  faulting address: 0x40380284
  world: PMS_WORLD_0
  operation type: WRITE

Core  0 register dump:
MEPC    : 0x4200594c  RA      : 0x40382ac4  SP      : 0x3fca5e80  GP      : 0x3fc94200  
TP      : 0x3fc8cc00  T0      : 0x4005890e  T1      : 0x40391da6  T2      : 0x00009999  
S0/FP   : 0x3fcbb49c  S1      : 0x3c1b2101  A0      : 0x3fcbb839  A1      : 0x3fcb4fe4  
A2      : 0x3fca6da0  A3      : 0x3fc9c098  A4      : 0x3fca7b20  A5      : 0x00000035  
A6      : 0x3fcb4fd8  A7      : 0x3fcbb6b5  S2      : 0x3fcb4fe0  S3      : 0x3fc9b000  
S4      : 0x3fc96560  S5      : 0xfffe42a6  S6      : 0x00017dde  S7      : 0x3fc96830  
S8      : 0x3fca5f30  S9      : 0x00000004  S10     : 0x3c1b39fc  S11     : 0x3c1b0000  
T3      : 0x00000000  T4      : 0x3c1b7081  T5      : 0x00000580  T6      : 0x3bc3a72c  
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x0000001a  MTVAL   : 0x00004942  
MHARTID : 0x00000000  

Stack memory:
3fca5e80: 0x3fcbab68 0x00000000 0x3fcba9a8 0x3c0f2e40 0x3fca0a38 0x3c1b2101 0x3fcbb49c 0x40382ac4
3fca5ea0: 0x3fcaa658 0x00000000 0x420058d0 0x4038e4e4 0x00000000 0x3c1b704d 0x3fcbb745 0x420e7b5e
3fca5ec0: 0x00000000 0x00000002 0x00000004 0x00000000 0x3fca08f0 0x3fca5f88 0x00000000 0x00000000
3fca5ee0: 0x00000000 0x00000000 0x00000000 0x3fca0a38 0x3fca0a20 0x00000000 0x00000000 0x3fca0958
3fca5f00: 0x3fca08f0 0x3fca5f88 0x3fcba9a8 0x4202ba16 0x00000000 0x00000000 0x00000000 0xffffffff
3fca5f20: 0x00000042 0x003f08eb 0x00000b40 0x420e858c 0x00000001 0x00000000 0x00000000 0x00000000
3fca5f40: 0x3fca5f88 0x00000001 0x3fc9dfe8 0x420e7c7e 0x3fca0a20 0xffffffff 0x3fc9dfe8 0x00000000
@mikkel.damsgaard any ideas?
m
I am ooo today. Will look tomorrow. Running with IDF.py monitor could give a nice stack trace
f
That helped reduce the test-case. We are already failing when trying to initialize the gpio. Failing example:
Copy code
import gpio

main:
  pin := gpio.Pin 4
Has error:
Copy code
Guru Meditation Error: Core  0 panic'ed (Memory protection fault). 
  memory type: IRAM0_SRAM
  faulting address: 0x40380284
0x40380284: toit::GpioResourceGroup::isr_handler(void*) at ??:?

  world: PMS_WORLD_0
  operation type: WRITE

Stack dump detected
Core  0 register dump:
MEPC    : 0x42005c6e  RA      : 0x40382d88  SP      : 0x3fca5e80  GP      : 0x3fc93e00  
0x42005c6e: toit::primitive_init(toit::Process*, toit::Object**) at ./build/esp32c3/./src/resources/gpio_esp32.cc:230

0x40382d88: toit::Object::is_marked() at ./build/esp32c3/./src/objects.h:73
 (inlined by) toit::Primitive::is_error(toit::Object*) at ./build/esp32c3/./src/primitive.h:1184
 (inlined by) toit::Interpreter::run() at ./build/esp32c3/./src/interpreter_run.cc:1052

TP      : 0x3fc8c8b8  T0      : 0x3c142db3  T1      : 0x40391884  T2      : 0x3c140ef5  
0x40391884: _calloc_r at /home/flo/code/opentoit/third_party/esp-idf/components/newlib/heap.c:70
Found it and fixed it. It was a misplaced
IRAM_ATTR
on a variable. The C3 has a memory protection enabled that disallows writing into IRAM once the program has been loaded. It's a security feature. However, by adding an
IRAM_ATTR
to one of our global variables, we crashed the system when writing into that field. Unfortunately this happened when setting up the GPIO pins. The fix should be included in Toit v2.0.0-alpha.66
m
Awesome find.
f
Wouldn't have found it without your tip of using the esp-idf. And, tbh, I think I added the
IRAM_ATTR
...
b
Ah, cool find! Thanks 🙂
f
We should have caught this a bit earlier, but we haven't really set up a testing gig for physical devices yet. I'm running my esp32-c3 through a small stress-test now, and also tested the ADC. Hopefully things should go a bit more smoothly from now on.
alpha.66 has been released, but we haven't updated Jaguar yet.
k
Jaguar v1.9.11 is out with the new SDK.
b
Ah no problem, confirmed it works! Is there a model/devkit that y'all like and frequently test? I'd assume the standard esp32 devkit v1?
f
Mostly the wroom modules and standard devkits.
That said: it's a big goal to add more models and more peripherals. Just not yet on the top of our Todo lists.
b
Do you find the devkit v1 reliable? I was running into WiFi connection issues with C, Micropython, and Toit, but wasn't able to fully diagnose whether it was the board my colleague's network. I also saw a lot of flakiness with Micropython but wasn't sure if it was the board, language, or both 😅 That is why I landed on the c3 which has been much better with WiFi connections, at least.
f
I don't think we have seen major differences. TLS connections are really memory hungry and are hard to get right, but otherwise we haven't really had issues. Can't really say anything about the c3.
Also note that different versions of the esp-idf might have different characteristics.
And different routers definitely deal differently with the chips.
Our Google wifi, for example, has a tendency to put you in timeout if you connect too often.
b
I see. Is it in (eventual) scope for artemis/jaguar to respond to these events outside of the application layer to support as many WiFi/router environments as possible?
Thanks for the help btw!
I'm new to this world
f
Not sure we really have control at that level. We do sometimes go a bit deeper when we see issues (or big benefits), though. For example, a TLS connection isn't just using mbedtls anymore. Parts of the handshake is done in Toit to improve the memory situation.
b
I see. It's very possible what I was running into was a fluke. Will keep testing things on my end and let you know what I find.
f
That would be great.
2 Views