Monday, March 7, 2011

ARM exceptions and Debugging techniques

ARM exceptions and Debugging techniques :


1.Reset:
  • Common reasons for Reset
  • Function call using a NULL pointer
  • Asserts / System Aborts
  • Hardware error
Mostly a NULL pointer is the cause for RESET. Check out the R14 and find out which was the last instruction before jumping to PC 0.In case of condition check fails the programs can choose to ASSERT the system. In case of invalid scenarios in RTOS too, the system may abort. Putting Breakpoints at the common abort points like ASSERT will be the first step to debug these problems.Hardware errors are less common reason for NULL pointer exceptions. Usually the ABORT signal from the peripherals is routed through the Interrupt Controller. When a system resets no handling for interrupts are done. So the registers of Interrupt Controller will be preserved. It will be helpful to check out the registers to find out the source hardware that caused the reset.



2.Data Abort:
  • Accessing Uninitialized Pointers
  • Stack overflow, array over indexing(usually local)
  • Invalid Operation on a memory
  • Memory Failure

The memory subsystem will generate a signal in case of any of the above
Accessing uninitialized pointers will cause ARM to fetch data from areas of memory that are not defined.If you try to do invalid operation on a particular memory device, eg: If you try to write to ROM, it will cause Data Abort.In both the above cases we can know which instruction caused the data abort, by checking out R14 value.Stack overflow or array over indexing will cause some data pointers to be corrupted. This might cause a Data Abort. (Check out the last section, dealing with stack and array overflows).In case of memory failure… nothing much can be done. Check if the Interrupt Controller has some registers which will show which memory system Aborted.


3.Pre-fetch Abort
Pre-fetch abort is similar to Data Abort. It is caused when Instructions are fetched from invalid address.
Usual reasons are :
  • Calling a function via an Uninitialized Function Pointer
  • Stack overflow, array over indexing(usually local)
Approaches similar to Data Abort can be followed.


4.Undefined Instructions
This exception occurs when the instruction fetched from memory cannot be decode. This can be caused by:
  • Memory corruption by Stack overflow or Array over indexing
  • Un-initialized Function Pointer
If a memory which was originally supposed to contain code, gets overwritten due to some memory corruption, the instructions will become invalid. This error is less common in systems where code executes from FLASH. In such systems (called XIP) only some small set of code will be in RAM.Calling function using an Un-initialized Function Pointer can cause this issue.Check out R14.


What are CRASHES?
Crashes are unhandled EXCEPTIONS. If you are looking at a crash basically you are looking at an EXCEPTION. Find out what is the exception and start digging for clues depending upon the type of exception.


How does Memory get corrupted???
Memory gets corrupted due to Uninitialized Pointer,Stack Overflow,Local Array over-indexing and Global Array over-indexing.

When a system crashes due to un-initialized pointers, the LR value will directly give the place where the pointer was accessed. This is usually one of the easiest of memory corruptions to be tracked. Usually these are very consistent in nature.Stack overflow usually happens in heavy system loads. If you are regressing something and you got some aborts, then it can be a stack overflow. One easy way of finding out is to double the stack size and see, whether the crash occurs again or not. This will give you more confidence to debug in this direction. [This is not a litmus test].Some OS’s write a FENCE to the lowest address of a Stack. A FENCE is a special sequence of bytes like CDCDCDCDCD or DEADDEADDEADDEAD which are used to mark boundaries or show unused memory area etc. If the FENCE has been breached, then there is a stack overflow.Local array over-indexing can cause local pointer variables to get corrupted. If you are seeing some local variables corrupted, it can be a local array over-indexing issue.If you are seeing some global array corruption, it can be a global array over indexing.


Array Over-Indexing Vs Stack Overflow :There is a basic difference between Array over Indexing and Stack Overflow. Stack grows from a higher memory area to a lower memory area. While during Array over indexing u will corrupt data from lower memory area to higher memory area. In the memory dump, check which area is getting corrupted and you will get to know who the actual culprit is.

No comments:

Post a Comment