Sunday, January 24, 2016

Linux Kernel Debugging using Ftrace - Basic Post

Ftrace is a tracing utility built directly into the Linux kernel. Many distributions already have various configurations of Ftrace enabled in their most recent releases. 

Note that this post doesn't cover all tracers in details.


Important Kernel Configs for Tracing and debugging :-

CONFIG_FUNCTION_TRACER
CONFIG_FUNCTION_DURATION_TRACER
CONFIG_FUNCTION_GRAPH_TRACER
CONFIG_STACK_TRACER
CONFIG_DYNAMIC_FTRACE

To Find what tracers are available :-

If the DEBUGFS is not mounted 
mount debugfs -t debugfs /debug

Go to the tracer directory as below 

[~]# cd /sys/kernel/debug/tracing 
[tracing]#   
[tracing]# cat available_tracers 
function_graph function sched_switch nop

Note that the Kernel has huge list of tracers, covering all that is not be possible for just a single blog post, however below are few links which describes all of them in detail.


http://lwn.net/Articles/365835/


https://lwn.net/Articles/366796/


I will surely come back with a blog post on debugging critical moments of Linux kernel using FTRACE. I am working toward that right now.


BTW, There is another important tip as below :-


Use Ftrace's  trace_printk() instead of printks :-

printk() is the king of all debuggers, but it has a problem. If you are debugging a high volume area such as the timer interrupt, the scheduler, or the network, printk() can lead to bogging down the system or can even create a live lock. It is also quite common to see a bug "disappear" when adding a few printk()s. This is due to the sheer overhead that printk() introduces.

Ftrace introduces a new form of printk() called trace_printk(). It can be used just like printk(), and can also be used in any context (interrupt code, NMI code, and scheduler code). What is nice about trace_printk() is that it does not output to the console. Instead it writes to the Ftrace ring buffer and can be read via the trace file.


Writing into the ring buffer with trace_printk() only takes around a tenth of a microsecond or so. But using printk(), especially when writing to the serial console, may take several milliseconds per write. The performance advantage of trace_printk() lets you record the most sensitive areas of the kernel with very little impact.


For example you can add something like this to the kernel or module:


    trace_printk("read foo %d out of bar %p\n", bar->foo, bar);



tracing_on() and tracing_off() for Kernel Debugging :-

The driver gets stuck in a sleep state and never wakes up. You will not be able to disable the tracer from user space when a kernel event occurs is difficult and which can in a trace buffer overflow and loss of the relevant information before the user can stop the trace.

There are two functions that work well inside the kernel: tracing_on() and tracing_off(). 
These two act just like echoing "1" or "0" respectively into the tracing_on file. 

If there is some condition that can be checked for inside the kernel, then the tracer may be stopped by adding something like the following:

                               if (test_for_error())
                                           tracing_off();

This gives flexibility to turn on and off tracing in the particular driver, where you can enable tracing on some error and use trace_printk() till you disable it later.

After examining the trace, or saving it off in another file with:
cat trace > ~/trace.sav



No comments:

Post a Comment