User Guide¶
Quick Start¶
drgn debugs the running kernel by default; run sudo drgn
. To debug a
running program, run sudo drgn -p $PID
. To debug a core dump (either a
kernel vmcore or a userspace core dump), run drgn -c $PATH
. Make sure to
install debugging symbols for
whatever you are debugging.
Then, you can access variables in the program with prog['name']
and access
structure members with .
:
$ sudo drgn
>>> prog['init_task'].comm
(char [16])"swapper/0"
You can use various predefined helpers:
>>> len(list(bpf_prog_for_each()))
11
>>> task = find_task(115)
>>> cmdline(task)
[b'findmnt', b'-p']
You can get stack traces with stack_trace()
and access parameters or local
variables with trace['name']
:
>>> trace = stack_trace(task)
>>> trace[5]
#5 at 0xffffffff8a5a32d0 (do_sys_poll+0x400/0x578) in do_poll at ./fs/select.c:961:8 (inlined)
>>> poll_list = trace[5]['list']
>>> file = fget(task, poll_list.entries[0].fd)
>>> d_path(file.f_path.address_of_())
b'/proc/115/mountinfo'
Core Concepts¶
The most important interfaces in drgn are programs, objects, and helpers.
Programs¶
A program being debugged is represented by an instance of the
drgn.Program
class. The drgn CLI is initialized with a Program
named prog
; unless you are using the drgn library directly, this is usually
the only Program
you will need.
A Program
is used to look up type definitions, access variables, and read
arbitrary memory:
>>> prog.type('unsigned long')
prog.int_type(name='unsigned long', size=8, is_signed=False)
>>> prog['jiffies']
Object(prog, 'volatile unsigned long', address=0xffffffffbe405000)
>>> prog.read(0xffffffffbe411e10, 16)
b'swapper/0\x00\x00\x00\x00\x00\x00\x00'
The drgn.Program.type()
, drgn.Program.variable()
,
drgn.Program.constant()
, and drgn.Program.function()
methods
look up those various things in a program. drgn.Program.read()
reads
memory from the program’s address space. The []
operator looks up a variable, constant, or
function:
>>> prog['jiffies'] == prog.variable('jiffies')
True
It is usually more convenient to use the []
operator rather than the
variable()
, constant()
, or function()
methods unless the program
has multiple objects with the same name, in which case the methods provide more
control.
Objects¶
Variables, constants, functions, and computed values are all called objects
in drgn. Objects are represented by the drgn.Object
class. An object
may exist in the memory of the program (a reference):
>>> Object(prog, 'int', address=0xffffffffc09031a0)
Or, an object may be a constant or temporary computed value (a value):
>>> Object(prog, 'int', value=4)
What makes drgn scripts expressive is that objects can be used almost exactly
like they would be in the program’s own source code. For example, structure
members can be accessed with the dot (.
) operator, arrays can be
subscripted with []
, arithmetic can be performed, and objects can be
compared:
>>> print(prog['init_task'].comm[0])
(char)115
>>> print(repr(prog['init_task'].nsproxy.mnt_ns.mounts + 1))
Object(prog, 'unsigned int', value=34)
>>> prog['init_task'].nsproxy.mnt_ns.pending_mounts > 0
False
Python doesn’t have all of the operators that C or C++ do, so some substitutions are necessary:
Instead of
*ptr
, dereference a pointer withptr[0]
.Instead of
ptr->member
, access a member through a pointer withptr.member
.Instead of
&var
, get the address of a variable withvar.address_of_()
.
A common use case is converting a drgn.Object
to a Python value so it can
be used by a standard Python library. There are a few ways to do this:
The
drgn.Object.value_()
method gets the value of the object with the directly corresponding Python type (i.e., integers and pointers becomeint
, floating-point types becomefloat
, booleans becomebool
, arrays becomelist
, structures and unions becomedict
).The
drgn.Object.string_()
method gets a null-terminated string asbytes
from an array or pointer.The
int()
,float()
, andbool()
functions do an explicit conversion to that Python type.
Objects have several attributes; the most important are
drgn.Object.prog_
and drgn.Object.type_
. The former is the
drgn.Program
that the object is from, and the latter is the
drgn.Type
of the object.
Note that all attributes and methods of the Object
class end with an
underscore (_
) in order to avoid conflicting with structure or union
members. The Object
attributes and methods always take precedence; use
drgn.Object.member_()
if there is a conflict.
References vs. Values¶
The main difference between reference objects and value objects is how they are
evaluated. References are read from the program’s memory every time they are
evaluated; values simply return the stored value (drgn.Object.read_()
reads a reference object and returns it as a value object):
>>> import time
>>> jiffies = prog['jiffies']
>>> jiffies.value_()
4391639989
>>> time.sleep(1)
>>> jiffies.value_()
4391640290
>>> jiffies2 = jiffies.read_()
>>> jiffies2.value_()
4391640291
>>> time.sleep(1)
>>> jiffies2.value_()
4391640291
>>> jiffies.value_()
4391640593
References have a drgn.Object.address_
attribute, which is the object’s
address as a Python int
. This is slightly different from the
drgn.Object.address_of_()
method, which returns the address as a
drgn.Object
. Of course, both references and values can have a pointer type;
address_
refers to the address of the pointer object itself, and
drgn.Object.value_()
refers to the value of the pointer (i.e., the
address it points to):
>>> address = prog['jiffies'].address_
>>> type(address)
<class 'int'>
>>> print(hex(address))
0xffffffffbe405000
>>> jiffiesp = prog['jiffies'].address_of_()
>>> jiffiesp
Object(prog, 'volatile unsigned long *', value=0xffffffffbe405000)
>>> print(hex(jiffiesp.value_()))
0xffffffffbe405000
Absent Objects¶
In addition to reference objects and value objects, objects may also be absent.
>>> Object(prog, "int").value_()
Traceback (most recent call last):
File "<console>", line 1, in <module>
_drgn.ObjectAbsentError: object absent
This represents an object whose value or address is not known. For example, this can happen if the object was optimized out of the program by the compiler.
Any attempt to operate on an absent object results in a
drgn.ObjectAbsentError
exception, although basic information including
its type may still be accessed.
Helpers¶
Some programs have common data structures that you may want to examine. For example, consider linked lists in the Linux kernel:
struct list_head {
struct list_head *next, *prev;
};
#define list_for_each(pos, head) \
for (pos = (head)->next; pos != (head); pos = pos->next)
When working with these lists, you’d probably want to define a function:
def list_for_each(head):
pos = head.next
while pos != head:
yield pos
pos = pos.next
Then, you could use it like so for any list you need to look at:
>>> for pos in list_for_each(head):
... do_something_with(pos)
Of course, it would be a waste of time and effort for everyone to have to define these helpers for themselves, so drgn includes a collection of helpers for many use cases. See Helpers.
Validators¶
Validators are a special category of helpers that check the consistency of a data structure. In general, helpers assume that the data structures that they examine are valid. Validators do not make this assumption and do additional (potentially expensive) checks to detect broken invariants, corruption, etc.
Validators raise drgn.helpers.ValidationError
if the data structure is
not valid or drgn.FaultError
if the data structure is invalid in a way
that causes a bad memory access. They have names prefixed with validate_
.
For example, drgn.helpers.linux.list.validate_list()
checks the
consistency of a linked list in the Linux kernel (in particular, the
consistency of the next
and prev
pointers):
>>> validate_list(prog["my_list"].address_of_())
drgn.helpers.ValidationError: (struct list_head *)0xffffffffc029e460 next 0xffffffffc029e000 has prev 0xffffffffc029e450
drgn.helpers.linux.list.validate_list_for_each_entry()
does the same
checks while also returning the entries in the list for further validation:
def validate_my_list(prog):
for entry in validate_list_for_each_entry(
"struct my_entry",
prog["my_list"].address_of_(),
"list",
):
if entry.value < 0:
raise ValidationError("list contains negative entry")
Other Concepts¶
In addition to the core concepts above, drgn provides a few additional abstractions.
Threads¶
The drgn.Thread
class represents a thread.
drgn.Program.threads()
, drgn.Program.thread()
,
drgn.Program.main_thread()
, and drgn.Program.crashed_thread()
can be used to find threads:
>>> for thread in prog.threads():
... print(thread.tid)
...
39143
39144
>>> print(prog.main_thread().tid)
39143
>>> print(prog.crashed_thread().tid)
39144
Stack Traces¶
drgn represents stack traces with the drgn.StackTrace
and
drgn.StackFrame
classes. drgn.stack_trace()
,
drgn.Program.stack_trace()
, and drgn.Thread.stack_trace()
return the call stack for a thread. The []
operator looks up an object in the scope of a
StackFrame
:
>>> trace = stack_trace(115)
>>> trace
#0 context_switch (./kernel/sched/core.c:4683:2)
#1 __schedule (./kernel/sched/core.c:5940:8)
#2 schedule (./kernel/sched/core.c:6019:3)
#3 schedule_hrtimeout_range_clock (./kernel/time/hrtimer.c:2148:3)
#4 poll_schedule_timeout (./fs/select.c:243:8)
#5 do_poll (./fs/select.c:961:8)
#6 do_sys_poll (./fs/select.c:1011:12)
#7 __do_sys_poll (./fs/select.c:1076:8)
#8 __se_sys_poll (./fs/select.c:1064:1)
#9 __x64_sys_poll (./fs/select.c:1064:1)
#10 do_syscall_x64 (./arch/x86/entry/common.c:50:14)
#11 do_syscall_64 (./arch/x86/entry/common.c:80:7)
#12 entry_SYSCALL_64+0x7c/0x15b (./arch/x86/entry/entry_64.S:113)
#13 0x7f3344072af7
>>> trace[5]
#5 at 0xffffffff8a5a32d0 (do_sys_poll+0x400/0x578) in do_poll at ./fs/select.c:961:8 (inlined)
>>> prog['do_poll']
(int (struct poll_list *list, struct poll_wqueues *wait, struct timespec64 *end_time))<absent>
>>> trace[5]['list']
*(struct poll_list *)0xffffacca402e3b50 = {
.next = (struct poll_list *)0x0,
.len = (int)1,
.entries = (struct pollfd []){},
}
Symbols¶
The symbol table of a program is a list of identifiers along with their address
and size. drgn represents symbols with the drgn.Symbol
class, which is
returned by drgn.Program.symbol()
.
Types¶
drgn automatically obtains type definitions from the program. Types are
represented by the drgn.Type
class and created by various factory
functions like drgn.Program.int_type()
:
>>> prog.type('int')
prog.int_type(name='int', size=4, is_signed=True)
You won’t usually need to work with types directly, but see Types if you do.
Platforms¶
Certain operations and objects in a program are platform-dependent; drgn allows
accessing the platform that a program runs with the drgn.Platform
class.
Command Line Interface¶
The drgn CLI is basically a wrapper around the drgn library which automatically
creates a drgn.Program
. The CLI can be run in interactive mode or
script mode.
Script Mode¶
Script mode is useful for reusable scripts. Simply pass the path to the script along with any arguments:
$ cat script.py
import sys
from drgn.helpers.linux import find_task
pid = int(sys.argv[1])
uid = find_task(pid).cred.uid.val.value_()
print(f'PID {pid} is being run by UID {uid}')
$ sudo drgn script.py 601
PID 601 is being run by UID 1000
It’s even possible to run drgn scripts directly with the proper shebang:
$ cat script2.py
#!/usr/bin/env drgn
mounts = prog['init_task'].nsproxy.mnt_ns.mounts.value_()
print(f'You have {mounts} filesystems mounted')
$ sudo ./script2.py
You have 36 filesystems mounted
Interactive Mode¶
Interactive mode uses the Python interpreter’s interactive mode and adds a few nice features, including:
History
Tab completion
Automatic import of relevant helpers
Pretty printing of objects and types
The default behavior of the Python REPL is to
print the output of repr()
. For drgn.Object
and
drgn.Type
, this is a raw representation:
>>> print(repr(prog['jiffies']))
Object(prog, 'volatile unsigned long', address=0xffffffffbe405000)
>>> print(repr(prog.type('atomic_t')))
prog.typedef_type(name='atomic_t', type=prog.struct_type(tag=None, size=4, members=(TypeMember(prog.type('int'), name='counter', bit_offset=0),)))
The standard print()
function uses the output of str()
. For
drgn objects and types, this is a representation in programming language
syntax:
>>> print(prog['jiffies'])
(volatile unsigned long)4395387628
>>> print(prog.type('atomic_t'))
typedef struct {
int counter;
} atomic_t
In interactive mode, the drgn CLI automatically uses str()
instead of
repr()
for objects and types, so you don’t need to call print()
explicitly:
$ sudo drgn
>>> prog['jiffies']
(volatile unsigned long)4395387628
>>> prog.type('atomic_t')
typedef struct {
int counter;
} atomic_t
Next Steps¶
Refer to the API Reference. Look through the Helpers. Read some Case Studies. Browse through the tools. Check out the community contributions.