Package bap
[hide private]
[frames] | no frames]

Package bap

source code

Python inteface to BAP.


Porcelain Interace
==================

The high level interface allows to run ``bap`` and get back the information
that we were able to infer from the file. It consists only from one function,
``bap.run``, that will drive ``bap`` for you. It is quite versatile, so read the
documentation for the further information.


Example
-------

>>> import bap
>>> proj = bap.run('/bin/true', ['--symbolizer=ida'])
>>> text = proj.sections['.text']
>>> main = proj.program.subs.find('main')
>>> entry = main.blks[0]
>>> next = main.blks.find(entry.jmps[0].target.arg)

It is recommended to explore the interface using ipython or similiar
interactive toplevels.

We use ADT syntax to communicate with python. It is a syntactical
subset of Python grammar, so in fact, bap just returns a valid Python
program, that is then evaluated. The ADT stands for Algebraic Data
Type, and is described in ``adt`` module. For non-trivial tasks one
should consider using ``adt.Visitor`` class.



Plumbing interface [rpc]
========================

The low level interface provides an access to internal services. It
uses ``bap-server``, and talks with bap using RPC protocol. It is in
extras section and must be installed explicitly with ``[rpc]`` tag.

In a few keystrokes:

    >>> import bap
    >>> print '\n'.join(insn.asm for insn in bap.disasm("\x48\x83\xec\x08"))
        decl    %eax
        subl    $0x8, %esp

A more complex example:

    >>> img = bap.image('coreutils_O0_ls')
    >>> sym = img.get_symbol('main')
    >>> print '\n'.join(insn.asm for insn in bap.disasm(sym))
        push    {r11, lr}
        add     r11, sp, #0x4
        sub     sp, sp, #0xc8
        ... <snip> ...

Bap package exposes two functions:

#. ``disasm`` returns a disassembly of the given object
#. ``image``  loads given file

Disassembling things
--------------------

``disasm`` is a swiss knife for disassembling things. It takes either a
string object, or something returned by an ``image`` function, e.g.,
images, segments and symbols.

``disasm`` function returns a generator yielding instances of class
``Insn`` defined in module :mod:`asm`. It has the following attributes:

* name - instruction name, as undelying backend names it
* addr - address of the first byte of instruction
* size - overall size of the instruction
* operands - list of instances of class ``Op``
* asm - assembler string, in native assembler
* kinds - instruction meta properties, see :mod:`asm`
* target - instruction lifter to a target platform, e.g., see :mod:`arm`
* bil - a list of BIL statements, describing instruction semantics.

``disasm`` function also accepts a bunch of keyword arguments, to name a few:

* server - either an url to a bap server or a dictionay containing port
  and/or executable name
* arch
* endian  (instance of ``bil.Endian``)
* addr    (should be an instance of type ``bil.Int``)
* backend
* stop_conditions

All attributes are self-describing I hope. ``stop_conditions`` is a list of
``Kind`` instances defined in :mod:`asm`. If disassembler meets instruction
that is instance of one of this kind, it will stop.

Reading files
-------------

To read and analyze file one should load it with ``image``
function. This function  returns an instance of class ``Image`` that
allows one to discover information about the file, and perform different
queries. It has function ``get_symbol`` function to lookup symbol in
file by name, and the following set of attributes (self describing):

* arch
* entry_point
* addr_size
* endian
* file (file name)
* segments

Segments is a list of instances of ``Segment`` class, that also has a
``get_symbol`` function and the following attributes:

* name
* perm (a list of ['r', 'w', 'x'])
* addr
* size
* memory
* symbols

Symbols is a list of, you get it, ``Symbol`` class, each having the
following attributes:

* name
* is_function
* is_debug
* addr
* chunks

Where chunks is a list of instances of ``Memory`` class, each having the
following attributes:

* addr
* size
* data

Where data is actual string of bytes.

Submodules [hide private]

Variables [hide private]
  __package__ = 'bap'