Package bap
[hide private]
[frames] | no frames]

Source Code for Package bap

  1  r"""Python inteface to BAP. 
  2   
  3   
  4  Porcelain Interace 
  5  ================== 
  6   
  7  The high level interface allows to run ``bap`` and get back the information 
  8  that we were able to infer from the file. It consists only from one function, 
  9  ``bap.run``, that will drive ``bap`` for you. It is quite versatile, so read the 
 10  documentation for the further information. 
 11   
 12   
 13  Example 
 14  ------- 
 15   
 16  >>> import bap 
 17  >>> proj = bap.run('/bin/true', ['--symbolizer=ida']) 
 18  >>> text = proj.sections['.text'] 
 19  >>> main = proj.program.subs.find('main') 
 20  >>> entry = main.blks[0] 
 21  >>> next = main.blks.find(entry.jmps[0].target.arg) 
 22   
 23  It is recommended to explore the interface using ipython or similiar 
 24  interactive toplevels. 
 25   
 26  We use ADT syntax to communicate with python. It is a syntactical 
 27  subset of Python grammar, so in fact, bap just returns a valid Python 
 28  program, that is then evaluated. The ADT stands for Algebraic Data 
 29  Type, and is described in ``adt`` module. For non-trivial tasks one 
 30  should consider using ``adt.Visitor`` class. 
 31   
 32   
 33   
 34  Plumbing interface [rpc] 
 35  ======================== 
 36   
 37  The low level interface provides an access to internal services. It 
 38  uses ``bap-server``, and talks with bap using RPC protocol. It is in 
 39  extras section and must be installed explicitly with ``[rpc]`` tag. 
 40   
 41  In a few keystrokes: 
 42   
 43      >>> import bap 
 44      >>> print '\n'.join(insn.asm for insn in bap.disasm("\x48\x83\xec\x08")) 
 45          decl    %eax 
 46          subl    $0x8, %esp 
 47   
 48  A more complex example: 
 49   
 50      >>> img = bap.image('coreutils_O0_ls') 
 51      >>> sym = img.get_symbol('main') 
 52      >>> print '\n'.join(insn.asm for insn in bap.disasm(sym)) 
 53          push    {r11, lr} 
 54          add     r11, sp, #0x4 
 55          sub     sp, sp, #0xc8 
 56          ... <snip> ... 
 57   
 58  Bap package exposes two functions: 
 59   
 60  #. ``disasm`` returns a disassembly of the given object 
 61  #. ``image``  loads given file 
 62   
 63  Disassembling things 
 64  -------------------- 
 65   
 66  ``disasm`` is a swiss knife for disassembling things. It takes either a 
 67  string object, or something returned by an ``image`` function, e.g., 
 68  images, segments and symbols. 
 69   
 70  ``disasm`` function returns a generator yielding instances of class 
 71  ``Insn`` defined in module :mod:`asm`. It has the following attributes: 
 72   
 73  * name - instruction name, as undelying backend names it 
 74  * addr - address of the first byte of instruction 
 75  * size - overall size of the instruction 
 76  * operands - list of instances of class ``Op`` 
 77  * asm - assembler string, in native assembler 
 78  * kinds - instruction meta properties, see :mod:`asm` 
 79  * target - instruction lifter to a target platform, e.g., see :mod:`arm` 
 80  * bil - a list of BIL statements, describing instruction semantics. 
 81   
 82  ``disasm`` function also accepts a bunch of keyword arguments, to name a few: 
 83   
 84  * server - either an url to a bap server or a dictionay containing port 
 85    and/or executable name 
 86  * arch 
 87  * endian  (instance of ``bil.Endian``) 
 88  * addr    (should be an instance of type ``bil.Int``) 
 89  * backend 
 90  * stop_conditions 
 91   
 92  All attributes are self-describing I hope. ``stop_conditions`` is a list of 
 93  ``Kind`` instances defined in :mod:`asm`. If disassembler meets instruction 
 94  that is instance of one of this kind, it will stop. 
 95   
 96  Reading files 
 97  ------------- 
 98   
 99  To read and analyze file one should load it with ``image`` 
100  function. This function  returns an instance of class ``Image`` that 
101  allows one to discover information about the file, and perform different 
102  queries. It has function ``get_symbol`` function to lookup symbol in 
103  file by name, and the following set of attributes (self describing): 
104   
105  * arch 
106  * entry_point 
107  * addr_size 
108  * endian 
109  * file (file name) 
110  * segments 
111   
112  Segments is a list of instances of ``Segment`` class, that also has a 
113  ``get_symbol`` function and the following attributes: 
114   
115  * name 
116  * perm (a list of ['r', 'w', 'x']) 
117  * addr 
118  * size 
119  * memory 
120  * symbols 
121   
122  Symbols is a list of, you get it, ``Symbol`` class, each having the 
123  following attributes: 
124   
125  * name 
126  * is_function 
127  * is_debug 
128  * addr 
129  * chunks 
130   
131  Where chunks is a list of instances of ``Memory`` class, each having the 
132  following attributes: 
133   
134  * addr 
135  * size 
136  * data 
137   
138  Where data is actual string of bytes. 
139  """ 
140   
141  from .bap import run 
142   
143  try : 
144      from .rpc import disasm, image 
145  except ImportError: 
146      pass 
147