1 r"""Python inteface to BAP.
2
3
4 Porcelain Interace
5 ==================
6
7 The high level interface allows to run ``bap`` and get back the information
8 that we were able to infer from the file. It consists only from one function,
9 ``bap.run``, that will drive ``bap`` for you. It is quite versatile, so read the
10 documentation for the further information.
11
12
13 Example
14 -------
15
16 >>> import bap
17 >>> proj = bap.run('/bin/true', ['--symbolizer=ida'])
18 >>> text = proj.sections['.text']
19 >>> main = proj.program.subs.find('main')
20 >>> entry = main.blks[0]
21 >>> next = main.blks.find(entry.jmps[0].target.arg)
22
23 It is recommended to explore the interface using ipython or similiar
24 interactive toplevels.
25
26 We use ADT syntax to communicate with python. It is a syntactical
27 subset of Python grammar, so in fact, bap just returns a valid Python
28 program, that is then evaluated. The ADT stands for Algebraic Data
29 Type, and is described in ``adt`` module. For non-trivial tasks one
30 should consider using ``adt.Visitor`` class.
31
32
33
34 Plumbing interface [rpc]
35 ========================
36
37 The low level interface provides an access to internal services. It
38 uses ``bap-server``, and talks with bap using RPC protocol. It is in
39 extras section and must be installed explicitly with ``[rpc]`` tag.
40
41 In a few keystrokes:
42
43 >>> import bap
44 >>> print '\n'.join(insn.asm for insn in bap.disasm("\x48\x83\xec\x08"))
45 decl %eax
46 subl $0x8, %esp
47
48 A more complex example:
49
50 >>> img = bap.image('coreutils_O0_ls')
51 >>> sym = img.get_symbol('main')
52 >>> print '\n'.join(insn.asm for insn in bap.disasm(sym))
53 push {r11, lr}
54 add r11, sp, #0x4
55 sub sp, sp, #0xc8
56 ... <snip> ...
57
58 Bap package exposes two functions:
59
60 #. ``disasm`` returns a disassembly of the given object
61 #. ``image`` loads given file
62
63 Disassembling things
64 --------------------
65
66 ``disasm`` is a swiss knife for disassembling things. It takes either a
67 string object, or something returned by an ``image`` function, e.g.,
68 images, segments and symbols.
69
70 ``disasm`` function returns a generator yielding instances of class
71 ``Insn`` defined in module :mod:`asm`. It has the following attributes:
72
73 * name - instruction name, as undelying backend names it
74 * addr - address of the first byte of instruction
75 * size - overall size of the instruction
76 * operands - list of instances of class ``Op``
77 * asm - assembler string, in native assembler
78 * kinds - instruction meta properties, see :mod:`asm`
79 * target - instruction lifter to a target platform, e.g., see :mod:`arm`
80 * bil - a list of BIL statements, describing instruction semantics.
81
82 ``disasm`` function also accepts a bunch of keyword arguments, to name a few:
83
84 * server - either an url to a bap server or a dictionay containing port
85 and/or executable name
86 * arch
87 * endian (instance of ``bil.Endian``)
88 * addr (should be an instance of type ``bil.Int``)
89 * backend
90 * stop_conditions
91
92 All attributes are self-describing I hope. ``stop_conditions`` is a list of
93 ``Kind`` instances defined in :mod:`asm`. If disassembler meets instruction
94 that is instance of one of this kind, it will stop.
95
96 Reading files
97 -------------
98
99 To read and analyze file one should load it with ``image``
100 function. This function returns an instance of class ``Image`` that
101 allows one to discover information about the file, and perform different
102 queries. It has function ``get_symbol`` function to lookup symbol in
103 file by name, and the following set of attributes (self describing):
104
105 * arch
106 * entry_point
107 * addr_size
108 * endian
109 * file (file name)
110 * segments
111
112 Segments is a list of instances of ``Segment`` class, that also has a
113 ``get_symbol`` function and the following attributes:
114
115 * name
116 * perm (a list of ['r', 'w', 'x'])
117 * addr
118 * size
119 * memory
120 * symbols
121
122 Symbols is a list of, you get it, ``Symbol`` class, each having the
123 following attributes:
124
125 * name
126 * is_function
127 * is_debug
128 * addr
129 * chunks
130
131 Where chunks is a list of instances of ``Memory`` class, each having the
132 following attributes:
133
134 * addr
135 * size
136 * data
137
138 Where data is actual string of bytes.
139 """
140
141 from .bap import run
142
143 try :
144 from .rpc import disasm, image
145 except ImportError:
146 pass
147