============================== Computing the dependency graph ============================== Specifying the working set ========================== When a graph is instantiated without any working set given, it will use the global working set of active distributions defined by pkg_resources: >>> from tl.eggdeps.graph import Graph >>> graph = Graph() >>> sort_specs(graph.working_set) [...setuptools ... tl.eggdeps ... zope.testing ...] For testing the graph builder, we will use custom working sets and distributions. Using the convenience distribution factory defined by our test setup, we pass a working set of some mock distributions to the graph builder: >>> anton_1 = make_dist("anton-1.egg", depends="berta") >>> berta_2 = make_dist("berta-2.egg", depends="""charlie>1.5 ... [extra] ... dora[test]""") >>> ws = make_working_set(anton_1, berta_2) >>> graph = Graph(working_set=ws) >>> sort_specs(graph.working_set) [anton 1 (.../anton-1.egg), berta 2 (.../berta-2.egg)] Helper methods ============== Extracting project names from specifications -------------------------------------------- Graphs have a method that extracts project names from an iterable of distributions or requirements and returns them as a set: >>> graph.names(ws) set([...]) >>> sprint(graph.names(ws)) set(['anton', 'berta']) A Graph instance has a filter function that determines by project name which distributions to include in the graph. This filter applies to the project names returned by the names method. As it allows any distribution by default, we have to specify something interesting to see an effect: >>> graph = Graph(ws, show=lambda name: name < "b") >>> graph.names(ws) set(['anton']) Another method actually named ``filter`` yields pairs of matching specifications and associated graph nodes for the distribution to be included by said rules. (For more on nodes, see below.) >>> graph = Graph(working_set=ws) >>> list(graph.filter(ws)) [(anton 1 (.../anton-1.egg), {}), (berta 2 (.../berta-2.egg), {})] >>> graph = Graph(ws, show=lambda name: name < "b") >>> list(graph.filter(ws)) [(anton 1 (.../anton-1.egg), {})] Filtering distributions ----------------------- The filter for distributions to be shown is stored on the graph instance: >>> graph.show("anton") True >>> graph.show("berta") False Finding distributions --------------------- Working sets have a find method that returns a distribution matching a requirement if one can be found. It is wrapped by a convenience method of the Graph class that handles a special case. If we ask for distributions active at a compatible version or not active at all, both find methods behave the same: >>> import pkg_resources >>> req = pkg_resources.Requirement.parse("anton") >>> ws.find(req) anton 1 (.../anton-1.egg) >>> graph.find(req) anton 1 (.../anton-1.egg) >>> req = pkg_resources.Requirement.parse("charlie") >>> ws.find(req) >>> graph.find(req) Unfortunately, the working set's find method raises an exception if a distribution for the same project is found, but at an incompatible version. As we treat distributions active at the wrong version the same as distributions not active at all, a convenience method handles the exception for us: >>> req = pkg_resources.Requirement.parse("anton>5") >>> ws.find(req) Traceback (most recent call last): ... VersionConflict: (anton 1 (.../anton-1.egg), Requirement.parse('anton>5')) >>> graph.find(req) Nodes ===== The graph contains nodes which are instances of the Node class. They get bound to a graph upon instantiation and represent a distribution by its project name. The distribution is specified by an instance of either a Distribution or a Requirement: >>> from tl.eggdeps.graph import Node >>> node = Node(graph, anton_1) >>> node.name 'anton' The node has a ``require`` method that tries to find a distribution matching a specification in the graph's working set. It returns a boolean indicating success or failure. If it succeeds, it stores the distribution in an attribute of the node. Another attribute keeps record of whether the distribution has been compatible to all specifications required so far. The ``require`` method has already been called once when the node was instantiated: >>> node.dist anton 1 (.../anton-1.egg) >>> node.compatible True When an attempt to match a specification to the node fails, the distribution remains associated with the node, but the node's compatibility flag is unset: >>> req_anton_5 = pkg_resources.Requirement.parse("anton>5") >>> node.require(req_anton_5) False >>> node.dist anton 1 (.../anton-1.egg) >>> node.compatible False On the other hand, if a distribution is not found upon instantiation, it may well be found by a later attempt at matching some specification: >>> node = Node(graph, req_anton_5) >>> print node.dist None >>> node.compatible False >>> node.require(anton_1) True >>> node.dist anton 1 (.../anton-1.egg) >>> node.compatible False If a node receives a requirement for a distribution of another project than its own, it will complain: >>> node.require(berta_2) Traceback (most recent call last): ... ValueError: A 'anton' node cannot satisfy a 'berta' requirement. Nodes have an API for having them store and list dependencies on other nodes, either with or without extras involved: >>> node.depend('berta') >>> list(node.iter_deps()) [('berta', set([]))] >>> node.extra_depend('cool-extra', 'charlie') >>> list(node.iter_deps()) [('charlie', set(['cool-extra'])), ('berta', set([]))] Dependencies can target packages with one or more extras activated. In that case, an iterable of extras of the dependency can be specified. We will not see anything about this the way we have looked at the stored dependencies so far, but there is another method that reads the storage and returns more detailed information: >>> node.depend('berta', ('hot-extra',)) >>> node.extra_depend('cool-feature', 'charlie', ('more-coolness',)) >>> list(node.iter_deps()) [('charlie', set(['cool-feature', 'cool-extra'])), ('berta', set([]))] >>> list(node.iter_deps_with_extras()) [('charlie', None, set(['cool-extra'])), ('charlie', 'more-coolness', set(['cool-feature'])), ('berta', None, set([])), ('berta', 'hot-extra', set([]))] If the same package is a dependency both via an extra and without one, the information on the extra is discarded: >>> node.depend('charlie') >>> list(node.iter_deps()) [('charlie', set([])), ('berta', set([]))] >>> list(node.iter_deps_with_extras()) [('charlie', None, set([])), ('charlie', 'more-coolness', set(['cool-feature'])), ('berta', None, set([])), ('berta', 'hot-extra', set([]))] A node remembers which of its extras have been used over time: >>> sorted(node.extras_used) ['cool-extra', 'cool-feature'] Analysing the working set ========================= A dependency graph may be built from the complete working set by finding all possible dependencies between any distributions. The graph will be a mapping from project names to node objects which describe each node's dependencies. Node objects in turn are mappings from project names of each dependency to a set of dependency descriptions. The empty set signals a mandatory dependency, a set of names means that the dependency is by way of any of the named extras. Dependencies which are not active will be ignored. Operating on the full working set --------------------------------- By default, all dependencies between any distributions in the working set will be reported, including mandatory as well as extra dependencies: >>> dora_0_5 = make_dist("dora-0.5.egg") >>> ws = make_working_set(anton_1, berta_2, dora_0_5) >>> graph = Graph(ws) >>> graph.from_working_set() >>> sprint(graph) {'anton': {'berta': {None: set([])}}, 'berta': {'dora': {'test': set(['extra'])}}, 'dora': {}} The graph has a set of roots, which are the names of those distributions that are not a dependency of any other node: >>> graph.roots set(['anton']) If a distribution depends on another one both mandatorily and by some extras (which is possible though not very useful), the dependency is considered a plain mandatory dependency: >>> emil_1 = make_dist("emil-1.egg", """anton ... [pointless-extra] ... anton""") >>> ws = make_working_set(anton_1, emil_1) >>> graph = Graph(ws) >>> graph.from_working_set() >>> sprint(graph) {'anton': {}, 'emil': {'anton': {None: set([])}}} >>> graph.roots set(['emil']) If the graph contains a cycle none of whose constituents is a dependency of a distribution outside the cycle, one of those constituents will be considered a graph root: >>> emil_1 = make_dist("emil-1.egg", "fritz") >>> fritz_5 = make_dist("fritz-5.egg", depends="emil") >>> ws = make_working_set(anton_1, emil_1, fritz_5) >>> graph = Graph(ws) >>> graph.from_working_set() >>> sprint(graph) {'anton': {}, 'emil': {'fritz': {None: set([])}}, 'fritz': {'emil': {None: set([])}}} >>> graph.roots set(['anton', 'emil']) Dependencies from a working set analysis take versions into account. The following does not report a dependency of berta on charlie as berta requires at least charlie 1.5: >>> charlie_1_4 = make_dist("charlie-1.4.egg") >>> ws = make_working_set(berta_2, charlie_1_4) >>> graph = Graph(ws) >>> graph.from_working_set() >>> sprint(graph) {'berta': {}, 'charlie': {}} Reducing the graph ------------------ Extra dependencies may be ignored completely to simplify a complex graph: >>> ws = make_working_set(anton_1, berta_2, dora_0_5) >>> graph = Graph(ws, extras=False) >>> graph.from_working_set() >>> sprint(graph) {'anton': {'berta': {None: set([])}}, 'berta': {}, 'dora': {}} >>> sorted(graph.roots) ['anton', 'dora'] Alternatively, specific distributions may be ignored. The graph will then not contain any node for those distributions, nor any edges for dependencies on them or their own dependencies. This is achieved by specifying a filter function that determines which distributions ought to be shown: >>> graph = Graph(ws, show=lambda name: name != "berta") >>> graph.from_working_set() >>> sprint(graph) {'anton': {}, 'dora': {}} >>> sorted(graph.roots) ['anton', 'dora'] If certain distributions should themselves be included in the graph but their dependencies not be followed, they can be made "dead ends" by passing a filter function that determines which distributions to follow the dependencies of: >>> graph = Graph(ws, follow=lambda name: name != "berta") >>> graph.from_working_set() >>> sprint(graph) {'anton': {'berta': {None: set([])}}, 'berta': {}, 'dora': {}} >>> sorted(graph.roots) ['anton', 'dora'] >>> graph["anton"].follow True >>> graph["berta"].follow False >>> graph["dora"].follow True Analysing specific distributions' dependencies ============================================== The second way of building a dependency graph is by inspecting the dependencies of one or more specified distributions. In this scenario, unrelated active distributions are ignored. Operating on the full working set --------------------------------- In this example, anton does not depend on berta and berta's dependencies: >>> charlie_1_6 = make_dist("charlie-1.6.egg") >>> ws = make_working_set(anton_1, berta_2, charlie_1_6) >>> graph = Graph(ws) >>> graph.from_specifications("berta") >>> sprint(graph) {'berta': {'charlie': {None: set([])}}, 'charlie': {}} The roots of the graph are the specified distributions now: >>> graph.roots set(['berta']) On the other hand, required distributions which are not in the working set are included now. In the example, this applies to dora: >>> graph = Graph(ws) >>> graph.from_specifications("berta [extra]") >>> sprint(graph) {'berta': {'charlie': {None: set([])}, 'dora': {None: set(['extra'])}}, 'charlie': {}, 'dora': {}} >>> graph.roots set(['berta']) Node objects store their associated distribution on an attribute. Since dora is inactive it doesn't have one, in contrast to berta and charlie: >>> graph["berta"].dist berta 2 (.../berta-2.egg) >>> graph["charlie"].dist charlie 1.6 (.../charlie-1.6.egg) >>> graph["dora"].dist If a version of charlie incompatible with the requirement by berta is active, charlie is treated as if it wasn't active at all: >>> ws = make_working_set(berta_2, charlie_1_4) >>> graph = Graph(ws) >>> graph.from_specifications("berta") >>> sprint(graph) {'berta': {'charlie': {None: set([])}}, 'charlie': {}} >>> graph["charlie"].dist Reducing the graph ------------------ In contrast to analysing the whole working set, turning off extra dependencies will remove those packages from the graph which are dependencies of the root nodes only by way of extras. In our example, this applies to anton: >>> ws = make_working_set(anton_1, berta_2, charlie_1_6) >>> graph = Graph(ws, extras=False) >>> graph.from_specifications("berta [extra]") >>> sprint(graph) {'berta': {'charlie': {None: set([])}}, 'charlie': {}} >>> graph.roots set(['berta']) Ignoring specific distributions has different effects than in whole working set analysis as well. Whichever other distributions are connected to the roots only through a distribution which is to be ignored (charlie as a dependency of berta in this case), will be left out of the graph themselves: >>> graph = Graph(ws, show=lambda name: name != "berta") >>> graph.from_specifications("anton") >>> sprint(graph) {'anton': {}} >>> graph.roots set(['anton']) Similarly, distributions depended on by dead ends only (charlie again) will be missing from the graph: >>> graph = Graph(ws, follow=lambda name: name != "berta") >>> graph.from_specifications("anton") >>> sprint(graph) {'anton': {'berta': {None: set([])}}, 'berta': {}} >>> graph.roots set(['anton']) >>> graph["anton"].follow True >>> graph["berta"].follow False But of course, distributions depended upon by ignored distributions and dead ends are not ignored, they may just be missed because of dependencies not being followed. If there are other paths from the roots to them, those distributions will be included in the graph, but with some connections missing: >>> fritz_5 = make_dist("fritz-5.egg", depends="""berta ... charlie""") >>> ws = make_working_set(berta_2, charlie_1_6, fritz_5) >>> graph = Graph(ws, show=lambda name: name != "berta") >>> graph.from_specifications("fritz") >>> sprint(graph) {'charlie': {}, 'fritz': {'charlie': {None: set([])}}} >>> graph = Graph(ws, follow=lambda name: name != "berta") >>> graph.from_specifications("fritz") >>> sprint(graph) {'berta': {}, 'charlie': {}, 'fritz': {'berta': {None: set([])}, 'charlie': {None: set([])}}} >>> graph["berta"].follow False >>> graph["charlie"].follow True >>> graph["fritz"].follow True .. Local Variables: .. mode: rst .. End: