Skip to content
This repository was archived by the owner on Apr 30, 2020. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions infra/graph_generators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sun Mar 17 13:00:33 2019

@author: Zargham

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the function of the author block? Does it represent "ownership" of the file in an abstract sense? What information does it provide outside of the information readily available from git blame? Are users who make modifications to this file expected to add themselves as authors?

Similarly, why do we have a created timestamp? In the few cases where the creation date is relevant information, it can be found by checking the timestamp on the associated commit. I don't see why humans should be manually maintaining such metadata.

If we do standardize on having explicit authorship tags, it should be the users' github handle, and not their last name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this was autogenerated by spyder and I just sort of ignored it. If you have a specific conversion for headers, I'll happily follow it.

Copy link

@teamdandelion teamdandelion Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have a standard python docstring describing the purpose of the module.

"""
import networkx as nx

def lineGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, python style for method names is underscore naming, as in line_graph_gen

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing commas between spaces. run black on this code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran black

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, arguments should have underscore naming, as in edge_type_name

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is edge_type_name and node_type_name needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The the python versions of the pagerank algorithm use the types to look up the edge weights. In these simple generators there is only one type of edge but i was expecting to use them for constructing more sophisticated graphs by combining these, the resulting graphs would would have multiple types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated these methods so the user can assign them with a dictionary, thus making heterogenous types, like in the sourcecred graphs with real data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_nodes is a clearer parameter name than N.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, all of these public methods should have python docstrings:
https://www.python.org/dev/peps/pep-0257/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added doc string; not sure they are totally up to convention but i read the link and added simple statements about each function.


line = nx.path_graph(N, create_using=nx.MultiDiGraph)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this algorithm is a little funky. Given that we create the path_graph using nx.MultiDiGraph in either case, couldnt we do something like:

graph = nx.path_graph(N, create_using=nx.MultiDiGraph)
if bidir:
  for e in edges:
    graph.add_edge(e[1], e[0])

# ... stuff ....
return graph

if not(bidir):
G = line

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: G is a bad variable name, especially as Capitalization in python implies a class name.

else:
edges = line.edges
G = nx.MultiDiGraph()
for e in edges:
G.add_edge(e[0],e[1])
G.add_edge(e[1],e[0])

nx.set_node_attributes(G,nodeTypeName, 'type')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do keep the nodeTypeName and edgeTypeName logic and arguments, we shouldn't keep repeating this logic all over the place. instead, we should have a GraphOptions type which includes a node type name and edge type name, and then a method like _apply_graph_options(graph, options) which does these mutations through a single re-used code path. then we can test it more thoroughly, easily change the logic, etc.

nx.set_edge_attributes(G,edgeTypeName, 'type')

return G

def starGraphGen(N, kind='sink',nodeTypeName='vanilla',edgeTypeName='vanilla'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: methods should have underscore names., N is a bad argument name


star = nx.star_graph(N)
G = nx.MultiDiGraph()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have error handling that checks that kind is an accepted argument:

if kind is not "source" and kind is not "sink" and kind is not "bidir":
  raise ValueError("Unrecognized kind: {}".format(kind)")

This allows much greater confidence that code using this API is calling it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon the drive-by, but:

  1. Don’t use is (reference equality) when you want value equality:

    >>> x = "a" * 1000
    >>> y = "a" * 1000
    >>> x == y
    True
    >>> x is y
    False
  2. Just use in:

    if kind not in ("source", "sink", "bidir"):
        ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the correction. I think I tried the following test and concluded that is was ok:

>>> a = "foo"
>>> b = "foo"
>>> a is b
True

I guess Python deduplicates literal strings but not constructed ones?

for e in star.edges:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we only ever want that star graph to iterate over its edges, i think it would be cleaner to inline it:

for e in nx.star_graph(N).edges:
  # do the stuff

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for even more readability, maybe destructure it?

for (src, dst) in nx.star_graph(N).edges:
  # do something

if (kind == 'source') or (kind == 'bidir'):
G.add_edge(e[0],e[1])
if (kind == 'sink') or (kind == 'bidir'):
G.add_edge(e[1],e[0])

nx.set_node_attributes(G,nodeTypeName, 'type')
nx.set_edge_attributes(G,edgeTypeName, 'type')

return G

def circleGraphGen(N, bidir=False,nodeTypeName='vanilla',edgeTypeName='vanilla' ):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nits as before


circle = nx.cycle_graph(N, create_using=nx.MultiDiGraph)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before, we should not need to initialize two graphs in the bidir case, we should be able to make the graph bidirectional in-place.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could think about factoring out a make_bidirectional method if we find ourselves doing this a lot.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into it; this was a work around in the first place. Some of the built in methods in networkx don't do quick what we want when initiated. Also, weird stuff happens when you iterate on the same graph you are editing.

if not(bidir):
G = circle
else:
edges = circle.edges
G = nx.MultiDiGraph()
for e in edges:
G.add_edge(e[0],e[1])
G.add_edge(e[1],e[0])

nx.set_node_attributes(G,nodeTypeName, 'type')
nx.set_edge_attributes(G,edgeTypeName, 'type')

return G

def treeGraphGen(r,h, kind='sink',nodeTypeName='vanilla',edgeTypeName='vanilla'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nits as on other methods


tree = nx.balanced_tree(r,h, create_using=nx.MultiDiGraph)

if kind=='source':
G = tree
elif kind =='sink':
G = nx.MultiDiGraph()
for e in tree.edges:
G.add_edge(e[1],e[0])
elif kind == 'bidir':
G = nx.MultiDiGraph()
for e in tree.edges:
G.add_edge(e[1],e[0])
G.add_edge(e[0],e[1])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add an else block to catch the case where kind is unrecognized:

else:
  raise ValueError("unrecognized kind: {}".format(kind)


nx.set_node_attributes(G,nodeTypeName, 'type')
nx.set_edge_attributes(G,edgeTypeName, 'type')

return G

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: trailing whitespace--hopefully black will take care of this.

68 changes: 68 additions & 0 deletions infra/graph_generators_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Mar 20 17:17:54 2019

@author: Zargham
"""
import networkx as nx
import unittest
from graph_generators import lineGraphGen, starGraphGen, treeGraphGen

class GraphGenerators_LineGraph(unittest.TestCase):
def test_lineGraphGen_sanity(self):
g = lineGraphGen(4)
self.assertEqual(list(g.edges()), [(0, 1), (1, 2), (2, 3)])

def test_lineGraphGen_bidir(self):
g = lineGraphGen(4, bidir=True)
# (node, neighbor)
self.assertEqual(list(g.edges()), [(0, 1), (1, 0), (1, 2), (2, 1), (2, 3), (3, 2)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should test the edge case too. what if we pass 0 to the graph generator?


class GraphGenerators_StarGraph(unittest.TestCase):

# how to validate that 'kind' is either 'source', 'sink', or 'bidir' ??

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have the fn raise a ValueError if the kind isnt an acceptable value, then test that behavior here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added error handling (first try)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added error handling across the module.


def test_starGraph_sink(self):
g = starGraphGen(4)
self.assertEqual(list(g.edges()), [(1, 0), (2, 0), (3, 0), (4, 0)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BrianLitwin when we were chatting i mentioned i didn't like these test cases, but now that i am reading and paying closer attention, i think they're fine. its pretty easy to read and reason about what this graph is, and whether it's the right graph for the arguments.


def test_starGraph_source(self):
g = starGraphGen(4, kind='source')
self.assertEqual(list(g.edges()), [(0, 1), (0, 2), (0, 3), (0, 4)])

def test_starGraph_bidir(self):
g = starGraphGen(4, kind='bidir')
self.assertEqual(list(g.edges()), [(0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (2, 0), (3, 0), (4, 0)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is kind of a doozy to keep all in my head--maybe check for n=2 or n=1 instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept them at mostly 3 and 4 because for 2 node graphs the there would be no way to distinguish some of the graph patterns; for example the star and the line would be the same.


class GraphGenerators_TreeGraph(unittest.TestCase):
#note: our default is diff from networkx's default in this case (sink vs source)
def test_treeGraph_source_sink(self):
g = treeGraphGen(2,2)
self.assertEqual(list(g.edges()), [(1, 0), (2, 0), (3, 1), (4, 1), (5, 2), (6, 2)])

def test_treeGraph_source(self):
g = treeGraphGen(2,2, kind='source')
self.assertEqual(list(g.edges()), [(0, 1), (0, 2), (1, 3), (1, 4), (2, 5), (2, 6)])

def test_treeGraph_bidir(self):
g = treeGraphGen(2,2, kind='bidir')
self.assertEqual(list(g.edges()), [
(1, 0), (1, 3), (1, 4), (0, 1), (0, 2), (2, 0),
(2, 5), (2, 6), (3, 1), (4, 1), (5, 2), (6, 2)
])

# this is the way I might test that the types assigned in Z's functions are correct
# I think it would be a good idea to abstract out that logic into a helper function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, agreed! see my comment above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factored it out. Thanks.

# so that the code isn't duplicated in every generator

class GraphGenerators_SetType(unittest.TestCase):
def test_lineGraphGen_setType(self):
g = lineGraphGen(4, nodeTypeName='foo',edgeTypeName='bar')

# each node and edge should have the correct 'type' value
for k, v in nx.get_node_attributes(g, 'type').items():
self.assertEqual(v, 'foo')

for k, v in nx.get_edge_attributes(g, 'type').items():
self.assertEqual(v, 'bar')