Data structure for arrays which share some elements — Python

Question

I have a collection of arrays which "overlap" at certain elements. Here's a picture of an example involving 3 arrays of characters:

  array0↓
       'A'      ↓array2
array1→'B' 'D' 'E'
       'C'     'F'

The important thing is that changes to the arrays should respect this structure. So for example, if I change the 'B' in array0 to 'X', the 'B' in array1 should also change to 'X'.

My question is what is a good, efficient way of implementing this in Python?

There are two things I have thought of so far:

One, I can make a bespoke class, instances of which contain a completely distinct list, along with information about any overlaps that it has, and implement update methods appropriately so that any change to the list is always duplicated for the other lists at the overlaps. This seems a little overwrought though, and involves duplicating the data.

Two, I could do it by using singleton lists like this:

data = [['A'], ['B'], ['C'], ['D'], ['E'], ['F']]
array0 = [data[0], data[1], data[2]]
array1 = [data[1], data[3], data[4]]
array2 = [data[4], data[5]]

for array in array0, array1, array2:
     print(array)

>>> [['A'], ['B'], ['C']]
>>> [['B'], ['D'], ['E']]
>>> [['E'], ['F']]

array0[1][0] = 'X'

for array in array0, array1, array2:
     print(array)

>>> [['A'], ['X'], ['C']]
>>> [['X'], ['D'], ['E']]
>>> [['E'], ['F']]

But I feel this may be hacky and not the best way. Thanks for any suggestions.

Do the overlaps have to match up with an actual crossword-style grid layout? — user2357112, 2 hours ago
Good question, I considered addressing that. For the problem I'm actually working on that would be fine. And so you could use a 2 dimensional array to implement this -- although it could potentially have a lot of empty space. And I am curious to see solutions for the more general situation when this assumption isn't met (e.g. if we also paired 'C' and 'F' in the example). — Denziloe, 2 hours ago

Daniel Mesejo · Answer 1 · 2018-08-24 16:34:47Z

Mine suggestion is variation of the one proposed by @a_guest. You could have a wrapper class that marks the elements as shared and a data structure for handling such elements:

class SharedElement:
    def __init__(self, val):
        self.val = val

    def update(self, val):
        self.val = val

    def __repr__(self):
        return "SharedElement({0})".format(self.val)

    def __str__(self):
        return str(self.val)


class SharedList:
    def __init__(self, lst):
        self._lst = lst

    def __getitem__(self, item):
        if isinstance(self._lst[item], SharedElement):
            return self._lst[item].val
        return self._lst[item]

    def __setitem__(self, key, value):
        if isinstance(self._lst[key], SharedElement):
            self._lst[key].update(value)


B = SharedElement('B')
E = SharedElement('E')

a = SharedList(['A', B, 'C'])
b = SharedList([B, 'D', E])
c = SharedList([E, 'F'])

b[0] = 'X'

print([val for val in a])
print([val for val in b])
print([val for val in c])

Output

['A', 'X', 'C']
['X', 'D', 'E']
['E', 'F']

I'd personally avoid mixing strings and SharedElements. If you ever want to do some operation on those arrays you'll have to check if each element is a str or a ShredElement and handle the two cases separately. I'd simply rename SharedElement to Element, and assign all values as Elements but refer to the same instances if they have to be shared. In this way the handling will be uniform. — Bakuriu, 2 hours ago
Using Element for all values could also allow to share the state without having to reassign the values. For example if x and y are instances that appear somewhere in the a, b, c lists you can make them equal and shared by doing something like x.__dict__ = y.__dict__ without having to do something like c[c.index(x)] = y. This is surely hackish but in some circumstance it might be the easiest solution. — Bakuriu, 2 hours ago

Ajax1234 · Answer 2 · 2018-08-24 14:11:05Z

You can create a wrapper class that can handle the updating of all elements of the same value:

arr = [[['A'], ['B'], ['C']], [['B'], ['D'], ['E']], [['E'], ['F']]]
class WrapArray:
  def __init__(self, _data):
    self.d = _data
  def __getitem__(self, _x):
    self.x = _x
    class _wrapper:
      def __init__(self, _inst):
         self.ref = _inst
      def __setitem__(self, _y, _val):
         _place = self.ref.d[self.ref.x][_y][0]
         self.ref.d[self.ref.x][_y][0] = _val
         for i in range(len(self.ref.d)):
           for b in range(len(self.ref.d[i])):
             if self.ref.d[i][b][0] == _place:
               self.ref.d[i][b] = [_val]
    return _wrapper(self)
  def __repr__(self):
    return str(self.d)

array = WrapArray(arr)
array[1][0] = 'X'

Output:

[[['A'], ['X'], ['C']], [['X'], ['D'], ['E']], [['E'], ['F']]]

a_guest · Answer 3 · 2018-08-24 15:47:40Z

You could subclass list and use a dedicated wrapper class to proxy shared content. This involves no data duplication as it only stores the proxy for shared data which dispatches to the original data. It's a bit similar to your nested list approach but it maintains the normal list interface. Here is an example implementation:

class Intersection:
    def __init__(self, other, index):
        self.other = other
        self.index = index

    def __repr__(self):
        return repr(self.other[self.index])

    @property
    def value(self):
        return self.other[self.index]

    @value.setter
    def value(self, v):
        self.other[self.index] = v


class List(list):
    def __getitem__(self, index):
        item = super().__getitem__(index)
        return item.value if isinstance(item, Intersection) else item

    def __setitem__(self, index, value):
        item = super().__getitem__(index)
        if isinstance(item, Intersection):
            item.value = value
        else:
            super().__setitem__(index, value)

    def share(self, index):
        return Intersection(self, index)

Now you can share the data among your lists as required:

a = List(['A', 'B', 'C'])
b = List([a.share(1), 'D', 'E'])
c = List([b.share(2), 'F'])

a[1] = 'X'
b[2] = 'Y'
print(a)
print(b)
print(c)

Which gives as output:

['A', 'X', 'C']
['X', 'D', 'Y']
['Y', 'F']

a_guest · Answer 4 · 2018-08-24 15:48:20Z

You could use a dedicated class that updates other intersecting instances appropriately as you indicated with your first idea. I wouldn't consider data duplication a problem as for mutable data you anyway store the references and in case your using large immutable data you can employ a dedicated wrapper class (e.g. Python 3.7 introduced the @dataclass decorator).

Here is an example implementation:

from collections import defaultdict

class List(list):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._intersections = defaultdict(list)

    def __setitem__(self, index, value):
        super().__setitem__(index, value)
        for i, other in self._intersections[index]:
            other[i] = value

    def intersect(self, other, at):
        self._intersections[at[0]].append((at[1], other))

With that you can intersect the lists as in your example:

a = List(['A', 'B', 'C'])
b = List(['B', 'D', 'E'])
c = List(['E', 'F'])

a.intersect(b, (1, 0))
b.intersect(c, (2, 0))

a[1] = 'X'
b[2] = 'Y'
print(a)
print(b)
print(c)

Which gives as output:

['A', 'X', 'C']
['X', 'D', 'Y']
['Y', 'F']

There doesn't seem to be much benefit to wrapping 2 of the 3 intersect args in an extra container, but other than that, it's a formidable solution — Felix Dombek, 2 hours ago

Lorenzo · Answer 5 · 2018-08-24 14:18:58Z

As you pointed out in your question, the relevant information is that

array0ptr = [0, 1, 2]
array1ptr = [1, 3, 4]
array2ptr = [4, 5]

(I am adding the suffix ptr because practically those elements are pointers). Here the list element are the pointer to the objects that shall be maintained in a separate list

ol = ['A', 'B', 'C', 'D', 'E']

The real arrays can be obtained at run time by member functions like

array0 = []
for i in range(len(array0ptr)):
    array0.append(ol[array0ptr[i]])

Now your point is : suppose that the object list becomes

ol = ['A', 'B', 'intruder', 'C', 'D', 'E']

How do I automagically keep track of this in my arrays?? Those arrays should become:

array0ptr = [0, 1, 3]
array1ptr = [1, 4, 5]
array2ptr = [5, 6]

I think that the most simple answer is : keep the list fixed!, and do not allow inserting or changing the order of items. Simply mantain a different hash with the object position. In the case above, you'll have

sl = ['A', 'B', 'C', 'D', 'E', 'intruder']
slorder = [0, 1, 3, 4, 5, 2]

it is then possible to write member functions that dump the updated list of objects, the array would not change. What can be tricky is if you want to delete objects, but this is tricky in any case I fear.

asked	today
viewed	239 times
active	today

current community

your communities

more stack exchange communities

Data structure for arrays which share some elements — Python

5 Answers 5

Your Answer

Not the answer you're looking for? Browse other questions tagged python data-structures or ask your own question.

Hot Network Questions

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python data-structures or ask your own question.

Related