I create a list as follows:

['v0' if x%4==0 else 'v1' if x%4==1 else 'v2' if x%4==2 else 'v3' for x in list_1]

How to generalize the creation of such a list, so that it can be easily expanded by a larger number of variables and subsequent conditions?

up vote 32 down vote accepted

String formatting

Why not use a modulo operation here, and do string formatting, like:

['v{}'.format(x%4) for x in list_1]

Here we thus calculate x%4, and append this to 'v' in the string. The nice thing is that we can easily change 4, to another number.

Tuple or list indexing

In case the output string do not follow such structure, we can construct a list or tuple to hold the values. Like:

# in case the values do not follow a certain structure
vals = ('v0', 'v1', 'v2', 'v3')
[vals[x%4] for x in list_1]

By indexing it that way, it is clear what values will map on what index. This works good, given the result of the operation - here x%4 - maps to a n (with a reasonable small n).

(Default)dictionary

In case the operation does not map on a n, but still on a finite number of hashable items, we can use a dictionary. For instance:

d = {0: 'v0', 1: 'v1', 2: 'v2', 3: 'v3'}

or in case we want a "fallback" value, a value that is used given the lookup fails:

from collections import defaultdict

d = defaultdict(lambda: 1234, {0: 'v0', 1: 'v1', 2: 'v2', 3: 'v3'})

where here 1234 is used as a fallback value, and then we can use:

[d[x%4] for x in list_1]

Using d[x%4] over d.get(x%4) given d is a dictionary can be more useful if we want to prevent that lookups that fail pass unnoticed. It will in that case error. Although errors are typically not a good sign, it can be better to raise an error in case the lookup fails than add a default value, since it can be a symptom that something is not working correctly.

  • 3
    This answer is hyper specific to the modulo operation. What about a general case when OP has some arbitrary operation and arbitrary value to return? – coldspeed Jan 7 at 18:28
  • 1
    @cᴏʟᴅsᴘᴇᴇᴅ: well it works for everything that maps on Zn. I think the advantage with using this is that we are certain that we do not forget a value. If we use .get(..) then a potential problem is that an error passes silently, which is agains the philosophy of Python. – Willem Van Onsem Jan 7 at 18:30
  • I agree, but the .get idiom would map best to the "else" clause in OP's code, which is my rationale behind using it. – coldspeed Jan 7 at 18:31
  • @cᴏʟᴅsᴘᴇᴇᴅ: well in that case it is perhaps worth using a defaultdict. Since in that case, it is explicitly picked as a "default" value. – Willem Van Onsem Jan 7 at 18:32

Here are my attempts at a generic solution. First, the setup -

list_1 = [1, 2, 4, 5, 10, 4, 3]

The first two options are pure-python based, while the last two use numerical libraries (numpy and pandas).


Option 1

dict.get

Generate a mapping of keys to values. In the list comprehension, query dict.get -

mapping = {0 : 'v0', 1 : 'v1', 2 : 'v2'}
r = [mapping.get(x % 4, 'v3') for x in list_1]

r
['v1', 'v2', 'v0', 'v1', 'v2', 'v0', 'v3']

Here, 'v3' is the default value that is returned when the result of x % 4 does not exist as a key in mapping.

This would work for any arbitrary set of conditions and values, not just the condition outlined in the question (modulo arithmetic).


Option 2

collections.defaultdict

A similar solution would be possible using a defaultdict -

from collections import defaultdict

mapping = defaultdict(lambda: 'v3', {0: 'v0', 1: 'v1', 2: 'v2', 3: 'v3'})
r = [mapping[x % 4] for x in list_1]

r
['v1', 'v2', 'v0', 'v1', 'v2', 'v0', 'v3']

This works similar to Option 1.


Option 3

numpy.char.add

If you use numpy, then you might be interested in a vectorised solution involving modulo arithmetic and broadcasted addition -

r = np.char.add('v', (np.array(list_1) % 4).astype('<U8'))

r
array(['v1', 'v2', 'v0', 'v1', 'v2', 'v0', 'v3'],
      dtype='<U9')

If you require a list as the final result, you can call r.tolist(). Note that this solution is optimised for your particular use case. A more generic approach would be achieved with numpy using np.where/np.select.


Option 4

pd.Series.mod + pd.Series.radd

A similar solution would also work with pandas mod + radd -

r = pd.Series(list_1).mod(4).astype(str).radd('v')
r

0    v1
1    v2
2    v0
3    v1
4    v2
5    v0
6    v3
dtype: object

r.tolist()
['v1', 'v2', 'v0', 'v1', 'v2', 'v0', 'v3']
  • 2
    @WillemVanOnsem Thank you. My answer aimed to address a general situation that would work for any set of conditions and values. I suppose I was not clear with that, and the downvoter did not pick up on it. – coldspeed Jan 7 at 18:29
  • 2
    Nitpick: The dict-based approaches will assign the same instance to each matching item in the list, whereas OP's original code would create a new instance each time. Most likely this is not a problem at all (certainly not for strings or numbers) but in some cases it might be. Just wanted to point this out. – tobias_k Jan 8 at 10:42
  • @jezrael Wow... that's horrible... how did you find out?? – coldspeed Apr 12 at 11:27

In the given example, it's clear that we can "compress" the conditions, which lead to the specific solutions that were given here. In the general case though, we can't assume that there is some "trick" to quickly write out all possible conditions in a single line.

I'd write out all the conditions in a function:

def conditions(x):
    if x == <option a>:
        return <result a>
    elif x == <option b>:
        return <result b>
    .
    .
    .
    else:
        return <default option>

If you're just using compare operations, you can just use a collections.defaultdict, as shown in the other responses. If the conditions are more complex, then you'd probably have to write out the whole function as shown.

Now for you list comprehension, you can just do:

values = [conditions(x) for x in my_list_of_values]
def condition(rule, out):
  return lambda x: out(x) if rule(x) else None

def rule1(x): return x%4 == 0
def out1(x): return 'v0'

def rule2(x): return x%4 == 1
def out2(x): return 'v1'

def rule3(x): return x%4 == 2
def out3(x): return 'v2'

lastrule = lambda x: True
lastout = lambda x: 'v3'

check1 = condition(rule1, out1)
check2 = condition(rule2, out2)
check3 = condition(rule3, out3)
check_last = condition(lastrule, lastout)

def tranform(*check_list):
  def trans_value(x):
    for trans in check_list:
      if trans(x) is not None:
        return trans(x)
  return trans_value

list_1=[4,5,6,7,8]      
print([tranform(check1, check2, check3, check_last)(x) for x in list_1])

For long checks, it might be easier to form a list of conditions first. Assume the conditional formula and output are both functions of x, no other input parameters. The way below saves some typing yet maintain additivity for long checks.

To achieve even more generic method (more complex conditions, multiple parameters), some compound procedures might be helpful (something like both(greater, either(smaller, identity)) and the whole program needs to be restructured again, which means the additivity of the program is not yet ideal, since it is not generic enough yet.

outconstants = ['v0', 'v1', 'v2', 'v3']
# for this specific example. In general, only outf is needed (see below)
n = len(outconstant)

outf = lambda out: lambda x: out
outs = [outf(out) for out in outconstants]
# define your own outf formula, if not output constant
# define multiple formulas and put into list, if different type of outputs are needed

rights = map(lambda constant: lambda x: constant, range(n-1)) 
lefts = [lambda x: x%4 for _ in range(n-1)]
# right and left formulas can be also defined separately and then put into list

def identity(a, b): return lambda x: a(x) == b(x) 
# define other rules if needed and form them into rules list with proper orders
# e.g., def greater(a, b): return lambda x: a(x) > b(x), ...


lastrule=lambda x: True

rules = list(map(identity, lefts, rights))
rules.append(lastrule)
# in complex case, each unique rule needs to be defined separately and put into list
# if new rule is needed, define it and append here before lastrule (additive)

def transform(rules, outs):
    def trans_value(x):
        for rule, out in zip(rules, outs):
            if rule(x):
                return out(x)
    return trans_value

list_1=[4,5,6,7,8]
print([transform(rules, outs)(x) for x in list_1])

Your Answer

 

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Not the answer you're looking for? Browse other questions tagged or ask your own question.