Tuesday, October 06, 2020

Python 3.9 dict union op: keys at odds with values

 In short: keys preserve the first while values preserve the last w,r,t, duplicates.

Given two dicts with keys that compare equal,; under union the first key is kept

Given two dicts with keys that compare equal; under union the second value is kept

Where all equal here

We have the following:

Python 3.9.0 (tags/v3.9.0:9cf6752, Oct  5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> 1.0 == 1 == 1.0 + 0.0j
True
>>> 

 The number one expressed as an int, float or complex number all compare equal despite the type differences - which is fine. 

 Values of duplicated keys  

When creating the union of dicts that have the "same" keys, PEP584 states what values should be kept: the value in the dict to the right of the "|" operator. Unfortunately this is contrary to what happens to the keys

dict keys are now ordered

With the recently introduced insertion ordering for dict keys this naturally makes the first occurrence of a dict key, the one that is naturally kept and this is indeed the case when using keys that compare equal but are different:

>>> m, n, o = {1:0, 2:0, 3:0, 4:0}, {2.0: 0, 3.0: 0}, {3.0+0.0j: 0}
>>> m | n | o
{1: 0, 2: 0, 3: 0, 4: 0}
>>> o | n | m
{(3+0j): 0, 2.0: 0, 1: 0, 4: 0}
>>> 

It's shown in the keys above. 

 In  unions, which keys and values are kept differ!

You have the strange case that a given key-value pair from a union may have not appeared in any of the input dicts (if also comparing types). 


>>> p, q, r = {1:1, 2:2, 3:3, 4:4}, {2.0: 20, 3.0: 30}, {3.0+0.0j: 300}
>>> p | q | r
{1: 1, 2: 20, 3: 300, 4: 4}
>>> r | q | p
{(3+0j): 3, 2.0: 2, 1: 1, 4: 4}
>>> 

Note, for example, in the last output above, the key-value pair of float 2.0 and int 2 does not appear in any of the input dicts p, q, or r.

Can we do better?

Not sure. The PEP gives valid reasons for the last value being kept. Insertion ordering is good reason that the keys first seen being kept - as-is that the key orderings match that of sets under the same operator:

>>> s, t, u = {1, 2, 3, 4}, {2.0, 3.0}, {3.0+0.0j}
>>> s | t | u
{1, 2, 3, 4}
>>> u | t | s
{1, 2.0, (3+0j), 4}
>>> 

It is only a mild niggle for me, but I do need to know of this quirk.

 

No comments:

Post a Comment