# Successive Approximation

Given a named variable and the range in which its value must exist, generates successive binary digits which can be used to converge on the value successive binary digits approximating the value to ever greater precision.

## Binary search for a value.

Starting with the original lower and upper range boundaries for the named variable, the value is compared to the range midpoint: if its lower than the midpoint then the range maximum is updated to become the midpoint as the value is in this half; otherwise the range minimum is updated with the midpoint to narrow therange containing the value by a half on each round. If the lower half of the range is used then generate a zero otherwise generate a one for this "round" in the binary search.

``

`def bisect1(value, rmin, rmax):`

` rmid = (rmin + rmax) / 2`

` if value <= rmid:`

` if TUTOR:`

` print(" is <=", rmid)`

` bit = 0 # push 0`

` rmax = rmid # range lower half`

` else:`

` if TUTOR:`

` print(" is >", rmid)`

` bit = 1 # push 1`

` rmin = rmid # range upper half`

` `

` return rmin, rmax, bit`

``

`def encode1(value, rangemin, rangemax, bitcount):`

` if TUTOR:`

` print(f"Successively encoding {value} between [{rangemin}, {rangemax}]:")`

` assert rangemin <= value <= rangemax, "Out of range!"`

` rmin, rmax = rangemin, rangemax # start at extreme`

` result = []`

` for i in range(bitcount):`

` rmin, rmax, bit = bisect1(value, rmin, rmax)`

` result.append(bit)`

` if TUTOR:`

` print(f" Resulting bit stream: {result}.")`

` `

` return result, rmin, rmax`

There is a `TUTOR`

mode to turn on and before giving the encoder a whirl.

`TUTOR = True`

``

`encode1(2/3, -2, 4, 9)`

```
Successively encoding 0.6666666666666666 between [-2, 4]:
is <= 1.0
is > -0.5
is > 0.25
is > 0.625
is <= 0.8125
is <= 0.71875
is <= 0.671875
is > 0.6484375
is > 0.66015625
Resulting bit stream: [0, 1, 1, 1, 0, 0, 0, 1, 1].
```

`([0, 1, 1, 1, 0, 0, 0, 1, 1], 0.66015625, 0.671875)`

You should be able to see how the encoder successively approaches the value.

Lets try some corner cases:

`encode1(0.5, 0, 1, 8)`

```
Successively encoding 0.5 between [0, 1]:
is <= 0.5
is > 0.25
is > 0.375
is > 0.4375
is > 0.46875
is > 0.484375
is > 0.4921875
is > 0.49609375
Resulting bit stream: [0, 1, 1, 1, 1, 1, 1, 1].
```

`([0, 1, 1, 1, 1, 1, 1, 1], 0.49609375, 0.5)`

`encode1(0, 0, 1, 8)`

```
Successively encoding 0 between [0, 1]:
is <= 0.5
is <= 0.25
is <= 0.125
is <= 0.0625
is <= 0.03125
is <= 0.015625
is <= 0.0078125
is <= 0.00390625
Resulting bit stream: [0, 0, 0, 0, 0, 0, 0, 0].
```

`([0, 0, 0, 0, 0, 0, 0, 0], 0, 0.00390625)`

`encode1(1, 0, 1, 8)`

```
Successively encoding 1 between [0, 1]:
is > 0.5
is > 0.75
is > 0.875
is > 0.9375
is > 0.96875
is > 0.984375
is > 0.9921875
is > 0.99609375
Resulting bit stream: [1, 1, 1, 1, 1, 1, 1, 1].
```

`([1, 1, 1, 1, 1, 1, 1, 1], 0.99609375, 1)`

## Bitstream decoding

With the encoders bitstream and theinitial range, the initial value can be recovered to ever greater accuracy as more bits are processed. The decoder takes the same decisions as the encoder when successively bisecting the initial range.

`def decode1(bits, rangemin, rangemax):`

` if TUTOR:`

` print(f"Successively decoding {bits} between [{rangemin}, {rangemax}]:")`

` rmin, rmax = rangemin, rangemax # start at extremes`

` for bit in bits:`

` rmid = (rmin + rmax) / 2`

` if bit == 0:`

` rmax = rmid # range lower half`

` else:`

` rmin = rmid # range upper half`

` if TUTOR:`

` print(f" in range [{rmin}, {rmax}]")`

` `

` value = (rmin + rmax) / 2 # Midrange`

` delta = abs(value - rmin)`

` if TUTOR:`

` print(f" Resulting value: {value} ±{delta}.")`

` `

` return value, delta`

Lets encode a value then decode the whole of its bitstream

`bits, _, _ = encode1(2/3, -2, 4, 15)`

`print()`

`value = decode1(bits, -2, 4) # Note use of the same initial range!`

``

```
Successively encoding 0.6666666666666666 between [-2, 4]:
is <= 1.0
is > -0.5
is > 0.25
is > 0.625
is <= 0.8125
is <= 0.71875
is <= 0.671875
is > 0.6484375
is > 0.66015625
is > 0.666015625
is <= 0.6689453125
is <= 0.66748046875
is <= 0.666748046875
is > 0.6663818359375
is > 0.66656494140625
Resulting bit stream: [0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1].
Successively decoding [0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1] between [-2, 4]:
in range [-2, 1.0]
in range [-0.5, 1.0]
in range [0.25, 1.0]
in range [0.625, 1.0]
in range [0.625, 0.8125]
in range [0.625, 0.71875]
in range [0.625, 0.671875]
in range [0.6484375, 0.671875]
in range [0.66015625, 0.671875]
in range [0.666015625, 0.671875]
in range [0.666015625, 0.6689453125]
in range [0.666015625, 0.66748046875]
in range [0.666015625, 0.666748046875]
in range [0.6663818359375, 0.666748046875]
in range [0.66656494140625, 0.666748046875]
Resulting value: 0.666656494140625 ±9.1552734375e-05.
```

## Truncating binary digits

The full 15 bits of the encoding above give a result that is correct when rounded to 5 *decimal* digits. If a smaller number of bits (from the left), are decoded then the value should be reproduced to a lesser level of precision. Lets try:

`print("# Decode the first 10 bits.")`

`value = decode1(bits[:10], -2, 4)`

`print("\n# Decode only the first 5 bits.")`

`value = decode1(bits[:5], -2, 4)`

``

```
# Decode the first 10 bits.
Successively decoding [0, 1, 1, 1, 0, 0, 0, 1, 1, 1] between [-2, 4]:
in range [-2, 1.0]
in range [-0.5, 1.0]
in range [0.25, 1.0]
in range [0.625, 1.0]
in range [0.625, 0.8125]
in range [0.625, 0.71875]
in range [0.625, 0.671875]
in range [0.6484375, 0.671875]
in range [0.66015625, 0.671875]
in range [0.666015625, 0.671875]
Resulting value: 0.6689453125 ±0.0029296875.
# Decode only the first 5 bits.
Successively decoding [0, 1, 1, 1, 0] between [-2, 4]:
in range [-2, 1.0]
in range [-0.5, 1.0]
in range [0.25, 1.0]
in range [0.625, 1.0]
in range [0.625, 0.8125]
Resulting value: 0.71875 ±0.09375.
```

# Higher dimensions

What if we have more than one named variables values to encode in a similar fashion - preserving the ability to truncate on the right, (or generate more bits), to affect precision. Well lets give each variable, which we can think of as a "dimension" a name, a range, and all the dimensions a constant *ordering*.

If we assume the same precision for each dimension and interleave the bitstream for the dimensions in encoding, then the combined interleaved bitstream could be truncated as before ; de-interleaved, then individually decoded back to generate values for each dimension.

`from collections import namedtuple`

``

`Dimension = Dim = namedtuple('Dim', 'name, rangemin, rangemax')`

``

`dimensions = (Dim('s', -4, 4), # In order s then t`

` Dim('t', -5, 5))`

`values = ( -1.5, 2.7) # In order s then t`

``

`def bisect(value, rmin, rmax):`

` rmid = (rmin + rmax) / 2`

` if value <= rmid:`

` bit = 0 # push 0`

` rmax = rmid # range lower half`

` else:`

` bit = 1 # push 1`

` rmin = rmid # range upper half `

` return rmin, rmax, bit`

``

`def encode2(values, dimensions, bitcount_combined):`

` for value, (name, rangemin, rangemax) in zip(values, dimensions):`

` assert rangemin <= value <= rangemax, f"{name} value is out of range!"`

` search = [[value, rmin, rmax] # start at extreme`

` for value, (_, rmin, rmax) `

` in zip(values, dimensions)]`

` dcount = len(dimensions)`

` result = [] # One result for all dimensions`

` for i in range(bitcount_combined):`

` d = search[i % dcount] # dimension to interleave`

` d[1], d[2], bit = bisect(*d)`

` result.append(bit)`

` return result`

``

`def decode2(bits, dimensions):`

` search = [[rmin, rmax] # start at extreme`

` for _, rmin, rmax in dimensions]`

` dcount = len(dimensions)`

` for i, bit in enumerate(bits):`

` rmin, rmax = search[i % dcount] # dimension to decode`

` rmid = (rmin + rmax) / 2`

` if bit == 0:`

` rmax = rmid # range lower half`

` else:`

` rmin = rmid # range upper half`

` search[i % dcount] = [rmin, rmax]`

` values = [((rmin + rmax) / 2, abs((rmin - rmax) / 2))`

` for rmin, rmax in search]`

` return values # [(value, delta), ...]`

``

`print("Encode then decode",values)`

`for val, delta in decode2(encode2(values, dimensions, 30), dimensions):`

` print(f" {val} ±{delta:g}")`

` `

```
Encode then decode (-1.5, 2.7)
-1.5001220703125 ±0.00012207
2.700042724609375 ±0.000152588
```

Again, we can truncate and take the leftmost 14 bits of the encoding to decode to less precision in *both* values

`for val, delta in decode2(encode2(values, dimensions, 30)[:14], dimensions):`

` print(f" {val} ±{delta:g})")`

```
-1.53125 ±0.03125)
2.6953125 ±0.0390625)
```

# A closer look at interleaving.

Create separate bitstreams for two dimensions and compare to that of the combined.

`# Recap`

`print("Dimensions:", dimensions)`

`print("Values:", values)`

```
Dimensions: (Dim(name='s', rangemin=-4, rangemax=4), Dim(name='t', rangemin=-5, rangemax=5))
Values: (-1.5, 2.7)
```

`s_bits, *_ = encode1(-1.5, -4, 4, 7)`

```
Successively encoding -1.5 between [-4, 4]:
is <= 0.0
is > -2.0
is <= -1.0
is <= -1.5
is > -1.75
is > -1.625
is > -1.5625
Resulting bit stream: [0, 1, 0, 0, 1, 1, 1].
```

`t_bits, *_ = encode1(2.7, -5, 5, 7)`

```
Successively encoding 2.7 between [-5, 5]:
is > 0.0
is > 2.5
is <= 3.75
is <= 3.125
is <= 2.8125
is > 2.65625
is <= 2.734375
Resulting bit stream: [1, 1, 0, 0, 0, 1, 0].
```

`def alternate(list1, list2):`

` "zip up the members of the two lists, list1[0] first"`

` return sum((list(m1m2) for m1m2 in zip(list1, list2)), [])`

`st_bits = alternate(s_bits, t_bits)`

`print(st_bits)`

```
[0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0]
```

`st_encode2 = encode2(values, dimensions, 14)`

`st_encode2`

`[0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0]`

`## OR`

`bits_per_dim = bpd = 7`

`assert encode2(values, dimensions, bpd * 2) == alternate(encode1(-1.5, -4, 4, bpd)[0], `

` encode1(2.7, -5, 5, bpd)[0])`

`print("OK")`

```
Successively encoding -1.5 between [-4, 4]:
is <= 0.0
is > -2.0
is <= -1.0
is <= -1.5
is > -1.75
is > -1.625
is > -1.5625
Resulting bit stream: [0, 1, 0, 0, 1, 1, 1].
Successively encoding 2.7 between [-5, 5]:
is > 0.0
is > 2.5
is <= 3.75
is <= 3.125
is <= 2.8125
is > 2.65625
is <= 2.734375
Resulting bit stream: [1, 1, 0, 0, 0, 1, 0].
OK
```

# On precision.

The code works to the precision of the Python datatypes passed into it. The ranges so far have used Pythons 64 bit floating point datatype for range and value. If decimals or fractions are used then much more precision could be achieved with the same code due to duck typing.

# Human readability

`st_bits`

`[0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0]`

These strings of binary bits are hard for us to accurately remember and transcribe. Traditionally we have used octal, base-8; and hexadecimal, base 16 number bases to get shorter, more memorable, representations for binary numbers.

I am going to use a`base 32`

numbering scheme to encode the bits. Every five successive bits will become one of 32 different characters. The characters used are the digits zero to nine then 22 lowercase ASCII letters missing out certain letters, (a, i, l, and o), that could be mistaken for numbers.

Users will more naturally express precision in terms of the number of characters in the representation and so the underlying bit precision will be in multiples of 5-bits.

`ch32 = "0123456789bcdefghjkmnpqrstuvwxyz"`

``

`def n2bitlist(n, _width=5):`

` return tuple((1 if x == '1' else 0) for x in f"{n:0{_width}b}")`

``

`# Map binary tuple to code character`

`bool2ch = {n2bitlist(i): ch for i, ch in enumerate(ch32)} `

`ch2bool = {ch: list(b) for b, ch in bool2ch.items()}`

``

`def encode(values, dimensions, code_len):`

` bits = encode2(values, dimensions, code_len * 5)`

` # To base 32`

` b = tuple(bits)`

` #print('Bits:', b)`

` code = (bool2ch[b[i * 5: (i + 1) * 5]] for i in range(code_len)) `

` return ''.join(code)`

``

`def decode(code, dimensions):`

` bits = sum((ch2bool[ch] for ch in code), [])`

` values = decode2(bits, dimensions)`

` return values`

`# Recap`

`print("Dimensions:", dimensions)`

`print("Values:", values)`

```
Dimensions: (Dim(name='s', rangemin=-4, rangemax=4), Dim(name='t', rangemin=-5, rangemax=5))
Values: (-1.5, 2.7)
```

`st_code = encode(values, dimensions, 3)`

`print('Values encode to %r' % st_code)`

```
Values encode to 'f2x'
```

`print("Then decode to:")`

`for val, delta in decode(st_code, dimensions):`

` print(f" {val} ±{delta:g}")`

```
Then decode to:
-1.515625 ±0.015625
2.6953125 ±0.0390625
```

# Geohashes

Geohashes are a convenient way of encoding geographical coordinates as a character string.

I have used the same textual mapping so that the following dimensional data will reproduce Geohash en/decoding.

`geo = (Dim('latitude', -90, 90), # In order latitude then longitude.`

` Dim('longitude', -180, 180))`

``

Unfortunately Geohashes, whilst specifying coordinates in the order latitude then longitude; alternate bits from longitude and then latidude - i.e. in reverse, so adjustments are needed.

`def encode3(values, dimensions, bitcount_combined, alternate_from_last=True):`

` for value, (name, rangemin, rangemax) in zip(values, dimensions):`

` assert rangemin <= value <= rangemax, f"{name} value is out of range!"`

` search = [[value, rmin, rmax] # start at extreme`

` for value, (_, rmin, rmax) `

` in zip(values, dimensions)]`

` dcount = len(dimensions)`

` result = [] # One result for all dimensions`

` for i in range(bitcount_combined):`

` dim = (dcount - 1 -(i % dcount)) if alternate_from_last else i % dcount`

` d = search[dim] # dimension to interleave`

` d[1], d[2], bit = bisect(*d)`

` result.append(bit)`

` return result`

``

`def decode3(bits, dimensions, alternate_from_last=True):`

` search = [[rmin, rmax] # start at extreme`

` for _, rmin, rmax in dimensions]`

` dcount = len(dimensions)`

` for i, bit in enumerate(bits):`

` dim = (dcount - 1 - (i % dcount)) if alternate_from_last else i % dcount`

` rmin, rmax = search[dim] # dimension to decode`

` rmid = (rmin + rmax) / 2`

` if bit == 0:`

` rmax = rmid # range lower half`

` else:`

` rmin = rmid # range upper half`

` search[dim] = [rmin, rmax]`

` values = [((rmin + rmax) / 2, abs((rmin - rmax) / 2))`

` for rmin, rmax in search]`

` return values # [(value, delta), ...]`

``

`def geo_encode(values, code_len):`

` bits = encode3(values, geo, code_len * 5)`

` # To base 32`

` b = tuple(bits)`

` #print('Bits:', b)`

` code = (bool2ch[b[i * 5: (i + 1) * 5]] for i in range(code_len)) `

` return ''.join(code)`

``

`def geo_decode(code):`

` bits = sum((ch2bool[ch] for ch in code), [])`

` values = decode3(bits, geo)`

` return values`

`coord = (51.433718, -0.214126)`

`geo_hash = geo_encode(coord, 6)`

`geo_hash`

`'gcpue5'`

`latlong = geo_decode(geo_hash)`

`print(f"Geohash '{geo_hash}' decodes to:")`

`for (val, delta), (name, *_) in zip(geo_decode(geo_hash), geo):`

` print(f" {name:10}: {val} ±{delta:g}")`

```
Geohash 'gcpue5' decodes to:
latitude : 51.43524169921875 ±0.00274658
longitude : -0.2142333984375 ±0.00549316
```

## External sites for testing Geohashes

You can enter the coordinates or the geohash above into this site and find that it points to the "Wimpledon All England Lawn Tennis and Croquet Club".

# End Notes.

This blog post was written because the WIkipedia entry and a web search did nt turn up a good description of Geohashes for me to finish the Python Rosetta Code entry initially. After more reading I finally completed the Python entry.

I thought about generalising the method to more than two dimensions and also wanted to highlight the swithing of order in geohashing: lat/long normal coord order vs long/lat used when alternating the bits from the binary search.

Another generalisation might be to use n-ary search for a value producing a base-n "bit"; but I could not think of a need, and this post is already long.

# END.

Not saved yet

python3 | launching

## No comments:

## Post a Comment