trite.io - posts

NumPy and static type checking via Pyright

Created: February 5, 2022

Spending some more time on Advent of Code 2015 day 6 part 1. As mentioned in the last post, the goal is to make an animation of each command being applied in sequence.

I have already solved the base problem using fairly primitive types. NumPy and Pillow work together nicely, and numpy has excellent support for manipulating 2-D arrays.

Type of “[something]” is partially unknown

One of the first things I wanted to do is experiment with the conversion between numpy’s ndarray and pillow’s Image types. This is also where the first type constraints will be a problem for the type checker (Pyright in this case):

blah

Here’s that full error message in all its hideous glory (formatted somewhat):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Type of "asblaharray" is partially unknown
    Type of "asarray" is "
        Overload [
            (
                a: _SupportsArray[dtype[_SCT@asarray]] | Sequence[_SupportsArray[dtype[_SCT@asarray]]] | Sequence[Sequence[_SupportsArray[dtype[_SCT@asarray]]]] | Sequence[Sequence[Sequence[_SupportsArray[dtype[_SCT@asarray]]]]] | Sequence[Sequence[Sequence[Sequence[_SupportsArray[dtype[_SCT@asarray]]]]]]
                , dtype: None = ...
                , order: Literal['K', 'A', 'C', 'F'] | None = ...
                , *
                , like: _SupportsArray[dtype[Unknown]] | _NestedSequence[_SupportsArray[dtype[Unknown]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = ...
            )
            -> 
            ndarray[Any, dtype[_SCT@asarray]]

            , (
                a: object
                , dtype: None = ...
                , order: Literal['K', 'A', 'C', 'F'] | None = ...
                , *
                , like: _SupportsArray[dtype[Unknown]] | _NestedSequence[_SupportsArray[dtype[Unknown]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = ...
            )
            ->
            ndarray[Any, dtype[Any]]

            , (
                a: Any
                , dtype: dtype[_SCT@asarray] | Type[_SCT@asarray] | _SupportsDType[dtype[_SCT@asarray]]
                , order: Literal['K', 'A', 'C', 'F'] | None = ...
                , *
                , like: _SupportsArray[dtype[Unknown]] | _NestedSequence[_SupportsArray[dtype[Unknown]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = ...
            )
            ->
            ndarray[Any, dtype[_SCT@asarray]]

            , (
                a: Any
                , dtype: dtype[Any] | type | _SupportsDType[dtype[Any]] | str | Tuple[Any, int] | Tuple[Any, SupportsIndex | Sequence[SupportsIndex]] | List[Any] | _DTypeDict | Tuple[Any, Any] | None
                , order: Literal['K', 'A', 'C', 'F'] | None = ...
                , *
                , like: _SupportsArray[dtype[Unknown]] | _NestedSequence[_SupportsArray[dtype[Unknown]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = ...
            )
            ->
            ndarray[Any, dtype[Any]]
        ]
    "PylancereportUnknownMemberType

So what’s going on here? Basically we’re working with an overloaded function/method. Following the function definition for numpy.asarray (F12 by default in VSCode) reveals the following function signatures:

overloaded function signatures

Don’t worry if both of these just seem like walls of garbage. There’s a lot to unpack, hopefully this might help a bit:

overloaded function signatures comparison

Left side of the above image shows the function definitions, right side shows the error message with formatting applied. What this effectively means is:

The type checker needs more information

In order to guarantee correctness of the code the type system needs some information. Combining strict typing with function overloading means type information must be more narrow. The type checker doesn’t just have 1 function to compare against now, but 4; and there is very likely a lot of overlap between them.

Solutions

A few solutions that come to mind:

Ignoring one line can work pretty well. It comes with drawbacks, and can add exponential complexity the more it is used.

Potholes

Time to analogize briefly. Picture a street (or 3):

which would you rather travel

The first section of road is fully type checked. Traveling it requires less mental effort than the street with a few potholes, which in turn requires less than the third street. Ignoring even a single line like this moves us from street 1 to street 2 in the image:

1
2
3
4
5
6
from PIL import Image
import numpy as np

im = Image.new('RGB', (1000, 1000), (0, 0, 0))
arr = np.asarray(im) # type: ignore
print(arr)

Now there’s a pothole to watch out for. But it’s just 1, and it’s pretty easy to watch out for. Just don’t do something like this and you’re all set!

1
2
3
4
5
6
import numpy as np

# im = Image.new('RGB', (1000, 1000), (0, 0, 0))
im = 'blah'
arr = np.asarray(im) # type: ignore
print(arr)

Of course each thing that gets ignored must now be tracked by you, the developer. That can add up much faster than most folks realize.

What if we could take advantage of being lazy and ignoring this complex type signature, but then add our own guarantees? Doable, though it is a bit like using ramen to fill in the pothole: technically a solution, certainly not the best one.

Carefully ignoring types

A simple wrapper function can be the ramen in this tortured analogy, we just have to mold it into the right shape:

1
2
3
4
5
6
7
8
9
10
from PIL import Image
import numpy as np
import numpy.typing as npt

def asarray(img: Image.Image) ->> npt.NDArray[np.uint8]:
    return np.asarray(img) # type: ignore

im = Image.new('RGB', (1000, 1000), (0, 0, 0))
arr = asarray(im)
print(arr)

This local asarray function now forces correct type usage in pyright! VSCode will yell at us when passing the wrong type to our function:

asarray wrapper error

And works as intended with the proper type:

asarray wrapper working