Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions: Function composition and type hints #1

Closed
datnamer opened this issue Feb 23, 2018 · 7 comments
Closed

Questions: Function composition and type hints #1

datnamer opened this issue Feb 23, 2018 · 7 comments

Comments

@datnamer
Copy link

datnamer commented Feb 23, 2018

Can the map function be overloaded with a kernel from a language like numba? It seems that function composition is a more general form of broadcasting and map.

Also, can kernels be selected based on other host type systems besides ndtypes? @sklam talked about translation, but it seems like 484, protocols etc have more granularity than ndypes, so you'd be losing data.

Otherwise we have yet another array type system in python, when there is some talk on standardizing on mypy stuff: python/typing#516

@datnamer datnamer changed the title Questions Function composition and map Feb 23, 2018
@datnamer datnamer changed the title Function composition and map Questions: Function composition and type hints Feb 23, 2018
@skrah
Copy link
Member

skrah commented Feb 23, 2018

Indeed, new kernels can be dynamically inserted into the lookup table. One of the goals is to add kernel jit compilation + insertion to Numba.

Function composition is a bit of an orthogonal issue: For jit compiled functions it seems less useful (just write a new function), for precompiled C kernels it could be added, but making it fast without temporary xnd containers is not trivial.

A couple of points about PEP 484:

  1. Mypy is more of a powerful linter than a real type checker. We need 100% accuracy, which mypy does not provide.

  2. Datashape is far more suitable for low-level C types than PEP 484, which does not mention them at all.

  3. Datashape types include array sizes, which makes them dependent types. Static array bounds checking in general is undecidable, so mypy cannot help, even if PEP 484 did include C types.

@skrah
Copy link
Member

skrah commented Feb 23, 2018

More points about PEP 484:

4: Datashape types (ndt_t) contain full memory access information, including alignment. They are used to traverse memory much in the way a compiler generates code based on the types of data structures. Thus, the types need to be a) in C to be fast and b) not bloated so the code is readable.

@datnamer
Copy link
Author

datnamer commented Feb 23, 2018

Thanks for your replies.

Sure. The other problem, aside from standards proliferation, is that many real programs have abstraction and type hierarchy, rather than just a collection of functions.

For example, the distribution hierarchy in Pymc3.

I don't see how one can do fast type hierarchy programming (use dispatch to organize code on type lattice) with ndtypes and gumath. It seems like somehow interfacing with pep 484 ( and later typing peps that include variable annotations, not mypy) is one way to do this.

Regarding the obstacles you mentioned:

  • the type hints don't have to use mypy, the annotations are available for any program to use.
  • pep 544 protocols can map to type kinds and lower level types can be written represent concrete ndt types. (ie write an int64 class). Then the int64 implements the int abstract protocol, the right kernel is called and numba can generate code at function call time for the concrete argument. Then someone else can call it with int32.

Same with a bayesian package with functions over "abstractparametricdistribution" which I will call with "myconcretedistribution".

  • there is talk of array sizes are being added to typing (static checking with mypy is different).
  • alignment etc could be a keyword

I understanding you aren't re-implementing an object oriented type system, I'm just giving feedback on how this would work with my uses-cases where I'd want to use it in the host language.

@skrah
Copy link
Member

skrah commented Feb 23, 2018

Gufuncs are multimethods. If f() should take {int16, int32, ...}, one has to add kernels for all signatures.

Datashape also had the Signed kind for all signed integers, so if a function has that signature, Numba could possibly generate kernels on the fly.

I'm not sure what you mean by "fast type hierarchy programming". Selecting the multimethod is done purely by switch statements on the C level, using __annotations__ on the Python level should be much slower than that.

If people insist on shoehorning datashape into the (IMO bulky) PEP 484 syntax, this is a start:

from ndtypes import *

class NdtInt(object):
    def __init__(self):
        self.t = ndt("Signed")

class NdtInt64(object):
    def __init__(self):
        self.t = ndt("int64")

class NdtTuple(object):
    def __init__(self, *args):
        s = '(' + ', '.join(str(x.t) for x in args) + ')'
        self.t = ndt(s)

But I don't think that such a syntax is appropriate for huge nested types like the Lahman database:

http://matthewrocklin.com/blog/work/2014/11/19/Blaze-Datasets

@datnamer
Copy link
Author

datnamer commented Feb 23, 2018

It's not, but I'm not talking about a data use case, which is already perfectly defined. How would the user write a custom type? How would you deal with the distribution example I provided or defining gufuncs on something like the pymc type hierarchy : https://github.com/pymc-devs/pymc3/blob/master/pymc3/distributions/continuous.py

Maybe this sort of thing is not in scope of these packages, in which case this issue can be closed.

@skrah
Copy link
Member

skrah commented Mar 30, 2018

@datnamer Here's a concrete example for defining new functions and types:

#4

@teoliphant
Copy link
Member

teoliphant commented Mar 30, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants