-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
proposal: spec: generic parameterization of array sizes #44253
Comments
Thanks. I'm not quite sure about the constant expression bit; it's important to not constrain the compiler implementation. In any case I think we have our hands full with the current proposal. I'm going to put this on hold for now for consideration as a future language extension. |
Thanks for giving it a look.
Look forward to working out the implications here for the compiler when the time comes. Will give it a rest for 6-9mo and see how the rest of the type parameter implementation shakes out. |
Change https://golang.org/cl/301290 mentions this issue: |
Detailed proposal for golang/go#44253. Change-Id: I9e37363a4d5c41de100def1222d50ff6ad09d078 GitHub-Last-Rev: 8e3bc88 GitHub-Pull-Request: #33 Reviewed-on: https://go-review.googlesource.com/c/proposal/+/301290 Reviewed-by: Ian Lance Taylor <iant@golang.org>
A few thoughts:
With regard to the last point, I don't see a huge difference between:
and:
They're both forming a new type with a size dependent on the sizes of two type parameters. The main thing that concerns me about this proposal is that there is considerable potential for code bloat if the feature is used a lot, particularly if the stenciling or GC stenciling implementation approach is used. But there's also potential for the same kind of issue even without generic arrays so maybe that's not a good argument. It's just a bit easier with arrays. For example:
That particular kind of abuse could be avoided by prohibiting recursive generic functions where the type parameters of the recursive call aren't identical. I haven't yet seen a good use case for allowing that yet, and it avoids a bunch of potential problems. In general, I'm +1 on this proposal as long as we can be happy that it's not open to serious abuse (e.g. computation in the type system). |
I agree, but it should be noted that this is not the only kind of code you can write. In particular, you can use I think it's generally important to keep the type-checking of the body a generic function independent of the type-argument. And AFAIK that's the plan, with sentences like "a generic function can use an operation if its valid for all possible type-arguments". (Note: This is what's in the current design). However, in general, that's not necessarily feasible. A simple example of the kinds of problems we might encounter is func F[A interface{ [...]T}, T any]() {
const _ uint8 = uint8(len(A{}))
} This does not type-check if
My example would thus be encoded as (using the type-list syntax for now): type (
Var interface { type […]struct{} }
True [256]struct{}
False [0]struct{}
)
// (A∨¬B∨¬C) ∧ (¬A∨C∨D)
func F[A, B, C, D Var]() {
const (
// for readability
a, b, c = len(A), len(B), len(C)
notA, notB, notC = 256^a, 256^b, 256^c
)
const _ = uint8( (a | notB | notC) & (notA | c | d) )
} This fails type-checking if and only if the formula is satisfiable - as long as we limit type-checking to the generic function, using the constraints on the type-parameters only. Of course, if we are willing to delay type-checking of constant expressions until instantiation, this problem goes away.
I think this is excluded by the initialization order section of the spec, which effectively limits the kinds of recursion you can do. But I'm not 100% sure. I believe this at least requires some thinking, because I do think we have some recursion at least. FTR, this is not limited to So, I actually think we need likely can't make |
@ianlancetaylor highlighting you to be sure - see my last comment, I think this requires some refinement of the generics proposal as a whole, in terms of what it means for constant expressions. |
I'm not sure I understand how instantiation of generic types relates to initialization order. Isn't initialization order a runtime property? Are you suggesting that the generic type checking should have to work on the infinite set of types implied by constraints at compile time? I don't see why that has to be the case. Just to make sure I understand, is this any different without Var interface { type [0]struct{}, [256]struct{} } |
Maybe nothing. I was thinking about stuff like this and other recursive type definitions. I'm not quite sure where in the spec this is dealt with right now. But ultimately, the computation question is off-topic right now :) I haven't found a specific problem (yet). So feel free to disregard that part :)
Yes. The problem is specific to Unfortunately, I think |
I think the big learning of the day is that it's particularly hard to understand whether something is going to (1) overflow or (b) produce a value that is larger than the address space. As @Merovius has pointed out, we're in a real pickle if we want to determine whether an expression could overflow. Indeed, we've now discussed two different classes of errors which might occur with these generic constant expressions:
(1) is squarely a problem of the type system and is something we deal with to make this proposal coherent. (2), however, is not a concern of the type checker but rather a concern of compilation; it is a platform-specific concern. As I see it, the type system, in a sense, does not care about the bounds of the memory space (and, relatedly, the word size used for So, below find my thought process and proposal to try to address the problems in a way while attempting to come up with rules that are small, simple, and coherent with the concepts of the language. Rejected Straw-Man: Resolve overflows upon instantiationOne straw-man proposal might be to just ignore the possibility for overflow problems until later in compilation/linking when all instantiations are known. For a given instantiation, it's trivial to determine if overflow occurs. This approach, however, violates a key principle of the Type Parameters Proposal as laid out in the comparison to C++:
If we did this checking later and some constant expression deep in the bowels of the library changed, it may break code which had previously worked. Such changes would be hard to detect, but, in a sense, might be considered part of the API. Proposal: Reject typed constant expressions referencing type parametersThe succinct rule to ward off problem (1) and @Merovius's concerns about detecting overflow is to prevent a constant expression which references a type parameter from ever becoming typed. This, with perhaps minor tweaks to the spec, should work well for the motivating use case. While the language spec requires that the array length must be a constant expression "representable by int", the type checker, as far as I understand, doesn't need to care about the maximum size of an int [1]. This solution does not address memory capacity overflow problems, but, then again, we are plagued by those today [2]. As far as the type system is concerned, we may have infinite memory. This proposal would allow generic data structures but would not allow these constant expressions to be directly assigned to concrete numeric types. The go spec has some langauge describing when a constant expression becomes typed. The extension to the spec would say something to the effect of: all constant expressions which reference type parameters cannot be typed, cast to a type, used in an assignment to a typed constant, or assigned to a variable. The other additional extension would be to note that Let's explore the implications of this proposal: type Var interface { type [0]struct, [256]struct }
func F[A Var]() {
const _ = uint8(len(A{})) // illegal, cannot convert untyped constant referencing type params to uint8
var _ uint8 = len(A{}) // illegal, cannot convert untyped constant referencing type params to uint8
var _ = uint(len(A{})) // legal cast but not guaranteed to be evaluated at compile time (though certainly can be).
} This proposal also allows a generic concatenation function (assuming we have [1] I would like to believe that we'd say that the below program type checks in the go programming language even though it doesn't compile on 32-bit platforms. func main() {
fmt.Println(len((*[math.MaxInt32+1]byte)(nil)))
} [2] consider the following silly program: type Arr[T any] interface {
type [1]T, [math.MaxUint32]T
}
type Big2DArr[T any, A Arr[T]] [math.MaxUint32 >> 14]A
I did not realize when I wrote this that |
To repeat what I just said on slack: I only used constant overflow because it's an easy to trigger compilation failure. But any compilation failure that depends on an integer constant can be used for this. For example, indexing an array would be suitable, and it doesn't require a typed constant. Similarly, you could declare an array type and assign that to a statically known array type: type (
Var interface{ type True, False}
True [256]struct{}
False [0]struct{}
)
func F[A, B, C, D Var]() {
const (
// for readability
a, b, c = len(A), len(B), len(C)
notA, notB, notC = 256 ^ a, 256 ^ b, 256 ^ c
)
type X [(a | notB | notC) & (notA | c | d)]int
type Y [0]int
var _ Y = Y(X{}) // valid if and only if the formula always evaluates to false
} This one is especially interesting, because "use a constant expression derived from one array types length to define another array type" is the entire point of making I think the core problem is that it's not entirely clear what the rule "an operation is valid, if it is valid for every type fulfilling the constraints" means, in many circumstances. For example, this last example could just reject the code no matter if the formula is satisfiable or not - but that would violate this rule. That might be okay, but we need to make sure it's understandable in what circumstances we violate it for which classes of operations. The spec must clearly state that the compiler rejects this program and why. I'm also not entirely sure, how we would distinguish "cases like this" from more reasonable cases. i.e. I'm not sure what differentiates |
Indexing into an array by a constant seems seems like a non-issue. If the type is constrained to be the set of all array lengths, then no constant value is safe to iterate over all of those lengths. Indexing by a variable into an array is a runtime check. The casting case, however, is more interesting. Hopefully there's a nice out there too. I'll think on it. Thanks! |
To clarify the same way I did on slack, this is the real idea: func F[A, B, C, D Var]() {
const (
// for readability
a, b, c = len(A), len(B), len(C)
notA, notB, notC = 256 ^ a, 256 ^ b, 256 ^ c
)
type X [10]int
var _ = X[(a|notB|notC)&(notA|c|d)] // valid if and only if the formula always evaluates to false
} The array bounds are fixed - the index is varied. |
Fortunately the spec and the above constraint prevents this one too. The spec says:
As noted above, anything that's a constant expr involving a type parameter must be an untyped constant. The spec also says:
We've already noted that typing an untyped expression such as this is disallowed. |
I'm fairly sure it doesn't unless the return type is a slice rather than an array (which is fine - array concat is not necessarily a useful operation). Allowing array lengths as const expressions (and hence arithmetic) does seem to open the gateway to a world of potential pain, regardless of this proposal. Without that, I don't think this proposal is so useful, as you can't form any array types other than the original (for example an array of bool to mark validity of elements). In the end, this is just a performance issue and perhaps the consequent overhead of using slices rather than arrays is worth it for the avoidance of all those language issues. |
@rogpeppe I think it the proposal would permit the following:
|
Personally I tend to think that The same applies to If we did this, it would make it harder to write certain kinds of code, like defining a new type that was an array whose length was the same as that of some type parameter. But I think that we can live with that. |
My short term preference would be to disallow |
@ajwerner, is it not backward-compatible to change a non-constant expression to a constant of the same type? (It isn't backward-compatible to change a variable to a constant, because the variable can be reassigned or have its address taken — but a call-expression for |
@bcmills that's a good point. In fact, all 3 of these functions already return typed values in constant expressions. In effect, I guess just making them non-constant expressions when involving type parameters preserves flexibility.
Yet. 👀 on #45624. |
A real world example for such a need is the rtreego library. |
Hi, I'm currently trying to change several of the slices in our codebase into arrays because we always know their size and it does not change. One of the "issues" I'm facing is that json does not behave the same with This proposal would seem to answer my problem, but I see no way of instantiating such a generic... I see how it is usable in functions like those presented in this discussion, but say I want to have a generic array with custom marshalling/unmarshalling routines, I'd probably go with something like the following: type GenericArrayCustomMarshal[T interface{ [...]byte }] T
func (arr *GenericArrayCustomMarshal[T]) UnmarshalJson(p []byte) error {
var b []byte
err := json.Unmarshal(p, &b)
if err != nil {
return err
}
copy(*arr, b)
return nil
} except how do I instantiate a struct containing a I like this proposal but I'd say we need a way to specify the constant expression as for types: // const generic only
type GenericByteArray[const n] interface{
~[n]byte
}
// const generic + type parameter
type GenericArray[const n, T any] interface {
~[n]T
} This would be allowed to be instantiated with any constant expression in place of |
First of all, you don't need the Second of all, while it's a little odd, you could instantiate it without having to pass a constant into the types by doing this: type CustomMarshalArray[T ~[...]E, E any] struct { V T }
type StructWithCustomMarshalArray struct {
Array CustomMarshalArray[[3]byte, byte]
} This would be simplified if |
Another use-case for this is creating a more high-level language that compiles down to Go. If that language has arrays then there's need to generate very big amount of Go code per each array type. Ability to generalize array size could reduce amount of Go code generated. BTW same things goes for structure (it's a different topic though) - ability to express types like "any structure" or "structure with at least these fields" could help in such tasks where Go is a compile-target. But the problem is Go's sub-typing - it's not structural for structures, only for interfaces |
I don't want to side-track on this, but I would expect that for this case it would be appropriate to convert the higher-level arrays to Go slices. Most Go programs do not use arrays at all. They use slices for everything. (This doesn't mean that this issue is useless. There are reasons to use arrays rather than slices. But it is uncommon.) |
@ianlancetaylor yes, and maps/slices could also be used instead of structs. But that adds performance/memory overhead |
I have a use-case for this: A Vector Database that allows package main
import (
"fmt"
"sort"
)
type vectorDB[T [...]float64] struct {
vectors map[T]string
}
func NewVectorDB[T [...]float64]() *vectorDB[T] {
return &vectorDB[T]{map[T]string{}}
}
func (db *vectorDB[T]) Add(vec T, text string) {
db.vectors[vec] = text
}
func (db *vectorDB[T]) Lookup(vec T, ktop int) []string {
type vecDist struct {
v T
d float64
}
d := make([]vecDist, 0, len(db.vectors))
for v := range db.vectors {
d = append(d, vecDist{v, calcDist(v, vec)})
}
sort.Slice(d, func(i, j int) bool { return d[i].d < d[j].d })
if len(db.vectors) < ktop {
ktop = len(db.vectors)
}
results := make([]string, 0, ktop)
for _, i := range d[:ktop] {
results = append(results, db.vectors[i.v])
}
return results
}
func calcDist[T [...]float64](a, b T) float64 {
var d float64
for i, av := range a {
d += (av - b[i]) * (av - b[i])
}
return d
}
func main() {
db := NewVectorDB[[5]float64]()
db.Add([5]float64{1, 2, 3, 4, 5}, "apple")
db.Add([5]float64{1, 9, 3, 4, 5}, "orange")
fmt.Printf("Result: %s\n", db.Lookup([5]float64{9, 1, 2, 3, 4}, 1)[0])
} |
Heya there! Went a different way and created a duplicate issue with a different syntax: func NewRingBuffer[N [N]byte]() RingBuffer[N] {
return RingBuffer[N]{}
}
const buflen = 500
r1 := NewRingBuffer[500]()
r2 := NewRingBuffer[buflen]()
func getArrayPtr[N [N]byte](buf []byte, off int) *[N]byte {
return (*[N]byte)(frm.buf[off:N+off])
}
dataptr := getArrayPtr[20](r1.buf, 10) I work on embedded systems with small memory footprints (tens of kilobytes of RAM) where memory constraints are important. I regularly need to use arrays to lower memory consumption. Being able to have reusable parametric array length algorithms would be great. C already does this better than Go! |
See detailed design here. The recent acceptance of #43651, we can now begin discussing filling in the gaps and omissions of that proposal. This issue provides a proposal to fulfill the following stated omission in the type parameter proposal:
The structure of the proposal boils down to two extensions:
Today one can specify a type constraint that a type is an array of some specific, constant size. For example:
Similarly one can enumerate a set of array sizes:
But, there's no way to capture the idea of all arrays of a given type. This proposal proposes the following syntax to capture that idea:
This syntax is familiar as it is used for length inference in array literals.
len([...]T)
I'm not wedded in any way to the function with this behavior actually being
len
. It could be somenew thing like
ArrayLen([...]T)
in some package somewhere.Today go provides both constant expressions which take types (see
unsafe.Sizeof
) and more generallybuiltin functions which take types. Secondly, the go compiler already statically computes array lengths
at compile time when it can.
Constant expressions can be already be used to size arrays.
Taken together, one can imagine using this constant expression over some parameterized array type to
size other array types.
A detailed discussion with examples, background and motivation can be found in this proposal.
The text was updated successfully, but these errors were encountered: