Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Representation for mapping atoms that can handle intersection and difference #991

Open
jclark opened this issue Apr 6, 2022 · 6 comments
Assignees
Labels

Comments

@jclark
Copy link
Contributor

jclark commented Apr 6, 2022

Section 9.1.4 (page 177) if the Frisch thesis has a way of representing records that can represent intersection and difference. There is a mentioned in footnote 29 of the April 1 version of the Castagna paper.

Here's the relevant section.

FrischRecord

@jclark
Copy link
Contributor Author

jclark commented Apr 6, 2022

Putting this into English and our terminology, this is saying that the atomic mapping representation has an additional component, E, which is a finite set of types. Each type s in E adds a constraint that there is a field f such that the name is not one of the individual field names and value of f is in s.

More precisely what this representation does is to allow the representation of mapping type to be turned into a union of these atomic types.

I don't yet grok this.

@jclark jclark added the semtype Relates to types module label Jun 9, 2022
@heshanpadmasiri heshanpadmasiri added the in progress Currently being implemented label Jul 6, 2022
heshanpadmasiri added a commit to heshanpadmasiri/nballerina that referenced this issue Jul 17, 2022
This assumes E in Frisch thesis (ref ballerina-platform#991) is empty and no distinction between readonly and non-readonly cases
@heshanpadmasiri
Copy link
Member

PDF incase of rendering issues in github

Design for negative atoms

I shall start the explanation from the de# CastagnaPaper and then extend that to the more general case in FrishThesis. In page 61 of the CastagnaPaper he provide 2 representations for negative atoms as,

  1. For open records
    Screenshot 2022-08-09 at 11 24 05

    (i.e union of records where $l_1$ is either undefined or $not(T_1)$ and $l_2$ is either undefined or $not(T_2)$).

  2. For closed records
    Screenshot 2022-08-09 at 11 24 58

    (I believe there is a typo in the last term in the paper)
    This is union of all records in which $l_1$ is of type $not(T_1)$ or undefined, records in which $l_2$ is of type $not(T_2)$ or undefined and records in which $l_1$ is $T_1$ and $l_2$ is $T_2$ and there is at least another field defined.

I believe 1 is sufficient for nBallerina as well (though we may choose to represent every thing using 2) and we can already represent this using our representation of mappings. So I'll limit the rest to extending 2 to match ballerina's requirements. Since we can already represent the first and second components of the union we shall focus only on the representation of the final component.

In the foot note 29 (in page 47) Castagna calls these records "Quasi-open records" defined as,
Screenshot 2022-08-09 at 11 26 40

My intuition here is that in order to get the set of records where there must be "not-undefined" fields other than $l_1$ to $l_n$ what we are doing is, take the set of records where where fields other than $l_1$ to $l_n$ can be either undefined or any and subtract the set of records where fields other than $l_1$ to $l_n$ is undefined.

Now let's extend this to our idea of open records where the rest must be of a given type $T_{rest}$. Let,

Screenshot 2022-08-09 at 11 27 50

where $R^*$ is the set of records where $l_1$ is $T_1$, $l_2$ is $T_2$ and there is at least another field whose type is not $T{rest}$__

Based on the foot note 29 I think this $R^*$ is (a less general version of) the $R$ in [[FrishThesis]] page 177, given as.

Screenshot 2022-08-09 at 11 28 52

If my intuition for the "Quasi-open records" is correct what we need to do to get $R^*$ is take the set of records where $l_1$ is $T_1$ , $l_2$ is $T_2$ and rest is any and subtract from it set of all records where rest is $T_{rest}$. That is in terms of $R$ we will have $t_0$ equal to any and $E =$ { $T_{rest}$ }.

Example

As an example lets consider not {| byte x; byte y; ...byte|}. This is equivalent to,

{| not(byte) x|} | {|not(byte) y|} | ({| byte x; byte y; ...any |} - { any x; any y; ...byte});

That is the union of records where x is not byte or undefined (I assume not to include undefined in ballerina) , y is not byte or undefined and records where x is byte and y is byte and there is at least one field who is not byte. (That is $R(t; any; byte)$)

Implementation details

In the previous section I limited the usage of $R$ only to places where we can not represent what we need without $R$. But based on theorem 9.5 in [[FrishThesis]] we should be able to represent all records as a union of $R$. (How to do this is not entirely clear to me at the moment). Therefore in the actual implementation I think it makes more sense to unify the representation of all records under $R$ (similar to how we have unified the representation of open and closed records under open records).

@heshanpadmasiri
Copy link
Member

heshanpadmasiri commented Aug 10, 2022

Record difference worked examples

I will be using the notation { T_1 l_1; ...; T_n l_n; [T_0] + } to represent sets where there must be at least one field (that is not undefined) other than l_1 ... l_n of type T_0. If there is no T_0 that means any.

Examples verified by Cduce

type R1 {| 1|2 x; 1|2 y |};
type R2 {| 1 x; 1 y |};

not(R1) = { not(1|2) x? } | { not(1|2) y? } | { 1|2 x; 1|2 y; + };
not(R2) = { not(1) x? } | { not(1) y? } | { 1 x; 1 y; + };

R1 & not(R2) = {| 2 x; 1|2 y |} | {| 1|2 x; 2 y |};
R2 & not(R1) = NEVER;
type R1 { 1|2 x; 1|2 y };
type R2 { 1 x; 1 y };

not(R1) = { not(1|2) x? } | { not(1|2) y? }
not(R2) = { not(1) x? } | { not(1) y? }

R1 & not(R2) = { 2 x; 1|2 y } | { 1|2 x; 2 y };
R2 & not(R1) = NEVER;
type R1 {| 1|2 x; 1|2 y |};
type R1_open { 1|2 x; 1|2 y };

not(R1) = { not(1|2) x? } | { not(1|2) y? } | { 1|2 x; 1|2 y; + };
not(R1_open) = { not(1|2) x? } | { not(1|2) y? }

R1 & not(R_open) = NEVER
R1_open & not(R1) = { 1|2 x; 1|2 y; + };
type R1 {| 1|2 x; 1 y |};
type R2 {| 1 x; 1|2 y |};

not(R1) { not(1|2) x? } | { not(1) y? } | { 1|2 x; 1 y; + };
not(R2) { not(1) x? } | { not(1|2) y? } | { 1 x; 1|2 y; + };

R1 & not(R2) = {| 2 x; 1 y |}
R2 & not(R1) = {| 1 x; 2 y |}
type R1 { 1|2 x; 1 y };
type R2 { 1 x; 1|2 y };

not(R1) { not(1|2) x? } | { not(1) y? };
not(R2) { not(1) x? } | { not(1|2) y? };

R1 & not(R2) = { 2 x; 1 y }
R2 & not(R1) = { 1 x; 2 y }

Examples worked by hand

type R1 {| 1|2 x; 1|2 y; 1|2... |};
type R2 {| 1 x; 1 y; 1... |};

not(R1) = { not(1|2) x? } | { not(1|2) y? } | { 1|2 x; 1|2 y; (not(1|2))+ };
not(R2) = { not(1) x? } | { not(1) y? } | { 1 x; 1 y; (not(1))+ };

R1 & not(R2) = {| 2 x; 1|2 y; ...1|2 |} | {| 1|2 x; 2y; ...1|2 |} | { 1 x; 1 y; 2+ };

R2 & not(R1) = NEVER;
type R1 {| 1|2 x; 1 y; 3|4... |};
type R2 {| 1 x; 1|2 y; 3|5... |};

not(R1) { not(1|2) x? } | { not(1) y? } | { 1|2 x; 1 y; (not(3|4))+ };
not(R2) { not(1) x? } | { not(1|2) y? } | { 1 x; 1|2 y; (not(3))+ };

R1 & not(R2) = {| 2 x; 1 y; 3|4... |} | { 1 x; 1 y; 4+ }
R2 & not(R1) = {| 1 x; 2 y; 3|4... |} | { 1 x; 1 y; 5+ }
type R1 {| 1|4 x; 2|4 y; 1|2|3... |};
type R2 {| 1|5 x; 2|6 y; 5|6|7... |};

not(R1) { not(1|4) x? } | { not(2|4) y? } | { 1|4 x; 2|4 y; (not(1|2|3))+ };
not(R2) { not(1|5) x? } | { not(2|6) y? } | { 1|5 x; 2|6 y; (not(1|5|6))+ };

R1 & not(R2) = {| 4 x; 2|4 y; 1|2|3... |} | {| 1|4 x; 4 y; 1|2|3...|} | { 1 x; 2 y; 2|3 +}
R2 & not(R1) = {| 5 x; 2|6 y; 1|5|6 ...|} | | {| 1|5 x; 6 y; 1|5|6 ...|} | {1 x; 2 y; 5|6 + } 

@heshanpadmasiri
Copy link
Member

heshanpadmasiri commented Aug 15, 2022

Step by step worked examples

Open records (from 2nd example above)

  • For a given open record { t1 l1; ...; tn ln } we shall get the negative by iterating over each required field, picking the record type where that field is either missing or not of the expected type (i.e { not(t_i) l_i? }) and taking the union.
not ({ t1 l1; ...; tn ln }) = { not(t1) l1? } | ... | { not(tn) ln? }`
  • Note that this union covers all records where required fields (of the original non negative record) taking different types or missing.
  • Example
    type R1 { 1|2 x; 1|2 y};
    not(R1) = { not(1|2) x? } | { not(1|2) y? };
    
    type R2 { 1 x; 1 y};
    not(R2) = { not(1) x? } | { not(1) y? };
    
  • Then to get R1 \ R2 we shall use the relation A \ B = A & not(B)
R1 \ R2 = R1 & not(R2);

R1 & not(R2) = { 1|2 x; 1|2 y } & ({ not(1) x? } | { not(1) y? });
             = ({ 1|2 x; 1|2 y } & { not(1) x? }) | ({ 1|2 x; 1|2 y; } & ({ not(1)? y })) // using intersection distributive over union
             = { 2 x; 1|2 y } | { 1|2 x; 2 y }
  • Similarly for R2 \ R1
R2 & not(R1) = { 1 x; 1 y } & ({ not(1|2) x? } | { not(1|2) y? });
             = ({ 1 x; 1 y } & { not(1|2) x? }) | ({ 1 x; 1 y } & { not(1|2) y? })
             = NEVER | NEVER // if intersectio of one named field is NEVER then resulting record is NEVER
             = NEVER

Closed records (from example 1 above)

  • For a given closed record {| t1 l1; ...; tn ln |}, all we have to do is add to the union we get from the negative of the corresponding open record type ({ t1 l1; ...; tn ln }), record type where we have at least one more field outside of l1, ..., ln; (Note that we have to add only one extra type since all the other possible types where required fields taking a different type or missing (including ons with extra fields) are already covered in the open record case)
  • Example
type R1 {| 1|2 x; 1|2 y |};
not(R1) = { not(1|2) x? } | { not(1|2) y? } | { 1|2 x; 1|2 y; + };
type R2 {| 1 x; 1 y |};
not(R2) = { not(1) x? } | { not(1) y? } | { 1 x; 1 y; + };
  • Rest is same as the open records case
R1 & not(R2) = {| 1|2 x; 1|2 y |} & ({ not(1) x? } | { not(1) y? } | { 1 x; 1 y; + })
             = ({| 1|2 x; 1|2 y |} & { not(1) x? }) | ({| 1|2 x; 1|2 y |} & { not(1) y? }) |  ({| 1|2 x; 1|2 y |} & { 1 x; 1 y; + })
             = {| 2 x; 1|2 y |} | {| 1|2 x; 2 y |} | NEVER // last intersection require at least one more field than R1
             = {| 2 x; 1|2 y |} | {| 1|2 x; 2 y |} 

R2 & not(R1) = {| 1 x; 1 y |} & ({ not(1|2) x? } | { not(1|2) y? } | { 1|2 x; 1|2 y; + })
             = ({| 1 x; 1 y |} & { not(1|2) x? }) | ({| 1 x; 1 y |} & { not(1|2) y? }) | ({| 1 x; 1 y |} & { 1|2 x; 1|2 y; + })
             = NEVER | NEVER | NEVER
             = NEVER

Controlled open records (last of the examples worked by hand)

  • This is same as the closed record case with the only difference being for the record we are adding to the union (from the open record case) extra field must take a type other than the allowed rest type
  • We need this additional restriction because otherwise (in closed records) additional field can take the rest type.
  • Example
type R1 {| 1|4 x; 2|4 y; 1|2|3 ... |};
not(R1) = { not(1|4) x? } | { not(2|4) y? } | { 1|4 x; 2|4 y; not(1|2|3) + }

type R2 {| 1|5 x; 2|6 y; 1|5|6 ... |};
not(R2) = { not(1|5) x? } | { not(2|6) y? } | {| 1|5 x; 2|6 y; not(1|5|6) + |}

R1 & not(R2) = {| 1|4 x; 2|4 y; 1|2|3... |} & ({ not(1|5) x? } | { not(2|6) y? } | {| 1|5 x; 2|6 y; not(1|5|6) + |})
             = ({| 1|4 x; 2|4 y; 1|2|3... |} & { not(1|5) x? }) | ({| 1|4 x; 2|4 y; 1|2|3... |} & { not(2|6) y? }) | ({| 1|4 x; 2|4 y; 1|2|3... |} & {| 1|5 x; 2|6 y; not(1|5|6) + |})
             = {| 4 x; 2|4 y; 1|2|3... |} | {| 1|4 x; 4 y; 1|2|3...|} | { 1 x; 2 y; 2|3 +}
R2 & not(R1) = {| 1|5 x; 2|6 y; 1|5|6 ... |} & ({ not(1|4) x? } | { not(2|4) y? } | { 1|4 x; 2|4 y; not(1|2|3) + })
             = ({| 1|5 x; 2|6 y; 1|5|6 ... |} & { not(1|4) x? }) | ({| 1|5 x; 2|6 y; 1|5|6 ... |} & { not(2|4) y? }) | ({| 1|5 x; 2|6 y; 1|5|6 ... |} & { 1|4 x; 2|4 y; not(1|2|3) + })
             = {| 5 x; 2|6 y; 1|5|6 ...|} | | {| 1|5 x; 6 y; 1|5|6 ...|} | {1 x; 2 y; 5|6 + } 

Uniqueness of the negative atom representation

Note that this representation for the not (negative) of a record is not unique since each element in the union overlaps each other.
For example taking not { int l1; string l2; float l3} = { not(int) l1? } | { not(string) l2? } | { not(float) l3? }. { string l1; string l3} is a subtype of each each element of the union. It should be possible to define this union without such overlapping. However how to do this a more general way is not clear to me at the moment.

@jclark
Copy link
Contributor Author

jclark commented Aug 17, 2022

Thanks for the explanation. It's not clear whether this representation will be helpful for us.

@jclark
Copy link
Contributor Author

jclark commented Aug 17, 2022

I think we should put this on pause for now.

@jclark jclark removed the in progress Currently being implemented label Aug 17, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants