Skip to content

uuid and random need return different value in different row #10247

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
liukun4515 opened this issue Apr 26, 2024 · 3 comments · Fixed by #10248 or #10193
Closed

uuid and random need return different value in different row #10247

liukun4515 opened this issue Apr 26, 2024 · 3 comments · Fixed by #10248 or #10193
Labels
good first issue Good for newcomers physical-expr Changes to the physical-expr crates

Comments

@liukun4515
Copy link
Contributor

          I think the idea here is that expectation is that `rand` is invoked *once per row* rather than *once per batch*. And the only way it knew how many rows to make is to get a null array in 🤔 

For example, when I run datafusion-cli from this PR to call random() the same value is returned for each row:

> create table foo as values (1), (2), (3), (4), (5);
0 row(s) fetched.
Elapsed 0.018 seconds.

> select column1, random() from foo;
+---------+--------------------+
| column1 | random()           |
+---------+--------------------+
| 1       | 0.9594375709000513 |
| 2       | 0.9594375709000513 |
| 3       | 0.9594375709000513 |
| 4       | 0.9594375709000513 |
| 5       | 0.9594375709000513 |
+---------+--------------------+
5 row(s) fetched.
Elapsed 0.012 seconds.

But I expect that each row has a different value for random()

However, since none of the tests failed, clearly we have a gap in test coverage 🤔

Originally posted by @alamb in #10193 (comment)

@jayzhan211
Copy link
Contributor

@liukun4515 This should be solved in #10193

@alamb
Copy link
Contributor

alamb commented Apr 26, 2024

To be clear, I think the correct thing happens on main already

The example from #10193 (comment) was with intermediate changes when I reran on #10193

here is what happens on main (the correct thing)

DataFusion CLI v37.1.0
> create table t as values (1), (2);
0 row(s) fetched.
Elapsed 0.032 seconds.

> select random() from t;
+---------------------+
| random()            |
+---------------------+
| 0.02024777131575939 |
| 0.9330727106990677  |
+---------------------+
2 row(s) fetched.
Elapsed 0.012 seconds.

> select uuid() from t;
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 630d1d50-1ed2-4d3c-bb04-89d338e3e59f |
| 594e03fb-b038-4a48-a6e6-e2f8f12746c1 |
+--------------------------------------+
2 row(s) fetched.
Elapsed 0.003 seconds.

@alamb
Copy link
Contributor

alamb commented Apr 26, 2024

Here is a PR that adds a test for this case : #10248

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
good first issue Good for newcomers physical-expr Changes to the physical-expr crates
Projects
None yet
3 participants