Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

groupedWithin emits empty groups #1404

Closed
Sebruck opened this issue Jan 28, 2019 · 4 comments
Closed

groupedWithin emits empty groups #1404

Sebruck opened this issue Jan 28, 2019 · 4 comments
Labels

Comments

@Sebruck
Copy link
Contributor

Sebruck commented Jan 28, 2019

Hi

I use fs2 version 1.0.2 and the groupedWithin function seems not to behave as documented.

/**
    * Divide this streams into groups of elements received within a time window,
    * or limited by the number of the elements, whichever happens first.
    * Empty groups, which can occur if no elements can be pulled from upstream
    * in a given time window, **will not be emitted.**
    *
    * Note: a time window starts each time downstream pulls.
    */
  def groupWithin[F2[x] >: F[x]](n: Int, d: FiniteDuration)(

The documentation says that empty groups will not be emitted. But when numberOfElements % n == 0 then the last group is still empty.

Reproducer:

import cats.effect.{ContextShift, IO, Timer}
import fs2.Stream

import scala.concurrent.duration._

object Fs2Test extends App {

  implicit val cs: ContextShift[IO] = IO.contextShift(concurrent.ExecutionContext.global)
  implicit val timer: Timer[IO]     = IO.timer(concurrent.ExecutionContext.global)

  val numberStrings = (1 to 30).map(_.toString).toList
  Stream
    .emits(numberStrings)
    .covary[IO]
    .groupWithin(30, 1.milli)
    .map(chunk => println(chunk.size))
    .compile
    .drain
    .unsafeRunSync()

  // Output is:
  // 30
  // 0
}

Is this a bug or is the documentation miss-leading?

@SystemFw SystemFw added the bug label Jan 28, 2019
@Sebruck
Copy link
Contributor Author

Sebruck commented Jan 28, 2019

I am currently looking into this.

@SystemFw
Copy link
Collaborator

Nice, this specific issue should be in the algorithmic (non concurrent) part of the code so hopefully not too bad

@yannick-cw
Copy link

One thing I was wondering, when the groupedWithin always emits nonempty batches, why not make it something like a NonEmptyList?

@SystemFw
Copy link
Collaborator

Mostly because Chunk is optimised and very often you want to keep working with it. NonEmptyList forces a very specific access pattern. So, it's a tradeoff :) (we don't have NonEmptyChunk anymore)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants