Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BUG] Host column vector leaks when running test_cast_timestamp_to_date #10169

Closed
andygrove opened this issue Jan 9, 2024 · 1 comment · Fixed by NVIDIA/spark-rapids-jni#1689
Assignees
Labels
bug Something isn't working

Comments

@andygrove
Copy link
Contributor

Describe the bug


[gw0] [100%] PASSED ../../src/main/python/cast_test.py::test_cast_timestamp_to_date[DATAGEN_SEED_OVERRIDE=1704816489, INJECT_OOM] 


andy@ripper:~/git/nvidia/spark-rapids$ 24/01/09 12:48:49 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 87)
24/01/09 12:48:49 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 85)
24/01/09 12:48:49 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 84)
24/01/09 12:48:49 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 83)

Steps/Code to reproduce bug

./integration_tests/run_pyspark_from_build.sh -k test_cast_timestamp_to_date

Expected behavior
No leak

Environment details (please complete the following information)
Local workstation.

Additional context

@andygrove andygrove added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jan 9, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jan 9, 2024
@res-life
Copy link
Collaborator

Caused by GpuTimeZoneDB, duplicated to NVIDIA/spark-rapids-jni#1571

24/01/10 03:12:43 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 103)
24/01/10 03:12:43 ERROR MemoryCleaner: Leaked vector (ID: 103): 2024-01-10 03:12:41.0209 UTC: INC
java.lang.Thread.getStackTrace(Thread.java:1564)
ai.rapids.cudf.MemoryCleaner$RefCountDebugItem.<init>(MemoryCleaner.java:336)
ai.rapids.cudf.MemoryCleaner$Cleaner.addRef(MemoryCleaner.java:90)
ai.rapids.cudf.HostColumnVector.incRefCountInternal(HostColumnVector.java:190)
ai.rapids.cudf.HostColumnVector.<init>(HostColumnVector.java:97)
ai.rapids.cudf.HostColumnVector$ColumnBuilder.build(HostColumnVector.java:987)
ai.rapids.cudf.HostColumnVector.fromLists(HostColumnVector.java:331)
com.nvidia.spark.rapids.jni.GpuTimeZoneDB.doLoadData(GpuTimeZoneDB.java:242)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)

24/01/10 03:12:43 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 101)
24/01/10 03:12:43 ERROR MemoryCleaner: Leaked vector (ID: 101): 
24/01/10 03:12:43 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 100)
24/01/10 03:12:43 ERROR MemoryCleaner: Leaked vector (ID: 100): 
24/01/10 03:12:43 ERROR HostColumnVector: A HOST COLUMN VECTOR WAS LEAKED (ID: 99)
24/01/10 03:12:43 ERROR MemoryCleaner: Leaked vector (ID: 99): 

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants