Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ERROR: KeyError: key Union{} not found #129

Open
ohadle opened this issue Dec 30, 2020 · 2 comments
Open

ERROR: KeyError: key Union{} not found #129

ohadle opened this issue Dec 30, 2020 · 2 comments

Comments

@ohadle
Copy link

ohadle commented Dec 30, 2020

I'm trying out writing a parquet file:

using CSV, DataFrames, Parquet
pd = pyimport("pandas")
download("https://nyc-tlc.s3.amazonaws.com/trip+data/green_tripdata_2019-12.csv", 
    "test_data.csv")
df = CSV.File("test_data.csv") |> DataFrame

Now, following the writer example, I do:

write_parquet("test_data.parquet", df)

And get:

ERROR: KeyError: key Union{} not found
Stacktrace:
 [1] getindex at ./dict.jl:467 [inlined]
 [2] write_col(::IOStream, ::SentinelArrays.MissingVector, ::String, ::Int32, ::Int32; nchunks::Int64) at /Users/ohad/.julia/packages/Parquet/h8mm5/src/writer.jl:369
 [3] _write_parquet(::Tables.Columns{DataFrame}, ::Array{Symbol,1}, ::String, ::Int64; ncols::Int64, encoding::Dict{String,Int32}, codec::Dict{String,Int32}) at /Users/ohad/.julia/packages/Parquet/h8mm5/src/writer.jl:546
 [4] write_parquet(::String, ::DataFrame; compression_codec::String) at /Users/ohad/.julia/packages/Parquet/h8mm5/src/writer.jl:503
 [5] write_parquet(::String, ::DataFrame) at /Users/ohad/.julia/packages/Parquet/h8mm5/src/writer.jl:460
 [6] top-level scope at REPL[34]:1

What am I doing wrong?

@tanmaykm
Copy link
Member

tanmaykm commented Jan 5, 2021

Looks like an unsupported/unexpected column type? @xiaodaigh ?

@xiaodaigh
Copy link
Contributor

xiaodaigh commented Jan 5, 2021

The issue is the column :ehail_fee which is completely missing! This currently isn't supported! Support should not be too hard to add though. But no guarantee I will find time soon due to family commitments at this stage.

select!(df, Not(:ehail_fee))
write_parquet("test_data.parquet", df)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants