Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

High-level I/O Interface #3

Merged
merged 27 commits into from
Apr 23, 2024
Merged

High-level I/O Interface #3

merged 27 commits into from
Apr 23, 2024

Conversation

BenCurran98
Copy link
Collaborator

@BenCurran98 BenCurran98 commented Apr 17, 2024

Adding in methods for reading and writing in a user-friendly way.
The main wrapper around a complete LAS dataset is the LasDataset struct, which contains the LAS point data (as a table), an extra mutable table of user fields kept alongside this, as well as the header, VLR, EVLR and user-defined bytes
You can load your entire LAS dataset using load_las (which returns a LasDataset) or you can just get the tabular point cloud using load_pointcloud.
To save, you can call out to save_las and enter either a LasDataset or TypedTable

There are a few methods to manipulate your LasDataset as well, such as adding VLRs and user-defined fields to your points

For performance reasons, I've split up the idea of a LAS point record into 4 structs implementing an abstract type LasRecord. Each of these have a different combination of a point, user-defined fields and undocumented extra bytes and their read/write methods take these into account. This turns out to be way faster than if you had one struct wrapping all 3 fields and doing a bunch of if-statements to check if certain fields are populated. Note also that LasRecords aren't designed to be dealt with directly by the user- the user should interact with the data either through a LasContent or TypedTable

Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
all(indexin(evlrsA, evlrsB) .!= nothing),
all(indexin(evlrsB, evlrsA) .!= nothing)
])
user_bytes_equal = (get_user_defined_bytes(contA) == get_user_defined_bytes(contB))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_user_data doesn't need to be the same?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's implicitly checked when we check the contents of the original pointcloud input. I've restricted the inputs so users can't actually supply their own _user_data and it gets extracted from the point cloud itself

src/dataset.jl Outdated
"""
function add_column!(las::LasDataset, name::Symbol, values::AbstractVector{T}) where T
if ismissing(las._user_data)
las._user_data = FlexTable(NamedTuple{ (name,) }( (values,) ))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check the length matches the las data? also we are using the ordering to keep in sync (not ids)?

Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
src/dataset.jl Outdated
"""
$(TYPEDSIGNATURES)

Add a column with a `column` and set of `values` to a `las` dataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads strangely -- "Add a column with a column ..."? Maybe "Add a column with name column and ..." is more clear.

Signed-off-by: BenCurran98 <b.curran@fugro.com>
@BenCurran98 BenCurran98 marked this pull request as ready for review April 19, 2024 00:13
Signed-off-by: BenCurran98 <b.curran@fugro.com>
src/util.jl Outdated
@@ -44,8 +61,59 @@ function open_las(func::Function, file::String, rw::String = "r")
end
end

function open_laz(func, file::String, rw::String="r")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do here typing of func to func::Function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also spacing in this function open_laz is inconsistent with other functions, e.g. line 69
if (rw=="r")
when usually we have
if rw == "r"

src/util.jl Outdated
left_brack_idx[1] < right_brack_idx[1],
all(isdigit, str_name[(left_brack_idx[1] + 1):(right_brack_idx[1] - 1)])
]) "Invalid column name $(field)"
# if it's split, get the bit before the brackets, i.e for "col [1]" you get "col"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring to 'bit' here might cause a little confusion as you are not getting a binary bit, you're referring to the part before the bracketing.

Also as an aside, left and right brack is very funny sounding.

Signed-off-by: BenCurran98 <b.curran@fugro.com>
src/records.jl Outdated

function byte_size(record::Type{TRecord}) where TRecord <: LasRecord{TPoint,N} where {N, TPoint <: LasPoint{F}} where F
return byte_size(TPoint) + N
Get the appropriate LAS record format for a LAS `header` and a (possibly empty) est of `extra_bytes` that document any optional user fields to include
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling mistake:
"Get the appropriate LAS record format for a LAS header and a (possibly empty) set ..."

src/records.jl Outdated
record_length = point_record_length(header)
num_point_bytes = byte_size(point_type)
record_diff = record_length - num_point_bytes
@assert record_diff ≥ 0 "Record length in header $(record_length)B smaller than point size $(num_point_bytes)B"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert message could be clearer.
"Record length in header [...] is smaller than [...]"
Or perhaps more clearly
"Record length in header [...] cannot be smaller than point size [...]"

src/records.jl Outdated
user_field_types = data_type.(extra_bytes)
num_user_bytes = sum(sizeof.(user_field_types))
num_undoc_bytes = record_length - num_point_bytes - num_user_bytes
@assert num_undoc_bytes ≥ 0 "Record length in header $(record_length)B inconsistent with size of point $(num_point_bytes)B and size of user fields $(num_user_bytes)B"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Record length of [...] in header is inconsistent with both the size of point [...] and of user fields [...]"

test/dataset.jl Outdated
@test other_las == las

# now try incorporating some user fields
spicey_pc = Table(pc, thing = rand(num_points), other_thing = rand(Int16, num_points))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spicy is not spelled with an 'e'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It totally is

test/file_io.jl Outdated
end

@testset "Extra Bytes I/O" begin
# simple example where we have 4 extra undocumeented bytes per point
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"undocumented"

Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
Signed-off-by: BenCurran98 <b.curran@fugro.com>
@BenCurran98 BenCurran98 merged commit 9affce2 into main Apr 23, 2024
1 check passed
@BenCurran98 BenCurran98 deleted the IO branch April 23, 2024 02:07
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants