Using apoc.export.csv.graph with bulkImport:true produces wrong result for "id" property #1335

p0macs · 2019-11-04T20:03:42Z

When the node has a property named "id" and the node will be exported with the bulkImport:true option, then the value of the "id" property will be replaced with the value of the node's internal ID.
The code is probably wrong in the CsvFormat.java file, in the writeNodesBulkImport procedure.

conker84 · 2019-11-25T13:57:20Z

Hi @p0macs can you try with the following jar?
apoc-3.5.0.6-all.jar.zip

I look forward to your feedback!

…:true produces wrong result for id property

p0macs · 2019-11-26T09:36:19Z

Hi @conker84 ,
Thank you for the new version, I have tested it - but I think you probably misunderstood the issue.
Now the bulkImport-mode creates an export file where the first column is "id:ID(NodeLabel)" and the value for that column is the value from my "id" property.
You must understand, that the "id" property must not be a unique identifier of the node. It can be only a coincidence that this property is called "id". I can have a lot of nodes with the same "id" value.
Now the export file handling this "id" as it would be the real ID of the node - but it is not.
The proper way in my opinion would be:

having one "id:ID(NodeLabel)" column for the internal ID
having one additional "id" column for my "id" property

When you export without the bulkImport option, then the CSV file is created with the following header:
"_id","_labels","id","_start","_end","_type"
the "_id" is here the internal ID of the node, and the "id" is my own "id" property
The proper header for the bulkImport option would be (when no unique constraint is defined on the node):
_id:ID(NodeLabel),id,:LABEL

I think the bulk import is designed to work with any ":ID" column which is unique. But please note: alone the fact the the property name is "id" does not mean that this is the unique identifier for the node.

conker84 · 2020-01-09T11:55:20Z

Sorry @p0macs I lost you comment.
Can you check the test that I added in the PR:

neo4j-apoc-procedures/src/test/java/apoc/export/csv/ExportCsvNeo4jAdminTest.java

Lines 207 to 212 in d45a9e2

    
           String expectedNodesLarus = String.format(":ID,name,id:long,:LABEL%n" 
        
                   + "%s,Andrea,1,User;Larus%n", map.get("sourceId")); 
        
           String expectedNodesNeo4j = String.format(":ID,name,id:long,:LABEL%n" 
        
                   +"%s,Michael,2,User;Neo4j%n", map.get("targetId")); 
        
           String expectedRelsNeo4j = String.format(":START_ID,:END_ID,:TYPE,id:long%n" 
        
                   + "%s,%s,KNOWS,10%n", map.get("sourceId"), map.get("targetId"));

The new header should be:

:ID,name,id:long,:LABEL
<internalID>,<your_prop_name>,<your_prop_id>,<labels>

Before my PR the id prop was always overwritten by the internal id, while now we have it as a separate field.
I think that we should always rely on internal id because you can have also composite unique keys in your domain (this simplifies the whole export process) do you think that could be a problem for your use-case?

p0macs · 2020-01-09T16:08:59Z

Hi @conker84 ,
Exactly, that is what we need. using the internal ID for the export (as before) - but exporting also the user property "id" when it exists. In our case it has not the long datatype (it is a string), but I think that will be handled right anyway.

conker84 · 2020-01-10T10:41:03Z

@p0macs if I provide a build with the fix can you test it?

p0macs · 2020-01-10T12:49:22Z

Hi @conker84 , yes I could test it.

fixes #1335: Using apoc.export.csv.graph with bulkImport:true produces wrong result for id property

…s wrong result for id property

conker84 added a commit to conker84/neo4j-apoc-procedures that referenced this issue Nov 25, 2019

fixes neo4j-contrib#1335: Using apoc.export.csv.graph with bulkImport…

07e8d2e

…:true produces wrong result for id property

conker84 added a commit to conker84/neo4j-apoc-procedures that referenced this issue Nov 25, 2019

fixes neo4j-contrib#1335: Using apoc.export.csv.graph with bulkImport…

70e5f15

…:true produces wrong result for id property

conker84 added a commit to conker84/neo4j-apoc-procedures that referenced this issue Nov 25, 2019

fixes neo4j-contrib#1335: Using apoc.export.csv.graph with bulkImport…

d45a9e2

…:true produces wrong result for id property

conker84 mentioned this issue Nov 25, 2019

fixes #1335: Using apoc.export.csv.graph with bulkImport:true produces wrong result for id property #1359

Merged

JMHReif closed this as completed in #1359 Jan 10, 2020

JMHReif added a commit that referenced this issue Jan 10, 2020

Merge pull request #1359 from conker84/issue_1335

f463a3f

fixes #1335: Using apoc.export.csv.graph with bulkImport:true produces wrong result for id property

sarmbruster pushed a commit that referenced this issue Jan 11, 2020

fixes #1335: Using apoc.export.csv.graph with bulkImport:true produce…

cea3139

…s wrong result for id property

sarmbruster added a commit that referenced this issue Jan 13, 2020

fixes #1335: Using apoc.export.csv.graph with bulkImport:true produce…

fb20a8f

…s wrong result for id property

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using apoc.export.csv.graph with bulkImport:true produces wrong result for "id" property #1335

Using apoc.export.csv.graph with bulkImport:true produces wrong result for "id" property #1335

p0macs commented Nov 4, 2019

conker84 commented Nov 25, 2019 •

edited

Loading

p0macs commented Nov 26, 2019

conker84 commented Jan 9, 2020

p0macs commented Jan 9, 2020

conker84 commented Jan 10, 2020

p0macs commented Jan 10, 2020

Using apoc.export.csv.graph with bulkImport:true produces wrong result for "id" property #1335

Using apoc.export.csv.graph with bulkImport:true produces wrong result for "id" property #1335

Comments

p0macs commented Nov 4, 2019

conker84 commented Nov 25, 2019 • edited Loading

p0macs commented Nov 26, 2019

conker84 commented Jan 9, 2020

p0macs commented Jan 9, 2020

conker84 commented Jan 10, 2020

p0macs commented Jan 10, 2020

conker84 commented Nov 25, 2019 •

edited

Loading