Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CSV file upload breaks if whitespace in filename #427

Closed
quadrophobiac opened this issue May 11, 2017 · 13 comments · Fixed by #692
Closed

CSV file upload breaks if whitespace in filename #427

quadrophobiac opened this issue May 11, 2017 · 13 comments · Fixed by #692

Comments

@quadrophobiac
Copy link
Collaborator

Steps to reproduce

Upload any CSV contain WORD \s WORD.csv to Octopub publication. Upload should hang

Logs should show:

Completed 500 Internal Server Error in 45ms (ActiveRecord: 5.1ms) (etc.)
URI::InvalidURIError (bad URI(is not URI?): https://octopub-dev.s3-eu-west-1.amazonaws.com/uploads/eac6e230-05db-47ed-9454-c2c143b83c98/WORD WORD.csv):

@quadrophobiac
Copy link
Collaborator Author

Unsure how to even test for this
when a user uploads a file the relevant instance variable shows this

 @files
=> [<ActionController::Parameters {"title"=>"sleep data set 4 months", "description"=>"", "file"=>"https://octopub-dev.s3-eu-west-1.amazonaws.com/uploads/78922d62-64c9-42bf-8027-cba931fc92a7/Duplicate Jamie.csv", "schema_name"=>"", "schema_description"=>""} permitted: false>]

but the spec exercising this part of the code returns this

[<ActionController::Parameters {"title"=>"Fri1437-file-name", "description"=>"Fri1437-file-description", "file"=>#<ActionDispatch::Http::UploadedFile:0x007fde5f777e38 @tempfile=#<Tempfile:/var/folders/k7/7c014xzx12j4bvwxgp22djgr0000gp/T/RackMultipart20170516-5694-17nqy4s.csv>, @original_filename="white space.csv", @content_type="text/csv", @headers="Content-Disposition: form-data; name=\"files[][file]\"; filename=\"white+space.csv\"\r\nContent-Type: text/csv\r\nContent-Length: 222\r\n">, "schema_name"=>"", "schema_description"=>""} permitted: false>]

so the spec isn't working with the same problematic field - namely "file"

file_upload.js generates the URL name like so

var key   = $(data.jqXHR.responseXML).find("Key").text();
var url   = 'https://' + form.data('host') + '/' + key;

in the formdata this looks like so

"formData": {
        "key": "uploads/fbaff82b-b35e-40d0-bbb0-0df47aa15a1f/${filename}",
}

a simple regex swapping whitespace for dashes will not catch cases where a user uploads a file with trailing whitespace before the .csv extension

@quadrophobiac
Copy link
Collaborator Author

quadrophobiac commented May 17, 2017

This is unbelievably exasperating. No function analagous to parametricize exists in jQuery. It seems to me like catching this error within the JS file file_upload.js makes the most sense because then any substitutions to the name with underscores or dashes can be indicated to the user. It's also something that the application should do by default and something that seems a longstanding feature of integrating with Amazon s3 http://stackoverflow.com/questions/4419497/amazon-s3-only-accepting-files-with-no-spaces-no-numbers-in-the-title
Quick fix of regexing the relevant string in JS throws errors later down the line.
@Floppy if you have time to offer insight I'd appreciate it

@quadrophobiac
Copy link
Collaborator Author

including some prior tickets that may be pertinent to solving this #398 #304

@quadrophobiac
Copy link
Collaborator Author

It's possible that the error encountered has little to do with amazon s3 or otherwise and may entirely relate to this issue within the CSVlint gem Data-Liberation-Front/csvlint.rb#182

@Floppy
Copy link
Contributor

Floppy commented Aug 11, 2017

I've started investigating this from another direction, driving out with specs. Got some of it working by encoding/decoding URLs properly, but the S3 interface isn't fully tested by the specs, I think, so it'll need a combination of our two approaches, I think.

@quadrophobiac
Copy link
Collaborator Author

By combination of approaches do you mean the one you're working on PLUS the bug within the csvlint gem?

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

And the branch you were working on as well. [EDIT] oh that's this one, so yeah. :)

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

So. The CSVlint issue only affects local files in the command-line client, but the reason is similar. The CSVlint validator expects valid URLs to be passed in, whether file URLs, or http. The problem we're having here, I think, is that the uploader is giving us back an invalid (i.e. not correctly encoded) URL when we upload a file with a space in it. We need to detect that and encode, somewhere.

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

So as soon as the file is uploaded, the URL in the file uploader has a space in it. This feels wrong.

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

I'm now taking another approach to this. There is no reason to turn the S3 storage key into a URL and back again, so I'm going to change it to just pass along as a key instead. Let's see if that helps.

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

Context there: the storage keys work fine with spaces in. It's only because we're turning them into URLs for some of the time that the code fails.

@Floppy
Copy link
Contributor

Floppy commented Aug 15, 2017

Success acheived in #567.

@Floppy
Copy link
Contributor

Floppy commented Sep 13, 2017

#567 has been reverted pending the completion of #602, so this is, for now, still open. But #602 will fix it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants