Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Decode unicode filenames from URL #2131

Conversation

fedorkk
Copy link
Contributor

@fedorkk fedorkk commented Mar 3, 2017

No description provided.

@fedorkk
Copy link
Contributor Author

fedorkk commented Mar 3, 2017

The problem with downloading file with unicode symbols in URL through remote_image_url. It downloads correctly, but file name like this: 'юникод.jpg' sanitizes to this: '_D1_8E_D0_BD_D0_B8_D0_BA_D0_BE_D0_B4.jpg'. And arises NAME_TO_LONG errors, like there #539

@joemsak
Copy link

joemsak commented May 2, 2017

I thumbs-upped this PR but I want to comment too for emphasis. I am now using this branch in my production app because of this issue and it did fix it. I really need this merged so I can take advantage of upcoming updates for Rails 5.1 compatibility

@fedorkk
Copy link
Contributor Author

fedorkk commented May 2, 2017

As a workaround in my production I am using modified version of this monkey patch with a carrierwave master branch.

# frozen_string_literal: true
# Monkey patch for long filenames
# @see https://github.com/carrierwaveuploader/carrierwave/pull/539/files
module CWRemoteFix
  # 255 characters is the max size of a filename in modern filesystems
  # and 100 characters are allocated for versions
  MAX_FILENAME_LENGTH = 255 - 100

  def original_filename
    filename = filename_from_header || filename_from_uri
    mime_type = MIME::Types[file.content_type].first
    unless File.extname(filename).present? || mime_type.blank?
      filename = "#{filename}.#{mime_type.extensions.first}"
    end

    if filename.size > MAX_FILENAME_LENGTH
      extension = (filename =~ /\./) ? filename.split(/\./).last : false
      # 32 for MD5 and 2 for the __ separator
      split_position = MAX_FILENAME_LENGTH - 32 - 2
      # +1 for the . in the extension
      split_position -= (extension.size + 1) if extension
      # Generate an hash from original filename
      hex = Digest::MD5.hexdigest(filename[split_position, filename.size])
      # Create a new name within given limits
      filename = filename[0, split_position] + '__' + hex
      filename << '.' + extension if extension
    end
    # Return original or patched filename
    filename
  end

  def filename_from_uri
    URI.decode(File.basename(file.base_uri.path))
  end
end

# Monkeypatch downloader class using prepend
CarrierWave::Uploader::Download::RemoteFile.prepend CWRemoteFix

@thiagofm
Copy link
Member

thiagofm commented Jul 7, 2017

👍 thanks! Problem is well-described and tested.

@thiagofm thiagofm merged commit 61a961a into carrierwaveuploader:master Jul 7, 2017
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants