Converting Folder Tree using iconv with Ruby


I’ve just experienced one of the strangest problems ever. A PHP project was probably copied from a Swedish Windows machine to a Linux server, set to handle Swedish somehow. From the server it was then downloaded to a Mac that was probably running Swedish too, and from the Mac to my strictly English Windows installation running Wamp and XAMPP.

Some PHP files refused to run, instead I got a prompt asking to download them. After a while I was able to narrow it down to the folders, somehow they were the problem. When copying a PHP file from the “tainted” folder tree to a newly created folder I was able to run it. Therefore something had to be wrong with the folders, or possibly the file names. In the end nothing I did could make the files run, something else was the problem.

I ended up trying the below code though, it might be useful for someone else having character encoding related problems with folders and/or file names even though it didn’t work for me with the particular problem I had. I’ve seen an English Vista turning into a blue screen trying to work with files containing Swedish characters for instance, stuff like that.

require 'find' 
require 'fileutils'
require 'iconv'

def convert_enc(str)
  Iconv.conv('ascii', 'iso-8859-1', str)
end

def copy_tree(orig_dir, new_dir)
  Find.find(orig_dir) do |path|
    FileUtils.mkdir(convert_enc(path.gsub(/^\w+[^\/]/, new_dir))) unless FileTest.file?(path)
  end
  
  Find.find(orig_dir) do |path|
    if FileTest.file?(path)
      from = File.new(path)
      to = File.new(convert_enc(path.gsub(/^\w+[^\/]/, new_dir)), "w+")
      to.write(from.read)
    end
  end
end

copy_tree('old_dir', 'new_dir')

Even though we treat folder and file names exactly the same here this code will allow for different treatment. As you also can see we don’t do anything with file contents, but you can easily do so by passing from.read to some encoding/decoding logic.

Related Posts

Tags: , , , ,