Hi All,
I have searched for this all around and I apologize if this is right under my nose.
I am upgrading my gallery (around 9000 pictures) from G1 to G2.
the problem I am getting is that when text is in hebrew it is imported as various numbers preceded with a % sign.
When inserting hebrew text directly to G2 it works fine.
I have tried various codepages in the import stage with no avail.
Forgot to mention, It's a Debian Lenny machine with the Debian packages.
Thanks for your help and for this project.
Itamar.
Posts: 16
5 weeks and no word...
Is this a knows issue in Gallery2 Imports of Hebrew?
I can see that the encoding of the Hebrew text in Galley (1) is ISO-8859-1 (and there is no import option for ISO-8859-1 Hebrew).
In the first import window (where you choose the albums to import) I see gibberish instead of Hebrew characters. Surprisingly when I click next I see the Hebrew characters correctly in the list of to-be imported albums.
Regretfully after the import I see that all Hebrew was imported as gibberish.
Help! Please!
Posts: 32509
i thought it's just a matter of selecting the right character encoding on import.
are you sure it's ISO-8859-1 and not ISO-8859-8?
--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage
Posts: 16
Thank you for your kind reply!
No, sorry..
ISO-8859-8 gives the same results.
the outcome is this:
[img]http://www.itamar.org/albums/General/import.jpg[/img]
Posts: 32509
1. what mysql version are you using?
2. looks like all your Hebrew characters are HTML entities.
i guess a small modification to the import (migrate) module needs to be done to convert the HTML entities properly to UTF-8 characters.
i have the feeling that this has been discussed in the forums already. but i'll update this thread here as soon as i have some time.
--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage
Posts: 32509
you'll have to edit a file:
file modules/core/classes/helpers/GalleryCharsetHelper_simple.class
find:
function convertToUtf8($string, $sourceEncoding=null) { global $gallery; $phpVm = $gallery->getPhpVm(); if (empty($sourceEncoding)) { $sourceEncoding = GalleryCharsetHelper_simple::detectSystemCharset(); } if (empty($sourceEncoding) || !strcmp($sourceEncoding, 'UTF-8')) { return $string; } /* Iconv can return false, so try it first. If it fails, continue */ if ($phpVm->function_exists('iconv')) { if (($result = $phpVm->iconv($sourceEncoding, 'UTF-8', $string)) !== false) { return $result; } } if ($phpVm->function_exists('mb_convert_encoding')) { return $phpVm->mb_convert_encoding($string, 'UTF-8', $sourceEncoding); } else if ($phpVm->function_exists('recode_string')) { return $phpVm->recode_string($sourceEncoding . '..UTF-8', $string); } else { GalleryCoreApi::requireOnce( 'modules/core/classes/helpers/GalleryCharsetHelper_medium.class'); $charset =& GalleryCharsetHelper_medium::getCharsetTable($sourceEncoding); if (isset($charset)) { return preg_replace('/([\x80-\xFF])/se', '$charset[ord(\'$1\')]', $string); } } return $string; }replace with:
function old_convertToUtf8($string, $sourceEncoding=null) { global $gallery; $phpVm = $gallery->getPhpVm(); if (empty($sourceEncoding)) { $sourceEncoding = GalleryCharsetHelper_simple::detectSystemCharset(); } if (empty($sourceEncoding) || !strcmp($sourceEncoding, 'UTF-8')) { return $string; } /* Iconv can return false, so try it first. If it fails, continue */ if ($phpVm->function_exists('iconv')) { if (($result = $phpVm->iconv($sourceEncoding, 'UTF-8', $string)) !== false) { return $result; } } if ($phpVm->function_exists('mb_convert_encoding')) { return $phpVm->mb_convert_encoding($string, 'UTF-8', $sourceEncoding); } else if ($phpVm->function_exists('recode_string')) { return $phpVm->recode_string($sourceEncoding . '..UTF-8', $string); } else { GalleryCoreApi::requireOnce( 'modules/core/classes/helpers/GalleryCharsetHelper_medium.class'); $charset =& GalleryCharsetHelper_medium::getCharsetTable($sourceEncoding); if (isset($charset)) { return preg_replace('/([\x80-\xFF])/se', '$charset[ord(\'$1\')]', $string); } } return $string; } function convertToUtf8($string, $sourceEncoding=null) { $string = GalleryCharsetHelper_simple::old_convertToUtf8($string, $sourceEncoding); return GalleryUtilities::unicodeEntitiesToUtf8($string); }then try the import again.
once you're done with the import, revert your change to the above mentioned file (just restore the original file).
--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage
Posts: 16
VALIANT!!!!!!!
YOU THE MAN!!
This works immaculately!
Thanks very much (just donated 10$ via paypal!)
Will this be fixed in future releases?
Well, I don't really care for once I migrate this doesn't really matter for me - but what about future upgraders that will try migrating a hebrew G1?
Thanks again Valiant!
Itamar.
Posts: 32509
I've filed a bug about it:
http://sourceforge.net/tracker/index.php?func=detail&aid=1756821&group_id=7130&atid=107130
--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage