[Tag module hack] IPTC and Russian Keywords
fosco-maestro
Joined: 2010-08-11
Posts: 2 |
Posted: Wed, 2010-08-11 13:02 | |||
I think our people can have the same problem with russian keywords, that why I wrote it. Spent a day, to trying to get work Gallery 3 with Russian Keywords in photo. | G3 always output something like "Êóçüìåíêî" How I discovered later, problem was in PHP, in function "mb_detect_encoding()". That php function doesn't work well with cyrillic (some problems with CP1252 || CP1251, ru -- you can't get real character encoding for encode your word in "UTF8"). I Made a lot of test that's prove it (and I think it's not really important in that post. For more info if you interesting what's happend, you can write me a mail> ) God thank, guy "gcog" wrote a module for IPTC (http://codex.gallery2.org/Gallery3:Modules:iptc), from where I get code, for parsing image's IPTC/XMP (never worked before with metadates in images) I changed function "item_created()" in "modules/tag/helpers/tag_event.php". And copy-pasted folder "lib" from IPTC module, to "modules/tag/". All in "zip" file that I attached. Pavel Voznenko
|
||||
Posts: 7994
Sounds like this is related to https://sourceforge.net/apps/trac/gallery/ticket/1254
---
Problems? Check gallery3/var/logs
bugs/feature req's | upgrade to the latest code | use git
Posts: 2
Not really,
Keyword came in "Win-1252" (when I get ord() from all chars in string, it's equ Win-1252 table http://en.wikipedia.org/wiki/Windows-1252 , and Lebedev's decoder said the same http://www.artlebedev.ru/tools/decoder/ ), but "mb_detect_encoding()" returned "UTF-8", when "strict == TRUE", function returned "FALSE".
Default code "mb_detect_encoding($value, "ISO-8859-1, UTF-8")" that was in function "item_created()" returned "FALSE" too, and ofcourse, it starting encode in utf8. After, string become really unreadable and "mb_detect_encoding()" said that after encode string became "ASCII" 0_o
Then I made hardcode like "iconv('Windows-1252', 'UTF-8', $value)". It's returned "FALSE", and ofcourse in code like "value = iconv('Windows-1252', 'UTF-8', $value);" ->> $value == ''
Function "mb_convert_encoding()" made the same with string like "utf8_encode()".
How I think it's problem happend becouse our photografers used strange "taging tools", whatever, still have no idea what happend 0_o
Than I found module, that "gcog" created, and that module working with xmp data model, that doesn't care about encoding.
There is http://coffeecard.com.ua/share/SVV_2897.JPG.zip one of the photos with russian tags, if you interesting.
PS I'm still junior php developer, can be that I just developed own "bicycle"
Posts: 18
I have the same problem with Russian (non-latin) tags in files. I'm using the PicaJet as IPTC editor under Windows. As a result all tags are encoded in Windows-1251. And I see unreadable characters as a tag. I have reviewed the code and found, that some small changes in modules\tags\helper\tag_event.php (line 39):
if (function_exists("mb_detect_encoding") &&
mb_detect_encoding($word, "Windows-1251,Windows-1252,ISO-8859-1, UTF-8") != "UTF-8") {
/*$word = utf8_encode($word);*/
$coding = mb_detect_encoding($word, "Windows-1251,Windows-1252,ISO-8859-1, UTF-8");
$word = iconv($coding, 'UTF-8', $word);
}
can resolve my problem. So, I have replaced utf8_encode by iconv. But I'm not experienced php-developer and, probably, this code can be optimized. Moreover, I think, it should be generalized and list of available encodings should be expanded. Or, as alternative way, this can be an optional parameter, available for edit in settings.
So, I ask author of tags module: could you please, thinking about non-english customers and add support of different encodings for them?
I'm attached my changed file to this message. When you have a problem with encoding you should unzip this file into modules directory.
Posts: 7994
Please file a ticket for this. I think it's reasonable to put the encoding list into a module setting so that you can at least change it via Admin > Settings > Advanced.
---
Problems? Check gallery3/var/logs
bugs/feature req's | upgrade to the latest code | use git