Foreign Language Support

bradymiller wrote on Monday, May 04, 2009:

sounds good (if nobody objects).
Let me know when its done.  I’ll then work on getting out the “revised” Google docs spreadsheet minus these strings, and also take this chance to re-sort the entries that start with special characters to the bottom of the list; hopefully get done tonight if no problems. 
-brady

sunsetsystems wrote on Monday, May 04, 2009:

OK I have committed the changes to interface/billing/*_codes.php.

Rod
www.sunsetsystems.com

bradymiller wrote on Tuesday, May 05, 2009:

Hey,

  There have been extensive changes to the translation spreadsheet. It’s looks ok on initial testing but would be great if you would all quickly review your languages for obvious issues on the spreadsheet and in OpenEMR.

   We have removed about 1600 constants from the spreadsheet since they were specific to US billing.  Luckily, only a small proportion of these were even translated, but I am sorry to those of you who spent time translating them.

   We have also re-ordered them, so all the entries that start with special characters or numbers are at the bottom.

   Because of the extensive changes we have released another set of language tables. Here are the statistics:
Total number of english constants: 2196
Total number of definitions: 3586
Chinese: 4% (77 definitions)
Dutch: 71% (1557 definitions)
German: 1% (13 definitions)
Norwegian: 33% (715 definitions)
Russian: 2% (34 definitions)
Spanish: 36% (793 definitions)
Swedish: 18% (397 definitions)

We will continue to include the “dummy” language, which basically defines the word dummy for all 2196 constants. This is so we can find constants that aren’t translating and fix them.

The best way to test your translations is on the online cvs demo, which is using the above new translation table, and allows you to select a language before logging in: 
http://oemr.org/modules/wiwimod/index.php?page=DemoCVS

Here is the updated mysql dumpfile: http://openemr.cvs.sourceforge.net/viewvc/openemr/openemr/contrib/util/language_translations/currentLanguage.sql?revision=1.6

You can also install the above mysql dumpfile into your local openemr, but note this stuff is still considered experimental, and will delete your current language tables (back your stuff up before using this). Steps to do this are below: 
1. Download the above currentLanguage.sql dumpfile to your computer. 
Then in OpenEMR go to 
admin -> Database -> Databases -> openemr -> SQL and click ‘Browse’ 
button at the ‘Or location of the textfile:’ field. Select the currentLanguage.sql 
file you downloaded to your computer, then click ‘Go’ button. 
2. If you want to see the translations. Edit the openemr/interface/globals.php file at the translations entry. Put 2 if you want swedish, 3 if you want spanish, 4 if you want German, 5 if you want Dutch, 8 if you want to see Norwegian, 9 if you want Chinese, 10 if you want Russian, and 12 if you want “dummy”. After changing setting remember to logout then login back into OpenEMR.

Other News: 
We have an openemr language translation wiki at: 
http://www.oemr.org/modules/wiwimod/index.php?page=TranslationGuide 
To get "editing" access, register (at bottom right of screen) and then email me (brady@sparmy.com) your username, so I can give you wiki editing privileges.

Another cool thing:
You can now see chinese characters if go to the fees->billing menu in OpenEMR

thanks,
Brady

bradymiller wrote on Wednesday, May 06, 2009:

hey,

Starting to go thru the source and find the broken constants (basically stuff not enclosed by the xl() function), and noticed that all the demographics entry form stuff is not translated. Is there a way to go about getting this stuff into xl() functions, while keeping the layout stuff intact? This question is probably directed towards Rod.

-brady

sunsetsystems wrote on Wednesday, May 06, 2009:

library/options.inc.php contains the code that displays layouts, both the forms and the display-only formats.  It would not be difficult to modify that to wrap xl() around the necessary items.

Rod
www.sunsetsystems.com

sunsetsystems wrote on Thursday, May 07, 2009:

Just a note… I loaded currentLanguage.sql using the mysql utility, and found that I needed "–default-character-set=utf8" as an option on the command line.

Did not try it with mysqladmin, but am wondering if it will also be an issue there since mysqladmin does not know about the utf8 encoding?

Rod
www.sunsetsystems.com

bradymiller wrote on Thursday, May 07, 2009:

hey,

Yep, it’s odd. That’s not needed if the database it latin1, but is needed if the database is utf8. I must say, this stuff can be a bit mystifying.  I seemed to have cleared it up by putting SET NAMES utf8; at the top of the script.  Also made some changes in the table to simplify; removed the bin and changed all collation to general.  Didn’t commit the changes; figured I was gonna rebuild the table anyways over the weekend in a couple days (try to build it weekly for translators to test);

-brady

bradymiller wrote on Friday, May 08, 2009:

Rod,

Regarding currentLanguage.sql I had to remove the SET NAMES from the top, so you will continue needing to add your switch on import.

Reason was it screws up if the database is based on latin1 (which is all previous openemrs). So to ensure compatibility I left it off for now until find a better solutions.  It’s a bummer, because the phpmyadmin import method also screws up in the newer UTF8 database.  However, It still works fine during installation into UTF-8 database because you put the SET NAMES in the setup.php script.

-brady

bradymiller wrote on Friday, May 08, 2009:

hey,

Regarding UTF-8 in openemr, just checked in some changes to cvs. In OpenEMR html and mysql encoding is working.  In php-GACL only html encoding is working.  Working on mysql encoding in phpgacl; the gacl/gacl.class.php file is where the mysql conneciton is opened so gonna place the required stuff there, howeveer can’t include globals.php here (contains the UTF-8 user flag) since causes ADODB clashing between openemr and phpgacl so likely gonna need to make a config variable in gacl that is configured during setup.php.  Then we can move to phpmyadmin.

-brady

markleeds wrote on Friday, May 08, 2009:

Brady,

I came up with a working demonstration of the Google language api.  Here is a link:

http://mjl69.com/translation/

The HTML file does the translations while you wait.  I only included the first 16 items from the list because it is slow.  All translatable languages are included.  The HTML file was generated with a Perl script which builds it from a list of constants.  I originally tried it on the whole list of nearly 1700 items, but it crashed my computer.

Mark

bradymiller wrote on Friday, May 08, 2009:

hey,

That is very cool.  Would be great if the translators could give their thoughts on how accurate these are.  It’s definitely relatively fast

How slow was the first 16 to build?  Is there any hope of getting the entire list of constants(not really for demo but just to then dump them to a file(see below for why)?

Do you want me to give you viewing privileges of the google docs translation table? Then would be easy to have your huge output file of all possible translations in the format necessary to build the language tables, since the input of the script basically takes this spreadsheet using tab as the delimiter (then can import created sql dumpfile via the embedded phpmyadmin in openemr for an awesome openemr demonstration).  f your curious the file to build the sql tables form the spreadsheet is openemr/contrib/util/language_translations/buildLanguageDatabase.pl in cvs (the header describes it, basically 1st parameter is the spreadsheet tab-delimited file and second is the constants file (for validation)).

-brady

bradymiller wrote on Friday, May 08, 2009:

Mark,
Also the most up to date list of constants (recently reduced and sorted are here):
http://openemr.cvs.sourceforge.net/viewvc/openemr/openemr/contrib/util/language_translations/currentConstants.txt
-brady

markleeds wrote on Friday, May 08, 2009:

Brady,

I would definitely like an invite to google docs to see the table.

Building the HTML file for use with the Google API from the perl script only takes a second, even with thousands of constants.  The slow part is what you saw when you went to the page, the real time translation by the API via AJAX.  It may have been faster on your machine.  I am using an iBook G4 1GB RAM on a 1.5 Mb/s connection.  Limiting the number of languages will speed things up.  The speed does not really matter if it is eventually able to complete in a usable format.

Mark

blankev wrote on Friday, May 08, 2009:

Mark,

what you did with API has to be repeated 200 times. I found 2 mismatches, but in the same Dutch column I needed to remake changes for at least 2 times.

So for a starter of any foreign language it seems to have a future.

Would it help if you did one column at a time???

I would be glad to help with any check of the Dutch translations made with API and compare them with to right translation. Mine, yours or corrected mistakes from CVS real time usage of translations.

Pimm

bradymiller wrote on Friday, May 08, 2009:

Pimm,

So out of the 16, only two required changing? that’s really good.

What would be the most useful way to use this for the translators?  My hope is we end up with another spreadsheet with all the google translations in the same order of constants as our current one.  Then translators can use it while translating (would this be useful?).  Will also be cool to just build the openemr language tables from the google only spreadsheet just to see what openemr looks like, and really see how well the performance of openemr’s translation engine is. Also, to be able to translate doctor note’s etc. in real time would be a cool function.

This is really cool. I hope google isn’t mad; I’ve clicked on Mark’s form many times just watching the translations pop up.

-brady

blankev wrote on Friday, May 08, 2009:

Brady,

since the spreadsheet of translations get rather voluminous, do you agree that we should start with Constants and only one to be translated language definitions?

Spreadsheet "es"
Spreadsheet "du"
Spreadsheet "sw"

If wanted translators can insert language collumns at will for comparison. How many spreadsheets can be placed in the Google Documents?

Brady,

some time ago you said that it was rather easy to make the login page remember the last used language. If you still have nothing to do :wink: ;-)) , I would like to see this implemented in next major release, since I keep forgeting to start with the right language and so will my collegues.

Pimm

bradymiller wrote on Friday, May 08, 2009:

hey,

Google docs number is unlimited, but they are a pain to manage (since will need to add constants and re-sort them every once in awhile, nightmare having to deal with keeping them all updated).  We could supply the main doc and one for google translations, then the translators could always copy the constants and their column to their own doc (or a doc they share with other translators).

Remembering previous language and having per-user preferences won’t really be possible until we migrate all of our globals.php server/user setting to the mysql database (Not sure when this will happen). But, the globals.php in CVS does have an entry for default language setting (so you could put Dutch (capital letter is important), and the default selection would be dutch when you login).

-brady

blankev wrote on Friday, May 08, 2009:

In MHO, I would suggest:

First Column: Numbered Constants

Second Column: English_Constants

Third Column: Any Language_Definition

Fourth Column: Google translation

As option it could be handy to insert and/or… copy<=>paste any other accepted language translation next to the official columns. In my case it is easy when you see the English, Dutch, Spanish, since medical terms are different from Dictionary translation. (Even a Medical dictionary won’t help for OpenEMR translation.)

So do NOT use the Google translation for comparison of all Languae Definitions, but only to translate columns of the same language, translated Definitions of the Translator should get preference over Google translation. Google translation should get preference over English Constants, English constants should be kept for non translated constants. These last ones, the Constatns,  will always be included in real time OpenEMR-International.

Google columns should only be used as an example of same language translations, or to show people what the translated OpenEMR might look if translated.

There might be a problem with this approach, since Google does not give a flag when translation is wrong. Only hand build translations can have official approval with any new  OpenEMR version. Even my Dutch translation is still experimental. Many words I used need more fine tuning, some kind of correction, but I know the English version and so can handle them in my translated OpenEMR and accept understandable mistakes with a smile.

Pimm

sunsetsystems wrote on Friday, May 08, 2009:

Brady, are you *sure* that “SET NAMES utf8” doesn’t work with a latin1 database?  I was just noticing that mysqldump (which OpenEMR’s backup.php uses) generates that by default, and restores seem to work regardless of the database character set.

Rod
www.sunsetsystems.com

bradymiller wrote on Friday, May 08, 2009:

hey,

Very strange. On my mandriva development system was very consistent. If I insert the script via openemr’s embedded phpmyadmin I get the following:

If installed openemr with the current default UTF8 database then SET NAMES is needed for special characters to work (otherwise special characters are ruined).

If installed openemr with latin1 (chose the not force choice during install and confirmed it was latin1 via mysql) then SET NAMES ruins the special characters.

Note, it’s odd, because when I say “ruin”, I mean that even setting the browser to the alternate encoding scheme does not work.

There does appear to be possibly several reported bugs in phpmyadmin somewhat related to this, especially considering it’s such an old version. If that’s the case could force it to use SET NAMES if the utf8 flag has been set, and possibly just getting the html headers to have utf-8 encoding will fix it. Anyways, still looking into it.  Let me know if you are able to insert it with SET NAMES on latin1 via phpmyadmin.; perhaps it’s development environment specific (I really think this encoding stuff will possibly introduce some development environment specific issues that will take a little time to iron out).
-brady