Translations and OpenEMR (part 3)

blankev wrote on Thursday, May 21, 2009:

Due to the fact that discussions in "Foreign Language Support" gets a bit overburdened I start a new subject heading!

(If needed we could make different SUBJECTS for different problems)

Please take notice that most translation discussions were solved under the "Foreign Language Support" subject in Developers Forum and during transitions mentioned in the start of that subject.

Right now there is enough new discussion material to continue with a new SUBJECT heading.

Compilation:

Brady Miller started a Google spreadsheet where translators can have their own Language Column. With scripts these translations are included in the CVS Demo version of OpenEMR and when available in the official new versions.

In the Demo you can choose your language in the Loginscreen and see if it works. If in doubt that some part of you language are translatable, please choose the dummy language. (Has nothing to do with your state of mind)

Before you take a dive in this discussion please be sure to read the manuals. FInd them at www.oemr.org

Start by making some changes in your own language definitions in your own OpenEMR VERSION. Login go to ADMINISTRATION => LANGUAGES = > CHANGE DEFINITIONS => SAVE and test the result.

You can also make your own language. Than make some definitions in your language and test as showed before.

If the results of your efforts are positive, please get in contact through this Developers Forum so you can be rewarded with translation permissions and become a member of the official translation developments. (Don’t be afraid, we all started as dummyzzzz but the dummy language was first to have 100% coverage in the translation spreadsheet, ;-).

MAIN DISCUSSION MOMENTARILY (as of 20090521):

1. How to translate a non-english part of OpenEMR (Psychiatric forms) so it can be used with multi language translations.

2. How to get not encoded tekst in OpenEMR into the translation GoogleSpreadSheet.

3. How to keep us engaged, to continue with the painstaking work of finishing all translation problems for Internationalization of OpenEMR.

Pimm

bradymiller wrote on Thursday, May 21, 2009:

1. Plan for now to keep constants in English. So would rec. translating the form to english. This needs to be translated to english first anyways if were gonna get all the other language definitions (most of our translators can likely only go from english to their language, not dutch to their language).

2. Have a file in the builder scripts kept in cvs, which will hold these (it’s now empty, but soon to get several hundred):
http://openemr.cvs.sourceforge.net/viewvc/openemr/openemr/contrib/util/language_translations/manuallyAddedConstants.txt

3. Hopefully soon gonna dump several hundred more constants to the spreadsheet. (Will export the Google Doc, modify it, then re-import the new version)

-brady

bradymiller wrote on Thursday, May 21, 2009:

hey,

Incorporated the layouts and lists into the translations. It’s pretty cool (I also placed flags to allow disabling of each of these in globals.php) and rather interesting actually. Rod’s listing feature actually provides a mechanism for a multi-lingual clinic when the translation engine is run on top of it.  You have to try it to see what i mean in the cvs demo (ie. fill out demographics and history for a patient in Dutch, then look at it in english or spanish; all the stuff that has been placed via lists will be translated (assuming it has a definition in the language table)).

Plan to add about 250 constants this weekend to the translation spreadsheet (This will give very good coverage; although still some cracks to be filled). Rod, you’ll need to add the Armenian translations to our spreadsheet before I do this, since it will re-order the constants (I export the spreadsheet form google docs, run it thru a perl script, then re-import it back to Google docs).

-brady

sunsetsystems wrote on Thursday, May 21, 2009:

4. Maintaining reasonable support for the translation GUI in OpenEMR.  Two issues here: (a) the form is too big, takes much too long to load and save, rendering it virtually unusable; and (b) we need a mechanism to submit translations to the spreadsheet that were entered via this GUI.

Armenia is trying to enter translations now and I don’t want to turn them loose on the spreadsheet, so I’ll be trying to work out a quick fix for 4a ASAP.

Rod
www.sunsetsystems.com

sunsetsystems wrote on Thursday, May 21, 2009:

Regarding #4a, I have checked in changes to interface/language/language.php and lang_definition.php to support a "filter" that selects the language constants to be shown.  Basically whatever you put in will have a "%" appended and then used in a LIKE clause to match the constant names.  You can even do things like type in "%encounter" to select all constants containing the word "encounter".

Regarding #2, I think we’ll create a mess by maintaining the exceptions separately from the code.  I suggest designing a specially formatted type of comment to specify that some string should be a translation constant, and then put those comments in the code close to where they apply.  That way the developers might actually maintain them.

Rod
www.sunsetsystems.com

bradymiller wrote on Friday, May 22, 2009:

hey,

Regarding #2; these are all coming from where variable are within xl(), and not a static string.  Check out the output of my collectConstant.pl script. This script builds the list of constants (uses a previous list for comparison). The first list are the new constants that have been found in xl(), the second are the “MANUALLY ADDED” (from a file), which you’ll recognize from mostly lists/layout stuff.  The third are “KEEPING”, which are things that were in previous list of constants but no longer found (xl() statement gone); were keeping these to ensure compatibility with previous versions (this action can be toggled off if desire only a current listing). Then “REMOVED” (from a file) can be used to remove obvious erroneous constants from the script.

I posted the output of the script here:
https://sourceforge.net/forum/forum.php?thread_id=3279872&forum_id=202506

At this point predicting we’ll have about 150-175 constants total in the manuallyAdded file. It wouldn’t be hard to go back and place these (just put a xl(constant) in comments; for example put a // xl("(More)"); in interface/globals.php where this variable is declared). But most of these came from the lists/layouts, and I worry that it may make the lists/layouts scripts a bit cumbersome and ruin the beauty of its generic approach.  Maybe a lists/layouts text file is needed that just list the lists and layout labels that are appropriate for translation in a organized fashion?

-brady

blankev wrote on Friday, May 22, 2009:

Brady,

in layout and lists you can make also some kind of translations, but also additions in your won language.

I suppose language_definitions take preference?

What translation will take charge if changed?
1. lang_definitions
or
2. LABELS in Layout

Is this of any importance, since we had som many discussions over users makeing teriible non-understandable misktakes?

Pimm

bradymiller wrote on Friday, May 22, 2009:

hi,

  If you make your own labels in your own language, then the engine will actually try to convert these from english to your language (will fail, so then will keep your label).  There is a minute chance that problems could arise:

What happens if you type something in your language as a label, that is actually by chance also an english constant (it will get translated to your definition of that constant). Again, a minute chance but a possibility(pretty much impossible if using chinese, to bad your not chinese). This is why I placed the options of turning off translations for layouts and/or lists in interface/globals.php.

There are two situations I see:
1) A single physician practice of a Spanish guy who knows no English wants to start using/evaluating OpenEMR. He’ll likely use(actually need) the translations in layout/list to even use the program.
2) A clinic with more money and staff(somebody knows english). They would have more resources, so could just turn off the translations in layouts/lists and put their own languages into the fields while configuring for their clinic.

At first I felt like this translation solution for layouts/lists wasn’t that great, but then I realized what would happen if you took the lists to the extreme. The lists, which are user data basically consist of a ID (to database) and a superficial label. Hence, the data (ID) in the database could be considered language neutral (in some cases it’s actually just a number).  If a clinic wanted complete multi-lingual functionality, including patient data, they could develop a intricate system of lists(ie. physical exam results), thus a physician in Morocco would be able to read the PE findings just as the author in Dutch. Now that would be cool (in an academic way); CAMOS is something that really comes to mind here for this type of function.

-brady

blankev wrote on Friday, May 22, 2009:

And working with my memory I would say that the Dutch parts in OpenEMR are only some forms what can be used and you hace to use the ACCEPT/REJECT BUTTONS.

Just a matter of xl( … ) coding for international translation in lang_constants => lang_definitions

Could you include the trhee Dutch forms into the CVSDemo inline, so we could give it the propper translation input and dummy check…?

Pimm

blankev wrote on Friday, May 22, 2009:

I looked at the dutch forms. Did not find ANY xl( … ) codes! All was in Dutch, even with the "dummy" test no dummies found.

Conclusion:

1)
we have to make a copy of these forms and make the copy in English with inclusion of xl |( ) code !!!! Just like vitals-m-form and vitals-form. I don’t have any indept information about the Java scripts that are used and to what extend these need to get an adaptation to function on multi level international translation.

2)
we make the dutch forms xl( ) code and add a column in the translation spreadsheet with lang_definitions for the American-English language! Just the same as Swedish, Spanish, Dutch etc…
(My suggestion, if  this (2) is going to be the choice, to give every Developer the option to develop in his/her own language, but MUST add a lang_constant column AND a lang_definition in American-English before the development can be accepted.)

Let’s make a choice and start translating…

Rod, Brady, Sam, Joe let’s vote and continue!

The next vote round has to be on:
Do is sentence by sentence or by some kind of script.

Pimm

sunsetsystems wrote on Friday, May 22, 2009:

Looks like I missed some discussion.  Who wants the Dutch forms to be translated, and to what language?  I do think the standard should be to have an English translation for everything, but of course many parts of the project are in transition.  As for voting, the person doing the work tends to get the biggest vote.  :slight_smile:

I have mixed feelings about translating lists and layouts.  Either way might be best.  Again, the downside is maintaining two disjoint sets of English strings.  Sites are likely to customize these anyway, so such translation efforts are likely to be partially wasted.  And I think almost all clinics will want to standardize on a language – after all, the docs and staff do have to communicate with each other and they will have various things on paper anyway – so supporting multiple languages in one clinic seems of dubious value.  I would vote for procrastination on this.  :slight_smile:

There are some other things generally customized by the user, such as statement templates, that should probably skip the translation engine.

Rod
www.sunsetsystems.com

cfapress wrote on Friday, May 22, 2009:

I agree, the growing list of translation items makes the Admin->Language page take very long to load.

Or more specifically, the page takes very long to render. I believe the data transfer is pretty quick when compared tot he time required for the browser to render a table with 200+ rows.

I propose two solutions to this:

1) Forget the TABLE and switch to a UL or SPAN tags with a given width, say 50%. Browsers usually render those two tags much faster than TABLE.

2) Paginate it all. Break the definitions into pages of 10/25/50 items at a time. The user can choose how many constants to see on a single page.

Of course I risk becoming the developer now that I’ve made the pitch. That’s OK. I can probably make the time for it. I will add a new feature request to the tracker referencing this post.

Jason

bradymiller wrote on Friday, May 22, 2009:

Pimm,
1) Let’s just translate the forms to English, and surround the translated enlgish constants with xl(). Perfect “hello world” project for you in openemr…

Jason,
Glad to see your also entering the world if internationalization. If you want viewing access to the translation Google Docs spreadsheet, just let me know.

Rod,
Just so you know, the layouts/lists are now being translated, but this can be turned off in global.php (in the growing translation settings section i globals.php). Try it out (use Dutch); would be great to get input on the layouts/lists views. My "default use" goal is that a non-english user can login to openemr and start using it, and also be able to navigate the lists/layouts.

-brady

sunsetsystems wrote on Friday, May 22, 2009:

Jason, if you look for my followup post in this thread you’ll see I already added a “Filter” feature that pretty much takes care of the problem.  Of course paging wouldn’t hurt either.

Rod
www.sunsetsystems.com

bradymiller wrote on Saturday, May 23, 2009:

Hey, 

Time for the weekly translation table update and statistics.

I just added 225 more new constants to the translation spreadsheet.  I’d consider these constants high yield and include the currency ($ near bottom) symbol.  The new constants are listed here:
https://sourceforge.net/forum/forum.php?thread_id=3281138&forum_id=202506

Here are the statistics: 
Total number of english constants: 2421
Total number of definitions: 5499
Chinese: 3% (77 definitions)
Dutch: 91% (2197 definitions)
German: 1% (23 definitions)
Norwegian: 37% (903 definitions)
Russian: 2% (40 definitions)
Spanish: 43% (1030 definitions)
Swedish: 51% (1229 definitions)

We will continue to include the “dummy” languages, which basically defines the word dummy for all 2421 constants (there is also a dummyUTF language which uses a chinese word for every translation). These are so we can find constants that aren’t translating and fix them. 

The best way to test your translations is on the online cvs demo (this is the most current development version of openemr), which is using the new translation table, and allows you to select a language before logging in: 
http://oemr.org/modules/wiwimod/index.php?page=DemoCVS 

Still working on providing a “safe” set of sql tables for users that you want a local version. I’ll place the link and instructions on the wiki below when finished. 

Other News: 
We have an openemr language translation wiki at: 
http://www.oemr.org/modules/wiwimod/index.php?page=TranslationGuide 
To get "editing" access, register (at bottom right of screen) and then email me (brady@sparmy.com) your username, so I can give you wiki editing privileges. 

thanks, 
Brady

ideaman911 wrote on Sunday, May 24, 2009:

Brady et al;

The size and scope of the language tables is sounding both like we are making great inroads, and that we may overwhelm the ability to download an installation in a reasonable timeframe.

How difficult would it be to allow users to download a copy with selected languages which they could pick from our "list", and that would create their individual install language table?

Further to that, we should also consider the updating of that table at a future point as a language (or more) they overlooked needs to be added.

Things to ponder.  Thanks.  Nice work.

Joe Holzer    Idea Man
http://www.holzerent.com

blankev wrote on Sunday, May 24, 2009:

Joe,

download the translation spreadsheet is easy and quick.

Deleting some of the text columns not needed, should be easy if you have some knowledge about Spreadsheet: MS Excel or OpenOffice.

Saving/renaming/introduction of the right numbers for Definitions in parts of the spreadsheet is somewhat more complicated but could be explained in a step by step way.

The import into MySQL might be somewhat more complicated. You have to create the right CSV- or TEXT- files. But since I learned how to accomplish this task, I suppose it is not a huge enterprise. It is something to do concentrated, but even the administrator with no deep software/onetime scripting knowledge could import the tables using phpMyAdmin and fill the following tables:

lang_constants
lang_definitions

I don’t know anything about scripts but making the choices for constants and definitions for just some needed languages should be a possibility.

I made some time ago something that might be called a manual for import of above mentioned tables. If there is enough interest I can finetune and make this available in Language Manuals.

Learning this way to import these tables took me about twenty times of trial and error. Now with the use of the manual I can do it in less than 20 minutes (not too much time for a onetime event?).

Pimm

bradymiller wrote on Friday, May 29, 2009:

Hey, 

Time for the weekly translation table update and statistics. 

Here are the statistics:
Total number of english constants: 2421
Total number of definitions: 10245
Bahasa Indonesia: 99.9% (2417 definitions)
Chinese: 62% (1502 definitions)
Dutch: 100% (2420 definitions)
German: 2% (44 definitions)
Norwegian: 37% (903 definitions)
Russian: 4% (89 definitions)
Spanish: 46% (1109 definitions)
Swedish: 73% (1761 definitions)

We will continue to include the “dummy” languages, which basically defines the word dummy for all 2421 constants (there is also a dummyUTF language which uses a chinese word for every translation). These are so we can find constants that aren’t translating and fix them. 

The best way to test your translations is on the online cvs demo (this is the most current development version of openemr), which is using the new translation table, and allows you to select a language before logging in: 
http://oemr.org/modules/wiwimod/index.php?page=DemoCVS 

Still working on providing a “safe” set of sql tables for users that you want a local version. I’ll place the link and instructions on the wiki below when finished. 

Other News: 
We have an openemr language translation wiki at: 
http://www.oemr.org/modules/wiwimod/index.php?page=TranslationGuide 
To get "editing" access, register (at bottom right of screen) and then email me (brady@sparmy.com) your username, so I can give you wiki editing privileges. 

thanks, 
Brady

bradymiller wrote on Friday, May 29, 2009:

Hey,

Noticed the performance was dropping drastically as our definition list grew.  Using the EXPLAIN command in mysql, found out our current scheme was running through all rows (15,000 of them) during each query; ouch. Optimized it to only go through 12 rows with a well placed index key in the lang_definitions table. Now it’s friggin fast.

Also noticed that the query for constants is case insensitive, which will cause issues. For example if we have two constants ‘Add’ and ‘add’, then one translation (whichever is first) is used for both. Not ideal, and was easily fixed by giving the constant column in the constants table a BINARY collation.

Committed both these optimizations, and which are now working on the cvs demo.

-brady

blankev wrote on Saturday, May 30, 2009:

Dear Translation Developers,

working out of the memory box I seem to remember that there was a discussion in the past over a script that could create translations with the use of script and a translation website…

But I can’t find this discussion in the Forums and can’t remember if this was feasable or if this was just terminated before it was born or if it had a bad Apgar score…

It might be of some support to give new translators the choice to start with a automatic translation and do only corrections where needed. After some time this extra column could than be erased. At least it would be of benefit to get quicker results. Going forwards and back between Demo version and Spreadsheet is very time consuming, but the only way to evaluate if translations are correct and showing the way how OpenEMR is handling the translations.

If this suggestion is read by the person tried to make this script I would like to see how our new translators will handle this little piece of preparation at translation start up.

Indonesian translation was done in no time… how did they manage to do this?

Pimm