Code to speed up the translation system

teddziuba wrote on Tuesday, May 08, 2012:

I just committed some code to my GitHub branch to re-tool the i18n/translation system, and also provide a helper function to profile database queries.

Profiling DB Queries

I added a file,

library/query_profile.php

that has functions to track database queries. If you set the

query_debug

global variable, these functions will not only keep track of all database queries that get made per-request, but also the code paths that produced them.

There’s functions to print out the query profile as either an HTML comment, on on the console with

error_log

, if you develop with

php -S

like I do. There’s an example usage of this functionality in

encounters.php

Updated i18n Functionality

After doing some modest query profiling with the above functionality, I found out that the internationalization feature of OpenEMR generates a lot of database queries, on the order of hundreds per page load. A web application should do 5, maybe 6 database queries per pageload.

This patch introduces pre-caching to the i18n system. The basic idea is to query the database for all phrase translations at page load time, then every time

xl()

is called, it’s a simple hash table lookup instead of a database query.

I noticed a switch in there to turn off translation for English phrases, bypassing the query, but that’s not terribly robust as other language users still suffer the performance hit. This commit addresses that.

teddziuba wrote on Tuesday, May 08, 2012:

Edit

Oh my, this doesn’t work nearly as well as GitHub. Sorry this message is all borked up.

Also, here’s a link to my commit: https://github.com/openemr/openemr/pull/6

yehster wrote on Tuesday, May 08, 2012:

Any data on how long it takes to build the translation cache?  This is a trade-off of lots of individual “small” queries for one larger query (which I’m sure is still relatively fast).

I’m not sure how big the translation table is for the “most complete” lanaguage, where the cache will take the longest to build.

Brady, how will all this fit with your “comments” changes

sunsetsystems wrote on Tuesday, May 08, 2012:

So that loads all constants and their translations for the given language on every page load?  I’m very surprised that would be faster than doing the individual queries as needed (mysql does cacheing also).

Well I guess it would be faster if there are not many translations for the language.  Try it with a language that is mostly translated and see how the times compare.

Here’s an idea unrelated to cacheing:

What might be useful is to store hashes of the English strings as the language constants, instead of the strings themselves.  In this way the lang_constants table could be eliminated, as the hash could take over the role of a “constant ID”, and queries would presumably be simpler and faster.  This would probably simplify maintenance also.

Rod
www.sunsetsystems.com

bradymiller wrote on Tuesday, May 08, 2012:

Hi,

I still haven’t looked/tested/reviewed the code on github. To quickly address above questions:
1. There are 5500 constants and growing about 500 more or so yearly
2. A good language that contains almost all 5500 constants is Dutch, which could be used to test process times etc.
3. The “comments” code I just committed (only several lines were modified) will conflict with this code, but would be very easy to fix (the conflict is only because Ted has (nicely) reorganized/modularized the code).

-brady
OpenEMR Project

teddziuba wrote on Tuesday, May 08, 2012:

Where can I find the dutch language pack? I’ll load it in and test it.

Basically anything is faster than doing 200 SQL queries per page load.

bradymiller wrote on Tuesday, May 08, 2012:

Hi,

It’s already included (all the languages are already included), so just need to select the Dutch language when login. If use the development version (master branch), then will have about 5200 or so Dutch translations. If want to use the most recent translations set (it is built daily), which has 5500 Dutch translations, then can download (also with installation instructions) here:
http://open-emr.org/wiki/index.php/Install_Translations

If you’re interested, here are some of the vital wiki docs for the translation engine:
http://open-emr.org/wiki/index.php/OpenEMR_Internationalization_Translator_Guide
http://open-emr.org/wiki/index.php/OpenEMR_Internationalization_Development_Guide

-brady
OpenEMR Project

bradymiller wrote on Wednesday, May 09, 2012:

Hi,

Placed a review on the github commit.

thanks,
-brady
OpenEMR

yehster wrote on Friday, June 15, 2012:

Regarding the profiling mechanism proposed in this thread.  It turns out that Adodb actually has built in profiling:

http://phplens.com/lens/adodb/docs-adodb.htm#logsql