cbezuidenhout wrote on Wednesday, February 08, 2012:
in an ajax return string using json_encode should I be using xla, xlt, xl, attr, tet or htmlspecialchars ?
thanks
- Craig
Tajemo Enterprises
cbezuidenhout wrote on Wednesday, February 08, 2012:
in an ajax return string using json_encode should I be using xla, xlt, xl, attr, tet or htmlspecialchars ?
thanks
- Craig
Tajemo Enterprises
bradymiller wrote on Wednesday, February 08, 2012:
Hi,
Not really sure if there is a global answer(are you translating something or not etc.). Can you provide the code example?
-brady
cbezuidenhout wrote on Wednesday, February 08, 2012:
yes translation required
here is the code line
echo json_encode(array("error"=> xl('Record already exist') ));
so I was thinking should be xla ?
- Craig
Tajemo Enterprises
yehster wrote on Wednesday, February 08, 2012:
json_encode excapes special characters, so I’m fairly certain you should just use plain xl to get the appropriate “raw text”.
bradymiller wrote on Wednesday, February 08, 2012:
Hi,
Read this:
http://stackoverflow.com/questions/6892559/i-sometimes-see-that-people-wrap-json-encode-in-htmlspecialchars-why
http://php.net/manual/en/function.json-encode.php (example 2)
Sounds like xla may be best, since the translated variable could contain <script></script> tags that won’t be escaped, so I think on the echo of it, could run the cross-scripting attack code. However, I don’t think this code is going directly to the browser, but is instead being send via an ajax, so not sure if that makes a difference. Thoughts anybody?
-brady
bradymiller wrote on Wednesday, February 08, 2012:
another useful link:
http://owasp.securityinnovation.com/ViewGuidanceItem.aspx?ItemID=16c073b9-e951-4f25-a282-e867d2d808f7
yehster wrote on Wednesday, February 08, 2012:
Like you said before, it really depends on the use case.
However, Why would there be script tags in the translated string? That would mean someone put script tags in the translations table? Why would it be correct to some how have xl(“Record already exist”) return "Record already exist<SCRIPT>window.alert(“yo!”)</SCRIPT>
If there is code that needs to be executed, that should be handled seperately. xl strings are generally displayed and not ever “executed”.
Also, from the article yo linked, I think this is the most relevant item:
A far better solution to this problem is to simply not send JSON data directly in an HTML source (JSON encoding just makes the content safe to embed in JS, not HTML).
So if the text is being handled by javascript (in an ajax handler) it shouldn’t need html entities escaped to prevent cross-site scripting.
Plus compromising the system through this mechanism would mean that someone put script code in your translation table (which means they have database access, so the scripting issue/attack vector wouldn’t really be relevant relative to your other problem).
However, if you did have html tags in the translated string, e.g." <em>Error</em> something bad happened"
running it through xla and passing it through json_encode is likely going to not give the desired result. (The tags for em will get displayed < etc…. rather than being emphasized.
I think as a general rule, there shouldn’t be html or other tags in the translation table. It may only be possible to display some important characters in other languages using an html tag, but where possible, we should avoid it.
After all that here’s what I think.
xl:
using plain xl is likely to make sure that things are displayed as intended
xla:
using xla will prevent someone with access to the translation table from hijacking end user’s browser to potentially do something odd, but may not display the way it’s supposed to.
bradymiller wrote on Wednesday, February 08, 2012:
Hi,
To clarify. HTML elements are not allowed in the translation table. So xla should not produce any undesirable effects of translation strings. Script tags could get in if somebody gets access to Administration->Languages, Administration->Database or even perhaps somebody may sneak one in on the public google docs translation spreadsheet.
-brady
yehster wrote on Wednesday, February 08, 2012:
If HTML elements aren’t allowed in the translation table, then why would we want to use xl over xla?
In other words why two versions then?
bradymiller wrote on Wednesday, February 08, 2012:
Hi,
xl() is the standard translation function.
Another developer, Stephen, made a good case that embedding htmlspecialchars within xl was making the code hard to follow, which is how I was initially doing things. So he then made abbreviated combination functions:
xla()(with ENT_QUOTES) and xlt()(with ENT_NOQUOTES) are the standard translation function with htmlspecialchar escaping built in.
Note there are situations where you do not want to use htmlspecialchars:
1) if need to escape within a literal, then need to use addslashes() instead
2) if sending the translation to another format besides html (saving it to a file or transferring over internet etc.)
3) and am guessing more (perhaps even above may be a situation when above is sorted out)
See step 4 in http://www.open-emr.org/wiki/index.php/Codebase_Security#Plan
-brady
yehster wrote on Wednesday, February 08, 2012:
Brady,
Your last post is informative and I’m still thinking about things, but I suspect that worrying too much about which xl version to use in the event of
Script tags could get in if somebody gets access to Administration->Languages, Administration->Database or even perhaps somebody may sneak one in on the public google docs translation spreadsheet.
Is akin to putting a bandage on your arm in a spot where you anticipate some one might hit you with a chain saw.
bradymiller wrote on Thursday, February 09, 2012:
Hi,
Makes sense to escape everything that the user can possibly get to, including both $_REQUEST stuff and stuff from the database(which includes translation entries. For the situation above, once figure out what to do, can simply document it for the next time. The thingd I am wondering is:
1. if echo of a string containing a <script>alert(“hey”)</script> in a ajax script even does anything(does an alert window open), or does it just get sent to the script making the ajax request.
2. After the script gets the information via the ajax call, will the above string open a alert prompt if inserted into the html page via jquery.
-brady
yehster wrote on Thursday, February 09, 2012:
Brady,
What would happen with a script tag returned via ajax depends a lot of what the ajax handler does.
A call might look something like this.
$.post("givemereminders.php",
{
whos_stuff: "Joe"
},
function(data){
// This is the ajax handler that gets called when the server finishes.
if(data.error!=null)
{
alert(data.error)
}
else
{
$("#reminders").html(data.html);
}
},
"json");
So if the php code returned an error via this mechanism, the error would get handled as an alert, and if there were script tags in it then they would just get displayed as text in the alert. (In your example, there would be an alert dialog would print “<script>alert(“hey”)</script>”
However if your php code returned a json object flagged as html the browser in this case would parse the script tag and execute it.
Being able to send script tags like that is useful so you can for example bind events to newly created elements.
bradymiller wrote on Thursday, February 09, 2012:
Hi,
Then basically seems impossible to know whether to do xla() or addslashes(xl()) on the script above. Brain hurts. With translations, actually, xla() is just fine, because we actually convert single and double quote to back-tics anyways in the translation engine. This was done, because unescaped single-quotes/double-quotes in translation strings were causing white screens of death when used a javascript literals. However, if there were a variable that didn’t need translating, then seems impossible whether would know if should use attr/text() vs. addslashes.
-brady
yehster wrote on Thursday, February 09, 2012:
Brady,
Because json_encode escapes characters for the ajax protocol, I don’t think there is any need to worry about “white screen of death.” json_encode does something similar to addslashes. I think if you used addslashes and then passed the resultant string to json_encode, you would “double escape” things, which isn’t correct either.
I stand by my initial assertion that plain xl is probably the correct thing to do in this scenario.
The script injection issue into the translation table while a “real threat” seems highly improbable. If you are confident that your translations table is clean (only folks you trust have access to the database table, you got your initial translations from a trusted source and/or you verified there was nothing funny after you inserted them) then what the ajax handler does is a non-issue.
bradymiller wrote on Thursday, February 09, 2012:
Hi,
There are about 150 translators with access to the spreadsheet and I can’t monitor all changes…
What about if this wasn’t a translated variable, but instead was a fname from the patient_data table in database(and it’s easy to insert <script> tag items into pt demographics). Then, wouldn’t you consider using attr() ? Don’t see why different from above (since a user can do same at the Administration->Languages for any number of translations).
-brady
yehster wrote on Thursday, February 09, 2012:
Brady,
I looked at the security wiki you posted again, and I think that
echo json_encode(array("error"=> xlt('Record already exist') ));
is probably the correct function to use, but it all really depends on the context.
bradymiller wrote on Friday, February 10, 2012:
Hi,
Sounds good. Agree that it really depends on the context.
-brady
yehster wrote on Friday, February 10, 2012:
Brady,
I guess the point I was missing before was that we should always try to run dynamic content that isn’t html through “htmlspecialcharacters” at some point. I seem to remember now some of my initial code contributions where I only used xl and before you brought them into the official code base you added an appropriate “htmlspecialcharacters” call.