Brady suggested I start a forum topic on this instead of burying my stuff in the tracker.
Some things I’m interested in hearing from other developers are your thoughts on:
* is it OK to add a dependency like this to pear’s libraries? They’re very commonly used, but we’d have to ensure the dependency is met on any machines we install on.
* is MDB2 the best library to use? I know there are others, but haven’t used them.
If this seems like a reasonable way to address this systemic problem, I would add some more functions to make our use of MDB2 easier, add some more documentation to the wiki on how to best use it, and start refactoring some of the code to make use of the new tools.
PEAR::MDB2 is probably the best choice for a DB abstraction layer. PEAR::DB and PEAR::MDB are deprecated in favor of it. PEAR::DBA is for BDB-style databases, which is not what we want. ADOdb is another contender, though the comparative review I read indicated that MDB2 was easier to use and performed just as well.
In fact, we are ALREADY USING an old version of ADOdb (yes, it has known, minor security issues) that was checked in to library/adodb. We don’t use it well - we create the connection with ADOdb, then ask it for the internal db-specific handle, then make mysql_ calls against that handle.
ADOdb also supports using placeholders, emulating it with per-type auto-escaping if the underlying driver doesn’t support it.
My impression is that whatever is used will have to “just work” on MS Windows by exploding the tarball somewhere in the web root and running the OpenEMR setup script. I don’t know if PEAR packages can be used like that - my intuition is that they can, but it is better to have PEAR running so you can be aware of available updates (possibly security-related).
hey,
It’s great that others are starting to fix this stuff. There’s been some upstream hackery to ADODB for the logging feature, so would be easier to use ADODB(if it can do this).
Visolve, was there any documentation placed on the wiki regarding the logging stuff.
Our existing ADOdb needs an upgrade, and the old version in combination with a new PHP5 version from unstable blew up one of my testing environments today.
I don’t see why we need to distribute adodb with OpenEMR. Would you like to look into the possibility of taking it out, and using whatever is appropriate as a dependency?
After doing this, I think I like this approach better than forcing MDB2 on the project, even though it seems more popular.
Before this is accepted, we would have to:
* upgrade ADODB, I think
* fix a bunch of SQL calls that were broken by my change, which I can do.
* gradually change old queries to use placeholders instead of directly using variables in SQL
* encourage people to start using SQL placeholders moving forward
Does anyone object to this direction, or see any unintended, unwanted consequences?
I don’t see why we need to distribute adodb with OpenEMR.
Well, right now our contains a few custom changes. The only one I can think of is the changes to the Execute member function, that causes it to use our logging function. Using the version downloaded from the ADOdb project or provided by the system (libphp-adodb package on Debian) would lose those changes.
I’m fine with that. I think the logging / auditing is probably better done at a level higher than ADOdb. The problem is, I’m not sure where that fits in best. Our stuff in library/sql.inc does it, but that doesn’t have any support for bind variables, which is really what kicked off this thread.
Would you like to look into the possibility of taking it out…
My main concern with dropping it entirely is breaking the CVS Demo and Developer Appliance. I suppose we could have the install/setup script download it, if needed, at install time. There’s also the conflicting goals of getting updates for security issues vs. installing a version of ADOdb we never tested a release against. Finally, some support for a system-provided ADOdb probably means fixing up our includes a bit.
Brady, are the script(s) that maintain the CVS demo and the developer’s appliance in the tree? I’d like to look at them and see if I can’t update them, both for Git and this possible change.
Well we don’t want to break logging. Certainly some sort of wrapper for adodb calls would be in order, for that and as insurance against other possible issues later. Don’t bite off too much at once, maybe do the wrapper first and then decide if you want to replace adodb, or take it out.
Agreed that a higher level of logging would be better, but let’s make that a different problem.
I’d been speaking (er, writing?) in the abstract. Andy’s patch still goes through the stuff in library/sql.inc so we still get logging, and we also get bind variables. Mostly, I think it is a good patch, and I already threw up a comment on the patch tracker about the biggest problem I see with it. It doesn’t support prepared statements, which I normally associate with bind variables, but I can’t think of any time they would be a performance gain in the codebase.
Dropping ADOdb from the CVS repository is something I’d like to do, but it is a medium-sized project and can be held off for later.
Thanks for looking it over. I’ll work on one that doesn’t use exceptions so we can continue to support PHP4. (yuk!)
I’d love to support prepared statements in some way eventually. I’m hoping to make small steps here for now.
I’m glad we seem to be on the same page. I’m still waiting for someone to point out something that I’m missing that makes
this impossible. If that doesn’t happen soon, I’ll put together a more comprehensive patch that we can actually apply.
acmoore,
Haven’t been able to test it, but like the concept of this. What happens if a variable has already been escaped, and then placed
into a binded sql statement? I’m guessing wouldn’t work, so some variable processing may require refactoring. So, then th flow
of a variable is to 1) remove escapes (if magic quotes is set) and then 2) would put this right into the sql command with bind.
Out of curiosity, is there a way to globally deal with magic quotes of all variables in the POST, PRE, etc, for each script (ie. would need to deal with recursive array issues). That would be nice and clean: Then a variable could go directly into the sql bind statement, thus avoiding the need for the current rather confusing functions in formdata.inc.php .
Brady: (untested, but should do it. )
$gpc_superglobals = array($_GET, $_POST, $_COOKIE);
function demagic(&$v, $k) { $v = strip_escape_custom($v); }
array_walk_recursive($gpc_superglobals, demagic);
function demagic(&$v, $k) { $v = strip_escape_custom($v); }
array_walk_recursive($gpc_superglobals, demagic);
Actually, that’s really inefficient 'cause it make lots of calls to get_magic_quotes_gpc(). (Also it uses a bareword -> BAD!). Something like this is probably better:
if (get_magic_quotes_gpc()) {
$gpc_superglobals = array($_GET, $_POST, $_COOKIE);
function ssip(&$v, $k) { $v = stripslashes($v); }
array_walk_recursive($gpc_superglobals, ‘ssip’);
}
So if we do a complete code walk-through to institute acmoore’s stuff, then perhaps we should also incorporate stephen-smith’s above superfunction (assuming it tests well). That would be very cool.
I didn’t consider magic_quotes. This is the first time I’ve ever heard of them. What an odd feature!
Perhaps the thing to do is to stripslashes() on all elements of the $bind arrays inside the sql.inc functions if magic_quotes is on. I’m not sure I totally get it, though.
Regarding your above point, this won’t have any effect on the demos or appliances. The demos and appliances follow this flow:
1) Turn on
2) Download code from SF cvs (via a mandriva startup script)
3) Copy the downloaded code to web directory
4) Run following script: openemr/contrib/util/installScripts/cvsDemoInstall
When I have time, I’ll post the mandriva startup and shutdown script to cvs
(good idea to have them as transparent as possible).
Regarding git, one big issue I’ve encountered so far is that it takes way to long to download the repository (downloading the code over cvs from SF is actually very fast)