So I am writing an external tool to allow claims data to be used to populate an OpenEMR database SQL query.
Currently, I am building the arrays of insurance companies and the associated addresses.
These are stored in three tables: insurance_companies
and addresses
and phone_numbers
For addresses and phone numbers (being that only insurance companies use these linked tables), the ID is always one more than the foreign ID linking it to the insurance company ID. Insurance ID’s are generated in some strange fashion. Address table ID’s are generated in a generate_id() function that basically says ID = insurance_companies.id [row] +1;.
This means that the tables have non-sequential ID’s.
This also means that nothing else can safely use the addresses table. Even if you pick a non-existing index, the generate_id() function which is the following:
function generate_id () {
$database = $GLOBALS['adodb']['db'];
return $database->GenID("sequences");
}
and has a comment “To Do, document this function”,
basically means that there are potential issues with generating, importing or any number of other things using insurance companies and associated tables from partial db dumps. I have no reference for how this ID is exactly generated, or how it can be replicated without using the database that I am creating in the first place. I am pretty sure that mysql GenID(“sequences”) is not a native function, and depends on autoincrement in some way, so I have no idea of what is really going on here. UUid’s are best generated (in my opinion) by a millisecond timing function or plain autoincrement, or in asyncronous situations perhaps reserving a MAX(id) and reserving a record.
I am currently stuck on the problem of how to safely proceed.
Since the tables are strictly 1-1-1 relationships, I intend to simply have all matching ID’s and foreign keys be the same in single digit increments.
Does anyone see an issue with doing this for a database that will subsequently get modified manually?
Unfortunately insurance companies are not the only entity that using the address and phone_numbers tables. Pharmacies are also using those tables. I found this as I was working implementing an ORM. And there is no field that notes which entity the record is tied to.
I my opinion, those tables need to be auto_increment and there needs to be a field denoting the relationship type/table. The second part of that is comlicated upgrade path for those with existing data.
Yeah, I belatedly remembered that pharmacies use that too.
Fortunately, a complicated upgrade really only requires a very slick SQL query to reset everything…and a very long long script execution allowance.
Luckily, my toy is for new databases or “add some more externally” databases…so I can ignore this for now…I think.
As I understand it, for this supposedly unrepeatable ID, you add up all the rows in a database and add one. That means if you delete rows and update the autoincrement settings of a table, or if you import a dump without that additional information, you could very well duplicate that number. The smaller a database is, the easier that could happen.
There is a type link in the addreses.
This is really an important piece if we are ever to have a “contacts” table with one-to-many (and vice versa) relationships to get past the use of the users table and add a lot more functionality to records and forms with functions such as are afforded in Knockout.js.
Sorry…that is just in phone_numbers, not addresses. I was looking at production, not vanilla. Insurance companies seem to be type 2 there, and pharmacies type 5, unless that has nothing to do with nothing.
My next issue is with the transaction ID’s.
To allow ERA’s to be uploaded after creating the DB from EDI, I would like to allow the user to configure what the transaction ID looks like.
I have the following format from openemr: CLM*87-134325*12.50***11:B:1*Y*A*Y*Y~
This is the following format: *PID|delimiter|CLAIM*
…with a hyphen as a delimiter.
I assume other possible formats are: *CLAIM* *CLAIM|delimiter|PID*
…if the other formats don’t include are repeatably identifiable medical record number as well as a claim (encounter) number, what might be the best to use?
I am assuming for now that if the openemr style is not used, then either a simple autoincrement index for each or a partial success with one or the other can be used…but that means ERA’s will require pre-processing to allow upload.
Anyone have a wide range of experience with these?
BTW Aaric, thanks for that input. I am proceeding on stupidly and ignoring the issue after you verified it is just another WTF that was probably done for a very good purpose…long long ago…but is a bit of a crimp right now.
I am just glad that Rod or whoever did the civilized thing and gave us a PID-Encounter format for our transaction ID’s. If I am dealing with claim number only stuff, I think I could just add some extra code to parse_era.inc (or whatever it is) with a global.goofy_eras==true. Not a code-base potential thing, but a good hack for getting someone up and running.
Progress on this tool has been awesome. I’m improving the Facilities construction a bit now after doing some standardized tests to check output quality, and then it is on to the last few tables, namely form_encounter, forms, and billing.
The ultimate test is to import a file, then generate the claims again and have the content acceptably exact.
Even without writing billing and encounters, all the duplicate checks, escape characters and all that wonderful stuff has slowed down this >mostly< single threaded version appreciably. It takes several seconds for it to process a megabyte of text files to recreate a measly few hundred patients. Better look into threading the processes to additional objects…but right now the code is written in a way that someone could easily convert it to a php scripts that chew their way through a single string of code.