I upgraded from OpenEMR 4.0 to 4.1 about two weeks ago. During the last couple of days I have encountered a new problem when I import tif or pdf files into the documents tree. When things worked right, I could import properly into any directory or subdirectory and the file would show up in the proper place. Now, when I do everything the same, a single file upload often shows up in two separate locations in the directory tree, even though it is only one file uploaded. If I delete either apparent instance, both apparent instances disappear and the file is removed from the documents directory.
I had a customized document tree before the conversion, and apparently some new branches were added with the data conversion. I may have even removed one or two seldom used directories. The utility for doing that appeared to move the underlying documents to the parent directory properly.
I don’t see anything in the documents table to link any particular document to a branch of the tree.
I do see some other tables that may be affecting the behavior, though I am basically clueless. Any ideas?
Here’s a clue to something that is apparently NOT the cause of this behavior: I found that in the categories table, entries looked correct and appropriate, but one of the id values was not in consecutive order. I ordered the entries in this table by id number. Behavior did not change.
One interesting finding is that document numbers assigned by the program for recent document entries are in the low hundreds (e.g. 350) while before the conversion we had document numbers more than 10,000. I wonder whether there is a table re-using document numbers and confusing matters.
I also looked at the table categories_to_documents and found that all the category_id entries were valid numbers that were included in the categories table.
Also, I confirmed that there were no duplicate document_id numbers in the categories_to documents table.
I am puzzled by a file called documents_seq. It has nothing but three entries - 1, 26, 30. If these are actually document types, the 30 is not in my list of valid document types. I think I’m going to put in an empt table in place of this documents_seq table. (I’ll keep a backup).
Blanking out all entries in the categories_seq table made no difference in anything - a document uploaded still shows up in two places in the document tree. So I put it back the way it was.
Next I went to the admin tab and added some junk directories to use up id numbers 27 through 30. This made now difference.
Well, with documents and patients coming in all afternoon, I called uncle and just started pruning the branches of the document tree from the periphery up. I started with the seldom or never used categories, then worked up the tree as far as necessary to restore proper behavior.
Nice thing about the setup is that if you prune branches, the content just moves into a less specific category.
I think everything is OK now with just two branches in the tree - one for labs and another for communications/letters.
Well, the weekend gave me a chance to diagnose the problem with my document directory tree. I really messed things up with the data conversion from 4.0 to 4.1. My documents had to go from ./openemr/documents to ./openemr/sites/default/documents folder. They did, but the URLs for the old documents still had the old path. I deleted the ./openemr/documents folder (or link?) and that must have been a no-no. Whatever happened, my categories_to_documents table was completely messed up. I had pruned down the categories table down to practically nothing. I was finding that document categories were having values that did not correspond to any remaining document category. I put in an appropriate symbolic link in the place where the old documents files were, so that regardless of which place was specified in the URL, it would work.
I then deleted all the values from the categories_to_documents table.
Next I deleted all the values in the categories_seq table - I’m not sure what that actually does but it seemed better without values.
I sourced new categories file from a openemr database dump file I made a while back, just after conversion to 4.1. That involved cutting out just that part of the openemr dump file pertaining to the documents table.
I ran a command line MySQL command to read all the ids in the document file and to duplicate them into the document_id field of the categories_to_documents table. Then I assigned a value 1 to all the category_id field in the categories_to_documents table. This puts all documents at the trunk of the document tree when displayed. At least they’re visible and things work well. So far so good.
Sometime down the line I think I can do command line SQL join commands to assign more specific and appropriate document types to most of the documents based on the naming conventions I have used over time.
Whew!
The openness of OpenEMR is awesome! It seems that no matter how messed up your tables get, you can fix them one way or another.
Regardless of whether you choose to switch from mysql on the command line or not, something you should consider when updating is to wrap your queries in a transaction http://dev.mysql.com/doc/refman/5.1/en/commit.html
That way if you make a mistake when updating, you can easily rollback changes, or commit after verifying that they did what you expect.
The join you are going to want to do will probably look something like this:
start transaction;
update documents as d, categories_to_documents as c set c.category_id=2 where c.document_id=d.id and d.url like "foo%";
At that point, you can select and verify it did what you want. If it worked, then the next statement you’d execute is commit, if not rollback.
-Kevin Yeh
Just to weigh on on the ‘url’ column issue in the documents table. Note this bug creeped up back around version 3.0 or so when users were moving their OpenEMR instances around (ie. to new servers or different paths). This has been dealt with by simply only using the filename and discarding the rest of the path whenever it is used in the documents module. So, this should not of caused you issues when moving the documents directory during upgrade to 4.0+, unless you have custom code (or if still happening in some standard OpenEMR code, then this would be a bug that needs fixing). This has actually brought up an interesting issue, and has made it seemingly impossible to create any directory hierarchy in the documents directory, which has hindered code such as below from getting into the OpenEMr codebase(see my message at the end of the code): http://github.com/zhhealthcare/openemr/commit/ea27304ad50abd9cbcfd5087f31099801c7146aa
Although it just hit me, that this could be done by simply making a ‘directory’ column in the documents directory that held any directory path, if applicable, within the documents directory.
Also, the categories_seq table contains only one row with a value that is equal to the highest number of ‘id’ that is in the categories table.
Thank you, Kevin and Brady. These hints will definitely help me and others.
For some reason my documents_seq file had three entries and that didn’t make sense to me. I fixed it using Brady’s suggestion.
The same thing happened, apparently, to the sequences table. It had two values - a large value and below it a smaller value. It turns out new documents were assigned small rather than large numbers. On a hunch, I took out all the values from the sequences table and put in a single value large enough so as not to overlap any existing document numbers.
I played around with changing a few category_id values for certain documents, and the behavior seemed to follow the changes properly. For now, I’m going to keep all the category codes as 1 just to be sure it is truly stable before changing things permanently.
Kevin’s advice worked great for reassigning proper locations in the document tree for my documentds based on file names according to naming conventions. All is well.
For anybody else who notices this problem - do not despair. For somebody who understands the database tables involved it can be fixed fairly easily.