Version Upgrade - UUID population

This might belong in the Support category, but since it’s specifically related to details of how the application code works at a fairly low level, I thought I’d start here.
I have been prototyping version upgrade paths for some extremely large working OpenEMR V5 systems, and how to get from there to V7.0.1.
During the phases of the version upgrade routine between V6 & V7, when the UUID fields are added to the database schema, and subsequently populated, is there a specific reason or logic as to where the new UUID column is placed in the existing tables?
For example, in the “procedure_report” table, the UUID is inserted as the second column, but in “procedure_result”, it is added as the 2nd to last column. Or, in “prescriptions”, the UUID is inserted 7 rows from the last field in the table, but in “documents”, it’s again placed as the second column in the table.
Reason I’m wondering, is that for upgrade purposes, in large environments where these tables have millions of rows, the UUID phase either takes excessive time to complete, or in some cases, it is unable to complete. One of my current upgrade tests involves a database with 2.1 million documents, attempting the version upgrade from 5 → 7, once the UUID phase hits the documents table, the process ran for 10 days and then eventually just died, no errors, no actual failure reports, it just eventually stopped running. The underlying hardware was a physical server with SAS disk storage, 64GB of memory, and dual 8-core CPUs, so I am not sure that throwing more horsepower at the problem will make any significant changes in elapsed processing time. Given that the existing data tables include over 2 million documents, 1 million + prescription entries, and hundreds of thousands of laboratory tests/results, even if the problem that causes those UUID population routines to run for 10 days and then exit without completing was corrected, that would still mean that the required time to execute the version upgrade would run into weeks of continuous SQL processing, possibly well over a month. One possible workaround, would be to implement and populate those fields gradually in the database, prior to the 5 → 7 upgrade being performed, but before I start a modified round of testing based around that plan, I’d like to gain a better understanding of the seemingly random placement of the UUID fields in the affected tables. If anyone has any insights, it would be much appreciated!

You can look at our approach here that is valid for current mariadb/mysql databases. This is one area where project has taken a very peculiar approach in handling UUID where php scripts are used to do work that most current dbengines do natively. To give an idea, when you disable the strange logic of UUID registry and delegate UUID generation to database engine, all table updates take seconds instead of hours.

Best.

Yes, I should have mentioned the back end was MariaDB 10.6. I am done trying to work with MySQL 8, just dealing with the strange default table collation and character set handling on MySQL 8 adds hours of wasted processing time to the overall version upgrade process. MariaDB/Galera is the far superior platform at this point. Browsing through that folder , it looks like you discovered the same bug with passwords, and the glitch in the UUID code where it fails on the postcalendar_events table. It all starts to get interesting once you’re dealing with millions of database rows instead of a few hundred K. Thanks for pointing that out! We’ll be digging deeper into this for sure next week and I’ll update here with any new information or fixes that we find, in case it will help others.

See here for why the uuid stuff was started:
Create patient uuid mechanism · Issue #2277 · openemr/openemr · GitHub
(medium term goal of having the uuid identifiers for fhir and long-term goal of supporting offsite data synchronization at some point (goal of replacing database id indexing with uuid indexing))
Essentially needed a balance of being time based (version 1 uuid that mysql/mariadb produce is time based and sticks a server specific code on end) and not guessable (so instead went with a custom uuid (Timestamp-first COMB Codec) that is time based and sticks a random entry instead of a server specific code, so not guessable)

Are you stepping through each version on your upgrade? Rec not doing this and would try to jump from 5 directly to 7 since the uuid mechanism in the more recent version upgrades is magnitudes faster then older version upgrades.

1 Like

If still issues even with that, then could consider blocking out stuff here:
https://github.com/openemr/openemr/blob/master/src/Common/Uuid/UuidRegistry.php#L42-L65
(maybe make it empty on the initial upgrade, then show one at a time and run the upgrade from 7.0.0 to 7.0.1 each time open one up)

We had about 400K documents. But it was the appx 1.5M rows of proc* that made it impossible to justify use of standard code. Next was the bloat factor that would have affected timing and storage of backups without adding any benefit for foreseeable future.

In addition, our upgrade was several ETL steps that were refined iteratively. Since the test cycles were turning out to be so long, we ended with an cli upgrader that can be incorporated into one of the scripts. This script also lets you upgrade components in their own sequence. Hopefully this project will use that approach if the modules are going to be self contained.

Best.

Hi @brady.miller ,
I am doing an upgrade from v7.0.0 (1) to v7.0.2.
The upgrade is stuck on the UUID creation.
Is there a way to do this later? It has been running for +12 hrs.

UPDATE
I went back and started the upgrade from v6.0.0 and ran it. Either the UUIDs finished or the starting further back fixed the issue that was hanging up the UUID. I don’t know at this point.