OpenEMR Patient Identifier aka "The PID"

dixonwhitmire · April 23, 2020, 8:01pm

Hi Folks,

I’ve noticed that the patient identifier, or “pid”, is a numeric value which is auto-generated using an “optimistic” database query. Just meaning that there isn’t any locking which means that a race condition could occur based on how many users are in the system, along with what they’re doing.

I’m not suggesting we add share locks and such into the code, as that may lead you down another rabbit hole. Rather would it make sense to migrate the pid column to utilize an “auto-incrementing” database column? This would allow OpenEMR to have a patient id that is physically separate from the database’s primary key, while allowing the database to handle allocating the value and the concurrency aspects. To be fair, this change would require a more complex deployment (code updates + SQL), but it would remove the complexity in managing this identifier.

If there is interest in taking a look at this, I can add it to “the backlog”.

@brady.miller or @sjpadgett what do y’all think?

Thanks!

Dixon

mdsupport · April 23, 2020, 10:45pm

‘id’ column in all tables is auto-incremented. I think that precludes any other column to be auto per mysql constraint. Too bad that db views are shunned in this project otherwise patient-data will be very useful view with pid as a computed column as IFNULL(pubpid, id).

dixonwhitmire · April 24, 2020, 1:59am

That is a very good point! I should have taken a closer look at the MySQL Docs (assuming that Maria DB will work in similar fashion).

Thank you for the feedback!

brady.miller · April 24, 2020, 6:07am

This kind of discussion generally circles back to use of uuid

mdsupport · April 24, 2020, 6:10pm

I don’t think so. Auto incremented id fields as integers are perfectly adequate for a single instance of a database and would be every user’s nightmare if uuid replaced as pid / MRN in human interfaces. UU of uuid is a specific use case for communicating with external entities, e.g. send/receive uuid as pid in FHIR responses (proxy for fully qualified attributes - patient nnn of practice of Jon Doe, MD / NPI xxxxxx).

stephenwaite · April 24, 2020, 6:32pm

hi @mdsupport, wouldn’t we still have pubpid for us humans

mdsupport · April 24, 2020, 7:08pm

@dixonwhitmire didn’t refer to uuid so I better stop with following -

Only original code religiously uses pubpid for every displayed field and all joins use pid instead of patient_data.id. As such I fail to see value of changing several table keys to string and paying performance penalty for pid as uuid.

dixonwhitmire · April 24, 2020, 7:21pm

Yes originally I was just noting the possible concurrency issues which may occur due to the current method used to generate a “PID”

UUIDs are certainly useful if you need a unique identifier for a system. For example, we could associate a UUID with a patient and use it as an identifier with the OpenEMR application and API, external FHIR systems, etc.

You definitely would not want to use a UUID to support a field in the UI as it’s not user friendly. Although it would be funny to see it in a drop down list somewhere

In regards to the PID, it is a good idea to have an “id” that is decoupled from the database primary key. If the PID is an internal field that isn’t really exposed in the UI, a UUID would be useful as it the identifier could travel with the patient record if it is integrated with other systems, whether those are other OpenEMR installations, FHIR, etc.

If PID concurrency becomes an issue there are other ways the ID can be allocated which wouldn’t involve a UUID if that’s preferred.

mdsupport · April 24, 2020, 9:50pm

If you are concerned about pid, a decent way to address that will be use of sequences - Oracle’s nextval or MSSQL/DB2’s next value for constructs. Adodb abstraction is genId with following disclaimer :

Limitations

In databases that do not support sequences, they are emulated by insertion and updating table rows. Because the method provides no row locking, its use is not recommended in intense multi-user environments as duplicate sequence numbers may be returned.

Another caution :
Last time I looked, this project put wrapper on genId and single threaded all sequences into a single sequence called ‘sequences’. So now you have encounters, documents and few other entities sharing the same sequence!

stephenwaite · April 24, 2020, 10:14pm

hopefully we can keep building on some recent modernizing work

brady.miller · April 25, 2020, 4:19am

(brief glimpse of the future in 5 years or so)

hey 2758daa4-af06-4437-84cc-528dfbcebec5 (ie. Stephen) and c9dc038b-4345-4555-874a-ed0a578cad61 (ie. Dixon) and d7a65faa-6147-4106-a568-7de1141f8dc1 (ie. MD Support),

Agree with 2758daa4-af06-4437-84cc-528dfbcebec5 on working on modernizing the database. Agree with d7a65faa-6147-4106-a568-7de1141f8dc1 that the sequence is crazy; as I (otherwise known as eb0fe8aa-119e-418a-9462-c04c19cff9e1) recall, that is what blew up OpenEMR several times when the audit engine broke a couple times in past.

thanks,
eb0fe8aa-119e-418a-9462-c04c19cff9e1