Realistic Test Data Generation

(Matthew Vita) #1


Using GitHub - openemr/demo-data-generator: Generate fictional demo data for testing of OpenEMR, supply realistic test data into OpenEMR with 100% record/encounter coverage.



How to fetch NAMCS data

     FROM patient pat 
LEFT JOIN diagnosis dia     ON pat.patientid   = dia.patientid
LEFT JOIN encounter enc     ON pat.patientid   = enc.patientid
LEFT JOIN prescription pre  ON pat.patientid   = pre.patientid
LEFT JOIN labresult lab     ON pat.patientid   = lab.patientid
LEFT JOIN measurement mea   ON enc.encounterid = mea.encounterid
    WHERE pat.patientid = ?;


(André Millet) #2

the n=100 will depend on what kind of information do we need. analysis

(Matthew Vita) #3

Hi @andremillet. I just picked the number 100 “out of thin air”. It could be 80 or 50, for instance.

The point is we need n really good sample records for realistic patients. This will be mostly useful in the classroom (OpenEMR as a teaching tool in med school, for instance). However, John and Jason can use it in the Analysis project.

(Matthew Vita) #4

So you know, I’m not asking you to come up with all of these records with notes, encounters, document, etc. We need to first see if NAMCS will meet our needs (and I’m on the fence because of some very important data points that are missing such as clinical notes). If not, it will be helpful that have you generate whatever amount of unique patients and records you are comfortable with. (Think about your career in medicine so far, I’m sure there are patients that have similar histories and patterns that you could model).

(André Millet) #5

that’s the point. we need a significant sample. We already functioning OpenEMR instances, so why not gather the numbers from there? REAL numbers?

how do I summon John to this discussion ?

(Stephen Nielson) #6

@MatthewVita @sjpadgett

I know this is an old post but has anyone looked at this work by this work by Crucible to load test data using FIHR?

If there’s a FIHR server they can load synthetic patient data. It looks like they piggyback on top of this project: GitHub - synthetichealth/synthea: Synthetic Patient Population Simulator

I’m resurrecting the thread as I’ve been looking at how to create test data for the OpenEMR as part of the testing framework.

(Stephen Nielson) #7

Apparently I missed a post in the forum where this was already discussed: OpenEMR and FHIR My apologies.