Recently had to do a backup and restore of our system and was curious what amount of data the typical OpenEMR user is dealing with. After two and a half years of use, our backup file is around 4.5 GB and increasing by the day (lots of scanned documents). Will there be a point when the traditional backup method won’t be able to handle such volume of data? Should I look into archiving records outside of the directory tree after a patient is discharged? Or am I concerned about a non-issue?
I think you’ll be fine. Suggest you have a technician set up an offsite backup server that does differential backups, i.e. fetching just what has been added or changed. With proper setup the backup server could also provide temporary hosting if the main server is not available.
Appreciate the feedback, Rod. Been looking into DRBD lately. Is something like that overkill for a small webserver running only a couple services (OpenEMR/Timetrex)? Or should I be looking to replicate at the application level?
Have not used DRBD but I understand it’s a real-time redundancy solution, sort of like RAID (but a very different implementation). This is not a substitute for offsite backups, which would cover a wider range of possible disaster scenarious such as human error and mischief.
Real-time redundancy is for “high availability” applications, to give you fast/automated recovery from some types of hardware failure.
Just make sure you have a workable and tested recovery plan also. One reason I like an offsite backup server that’s also ready to run OpenEMR is that if the main server becomes unavailable, the backup server can be ready to take its place with minimum fuss.
It would be pretty awkward to arrive at your office one morning with no server and 100 gigs of backup data that you can only retrieve over the Internet.
What software would you recommend for that type of solution? I’m in the process of configuring indentical servers in both of our offices for a high availability approach (lightning capital of the world warrants it). That’s why I mentioned DRDB before. I figure since we’ve been operating on an old desktop tower for the last few years, it will be a nice upgrade and will allow me to sleep easier at night.
Until I can sort out a more efficient method of backup, I’ll continue to use the manual backup method and carry it with me on a USB drive. I’m curious how a tertiary backup-only server may fit into the mix though…
Hi Frankie, “rsync” is the magic tool for differential backups over a network connection. It will take a bit of study and/or assistance from someone who has used it before. Also you will probably want to be using encrypted filesystems for the sake of HIPAA and common sense.
So my backup has now exceeded 5 GB and the native script in Admin > Backup is failing with “The connection has timed out.” So far I’ve tried increasing max_execution_time in php.ini (apache and cli) as well as added a TimeOut directive in httpd.conf. No luck.
Which log file should I be looking in for timeout errors? Any other connection setting I should increase?
i’m guessing you’re running a linux server, do you have a linux laptop or desktop handy
just install rsync on both devices and then run this to an encrytped usb directory:
sudo rsync -rvz --progress x.x.x.x:/var/www/openemr directory_of_your_mounted_usb where x.x.x.x is the ip of your server
(assuming you have default location of openemr)
you can then copy the contents of this usb to a test server to ensure you’ve got a valid backup and put in place the emergency server that was suggested above
Wouldn’t you want to dump the database with mysqldump prior to backing up? It’s my understanding that copying and restoring the database directly like that could result in data corruption.
I’ve resorted to doing a manual backup as per this article. Testing the restore on a virtual machine now. Is there any benefit to using rsync in the way you described? Seems fairly similar to the manual backup.
I do hope to setup a more automated approach in the near future, I just can’t seem to find the time.
Edit: Restore tested well. I’ll stick with the manual backup for now. Still curious what setting may need to be increased to allow the backup to complete through Apache. I found that archiving the directory tree was taking an exceptionally long time due to the volume of scanned documents…
i do remember reading that since apache/mysql is running, so you could stop them first, of course at this point you’re going to want to make sure users are either not on or are aware of this
rsync works by doing a differential backup and only grabbing the files that have changed instead of downloading every document that is in openemr/sites/default/documents for instance over and over again
as she notes as risky, rsync /var/lib/mysql to the drive noted above
or 2) mysqldump -u openemr -ppsswd openemr > /var/www/openemr/dump/openemr.sql to have it included in the rsync of the webroot.