Next steps on developer tooling

Hi all,

I wanted to follow up on some of the developer tooling and processes we’ve been discussing on the admin calls recently. There are a lot of high-level things either actively in-flight or being sketched out; I’m hoping to capture most of them here so people can share thoughts or concerns.

First and foremost, this is a long post, so thank you for your attention!

Second, and more importantly, the guiding principles here are all around improving OpenEMR for patients, providers, administrators, and developers alike. Some proposed changes may be disruptive, but the aim is to keep that to the bare minimum and never just for the sake of it. Everyone’s interests are aligned here - these proposals aim to make it easier to maintain OpenEMR which will lead to better security and performance, less wasted effort, and more reliable upgrades.

Database Connection Improvements

There are no fewer than three different systems that connect to and interact with the database in OpenEMR, and all have quite a few rough edges. We’re making progress on adding support for doctrine/dbal and updating existing code to opaquely use its connection (PDO-based, for those who care). Existing code shouldn’t notice a difference, by design.

I want to re-open the discussion around doctrine/orm as a successor to that project. There was apparently support in the past which was removed. That motivation, valid at the time, is now outdated, and it’s largely one of the best-supported libraries in the entire PHP ecosystem these days, so I am not super concerned about it being an issue again. I strongly believe it will do a lot for code readability, security, and being able to create robust, reusable, testable services that will serve the future well.

Packaging

OpenEMR currently supports a lot of different runtime environments, and this leads to very slow CI (purely a developer problem) and makes upgrading somewhat scary (a wider issue). Since some code makes incorrect assumptions about the runtime environment, there could even be security issues (several modules rely on their own .htaccess files, but not all deployments even use Apache!)

I think we can give users (well, sysadmins and value-added resellers I guess) as well as developers a better experience by being a bit more opinionated on packaging and distribution. This thought is still in its infancy so nothing much to share yet, but if people have background or ideas I’m all ears.

Front Controller

This has been a frequent refrain, as I understand it. There’s a working-ish prototype here, though it’ll need some tuning to wrap it up. tl;dr: existing code will work as-is, new code will have a better foundation. This does tie in to modules and API definitions (in modules, below) but more as a later step.

This is one of the top things in mind with the packaging concerns above - we really want to have an official webserver config (or set of configs) that we can rely on to ensure requests will run through this path, otherwise our efforts will be for nought.

Modules

This is one of the bigger items here. tl;dr: we want it easier to develop, test, and distribute modules, to safely make them a part of an installation of OpenEMR, and to be able to reliably make changes to core without risking breaking modules. To do this, we’ve discussed well-defined interfaces and a certification process.

I wrote a long post in Github (#9981) so I won’t rehash everything here. There’s a prototype for CLI-providing modules as a starting point in #10114. There are blockers we need to address for it (few) and for other module types (a lot more). In short:

  • We need a good story for dependency injection, application config, and service auto-wiring
    • There are many options for this, each with their own trade-offs. But we’ll want to either settle on one or build a light abstraction API that modules can rely on for vending their own services.
  • We need a well-defined set (this can and should start small, and can increase over time) of interfaces that define the services that are guaranteed available to modules. This includes things like
    • DB reads and writes (see doctrine/orm as one approach, above)
    • Logging (think PSR-3)
    • Interaction with core services
    • Interaction with external services (think PSR-18)
    • A whole lot more
  • For modules that modify the DB, we need to select migration tooling - preferably not raw SQL files (in an amazing ideal world, OpenEMR could run on non-MySQL RDBMSs). I’ll proactively suggest doctrine/migrations here but there are others to consider.
  • For modules exposing new API endpoints, we need to decide:
    • How to map routes to endpoints
    • What data formats are supported (pure raw PSR? Symfony? some internal abstraction? multiple?)
  • For modules that connect core to an external service (e.g. a payment gateway), there’s a whole config story to solve beyond installation

Plus a whole bunch of UI-related stuff which I don’t even want to think too much about yet.

There’s also the million-dollar question: what should be in core, and what should be in a module?

I don’t think we need a concrete answer to that one up front, but people seem generally on the same page about moving a lot of stuff (especially that which connects to third party services) into modules.

Even if OpenEMR itself doesn’t follow strict semantic versioning, we’ll want to do so within the module system. This will be easier for everyone in the long run, and I don’t expect many or frequent major versions (read: backwards compatibility breaks) to occur as long as we’re thoughtful about what we add to the interfaces.

There’s also a lot of secondary stuff to consider, like error handling, packaging/distribution, telemetry, and more.

Static Analysis (PHPStan)

We’re trying to raise the quality level of the code (including improving security and reducing bugs) through static analysis tools, primarily PHPStan. There’s an in-flight PR to raise the level much higher. All existing errors will go into the “baseline” so, for the most part, only new additions will be held to the higher standard.

By and large, this requires adding type information to code, so it can more accurately detect logical errors. There’s very little about style in there (that’s more PHPCS’s domain) outside of some advanced rulesets that we’re not including.

For context, there are about 20k errors in the current baseline, and the PR raising the level increases that tenfold to about 200k. But it’s not as bad as it sounds - often, adding type info to a single widely-call function or method can resolve (or reveal!) hundreds or even thousands of errors.

For existing developers, this is likely to be one of the more disruptive things to existing workflows. However, it should be a shallow learning curve with a lot of community support behind it, and I think the higher quality bar will make it more appealing for people to develop for OpenEMR, not less.

Other Rules

A few custom rules have been added recently to discourage use of older, deprecated functions. There are a few more either in development or review that are more explicitly security-focused (e.g. #10249), and others aiming to address “footguns” (e.g. code that works today but is likely to lead to a future crash or vulnerability).

~fin~

That’s all I’ve got for now. If you made it this far, thanks for staying with me. There are certainly plenty of other things worth discussing even strictly within developer stuff - I omitted a few to keep this down to a small novel, but please add anything that you feel should be a priority!

5 Likes

I have a dumb question.

How does this play with the apache or nginx webserver configurations when having to set the document root?

What is the projected implementation of this front controller release?

Great question, @juggernautsei .

The simple version is that the RewriteRule that’s present in several .htaccess files would move up to the application root (probably with some minor adjustments), and we’d have an equivalent recommendation for nginx.

On some timeframe (possibly as part of this) I’d like to include official server config files with OpenEMR, that are at least a starting point covering operational requirements, though tuning it specific to your deployment would be fine.

What is the projected implementation of this front controller release?

The linked PR (#9943) shows the general direction of what it’ll look like to start. In the front controller file, above where the current routing goes (“FallbackRouter”), there would be a primary router for new APIs once that new tooling lands. So basically the logic would look like

Request arrives
 |
Check if route+verb match any module-provided API routes
 |
 |- Yes: dispatch to that handler
 |
 |- No: fall back to existing procedural-style approach

Hope that answers your question, happy to discuss further!

Before I wrote my question. I had one of the AI to explain the code to me in detail. I understand what it is supposed to do. My concern is the graduation of operation. Is this going to be a hard left turn or will it co-exist with the current architecture until it slowly envelops the current flow to consolidate it into a single front door?

In simple terms, how is this going to play out in the real world release over release.

The plan is that things will coexist for the foreseeable future, but to push new contribution towards encapsulated modules.

As the new module infrastructure gets built up enough, we may start requiring new features go through that system rather than being added directly to core. There’s a minimum featureset that we’ll need to build up before this becomes possible, which I think we’ll reach iteratively.

At some point, presumably after a lot of headway in converting current systems into new modules, we’ll want to discuss a cut-over process. This would be quite a bit further out. I suspect we’ll want to have a relatively automated (and probably AI-powered) way to convert existing modules to the new infra before that happens.