Encounter history build time

tmccormi wrote on Sunday, November 07, 2010:

Thanks Brady, that’s a relief.  I, for one, am going to blast all my local repos and start them clean to make dang sure.
-Tony

tmccormi wrote on Sunday, November 07, 2010:

PS: this command is very useful if you don’t have access to a GUI (qgi, gitg or gitk)  (or just want it quick.  And this is what my local repo (newly recreated) shows from the master branch, which looks clean.

git log -graph -abbrev-commit -pretty=oneline -decorate
* 4999180 (HEAD, origin/master, origin/HEAD, master) Phyaura Calendar speed inhancement
*   352041b Merge branch ‘master’ of ssh://openemr.git.sourceforge.net/gitroot/openemr/openemr
| 
| *   198e01f Merge branch ‘work1’
| | 
| | * f2546cb Some cleanup suggested by Brady.
| | * 60a7087 Added administrative page for editing and uploading site-specific files.
* | |   d104c7d Merge branch ‘master’ of ssh://openemr.git.sourceforge.net/gitroot/openemr/openemr
|\ \  
| |/ / 
| * | 3ac5bc5 Finished security overhaul of the transactions report: 1) Removed unneeded formData function  (These funct
| |/ 
| * b2f2276 More security stuff.
| * 9debf42 Fixed to use more secure SQL methods.
| * 70d9e30 Added Inventory Transactions report.
| * 4a5d642 Added an address book type for Distributors and a new lot transaction type “Distributions” which are sales t
| * d46fd37 New Payment Screen.If it is Patient Payment automatically his/her distribution comes.No need to once again s
| * 3c4ae29 minor typo fixes
| * 5549fc5 Upgrade to phpMyAdmin 2.11.10
| * 0645669 Configure the cvs ubuntu package to collect code from the github repository
* |   d69d2c6 Merge branch ‘master’ of git://github.com/openemr/openemr
|\  

the rest snipped …

stephen-smith wrote on Sunday, November 07, 2010:

All,
Repository is okay.  We can move forward from here.  The “dirty” merge that Tony pulled in isn’t really a problem other than the commit not being quite as documented as it could be.  The commit that is a revert of a merge commit has the potential of causing confusion in the future, but even that is rather unlikely.  The HOWTO I posted earlier probably has the best information about using git revert on a merge commit.

I do still recommend against merging master into a feature branch with one potential exception.  If it is the last thing you do before doing a push to master (which is how Brady’s example workflow does it) AND if the push to master fails for some reason you undo the merge by using git reset (not git revert).

Brady,
A git merge can fail.  If it does so, you are asked to manually resolve the conflicts and commit the result.  This can result in a “dirty” merge.  That’s not a bad thing by itself, but it is indicative that the merge is not so “simple”{1}.  When a merge isn’t “simple”, the default merge message should really be replaced or supplemented with something similar to a normal commit message.  (Given our collective experience levels with git, it could be argued that NO merge commit is simple.  In that case, it might be worth stealing Gnome’s hook that disallows commits that use the default merge message.  That would force even “simple” merges to have a real, human-provided commit message written.)  (I’ve also been entertaining the idea of disallowing “dirty” commits, but I’ve not completely convinced myself it is a good thing  AND I’m not quite sure how to write it.)

All,
The git rev-list command can be useful to determine if your master contains any commits that are not in the SF.net master.  (If you intend your master to be a copy/mirror of the SF.net master, those commits should be dropped or moved to a branch.)  Using my local naming scheme the command would be:

git rev-list master --not openemr/master

No output is good.  Each commit in your master but not in the SF.net master would have it’s SHA printed on stdout, on a line by itself.

{div mode=“pedantic”}
git merge can use a variety of different strategies to generate the final, merged tree.  The command will fail if the strategy is unable to generate a tree without user input.  The default strategy is “recursive” which is a minor improvement on the standard 3-way merge algorithm that’s been in use for decades, but others are available. However, the strategy used is not recorded as part of the commit object.

Therefore, the strategy isn’t taken into account by tools that want to determine the “interesting” or “dirty” parts of a merge commit, like QGit.  This means that while manually resolving conflicts might generate a “dirty” commit, there isn’t a strict one-to-one correspondence.  It’s possible that an automatic resolution, particularly one generated by a custom strategy, could show up as “dirty”/“interesting”.  It’s also possible that a manual resolution might not show up as “dirty”/“interesting”, if it is sufficiently “simple”.
{/div}

{1} I am using “simple” instead of trivial, because the word “trivial” is already used by at least one message by git.  In that context it means a merge where each file in the tree is touched by at most one of the parents, so the merge can be done without even considering the contents of the files.  (I.e. trivial == in-tree.)  The default “recursive” strategy is also able to do 3-way merges AND use some amount of history information to resolve 3-way merge conflicts.

bradymiller wrote on Sunday, November 07, 2010:

Thanks for the explanations and the command Stephen.

All developers,
Updated the sourceforge commit workflow with Stephen’s command above (step 4) (also needed to add an additional fetch step for it to work, which is now step 1). Step 4 would of avoided Tony’s error, so should avoid other developers making this mistake in the future. Please look over and provide feedback as needed (especially interested in any ways to decrease the number of steps):
http://www.openmedsoftware.org/wiki/Git_for_dummies#Committing_to_the_official_git_repository_on_Sourceforge

-brady

bradymiller wrote on Sunday, November 07, 2010:

hey,

Clarified the SF commit work flow even more. Step 1 now contains the instructions (and warnings) to ensure the developers local repository is compatible with the official SF repository:
http://openmedsoftware.org/wiki/Git_for_dummies#Committing_to_the_official_git_repository_on_Sourceforge

-brady

bradymiller wrote on Sunday, November 07, 2010:

Just realized another workflow issues related to Tony’s error (and not sure how to avoid it in the future),

As most excited developers, Tony started a github fork and didn’t properly update the ‘master’ and did some custom work in the ‘master’ branch (note this is an extremely common occurrence that more than 50% of the active developers have done). As he became more proficient, he didn’t do this, and he appropriately started new branches for custom code etc. But his ‘master’ code still contained some custom stuff from previous work and some garbage merge commits. So, we agreed, that cleaning up his ‘master’ branch (simply doing a complete refresh from the official git repo to remove all his customizations and garbe merge commits which worked well) would fix it and make his repo compatible with the SF git repo. At this point he will then pass step 1 in the git for dummies script. HOWEVER, we haven’t account for his old branches. Thus, if he brings in an old branch (via rebase/master) then he’ll just bring back in all the “garbage” commits. For Tony not a big deal, since he only has several old branches (ie. can re-fork a new repo and then use cherry-pick to re-create nice new branches), although privately he may have many more branches based on the previous “bad” master branch. The question is how best to deal with this scenario? And a perfect practical example is zhhealthcare’s public github repo, which contains a bunch of branches (with potentially useful code in the future) based on a “bad” ‘master’ branch:
http://github.com/zhhealthcare/openemr

Considering the ramifications of a mistake in our SF git repo (with downstream mirrors and personal repos that can basically not be fixed), should we consider a different commit method.
For example:
Actual commit access - Having several git proficient developers responsible for all commits.
Pseudo commit access - Developer has demonstrated they deserve commit access, so they can commit what they want by simply asking (they are expected to provide a public rebased branch with a minimal number of commits so a developer with actual commit access will commit it without question using cherry-pick)
It’s actually very minimal work for the actual committer and only involves three commands (fetch the other developer repo, cherry-pick the commit(s), push it to sourceforge)

I’m only throwing the idea out there to see what others think.

-brady

sunsetsystems wrote on Sunday, November 07, 2010:

Brady I’m not sure I’m following all that without an example sequence of events.  But might it help to ask developers to push first to their personal (and updated) github master, as a way of testing what will happen when they push to SF?

Also I’d really like a documented plan in place for recovery in case something very bad happens.

Thanks for all of your help with this!

Rod
www.sunsetsystems.com

bradymiller wrote on Sunday, November 07, 2010:

hey,

Regarding backups, I have backup snapshops of repo daily for 7 days, then weekly for 4 weeks, then monthly for 6 months using the rsync command documented on at bottom of this sourceforge page:
https://sourceforge.net/apps/trac/sourceforge/wiki/Git
Since better to not rely on just one backup mechanism(ie. my server could always break etc.), I suggest you also incorporate some sort of backups mechanism on your site.

The real issue is the downstream effects of something “bad” happens, since these will be mirrored to three mirrors, which in turn has personal forks from these. If we had to restore our main SF git repo, then may lose compatibility with the downstream repos, which isn’t disastrous but could be a pain in the future as OpenEMR grows.

We could have two main workflows, which both require basic same set of instructions by the developer:
1) Review workflow (can be done by anybody)
-Request review in ‘Code Review’ tracker and provide a link to their public rebased (to sourceforge ‘master’) repo/branch with no more than 3 commits (developer may need to use interactive rebase to squash them to 3 if needed), and ask for a review.
2) Commit workflow (can only be done by developers with pseudo-commit privileges)
-Post a message in forum and place a rebased (to sourceforge ‘master’) branch with no more than 3 commits (developer may need to use interactive rebase to squash them to 3 if needed), and demand that it is committed.

So, workflow 1 is used for new developers without pseudo-commit privileges or any developer that thinks a review is needed. And workflow 2 is used by a pseudo-committer when their code is ready to be committed.

To review code that has been submitted from workflow 1 (by the code reviewer):
git remote add developerName publicGitRepoLink
git fetch developerName
git co master
git co -b testingCode
git cherry-pick sha# (do this for each commit) (can get these numbers from the log)

To commit code that has been submitted for commit from workflow 2 (by the code comitter):
git remote add developerName publicGitRepoLink
git fetch developerName
git co master
git pull sourceforge master
git cherry-pick sha# (do this for each commit) (can get these numbers from the log)
git push sourceforge master

hope this helps clarify the proposed workflows.
-brady

stephen-smith wrote on Sunday, November 07, 2010:

Brady,
Like Rod, I’m not sure I’m following you.  As I said before, the repo is fine.  Do you have a specific concern I haven’t addressed with the current state of the repo?

All,
It is probably beneficial for everyone to double-check themselves as far as what they commit to the SF.net repository until you are quite comfortable in Git.  In particular, remember that commit objects get pushed, not patches, so the history of your branch (and any branch merged into it) can be significant.  There are a number of git commands and tools based on git for examining the history of your branch.

I don’t want to be a bottleneck, but you can always message me to review a commit/branch/whatever before you push it to SF.net.

stephen-smith wrote on Monday, November 08, 2010:

Brady,
I don’t think either of those workflows since they both have seem to have an unnecessary rebase in them.  I also think it is unreasonable to set some arbitrary limit on the number of commits that gets pushed at a time.  Reviewing commits/branches (instead of patches) using our existing procedure doesn’t seem like a bad idea, though.  (We can also still accept/review patches, too.)

bradymiller wrote on Monday, November 08, 2010:

hey,

Sorry not clear. Several updates:

1) Yet again modified the SF git commit workflow on the wiki and got it to 7 steps and I think it’s complete. If a developer strictly adheres to this, then should avoid bringing in any additional unwanted commits:
http://openmedsoftware.org/wiki/Git_for_dummies#Committing_to_the_official_git_repository_on_Sourceforge

2) Next would like to attack the next issue of developers “fixing” their repo to avoid Tony’s Bug. Stephen, is there a way to fix repos such as zhhealthcare’s ( http://github.com/zhhealthcare/openemr ), so that their ‘master’ and numerous branches are safe for pushing to the SF git repo.

3) My previous above rambling was just meant to discuss the possibility of only having several developers with actual commit access in order to avoid “bad” commits to the SF git repo. Then, these few committers could grab code for commit from other developers repos via the “cherry-pick” command, which is another way to avoid Tony’s bug.

-brady

stephen-smith wrote on Monday, November 08, 2010:

If there are commits in the history that need to be removed, edited, combined, or otherwise altered, git rebase should be able to handle it, as long as you are fine with the flattening/linearizing history.  If you use -onto and are very specific with the end point, cut point, and attachment point, you can do a lot of surgery to a git history with rebase.

I haven’t looked at zhhealthcare’s tree specifically.

stephen-smith wrote on Monday, November 08, 2010:

Yeah, I don’t like zhhealthcare’s tree one bit.  First, each branch other than “master” would need to be rebased and most, if not all, of them are going to require a number of manual conflict resolutions followed by git rebase -continue.  You can probably avoid at least some of the conflicts by using (git rebase -onto openemr/master zhhealthcare/master $branch) instead of not specifying -onto.  (openemr/master is my name for the “master” branch on SF.net.)  Then, The “master” branch there should simply be thrown/blown away and the “master” branch at SF.net should be used instead.

Also, some (not all) branches look like they have commits that need to be squashed using (git rebase -i)-typos or very minor fixes.  That can be done either before, after, or even during the above.

In particular, we don’t really want anything that merges in (or in based upon) the “master” in zhhealthcare’s repository.  Mainly because it contains a number of unwanted commits: ec129f84f46f4, fddac9d7c5eb1, and ce9dd0ae6cabc6, at least.  Pulling those is would be less bad then some of the commits Tony pulled in, but they are still undesirable.  Also, if each real change includes 12-24 merge commits (even if they are all clean), I am likely to get quite confused looking at our commit history.

bradymiller wrote on Thursday, November 11, 2010:

hey,
Agreed we need to avoid bringing in trivial commits, but this risk will be difficult to completely avoid. Zhhealthcare is ready for commit access (from a coding standpoint), however we obviously can’t allow pushing from their repository in its current state. Despite the craziness of their repo, it is still very easy for myself to grab commits (via cherry-pick) from their repository, and then commit it to the SF git repo myself. This is what I was discussing above; if perhaps it makes more sense to only give a few developers actual commit privileges.
-brady

bradymiller wrote on Thursday, November 11, 2010:

Stephen, Rod, Tony (and all other developers),

To clarify what I was discussing above, put a quick proposal on the wiki:
http://openmedsoftware.org/wiki/Git_Migration#Proposed_Developer_Structure

With this structure in place, then don’t have to worry about everybody’s repositories; otherwise I’m predicting our main SF repository will become very chaotic over time.

-brady

sunsetsystems wrote on Thursday, November 11, 2010:

Brady this looks mostly OK but a couple of comments/suggestions:

1. If I’m asked to commit something for a Privileged developer I’d still like to be able to ask questions without anyone feeling violated.  So maybe word it as something like “further code review is not expected unless the committer initiates it”.

2. I don’t think we need “near-privileged”.  How about just calling everyone else a “contributor”.

Finally, if someone is going to contribute frequently then I want them to take some time to become proficient and trustworthy with git (i.e. become “privileged”).  While doing a commit for someone else does not take much time, it is a distraction and can have a significant impact on a developer’s productivity.

Thanks so much for helping to organize these procedures!

Rod
www.sunsetsystems.com

acmoore wrote on Thursday, November 11, 2010:

Here’s a late reply just to let you guys know that I think this is going in a good direction and to thank you for the work you all have done in getting here.

Rod, I agree with your #1. Although I see I’ve been put in the “privileged” group, I encourage reviews of my code without anyone feeling violated about it.

Also, for what it’s worth, I made my github repositories private since I rebase in them a lot and didn’t want anyone to think that they could clone them and not be disrupted.

Sorry to continue to contribute to the off-topic discussion, but it seems pretty important!

-Andy

sunsetsystems wrote on Thursday, November 11, 2010:

Oops, change “(i.e. become “privileged”)” above to “(i.e. become an “integrator”)”.

Rod
www.sunsetsystems.com

tmccormi wrote on Thursday, November 11, 2010:

I wish we could change the subject line on this thread …  but I think the proposal if fine.
-Tony

bradymiller wrote on Thursday, November 11, 2010:

Rod (and all other developers),

Put the updated flow/scheme on its own wiki page here:
http://openmedsoftware.org/wiki/Repository_work_flow_structure

The main goal is to keep our SF repository robust, while not wasting resources of the Integrators. In theory, we’re simply pretending the “Privileged” Developer has commit privileges. So, as is the case now, they can put their code up for review or they can commit it; this decision is up to them. The point is to keep this mechanism in place. So, the Privileged Developer has two options:
1. Put code up for review. When code passes review, then they can demand an Integrator to commit it.
2. Bypass review, and demand an Integrator to commit it (makes sense for bug fixes etc. and this decision is left up to the privileged developer).

The actual point of this is to reduce the burden on the Integrators. It’s very easy to grab a remote branch/commit and commit it to the SF repo, but if we are asked to do any review of it, then it becomes much more difficult.

The nice thing about this mechanism (we will require the “fast-track” commits by the privileged developers to be on a public rebased git branch) is that the natural progression will be that the developers become proficient in git, and in time they should progress to an integrator.

It’s also a nice way to keep track of developers (along with companies) (ie. give credit to all in new copyright page etc.). As the list grows, it will be important to have this information for Integrators; ie. whose Privileged. Near-privileges is just a nice way so developers don’t fall in the cracks whom are contributing good code in case the admins (myself or you) drop off the internet for a while. And the new Standard Developer category is nice to keep track of all contributions for the copyright credit page.

Feel free to express thoughts or further clarify text on the wiki page.

-brady