comm-central future as a branch

Gregory Szorc gps at mozilla.com
Thu Oct 11 10:32:52 UTC 2018


On Tue, Oct 9, 2018 at 5:49 PM Magnus Melin <mkmelin+mozilla at iki.fi> wrote:

> I'd like feedback on the below plan for having Thunderbird in a single
> repository.
> Let's have follow-ups go to dev-planning
>
> TL;DR version: planning to have current comm-central live as a branch
> instead of a separate repository.
>
> ---
>
> Background
> ----------------
>
> When the Mozilla move from CVS to hg took place in 2008, the code
> related to Thunderbird was not included in mozilla-central and instead
> placed in the comm-central repository. Under the hood Thunderbird uses
> the mozilla-central code, so to build it currently has to check out both
> repositories. This is not ideal, for many many reasons.
>
> I wrote [an email to
> maildev](
> http://lists.thunderbird.net/pipermail/maildev_lists.thunderbird.net/2017-November/000875.html)
>
> last year listing some of the drawbacks of this setup - but essentially
> what it comes down to is that version control was not meant to be used
> like this and it makes workflows error prone, in addition to having to
> do various hacks in the build automation. Enabling autoland, phabricator
> and other tools used by Mozilla core could also be done much more easily
> for comm-central with a single repository.
>
> The mail spurred some related discussion on git vs hg and importing of
> history, but basically there was agreement among the comm-central
> developers to go ahead with this change.
>
> Thunderbird has now migrated its build automation to Taskcluster, and
> also moved to building with mozilla-central as topdir, with a comm/
> sub-folder containing comm-central code. (We used to instead have a
> mozilla subdir checkout of mozilla-central as child the comm-central
> checkout.)
>
> In 2014 there was [a proposal to merge comm-central code into
> mozilla-central](
> https://groups.google.com/forum/#!msg/mozilla.dev.planning/P8oHKTCWSiQ/76ECituhK-IJ)
>
> but this was rejected since the powers that be did not want 1) the
> confusion about what code belongs to what project, 2) pull/tree size,
> and 3) it was seen as irrelevant to Firefox. The proposal below has none
> of these obstacles, and depending on the physical repository location,
> may not change anything at all for a Firefox developer.
>
> I think it's now time to revisit the single repository idea and pull
> that through in one way or the other. So, on to the proposal. For
> reference, on disk comm/ is on currently 122 MB (+ comm/.hg 192 MB), so
> size-wise it is comparatively small, with a mozilla-central checkout
> weighing in at 2.3 GB. Including the relevant CVS history, the comm/.hg
> would be 317 MB in total.
>
> How
> ------
>
> To capture the essence of what Thunderbird is, we would make
> comm-central a named branch of the mozilla-central. Let's call this
> branch "comm/default".
>
> Initially, we'd pull in a hg converted version of the current
> comm-central default branch, putting all the files into a comm
> sub-folder in the base directory of the comm/default branch. This is the
> way that the Thunderbird build system already wants the code laid out.
> With the comm sub-folder it will be clear to everyone involved which
> code is on mozilla-central code and which is not. It is possible to
> import the CVS history for the related directories too while we're at
> it, but it adds some size. Importing related CVS history only adds
> around 125 MB extra so I'm inclined to include it. With the new
> comm/default branch pulled in, we merge the curent mozilla-central
> default branch to comm/default. The resulting code is now the same as
> status quo (of the both repos pulled in), just that it's all version
> controlled in the same repository.
>
> Once this is all set up, Thunderbird development work would move on on
> the comm/default branch, and there would be periodical merges with the
> (mozilla-central) default branch. These merges could be automated if
> wanted.
>
> Quite often it's the case that Thunderbird code needs to be adjusted to
> build, due to changes in mozilla-central code. For cases of known
> incoming bustage Thunderbird now would have the possibility of waiting
> to do the merge until a fix is available. I would suggest never to back
> out bustage causing mozilla-central changesets from the comm/default
> branch, but to detect other incoming bustage by doing builds from a
> comm/band-aid branch (branced from comm/default). The details on this
> can be discussed later though, not to derail discussion about the main
> issue here. There are many alternatives, and it's also completely
> possible to do what we currently do.
>
> Where
> ---------
>
> In the earlier discussions i's become clear there is a bunch of
> confusion around what is a branch and what is a repository, so please
> note the difference. Aa repository is the physical location where the
> history of a project is stored. Every repository does not have to carry
> every branch. I.e., the mozilla-central repository could carry the
> comm/default branch, or the comm/default branch could exist only in a
> comm-central2 repository somewhere else. In all cases it's trivial for
> developers who have mozilla-central checked out, to add the comm/default
> branch to their local checkout.
>
> So for the location there are at least three options:
>
> A: branch in the mozilla-central repository
> B: branch in the mozilla-unified repository
> C: branch in a repository elsewhere
>
> Not to disrupt operations too much I think it would be preferable to use
> option B - create the branch in the mozilla-unified repository. It does
> depend a bit on what plans Mozilla has for these repositories.
>
> Creating the branch in mozilla-unified of course needs buy-in from the
> Mozilla hg people, so please let me hear your opinions.
>

So we're all on the same page with regards to terminology and how things
work...

"branch" is overloaded to mean different things.

>From a Mercurial VCS perpsective, a "branch" is a name attached to
changesets within a repository. Mercurial branches are permanent and pretty
heavyweight. They are typically useful for tracking long-lived lines of
work, such as releases.

A Mercurial "branch" is different from a Git "branch." (We won't talk about
Git branches here because I don't think it is relevant.)

>From a Firefox/Thunderbird Mercurial repository perspective, "branch" also
means "a specific Firefox repository." e.g. we refer to
hg.mozilla.org/{mozilla-central, integration/mozilla-inbound,
projects/larch, releases/mozilla-release} as "branches." I find this
terminology confusing because "branch" is ambiguous. But it's what a lot of
us use. (I would prefer the name "repository" or even "fork" instead.)

The Firefox Mercurial repositories typically have a single Mercurial
"branch:" "default." Exceptions are the release repositories, which have
various branches named *_RELBRANCH.

For historical reasons, the different Firefox Mercurial repositories mostly
all have a single DAG head. And that head is on the "default" Mercurial
branch.

The mozilla-unified repository is a read-only aggregation of various
Firefox repositories. Because the source repositories all use the "default"
branch, putting all these changesets in one repository means it isn't clear
which changesets "belong" to which repositories because all changesets
exist on the "default" branch. The aggregation process invents "bookmarks"
so names are associated with each DAG head and you can easily get a pointer
to the latest changeset from each source repository.

There are various things we don't like about the current Firefox repository
management.
https://docs.google.com/document/d/1mddoxa2i9ZgPLL_lGOZD0yNyS_ck3v1mzvp4GTEhC-0/edit?usp=sharing
tracks changes we'd like to make. There is general buy-in to make those
changes. But when we last talked about it seriously (at 1 of the prior 2
all hands), we agreed it wasn't high priority. So the work is on hold.

Regarding this proposal, I'm going to rule out adding Thunderbird code to
mozilla-central (A) at this point in time. People (and automated tools)
expect mozilla-central to have a single DAG head / branch and I think
adding a 2nd head/branch will cause problems. Plus there's a practical
concern of not wanting Thunderbird's code bloating the size of the
repository.

Because Thunderbird has separate project governance and support
requirements from Firefox (AFAIK), I think we should also rule out adding
Thunderbird code to mozilla-unified (B). I'm not super opposed to adding
Thunderbird code into mozilla-unified eventually if it makes sense. But I
don't think we should do it initially.

I think this new Thunderbird repository should live on its own, independent
from the Firefox repositories. Whether changesets should be in the
"default" branch or in some other Mercurial branch (so when changesets are
pulled into a Firefox repository you know they belong to Thunderbird), I
don't have a strong opinion. Having them named differently is more
compatible with a world where you are pulling Firefox and Thunderbird
repositories into the same repo. But having different branches/names also
introduces additional workflow requirements (people need to know to switch
between branches - and since we don't use Mercurial branches heavily in
Firefox land, people may not know how to do that).


>
> What about history?
> --------------------------
>
> For the proposed approach (going through `hg convert`) commits are
> preserved, but AFAIK it's not possible to preserve actual commit hashes.
> `hg convert` will graft the original commits, and this adds the original
> hash as an extra field to the new commit (use `hg log --debug` to see
> it). I don't think this is such a big problem. The original commits
> would be linked in the pushlog, the same way there is "converted from"
> for mozilla-central-cvs, like
> https://hg.mozilla.org/experimental/mozilla-central-cvs/rev/ab117d946959.
> I don't see them e.g. in
>
> https://hg.mozilla.org/users/mkmelin_iki.fi/thunderbird-central/rev/8f8b0933b611
> so perhaps there is some server side feature that needs turning on?
>

Correct. We have a very minimal server extension to render the
"convert_revision" changeset metadata (run `hg log --debug` to see it) as a
link to another repository. Since the conversion only stores the hash of
the source revision, we need to add a config option to the .hg/hgrc of the
repository to define which source repository to point to.

It appears modern versions of Mercurial support recording "convert_source"
changeset metadata (https://www.mercurial-scm.org/repo/hg/rev/c9093d4d1ff6).
We should change our server extension to always render that, if available.
Please file a Developer Services :: hg.mozilla.org bug to request that
feature if you will use it.

Anyway, I added the server config setting so the "converted from" links now
render on your user repository.


>
> Can I try it?
> ---------------
>
> You can check
> https://hg.mozilla.org/users/mkmelin_iki.fi/thunderbird-central/ which
> carries the comm/default branch.
>
> In your mozilla tree, pull in the branch like this:
>
>       hg pull -u -b comm/default
> https://hg.mozilla.org/users/mkmelin_iki.fi/thunderbird-central
>
> To completely remove it again, use `hg strip "branch(comm/default)`
>
> How did you do it?
> ------------------------
>
>      CVS_GIT_REPO=~/tmp/mozilla-cvs-history/
>      CC_HG_REPO=~/Code/tb/mozilla/comm/
>
>      git clone https://github.com/ehsan/mozilla-cvs-history.git
> $CVS_GIT_REPO
>
>      echo default comm/default | tee branchmap.txt
>      echo rename . comm | tee filemap.txt cc0.txt
>
>      hg --cwd=$CC_HG_REPO up 0
>      (cd $CC_HG_REPO && ls -d */ | cut -f1 -d'/' | sed 's/^/include /')
>  >> cc0.txt
>
>      hg convert --filemap=cc0.txt --branchmap=branchmap.txt
> $CVS_GIT_REPO single-repo/
>
>      CVS_TIP=`hg id --cwd=single-repo -i -r tip --debug`
>      CC_0=`hg id --cwd=$CC_HG_REPO -i -r 0 --debug`
>      echo $CC_0 $CVS_TIP | tee splicemap.txt
>      # e4f4569d451a5e0d12a6aa33ebd916f979dd8faa
> 93dd94bad983ce2eff5f228719c349b78a8a7f7d
>
>      hg convert --config convert.hg.saverev=True
> --splicemap=splicemap.txt --filemap=filemap.txt
> --branchmap=branchmap.txt $CC_HG_REPO single-repo/
>

You may also want to add the config options from
https://hg.mozilla.org/hgcustom/version-control-tools/file/d54b06ec9d16/vcssync/mozvcssync/git2hg.py#l71
to ensure file copies and renames are more robustly detected. The
conversion will take longer. But it's a one-time thing and accuracy is
probably worth burning a few trillion CPU cycles on :)


>
> For fun, see all the history down to 1999 is there, like
>
>      hg log comm/mailnews/mime/src/mimei.cpp
>
> Then go on and add the default branch from mozilla-central and do the merge
>
>      cd single-repo
>      hg pull -f -b default $CC_HG_REPO/..
>
>      hg up comm/default
>      hg merge default && hg commit -m "Merge default to comm/default
> branch"
>
>      hg push -f -b comm/default --new-branch thunderbird-central-push
>
> ---
>

As for converting CVS history, I'm +1 on that.

I wouldn't plan on the hashes aligning between any future "official"
conversion of the Firefox CVS history to Mercurial and what you did or what
any other conversion did. *If* we produce an official Mercurial repository
with CVS history, we will likely do so using a different technique than
what existing conversions have done with conventional tools. (If we do an
official conversion or any other kind of flag day conversion, we'll likely
switch to Mercurial's "tree manifests" feature and that will rewrite hashes
by definition. And we will also likely wait for a non-SHA-1 hash to be
supported by Mercurial.)


>
> Thoughts, comments, feedback appreciated!
>

Overall this all seems pretty reasonable! Speaking as an hg.mo service
operator and someone with domain expertise about converting repositories, I
would like to perform a final review on everything before this goes into
production though. Please flag me for needinfo on a bug or reach out
privately. I'm on PTO a bit in the next ~2 weeks though. Please reach out
to my manager - Kim Moir - if I'm not responsive. And feel free to drop
into #vcs to ask questions.

Thank you for taking the time to write all this up and for reaching out for
feedback.


>
>   -Magnus
>
> _______________________________________________
> dev-planning mailing list
> dev-planning at lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/tb-planning/attachments/20181011/c7907393/attachment-0001.html>


More information about the tb-planning mailing list