2017/10/28

Fixing CVS -> CVSps -> Git cvsimport

TOC

  1. Background
  2. CVS Log Messages
  3. Proposed Fix
  4. Patches
  5. Example Run Thruoughs

BACKGROUND

While importing some CVS repository to Git I noticed some strange issues. My investigation shows that the heart of the problem is with how CVS log messages are parsed. I worked through a couple of different solutions, and finally settled on the one I feel is the best and least disruptive, and hopefully most acceptable. Since, the solution involves changes to three different pieces of software, requiring communication with each of their respective development teams, putting all the information in one place seemed logical.

One thing to keep in mind as you read this document. As you can see it is a fairly lengthy document. It took me a couple of days to put it together and trim it down to be consumable. Meaning there may be unintentional errors that have slipped in. For this apologize.

Here is a quick walkthrough of how importing a CVS repository into Git works, and as I outline details, those faimilar with how each of these components work, will hopefully sync with my train of thaught.

Git has a cvsimport Perl script which relies on CVSps (version 2.1) to get relevant information from the CVS server/repository to import source history and revisions.

Dashes

CVS rlog will separate each history revision of a file with a series of 28 dashes. And each file with series of 77 equal-signs.

Revision Meta-Data

Part of the revision history, CVS also includes some "meta-data" which are of the form FIELD :   META-DATA STRING ;

e.g.,

date: 2017/10/31 05:56:16; author: catbert; state: Exp; lines: +2 -0; commitid: T3v4RzWwNvgZCKe9;

PatchSet for Git's cvsimport

CVSps produces a "patch set" infomation file (for lack of better description), which Git's cvsimport parses. This file follows a certain format labeling relevant information for cvsimport's consumption.

Here is an example of data produced by CVSps:

---------------------
PatchSet 3
Date: 2017/11/01 19:32:51
Author: catbert
Branch: HEAD
Tag: (none)
Log:
Update information about usage change.
Add new -shiny option for glitzy new feature!

Members:
        README:1.1->1.2
        main.c:1.12->1.13
BACK TO TOC

CVS LOG MESSAGES

CVS, like most revision control systems, allows the developer to input a free-form message with each revision update (or commit message).

CVSps and Git cvsimport make certain assumptions while parsing their respective inputs.

CVSps Assumptions

CVSps makes at least two assumptions, which cause it to potentially parse CVS rlog output incorrectly.

First, CVSps assumes the log messages will contain neither the revision separator (lines of 28 dashes) nor file separator (lines of 77 equal signs).

Second, CVSps assumes log messages will not have initial lines that conform to revision meta-data format (described above). All initial lines of a commit message which do fit that format ( FIELD :   META-DATA STRING ; ) get ignored completely.

Git cvsimport Assumptions

Git cvsimport makes the assuption that the log messages will not have lines which start with any of the "indicators" (or "TAGS") it assoicates special meaning to in the CVSps output file (see above).

For example, if a developer puts in his commit message a line which starts with Members:, this will throw Git cvsimport's parsing.

This can easily happen -- and I have ran into a couple of such examples -- when a developer copy-and-pastes a log history from Git into his commit message in CVS, in an attempt to document a relationship between his work and a Git repository.

BACK TO TOC

PROPOSED FIX

The fix touches all three pieces of software: CVS, CVSps and Git's cvsimport. This may at first feel a bit drastic, but it is most robust. In addition this approach is fully backward compatible, meaning with any of the three components not having the proposed patches applied, the resulting behavior is the same as if none of the pieces had the changes applied. Therefore, this approach should be fairly safe even if the software getting the patches applied ends up "talking" with a component outside the patched ecosystem!

CVS Change

Introduce a new Response specifically to be used for commit log message: LOGM. Clients will include LOGM as part of Valid-responses string during client/server communication. As an optional response if either client or server do not support it, it will not be at play; (Backward Compatible). However, if both client and server support LOGM, during rlog request, the server will output each line of a commit message prepended with LOGM   rather than M   (current/default behavior).

There is no speific handling added for the CVS client. As noted in the proposed patch, handle_m is specified for the filtering/processing function in the struct response responses[] = {...} table.

CVSps Change

CVSps is changed to include LOGM in its Valid-responses string and during processing of rlog response in load_from_cvs() it switches to processing LOGM output on first LOGM output from the CVS server. If LOGM support is not detected, processing continues as it is today (Backward Compatible).

Second change to CVSps is when outputting Log: lines in PatchSet data by appending line count of commit message which follows. This information is then extracted by the patched Git cvsimport, and will help it correctly parse out commit message lines out of each PatchSet section. Fortunately, since current Git cvsimport isn't strict about Log: lines, this change is Backward Compatible with older Git cvsimport distributions.

One thing to note with the patch to CVSps is that, some abstraction was introduced to simplify the code dealing with then entire rlog processing; from cvs_rlog_open(), cvs_rlog_close() and cvs_rlog_fgets(), as well as introduction of a new method for appending to the logbuff in order to avoid duplication of code between traditional/default message processing and newly introduced LOGM method.

Git cvsimport Change

Possibly the simplest of all patches is to Git's cvsimport script. The change is to the regular expression which searches for Log: in the PatchSet file. The regular expression is amended to optionally accept Log: tags with commit message line count appended to them, as the modified CVSps will produce. Optional parsing of line count ensures Backward Compatibility with older CVSps PatchSets.

BACK TO TOC

PATCHES

LICENSE

I don't feel the changes I have made are significant enough to explicitly release them with any license file. However, based on experience, especially with the OpenBSD team, copyright, licenses and code-release should be taken seriously. The idea being in copyright law, right, or permission to use must be given explicitly.

In order to ensure my contributions can be included in respective source projects, as well as any offshoot projects, I offer my changes under both an ISC-like license as well as license under which respective source project is distributed.

/*
 * Copyright (c) 2017 Patrick Keshishian
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

CVS Patch

Since most of my development is on OpenBSD, my patch is against their CVS sources.
PATCH:
OBSD based CVS LOGM

This patch is against the latest (?) CVS sources (1.11.23) from savannah.nongnu.org. However, please note that most of my tests have been against the OpenBSD version of CVS. That said, I have ran manual rlog queries with the patched version of CVS (1.11.23) with and without including LOGM in the Valid-responses string, as well as patched and unpached CVSps.
PATCH: CVS 1.11.23 LOGM

CVS (1.11.23) indicates in COPYING file in its distribution to be under GNU GENERAL PUBLIC LICENSE Version 1, February 1989. Therefore, consider my changes to it to be released under both GNU GENERAL PUBLIC LICENSE Version 1 and above noted ISC-like license.

CVSps Patch

PATCH: CVSps LOGM

CVSps (2.1) indicates in COPYING file in its distribution to be under GNU GENERAL PUBLIC LICENSE Version 2, June 1991. Therefore, consider my changes to it to be released under both GNU GENERAL PUBLIC LICENSE Version 2. and above noted ISC-like license.

Git cvsimport Patch

PATCH: Git cvsimport

Git indicates in COPYING file in its distribution to be under GNU GENERAL PUBLIC LICENSE Version 2, June 1991. Therefore, consider my changes to it to be released under both GNU GENERAL PUBLIC LICENSE Version 2. and above noted ISC-like license.

BACK TO TOC

EXAMPLE RUN THROUGH

Pre Patch

Throughout this document the actual username is replaced by catbert and actual path where this work is performed is replaced by $WORKDIR.

$ cd
$ mkdir -p $WORKDIR
$ cd $WORKDIR
$ export CVSROOT=`pwd`/_cvsroot
$ cvs init

$ mkdir proj
$ cd proj
$ echo "Welcome to this test project!" > README
$ cvs import -m "Initial IMPORT!" proj PROJCORP_1 PROJ_INITIAL
N proj/README

No conflicts created by this import

$ cd ..
$ rm -r proj

$ cvs co proj
cvs checkout: Updating proj
U proj/README

$ cd proj
$ date >> README
$ echo A NEW DAY IN THIS PROJECT >> README

$ cvs commit -F /dev/stdin
NOTMETA: This is not a revision meta-data;
cvsps will treat it as such.
^D
cvs commit: Examining .
Checking in README;
/home/catbert/$WORKDIR/_cvsroot/proj/README,v  <--  README
new revision: 1.2; previous revision: 1.1
done

First let's see the RAW CVS server rlog output.

$ cat <<__EOM | cvs server > server.out
> Root $CVSROOT
> Valid-responses ok error Valid-requests Mode M Mbinary E Checked-in Created Updated Merged Removed
> valid-requests
> Argument proj
> rlog
> __EOM

$ cat server.out
Valid-requests Root Valid-responses valid-requests Repository Directory Max-dotdot Static-directory Sticky Checkin-prog Update-prog Entry Kopt Checkin-time Modified Is-modified UseUnchanged Unchanged Notify Questionable Case Argument Argumentx Global_option Gzip-stream wrapper-sendme-rcsOptions Set expand-modules ci co update diff log rlog add remove update-patches gzip-file-contents status rdiff tag rtag import admin export history release watch-on watch-off watch-add watch-remove watchers editors init annotate rannotate noop version
ok
E cvs rlog: Logging proj
M 
M RCS file: /home/catbert/$WORKDIR/_cvsroot/proj/README,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M 	PROJ_INITIAL: 1.1.1.1
M 	PROJCORP_1: 1.1.1
M keyword substitution: kv
M total revisions: 3;	selected revisions: 3
M description:
M ----------------------------
M revision 1.2
M date: 2017/10/31 05:56:16;  author: catbert;  state: Exp;  lines: +2 -0;  commitid: T3v4RzWwNvgZCKe9;
M NOTMETA: This is not a revision meta-data;
M cvsps will treat it as such.
M ----------------------------
M revision 1.1
M date: 2017/10/31 05:54:58;  author: catbert;  state: Exp;  commitid: WioLNqlsuQhPlqsU;
M branches:  1.1.1;
M Initial revision
M ----------------------------
M revision 1.1.1.1
M date: 2017/10/31 05:54:58;  author: catbert;  state: Exp;  lines: +0 -0;  commitid: WioLNqlsuQhPlqsU;
M Initial IMPORT!
M =============================================================================
ok

Next let's see what stock cvsps version 2.1 run with options from Git cvsimport Perl script outputs

NOTE: It is VERY important that if you need to re-run cvsps, that you first remove appropriate .cvsps/ "cache" files. Otherwise, you'll go crazy trying to figure out why you are seeing old data, and/or unexpected behavior.

$ cvsps --norc --cvs-direct -u -A --root :local:$CVSROOT proj > cvsps.out
cvs_direct initialized to CVSROOT /home/catbert/$WORKDIR/_cvsroot
cvs rlog: Logging proj

$ cat cvsps.out
---------------------
PatchSet 1 
Date: 2017/10/30 22:54:58
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
Initial revision

Members: 
	README:INITIAL->1.1 

---------------------
PatchSet 2 
Date: 2017/10/30 22:54:58
Author: catbert
Branch: PROJCORP_1
Ancestor branch: HEAD
Tag: PROJ_INITIAL 
Log:
Initial IMPORT!

Members: 
	README:1.1->1.1.1.1 

---------------------
PatchSet 3 
Date: 2017/10/30 22:56:16
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
cvsps will treat it as such.

Members: 
	README:1.1->1.2 

Notice that in PatchSet 3, the log message is missing the first line! cvsps mistook it as a revision meta-data and ignored it.

Next let's play with "Log Boundary Dashes". CVS uses 28 dashes for separating revisions. This works nicely for the human reader, but if it ever appears in an actual log message, it will confuse cvsps in very destructive way.

Let's demonstrate.

$ echo A NEW FILE > ABC
$ cvs add ABC
cvs add: scheduling file `ABC' for addition
cvs add: use 'cvs commit' to add this file permanently

$ cvs commit -F /dev/stdin ABC
New file ABC.

Here we add the "Log Boundary Dashes" (28 dashes)
----------------------------
BAD THINGS ARE BOUT TO HAPPEN
AND IT WILL BE SUBTLE
SO PAY CLOSE ATTENTION
----------------------------

ABC comes before README. This is important!
^D
RCS file: /home/catbert/$WORKDIR/_cvsroot/proj/ABC,v
done
Checking in ABC;
/home/catbert/$WORKDIR/_cvsroot/proj/ABC,v  <--  ABC
initial revision: 1.1
done

First let's see the RAW CVS server rlog output.

$ cat <<__EOM | cvs server > server.out
> Root $CVSROOT
> Valid-responses ok error Valid-requests Mode M Mbinary E Checked-in Created Updated Merged Removed
> valid-requests
> Argument proj
> rlog
> __EOM

$ cat server.out
Valid-requests Root Valid-responses valid-requests Repository Directory Max-dotdot Static-directory Sticky Checkin-prog Update-prog Entry Kopt Checkin-time Modified Is-modified UseUnchanged Unchanged Notify Questionable Case Argument Argumentx Global_option Gzip-stream wrapper-sendme-rcsOptions Set expand-modules ci co update diff log rlog add remove update-patches gzip-file-contents status rdiff tag rtag import admin export history release watch-on watch-off watch-add watch-remove watchers editors init annotate rannotate noop version
ok
E cvs rlog: Logging proj
M 
M RCS file: /home/catbert/$WORKDIR/_cvsroot/proj/ABC,v
M head: 1.1
M branch:
M locks: strict
M access list:
M symbolic names:
M keyword substitution: kv
M total revisions: 1;	selected revisions: 1
M description:
M ----------------------------
M revision 1.1
M date: 2017/10/31 06:04:01;  author: catbert;  state: Exp;  commitid: ZjHgka7Xq4I856XV;
M New file ABC.
M 
M Here we add the "Log Boundary Dashes" (28 dashes)
M ----------------------------
M BAD THINGS ARE BOUT TO HAPPEN
M AND IT WILL BE SUBTLE
M SO PAY CLOSE ATTENTION
M ----------------------------
M 
M ABC comes before README. This is important!
M =============================================================================
M 
M RCS file: /home/catbert/$WORKDIR/_cvsroot/proj/README,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M 	PROJ_INITIAL: 1.1.1.1
M 	PROJCORP_1: 1.1.1
M keyword substitution: kv
M total revisions: 3;	selected revisions: 3
M description:
M ----------------------------
M revision 1.2
M date: 2017/10/31 05:56:16;  author: catbert;  state: Exp;  lines: +2 -0;  commitid: T3v4RzWwNvgZCKe9;
M NOTMETA: This is not a revision meta-data;
M cvsps will treat it as such.
M ----------------------------
M revision 1.1
M date: 2017/10/31 05:54:58;  author: catbert;  state: Exp;  commitid: WioLNqlsuQhPlqsU;
M branches:  1.1.1;
M Initial revision
M ----------------------------
M revision 1.1.1.1
M date: 2017/10/31 05:54:58;  author: catbert;  state: Exp;  lines: +0 -0;  commitid: WioLNqlsuQhPlqsU;
M Initial IMPORT!
M =============================================================================
ok

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cvsps --norc --cvs-direct -u -A --root :local:$CVSROOT proj > cvsps.out
cvs_direct initialized to CVSROOT /home/catbert/$WORKDIR/_cvsroot
cvs rlog: Logging proj
WARNING: revision 1.1.1.1 of file ABC on unnamed branch


$ cat cvsps.out
---------------------
PatchSet 1 
Date: 2017/10/30 22:54:58
Author: catbert
Branch: #CVSPS_NO_BRANCH
Tag: (none) 
Log:
Initial IMPORT!

Members: 
	ABC:1.1->1.1.1.1 

---------------------
PatchSet 2 
Date: 2017/10/30 22:56:16
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
cvsps will treat it as such.

Members: 
	ABC:1.1->1.2 

---------------------
PatchSet 3 
Date: 2017/10/30 23:04:01
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
New file ABC.

Here we add the "Log Boundary Dashes" (28 dashes)

Members: 
	ABC:1.2->1.1 

SEE what happened? Where has the README file gone? The "bug" is in cvsps's parsing of the CVS server output when it encounters these "Log Boundary Dashes". The parser's state goes from NEED_EOM to NEED_REVISION when it sees the "dashes" in the log message for ABC revision just before README's start. The parser misses the "File Boundary" (which is the series of equal symbols), which would have transitioned the state to NEED_FILE.

Let's see what Git cvsimport does with this repository.

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cd ..
$ git cvsimport -v -a -d $CVSROOT -C togit -r proj0 -i -R proj
Initialized empty Git repository in /home/catbert/$WORKDIR/togit/.git/
Running cvsps...
cvs_direct initialized to CVSROOT /home/catbert/$WORKDIR/_cvsroot
cvs rlog: Logging proj
WARNING: revision 1.1.1.1 of file ABC on unnamed branch
Skipping #CVSPS_NO_BRANCH
Fetching ABC   v 1.2
Drop ABC
Tree ID 4b825dc642cb6eb9a060e54bf8d69288fbee4904
Parent ID (empty)
Committed patch 2 (master 1509429376 +0000)
Commit ID ea479084404fefe56936c077599a0bbf9dc4e19c
Fetching ABC   v 1.1
Update ABC: 11 bytes
Tree ID e43dcf7f05643ffb5a19ddd24c230d36b802b059
Parent ID ea479084404fefe56936c077599a0bbf9dc4e19c
Committed patch 3 (master 1509429841 +0000)
Commit ID 3d446c02c35d1be9381e679231df13cf5d565441
DONE; creating master branch

$ git clone togit/ startwork
Cloning into 'startwork'...
done.
$ cd startwork/
$ ls
.git/ ABC

$ git log
commit 3d446c02c35d1be9381e679231df13cf5d565441
Author: catbert 
Date:   Tue Oct 31 06:04:01 2017 +0000

    New file ABC.
    
    Here we add the "Log Boundary Dashes" (28 dashes)

commit ea479084404fefe56936c077599a0bbf9dc4e19c
Author: catbert 
Date:   Tue Oct 31 05:56:16 2017 +0000

    cvsps will treat it as such.

Let's do one more which will confuse Git cvsimport. This deals with again log messages, but this time ones which include lines which start with Members.

$ cd ../proj
$ cal >> ABC
$ cvs commit -F /dev/stdin ABC
This will confuse git-cvsimport's parser

Members: 
        somefile.c:1.1->1.2
        another.h:1.7->1.8
        foo.mk:1.22->1.23

Imagine these were lines pasted to note something
^D
Checking in ABC;
/home/catbert/$WORKDIR/_cvsroot/proj/ABC,v  <--  ABC
new revision: 1.2; previous revision: 1.1
done

Let's run cvsps

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cvsps --norc --cvs-direct -u -A --root :local:$CVSROOT proj > cvsps.out
cvs_direct initialized to CVSROOT /home/catbert/$WORKDIR/_cvsroot
cvs rlog: Logging proj
WARNING: revision 1.1.1.1 of file ABC on unnamed branch


$ cat cvsps.out
---------------------
PatchSet 1 
Date: 2017/10/30 22:54:58
Author: catbert
Branch: #CVSPS_NO_BRANCH
Tag: (none) 
Log:
Initial IMPORT!

Members: 
	ABC:1.1->1.1.1.1 

---------------------
PatchSet 2 
Date: 2017/10/30 23:04:01
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
New file ABC.

Here we add the "Log Boundary Dashes" (28 dashes)

Members: 
	ABC:1.2->1.1 

---------------------
PatchSet 3 
Date: 2017/10/30 23:25:20
Author: catbert
Branch: HEAD
Tag: (none) 
Log:
This will confuse git-cvsimport's parser

Members:
	somefile.c:1.1->1.2
	another.h:1.7->1.8
	foo.mk:1.22->1.23

Imagine these were lines pasted to note something

Members: 
	ABC:1.1->1.2 

Note that PatchSet 3 shows two Members: lines.

Now, let's rerun Git cvsimport afresh.

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cd ..
$ rm -fr togit/ startwork/

$ git cvsimport -a -v -d $CVSROOT -C togit -r proj0 -i -R proj
Initialized empty Git repository in /home/catbert/$WORKDIR/togit/.git/
Running cvsps...
cvs_direct initialized to CVSROOT /home/catbert/$WORKDIR/_cvsroot
cvs rlog: Logging proj
WARNING: revision 1.1.1.1 of file ABC on unnamed branch
Skipping #CVSPS_NO_BRANCH
Fetching ABC   v 1.1
Update ABC: 11 bytes
Tree ID e43dcf7f05643ffb5a19ddd24c230d36b802b059
Parent ID (empty)
Committed patch 2 (master 1509429841 +0000)
Commit ID f8421b90b2089230995c3244f8feb6efdacde711
Fetching somefile.c   v 1.2
Unknown: error  

and BOOM!

Post Patch

With above patches applied the run-through of above exercise follows.

$ mkdir -p $WORKDIR
$ cd $WORKDIR
$ export CVSROOT=`pwd`/_cvsroot
$ cvs init

$ mkdir proj
$ cd proj
$ echo "Welcome to this test project!" > README
$ cvs import -m "Initial IMPORT!" proj PROJCORP_1 PROJ_INITIAL
N proj/README

No conflicts created by this import

$ cd ..
$ rm -r proj

$ cvs co proj
cvs checkout: Updating proj
U proj/README

$ cd proj
$ date >> README
$ echo A NEW DAY IN THIS PROJECT >> README

$ cvs commit -F /dev/stdin
NOTMETA: This is not a revision meta-data;
cvsps will treat it as such.
^D
cvs commit: Examining .
Checking in README;
/tmp/workdir/_cvsroot/proj/README,v  <--  README
new revision: 1.2; previous revision: 1.1
done

First let's see the RAW CVS server rlog output.

$ cat <<__EOM | cvs server > server.out
> Root $CVSROOT
> Valid-responses ok error Valid-requests Mode M Mbinary E Checked-in Created Updated Merged Removed
> valid-requests
> Argument proj
> rlog
> __EOM

$ cat server.out
Valid-requests Root Valid-responses valid-requests Repository Directory Max-dotdot Static-directory Sticky Checkin-prog Update-prog Entry Kopt Checkin-time Modified Is-modified UseUnchanged Unchanged Notify Questionable Case Argument Argumentx Global_option Gzip-stream wrapper-sendme-rcsOptions Set expand-modules ci co update diff log rlog add remove update-patches gzip-file-contents status rdiff tag rtag import admin export history release watch-on watch-off watch-add watch-remove watchers editors init annotate rannotate noop version
ok
E cvs rlog: Logging proj
M 
M RCS file: /tmp/workdir/_cvsroot/proj/README,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M 	PROJ_INITIAL: 1.1.1.1
M 	PROJCORP_1: 1.1.1
M keyword substitution: kv
M total revisions: 3;	selected revisions: 3
M description:
M ----------------------------
M revision 1.2
M date: 2017/11/02 02:32:51;  author: catbert;  state: Exp;  lines: +2 -0;  commitid: 8DVlI7ADGNEglNu4;
M NOTMETA: This is not a revision meta-data;
M cvsps will treat it as such.
M ----------------------------
M revision 1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  commitid: cgvDa5LYMvOQKzeN;
M branches:  1.1.1;
M Initial revision
M ----------------------------
M revision 1.1.1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  lines: +0 -0;  commitid: cgvDa5LYMvOQKzeN;
M Initial IMPORT!
M =============================================================================
ok

NOTE We didn't indicate LOGM in the Valid-responses request. Let's try this again with LOGM specified.

$ cat <<__EOM | cvs server > server.out
> Root $CVSROOT
> Valid-responses ok error Valid-requests Mode M Mbinary E LOGM Checked-in Created Updated Merged Removed
> valid-requests
> Argument proj
> rlog
> __EOM

$ cat server.out
Valid-requests Root Valid-responses valid-requests Repository Directory Max-dotdot Static-directory Sticky Checkin-prog Update-prog Entry Kopt Checkin-time Modified Is-modified UseUnchanged Unchanged Notify Questionable Case Argument Argumentx Global_option Gzip-stream wrapper-sendme-rcsOptions Set expand-modules ci co update diff log rlog add remove update-patches gzip-file-contents status rdiff tag rtag import admin export history release watch-on watch-off watch-add watch-remove watchers editors init annotate rannotate noop version
ok
E cvs rlog: Logging proj
M 
M RCS file: /tmp/workdir/_cvsroot/proj/README,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M 	PROJ_INITIAL: 1.1.1.1
M 	PROJCORP_1: 1.1.1
M keyword substitution: kv
M total revisions: 3;	selected revisions: 3
M description:
M ----------------------------
M revision 1.2
M date: 2017/11/02 02:32:51;  author: catbert;  state: Exp;  lines: +2 -0;  commitid: 8DVlI7ADGNEglNu4;
LOGM NOTMETA: This is not a revision meta-data;
LOGM cvsps will treat it as such.
M ----------------------------
M revision 1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  commitid: cgvDa5LYMvOQKzeN;
M branches:  1.1.1;
LOGM Initial revision
M ----------------------------
M revision 1.1.1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  lines: +0 -0;  commitid: cgvDa5LYMvOQKzeN;
LOGM Initial IMPORT!
M =============================================================================
ok

Following our previous "script" lets play with "Log Boundary Dashes"

$ echo A NEW FILE > ABC
$ cvs add ABC
cvs add: scheduling file `ABC' for addition
cvs add: use 'cvs commit' to add this file permanently

$ cvs commit -F /dev/stdin ABC
New file ABC.

Here we add the "Log Boundary Dashes" (28 dashes)
----------------------------
BAD THINGS ARE BOUT TO HAPPEN
AND IT WILL BE SUBTLE
SO PAY CLOSE ATTENTION
----------------------------

ABC comes before README. This is important!
^D
RCS file: /tmp/workdir/_cvsroot/proj/ABC,v
done
Checking in ABC;
/tmp/workdir/_cvsroot/proj/ABC,v  <--  ABC
initial revision: 1.1
done

Continuing with the "script" let's add lines starting with Members:

$ cal >> ABC
$ cvs commit -F /dev/stdin ABC
This will confuse git-cvsimport's parser

Members: 
        somefile.c:1.1->1.2
         another.h:1.7->1.8
         foo.mk:1.22->1.23

Imagine these were lines pasted to note something
^D
Checking in ABC;
/tmp/workdir/_cvsroot/proj/ABC,v  <--  ABC
new revision: 1.2; previous revision: 1.1
done

First let's see the RAW CVS server rlog output.

$ cat <<__EOM | cvs server > server.out
> Root $CVSROOT
> Valid-responses ok error Valid-requests Mode M Mbinary E LOGM Checked-in Created Updated Merged Removed
> valid-requests
> Argument proj
> rlog
> __EOM

$ cat server.out
Valid-requests Root Valid-responses valid-requests Repository Directory Max-dotdot Static-directory Sticky Checkin-prog Update-prog Entry Kopt Checkin-time Modified Is-modified UseUnchanged Unchanged Notify Questionable Case Argument Argumentx Global_option Gzip-stream wrapper-sendme-rcsOptions Set expand-modules ci co update diff log rlog add remove update-patches gzip-file-contents status rdiff tag rtag import admin export history release watch-on watch-off watch-add watch-remove watchers editors init annotate rannotate noop version
ok
E cvs rlog: Logging proj
M 
M RCS file: /tmp/workdir/_cvsroot/proj/ABC,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M keyword substitution: kv
M total revisions: 2;	selected revisions: 2
M description:
M ----------------------------
M revision 1.2
M date: 2017/11/02 04:11:53;  author: catbert;  state: Exp;  lines: +8 -0;  commitid: 2kyrNR1HMXvpvHuO;
LOGM This will confuse git-cvsimport's parser
LOGM 
LOGM Members:
LOGM        somefile.c:1.1->1.2
LOGM         another.h:1.7->1.8
LOGM         foo.mk:1.22->1.23
LOGM 
LOGM Imagine these were lines pasted to note something
M ----------------------------
M revision 1.1
M date: 2017/11/02 02:49:05;  author: catbert;  state: Exp;  commitid: 3bAwSgarQcIX2nft;
LOGM New file ABC.
LOGM 
LOGM Here we add the "Log Boundary Dashes" (28 dashes)
LOGM ----------------------------
LOGM BAD THINGS ARE BOUT TO HAPPEN
LOGM AND IT WILL BE SUBTLE
LOGM SO PAY CLOSE ATTENTION
LOGM ----------------------------
LOGM 
LOGM ABC comes before README. This is important!
M =============================================================================
M 
M RCS file: /tmp/workdir/_cvsroot/proj/README,v
M head: 1.2
M branch:
M locks: strict
M access list:
M symbolic names:
M 	PROJ_INITIAL: 1.1.1.1
M 	PROJCORP_1: 1.1.1
M keyword substitution: kv
M total revisions: 3;	selected revisions: 3
M description:
M ----------------------------
M revision 1.2
M date: 2017/11/02 02:32:51;  author: catbert;  state: Exp;  lines: +2 -0;  commitid: 8DVlI7ADGNEglNu4;
LOGM NOTMETA: This is not a revision meta-data;
LOGM cvsps will treat it as such.
M ----------------------------
M revision 1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  commitid: cgvDa5LYMvOQKzeN;
M branches:  1.1.1;
LOGM Initial revision
M ----------------------------
M revision 1.1.1.1
M date: 2017/11/02 02:32:13;  author: catbert;  state: Exp;  lines: +0 -0;  commitid: cgvDa5LYMvOQKzeN;
LOGM Initial IMPORT!
M =============================================================================
ok

Let's see what cvsps shows.

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cvsps --norc --cvs-direct -u -A --root :local:$CVSROOT proj > cvsps.out
cvs_direct initialized to CVSROOT /tmp/workdir/_cvsroot
cvs rlog: Logging proj

$ cat cvsps.out
---------------------
PatchSet 1 
Date: 2017/11/01 19:32:13
Author: catbert
Branch: HEAD
Tag: (none) 
Log: 1
Initial revision

Members: 
	README:INITIAL->1.1 

---------------------
PatchSet 2 
Date: 2017/11/01 19:32:13
Author: catbert
Branch: PROJCORP_1
Ancestor branch: HEAD
Tag: PROJ_INITIAL 
Log: 1
Initial IMPORT!

Members: 
	README:1.1->1.1.1.1 

---------------------
PatchSet 3 
Date: 2017/11/01 19:32:51
Author: catbert
Branch: HEAD
Tag: (none) 
Log: 2
NOTMETA: This is not a revision meta-data;
cvsps will treat it as such.

Members: 
	README:1.1->1.2 

---------------------
PatchSet 4 
Date: 2017/11/01 19:49:05
Author: catbert
Branch: HEAD
Tag: (none) 
Log: 10
New file ABC.

Here we add the "Log Boundary Dashes" (28 dashes)
----------------------------
BAD THINGS ARE BOUT TO HAPPEN
AND IT WILL BE SUBTLE
SO PAY CLOSE ATTENTION
----------------------------

ABC comes before README. This is important!

Members: 
	ABC:INITIAL->1.1 

---------------------
PatchSet 5 
Date: 2017/11/01 21:11:53
Author: catbert
Branch: HEAD
Tag: (none) 
Log: 8
This will confuse git-cvsimport's parser

Members:
       somefile.c:1.1->1.2
        another.h:1.7->1.8
        foo.mk:1.22->1.23

Imagine these were lines pasted to note something

Members: 
	ABC:1.1->1.2 

As you can see, the PatchSet looks much better, and it includes the line count for each commit message for Git cvsimport which comes next

NOTE: Remove related "cache" files from ~/.cvsps to ensure consistency.

$ cd ..
$ rm -fr togit/ startwork/

$ git cvsimport -a -v -d $CVSROOT -C togit -r proj0 -i -R proj
Initialized empty Git repository in /tmp/workdir/togit/.git/
Running cvsps...
cvs_direct initialized to CVSROOT /tmp/workdir/_cvsroot
cvs rlog: Logging proj
Fetching README   v 1.1
New README: 30 bytes
Tree ID e553e30713aa517d13350bcf77efed681fa462b8
Parent ID (empty)
Committed patch 1 (master 1509589933 +0000)
Commit ID de55b0e4614d667e6981ac08f2019801049b5933
Fetching README   v 1.1.1.1
Update README: 30 bytes
Tree ID e553e30713aa517d13350bcf77efed681fa462b8
Parent ID de55b0e4614d667e6981ac08f2019801049b5933
Committed patch 2 (PROJCORP_1 1509589933 +0000)
Commit ID ac12bb6288860c536cccdf89de571549d7fe0fb8
Created tag 'PROJ_INITIAL' on 'PROJCORP_1'
Fetching README   v 1.2
Update README: 85 bytes
Tree ID a156bd9edc46794d5259c0459949bcfe7d881c33
Parent ID de55b0e4614d667e6981ac08f2019801049b5933
Committed patch 3 (master 1509589971 +0000)
Commit ID 12c3537432a089d396797567a32929e3bfeccccc
Fetching ABC   v 1.1
New ABC: 11 bytes
Tree ID f666f4a3e52ebb6418595dcebf0acb856d38f881
Parent ID 12c3537432a089d396797567a32929e3bfeccccc
Committed patch 4 (master 1509590945 +0000)
Commit ID 2ea1de8fa7d17d0a17e441cce9014ac2bd1f2376
Fetching ABC   v 1.2
Update ABC: 175 bytes
Tree ID 78a73e04bf0272f8869c1fd567a292756f5b9c0a
Parent ID 2ea1de8fa7d17d0a17e441cce9014ac2bd1f2376
Committed patch 5 (master 1509595913 +0000)
Commit ID 9c3f8afc0a587f9146669df3cbf73d51bfa54fe5
DONE; creating master branch


$ git clone togit/ startwork
Cloning into 'startwork'...
done.

$ cd startwork/
$ ls
.git/   ABC     README


$ git log
commit 9c3f8afc0a587f9146669df3cbf73d51bfa54fe5
Author: catbert 
Date:   Thu Nov 2 04:11:53 2017 +0000

    This will confuse git-cvsimport's parser
    
    Members:
           somefile.c:1.1->1.2
            another.h:1.7->1.8
            foo.mk:1.22->1.23
    
    Imagine these were lines pasted to note something

commit 2ea1de8fa7d17d0a17e441cce9014ac2bd1f2376
Author: catbert 
Date:   Thu Nov 2 02:49:05 2017 +0000

    New file ABC.
    
    Here we add the "Log Boundary Dashes" (28 dashes)
    ----------------------------
    BAD THINGS ARE BOUT TO HAPPEN
    AND IT WILL BE SUBTLE
    SO PAY CLOSE ATTENTION
    ----------------------------
    
    ABC comes before README. This is important!

commit 12c3537432a089d396797567a32929e3bfeccccc
Author: catbert 
Date:   Thu Nov 2 02:32:51 2017 +0000

    NOTMETA: This is not a revision meta-data;
    cvsps will treat it as such.

commit de55b0e4614d667e6981ac08f2019801049b5933
Author: catbert 
Date:   Thu Nov 2 02:32:13 2017 +0000

    Initial revision
BACK TO TOC