/[escript]/trunk/pyvisi/admin/cvs2cl.pl
ViewVC logotype

Annotation of /trunk/pyvisi/admin/cvs2cl.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 337 - (hide annotations)
Mon Dec 12 01:57:07 2005 UTC (15 years, 10 months ago) by cochrane
File MIME type: text/plain
File size: 77298 byte(s)
Initial merge of pyvisi into esys repository.
1 cochrane 337 #!/bin/sh
2     exec perl -w -x $0 ${1+"$@"} # -*- mode: perl; perl-indent-level: 2; -*-
3     #!perl -w
4    
5    
6     ##############################################################
7     ### ###
8     ### cvs2cl.pl: produce ChangeLog(s) from `cvs log` output. ###
9     ### ###
10     ##############################################################
11    
12     ## $Id: cvs2cl.pl,v 1.1 2004/11/23 12:30:21 paultcochrane Exp $
13     ## $Revision: 1.1 $
14     ## $Date: 2004/11/23 12:30:21 $
15     ## $Author: paultcochrane $
16     ##
17     ## (C) 2001,2002,2003 Martyn J. Pearce <fluffy@cpan.org>, under the GNU GPL.
18     ## (C) 1999 Karl Fogel <kfogel@red-bean.com>, under the GNU GPL.
19     ##
20     ## (Extensively hacked on by Melissa O'Neill <oneill@cs.sfu.ca>.)
21     ## (Gecos hacking by Robin Johnson <robbat2@orbis-terrarum.net>.)
22     ##
23     ## cvs2cl.pl is free software; you can redistribute it and/or modify
24     ## it under the terms of the GNU General Public License as published by
25     ## the Free Software Foundation; either version 2, or (at your option)
26     ## any later version.
27     ##
28     ## cvs2cl.pl is distributed in the hope that it will be useful,
29     ## but WITHOUT ANY WARRANTY; without even the implied warranty of
30     ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
31     ## GNU General Public License for more details.
32     ##
33     ## You may have received a copy of the GNU General Public License
34     ## along with cvs2cl.pl; see the file COPYING. If not, write to the
35     ## Free Software Foundation, Inc., 59 Temple Place - Suite 330,
36     ## Boston, MA 02111-1307, USA.
37    
38    
39     use strict;
40     use Text::Wrap qw( );
41     use Time::Local;
42     use File::Basename qw( fileparse );
43     use User::pwent;
44    
45    
46     # The Plan:
47     #
48     # Read in the logs for multiple files, spit out a nice ChangeLog that
49     # mirrors the information entered during `cvs commit'.
50     #
51     # The problem presents some challenges. In an ideal world, we could
52     # detect files with the same author, log message, and checkin time --
53     # each <filelist, author, time, logmessage> would be a changelog entry.
54     # We'd sort them; and spit them out. Unfortunately, CVS is *not atomic*
55     # so checkins can span a range of times. Also, the directory structure
56     # could be hierarchical.
57     #
58     # Another question is whether we really want to have the ChangeLog
59     # exactly reflect commits. An author could issue two related commits,
60     # with different log entries, reflecting a single logical change to the
61     # source. GNU style ChangeLogs group these under a single author/date.
62     # We try to do the same.
63     #
64     # So, we parse the output of `cvs log', storing log messages in a
65     # multilevel hash that stores the mapping:
66     # directory => author => time => message => filelist
67     # As we go, we notice "nearby" commit times and store them together
68     # (i.e., under the same timestamp), so they appear in the same log
69     # entry.
70     #
71     # When we've read all the logs, we twist this mapping into
72     # a time => author => message => filelist mapping for each directory.
73     #
74     # If we're not using the `--distributed' flag, the directory is always
75     # considered to be `./', even as descend into subdirectories.
76    
77    
78     ############### Globals ################
79    
80     use constant MAILNAME => "/etc/mailname";
81    
82     # What we run to generate it:
83     my $Log_Source_Command = "cvs log";
84    
85     # In case we have to print it out:
86     my $VERSION = '$Revision: 1.1 $';
87     $VERSION =~ s/\S+\s+(\S+)\s+\S+/$1/;
88    
89     ## Vars set by options:
90    
91     # Print debugging messages?
92     my $Debug = 0;
93    
94     # Just show version and exit?
95     my $Print_Version = 0;
96    
97     # Just print usage message and exit?
98     my $Print_Usage = 0;
99    
100     # Single top-level ChangeLog, or one per subdirectory?
101     my $Distributed = 0;
102    
103     # What file should we generate (defaults to "ChangeLog")?
104     my $Log_File_Name = "ChangeLog";
105    
106     # Grab most recent entry date from existing ChangeLog file, just add
107     # to that ChangeLog.
108     my $Cumulative = 0;
109    
110     # `cvs log -d`, this will repeat the last entry in the old log. This is OK,
111     # as it guarantees at least one entry in the update changelog, which means
112     # that there will always be a date to extract for the next update. The repeat
113     # entry can be removed in postprocessing, if necessary.
114     my $Update = 0;
115    
116     # Expand usernames to email addresses based on a map file?
117     my $User_Map_File = "";
118     my $User_Passwd_File;
119     my $Mail_Domain;
120    
121     # Output log in chronological order? [default is reverse chronological order]
122     my $Chronological_Order = 0;
123    
124     # Grab user details via gecos
125     my $Gecos = 0;
126    
127     # User domain for gecos email addresses
128     my $Domain;
129    
130     # Output to a file or to stdout?
131     my $Output_To_Stdout = 0;
132    
133     # Eliminate empty log messages?
134     my $Prune_Empty_Msgs = 0;
135    
136     # Tags of which not to output
137     my %ignore_tags;
138    
139     # Show only revisions with Tags
140     my %show_tags;
141    
142     # Don't call Text::Wrap on the body of the message
143     my $No_Wrap = 0;
144    
145     # Don't do any pretty print processing
146     my $Summary = 0;
147    
148     # Separates header from log message. Code assumes it is either " " or
149     # "\n\n", so if there's ever an option to set it to something else,
150     # make sure to go through all conditionals that use this var.
151     my $After_Header = " ";
152    
153     # XML Encoding
154     my $XML_Encoding = '';
155    
156     # Format more for programs than for humans.
157     my $XML_Output = 0;
158    
159     # Do some special tweaks for log data that was written in FSF
160     # ChangeLog style.
161     my $FSF_Style = 0;
162    
163     # Show times in UTC instead of local time
164     my $UTC_Times = 0;
165    
166     # Show times in output?
167     my $Show_Times = 1;
168    
169     # Show day of week in output?
170     my $Show_Day_Of_Week = 0;
171    
172     # Show revision numbers in output?
173     my $Show_Revisions = 0;
174    
175     # Show dead files in output?
176     my $Show_Dead = 0;
177    
178     # Hide dead trunk files which were created as a result of additions on a
179     # branch?
180     my $Hide_Branch_Additions = 1;
181    
182     # Show tags (symbolic names) in output?
183     my $Show_Tags = 0;
184    
185     # Show tags separately in output?
186     my $Show_Tag_Dates = 0;
187    
188     # Show branches by symbolic name in output?
189     my $Show_Branches = 0;
190    
191     # Show only revisions on these branches or their ancestors.
192     my @Follow_Branches;
193    
194     # Don't bother with files matching this regexp.
195     my @Ignore_Files;
196    
197     # How exactly we match entries. We definitely want "o",
198     # and user might add "i" by using --case-insensitive option.
199     my $Case_Insensitive = 0;
200    
201     # Maybe only show log messages matching a certain regular expression.
202     my $Regexp_Gate = "";
203    
204     # Pass this global option string along to cvs, to the left of `log':
205     my $Global_Opts = "";
206    
207     # Pass this option string along to the cvs log subcommand:
208     my $Command_Opts = "";
209    
210     # Read log output from stdin instead of invoking cvs log?
211     my $Input_From_Stdin = 0;
212    
213     # Don't show filenames in output.
214     my $Hide_Filenames = 0;
215    
216     # Don't shorten directory names from filenames.
217     my $Common_Dir = 1;
218    
219     # Max checkin duration. CVS checkin is not atomic, so we may have checkin
220     # times that span a range of time. We assume that checkins will last no
221     # longer than $Max_Checkin_Duration seconds, and that similarly, no
222     # checkins will happen from the same users with the same message less
223     # than $Max_Checkin_Duration seconds apart.
224     my $Max_Checkin_Duration = 180;
225    
226     # What to put at the front of [each] ChangeLog.
227     my $ChangeLog_Header = "";
228    
229     # Whether to enable 'delta' mode, and for what start/end tags.
230     my $Delta_Mode = 0;
231     my $Delta_From = "";
232     my $Delta_To = "";
233    
234     my $TestCode;
235    
236     # Whether to parse filenames from the RCS filename, and if so what
237     # prefix to strip.
238     my $RCS_Mode = 0;
239     my $RCS_Root = "";
240    
241     ## end vars set by options.
242    
243     # latest observed times for the start/end tags in delta mode
244     my $Delta_StartTime = 0;
245     my $Delta_EndTime = 0;
246    
247     # In 'cvs log' output, one long unbroken line of equal signs separates
248     # files:
249     my $file_separator = "======================================="
250     . "======================================";
251    
252     # In 'cvs log' output, a shorter line of dashes separates log messages
253     # within a file:
254     my $logmsg_separator = "----------------------------";
255    
256     my $No_Ancestors = 0;
257    
258     ############### End globals ############
259    
260    
261    
262     &parse_options ();
263     if ( defined $TestCode ) {
264     eval $TestCode;
265     die "Eval failed: '$@'\n"
266     if $@;
267     } else {
268     &derive_change_log ();
269     }
270    
271    
272     ### Everything below is subroutine definitions. ###
273    
274     sub run_ext {
275     my ($cmd) = @_;
276     $cmd = [$cmd]
277     unless ref $cmd;
278     local $" = ' ';
279     my $out = qx"@$cmd 2>&1";
280     my $rv = $?;
281     my ($sig, $core, $exit) = ($? & 127, $? & 128, $? >> 8);
282     return $out, $exit, $sig, $core;
283     }
284    
285     # If accumulating, grab the boundary date from pre-existing ChangeLog.
286     sub maybe_grab_accumulation_date ()
287     {
288     if (! $Cumulative || $Update) {
289     return "";
290     }
291    
292     # else
293    
294     open (LOG, "$Log_File_Name")
295     or die ("trouble opening $Log_File_Name for reading ($!)");
296    
297     my $boundary_date;
298     while (<LOG>)
299     {
300     if (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/)
301     {
302     $boundary_date = "$1";
303     last;
304     }
305     }
306    
307     close (LOG);
308    
309     # convert time from utc to local timezone if the ChangeLog has
310     # dates/times in utc
311     if ($UTC_Times && $boundary_date)
312     {
313     # convert the utc time to a time value
314     my ($year,$mon,$mday,$hour,$min) = $boundary_date =~
315     m#(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)#;
316     my $time = timegm(0,$min,$hour,$mday,$mon-1,$year-1900);
317     # print the timevalue in the local timezone
318     my ($ignore,$wday);
319     ($ignore,$min,$hour,$mday,$mon,$year,$wday) = localtime($time);
320     $boundary_date=sprintf ("%4u-%02u-%02u %02u:%02u",
321     $year+1900,$mon+1,$mday,$hour,$min);
322     }
323    
324     return $boundary_date;
325     }
326    
327     sub wrap {
328     my ($indent1, $indent2, @text) = @_;
329     my $text = Text::Wrap::wrap($indent1, $indent2, @text);
330     # If incoming text looks preformatted, don't get clever
331     if ( grep /^\s+/m, @text ) {
332     return $text;
333     }
334     my @lines = split /\n/, $text;
335     $indent2 =~ s!^((?: {8})+)!"\t" x (length($1)/8)!e;
336     $lines[0] =~ s/^$indent1\s+/$indent1/;
337     s/^$indent2\s+/$indent2/
338     for @lines[1..$#lines];
339     my $newtext = join "\n", @lines;
340     $newtext .= "\n"
341     if substr($text, -1) eq "\n";
342     return $newtext;
343     }
344    
345     # Fills up a ChangeLog structure in the current directory.
346     sub derive_change_log ()
347     {
348     # See "The Plan" above for a full explanation.
349    
350     my %grand_poobah;
351    
352     my $file_full_path;
353     my $time;
354     my $revision;
355     my $author;
356     my $state;
357     my $lines;
358     my $cvsstate;
359     my $msg_txt;
360     my $detected_file_separator;
361    
362     my %tag_date_printed;
363    
364     # Might be adding to an existing ChangeLog
365     my $accumulation_date = &maybe_grab_accumulation_date ();
366     if ($accumulation_date) {
367     # Insert -d immediately after 'cvs log'
368     my $Log_Date_Command = "-d\'>${accumulation_date}\'";
369     $Log_Source_Command =~ s/(^.*log\S*)/$1 $Log_Date_Command/;
370     &debug ("(adding log msg starting from $accumulation_date)\n");
371     }
372    
373     # We might be expanding usernames
374     my %usermap;
375    
376     # In general, it's probably not very maintainable to use state
377     # variables like this to tell the loop what it's doing at any given
378     # moment, but this is only the first one, and if we never have more
379     # than a few of these, it's okay.
380     my $collecting_symbolic_names = 0;
381     my %symbolic_names; # Where tag names get stored.
382     my %branch_names; # We'll grab branch names while we're at it.
383     my %branch_numbers; # Save some revisions for @Follow_Branches
384     my @branch_roots; # For showing which files are branch ancestors.
385    
386     # Bleargh. Compensate for a deficiency of custom wrapping.
387     if (($After_Header ne " ") and $FSF_Style)
388     {
389     $After_Header .= "\t";
390     }
391    
392     if (! $Input_From_Stdin) {
393     &debug ("(run \"${Log_Source_Command}\")\n");
394     open (LOG_SOURCE, "$Log_Source_Command |")
395     or die "unable to run \"${Log_Source_Command}\"";
396     }
397     else {
398     open (LOG_SOURCE, "-") or die "unable to open stdin for reading";
399     }
400    
401     binmode LOG_SOURCE;
402    
403     %usermap = &maybe_read_user_map_file ();
404    
405     while (<LOG_SOURCE>)
406     {
407     # Canonicalize line endings
408     s/\r$//;
409     my $new_full_path;
410    
411     # If on a new file and don't see filename, skip until we find it, and
412     # when we find it, grab it.
413     if (! (defined $file_full_path))
414     {
415     if (/^Working file: (.*)/) {
416     $new_full_path = $1;
417     } elsif ($RCS_Mode && m|^RCS file: $RCS_Root[/\\](.*),v$|) {
418     $new_full_path = $1;
419     }
420     }
421    
422     if (defined $new_full_path)
423     {
424     $file_full_path = $new_full_path;
425     if (@Ignore_Files)
426     {
427     my $base;
428     ($base, undef, undef) = fileparse ($file_full_path);
429     # Ouch, I wish trailing operators in regexps could be
430     # evaluated on the fly!
431     if ($Case_Insensitive) {
432     if (grep ($file_full_path =~ m|$_|i, @Ignore_Files)) {
433     undef $file_full_path;
434     }
435     }
436     elsif (grep ($file_full_path =~ m|$_|, @Ignore_Files)) {
437     undef $file_full_path;
438     }
439     }
440     next;
441     }
442    
443     # Just spin wheels if no file defined yet.
444     next if (! $file_full_path);
445    
446     # Collect tag names in case we're asked to print them in the output.
447     if (/^symbolic names:$/) {
448     $collecting_symbolic_names = 1;
449     next; # There's no more info on this line, so skip to next
450     }
451     if ($collecting_symbolic_names)
452     {
453     # All tag names are listed with whitespace in front in cvs log
454     # output; so if see non-whitespace, then we're done collecting.
455     if (/^\S/) {
456     $collecting_symbolic_names = 0;
457     }
458     else # we're looking at a tag name, so parse & store it
459     {
460     # According to the Cederqvist manual, in node "Tags", tag
461     # names must start with an uppercase or lowercase letter and
462     # can contain uppercase and lowercase letters, digits, `-',
463     # and `_'. However, it's not our place to enforce that, so
464     # we'll allow anything CVS hands us to be a tag:
465     /^\s+([^:]+): ([\d.]+)$/;
466     my $tag_name = $1;
467     my $tag_rev = $2;
468    
469     # A branch number either has an odd number of digit sections
470     # (and hence an even number of dots), or has ".0." as the
471     # second-to-last digit section. Test for these conditions.
472     my $real_branch_rev = "";
473     if (($tag_rev =~ /^(\d+\.\d+\.)+\d+$/) # Even number of dots...
474     and (! ($tag_rev =~ /^(1\.)+1$/))) # ...but not "1.[1.]1"
475     {
476     $real_branch_rev = $tag_rev;
477     }
478     elsif ($tag_rev =~ /(\d+\.(\d+\.)+)0.(\d+)/) # Has ".0."
479     {
480     $real_branch_rev = $1 . $3;
481     }
482     # If we got a branch, record its number.
483     if ($real_branch_rev)
484     {
485     $branch_names{$real_branch_rev} = $tag_name;
486     if (@Follow_Branches) {
487     if (grep ($_ eq $tag_name, @Follow_Branches)) {
488     $branch_numbers{$tag_name} = $real_branch_rev;
489     }
490     }
491     }
492     else {
493     # Else it's just a regular (non-branch) tag.
494     push (@{$symbolic_names{$tag_rev}}, $tag_name);
495     }
496     }
497     }
498     # End of code for collecting tag names.
499    
500     # If have file name, but not revision, and see revision, then grab
501     # it. (We collect unconditionally, even though we may or may not
502     # ever use it.)
503     if ((! (defined $revision)) and (/^revision (\d+\.[\d.]+)/))
504     {
505     $revision = $1;
506    
507     if (@Follow_Branches)
508     {
509     foreach my $branch (@Follow_Branches)
510     {
511     # Special case for following trunk revisions
512     if (($branch =~ /^trunk$/i) and ($revision =~ /^[0-9]+\.[0-9]+$/))
513     {
514     goto dengo;
515     }
516    
517     my $branch_number = $branch_numbers{$branch};
518     if ($branch_number)
519     {
520     # Are we on one of the follow branches or an ancestor of
521     # same?
522     #
523     # If this revision is a prefix of the branch number, or
524     # possibly is less in the minormost number, OR if this
525     # branch number is a prefix of the revision, then yes.
526     # Otherwise, no.
527     #
528     # So below, we determine if any of those conditions are
529     # met.
530    
531     # Trivial case: is this revision on the branch?
532     # (Compare this way to avoid regexps that screw up Emacs
533     # indentation, argh.)
534     if ((substr ($revision, 0, ((length ($branch_number)) + 1)))
535     eq ($branch_number . "."))
536     {
537     goto dengo;
538     }
539     # Non-trivial case: check if rev is ancestral to branch
540     elsif ((length ($branch_number)) > (length ($revision))
541     and
542     $No_Ancestors)
543     {
544     $revision =~ /^((?:\d+\.)+)(\d+)$/;
545     my $r_left = $1; # still has the trailing "."
546     my $r_end = $2;
547    
548     $branch_number =~ /^((?:\d+\.)+)(\d+)\.\d+$/;
549     my $b_left = $1; # still has trailing "."
550     my $b_mid = $2; # has no trailing "."
551    
552     if (($r_left eq $b_left)
553     && ($r_end <= $b_mid))
554     {
555     goto dengo;
556     }
557     }
558     }
559     }
560     }
561     else # (! @Follow_Branches)
562     {
563     next;
564     }
565    
566     # Else we are following branches, but this revision isn't on the
567     # path. So skip it.
568     undef $revision;
569     dengo:
570     next;
571     }
572    
573     # If we don't have a revision right now, we couldn't possibly
574     # be looking at anything useful.
575     if (! (defined ($revision))) {
576     $detected_file_separator = /^$file_separator$/o;
577     if ($detected_file_separator) {
578     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
579     goto CLEAR;
580     }
581     else {
582     next;
583     }
584     }
585    
586     # If have file name but not date and author, and see date or
587     # author, then grab them:
588     unless (defined $time)
589     {
590     if (/^date: .*/)
591     {
592     ($time, $author, $state, $lines) =
593     &parse_date_author_and_state ($_);
594     if (defined ($usermap{$author}) and $usermap{$author}) {
595     $author = $usermap{$author};
596     } elsif(defined $Domain or $Gecos == 1) {
597     my $email = $author;
598     if(defined $Domain && $Domain ne '') {
599     $email = $author."@".$Domain;
600     }
601     my $pw = getpwnam($author);
602     my $fullname;
603     my $office;
604     my $workphone;
605     my $homephone;
606     for (($fullname, $office, $workphone, $homephone) = split /\s*,\s*/, $pw->gecos) {
607     s/&/ucfirst(lc($pw->name))/ge;
608     }
609     if($fullname ne "") {
610     $author = $fullname . " <" . $email . ">";
611     }
612     }
613     }
614     else {
615     $detected_file_separator = /^$file_separator$/o;
616     if ($detected_file_separator) {
617     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
618     goto CLEAR;
619     }
620     }
621     # If the date/time/author hasn't been found yet, we couldn't
622     # possibly care about anything we see. So skip:
623     next;
624     }
625    
626     # A "branches: ..." line here indicates that one or more branches
627     # are rooted at this revision. If we're showing branches, then we
628     # want to show that fact as well, so we collect all the branches
629     # that this is the latest ancestor of and store them in
630     # @branch_roots. Just for reference, the format of the line we're
631     # seeing at this point is:
632     #
633     # branches: 1.5.2; 1.5.4; ...;
634     #
635     # Okay, here goes:
636    
637     if (/^branches:\s+(.*);$/)
638     {
639     if ($Show_Branches)
640     {
641     my $lst = $1;
642     $lst =~ s/(1\.)+1;|(1\.)+1$//; # ignore the trivial branch 1.1.1
643     if ($lst) {
644     @branch_roots = split (/;\s+/, $lst);
645     }
646     else {
647     undef @branch_roots;
648     }
649     next;
650     }
651     else
652     {
653     # Ugh. This really bothers me. Suppose we see a log entry
654     # like this:
655     #
656     # ----------------------------
657     # revision 1.1
658     # date: 1999/10/17 03:07:38; author: jrandom; state: Exp;
659     # branches: 1.1.2;
660     # Intended first line of log message begins here.
661     # ----------------------------
662     #
663     # The question is, how we can tell the difference between that
664     # log message and a *two*-line log message whose first line is
665     #
666     # "branches: 1.1.2;"
667     #
668     # See the problem? The output of "cvs log" is inherently
669     # ambiguous.
670     #
671     # For now, we punt: we liberally assume that people don't
672     # write log messages like that, and just toss a "branches:"
673     # line if we see it but are not showing branches. I hope no
674     # one ever loses real log data because of this.
675     next;
676     }
677     }
678    
679     # If have file name, time, and author, then we're just grabbing
680     # log message texts:
681     $detected_file_separator = /^$file_separator$/o;
682     if ($detected_file_separator && ! (defined $revision)) {
683     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
684     goto CLEAR;
685     }
686     unless ($detected_file_separator || /^$logmsg_separator$/o)
687     {
688     $msg_txt .= $_; # Normally, just accumulate the message...
689     next;
690     }
691     # ... until a msg separator is encountered:
692     # Ensure the message contains something:
693     if ((! $msg_txt)
694     || ($msg_txt =~ /^\s*\.\s*$|^\s*$/)
695     || ($msg_txt =~ /\*\*\* empty log message \*\*\*/))
696     {
697     if ($Prune_Empty_Msgs) {
698     goto CLEAR;
699     }
700     # else
701     $msg_txt = "[no log message]\n";
702     }
703    
704     ### Store it all in the Grand Poobah:
705     {
706     my $dir_key; # key into %grand_poobah
707     my %qunk; # complicated little jobbie, see below
708    
709     # Each revision of a file has a little data structure (a `qunk')
710     # associated with it. That data structure holds not only the
711     # file's name, but any additional information about the file
712     # that might be needed in the output, such as the revision
713     # number, tags, branches, etc. The reason to have these things
714     # arranged in a data structure, instead of just appending them
715     # textually to the file's name, is that we may want to do a
716     # little rearranging later as we write the output. For example,
717     # all the files on a given tag/branch will go together, followed
718     # by the tag in parentheses (so trunk or otherwise non-tagged
719     # files would go at the end of the file list for a given log
720     # message). This rearrangement is a lot easier to do if we
721     # don't have to reparse the text.
722     #
723     # A qunk looks like this:
724     #
725     # {
726     # filename => "hello.c",
727     # revision => "1.4.3.2",
728     # time => a timegm() return value (moment of commit)
729     # tags => [ "tag1", "tag2", ... ],
730     # branch => "branchname" # There should be only one, right?
731     # branchroots => [ "branchtag1", "branchtag2", ... ]
732     # }
733    
734     if ($Distributed) {
735     # Just the basename, don't include the path.
736     ($qunk{'filename'}, $dir_key, undef) = fileparse ($file_full_path);
737     }
738     else {
739     $dir_key = "./";
740     $qunk{'filename'} = $file_full_path;
741     }
742    
743     # This may someday be used in a more sophisticated calculation
744     # of what other files are involved in this commit. For now, we
745     # don't use it much except for delta mode, because the
746     # common-commit-detection algorithm is hypothesized to be
747     # "good enough" as it stands.
748     $qunk{'time'} = $time;
749    
750     # We might be including revision numbers and/or tags and/or
751     # branch names in the output. Most of the code from here to
752     # loop-end deals with organizing these in qunk.
753    
754     $qunk{'revision'} = $revision;
755     $qunk{'state'} = $state;
756     if ( defined( $lines )) {
757     $qunk{'lines'} = $lines;
758     }
759    
760     # Grab the branch, even though we may or may not need it:
761     $qunk{'revision'} =~ /((?:\d+\.)+)\d+/;
762     my $branch_prefix = $1;
763     $branch_prefix =~ s/\.$//; # strip off final dot
764     if ($branch_names{$branch_prefix}) {
765     $qunk{'branch'} = $branch_names{$branch_prefix};
766     }
767    
768     # Keep a record of the file's cvs state.
769     $qunk{'cvsstate'} = $state;
770    
771     # If there's anything in the @branch_roots array, then this
772     # revision is the root of at least one branch. We'll display
773     # them as branch names instead of revision numbers, the
774     # substitution for which is done directly in the array:
775     if (@branch_roots) {
776     my @roots = map { $branch_names{$_} } @branch_roots;
777     $qunk{'branchroots'} = \@roots;
778     }
779    
780     # Save tags too.
781     if (defined ($symbolic_names{$revision})) {
782     $qunk{'tags'} = $symbolic_names{$revision};
783     delete $symbolic_names{$revision};
784    
785     # If we're in 'delta' mode, update the latest observed
786     # times for the beginning and ending tags, and
787     # when we get around to printing output, we will simply restrict
788     # ourselves to that timeframe...
789    
790     if ($Delta_Mode) {
791     if (($time > $Delta_StartTime) &&
792     (grep { $_ eq $Delta_From } @{$qunk{'tags'}}))
793     {
794     $Delta_StartTime = $time;
795     }
796    
797     if (($time > $Delta_EndTime) &&
798     (grep { $_ eq $Delta_To } @{$qunk{'tags'}}))
799     {
800     $Delta_EndTime = $time;
801     }
802     }
803     }
804    
805     unless ($Hide_Branch_Additions and $msg_txt =~ /file \S+ was initially added on branch \S+./) {
806     # Add this file to the list
807     # (We use many spoonfuls of autovivication magic. Hashes and arrays
808     # will spring into existence if they aren't there already.)
809    
810     &debug ("(pushing log msg for ${dir_key}$qunk{'filename'})\n");
811    
812     # Store with the files in this commit. Later we'll loop through
813     # again, making sure that revisions with the same log message
814     # and nearby commit times are grouped together as one commit.
815     push (@{$grand_poobah{$dir_key}{$author}{$time}{$msg_txt}}, \%qunk);
816     }
817     }
818    
819     CLEAR:
820     # Make way for the next message
821     undef $msg_txt;
822     undef $time;
823     undef $revision;
824     undef $author;
825     undef @branch_roots;
826    
827     # Maybe even make way for the next file:
828     if ($detected_file_separator) {
829     undef $file_full_path;
830     undef %branch_names;
831     undef %branch_numbers;
832     undef %symbolic_names;
833     }
834     }
835    
836     close (LOG_SOURCE);
837    
838     ### Process each ChangeLog
839    
840     while (my ($dir,$authorhash) = each %grand_poobah)
841     {
842     &debug ("DOING DIR: $dir\n");
843    
844     # Here we twist our hash around, from being
845     # author => time => message => filelist
846     # in %$authorhash to
847     # time => author => message => filelist
848     # in %changelog.
849     #
850     # This is also where we merge entries. The algorithm proceeds
851     # through the timeline of the changelog with a sliding window of
852     # $Max_Checkin_Duration seconds; within that window, entries that
853     # have the same log message are merged.
854     #
855     # (To save space, we zap %$authorhash after we've copied
856     # everything out of it.)
857    
858     my %changelog;
859     while (my ($author,$timehash) = each %$authorhash)
860     {
861     my $lasttime;
862     my %stamptime;
863     foreach my $time (sort {$main::a <=> $main::b} (keys %$timehash))
864     {
865     my $msghash = $timehash->{$time};
866     while (my ($msg,$qunklist) = each %$msghash)
867     {
868     my $stamptime = $stamptime{$msg};
869     if ((defined $stamptime)
870     and (($time - $stamptime) < $Max_Checkin_Duration)
871     and (defined $changelog{$stamptime}{$author}{$msg}))
872     {
873     push(@{$changelog{$stamptime}{$author}{$msg}}, @$qunklist);
874     }
875     else {
876     $changelog{$time}{$author}{$msg} = $qunklist;
877     $stamptime{$msg} = $time;
878     }
879     }
880     }
881     }
882     undef (%$authorhash);
883    
884     ### Now we can write out the ChangeLog!
885    
886     my ($logfile_here, $logfile_bak, $tmpfile);
887    
888     if (! $Output_To_Stdout) {
889     $logfile_here = $dir . $Log_File_Name;
890     $logfile_here =~ s/^\.\/\//\//; # fix any leading ".//" problem
891     $tmpfile = "${logfile_here}.cvs2cl$$.tmp";
892     $logfile_bak = "${logfile_here}.bak";
893    
894     open (LOG_OUT, ">$tmpfile") or die "Unable to open \"$tmpfile\"";
895     }
896     else {
897     open (LOG_OUT, ">-") or die "Unable to open stdout for writing";
898     }
899    
900     print LOG_OUT $ChangeLog_Header;
901    
902     if ($XML_Output) {
903     my $encoding =
904     length $XML_Encoding ? qq'encoding="$XML_Encoding"' : '';
905     my $version = 'version="1.0"';
906     my $declaration =
907     sprintf '<?xml %s?>', join ' ', grep length, $version, $encoding;
908     my $root =
909     '<changelog xmlns="http://www.red-bean.com/xmlns/cvs2cl/">';
910     print LOG_OUT "$declaration\n\n$root\n\n";
911     }
912    
913     my @key_list = ();
914     if($Chronological_Order) {
915     @key_list = sort {$main::a <=> $main::b} (keys %changelog);
916     } else {
917     @key_list = sort {$main::b <=> $main::a} (keys %changelog);
918     }
919     foreach my $time (@key_list)
920     {
921     next if ($Delta_Mode &&
922     (($time <= $Delta_StartTime) ||
923     ($time > $Delta_EndTime && $Delta_EndTime)));
924    
925     # Set up the date/author line.
926     # kff todo: do some more XML munging here, on the header
927     # part of the entry:
928     my ($ignore,$min,$hour,$mday,$mon,$year,$wday)
929     = $UTC_Times ? gmtime($time) : localtime($time);
930    
931     # XML output includes everything else, we might as well make
932     # it always include Day Of Week too, for consistency.
933     if ($Show_Day_Of_Week or $XML_Output) {
934     $wday = ("Sunday", "Monday", "Tuesday", "Wednesday",
935     "Thursday", "Friday", "Saturday")[$wday];
936     $wday = ($XML_Output) ? "<weekday>${wday}</weekday>\n" : " $wday";
937     }
938     else {
939     $wday = "";
940     }
941    
942     my $authorhash = $changelog{$time};
943     if ($Show_Tag_Dates) {
944     my %tags;
945     while (my ($author,$mesghash) = each %$authorhash) {
946     while (my ($msg,$qunk) = each %$mesghash) {
947     foreach my $qunkref2 (@$qunk) {
948     if (defined ($$qunkref2{'tags'})) {
949     foreach my $tag (@{$$qunkref2{'tags'}}) {
950     $tags{$tag} = 1;
951     }
952     }
953     }
954     }
955     }
956     foreach my $tag (keys %tags) {
957     if (!defined $tag_date_printed{$tag}) {
958     $tag_date_printed{$tag} = $time;
959     if ($XML_Output) {
960     # NOT YET DONE
961     }
962     else {
963     if ($Show_Times) {
964     printf LOG_OUT ("%4u-%02u-%02u${wday} %02u:%02u tag %s\n\n",
965     $year+1900, $mon+1, $mday, $hour, $min, $tag);
966     } else {
967     printf LOG_OUT ("%4u-%02u-%02u${wday} tag %s\n\n",
968     $year+1900, $mon+1, $mday, $tag);
969     }
970     }
971     }
972     }
973     }
974     while (my ($author,$mesghash) = each %$authorhash)
975     {
976     # If XML, escape in outer loop to avoid compound quoting:
977     if ($XML_Output) {
978     $author = &xml_escape ($author);
979     }
980    
981     FOOBIE:
982     # We sort here to enable predictable ordering for the testing porpoises
983     for my $msg (sort keys %$mesghash)
984     {
985     my $qunklist = $mesghash->{$msg};
986    
987     ## MJP: 19.xii.01 : Exclude @ignore_tags
988     for my $ignore_tag (keys %ignore_tags) {
989     next FOOBIE
990     if grep($_ eq $ignore_tag, map(@{$_->{tags}},
991     grep(defined $_->{tags},
992     @$qunklist)));
993     }
994     ## MJP: 19.xii.01 : End exclude @ignore_tags
995    
996     # show only files with tag --show-tag $show_tag
997     if ( keys %show_tags ) {
998     next FOOBIE
999     if !grep(exists $show_tags{$_}, map(@{$_->{tags}},
1000     grep(defined $_->{tags},
1001     @$qunklist)));
1002     }
1003    
1004     my $files = &pretty_file_list ($qunklist);
1005     my $header_line; # date and author
1006     my $body; # see below
1007     my $wholething; # $header_line + $body
1008    
1009     if ($XML_Output) {
1010     $header_line =
1011     sprintf ("<date>%4u-%02u-%02u</date>\n"
1012     . "${wday}"
1013     . "<time>%02u:%02u</time>\n"
1014     . "<author>%s</author>\n",
1015     $year+1900, $mon+1, $mday, $hour, $min, $author);
1016     }
1017     else {
1018     if ($Show_Times) {
1019     $header_line =
1020     sprintf ("%4u-%02u-%02u${wday} %02u:%02u %s\n\n",
1021     $year+1900, $mon+1, $mday, $hour, $min, $author);
1022     } else {
1023     $header_line =
1024     sprintf ("%4u-%02u-%02u${wday} %s\n\n",
1025     $year+1900, $mon+1, $mday, $author);
1026     }
1027     }
1028    
1029     $Text::Wrap::huge = 'overflow'
1030     if $Text::Wrap::VERSION >= 2001.0130;
1031     # Reshape the body according to user preferences.
1032     if ($XML_Output)
1033     {
1034     $msg = &preprocess_msg_text ($msg);
1035     $body = $files . $msg;
1036     }
1037     elsif ($No_Wrap && !$Summary)
1038     {
1039     $msg = &preprocess_msg_text ($msg);
1040     $files = wrap ("\t", "\t", "* $files");
1041     $msg =~ s/\n(.+)/\n\t$1/g;
1042     unless ($After_Header eq " ") {
1043     $msg =~ s/^(.+)/\t$1/g;
1044     }
1045     $body = $files . $After_Header . $msg;
1046     }
1047     elsif ($Summary)
1048     {
1049     my( $filelist, $qunk );
1050     my( @DeletedQunks, @AddedQunks, @ChangedQunks );
1051    
1052     $msg = &preprocess_msg_text ($msg);
1053     #
1054     # Sort the files (qunks) according to the operation that was
1055     # performed. Files which were added have no line change
1056     # indicator, whereas deleted files have state dead.
1057     #
1058     foreach $qunk ( @$qunklist )
1059     {
1060     if ( "dead" eq $qunk->{'state'})
1061     {
1062     push( @DeletedQunks, $qunk );
1063     }
1064     elsif ( !exists( $qunk->{'lines'}))
1065     {
1066     push( @AddedQunks, $qunk );
1067     }
1068     else
1069     {
1070     push( @ChangedQunks, $qunk );
1071     }
1072     }
1073     #
1074     # The qunks list was originally in tree search order. Let's
1075     # get that back. The lists, if they exist, will be reversed upon
1076     # processing.
1077     #
1078    
1079     #
1080     # Now write the three sections onto $filelist
1081     #
1082     if ( @DeletedQunks )
1083     {
1084     $filelist .= "\tDeleted:\n";
1085     foreach $qunk ( @DeletedQunks )
1086     {
1087     $filelist .= "\t\t" . $qunk->{'filename'};
1088     $filelist .= " (" . $qunk->{'revision'} . ")";
1089     $filelist .= "\n";
1090     }
1091     undef( @DeletedQunks );
1092     }
1093     if ( @AddedQunks )
1094     {
1095     $filelist .= "\tAdded:\n";
1096     foreach $qunk ( @AddedQunks )
1097     {
1098     $filelist .= "\t\t" . $qunk->{'filename'};
1099     $filelist .= " (" . $qunk->{'revision'} . ")";
1100     $filelist .= "\n";
1101     }
1102     undef( @AddedQunks );
1103     }
1104     if ( @ChangedQunks )
1105     {
1106     $filelist .= "\tChanged:\n";
1107     foreach $qunk ( @ChangedQunks )
1108     {
1109     $filelist .= "\t\t" . $qunk->{'filename'};
1110     $filelist .= " (" . $qunk->{'revision'} . ")";
1111     $filelist .= ", \"" . $qunk->{'state'} . "\"";
1112     $filelist .= ", lines: " . $qunk->{'lines'};
1113     $filelist .= "\n";
1114     }
1115     undef( @ChangedQunks );
1116     }
1117     chomp( $filelist );
1118     $msg =~ s/\n(.*)/\n\t$1/g;
1119     unless ($After_Header eq " ") {
1120     $msg =~ s/^(.*)/\t$1/g;
1121     }
1122     $body = $filelist . $After_Header . $msg;
1123     }
1124     else # do wrapping, either FSF-style or regular
1125     {
1126     if ($FSF_Style)
1127     {
1128     $files = wrap ("\t", "\t", "* $files");
1129    
1130     my $files_last_line_len = 0;
1131     if ($After_Header eq " ")
1132     {
1133     $files_last_line_len = &last_line_len ($files);
1134     $files_last_line_len += 1; # for $After_Header
1135     }
1136    
1137     $msg = &wrap_log_entry
1138     ($msg, "\t", 69 - $files_last_line_len, 69);
1139     $body = $files . $After_Header . $msg;
1140     }
1141     else # not FSF-style
1142     {
1143     $msg = &preprocess_msg_text ($msg);
1144     $body = $files . $After_Header . $msg;
1145     $body = wrap ("\t", "\t ", "* $body");
1146     $body =~ s/[ \t]+\n/\n/g;
1147     }
1148     }
1149    
1150     $body =~ s/[ \t]+\n/\n/g;
1151     $wholething = $header_line . $body;
1152    
1153     if ($XML_Output) {
1154     $wholething = "<entry>\n${wholething}</entry>\n";
1155     }
1156    
1157     # One last check: make sure it passes the regexp test, if the
1158     # user asked for that. We have to do it here, so that the
1159     # test can match against information in the header as well
1160     # as in the text of the log message.
1161    
1162     # How annoying to duplicate so much code just because I
1163     # can't figure out a way to evaluate scalars on the trailing
1164     # operator portion of a regular expression. Grrr.
1165     if ($Case_Insensitive) {
1166     unless ($Regexp_Gate && ($wholething !~ /$Regexp_Gate/oi)) {
1167     print LOG_OUT "${wholething}\n";
1168     }
1169     }
1170     else {
1171     unless ($Regexp_Gate && ($wholething !~ /$Regexp_Gate/o)) {
1172     print LOG_OUT "${wholething}\n";
1173     }
1174     }
1175     }
1176     }
1177     }
1178    
1179     if ($XML_Output) {
1180     print LOG_OUT "</changelog>\n";
1181     }
1182    
1183     close (LOG_OUT);
1184    
1185     if (! $Output_To_Stdout)
1186     {
1187     # If accumulating, append old data to new before renaming. But
1188     # don't append the most recent entry, since it's already in the
1189     # new log due to CVS's idiosyncratic interpretation of "log -d".
1190     if ($Cumulative && -f $logfile_here)
1191     {
1192     open (NEW_LOG, ">>$tmpfile")
1193     or die "trouble appending to $tmpfile ($!)";
1194    
1195     open (OLD_LOG, "<$logfile_here")
1196     or die "trouble reading from $logfile_here ($!)";
1197    
1198     my $started_first_entry = 0;
1199     my $passed_first_entry = 0;
1200     while (<OLD_LOG>)
1201     {
1202     if (! $passed_first_entry)
1203     {
1204     if ((! $started_first_entry)
1205     && /^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
1206     $started_first_entry = 1;
1207     }
1208     elsif (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
1209     $passed_first_entry = 1;
1210     print NEW_LOG $_;
1211     }
1212     }
1213     else {
1214     print NEW_LOG $_;
1215     }
1216     }
1217    
1218     close (NEW_LOG);
1219     close (OLD_LOG);
1220     }
1221    
1222     if (-f $logfile_here) {
1223     rename ($logfile_here, $logfile_bak);
1224     }
1225     rename ($tmpfile, $logfile_here);
1226     }
1227     }
1228     }
1229    
1230     sub parse_date_author_and_state ()
1231     {
1232     # Parses the date/time and author out of a line like:
1233     #
1234     # date: 1999/02/19 23:29:05; author: apharris; state: Exp;
1235    
1236     my $line = shift;
1237    
1238     my ($year, $mon, $mday, $hours, $min, $secs, $author, $state, $rest) =
1239     $line =~
1240     m#(\d+)/(\d+)/(\d+)\s+(\d+):(\d+):(\d+);\s+author:\s+([^;]+);\s+state:\s+([^;]+);(.*)#
1241     or die "Couldn't parse date ``$line''";
1242     die "Bad date or Y2K issues" unless ($year > 1969 and $year < 2258);
1243     # Kinda arbitrary, but useful as a sanity check
1244     my $time = timegm($secs,$min,$hours,$mday,$mon-1,$year-1900);
1245     my $lines;
1246     if ( $rest =~ m#\s+lines:\s+(.*)# )
1247     {
1248     $lines =$1;
1249     }
1250     return ($time, $author, $state, $lines);
1251     }
1252    
1253     # Here we take a bunch of qunks and convert them into printed
1254     # summary that will include all the information the user asked for.
1255     sub pretty_file_list ()
1256     {
1257     if ($Hide_Filenames and (! $XML_Output)) {
1258     return "";
1259     }
1260    
1261     my $qunksref = shift;
1262    
1263     my @qunkrefs =
1264     grep +((! exists $_->{'tags'} or
1265     ! grep exists $ignore_tags{$_}, @{$_->{'tags'}}) and
1266     (! keys %show_tags or
1267     (exists $_->{'tags'} and
1268     grep exists $show_tags{$_}, @{$_->{'tags'}}))
1269     ),
1270     @$qunksref;
1271     my @filenames;
1272     my $beauty = ""; # The accumulating header string for this entry.
1273     my %non_unanimous_tags; # Tags found in a proper subset of qunks
1274     my %unanimous_tags; # Tags found in all qunks
1275     my %all_branches; # Branches found in any qunk
1276     my $common_dir = undef; # Dir prefix common to all files ("" if none)
1277     my $fbegun = 0; # Did we begin printing filenames yet?
1278    
1279     # First, loop over the qunks gathering all the tag/branch names.
1280     # We'll put them all in non_unanimous_tags, and take out the
1281     # unanimous ones later.
1282     QUNKREF:
1283     foreach my $qunkref (@qunkrefs)
1284     {
1285     # Keep track of whether all the files in this commit were in the
1286     # same directory, and memorize it if so. We can make the output a
1287     # little more compact by mentioning the directory only once.
1288     if ($Common_Dir && (scalar (@qunkrefs)) > 1)
1289     {
1290     if (! (defined ($common_dir)))
1291     {
1292     my ($base, $dir);
1293     ($base, $dir, undef) = fileparse ($$qunkref{'filename'});
1294    
1295     if ((! (defined ($dir))) # this first case is sheer paranoia
1296     or ($dir eq "")
1297     or ($dir eq "./")
1298     or ($dir eq ".\\"))
1299     {
1300     $common_dir = "";
1301     }
1302     else
1303     {
1304     $common_dir = $dir;
1305     }
1306     }
1307     elsif ($common_dir ne "")
1308     {
1309     # Already have a common dir prefix, so how much of it can we preserve?
1310     $common_dir = &common_path_prefix ($$qunkref{'filename'}, $common_dir);
1311     }
1312     }
1313     else # only one file in this entry anyway, so common dir not an issue
1314     {
1315     $common_dir = "";
1316     }
1317    
1318     if (defined ($$qunkref{'branch'})) {
1319     $all_branches{$$qunkref{'branch'}} = 1;
1320     }
1321     if (defined ($$qunkref{'tags'})) {
1322     foreach my $tag (@{$$qunkref{'tags'}}) {
1323     $non_unanimous_tags{$tag} = 1;
1324     }
1325     }
1326     }
1327    
1328     # Any tag held by all qunks will be printed specially... but only if
1329     # there are multiple qunks in the first place!
1330     if ((scalar (@qunkrefs)) > 1) {
1331     foreach my $tag (keys (%non_unanimous_tags)) {
1332     my $everyone_has_this_tag = 1;
1333     foreach my $qunkref (@qunkrefs) {
1334     if ((! (defined ($$qunkref{'tags'})))
1335     or (! (grep ($_ eq $tag, @{$$qunkref{'tags'}})))) {
1336     $everyone_has_this_tag = 0;
1337     }
1338     }
1339     if ($everyone_has_this_tag) {
1340     $unanimous_tags{$tag} = 1;
1341     delete $non_unanimous_tags{$tag};
1342     }
1343     }
1344     }
1345    
1346     if ($XML_Output)
1347     {
1348     # If outputting XML, then our task is pretty simple, because we
1349     # don't have to detect common dir, common tags, branch prefixing,
1350     # etc. We just output exactly what we have, and don't worry about
1351     # redundancy or readability.
1352    
1353     foreach my $qunkref (@qunkrefs)
1354     {
1355     my $filename = $$qunkref{'filename'};
1356     my $cvsstate = $$qunkref{'cvsstate'};
1357     my $revision = $$qunkref{'revision'};
1358     my $tags = $$qunkref{'tags'};
1359     my $branch = $$qunkref{'branch'};
1360     my $branchroots = $$qunkref{'branchroots'};
1361    
1362     $filename = &xml_escape ($filename); # probably paranoia
1363     $revision = &xml_escape ($revision); # definitely paranoia
1364    
1365     $beauty .= "<file>\n";
1366     $beauty .= "<name>${filename}</name>\n";
1367     $beauty .= "<cvsstate>${cvsstate}</cvsstate>\n";
1368     $beauty .= "<revision>${revision}</revision>\n";
1369     if ($branch) {
1370     $branch = &xml_escape ($branch); # more paranoia
1371     $beauty .= "<branch>${branch}</branch>\n";
1372     }
1373     foreach my $tag (@$tags) {
1374     $tag = &xml_escape ($tag); # by now you're used to the paranoia
1375     $beauty .= "<tag>${tag}</tag>\n";
1376     }
1377     foreach my $root (@$branchroots) {
1378     $root = &xml_escape ($root); # which is good, because it will continue
1379     $beauty .= "<branchroot>${root}</branchroot>\n";
1380     }
1381     $beauty .= "</file>\n";
1382     }
1383    
1384     # Theoretically, we could go home now. But as long as we're here,
1385     # let's print out the common_dir and utags, as a convenience to
1386     # the receiver (after all, earlier code calculated that stuff
1387     # anyway, so we might as well take advantage of it).
1388    
1389     if ((scalar (keys (%unanimous_tags))) > 1) {
1390     foreach my $utag ((keys (%unanimous_tags))) {
1391     $utag = &xml_escape ($utag); # the usual paranoia
1392     $beauty .= "<utag>${utag}</utag>\n";
1393     }
1394     }
1395     if ($common_dir) {
1396     $common_dir = &xml_escape ($common_dir);
1397     $beauty .= "<commondir>${common_dir}</commondir>\n";
1398     }
1399    
1400     # That's enough for XML, time to go home:
1401     return $beauty;
1402     }
1403    
1404     # Else not XML output, so complexly compactify for chordate
1405     # consumption. At this point we have enough global information
1406     # about all the qunks to organize them non-redundantly for output.
1407    
1408     if ($common_dir) {
1409     # Note that $common_dir still has its trailing slash
1410     $beauty .= "$common_dir: ";
1411     }
1412    
1413     if ($Show_Branches)
1414     {
1415     # For trailing revision numbers.
1416     my @brevisions;
1417    
1418     foreach my $branch (keys (%all_branches))
1419     {
1420     foreach my $qunkref (@qunkrefs)
1421     {
1422     if ((defined ($$qunkref{'branch'}))
1423     and ($$qunkref{'branch'} eq $branch))
1424     {
1425     if ($fbegun) {
1426     # kff todo: comma-delimited in XML too? Sure.
1427     $beauty .= ", ";
1428     }
1429     else {
1430     $fbegun = 1;
1431     }
1432     my $fname = substr ($$qunkref{'filename'}, length ($common_dir));
1433     $beauty .= $fname;
1434     $$qunkref{'printed'} = 1; # Just setting a mark bit, basically
1435    
1436     if ($Show_Tags && (defined @{$$qunkref{'tags'}})) {
1437     my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1438    
1439     if (@tags) {
1440     $beauty .= " (tags: ";
1441     $beauty .= join (', ', @tags);
1442     $beauty .= ")";
1443     }
1444     }
1445    
1446     if ($Show_Revisions) {
1447     # Collect the revision numbers' last components, but don't
1448     # print them -- they'll get printed with the branch name
1449     # later.
1450     $$qunkref{'revision'} =~ /.+\.([\d]+)$/;
1451     push (@brevisions, $1);
1452    
1453     # todo: we're still collecting branch roots, but we're not
1454     # showing them anywhere. If we do show them, it would be
1455     # nifty to just call them revision "0" on a the branch.
1456     # Yeah, that's the ticket.
1457     }
1458     }
1459     }
1460     $beauty .= " ($branch";
1461     if (@brevisions) {
1462     if ((scalar (@brevisions)) > 1) {
1463     $beauty .= ".[";
1464     $beauty .= (join (',', @brevisions));
1465     $beauty .= "]";
1466     }
1467     else {
1468     # Square brackets are spurious here, since there's no range to
1469     # encapsulate
1470     $beauty .= ".$brevisions[0]";
1471     }
1472     }
1473     $beauty .= ")";
1474     }
1475     }
1476    
1477     # Okay; any qunks that were done according to branch are taken care
1478     # of, and marked as printed. Now print everyone else.
1479    
1480     my %fileinfo_printed;
1481     foreach my $qunkref (@qunkrefs)
1482     {
1483     next if (defined ($$qunkref{'printed'})); # skip if already printed
1484    
1485     my $b = substr ($$qunkref{'filename'}, length ($common_dir));
1486     # todo: Shlomo's change was this:
1487     # $beauty .= substr ($$qunkref{'filename'},
1488     # (($common_dir eq "./") ? "" : length ($common_dir)));
1489     $$qunkref{'printed'} = 1; # Set a mark bit.
1490    
1491     if ($Show_Revisions || $Show_Tags || $Show_Dead)
1492     {
1493     my $started_addendum = 0;
1494    
1495     if ($Show_Revisions) {
1496     $started_addendum = 1;
1497     $b .= " (";
1498     $b .= "$$qunkref{'revision'}";
1499     }
1500     if ($Show_Dead && $$qunkref{'cvsstate'} =~ /dead/)
1501     {
1502     # Deliberately not using $started_addendum. Keeping it simple.
1503     $b .= "[DEAD]";
1504     }
1505     if ($Show_Tags && (defined $$qunkref{'tags'})) {
1506     my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1507     if ((scalar (@tags)) > 0) {
1508     if ($started_addendum) {
1509     $b .= ", ";
1510     }
1511     else {
1512     $b .= " (tags: ";
1513     }
1514     $b .= join (', ', @tags);
1515     $started_addendum = 1;
1516     }
1517     }
1518     if ($started_addendum) {
1519     $b .= ")";
1520     }
1521     }
1522    
1523     unless ( exists $fileinfo_printed{$b} ) {
1524     if ($fbegun) {
1525     $beauty .= ", ";
1526     } else {
1527     $fbegun = 1;
1528     }
1529     $beauty .= $b, $fileinfo_printed{$b} = 1;
1530     }
1531     }
1532    
1533     # Unanimous tags always come last.
1534     if ($Show_Tags && %unanimous_tags)
1535     {
1536     $beauty .= " (utags: ";
1537     $beauty .= join (', ', sort keys (%unanimous_tags));
1538     $beauty .= ")";
1539     }
1540    
1541     # todo: still have to take care of branch_roots?
1542    
1543     $beauty = "$beauty:";
1544    
1545     return $beauty;
1546     }
1547    
1548     sub min ($$) { $_[0] < $_[1] ? $_[0] : $_[1] }
1549    
1550     sub common_path_prefix ($$)
1551     {
1552     my ($path1, $path2) = @_;
1553    
1554     # For compatibility (with older versions of cvs2cl.pl), we think in UN*X
1555     # terms, and mould windoze filenames to match. Is this really appropriate?
1556     # If a file is checked in under UN*X, and cvs log run on windoze, which way
1557     # do the path separators slope? Can we use fileparse as per the local
1558     # conventions? If so, we should probably have a user option to specify an
1559     # OS to emulate to handle stdin-fed logs. If we did this, we could avoid
1560     # the nasty \-/ transmogrification below.
1561    
1562     my ($dir1, $dir2) = map +(fileparse($_))[1], $path1, $path2;
1563    
1564     # Transmogrify Windows filenames to look like Unix.
1565     # (It is far more likely that someone is running cvs2cl.pl under
1566     # Windows than that they would genuinely have backslashes in their
1567     # filenames.)
1568     tr!\\!/!
1569     for $dir1, $dir2;
1570    
1571     my ($accum1, $accum2, $last_common_prefix) = ('') x 3;
1572    
1573     my @path1 = grep length($_), split qr!/!, $dir1;
1574     my @path2 = grep length($_), split qr!/!, $dir2;
1575    
1576     my @common_path;
1577     for (0..min($#path1,$#path2)) {
1578     if ( $path1[$_] eq $path2[$_]) {
1579     push @common_path, $path1[$_];
1580     } else {
1581     last;
1582     }
1583     }
1584    
1585     return join '', map "$_/", @common_path;
1586     }
1587    
1588     sub preprocess_msg_text ()
1589     {
1590     my $text = shift;
1591    
1592     # Strip out carriage returns (as they probably result from DOSsy editors).
1593     $text =~ s/\r\n/\n/g;
1594    
1595     # If it *looks* like two newlines, make it *be* two newlines:
1596     $text =~ s/\n\s*\n/\n\n/g;
1597    
1598     if ($XML_Output)
1599     {
1600     $text = &xml_escape ($text);
1601     $text = "<msg>${text}</msg>\n";
1602     }
1603     elsif (! $No_Wrap)
1604     {
1605     # Strip off lone newlines, but only for lines that don't begin with
1606     # whitespace or a mail-quoting character, since we want to preserve
1607     # that kind of formatting. Also don't strip newlines that follow a
1608     # period; we handle those specially next. And don't strip
1609     # newlines that precede an open paren.
1610     1 while ($text =~ s/(^|\n)([^>\s].*[^.\n])\n([^>\n])/$1$2 $3/g);
1611    
1612     # If a newline follows a period, make sure that when we bring up the
1613     # bottom sentence, it begins with two spaces.
1614     1 while ($text =~ s/(^|\n)([^>\s].*)\n([^>\n])/$1$2 $3/g);
1615     }
1616    
1617     return $text;
1618     }
1619    
1620     sub last_line_len ()
1621     {
1622     my $files_list = shift;
1623     my @lines = split (/\n/, $files_list);
1624     my $last_line = pop (@lines);
1625     return length ($last_line);
1626     }
1627    
1628     # A custom wrap function, sensitive to some common constructs used in
1629     # log entries.
1630     sub wrap_log_entry ()
1631     {
1632     my $text = shift; # The text to wrap.
1633     my $left_pad_str = shift; # String to pad with on the left.
1634    
1635     # These do NOT take left_pad_str into account:
1636     my $length_remaining = shift; # Amount left on current line.
1637     my $max_line_length = shift; # Amount left for a blank line.
1638    
1639     my $wrapped_text = ""; # The accumulating wrapped entry.
1640     my $user_indent = ""; # Inherited user_indent from prev line.
1641    
1642     my $first_time = 1; # First iteration of the loop?
1643     my $suppress_line_start_match = 0; # Set to disable line start checks.
1644    
1645     my @lines = split (/\n/, $text);
1646     while (@lines) # Don't use `foreach' here, it won't work.
1647     {
1648     my $this_line = shift (@lines);
1649     chomp $this_line;
1650    
1651     if ($this_line =~ /^(\s+)/) {
1652     $user_indent = $1;
1653     }
1654     else {
1655     $user_indent = "";
1656     }
1657    
1658     # If it matches any of the line-start regexps, print a newline now...
1659     if ($suppress_line_start_match)
1660     {
1661     $suppress_line_start_match = 0;
1662     }
1663     elsif (($this_line =~ /^(\s*)\*\s+[a-zA-Z0-9]/)
1664     || ($this_line =~ /^(\s*)\* [a-zA-Z0-9_\.\/\+-]+/)
1665     || ($this_line =~ /^(\s*)\([a-zA-Z0-9_\.\/\+-]+(\)|,\s*)/)
1666     || ($this_line =~ /^(\s+)(\S+)/)
1667     || ($this_line =~ /^(\s*)- +/)
1668     || ($this_line =~ /^()\s*$/)
1669     || ($this_line =~ /^(\s*)\*\) +/)
1670     || ($this_line =~ /^(\s*)[a-zA-Z0-9](\)|\.|\:) +/))
1671     {
1672     # Make a line break immediately, unless header separator is set
1673     # and this line is the first line in the entry, in which case
1674     # we're getting the blank line for free already and shouldn't
1675     # add an extra one.
1676     unless (($After_Header ne " ") and ($first_time))
1677     {
1678     if ($this_line =~ /^()\s*$/) {
1679     $suppress_line_start_match = 1;
1680     $wrapped_text .= "\n${left_pad_str}";
1681     }
1682    
1683     $wrapped_text .= "\n${left_pad_str}";
1684     }
1685    
1686     $length_remaining = $max_line_length - (length ($user_indent));
1687     }
1688    
1689     # Now that any user_indent has been preserved, strip off leading
1690     # whitespace, so up-folding has no ugly side-effects.
1691     $this_line =~ s/^\s*//;
1692    
1693     # Accumulate the line, and adjust parameters for next line.
1694     my $this_len = length ($this_line);
1695     if ($this_len == 0)
1696     {
1697     # Blank lines should cancel any user_indent level.
1698     $user_indent = "";
1699     $length_remaining = $max_line_length;
1700     }
1701     elsif ($this_len >= $length_remaining) # Line too long, try breaking it.
1702     {
1703     # Walk backwards from the end. At first acceptable spot, break
1704     # a new line.
1705     my $idx = $length_remaining - 1;
1706     if ($idx < 0) { $idx = 0 };
1707     while ($idx > 0)
1708     {
1709     if (substr ($this_line, $idx, 1) =~ /\s/)
1710     {
1711     my $line_now = substr ($this_line, 0, $idx);
1712     my $next_line = substr ($this_line, $idx);
1713     $this_line = $line_now;
1714    
1715     # Clean whitespace off the end.
1716     chomp $this_line;
1717    
1718     # The current line is ready to be printed.
1719     $this_line .= "\n${left_pad_str}";
1720    
1721     # Make sure the next line is allowed full room.
1722     $length_remaining = $max_line_length - (length ($user_indent));
1723    
1724     # Strip next_line, but then preserve any user_indent.
1725     $next_line =~ s/^\s*//;
1726    
1727     # Sneak a peek at the user_indent of the upcoming line, so
1728     # $next_line (which will now precede it) can inherit that
1729     # indent level. Otherwise, use whatever user_indent level
1730     # we currently have, which might be none.
1731     my $next_next_line = shift (@lines);
1732     if ((defined ($next_next_line)) && ($next_next_line =~ /^(\s+)/)) {
1733     $next_line = $1 . $next_line if (defined ($1));
1734     # $length_remaining = $max_line_length - (length ($1));
1735     $next_next_line =~ s/^\s*//;
1736     }
1737     else {
1738     $next_line = $user_indent . $next_line;
1739     }
1740     if (defined ($next_next_line)) {
1741     unshift (@lines, $next_next_line);
1742     }
1743     unshift (@lines, $next_line);
1744    
1745     # Our new next line might, coincidentally, begin with one of
1746     # the line-start regexps, so we temporarily turn off
1747     # sensitivity to that until we're past the line.
1748     $suppress_line_start_match = 1;
1749    
1750     last;
1751     }
1752     else
1753     {
1754     $idx--;
1755     }
1756     }
1757    
1758     if ($idx == 0)
1759     {
1760     # We bottomed out because the line is longer than the
1761     # available space. But that could be because the space is
1762     # small, or because the line is longer than even the maximum
1763     # possible space. Handle both cases below.
1764    
1765     if ($length_remaining == ($max_line_length - (length ($user_indent))))
1766     {
1767     # The line is simply too long -- there is no hope of ever
1768     # breaking it nicely, so just insert it verbatim, with
1769     # appropriate padding.
1770     $this_line = "\n${left_pad_str}${this_line}";
1771     }
1772     else
1773     {
1774     # Can't break it here, but may be able to on the next round...
1775     unshift (@lines, $this_line);
1776     $length_remaining = $max_line_length - (length ($user_indent));
1777     $this_line = "\n${left_pad_str}";
1778     }
1779     }
1780     }
1781     else # $this_len < $length_remaining, so tack on what we can.
1782     {
1783     # Leave a note for the next iteration.
1784     $length_remaining = $length_remaining - $this_len;
1785    
1786     if ($this_line =~ /\.$/)
1787     {
1788     $this_line .= " ";
1789     $length_remaining -= 2;
1790     }
1791     else # not a sentence end
1792     {
1793     $this_line .= " ";
1794     $length_remaining -= 1;
1795     }
1796     }
1797    
1798     # Unconditionally indicate that loop has run at least once.
1799     $first_time = 0;
1800    
1801     $wrapped_text .= "${user_indent}${this_line}";
1802     }
1803    
1804     # One last bit of padding.
1805     $wrapped_text .= "\n";
1806    
1807     return $wrapped_text;
1808     }
1809    
1810     sub xml_escape ()
1811     {
1812     my $txt = shift;
1813     $txt =~ s/&/&amp;/g;
1814     $txt =~ s/</&lt;/g;
1815     $txt =~ s/>/&gt;/g;
1816     return $txt;
1817     }
1818    
1819     sub maybe_read_user_map_file ()
1820     {
1821     my %expansions;
1822    
1823     if ($User_Map_File)
1824     {
1825     open (MAPFILE, "<$User_Map_File")
1826     or die ("Unable to open $User_Map_File ($!)");
1827    
1828     while (<MAPFILE>)
1829     {
1830     next if /^\s*#/; # Skip comment lines.
1831     next if not /:/; # Skip lines without colons.
1832    
1833     # It is now safe to split on ':'.
1834     my ($username, $expansion) = split ':';
1835     chomp $expansion;
1836     $expansion =~ s/^'(.*)'$/$1/;
1837     $expansion =~ s/^"(.*)"$/$1/;
1838    
1839     # If it looks like the expansion has a real name already, then
1840     # we toss the username we got from CVS log. Otherwise, keep
1841     # it to use in combination with the email address.
1842    
1843     if ($expansion =~ /^\s*<{0,1}\S+@.*/) {
1844     # Also, add angle brackets if none present
1845     if (! ($expansion =~ /<\S+@\S+>/)) {
1846     $expansions{$username} = "$username <$expansion>";
1847     }
1848     else {
1849     $expansions{$username} = "$username $expansion";
1850     }
1851     }
1852     else {
1853     $expansions{$username} = $expansion;
1854     }
1855     } # fi ($User_Map_File)
1856    
1857     close (MAPFILE);
1858     }
1859    
1860     if (defined $User_Passwd_File)
1861     {
1862     if ( ! defined $Domain ) {
1863     if ( -e MAILNAME ) {
1864     chomp($Domain = slurp_file(MAILNAME));
1865     } else {
1866     MAILDOMAIN_CMD:
1867     for ([qw(hostname -d)], 'dnsdomainname', 'domainname') {
1868     my ($text, $exit, $sig, $core) = run_ext($_);
1869     if ( $exit == 0 && $sig == 0 && $core == 0 ) {
1870     chomp $text;
1871     if ( length $text ) {
1872     $Domain = $text;
1873     last MAILDOMAIN_CMD;
1874     }
1875     }
1876     }
1877     }
1878     }
1879    
1880     die "No mail domain found\n"
1881     unless defined $Domain;
1882    
1883     open (MAPFILE, "<$User_Passwd_File")
1884     or die ("Unable to open $User_Passwd_File ($!)");
1885     while (<MAPFILE>)
1886     {
1887     # all lines are valid
1888     my ($username, $pw, $uid, $gid, $gecos, $homedir, $shell) = split ':';
1889     my $expansion = '';
1890     ($expansion) = split (',', $gecos)
1891     if defined $gecos && length $gecos;
1892    
1893     my $mailname = $Domain eq '' ? $username : "$username\@$Domain";
1894     $expansions{$username} = "$expansion <$mailname>";
1895     }
1896     close (MAPFILE);
1897     }
1898    
1899     return %expansions;
1900     }
1901    
1902     sub parse_options ()
1903     {
1904     # Check this internally before setting the global variable.
1905     my $output_file;
1906    
1907     # If this gets set, we encountered unknown options and will exit at
1908     # the end of this subroutine.
1909     my $exit_with_admonishment = 0;
1910    
1911     my (@Global_Opts, @Local_Opts);
1912    
1913     while (my $arg = shift (@ARGV))
1914     {
1915     if ($arg =~ /^-h$|^-help$|^--help$|^--usage$|^-?$/) {
1916     $Print_Usage = 1;
1917     }
1918     elsif ($arg =~ /^--delta$/) {
1919     my $narg = shift(@ARGV) || die "$arg needs argument.\n";
1920     if ($narg =~ /^([A-Za-z][A-Za-z0-9_\-]*):([A-Za-z][A-Za-z0-9_\-]*)$/) {
1921     $Delta_From = $1;
1922     $Delta_To = $2;
1923     $Delta_Mode = 1;
1924     } else {
1925     die "--delta FROM_TAG:TO_TAG is what you meant to say.\n";
1926     }
1927     }
1928     elsif ($arg =~ /^--debug$/) { # unadvertised option, heh
1929     $Debug = 1;
1930     }
1931     elsif ($arg =~ /^--version$/) {
1932     $Print_Version = 1;
1933     }
1934     elsif ($arg =~ /^-g$|^--global-opts$/) {
1935     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1936     # Don't assume CVS is called "cvs" on the user's system:
1937     push @Global_Opts, $narg;
1938     $Log_Source_Command =~ s/(^\S*)/$1 $narg/;
1939     }
1940     elsif ($arg =~ /^-l$|^--log-opts$/) {
1941     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1942     push @Local_Opts, $narg;
1943     $Log_Source_Command .= " $narg";
1944     }
1945     elsif ($arg =~ /^-f$|^--file$/) {
1946     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1947     $output_file = $narg;
1948     }
1949     elsif ($arg =~ /^--accum$/) {
1950     $Cumulative = 1;
1951     }
1952     elsif ($arg =~ /^--update$/) {
1953     $Update = 1;
1954     }
1955     elsif ($arg =~ /^--fsf$/) {
1956     $FSF_Style = 1;
1957     }
1958     elsif ($arg =~ /^--FSF$/) {
1959     $Show_Times = 0;
1960     $Common_Dir = 0;
1961     }
1962     elsif ($arg =~ /^--rcs/) {
1963     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1964     $RCS_Root = $narg;
1965     $RCS_Mode = 1;
1966     }
1967     elsif ($arg =~ /^-U$|^--usermap$/) {
1968     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1969     $User_Map_File = $narg;
1970     }
1971     elsif ($arg =~ /^--gecos$/) {
1972     $Gecos = 1;
1973     }
1974     elsif ($arg =~ /^--domain$/) {
1975     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1976     $Domain = $narg;
1977     }
1978     elsif ($arg =~ /^--passwd$/) {
1979     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1980     $User_Passwd_File = $narg;
1981     }
1982     elsif ($arg =~ /^--mailname$/) {
1983     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1984     warn "--mailname is deprecated; please use --domain instead\n";
1985     $Domain = $narg;
1986     }
1987     elsif ($arg =~ /^-W$|^--window$/) {
1988     defined(my $narg = shift (@ARGV)) || die "$arg needs argument.\n";
1989     $Max_Checkin_Duration = $narg;
1990     }
1991     elsif ($arg =~ /^--chrono$/) {
1992     $Chronological_Order = 1;
1993     }
1994     elsif ($arg =~ /^-I$|^--ignore$/) {
1995     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1996     push (@Ignore_Files, $narg);
1997     }
1998     elsif ($arg =~ /^-C$|^--case-insensitive$/) {
1999     $Case_Insensitive = 1;
2000     }
2001     elsif ($arg =~ /^-R$|^--regexp$/) {
2002     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
2003     $Regexp_Gate = $narg;
2004     }
2005     elsif ($arg =~ /^--stdout$/) {
2006     $Output_To_Stdout = 1;
2007     }
2008     elsif ($arg =~ /^--version$/) {
2009     $Print_Version = 1;
2010     }
2011     elsif ($arg =~ /^-d$|^--distributed$/) {
2012     $Distributed = 1;
2013     }
2014     elsif ($arg =~ /^-P$|^--prune$/) {
2015     $Prune_Empty_Msgs = 1;
2016     }
2017     elsif ($arg =~ /^-S$|^--separate-header$/) {
2018     $After_Header = "\n\n";
2019     }
2020     elsif ($arg =~ /^--no-wrap$/) {
2021     $No_Wrap = 1;
2022     }
2023     elsif ($arg =~ /^--summary$/) {
2024     $Summary = 1;
2025     $After_Header = "\n\n"; # Summary implies --separate-header
2026     }
2027     elsif ($arg =~ /^--gmt$|^--utc$/) {
2028     $UTC_Times = 1;
2029     }
2030     elsif ($arg =~ /^-w$|^--day-of-week$/) {
2031     $Show_Day_Of_Week = 1;
2032     }
2033     elsif ($arg =~ /^--no-times$/) {
2034     $Show_Times = 0;
2035     }
2036     elsif ($arg =~ /^-r$|^--revisions$/) {
2037     $Show_Revisions = 1;
2038     }
2039     elsif ($arg =~ /^--show-dead$/) {
2040     $Show_Dead = 1;
2041     }
2042     elsif ($arg =~ /^--no-hide-branch-additions$/) {
2043     $Hide_Branch_Additions = 0;
2044     }
2045     elsif ($arg =~ /^-t$|^--tags$/) {
2046     $Show_Tags = 1;
2047     }
2048     elsif ($arg =~ /^-T$|^--tagdates$/) {
2049     $Show_Tag_Dates = 1;
2050     }
2051     elsif ($arg =~ /^-b$|^--branches$/) {
2052     $Show_Branches = 1;
2053     }
2054     elsif ($arg =~ /^-F$|^--follow$/) {
2055     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
2056     push (@Follow_Branches, $narg);
2057     }
2058     elsif ($arg =~ /^--stdin$/) {
2059     $Input_From_Stdin = 1;
2060     }
2061     elsif ($arg =~ /^--header$/) {
2062     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
2063     $ChangeLog_Header = &slurp_file ($narg);
2064     if (! defined ($ChangeLog_Header)) {
2065     $ChangeLog_Header = "";
2066     }
2067     }
2068     elsif ($arg =~ /^--xml-encoding$/) {
2069     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
2070     $XML_Encoding = $narg ;
2071     }
2072     elsif ($arg =~ /^--xml$/) {
2073     $XML_Output = 1;
2074     }
2075     elsif ($arg =~ /^--hide-filenames$/) {
2076     $Hide_Filenames = 1;
2077     $After_Header = "";
2078     }
2079     elsif ($arg =~ /^--no-common-dir$/) {
2080     $Common_Dir = 0;
2081     }
2082     elsif ($arg =~ /^--ignore-tag$/ ) {
2083     die "$arg needs argument.\n"
2084     unless @ARGV;
2085     $ignore_tags{shift @ARGV} = 1;
2086     }
2087     elsif ($arg =~ /^--show-tag$/ ) {
2088     die "$arg needs argument.\n"
2089     unless @ARGV;
2090     $show_tags{shift @ARGV} = 1;
2091     }
2092     elsif ( lc ($arg) eq '--test-code' ) {
2093     # Deliberately undocumented. This is not a public interface,
2094     # and may change/disappear at any time.
2095     die "$arg needs argument.\n"
2096     unless @ARGV;
2097     $TestCode = shift @ARGV;
2098     }
2099     elsif ($arg =~ /^--no-ancestors$/) {
2100     $No_Ancestors = 1;
2101     }
2102     else {
2103     # Just add a filename as argument to the log command
2104     $Log_Source_Command .= " '$arg'";
2105     }
2106     }
2107    
2108     ## Check for contradictions...
2109    
2110     if ($Output_To_Stdout && $Distributed) {
2111     print STDERR "cannot pass both --stdout and --distributed\n";
2112     $exit_with_admonishment = 1;
2113     }
2114    
2115     if ($Output_To_Stdout && $output_file) {
2116     print STDERR "cannot pass both --stdout and --file\n";
2117     $exit_with_admonishment = 1;
2118     }
2119    
2120     if ($Input_From_Stdin && @Global_Opts) {
2121     print STDERR "cannot pass both --stdin and -g\n";
2122     $exit_with_admonishment = 1;
2123     }
2124    
2125     if ($Input_From_Stdin && @Local_Opts) {
2126     print STDERR "cannot pass both --stdin and -l\n";
2127     $exit_with_admonishment = 1;
2128     }
2129    
2130     if ($XML_Output && $Cumulative) {
2131     print STDERR "cannot pass both --xml and --accum\n";
2132     $exit_with_admonishment = 1;
2133     }
2134    
2135     # Or if any other error message has already been printed out, we
2136     # just leave now:
2137     if ($exit_with_admonishment) {
2138     &usage ();
2139     exit (1);
2140     }
2141     elsif ($Print_Usage) {
2142     &usage ();
2143     exit (0);
2144     }
2145     elsif ($Print_Version) {
2146     &version ();
2147     exit (0);
2148     }
2149    
2150     ## Else no problems, so proceed.
2151    
2152     if ($output_file) {
2153     $Log_File_Name = $output_file;
2154     }
2155     }
2156    
2157     sub slurp_file ()
2158     {
2159     my $filename = shift || die ("no filename passed to slurp_file()");
2160     my $retstr;
2161    
2162     open (SLURPEE, "<${filename}") or die ("unable to open $filename ($!)");
2163     my $saved_sep = $/;
2164     undef $/;
2165     $retstr = <SLURPEE>;
2166     $/ = $saved_sep;
2167     close (SLURPEE);
2168     return $retstr;
2169     }
2170    
2171     sub debug ()
2172     {
2173     if ($Debug) {
2174     my $msg = shift;
2175     print STDERR $msg;
2176     }
2177     }
2178    
2179     sub version ()
2180     {
2181     print "cvs2cl.pl version ${VERSION}; distributed under the GNU GPL.\n";
2182     }
2183    
2184     sub usage ()
2185     {
2186     &version ();
2187     print <<'END_OF_INFO';
2188     Generate GNU-style ChangeLogs in CVS working copies.
2189    
2190     Notes about the output format(s):
2191    
2192     The default output of cvs2cl.pl is designed to be compact, formally
2193     unambiguous, but still easy for humans to read. It is largely
2194     self-explanatory, I hope; the one abbreviation that might not be
2195     obvious is "utags". That stands for "universal tags" -- a
2196     universal tag is one held by all the files in a given change entry.
2197    
2198     If you need output that's easy for a program to parse, use the
2199     --xml option. Note that with XML output, just about all available
2200     information is included with each change entry, whether you asked
2201     for it or not, on the theory that your parser can ignore anything
2202     it's not looking for.
2203    
2204     Notes about the options and arguments (the actual options are listed
2205     last in this usage message):
2206    
2207     * The -I and -F options may appear multiple times.
2208    
2209     * To follow trunk revisions, use "-F trunk" ("-F TRUNK" also works).
2210     This is okay because no would ever, ever be crazy enough to name a
2211     branch "trunk", right? Right.
2212    
2213     * For the -U option, the UFILE should be formatted like
2214     CVSROOT/users. That is, each line of UFILE looks like this
2215     jrandom:jrandom@red-bean.com
2216     or maybe even like this
2217     jrandom:'Jesse Q. Random <jrandom@red-bean.com>'
2218     Don't forget to quote the portion after the colon if necessary.
2219    
2220     * Many people want to filter by date. To do so, invoke cvs2cl.pl
2221     like this:
2222     cvs2cl.pl -l "-d'DATESPEC'"
2223     where DATESPEC is any date specification valid for "cvs log -d".
2224     (Note that CVS 1.10.7 and below requires there be no space between
2225     -d and its argument).
2226    
2227     Options/Arguments:
2228    
2229     -h, -help, --help, or -? Show this usage and exit
2230     --version Show version and exit
2231     -r, --revisions Show revision numbers in output
2232     -b, --branches Show branch names in revisions when possible
2233     -t, --tags Show tags (symbolic names) in output
2234     -T, --tagdates Show tags in output on their first occurance
2235     --show-dead Show dead files
2236     --stdin Read from stdin, don't run cvs log
2237     --stdout Output to stdout not to ChangeLog
2238     -d, --distributed Put ChangeLogs in subdirs
2239     -f FILE, --file FILE Write to FILE instead of "ChangeLog"
2240     --fsf Use this if log data is in FSF ChangeLog style
2241     --FSF Attempt strict FSF-standard compatible output
2242     -W SECS, --window SECS Window of time within which log entries unify
2243     -U UFILE, --usermap UFILE Expand usernames to email addresses from UFILE
2244     --passwd PASSWORDFILE Use system passwd file for user name expansion.
2245     If no mail domain is provided (via --domain),
2246     it tries to read one from /etc/mailname else
2247     output of
2248     hostname -d / dnsdomainname / domainname. Dies
2249     if none successful. Use a domain of '' to
2250     prevent the addition of a mail domain.
2251     --domain DOMAIN Domain to build email addresses from
2252     --gecos Get user information from GECOS data
2253     -R REGEXP, --regexp REGEXP Include only entries that match REGEXP
2254     -I REGEXP, --ignore REGEXP Ignore files whose names match REGEXP
2255     -C, --case-insensitive Any regexp matching is done case-insensitively
2256     -F BRANCH, --follow BRANCH Show only revisions on or ancestral to BRANCH
2257     --no-ancestors When using -F, only track changes since the
2258     BRANCH started
2259     --no-hide-branch-additions By default, entries generated by cvs for a file
2260     added on a branch (a dead 1.1 entry) are not
2261     shown. This flag reverses that action.
2262     -S, --separate-header Blank line between each header and log message
2263     --summary Add CVS change summary information
2264     --no-wrap Don't auto-wrap log message (recommend -S also)
2265     --gmt, --utc Show times in GMT/UTC instead of local time
2266     --accum Add to an existing ChangeLog (incompat w/ --xml)
2267     --update As --accum, but lists only files changed since
2268     last run
2269     -w, --day-of-week Show day of week
2270     --no-times Don't show times in output
2271     --chrono Output log in chronological order
2272     (default is reverse chronological order)
2273     --header FILE Get ChangeLog header from FILE ("-" means stdin)
2274     --xml Output XML instead of ChangeLog format
2275     --xml-encoding ENCODING Insert encoding clause in XML header
2276     --hide-filenames Don't show filenames (ignored for XML output)
2277     --no-common-dir Don't shorten directory names from filenames.
2278     --rcs CVSROOT Handle filenames from raw RCS, for instance
2279     those produced by "cvs rlog" output, stripping
2280     the prefix CVSROOT.
2281     -P, --prune Don't show empty log messages
2282     --ignore-tag TAG Ignore individual changes that are associated
2283     with a given tag. May be repeated, if so,
2284     changes that are associated with any of the
2285     given tags are ignored.
2286     --show-tag TAG Log only individual changes that are associated
2287     with a given tag. May be repeated, if so,
2288     changes that are associated with any of the
2289     given tags are logged.
2290     --delta FROM_TAG:TO_TAG Attempt a delta between two tags (since FROM_TAG
2291     up to & including TO_TAG). The algorithm is a
2292     simple date-based one (this is a *hard* problem)
2293     so results are imperfect
2294     -g OPTS, --global-opts OPTS Invoke like this "cvs OPTS log ..."
2295     -l OPTS, --log-opts OPTS Invoke like this "cvs ... log OPTS"
2296     FILE1 [FILE2 ...] Show only log information for the named FILE(s)
2297    
2298     See http://www.red-bean.com/cvs2cl for maintenance and bug info.
2299     END_OF_INFO
2300     }
2301    
2302     __END__
2303    
2304     =head1 NAME
2305    
2306     cvs2cl.pl - produces GNU-style ChangeLogs in CVS working copies, by
2307     running "cvs log" and parsing the output. Shared log entries are
2308     unified in an intuitive way.
2309    
2310     =head1 DESCRIPTION
2311    
2312     This script generates GNU-style ChangeLog files from CVS log
2313     information. Basic usage: just run it inside a working copy and a
2314     ChangeLog will appear. It requires repository access (i.e., 'cvs log'
2315     must work). Run "cvs2cl.pl --help" to see more advanced options.
2316    
2317     See http://www.red-bean.com/cvs2cl for updates, and for instructions
2318     on getting anonymous CVS access to this script.
2319    
2320     Maintainer: Karl Fogel <kfogel@red-bean.com>
2321     Please report bugs to <bug-cvs2cl@red-bean.com>.
2322    
2323     =head1 README
2324    
2325     This script generates GNU-style ChangeLog files from CVS log
2326     information. Basic usage: just run it inside a working copy and a
2327     ChangeLog will appear. It requires repository access (i.e., 'cvs log'
2328     must work). Run "cvs2cl.pl --help" to see more advanced options.
2329    
2330     See http://www.red-bean.com/cvs2cl for updates, and for instructions
2331     on getting anonymous CVS access to this script.
2332    
2333     Maintainer: Karl Fogel <kfogel@red-bean.com>
2334     Please report bugs to <bug-cvs2cl@red-bean.com>.
2335    
2336     =head1 PREREQUISITES
2337    
2338     This script requires C<Text::Wrap>, C<Time::Local>, and
2339     C<File::Basename>.
2340     It also seems to require C<Perl 5.004_04> or higher.
2341    
2342     =pod OSNAMES
2343    
2344     any
2345    
2346     =pod SCRIPT CATEGORIES
2347    
2348     Version_Control/CVS
2349    
2350     =cut
2351    
2352     -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*-
2353    
2354     Note about a bug-slash-opportunity:
2355     -----------------------------------
2356    
2357     There's a bug in Text::Wrap, which affects cvs2cl. This script
2358     reveals it:
2359    
2360     #!/usr/bin/perl -w
2361    
2362     use Text::Wrap;
2363    
2364     my $test_text =
2365     "This script demonstrates a bug in Text::Wrap. The very long line
2366     following this paragraph will be relocated relative to the surrounding
2367     text:
2368    
2369     ====================================================================
2370    
2371     See? When the bug happens, we'll get the line of equal signs below
2372     this paragraph, even though it should be above.";
2373    
2374     # Print out the test text with no wrapping:
2375     print "$test_text";
2376     print "\n";
2377     print "\n";
2378    
2379     # Now print it out wrapped, and see the bug:
2380     print wrap ("\t", " ", "$test_text");
2381     print "\n";
2382     print "\n";
2383    
2384     If the line of equal signs were one shorter, then the bug doesn't
2385     happen. Interesting.
2386    
2387     Anyway, rather than fix this in Text::Wrap, we might as well write a
2388     new wrap() which has the following much-needed features:
2389    
2390     * initial indentation, like current Text::Wrap()
2391     * subsequent line indentation, like current Text::Wrap()
2392     * user chooses among: force-break long words, leave them alone, or die()?
2393     * preserve existing indentation: chopped chunks from an indented line
2394     are indented by same (like this line, not counting the asterisk!)
2395     * optional list of things to preserve on line starts, default ">"
2396    
2397     Note that the last two are essentially the same concept, so unify in
2398     implementation and give a good interface to controlling them.
2399    
2400     And how about:
2401    
2402     Optionally, when encounter a line pre-indented by same as previous
2403     line, then strip the newline and refill, but indent by the same.
2404     Yeah...
2405    

  ViewVC Help
Powered by ViewVC 1.1.26