Modified UserFactory to use a new locale for each new user, instead of the
same locale for a whole test run. This (almost) ensures the exercise of
code to deal with non-ascii names, something which would not happen if a
locale with ascii names was chosen at the start of a run.
Modified name.initials() to not use non-word characters as initials.
Modified unidecode_name() to do more normalization, to conform to the
conventions used in internet-drafts.
Added saving of the factory-boy random state in order to be able to re-run
a test suite with the same pseudo-random sequence as in a previous failed
run.
Fixed an issue with email formatting in test_api_submit_ok().
Modified the draft author extraction code to deal better with names with
embedded apostrophes.
- Legacy-Id: 14141
to the draft parser (incorporating patch from trunk), store the
extracted country instead of trying to turn it into an ISO country
code, add country and continent name models and add initial data for
those, add helper function for cleaning the countries, add author
country and continent charts, move the affiliation models to
stats/models.py, fix a bunch of bugs.
- Legacy-Id: 12846
Populates RelatedDocument with relations for references for each type draft Document.
Replaces these reference relationships with updated copies on draft submission.
Note to deployer: This migration takes around 10 minutes to complete on a fast development laptop.
- Legacy-Id: 6572
New features (keep in mind that utils/draft.py can be run standalone
to do extraction of draft author data, too):
* The handling of author info formatted in columns causes problems
in the face of an author named for instance A. Author with the
company 'Al Author and Associates', causing breakage of email
addresses longer than 'Al Author and'. Tweaked the recognition
of column data to require multiple (not only one) space around
'and'.
* Added support for extraction of author affiliation.
* Tweaked the meaning of -t, --timestamp and added --notimestamp; and
made the default be to emit leading timestamps based ont the draft
file time.
* Added support for running author extraction on RFCs, by not bailing
out on not finding a draft name when RFC information is available.
* Added support for additional date formats and author name formats.
* Improved creation date extraction -- previously, the first supported
date format which was recognized on the first page of the draft would
be used, rather than the first date in a supported format. This could
cause errors if the Status of Memo section or Abstract contained a
date occurring at the start of a line.
* Tweaked the honorific regex to make things work better for the case
when the full name in the author's address section includes a first
name which isn't part of the first-page abbreviated name. Fixes
problems with draft-chiappa-lisp-introduction and similar.
* Added a special case for people who provide their email address as
'foo&cisco.com' instead of 'foo@cisco.com'. Bah.
* Added an alternative, more human-readable key-value-pair attribute
output mode with a '-a' switch.
* Tweaded the first-name regex to capture cases where the first name
is indicated with an alternate first letter: 'Y(J) Stein'. Fixes
problems with draft-anavi-tdmoip and similar.
- Legacy-Id: 4612
recognizable author's address section, and not searching for
author names earlier in the document if found. Fixes a known
bad case where the author name occurred in the middle of a draft.
* Added handling for the case where an author name is followed by
parentheses which are not closed on the same line.
* Some refactoring.
- Legacy-Id: 3417
information: draft.get_author_info(). This method returns a list of
(full_name, first_name, middle_part, surname, suffix, email), with
middle_part, suffix and email set to None if none was found.
- Legacy-Id: 2921
to Yaco on 2011-03-19, and committed on branch/yaco/idsubmit as [2896].
* Extraction of Title which don't have the draft name on a separate
page fails. See for instance this example:
http://www.ietf.org/staging/draft-ma-cdni-publisher-use-cases-00.txt
The regex should maybe be updated to permit but not require a newline
before the draft filename:
'(?:\n\s*\n\s*)((.+\n){1,2}(.+\n?))(\s+<?draft-\S+\s*\n)\s*\n'
* If there are blank lines before the start of the author list on the
first page, the author extraction will fail. This sometimes happens
when there's junk at the start of a draft, see for instance
http://www.ietf.org/id/draft-ietf-mpls-tp-process-00.txt .
* Sometimes the Authors' Addresses section lists authors with the same
workplace address on the same line: "Sam Spade and Joe Smith". This
needs a fix in the author extraction code.
* Sometimes the order of first name, surname is different on the first
page and in the author list, and sometimes the surname is uppercase
in one place, but not in the other. This also needs a fix in the
author extraction code.
* The header stripping code had a bug, where multiple blank lines could
be replaced by a single blank line in the stripped text, which could
mess up title extraction.
* Title space normalization should be done also for titles from the
'unusual title format' code branch of the title extraction code.
* Company names on the first page are sometimes rendered with different
case than in the Authors' Addresses section.
* Some drafts list the draft filename _before_ the title, rather than
after the title. Permit this too. Covered in the patch.
* Spanish names can be shown as either
<given_name> <fathers_first_surname> <mothers_first_surname>
or less formally as
<given_name> <fathers_first_surname>
If the first form is used in the Authors' Addresses section, but the
second form (with the given name possibly abbreviated to its first
letter) the author extraction will fail.
* Drafts containing tabs will be caught by idnits during I-D submission,
but in case the drafts.py module is used independently from idnits,
convert tabs to spaces in order for the author extraction and other
methods to work as expected. Example: recently submitted draft
draft-bergeron-payload-rtpfec-rs-00.txt.
* Found a draft with a previously unhandled header/footer format:
draft-fang-mpls-tp-oam-toolset-01.txt. Tweak needed for header/footer
stripping.
- Legacy-Id: 2919
Note: SVN reference [2896] has been migrated to Git commit 5a34b70e52