Tweaked unidecode_name() to not produce single-letter ascii surnames from non-ascii codepoints. The unidecode transliteration is in any case somewhat arbitrary, and in most cases a real person will tweak the ascii name of his account. When running tests, however, this tweak avoids some false test failures. And no, it's not simple to fix the draft author-extraction heuristics to deal well with single-letter surnames.

- Legacy-Id: 15239
This commit is contained in:
Henrik Levkowetz 2018-06-10 14:48:13 +00:00
parent 5f7fb2e0bd
commit dbe9211963

View file

@ -107,6 +107,8 @@ def unidecode_name(uname):
first = first.title()
middle = ' '.join([ capfirst(p) for p in middle.split() ])
last = ' '.join([ capfirst(p) for p in last.split() ])
if len(last) == 1:
last = (last+last).capitalize()
# Restore the particle, if any
if particle and last.startswith(capfirst(particle)+' '):
last = ' '.join([ particle, last[len(particle)+1:] ])