Adjust the way authors with unknown countries are counted and improve

the explanation of how the numbers have come to be
 - Legacy-Id: 12858
This commit is contained in:
Ole Laursen 2017-02-16 12:23:45 +00:00
parent d2e85a3aa3
commit 1a0e4599c5
3 changed files with 27 additions and 10 deletions

View file

@ -433,6 +433,14 @@ def document_stats(request, stats_type=None):
if c and c.in_eu:
bins[eu_name].append(name)
# remove from the unknown bin all authors with a known country
all_known = set(n for b, names in bins.iteritems() if b for n in names)
unknown = []
for name in bins[""]:
if name not in all_known:
unknown.append(name)
bins[""] = unknown
series_data = []
for country, names in sorted(bins.iteritems(), key=lambda t: t[0].lower()):
percentage = len(names) * 100.0 / total_persons
@ -470,6 +478,14 @@ def document_stats(request, stats_type=None):
continent_name = country_to_continent.get(country_name, "")
bins[continent_name].append(name)
# remove from the unknown bin all authors with a known continent
all_known = set(n for b, names in bins.iteritems() if b for n in names)
unknown = []
for name in bins[""]:
if name not in all_known:
unknown.append(name)
bins[""] = unknown
series_data = []
for continent, names in sorted(bins.iteritems(), key=lambda t: t[0].lower()):
percentage = len(names) * 100.0 / total_persons

View file

@ -58,8 +58,7 @@
</tbody>
</table>
<p>The country information for an author can vary between documents,
so the sum of the rows in the table can be more than 100%. This
is especially true for the row with unknown continent information -
many authors may have one or more author entries with an
unrecognized country.</p>
<p>The statistics are based entirely on the author addresses provided
in each draft. Since this varies across documents, a travelling
author may be counted in more than country, making the total sum
more than 100%.</p>

View file

@ -58,11 +58,13 @@
</tbody>
</table>
<p>The country information for an author can vary between documents,
so the sum of multiple rows in the table can be more than 100%. This
is especially true for the row with unknown country information -
many authors may have one or more author entries with an
unrecognized country.</p>
<p>The statistics are based entirely on the author addresses provided
in each draft. Since this varies across documents, a travelling
author may be counted in more than country, making the total sum
more than 100%.</p>
<p>In case no country information is found for an author in the time
period, the author is counted as (unknown).</p>
<p>An author is counted in EU if the country is a member of the EU
now, even if that was not the case at publication.