datatracker/ietf/submit/mail.py
Jennifer Richards 3705bedfcd
feat: Celery support and asynchronous draft submission API (#4037)
* ci: add Dockerfile and action to build celery worker image

* ci: build celery worker on push to jennifer/celery branch

* ci: also build celery worker for main branch

* ci: Add comment to celery Dockerfile

* chore: first stab at a celery/rabbitmq docker-compose

* feat: add celery configuration and test task / endpoint

* chore: run mq/celery containers for dev work

* chore: point to ghcr.io image for celery worker

* refactor: move XML parsing duties into XMLDraft

Move some PlaintextDraft methods into the Draft base class and
implement for the XMLDraft class. Use xml2rfc code from ietf.submit
as a model for the parsing.

This leaves some mismatch between the PlaintextDraft and the Draft
class spec for the get_author_list() method to be resolved.

* feat: add api_upload endpoint and beginnings of async processing

This adds an api_upload() that behaves analogously to the api_submit()
endpoint. Celery tasks to handle asynchronous processing are added but
are not yet functional enough to be useful.

* perf: index Submission table on submission_date

This substantially speeds up submission rate threshold checks.

* feat: remove existing files when accepting a new submission

After checking that a submission is not in progress, remove any files
in staging that have the same name/rev with any extension. This should
guard against stale files confusing the submission process if the
usual cleanup fails or is skipped for some reason.

* refactor: make clear that deduce_group() uses only the draft name

* refactor: extract only draft name/revision in clean() method

Minimizing the amount of validation done when accepting a file. The
data extraction will be moved to asynchronous processing.

* refactor: minimize checks and data extraction in api_upload() view

* ci: fix dockerfiles to match sandbox testing

* ci: tweak celery container docker-compose settings

* refactor: clean up Draft parsing API and usage

  * remove get_draftname() from Draft api; set filename during init
  * further XMLDraft work
    - remember xml_version after parsing
    - extract filename/revision during init
    - comment out long broken get_abstract() method
  * adjust form clean() method to use changed API

* feat: flesh out async submission processing

First basically working pass!

* feat: add state name for submission being validated asynchronously

* feat: cancel submissions that async processing can't handle

* refactor: simplify/consolidate async tasks and improve error handling

* feat: add api_submission_status endpoint

* refactor: return JSON from submission api endpoints

* refactor: reuse cancel_submission method

* refactor: clean up error reporting a bit

* feat: guard against cancellation of a submission while validating

Not bulletproof but should prevent

* feat: indicate that a submission is still being validated

* fix: do not delete submission files after creating them

* chore: remove debug statement

* test: add tests of the api_upload and api_submission_status endpoints

* test: add tests and stubs for async side of submission handling

* fix: gracefully handle (ignore) invalid IDs in async submit task

* test: test process_uploaded_submission method

* fix: fix failures of new tests

* refactor: fix type checker complaints

* test: test submission_status view of submission in "validating" state

* fix: fix up migrations

* fix: use the streamlined SubmissionBaseUploadForm for api_upload

* feat: show submission history event timestamp as mouse-over text

* fix: remove 'manual' as next state for 'validating' submission state

* refactor: share SubmissionBaseUploadForm code with Deprecated version

* fix: validate text submission title, update a couple comments

* chore: disable requirements updating when celery dev container starts

* feat: log traceback on unexpected error during submission processing

* feat: allow secretariat to cancel "validating" submission

* feat: indicate time since submission on the status page

* perf: check submission rate thresholds earlier when possible

No sense parsing details of a draft that is going to be dropped regardless
of those details!

* fix: create Submission before saving to reduce race condition window

* fix: call deduce_group() with filename

* refactor: remove code lint

* refactor: change the api_upload URL to api/submission

* docs: update submission API documentation

* test: add tests of api_submission's text draft consistency checks

* refactor: rename api_upload to api_submission to agree with new URL

* test: test API documentation and submission thresholds

* fix: fix a couple api_submission view renames missed in templates

* chore: use base image + add arm64 support

* ci: try to fix workflow_dispatch for celery worker

* ci: another attempt to fix workflow_dispatch

* ci: build celery image for submit-async branch

* ci: fix typo

* ci: publish celery worker to ghcr.io/painless-security

* ci: install python requirements in celery image

* ci: fix up requirements install on celery image

* chore: remove XML_LIBRARY references that crept back in

* feat: accept 'replaces' field in api_submission

* docs: update api_submission documentation

* fix: remove unused import

* test: test "replaces" validation for submission API

* test: test that "replaces" is set by api_submission

* feat: trap TERM to gracefully stop celery container

* chore: tweak celery/mq settings

* docs: update installation instructions

* ci: adjust paths that trigger celery worker image  build

* ci: fix branches/repo names left over from dev

* ci: run manage.py check when initializing celery container

Driver here is applying the patches. Starting the celery workers
also invokes the check task, but this should cause a clearer failure
if something fails.

* docs: revise INSTALL instructions

* ci: pass filename to pip update in celery container

* docs: update INSTALL to include freezing pip versions

Will be used to coordinate package versions with the celery
container in production.

* docs: add explanation of frozen-requirements.txt

* ci: build image for sandbox deployment

* ci: add additional build trigger path

* docs: tweak INSTALL

* fix: change INSTALL process to stop datatracker before running migrations

* chore: use ietf.settings for manage.py check in celery container

* chore: set uid/gid for celery worker

* chore: create user/group in celery container if needed

* chore: tweak docker compose/init so celery container works in dev

* ci: build mq docker image

* fix: move rabbitmq.pid to writeable location

* fix: clear password when CELERY_PASSWORD is empty

Setting to an empty password is really not a good plan!

* chore: add shutdown debugging option to celery image

* chore: add django-celery-beat package

* chore: run "celery beat" in datatracker-celery image

* chore: fix docker image name

* feat: add task to cancel stale submissions

* test: test the cancel_stale_submissions task

* chore: make f-string with no interpolation a plain string

Co-authored-by: Nicolas Giard <github@ngpixel.com>
Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
2022-08-22 13:29:31 -05:00

376 lines
14 KiB
Python

# Copyright The IETF Trust 2013-2020, All Rights Reserved
# -*- coding: utf-8 -*-
import re
import email
import datetime
import base64
import os
import pyzmail
from django.conf import settings
from django.urls import reverse as urlreverse
from django.core.exceptions import ValidationError
from django.contrib.sites.models import Site
from django.template.loader import render_to_string
from django.utils.encoding import force_text, force_str
import debug # pyflakes:ignore
from ietf.utils.log import log
from ietf.utils.mail import send_mail, send_mail_message
from ietf.doc.models import Document
from ietf.ipr.mail import utc_from_string
from ietf.person.models import Person
from ietf.message.models import Message, MessageAttachment
from ietf.utils.accesstoken import generate_access_token
from ietf.mailtrigger.utils import gather_address_lists, get_base_submission_message_address
from ietf.submit.models import SubmissionEmailEvent, Submission
from ietf.submit.checkers import DraftIdnitsChecker
def send_submission_confirmation(request, submission, chair_notice=False):
subject = 'Confirm submission of I-D %s' % submission.name
from_email = settings.IDSUBMIT_FROM_EMAIL
(to_email, cc) = gather_address_lists('sub_confirmation_requested',submission=submission)
confirmation_url = settings.IDTRACKER_BASE_URL + urlreverse('ietf.submit.views.confirm_submission', kwargs=dict(submission_id=submission.pk, auth_token=generate_access_token(submission.auth_key)))
status_url = settings.IDTRACKER_BASE_URL + urlreverse('ietf.submit.views.submission_status', kwargs=dict(submission_id=submission.pk, access_token=submission.access_token()))
send_mail(request, to_email, from_email, subject, 'submit/confirm_submission.txt',
{
'submission': submission,
'confirmation_url': confirmation_url,
'status_url': status_url,
'chair_notice': chair_notice,
},
cc=cc)
all_addrs = to_email
all_addrs.extend(cc)
return all_addrs
def send_full_url(request, submission):
subject = 'Full URL for managing submission of draft %s' % submission.name
from_email = settings.IDSUBMIT_FROM_EMAIL
(to_email, cc) = gather_address_lists('sub_management_url_requested',submission=submission)
url = settings.IDTRACKER_BASE_URL + urlreverse('ietf.submit.views.submission_status', kwargs=dict(submission_id=submission.pk, access_token=submission.access_token()))
send_mail(request, to_email, from_email, subject, 'submit/full_url.txt',
{
'submission': submission,
'url': url,
},
cc=cc)
all_addrs = to_email
all_addrs.extend(cc)
return all_addrs
def send_approval_request(request, submission, replaced_doc=None):
"""Send an approval request for a submission
If replaced_doc is not None, requests will be sent to the wg chairs or ADs
responsible for that doc's group instead of the submission.
"""
subject = 'New draft waiting for approval: %s' % submission.name
from_email = settings.IDSUBMIT_FROM_EMAIL
# Sort out which MailTrigger to use
mt_kwargs = dict(submission=submission)
if replaced_doc:
mt_kwargs['doc'] = replaced_doc
if submission.state_id == 'ad-appr':
approval_type = 'ad'
if replaced_doc:
mt_slug = 'sub_replaced_doc_director_approval_requested'
else:
mt_slug = 'sub_director_approval_requested'
else:
approval_type = 'chair'
if replaced_doc:
mt_slug = 'sub_replaced_doc_chair_approval_requested'
else:
mt_slug = 'sub_chair_approval_requested'
(to_email,cc) = gather_address_lists(mt_slug, **mt_kwargs)
if not to_email:
return to_email
send_mail(request, to_email, from_email, subject, 'submit/approval_request.txt',
{
'submission': submission,
'domain': Site.objects.get_current().domain,
'approval_type': approval_type,
},
cc=cc)
all_addrs = to_email
all_addrs.extend(cc)
return all_addrs
def send_manual_post_request(request, submission, errors):
subject = 'Manual Post Requested for %s' % submission.name
from_email = settings.IDSUBMIT_FROM_EMAIL
(to_email,cc) = gather_address_lists('sub_manual_post_requested',submission=submission)
checker = DraftIdnitsChecker(options=[]) # don't use the default --submitcheck limitation
file_name = os.path.join(settings.IDSUBMIT_STAGING_PATH, '%s-%s.txt' % (submission.name, submission.rev))
nitspass, nitsmsg, nitserr, nitswarn, nitsresult = checker.check_file_txt(file_name)
send_mail(request, to_email, from_email, subject, 'submit/manual_post_request.txt', {
'submission': submission,
'url': settings.IDTRACKER_BASE_URL + urlreverse('ietf.submit.views.submission_status', kwargs=dict(submission_id=submission.pk)),
'errors': errors,
'idnits': nitsmsg,
}, cc=cc)
def announce_to_lists(request, submission):
m = Message()
m.by = Person.objects.get(name="(System)")
if request.user.is_authenticated:
try:
m.by = request.user.person
except Person.DoesNotExist:
pass
m.subject = 'I-D Action: %s-%s.txt' % (submission.name, submission.rev)
m.frm = settings.IDSUBMIT_ANNOUNCE_FROM_EMAIL
(m.to, m.cc) = gather_address_lists('sub_announced',submission=submission).as_strings()
if m.cc:
m.reply_to = m.cc
m.body = render_to_string('submit/announce_to_lists.txt',
dict(submission=submission,
settings=settings))
m.save()
m.related_docs.add(Document.objects.get(name=submission.name))
send_mail_message(request, m)
def announce_new_wg_00(request, submission):
m = Message()
m.by = Person.objects.get(name="(System)")
if request.user.is_authenticated:
try:
m.by = request.user.person
except Person.DoesNotExist:
pass
m.subject = 'I-D Action: %s-%s.txt' % (submission.name, submission.rev)
m.frm = settings.IDSUBMIT_ANNOUNCE_FROM_EMAIL
(m.to, m.cc) = gather_address_lists('sub_new_wg_00',submission=submission).as_strings()
if m.cc:
m.reply_to = m.cc
m.body = render_to_string('submit/announce_to_lists.txt',
dict(submission=submission,
settings=settings))
m.save()
m.related_docs.add(Document.objects.get(name=submission.name))
send_mail_message(request, m)
def announce_new_version(request, submission, draft, state_change_msg):
(to_email,cc) = gather_address_lists('sub_new_version',doc=draft,submission=submission)
if to_email:
subject = 'New Version Notification - %s-%s.txt' % (submission.name, submission.rev)
from_email = settings.IDSUBMIT_ANNOUNCE_FROM_EMAIL
send_mail(request, to_email, from_email, subject, 'submit/announce_new_version.txt',
{'submission': submission,
'msg': state_change_msg},
cc=cc)
def announce_to_authors(request, submission):
(to_email, cc) = gather_address_lists('sub_announced_to_authors',submission=submission)
from_email = settings.IDSUBMIT_ANNOUNCE_FROM_EMAIL
subject = 'New Version Notification for %s-%s.txt' % (submission.name, submission.rev)
if submission.group:
group = submission.group.acronym
elif submission.name.startswith('draft-iesg'):
group = 'IESG'
else:
group = 'Individual Submission'
send_mail(request, to_email, from_email, subject, 'submit/announce_to_authors.txt',
{'submission': submission,
'group': group},
cc=cc)
def get_reply_to():
"""Returns a new reply-to address for use with an outgoing message. This is an
address with "plus addressing" using a random string. Guaranteed to be unique"""
local,domain = get_base_submission_message_address().split('@')
while True:
rand = force_text(base64.urlsafe_b64encode(os.urandom(12)))
address = "{}+{}@{}".format(local,rand,domain)
q = Message.objects.filter(reply_to=address)
if not q:
return address
def process_response_email(msg):
"""Saves an incoming message. msg=string. Message "To" field is expected to
be in the format ietf-submit+[identifier]@ietf.org. Expect to find a message with
a matching value in the reply_to field, associated to a submission.
Create a Message object for the incoming message and associate it to
the original message via new SubmissionEvent"""
message = email.message_from_string(force_str(msg))
to = message.get('To')
# exit if this isn't a response we're interested in (with plus addressing)
local,domain = get_base_submission_message_address().split('@')
if not re.match(r'^{}\+[a-zA-Z0-9_\-]{}@{}'.format(local,'{16}',domain),to):
return None
try:
to_message = Message.objects.get(reply_to=to)
except Message.DoesNotExist:
log('Error finding matching message ({})'.format(to))
return None
try:
submission = to_message.manualevents.first().submission
except:
log('Error processing message ({})'.format(to))
return None
if not submission:
log('Error processing message - no submission ({})'.format(to))
return None
parts = pyzmail.parse.get_mail_parts(message)
body=''
for part in parts:
if part.is_body == 'text/plain' and part.disposition == None:
payload, used_charset = pyzmail.decode_text(part.get_payload(), part.charset, None)
body = body + payload + '\n'
by = Person.objects.get(name="(System)")
msg = submit_message_from_message(message, body, by)
desc = "Email: received message - manual post - {}-{}".format(
submission.name,
submission.rev)
submission_email_event = SubmissionEmailEvent.objects.create(
submission = submission,
desc = desc,
msgtype = 'msgin',
by = by,
message = msg,
in_reply_to = to_message
)
save_submission_email_attachments(submission_email_event, parts)
log("Received submission email from %s" % msg.frm)
return msg
def add_submission_email(request, remote_ip, name, rev, submission_pk, message, by, msgtype):
"""Add email to submission history"""
#in_reply_to = form.cleaned_data['in_reply_to']
# create Message
parts = pyzmail.parse.get_mail_parts(message)
body=''
for part in parts:
if part.is_body == 'text/plain' and part.disposition == None:
payload, used_charset = pyzmail.decode_text(part.get_payload(), part.charset, None)
body = body + payload + '\n'
msg = submit_message_from_message(message, body, by)
if (submission_pk != None):
# Must exist - we're adding a message to an existing submission
submission = Submission.objects.get(pk=submission_pk)
else:
# Must not exist
submissions = Submission.objects.filter(name=name,rev=rev).exclude(state_id='cancel')
if submissions.count() > 0:
raise ValidationError("Submission {} already exists".format(name))
# create Submission using the name
try:
submission = Submission.objects.create(
state_id="waiting-for-draft",
remote_ip=remote_ip,
name=name,
rev=rev,
title=name,
note="",
submission_date=datetime.date.today(),
replaces="",
)
from ietf.submit.utils import create_submission_event, docevent_from_submission
desc = "Submission created for rev {} in response to email".format(rev)
create_submission_event(request,
submission,
desc)
docevent_from_submission(submission, desc)
except Exception as e:
log("Exception: %s\n" % e)
raise
if msgtype == 'msgin':
rs = "Received"
else:
rs = "Sent"
desc = "{} message - manual post - {}-{}".format(rs, name, rev)
submission_email_event = SubmissionEmailEvent.objects.create(
desc = desc,
submission = submission,
msgtype = msgtype,
by = by,
message = msg)
#in_reply_to = in_reply_to
save_submission_email_attachments(submission_email_event, parts)
return submission, submission_email_event
def submit_message_from_message(message,body,by=None):
"""Returns a ietf.message.models.Message. msg=email.Message
A copy of mail.message_from_message with different body handling
"""
if not by:
by = Person.objects.get(name="(System)")
msg = Message.objects.create(
by = by,
subject = message.get('subject',''),
frm = message.get('from',''),
to = message.get('to',''),
cc = message.get('cc',''),
bcc = message.get('bcc',''),
reply_to = message.get('reply_to',''),
body = body,
time = utc_from_string(message.get('date', '')),
content_type = message.get('content_type', 'text/plain'),
)
return msg
def save_submission_email_attachments(submission_email_event, parts):
for part in parts:
if part.disposition != 'attachment':
continue
if part.type == 'text/plain':
payload, used_charset = pyzmail.decode_text(part.get_payload(),
part.charset,
None)
encoding = ""
else:
# Need a better approach - for the moment we'll just handle these
# and encode as base64
payload = base64.b64encode(part.get_payload())
encoding = "base64"
#name = submission_email_event.submission.name
MessageAttachment.objects.create(message = submission_email_event.message,
content_type = part.type,
encoding = encoding,
filename=part.filename,
body=payload)