datatracker/ietf/utils
Jennifer Richards 3705bedfcd
feat: Celery support and asynchronous draft submission API (#4037)
* ci: add Dockerfile and action to build celery worker image

* ci: build celery worker on push to jennifer/celery branch

* ci: also build celery worker for main branch

* ci: Add comment to celery Dockerfile

* chore: first stab at a celery/rabbitmq docker-compose

* feat: add celery configuration and test task / endpoint

* chore: run mq/celery containers for dev work

* chore: point to ghcr.io image for celery worker

* refactor: move XML parsing duties into XMLDraft

Move some PlaintextDraft methods into the Draft base class and
implement for the XMLDraft class. Use xml2rfc code from ietf.submit
as a model for the parsing.

This leaves some mismatch between the PlaintextDraft and the Draft
class spec for the get_author_list() method to be resolved.

* feat: add api_upload endpoint and beginnings of async processing

This adds an api_upload() that behaves analogously to the api_submit()
endpoint. Celery tasks to handle asynchronous processing are added but
are not yet functional enough to be useful.

* perf: index Submission table on submission_date

This substantially speeds up submission rate threshold checks.

* feat: remove existing files when accepting a new submission

After checking that a submission is not in progress, remove any files
in staging that have the same name/rev with any extension. This should
guard against stale files confusing the submission process if the
usual cleanup fails or is skipped for some reason.

* refactor: make clear that deduce_group() uses only the draft name

* refactor: extract only draft name/revision in clean() method

Minimizing the amount of validation done when accepting a file. The
data extraction will be moved to asynchronous processing.

* refactor: minimize checks and data extraction in api_upload() view

* ci: fix dockerfiles to match sandbox testing

* ci: tweak celery container docker-compose settings

* refactor: clean up Draft parsing API and usage

  * remove get_draftname() from Draft api; set filename during init
  * further XMLDraft work
    - remember xml_version after parsing
    - extract filename/revision during init
    - comment out long broken get_abstract() method
  * adjust form clean() method to use changed API

* feat: flesh out async submission processing

First basically working pass!

* feat: add state name for submission being validated asynchronously

* feat: cancel submissions that async processing can't handle

* refactor: simplify/consolidate async tasks and improve error handling

* feat: add api_submission_status endpoint

* refactor: return JSON from submission api endpoints

* refactor: reuse cancel_submission method

* refactor: clean up error reporting a bit

* feat: guard against cancellation of a submission while validating

Not bulletproof but should prevent

* feat: indicate that a submission is still being validated

* fix: do not delete submission files after creating them

* chore: remove debug statement

* test: add tests of the api_upload and api_submission_status endpoints

* test: add tests and stubs for async side of submission handling

* fix: gracefully handle (ignore) invalid IDs in async submit task

* test: test process_uploaded_submission method

* fix: fix failures of new tests

* refactor: fix type checker complaints

* test: test submission_status view of submission in "validating" state

* fix: fix up migrations

* fix: use the streamlined SubmissionBaseUploadForm for api_upload

* feat: show submission history event timestamp as mouse-over text

* fix: remove 'manual' as next state for 'validating' submission state

* refactor: share SubmissionBaseUploadForm code with Deprecated version

* fix: validate text submission title, update a couple comments

* chore: disable requirements updating when celery dev container starts

* feat: log traceback on unexpected error during submission processing

* feat: allow secretariat to cancel "validating" submission

* feat: indicate time since submission on the status page

* perf: check submission rate thresholds earlier when possible

No sense parsing details of a draft that is going to be dropped regardless
of those details!

* fix: create Submission before saving to reduce race condition window

* fix: call deduce_group() with filename

* refactor: remove code lint

* refactor: change the api_upload URL to api/submission

* docs: update submission API documentation

* test: add tests of api_submission's text draft consistency checks

* refactor: rename api_upload to api_submission to agree with new URL

* test: test API documentation and submission thresholds

* fix: fix a couple api_submission view renames missed in templates

* chore: use base image + add arm64 support

* ci: try to fix workflow_dispatch for celery worker

* ci: another attempt to fix workflow_dispatch

* ci: build celery image for submit-async branch

* ci: fix typo

* ci: publish celery worker to ghcr.io/painless-security

* ci: install python requirements in celery image

* ci: fix up requirements install on celery image

* chore: remove XML_LIBRARY references that crept back in

* feat: accept 'replaces' field in api_submission

* docs: update api_submission documentation

* fix: remove unused import

* test: test "replaces" validation for submission API

* test: test that "replaces" is set by api_submission

* feat: trap TERM to gracefully stop celery container

* chore: tweak celery/mq settings

* docs: update installation instructions

* ci: adjust paths that trigger celery worker image  build

* ci: fix branches/repo names left over from dev

* ci: run manage.py check when initializing celery container

Driver here is applying the patches. Starting the celery workers
also invokes the check task, but this should cause a clearer failure
if something fails.

* docs: revise INSTALL instructions

* ci: pass filename to pip update in celery container

* docs: update INSTALL to include freezing pip versions

Will be used to coordinate package versions with the celery
container in production.

* docs: add explanation of frozen-requirements.txt

* ci: build image for sandbox deployment

* ci: add additional build trigger path

* docs: tweak INSTALL

* fix: change INSTALL process to stop datatracker before running migrations

* chore: use ietf.settings for manage.py check in celery container

* chore: set uid/gid for celery worker

* chore: create user/group in celery container if needed

* chore: tweak docker compose/init so celery container works in dev

* ci: build mq docker image

* fix: move rabbitmq.pid to writeable location

* fix: clear password when CELERY_PASSWORD is empty

Setting to an empty password is really not a good plan!

* chore: add shutdown debugging option to celery image

* chore: add django-celery-beat package

* chore: run "celery beat" in datatracker-celery image

* chore: fix docker image name

* feat: add task to cancel stale submissions

* test: test the cancel_stale_submissions task

* chore: make f-string with no interpolation a plain string

Co-authored-by: Nicolas Giard <github@ngpixel.com>
Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
2022-08-22 13:29:31 -05:00
..
management fix: avoid mutables as defaults. Compute date default arguments at runtime rather than loadtime. (#4144) 2022-07-06 14:39:36 -05:00
migrations Reverted merge of timezone-aware migration efforts. 2021-01-12 16:54:20 +00:00
templatetags fix: Fix removetags (#4226) 2022-07-18 09:39:11 -05:00
.gitignore * Moved utility functions into utils/ directory, and started breaking out 2007-05-11 15:48:45 +00:00
__init__.py Cleaned out some (arbitrary) submodule name imports into ietf.utils, and made the corresponding import statements import the names directly from the correct submodules. 2014-03-16 07:09:38 +00:00
accesstoken.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
admin.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
aliases.py Log Unicode exception instead of printing them to the console when they occur building the email alias files. 2021-03-09 21:19:11 +00:00
bootstrap.py And more fixes. 2022-02-01 07:47:25 +00:00
crawlurls.txt Merged in [19713] from lars@eggert.org: 2021-12-01 22:50:13 +00:00
db.py Guard against absent 'form_class' kwarg in IETFJSONField.formfield(). Commit ready for merge. 2021-11-18 15:54:46 +00:00
decorators.py Reverted merge of timezone-aware migration efforts. 2021-01-12 16:54:20 +00:00
draft.py feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
draft_search.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
fields.py feat: only offer IAB/IESG members for bofreq responsible leadership (#4276) 2022-07-26 11:23:00 -05:00
hedgedoc.py feat: improve notes imports by using de-gfm -4. Related to #3851. (#3930) 2022-05-03 18:04:48 -05:00
history.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
html.py fix: Fix removetags (#4226) 2022-07-18 09:39:11 -05:00
jstest.py fix: offset scrollspy so righthand-nav highlights the correct entry (#3820) 2022-04-14 15:35:07 -03:00
log.py Updated log.assertion() to provide an exception object (under Py3, it seems that logging.Logger instances ignore the traceback if there isn't also an exception object). Added a check for unset draft-iesg state to Document.set_state(). 2020-09-18 14:15:02 +00:00
mail.py feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
markdown.py feat: Render the document shepherd writeup templates at two new URLs (#4225) 2022-07-22 13:43:02 -05:00
markup_txt.py Many more HTML fixes. 2022-02-03 07:49:34 +00:00
meetecho.py Use correct UTC time when creating Meetecho conferences. Fixes #3565. Commit ready for merge. 2022-02-23 20:51:18 +00:00
mime.py Merged in ^/trunk@17617. 2020-04-14 17:11:51 +00:00
models.py Added a utility function to convert objects to dictionaries (for comparison, for instance) 2020-03-19 22:42:43 +00:00
ordereddict.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
patch.py Made the patch utility return information to distinguish already patched files from successful patch application, and modified our checks extensions to signal when patches have been applied and a ccommand needs to be re-run. 2020-08-20 11:36:46 +00:00
pdf.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
pipe.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
resources.py Added 'from __future__' imports all over the place, to bring code behaviour into closer alignment between python2 and python3 2019-07-15 15:40:51 +00:00
response.py test: Validate HTML rendered during tests (#3782) 2022-04-07 13:30:38 -03:00
serialize.py Reverted merge of timezone-aware migration efforts. 2021-01-12 16:54:20 +00:00
storage.py Added a pylint rc-file, and fixed or silenced a number of issues found by pylint using the settings .pylintrc (which enable only error checking). 2016-09-08 14:48:59 +00:00
test_data.py feat: begin supporting the new rfc editor model (#3960) 2022-05-20 12:22:17 -05:00
test_draft_with_references_v2.xml Find references from submitted XML instead of rendering to text and parsing. Fixes #3342. Commit ready for merge. 2022-01-07 17:53:23 +00:00
test_draft_with_references_v3.xml Merged in [19895] from jennifer@painless-security.com: 2022-01-31 16:54:14 +00:00
test_runner.py test: Convert interleaved migration failure to a warning. (#4301) 2022-08-02 10:23:12 -05:00
test_smtpserver.py fix: avoid mutables as defaults. Compute date default arguments at runtime rather than loadtime. (#4144) 2022-07-06 14:39:36 -05:00
test_textupload.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
test_utils.py test: Validate HTML rendered during tests (#3782) 2022-04-07 13:30:38 -03:00
tests.py fix: test web manifest (#4047) 2022-06-02 11:10:11 -05:00
tests_hedgedoc.py Add ability to import session minutes from notes.ietf.org. Mock out calls to the requests library in tests. Call markdown library through a util method. Fixes #3489. Commit ready for merge. 2021-12-09 17:16:19 +00:00
tests_meetecho.py Use correct UTC time when creating Meetecho conferences. Fixes #3565. Commit ready for merge. 2022-02-23 20:51:18 +00:00
tests_restapi.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
texescape.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
text.py feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
textupload.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
timezone.py Reverted merge of timezone-aware migration efforts. 2021-01-12 16:54:20 +00:00
urls.py Removed all __future__ imports. 2020-03-05 23:53:42 +00:00
validators.py Snapshot of dev work to add session purpose annotation 2021-10-12 17:08:58 +00:00
xmldraft.py feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00