Commit graph

15 commits

Author SHA1 Message Date
Jennifer Richards cb8ef96f36
fix: more submission date feedback; refactor xml2rfc log capture (#8621)
* feat: catch and report any <date> parsing error

* refactor: error handling in a more testable way

* fix: no bare `except`

* test: exception cases for test_parse_creation_date

* fix: explicitly reject non-numeric day/year

* test: suppress xml2rfc output in test

* refactor: context manager to capture xml2rfc output

* refactor: more capture_xml2rfc_output usage

* fix: capture_xml2rfc_output exception handling
2025-03-03 09:21:39 -06:00
Jennifer Richards fb310e5ce2
feat: useful error when submission has inconsistent date (#8576)
* chore: handle errors in app-configure-blobstore.py

* feat: sensible error for inconsistent <date>
2025-02-21 09:49:16 -06:00
Jennifer Richards da0a217a8c
feat: use surname/initials for author name (#7510)
* feat: use surname/initials for author name

* test: test new method

* fix: handle case where author name is empty
2024-06-06 13:10:24 -05:00
Jennifer Richards ecc823eae5
feat: Give better errors when docName is missing (#7083)
* feat: Give better error when docName is missing

* refactor: Make method static for easier testing

* test: Add test of XMLDraft.parse_docname()
2024-02-21 11:59:07 -06:00
Jennifer Richards dc14308700
refactor: Drop submission validation via libmagic (#6500)
* refactor: Update parsers/base.py for Python3

* style: Black

* refactor: Remove mime type check from FileParser

* refactor: Validate that submission is UTF-8

The mime check distinguished us-ascii from UTF-8,
but as far as I can tell the code relying on it
treated both as equally valid.

* feat: Clear error when file is not valid XML

* chore: Remove unused import

* test: Update tests to match changes

* fix: Count bytes starting from 1

* test: Add tests of FileParser validations

* fix: Fix / simplify regexp

* test: Test error caused by bad XML submission

---------

Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
2023-10-23 10:00:04 -05:00
Jennifer Richards 69fd3a059e
fix: Validate string before calling int() (#6268) 2023-09-13 15:29:16 -05:00
Jennifer Richards 60dc60234d
fix: Better match xml2rfc date parsing (#5914)
* refactor: Split up get_creation_date to ease testing

* test: Add tests of parse_creation_date()

Note change in expected output when date_elt is None:
instead of returning None, this expects today's date.

* fix: Return today instead of None when date is absent

* fix: Handle empty string for day attribute

* test: Check a couple more parse_creation_date cases

* fix: Revert to returning None when date elt is absent

* style: black
2023-07-05 08:23:37 -05:00
Jennifer Richards d33a6f3c0c
fix: Handle missing date fields in XML submissions (#5744)
* refactor: Eliminate _construct_creation_date helper

* fix: Use xml2rfc method for filling in missing date fields

* fix: Set options.date for xml2rfc writers

* test: Test handling of missing date element/fields
2023-06-02 14:40:52 -05:00
Jennifer Richards 5a2708283b
feat: Extract document creation date from XML draft (#5733)
* fix: Extract document creation date from XML draft

* test: Fix test
2023-06-01 09:58:55 -05:00
Lars Eggert 182158b5c0
fix: Also extract document names from XML seriesInfo attributes and XInclude URLs (#5037)
* fix: Also extract document names from XML seriesInfo attributes

The old code only looked in the anchor string for the document names of
references, which doesn't work if the anchor uses a mnemonic. This caused lots
of missed references for many documents.

* No need to import lxml anymore

* Add tests

* Handle xinclude to bibxml URLs

* Wrap line

* Apply suggestion from @rjsparks

* Undo erroneous additions

* Address suggestion from @rjsparks
2023-02-14 17:07:54 -06:00
Jennifer Richards 3705bedfcd
feat: Celery support and asynchronous draft submission API (#4037)
* ci: add Dockerfile and action to build celery worker image

* ci: build celery worker on push to jennifer/celery branch

* ci: also build celery worker for main branch

* ci: Add comment to celery Dockerfile

* chore: first stab at a celery/rabbitmq docker-compose

* feat: add celery configuration and test task / endpoint

* chore: run mq/celery containers for dev work

* chore: point to ghcr.io image for celery worker

* refactor: move XML parsing duties into XMLDraft

Move some PlaintextDraft methods into the Draft base class and
implement for the XMLDraft class. Use xml2rfc code from ietf.submit
as a model for the parsing.

This leaves some mismatch between the PlaintextDraft and the Draft
class spec for the get_author_list() method to be resolved.

* feat: add api_upload endpoint and beginnings of async processing

This adds an api_upload() that behaves analogously to the api_submit()
endpoint. Celery tasks to handle asynchronous processing are added but
are not yet functional enough to be useful.

* perf: index Submission table on submission_date

This substantially speeds up submission rate threshold checks.

* feat: remove existing files when accepting a new submission

After checking that a submission is not in progress, remove any files
in staging that have the same name/rev with any extension. This should
guard against stale files confusing the submission process if the
usual cleanup fails or is skipped for some reason.

* refactor: make clear that deduce_group() uses only the draft name

* refactor: extract only draft name/revision in clean() method

Minimizing the amount of validation done when accepting a file. The
data extraction will be moved to asynchronous processing.

* refactor: minimize checks and data extraction in api_upload() view

* ci: fix dockerfiles to match sandbox testing

* ci: tweak celery container docker-compose settings

* refactor: clean up Draft parsing API and usage

  * remove get_draftname() from Draft api; set filename during init
  * further XMLDraft work
    - remember xml_version after parsing
    - extract filename/revision during init
    - comment out long broken get_abstract() method
  * adjust form clean() method to use changed API

* feat: flesh out async submission processing

First basically working pass!

* feat: add state name for submission being validated asynchronously

* feat: cancel submissions that async processing can't handle

* refactor: simplify/consolidate async tasks and improve error handling

* feat: add api_submission_status endpoint

* refactor: return JSON from submission api endpoints

* refactor: reuse cancel_submission method

* refactor: clean up error reporting a bit

* feat: guard against cancellation of a submission while validating

Not bulletproof but should prevent

* feat: indicate that a submission is still being validated

* fix: do not delete submission files after creating them

* chore: remove debug statement

* test: add tests of the api_upload and api_submission_status endpoints

* test: add tests and stubs for async side of submission handling

* fix: gracefully handle (ignore) invalid IDs in async submit task

* test: test process_uploaded_submission method

* fix: fix failures of new tests

* refactor: fix type checker complaints

* test: test submission_status view of submission in "validating" state

* fix: fix up migrations

* fix: use the streamlined SubmissionBaseUploadForm for api_upload

* feat: show submission history event timestamp as mouse-over text

* fix: remove 'manual' as next state for 'validating' submission state

* refactor: share SubmissionBaseUploadForm code with Deprecated version

* fix: validate text submission title, update a couple comments

* chore: disable requirements updating when celery dev container starts

* feat: log traceback on unexpected error during submission processing

* feat: allow secretariat to cancel "validating" submission

* feat: indicate time since submission on the status page

* perf: check submission rate thresholds earlier when possible

No sense parsing details of a draft that is going to be dropped regardless
of those details!

* fix: create Submission before saving to reduce race condition window

* fix: call deduce_group() with filename

* refactor: remove code lint

* refactor: change the api_upload URL to api/submission

* docs: update submission API documentation

* test: add tests of api_submission's text draft consistency checks

* refactor: rename api_upload to api_submission to agree with new URL

* test: test API documentation and submission thresholds

* fix: fix a couple api_submission view renames missed in templates

* chore: use base image + add arm64 support

* ci: try to fix workflow_dispatch for celery worker

* ci: another attempt to fix workflow_dispatch

* ci: build celery image for submit-async branch

* ci: fix typo

* ci: publish celery worker to ghcr.io/painless-security

* ci: install python requirements in celery image

* ci: fix up requirements install on celery image

* chore: remove XML_LIBRARY references that crept back in

* feat: accept 'replaces' field in api_submission

* docs: update api_submission documentation

* fix: remove unused import

* test: test "replaces" validation for submission API

* test: test that "replaces" is set by api_submission

* feat: trap TERM to gracefully stop celery container

* chore: tweak celery/mq settings

* docs: update installation instructions

* ci: adjust paths that trigger celery worker image  build

* ci: fix branches/repo names left over from dev

* ci: run manage.py check when initializing celery container

Driver here is applying the patches. Starting the celery workers
also invokes the check task, but this should cause a clearer failure
if something fails.

* docs: revise INSTALL instructions

* ci: pass filename to pip update in celery container

* docs: update INSTALL to include freezing pip versions

Will be used to coordinate package versions with the celery
container in production.

* docs: add explanation of frozen-requirements.txt

* ci: build image for sandbox deployment

* ci: add additional build trigger path

* docs: tweak INSTALL

* fix: change INSTALL process to stop datatracker before running migrations

* chore: use ietf.settings for manage.py check in celery container

* chore: set uid/gid for celery worker

* chore: create user/group in celery container if needed

* chore: tweak docker compose/init so celery container works in dev

* ci: build mq docker image

* fix: move rabbitmq.pid to writeable location

* fix: clear password when CELERY_PASSWORD is empty

Setting to an empty password is really not a good plan!

* chore: add shutdown debugging option to celery image

* chore: add django-celery-beat package

* chore: run "celery beat" in datatracker-celery image

* chore: fix docker image name

* feat: add task to cancel stale submissions

* test: test the cancel_stale_submissions task

* chore: make f-string with no interpolation a plain string

Co-authored-by: Nicolas Giard <github@ngpixel.com>
Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
2022-08-22 13:29:31 -05:00
Robert Sparks 10396d6f01
chore: remove more tools.ietf.org server only related things. (#4103)
* chore: remove more tools.ietf.org server only related things.

* chore: remove use of tools.ietf.org floorplans\n\nThe data will move into the FloorPlan models instead.
2022-06-22 14:11:46 -05:00
Robert Sparks dd66187362 Merged in [19895] from jennifer@painless-security.com:
Look at v2 'title' attribute in reference type heuristics for XML drafts. Related to #3529.
 - Legacy-Id: 19897
Note: SVN reference [19895] has been migrated to Git commit ea79fe0dcc183bc5cd8b27da67865c300b9dce4e
2022-01-31 16:54:14 +00:00
Robert Sparks d4fbaa2297 Guard against reference sections without names. Commit ready for merge.
- Legacy-Id: 19892
2022-01-29 22:49:08 +00:00
Jennifer Richards cf62b46093 Find references from submitted XML instead of rendering to text and parsing. Fixes #3342. Commit ready for merge.
- Legacy-Id: 19825
2022-01-07 17:53:23 +00:00