datatracker/docker
Jennifer Richards 3705bedfcd
feat: Celery support and asynchronous draft submission API (#4037)
* ci: add Dockerfile and action to build celery worker image

* ci: build celery worker on push to jennifer/celery branch

* ci: also build celery worker for main branch

* ci: Add comment to celery Dockerfile

* chore: first stab at a celery/rabbitmq docker-compose

* feat: add celery configuration and test task / endpoint

* chore: run mq/celery containers for dev work

* chore: point to ghcr.io image for celery worker

* refactor: move XML parsing duties into XMLDraft

Move some PlaintextDraft methods into the Draft base class and
implement for the XMLDraft class. Use xml2rfc code from ietf.submit
as a model for the parsing.

This leaves some mismatch between the PlaintextDraft and the Draft
class spec for the get_author_list() method to be resolved.

* feat: add api_upload endpoint and beginnings of async processing

This adds an api_upload() that behaves analogously to the api_submit()
endpoint. Celery tasks to handle asynchronous processing are added but
are not yet functional enough to be useful.

* perf: index Submission table on submission_date

This substantially speeds up submission rate threshold checks.

* feat: remove existing files when accepting a new submission

After checking that a submission is not in progress, remove any files
in staging that have the same name/rev with any extension. This should
guard against stale files confusing the submission process if the
usual cleanup fails or is skipped for some reason.

* refactor: make clear that deduce_group() uses only the draft name

* refactor: extract only draft name/revision in clean() method

Minimizing the amount of validation done when accepting a file. The
data extraction will be moved to asynchronous processing.

* refactor: minimize checks and data extraction in api_upload() view

* ci: fix dockerfiles to match sandbox testing

* ci: tweak celery container docker-compose settings

* refactor: clean up Draft parsing API and usage

  * remove get_draftname() from Draft api; set filename during init
  * further XMLDraft work
    - remember xml_version after parsing
    - extract filename/revision during init
    - comment out long broken get_abstract() method
  * adjust form clean() method to use changed API

* feat: flesh out async submission processing

First basically working pass!

* feat: add state name for submission being validated asynchronously

* feat: cancel submissions that async processing can't handle

* refactor: simplify/consolidate async tasks and improve error handling

* feat: add api_submission_status endpoint

* refactor: return JSON from submission api endpoints

* refactor: reuse cancel_submission method

* refactor: clean up error reporting a bit

* feat: guard against cancellation of a submission while validating

Not bulletproof but should prevent

* feat: indicate that a submission is still being validated

* fix: do not delete submission files after creating them

* chore: remove debug statement

* test: add tests of the api_upload and api_submission_status endpoints

* test: add tests and stubs for async side of submission handling

* fix: gracefully handle (ignore) invalid IDs in async submit task

* test: test process_uploaded_submission method

* fix: fix failures of new tests

* refactor: fix type checker complaints

* test: test submission_status view of submission in "validating" state

* fix: fix up migrations

* fix: use the streamlined SubmissionBaseUploadForm for api_upload

* feat: show submission history event timestamp as mouse-over text

* fix: remove 'manual' as next state for 'validating' submission state

* refactor: share SubmissionBaseUploadForm code with Deprecated version

* fix: validate text submission title, update a couple comments

* chore: disable requirements updating when celery dev container starts

* feat: log traceback on unexpected error during submission processing

* feat: allow secretariat to cancel "validating" submission

* feat: indicate time since submission on the status page

* perf: check submission rate thresholds earlier when possible

No sense parsing details of a draft that is going to be dropped regardless
of those details!

* fix: create Submission before saving to reduce race condition window

* fix: call deduce_group() with filename

* refactor: remove code lint

* refactor: change the api_upload URL to api/submission

* docs: update submission API documentation

* test: add tests of api_submission's text draft consistency checks

* refactor: rename api_upload to api_submission to agree with new URL

* test: test API documentation and submission thresholds

* fix: fix a couple api_submission view renames missed in templates

* chore: use base image + add arm64 support

* ci: try to fix workflow_dispatch for celery worker

* ci: another attempt to fix workflow_dispatch

* ci: build celery image for submit-async branch

* ci: fix typo

* ci: publish celery worker to ghcr.io/painless-security

* ci: install python requirements in celery image

* ci: fix up requirements install on celery image

* chore: remove XML_LIBRARY references that crept back in

* feat: accept 'replaces' field in api_submission

* docs: update api_submission documentation

* fix: remove unused import

* test: test "replaces" validation for submission API

* test: test that "replaces" is set by api_submission

* feat: trap TERM to gracefully stop celery container

* chore: tweak celery/mq settings

* docs: update installation instructions

* ci: adjust paths that trigger celery worker image  build

* ci: fix branches/repo names left over from dev

* ci: run manage.py check when initializing celery container

Driver here is applying the patches. Starting the celery workers
also invokes the check task, but this should cause a clearer failure
if something fails.

* docs: revise INSTALL instructions

* ci: pass filename to pip update in celery container

* docs: update INSTALL to include freezing pip versions

Will be used to coordinate package versions with the celery
container in production.

* docs: add explanation of frozen-requirements.txt

* ci: build image for sandbox deployment

* ci: add additional build trigger path

* docs: tweak INSTALL

* fix: change INSTALL process to stop datatracker before running migrations

* chore: use ietf.settings for manage.py check in celery container

* chore: set uid/gid for celery worker

* chore: create user/group in celery container if needed

* chore: tweak docker compose/init so celery container works in dev

* ci: build mq docker image

* fix: move rabbitmq.pid to writeable location

* fix: clear password when CELERY_PASSWORD is empty

Setting to an empty password is really not a good plan!

* chore: add shutdown debugging option to celery image

* chore: add django-celery-beat package

* chore: run "celery beat" in datatracker-celery image

* chore: fix docker image name

* feat: add task to cancel stale submissions

* test: test the cancel_stale_submissions task

* chore: make f-string with no interpolation a plain string

Co-authored-by: Nicolas Giard <github@ngpixel.com>
Co-authored-by: Robert Sparks <rjsparks@nostrum.com>
2022-08-22 13:29:31 -05:00
..
assets misc: import docker improvements from 7.39.1.dev2 2021-11-10 21:51:55 +00:00
configs chore: create separate run config for vite dev (#4221) 2022-07-18 08:40:32 -05:00
misc misc: import docker improvements from 7.39.1.dev2 2021-11-10 21:51:55 +00:00
scripts chore: improve devcontainer build (#4241) 2022-07-23 10:33:41 -04:00
app.Dockerfile chore: improve devcontainer build (#4241) 2022-07-23 10:33:41 -04:00
base.Dockerfile feat: Replace graphviz with d3 (#4067) 2022-07-21 12:14:45 -05:00
cleanall fix: skip assets volume in cleandb script + add confirm prompt to cleanall (#4038) 2022-06-13 12:13:16 -05:00
cleandb fix: skip assets volume in cleandb script + add confirm prompt to cleanall (#4038) 2022-06-13 12:13:16 -05:00
db.Dockerfile ci: fix db import file permission 2022-07-28 17:38:56 -04:00
docker-compose.celery.yml feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
docker-compose.extend.yml feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
rabbitmq.conf feat: Celery support and asynchronous draft submission API (#4037) 2022-08-22 13:29:31 -05:00
README.md docs: update DOCKER readme 2022-07-23 09:39:20 -04:00
run chore: remove workspace chown from dev init script + add warning when using root 2022-05-21 00:18:24 -04:00

Datatracker Development in Docker

Getting started

  1. Set up Docker on your preferred platform. On Windows, it is highly recommended to use the WSL 2 (Windows Subsystem for Linux) backend.

See the IETF Tools Windows Dev guide on how to get started when using Windows.

  1. On Linux, you must also install Docker Compose. Docker Desktop for Mac and Windows already include Docker Compose.

  2. If you have a copy of the datatracker code checked out already, simply cd to the top-level directory.

    If not, check out a datatracker branch as usual. We'll check out main below, but you can use any branch:

    git clone https://github.com/ietf-tools/datatracker.git
    cd datatracker
    git checkout main
    
  3. Follow the instructions for your preferred editor:

Using Visual Studio Code

This project includes a devcontainer configuration which automates the setup of the development environment with all the required dependencies.

Initial Setup

  1. Launch VS Code
  2. Under the Extensions tab, ensure you have the Remote - Containers (ms-vscode-remote.remote-containers) extension installed. On Windows, you also need the Remote - WSL (ms-vscode-remote.remote-wsl) extension to take advantage of the WSL 2 (Windows Subsystem for Linux) native integration.
  3. Open the top-level directory of the datatracker code you fetched above.
  4. A prompt inviting you to reopen the project in containers will appear in the bottom-right corner. Click the Reopen in Container button. If you missed the prompt, you can press F1, start typing reopen in container task and launch it.
  5. VS Code will relaunch in the dev environment and create the containers automatically.
  6. You may get several warnings prompting you to reload the window as extensions get installed for the first time. Wait for the initialization script to complete before doing so. (Wait for the message Done! to appear in the terminal panel.)

Subsequent Launch

To return to your dev environment created above, simply open VS Code and select File > Open Recent and select the datatracker folder with the [Dev Container] suffix.

You can also open the datatracker project folder and click the Reopen in container button when prompted. If you missed the prompt, you can press F1, start typing reopen in container task and launch it.

Usage

  • Under the Run and Debug tab, you can run the server with the debugger attached using Run Server (F5). Once the server is ready to accept connections, you'll be prompted to open in a browser. You can also open http://localhost:8000 in a browser.

    An alternate profile Run Server with Debug Toolbar is also available from the dropdown menu, which displays various tools on top of the webpage. However, note that this configuration has a significant performance impact.

    To add a Breakpoint, simply click to the left of the line gutter you wish to stop at. You can also add Conditional Breakpoints and Logpoint by right-clicking at the same location.

    While running in debug mode (F5), the following toolbar is shown at the top of the editor:

    See this tutorial on how to use the debugging tools for Django in VS Code.

  • An integrated terminal is available with various shell options (zsh, bash, fish, etc.). Use the New Terminal button located at the right side of the Terminal panel. You can have as many as needed running in parallel and you can use split to display multiple at once.

  • Under the SQL Tools tab, a connection Local Dev is preconfigured to connect to the DB container. Using this tool, you can list tables, view records and execute SQL queries directly from VS Code.

    The port 3306 is also exposed to the host automatically, should you prefer to use your own SQL tool.

  • Under the Task Explorer tab, a list of available preconfigured tasks is displayed. (You may need to expand the tree to src > vscode to see it.) These are common scritps you can run (e.g. run tests, fetch assets, etc.).

  • From the command palette (F1), the command Run Test Task allows you to choose between running all tests or just the javascript tests.

  • The Ports panel, found in the Terminal area, shows the ports currently mapped to your host and if they are currently listening.

Using Other Editors / Generic

  1. From the terminal, in the top-level directory of the datatracker project:

    On Linux / macOS:

    cd docker
    ./run
    

    Note that you can pass the -r flag to ./run to force a rebuild of the containers. This is useful if you switched branches and that the existing containers still contain configurations from the old branch. You should also use this if you don't regularly keep up with main and your containers reflect a much older version of the branch.

    On Windows (using Powershell):

    Copy-Item "docker/docker-compose.extend.yml" -Destination "docker/docker-compose.extend-custom.yml"
    (Get-Content -path docker/docker-compose.extend-custom.yml -Raw) -replace 'CUSTOM_PORT','8000' | Set-Content -Path docker/docker-compose.extend-custom.yml
    docker-compose -f docker-compose.yml -f docker/docker-compose.extend-custom.yml up -d
    docker-compose exec app /bin/sh /docker-init.sh
    
  2. Wait for the containers to initialize. Upon completion, you will be dropped into a shell from which you can start the datatracker and execute related commands as usual, for example

    ietf/manage.py runserver 0.0.0.0:8000
    

    to start the datatracker.

    Once the datatracker has started, you should be able to open http://localhost:8000 in a browser and see the landing page.

    Note that unlike the VS Code setup, a debug SMTP server is launched automatically. Any email will be discarded and logged to the shell.

Exit Environment

To exit the dev environment, simply enter command exit in the shell.

The containers will automatically be shut down on Linux / macOS.

On Windows, type the command

docker-compose down

to terminate the containers.

Clean and Rebuild DB from latest image

To delete the active DB container, its volume and get the latest image / DB dump, simply run the following command:

On Linux / macOS:

cd docker
./cleandb

On Windows:

docker-compose down -v
docker-compose pull db
docker-compose build --no-cache db

Clean all

To delete all containers for this project, its associated images and purge any remaining dangling images, simply run the following command:

On Linux / macOS:

cd docker
./cleanall

On Windows:

docker-compose down -v --rmi all
docker image prune 

Accessing MariaDB Port

The port is exposed but not mapped to 3306 to avoid potential conflicts with the host. To get the mapped port, run the command (from the project /docker directory):

docker-compose port db 3306

Notes / Troubleshooting

Slow zsh prompt inside Docker

On Windows, the zsh prompt can become incredibly slow because of the git status check displayed as part of the prompt. To remove this delay, run the command:

git config oh-my-zsh.hide-info 1

Windows .ics files incorrectly linked

When checking out the project on Windows, the .ics files are not correctly linked and will cause many tests to fail. To fix this issue, run the Fix Windows Timezone File Linking task in VS Code or run manually the script docker/scripts/app-win32-timezone-fix.sh

The content of the source files will be copied into the target .ics files. Make sure not to add these modified files when committing code!

Missing assets in the data folder

Because including all assets in the image would significantly increase the file size, they are not included by default. You can however fetch them by running the Fetch assets via rsync task in VS Code or run manually the script docker/scripts/app-rsync-extras.sh