datatracker/ietf/doc/storage_utils.py
Robert Sparks 997239a2ea
feat: write objects to blob storage (#8557)
* feat: basic blobstore infrastructure for dev

* refactor: (broken) attempt to put minio console behind nginx

* feat: initialize blobstore with boto3

* fix: abandon attempt to proxy minio. Use docker compose instead.

* feat: beginning of blob writes

* feat: storage utilities

* feat: test buckets

* chore: black

* chore: remove unused import

* chore: avoid f string when not needed

* fix: inform all settings files about blobstores

* fix: declare types for some settings

* ci: point to new target base

* ci: adjust test workflow

* fix: give the tests debug environment a blobstore

* fix: "better" name declarations

* ci: use devblobstore container

* chore: identify places to write to blobstorage

* chore: remove unreachable code

* feat: store materials

* feat: store statements

* feat: store status changes

* feat: store liaison attachments

* feat: store agendas provided with Interim session requests

* chore: capture TODOs

* feat: store polls and chatlogs

* chore: remove unneeded TODO

* feat: store drafts on submit and post

* fix: handle storage during doc expiration and resurrection

* fix: mirror an unlink

* chore: add/refine TODOs

* feat: store slide submissions

* fix: structure slide test correctly

* fix: correct sense of existence check

* feat: store some indexes

* feat: BlobShadowFileSystemStorage

* feat: shadow floorplans / host logos to the blob

* chore: remove unused import

* feat: strip path from blob shadow names

* feat: shadow photos / thumbs

* refactor: combine photo and photothumb blob kinds

The photos / thumbs were already dropped in the same
directory, so let's not add a distinction at this point.

* style: whitespace

* refactor: use kwargs consistently

* chore: migrations

* refactor: better deconstruct(); rebuild migrations

* fix: use new class in mack patch

* chore: add TODO

* feat: store group index documents

* chore: identify more TODO

* feat: store reviews

* fix: repair merge

* chore: remove unnecessary TODO

* feat: StoredObject metadata

* fix: deburr some debugging code

* fix: only set the deleted timestamp once

* chore: correct typo

* fix: get_or_create vs get and test

* fix: avoid the questionable is_seekable helper

* chore: capture future design consideration

* chore: blob store cfg for k8s

* chore: black

* chore: copyright

* ci: bucket name prefix option + run Black

Adds/uses DATATRACKER_BLOB_STORE_BUCKET_PREFIX option. Other changes
are just Black styling.

* ci: fix typo in bucket name expression

* chore: parameters in app-configure-blobstore

Allows use with other blob stores.

* ci: remove verify=False option

* fix: don't return value from __init__

* feat: option to log timing of S3Storage calls

* chore: units

* fix: deleted->null when storing a file

* style: Black

* feat: log as JSON; refactor to share code; handle exceptions

* ci: add ietf_log_blob_timing option for k8s

* test: --no-manage-blobstore option for running tests

* test: use blob store settings from env, if set

* test: actually set a couple more storage opts

* feat: offswitch (#8541)

* feat: offswitch

* fix: apply ENABLE_BLOBSTORAGE to BlobShadowFileSystemStorage behavior

* chore: log timing of blob reads

* chore: import Config from botocore.config

* chore(deps): import boto3-stubs / botocore

botocore is implicitly imported, but make it explicit
since we refer to it directly

* chore: drop type annotation that mypy loudly ignores

* refactor: add storage methods via mixin

Shares code between Document and DocHistory without
putting it in the base DocumentInfo class, which
lacks the name field. Also makes mypy happy.

* feat: add timeout / retry limit to boto client

* ci: let k8s config the timeouts via env

* chore: repair merge resolution typo

* chore: tweak settings imports

* chore: simplify k8s/settings_local.py imports

---------

Co-authored-by: Jennifer Richards <jennifer@staff.ietf.org>
2025-02-19 17:41:10 -06:00

104 lines
3.2 KiB
Python

# Copyright The IETF Trust 2025, All Rights Reserved
from io import BufferedReader
from typing import Optional, Union
import debug # pyflakes ignore
from django.conf import settings
from django.core.files.base import ContentFile, File
from django.core.files.storage import storages
# TODO-BLOBSTORE (Future, maybe after leaving 3.9) : add a return type
def _get_storage(kind: str):
if kind in settings.MORE_STORAGE_NAMES:
# TODO-BLOBSTORE - add a checker that verifies configuration will only return CustomS3Storages
return storages[kind]
else:
debug.say(f"Got into not-implemented looking for {kind}")
raise NotImplementedError(f"Don't know how to store {kind}")
def exists_in_storage(kind: str, name: str) -> bool:
if settings.ENABLE_BLOBSTORAGE:
store = _get_storage(kind)
return store.exists_in_storage(kind, name)
else:
return False
def remove_from_storage(kind: str, name: str, warn_if_missing: bool = True) -> None:
if settings.ENABLE_BLOBSTORAGE:
store = _get_storage(kind)
store.remove_from_storage(kind, name, warn_if_missing)
return None
# TODO-BLOBSTORE: Try to refactor `kind` out of the signature of the methods already on the custom store (which knows its kind)
def store_file(
kind: str,
name: str,
file: Union[File, BufferedReader],
allow_overwrite: bool = False,
doc_name: Optional[str] = None,
doc_rev: Optional[str] = None,
) -> None:
# debug.show('f"asked to store {name} into {kind}"')
if settings.ENABLE_BLOBSTORAGE:
store = _get_storage(kind)
store.store_file(kind, name, file, allow_overwrite, doc_name, doc_rev)
return None
def store_bytes(
kind: str,
name: str,
content: bytes,
allow_overwrite: bool = False,
doc_name: Optional[str] = None,
doc_rev: Optional[str] = None,
) -> None:
if settings.ENABLE_BLOBSTORAGE:
store_file(kind, name, ContentFile(content), allow_overwrite)
return None
def store_str(
kind: str,
name: str,
content: str,
allow_overwrite: bool = False,
doc_name: Optional[str] = None,
doc_rev: Optional[str] = None,
) -> None:
if settings.ENABLE_BLOBSTORAGE:
content_bytes = content.encode("utf-8")
store_bytes(kind, name, content_bytes, allow_overwrite)
return None
def retrieve_bytes(kind: str, name: str) -> bytes:
from ietf.doc.storage_backends import maybe_log_timing
content = b""
if settings.ENABLE_BLOBSTORAGE:
store = _get_storage(kind)
with store.open(name) as f:
with maybe_log_timing(
hasattr(store, "ietf_log_blob_timing") and store.ietf_log_blob_timing,
"read",
bucket_name=store.bucket_name if hasattr(store, "bucket_name") else "",
name=name,
):
content = f.read()
return content
def retrieve_str(kind: str, name: str) -> str:
content = ""
if settings.ENABLE_BLOBSTORAGE:
content_bytes = retrieve_bytes(kind, name)
# TODO-BLOBSTORE: try to decode all the different ways doc.text() does
content = content_bytes.decode("utf-8")
return content