-
-
Notifications
You must be signed in to change notification settings - Fork 16
[#441] [IMPROVE] Preload embedding model at startup #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
59b40f0
3ffb74a
12b09a7
da9afaa
5ce7782
50a8bd3
795f218
d498a00
6d3d8d1
3824d81
46e9969
64a19ef
dec3c12
f9e890a
67176a8
d273921
8198574
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,3 +4,30 @@ | |
| class ApiConfig(AppConfig): | ||
| default_auto_field = 'django.db.models.BigAutoField' | ||
| name = 'api' | ||
|
|
||
| def ready(self): | ||
| import os | ||
| import sys | ||
|
|
||
| # ready() runs in every Django process: migrate, test, shell, runserver, etc. | ||
| # Only preload the model when we're actually going to serve requests. | ||
| # Dev (docker-compose.yml) runs `manage.py runserver 0.0.0.0:8000`. | ||
| # Prod (Dockerfile.prod CMD) runs `manage.py runserver 0.0.0.0:8000 --noreload`. | ||
| # entrypoint.prod.sh also runs migrate, createsu, and populatedb before exec'ing | ||
| # runserver — the guard below correctly skips model loading for those commands too. | ||
| if sys.argv[1:2] != ['runserver']: | ||
| return | ||
|
|
||
| # Dev's autoreloader spawns two processes: a parent file-watcher and a child | ||
| # server. ready() runs in both, but only the child (RUN_MAIN=true) serves | ||
| # requests. Skip the parent to avoid loading the model twice on each file change. | ||
| # Prod uses --noreload so RUN_MAIN is never set; 'noreload' in sys.argv handles that case. | ||
| if os.environ.get('RUN_MAIN') != 'true' and '--noreload' not in sys.argv: | ||
| return | ||
|
|
||
| # Note: paraphrase-MiniLM-L6-v2 (~80MB) is downloaded from HuggingFace on first | ||
| # use and cached to ~/.cache/torch/sentence_transformers/ inside the container. | ||
| # That cache is ephemeral — every container rebuild re-downloads the model unless | ||
| # a volume is mounted at that path. | ||
| from .services.sentencetTransformer_model import TransformerModel | ||
| TransformerModel.get_instance() | ||
|
Comment on lines
+8
to
+33
|
||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -2,6 +2,7 @@ | |||||||||
| import logging | ||||||||||
| from statistics import median | ||||||||||
|
|
||||||||||
| # Django filter() only does ADD logic | ||||||||||
|
||||||||||
| # Django filter() only does ADD logic | |
| # Use Q objects to express OR conditions in Django queries |
Copilot
AI
Mar 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_query() introduces/relocates important filtering + precedence logic (authenticated vs unauthenticated visibility; guid-over-document_name; LIMIT slicing), but the new tests only cover evaluate_query and log_usage. Add unit/integration tests covering build_query behavior (e.g., guid precedence and the authenticated/unauthenticated queryset filters) to prevent regressions in access control and filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building on Copilot's comment, the specifics of the QuerySet object's structure aren't publicly documented. To inspect the QuerySets, we should actually execute them.
There's a couple ways we handle DB access for these tests. We could use [pytest-django's ``@pytest.mark.django_db](https://pytest-django.readthedocs.io/en/latest/database.html), which wraps the test in a transaction the rolls back automatically afterwards. Django also has a built-in django.test.TestCase`, which does a similar thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the docs references -- I added tests for build_query and didn't have to access the database because I was able to inspect which methods and arguments were called on the model ("Embeddings")
Copilot
AI
Mar 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_usage() swallows all exceptions, but logger.error(f"... {e}") drops the traceback, making it much harder to debug production failures when SemanticSearchUsage writes fail. Prefer logger.exception(...) (or logger.error(..., exc_info=True)) so the stack trace is captured while still not interrupting the request.
| except Exception as e: | |
| logger.error(f"Failed to create semantic search usage database record: {e}") | |
| except Exception: | |
| logger.exception("Failed to create semantic search usage database record") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ApiConfig.ready()will only run if this AppConfig is actually used by Django. Right nowINSTALLED_APPSappears to include just "api" (not "api.apps.ApiConfig"), andapi/__init__.pydoesn’t set a default config, so this preload hook may never execute. Consider updatingINSTALLED_APPSto referenceapi.apps.ApiConfig(or otherwise ensuring this config is selected) so the model is preloaded as intended.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model preloads as intended because Django ≥ 3.2 auto discovers AppConfig subclasses in apps.py