Skip to content

matching stats as join table#3835

Open
awildturtok wants to merge 56 commits intodevelopfrom
feature/matching-stats-as-join-table
Open

matching stats as join table#3835
awildturtok wants to merge 56 commits intodevelopfrom
feature/matching-stats-as-join-table

Conversation

@awildturtok
Copy link
Copy Markdown
Collaborator

@awildturtok awildturtok commented Jan 22, 2026

Eine Beispielhafte join-table könnte folgende Einträge enthalten:

                     resolved_id                      | col_val
------------------------------------------------------+----------
 database.icd.s00-t98.s00-s09.s00.s00_0.s00_00 | T612
 database.icd.s00-t98.s00-s09.s00.s00_0.s00_00 | T611
 database.icd.s00-t98.s00-s09.s00.s00_0.s00_00 | T610
 database.icd.j00-j99.j30-j39.j38.j38_0.j38_00 | J951
  select
    diag.id as "pid",
    least(
      'infinity'::date,
      lower("datum"),
      upper("datum")
    ) as "lower_bound",
    greatest(
      '-infinity'::date,
      lower("datum"),
      upper("datum")
    ) as "upper_bound",
    "resolved_id"
  from "diag"
    left outer join "icd_ids"
      on "diag"."icd_code" = "col_val"

im Falle von ColumnEquals wären dann noch weitere Spalten dabei. Bei PRESENT/EMPTY wären stattdessen true/false drin, die dann im join abgebildet werden.

awildturtok and others added 30 commits November 12, 2025 15:24
# Conflicts:
#	backend/src/main/java/com/bakdata/conquery/mode/local/LocalStorageListener.java
Also fixes a case, where missing entries were not registered to the root.
…tats-as-sql-function

# Conflicts:
#	backend/src/main/java/com/bakdata/conquery/models/messages/namespaces/specific/UpdateMatchingStatsMessage.java
Comment on lines +19 to +20
@Data
public abstract class StorageListener<T extends Namespace>{
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich hätte das Interface beibehalten und davon diese abstract Class abgeleitet.

Interfaces können auch praktisch sein, da man für diese Proxies implementieren kann

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich hatte glaube ich gehofft mehr logik zu vereinheitlichen, bin auch nicht deiner Meinung. Aber hier ist es definitiv die klasse wert.

Comment thread backend/src/main/java/com/bakdata/conquery/sql/conquery/SqlMatchingStats.java Outdated
@awildturtok awildturtok force-pushed the feature/matching-stats-as-join-table branch from 81dd8ff to 4366d85 Compare February 5, 2026 15:32
@awildturtok awildturtok force-pushed the feature/matching-stats-as-join-table branch from db2209e to 1ef85b9 Compare February 17, 2026 09:02
@awildturtok awildturtok force-pushed the feature/matching-stats-as-join-table branch from 4a218ad to 14b72b5 Compare February 17, 2026 16:28
@awildturtok awildturtok enabled auto-merge (squash) March 5, 2026 15:06
…tats-as-join-table

# Conflicts:
#	backend/src/main/java/com/bakdata/conquery/mode/local/LocalManagerProvider.java
#	backend/src/main/java/com/bakdata/conquery/models/worker/LocalNamespace.java
#	backend/src/main/java/com/bakdata/conquery/sql/conversion/cqelement/concept/CQConceptConverter.java
#	backend/src/main/java/com/bakdata/conquery/sql/conversion/dialect/PostgreDialectBundle.java
#	backend/src/main/java/com/bakdata/conquery/sql/conversion/query/TableExportQueryConverter.java
#	backend/src/main/java/com/bakdata/conquery/util/TablePrimaryColumnUtil.java
#	backend/src/test/java/com/bakdata/conquery/integration/json/SqlTestDataImporter.java
@awildturtok awildturtok force-pushed the feature/matching-stats-as-join-table branch from 89dca10 to 8afe6c0 Compare April 8, 2026 13:32
@awildturtok awildturtok force-pushed the feature/matching-stats-as-join-table branch from 8afe6c0 to 9871265 Compare April 8, 2026 14:19
@awildturtok awildturtok requested a review from thoniTUB April 9, 2026 10:00
@awildturtok awildturtok disabled auto-merge April 9, 2026 10:02
@awildturtok awildturtok disabled auto-merge April 9, 2026 10:02
Comment thread backend/src/main/java/com/bakdata/conquery/mode/local/ManagedConnection.java Outdated


@Override
public void execute() throws Exception {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hier bin ich noch ziemlich unentschlossen ob das reicht. Ich denke ich möchte auf jeden Fall die HikariConfig exposen in der config.json. Dann können wir darüber das connection management machen anstatt Angst zu haben


Stopwatch stopwatch = Stopwatch.createStarted();

ExecutorService executorService = Executors.newSingleThreadExecutor();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

die collectMatchingStatsForConcept hat ein async callable, das sollte also genau passen

Comment on lines +19 to +20
@Data
public abstract class StorageListener<T extends Namespace>{
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich hatte glaube ich gehofft mehr logik zu vereinheitlichen, bin auch nicht deiner Meinung. Aber hier ist es definitiv die klasse wert.


boolean matches(String value, CalculatedValue<Map<String, Object>> rowMap) throws ConceptConfigurationException;

//TODO implement using join-table
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ist ein separates Projekt, also kann bleiben

protected static Field<Object> coalesceFields(List<Field<?>> fields) {
if (fields.size() == 1) {
return fields.get(0).coerce(Object.class);
protected static <T> Field<T> coalesceFields(List<? extends Field<?>> fields, Class<T> type) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weiß leider nicht mehr wo aber hier gab es einen komischen Bug damit.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ist es schlau hier die Argumente (also fields.get(0)) zu überschreiben? Willst du die Funktion nicht lieber idempotent machen.

Ich würde auch das coerce erst im Nachgang machen, wenn das geht.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

das sieht tatsächlich besser aus, danke

Copy link
Copy Markdown
Collaborator

@thoniTUB thoniTUB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich muss nochmal weiter machen.


@Setter @Getter @NotEmpty
private Set<String> values;
@NotEmpty @Setter @Getter
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@NotBlank @Setter @Getter

public class ColumnEqualCondition implements CTCondition {

@Setter @Getter @NotEmpty
private Set<String> values;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private Set<@NotBlank String> values;

fieldParams = otherParams;
}
else {
fieldParams = Sets.intersection(otherParams, myParams);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warum ist das ein intersect und kein union?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weil die zwei elemente verwundet werden

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩸 🗡️

Okay muss ich nochmal überdenken, ich dachte hier werden Filter/Conditionen verundet und nicht die gefilterten Werte

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hier werden Concept-Conditions verundet

Comment on lines +50 to +51
//TODO better name
record Expression(ConceptElement<?> conceptElement, Map<Field<?>, Set<Param<?>>> conditions) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Es wird ja nur für matching stats genommen, oder? Dann würde ich das in den Namen einfließen lassen

/**
* This condition requires that the selected Column has a value.
*/
@CPSType(id="PRESENT", base=CTCondition.class)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Würde ich behalten

Comment on lines +235 to +246
/**
* Drop the table, then recreate it.
*/
private void createConceptIdsTable(Name tableName, List<Field<?>> fields) {

log.debug("Creating table {} with fields {}", tableName, fields);

CreateTableElementListStep createTable =
dslContext.createTable(tableName)
.columns(fields);

log.info("{}", createTable);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ist das drop implizit? ich sehe es hier nicht

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passiert im caller

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dann bitte die Doku anpassen

protected static Field<Object> coalesceFields(List<Field<?>> fields) {
if (fields.size() == 1) {
return fields.get(0).coerce(Object.class);
protected static <T> Field<T> coalesceFields(List<? extends Field<?>> fields, Class<T> type) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ist es schlau hier die Argumente (also fields.get(0)) zu überschreiben? Willst du die Funktion nicht lieber idempotent machen.

Ich würde auch das coerce erst im Nachgang machen, wenn das geht.

{
"type": "QUERY_TEST",
"sqlSpec": {
"isEnabled": true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wieder enablen

case MONEY -> true; // Not possible to introspect for
case DATE -> field.getDataType().isDate();
case DATE_RANGE -> field.getDataType().getTypeName().equals("daterange");
case DATE_RANGE -> true; // Not possible to introspect for
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wäre es hiermit?

Suggested change
case DATE_RANGE -> true; // Not possible to introspect for
case DATE_RANGE -> field.getDataType().getType().equals(org.jooq.postgres.extensions.types.DateRange.class);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ich kann mal schauen, aber iirc kommen die einfach als Object zurück

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zu mindest in den Tests würde habe ich gesehen, dass hier passen würde

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich meine, dass ich es in den Tests auch mal drin hatte und dann deaktivieren musste, sobald ich es auf eine separate Instanz hab zeigen lassen. Ich habs mal zurückgedreht und schau es mir an.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants