v0.4.0

Latest

Latest

patcon released this 20 Jan 05:50

· 40 commits to main since this release

4dbbab1

Changes

Add select_consensus_statements() function, and wire into Polis implementation.
Allow calculate_comment_statistics() to work without groups/labels.
Generalize format_comment_stats() to work for group and consensus statements.
Add select_representative_statements() to PolisClusteringResult as repness key.
Rename arg pick_n to pick_max in select_consensus_statements(), for clarity and consistency.
Slight change to PolisRepness type, so group IDs now returned as ints.
Add print_selected_statements() presenter for inspecting PolisClusteringResult.
Add print_consensus_statements() presenter for inspecting PolisClusteringResult.
Allow pick_max and confidence interval args to be set in polis.run_clustering().
Allow get_corrected_centroid_guesses() to unflip each axis if correction not needed.
Abstracted reducer and clusterer algorithm support.
- Added support for pacmap/localmap beyond PCA.
- Added support for HDBSCAN clustering beyond KMeans.
- Allow passing of arbitary params into reducer/clusterer.
Remove support for polis_legacy implementation (PolisClient).
Added disagree variant of group-informed-consensus. (group-informed-consensus-disagree)
Brought group-informed-consensus metrics to top-level result object.
Renamed run_clustering function to run_pipeline and created base pipeline implementation.
Add option to generate_figure_polis to configure showing pid labels (show_pids).
Remove deprecated methods from doc website.
Remove deprecated modules from prior import paths.
Avoid using dataframes in a few low level util function, in favour of numpy arrays.
Rename projected_{participants,statements} to {participant,statement}_projections in run_pipeline results. Also coords keyed to ID, instead of dataframes.
Remove agora implementation and tests. (#73)
Migrate from reference HDBSCAN module (in scikit-learn) to full-featured HDBSCAN* package.
Add dependency groups to avoid installing everything. (#11)
Add support for statement and participant IDs to be strings. (#92)
Added BestPolisKMeans estimator. (#111)

Fixes

Handle when is-meta and is-seed columns arrive in CSV import.
#55
Handle loading comments data from API when is_meta missing in CSV import.
Only pass unique labels into generate_figure() colorbar.
bugfix: clusterer_kwargs and reducer_kwargs were not being pass through run_pipeline().
bugfix: Ensure run_pipeline() passes random_state to reducer.
bugfix: Fix overly constrained versions from #80.
bugfix: Ensure we don't crash when a participant ID in keep_participant_ids doesn't exist in vote matrix. (#100)
bugfix: Relaxed weight_x_32767 field to accept alternative responses (voxit server). (#107)

Chores

Update the release process instructions.
Added simulate_api_response() test helper for easier comparison with polismath output.
Bumped minimum supported Python version from 3.8 to 3.10. (#94)
Added CI testing for all of Python 3.10, 3.13. (#93)
Enforced a build of website and notebooks in pull requests. (#96)
Added support for test fixtures that pull data from live polis convos. (#98)
Added data loader to docs website. (#101)
Sync'd sanity-check API tests with new Pol.is server response fields. (#106)
Resolved deprecated method in HDBSCAN. (#109)
Removed CloudflareBypassHTTPAdapter that is no longer necessary. (#108)
Extracted Exporter class out of data_loader into data_exporter. (#100)
Updated HDBSCAN to allow using scikit-learn >=1.8. (#112)
Keep running CI even when one version of Python fails. (#112)

Assets 2