Skip to content

Latest commit

 

History

History
100 lines (67 loc) · 4.62 KB

CHANGELOG.md

File metadata and controls

100 lines (67 loc) · 4.62 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

Fixed

[0.8.0]

Added

  • Increased compatibility with PromptingTools upto 0.71.

[0.7.0]

Added

  • Increased compatibility with PromptingTools upto 0.64.

[0.6.0]

Added

  • Added a new template TopicLabelerQuestionsWithContext to generate topic labels for questions in a given context (eg, chatbot inputs).
  • Added a new function topic_tree to generate a topic tree for a given topic level (display with print_tree).

Updated

  • Updated to use PromptingTools 0.44.0.
  • That implicitly changes the default chat model to gpt-4o-mini.
  • Minor updates to labeling templates to ensure that the labels are plain text, no markdown or code.
  • Re-formatted code to SciML style guide.
  • Upstreamed function wrap_string from LLMTextAnalysis to PromptingTools.

Fixed

  • Fixed a bug where wrap_string would not properly handle large words with a lot of Unicode characters.

[0.5.0]

Added

  • Added a classification function train_classifier to train a model to classify documents into a set of predefined labels (as opposed to the more open-ended topic modeling in build_clusters!). You can either provide a small set of labeled documents to train the model (that are in the index), or just specify the num_samples and the LLM model will generate its own training data based on the labels and labels_description provided.
  • Added a new template TextWriterFromLabel to generate synthetic documents for any given label (=topic).
  • Added methods for build_clusters! to add custom topic levels, eg, from a TrainedClassifier (build_clusters!(index,cls; topic_level="MyTopics")) or directly via providing a vector of document assignments (build_clusters!(index, assignments; topic_level="MyTopics")). The convention is to use topic_level::Integer for auto-generated topics, and topic_level::String for custom topics.

Updated

  • Updated to use PromptingTools 0.15.

Fixed

  • Fixed a bug where keywords were not properly filtered before being provided to the auto-labeling function.

[0.4.0]

Added

  • Added a new example on topic label customization (examples/3_customize_topic_labels.jl) and the corresponding sections in the FAQ.
  • Added a few string cleanup tricks in build_topic function to strip unnecessary repetition of the prompt template in the generated labels.
  • Added new templates TopicLabelerWithInstructions and TopicSummarizerWithInstructions that include the placeholder instructions to allow users to easily customize the labels and summaries, respectively.

Fixed

  • Fixed small typos in templates TopicLabelerBasic and TopicSummarizerBasic.

Updated

  • Updated logic in the plot to ensure topic labels are generated only when necessary. Use build_clusters! to force the generation of topic labels, or plot to generate them only if necessary.
  • Increased compatibility for PromptingTools to 0.12.

[0.3.2]

Fixed

  • Fixed a bug where plot() would error with UndefVarError(scores1).

[0.3.1]

Fixed

  • wrap_string utility would error with SubString chunks. Now it works with any AbstractString type.

[0.3.0]

Added

  • Changed compat for PromptingTools to 0.10.0 (with new default models! Ie, default embeddings will not match the previous version)

[0.2.1]

Fixed

  • Updated documentation to show Example 2 for concept/spectrum training.

[0.2.0]

Added

  • Added train_concept. Introduces the ability to train a model focusing on a single, specific concept within a collection of documents. This function helps in identifying and scoring the presence or intensity of the selected concept across the document set. Ideal for thematic studies, sentiment analysis, or tracking specific ideas in the text.
  • Added train_spectrum. Adds functionality to analyze documents across a spectrum defined by two contrasting concepts. This feature allows for a comparative analysis, providing insights into how documents align or contrast with two polar themes or sentiments.
  • Spectrum and concept can be plotted using plot function.
  • Improved plotting support:
    • Added package extension for PlotlyJS for plot function.
    • Enabled plot function to accept an arbitrary hoverdata table with information to be added to the tooltip for each document (expects Tables.jl-compatible data).

[0.1.0]

Added

  • Initial release with support for Plots.jl and building and labeling topics