Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update to new PromptingTools RAG #10

Merged
merged 7 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions Artifacts.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,51 @@ lazy = true
[[juliaextra.download]]
sha256 = "61133afa7e06fda133f07164c57190a5b922f8f2a1aa17c3f8a628b5cf752512"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/juliaextra__v1.10.0__ada1.0.tar.gz"

["julia__textembedding3large-0-Float32"]
git-tree-sha1 = "a105a2482296fa0a80ce0c76677cc9ef673be70e"
lazy = true

[["julia__textembedding3large-0-Float32".download]]
sha256 = "ff4e91908fb54b7919aad9d6a2ac5045124d43eb864fe9f96a7a68d304d4e0a2"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/julia__v1.10.2__textembedding3large-0-Float32__v1.0.tar.gz"

["julia__textembedding3large-1024-Bool"]
git-tree-sha1 = "7eef82f15c72712b4f5fff2449ebf3ed64b56b14"
lazy = true

[["julia__textembedding3large-1024-Bool".download]]
sha256 = "27186886d19ea4c3f1710b4bc70e8e809d906069d5de8c992c948d97d0f454da"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/julia__v1.10.2__textembedding3large-1024-Bool__v1.0.tar.gz"

["tidier__textembedding3large-0-Float32"]
git-tree-sha1 = "680c7035e512844fd2b9af1757b02b931dfadaa5"
lazy = true

[["tidier__textembedding3large-0-Float32".download]]
sha256 = "59eb6fef198e32d238c11d3a95e5201d18cb83c5d42eae753706614c0f72db9e"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__textembedding3large-0-Float32__v1.0.tar.gz"

["tidier__textembedding3large-1024-Bool"]
git-tree-sha1 = "44d861977d663a9c4615023ae38828e0ef88036e"
lazy = true

[["tidier__textembedding3large-1024-Bool".download]]
sha256 = "226cadd2805abb6ab6e561330aca97466e0a2cb1e1eb171be661d9dea9dcacdc"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__textembedding3large-1024-Bool__v1.0.tar.gz"

["makie__textembedding3large-0-Float32"]
git-tree-sha1 = "30c29c10d9b2b160b43f358fad9f4f6fe83ce378"
lazy = true

[["makie__textembedding3large-0-Float32".download]]
sha256 = "ee15489022df191fbede93adf1bd7cc1ceb1f84185229026a5e38ae9a3fab737"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__textembedding3large-0-Float32__v1.0.tar.gz"

["makie__textembedding3large-1024-Bool"]
git-tree-sha1 = "a49a86949f86f6cf4c29bdc9559c05064b49c801"
lazy = true

[["makie__textembedding3large-1024-Bool".download]]
sha256 = "135f36effc0d29ed20e9bc877f727e4d9d8366bcae4bf4d13f998529d1091324"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__textembedding3large-1024-Bool__v1.0.tar.gz"
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- (Preliminary) Knowledge packs available for Julia docs (`:julia`), Tidier ecosystem (`:tidier`), Makie ecosystem (`:makie`). Load with `load_index!(:julia)` or several with `load_index!([:julia, :tidier])`.

### Changed
- Bumped up PromptingTools to v0.20 (brings new RAG capabilities, pretty-printing, etc.)
- Changed default model to be GPT-4 Turbo to improve answer quality

### Fixed
- Fixed wrong initiation of `CONV_HISTORY` and other globals that led to UndefVarError. Moved several globals to `const Ref{}` pattern to ensure type stability, but it means that from now it always needs to be dereferenced with `[]` (eg, `MAIN_INDEX[]` instead of `MAIN_INDEX`).
Expand Down
5 changes: 4 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,11 @@ authors = ["J S <49557684+svilupp@users.noreply.github.com> and contributors"]
version = "0.0.1-DEV"

[deps]
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
LazyArtifacts = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Preferences = "21216c6a-2e73-6563-6e65-726566657250"
PromptingTools = "670122d1-24a8-4d70-bfce-740807c42192"
REPL = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
Expand All @@ -20,7 +23,7 @@ JSON3 = "1"
LazyArtifacts = "<0.0.1, 1"
LinearAlgebra = "<0.0.1, 1"
Preferences = "1"
PromptingTools = "0.9"
PromptingTools = "0.20"
REPL = "1"
SHA = "0.7"
Serialization = "<0.0.1, 1"
Expand Down
70 changes: 60 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@

AIHelpMe harnesses the power of Julia's extensive documentation and advanced AI models to provide tailored coding guidance. By integrating with PromptingTools.jl, it offers a unique, AI-assisted approach to answering your coding queries directly in Julia's environment.

Note: This is only a proof-of-concept. If there is enough interest, we will fine-tune the RAG pipeline for better performance.
> [!CAUTION]
> This is only a proof-of-concept. If there is enough interest, we will fine-tune the RAG pipeline for better performance.

## Features

Expand All @@ -27,7 +28,8 @@ Pkg.add(url="https://github.com/svilupp/AIHelpMe.jl")

- Julia (version 1.10 or later).
- Internet connection for API access.
- OpenAI and Cohere API keys (recommended for optimal performance). See [How to Obtain API Keys](#how-to-obtain-api-keys).
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
- For optimal performance, get also Cohere API key (free for community use) and Tavily API key (free for community use).

All setup should take less than 5 minutes!

Expand All @@ -40,10 +42,43 @@ All setup should take less than 5 minutes!
```

```plaintext
[ Info: Done generating response. Total cost: $0.001
[ Info: Done generating response. Total cost: $0.015
AIMessage("To implement quicksort in Julia, you can use the `sort` function with the `alg=QuickSort` argument.")
```

Note: As a default, we load only the Julia documentation and docstrings for standard libraries. The default model used is GPT-4 Turbo.

You can pretty-print the answer using `pprint` if you return the full RAGResult (`return_all=true`):
```julia
using AIHelpMe: pprint

result = aihelp("How do I implement quicksort in Julia?", return_all=true)
pprint(result)
```

```plaintext
--------------------
QUESTION(s)
--------------------
- How do I implement quicksort in Julia?

--------------------
ANSWER
--------------------
To implement quicksort in Julia, you can use the [5,1.0]`sort`[1,1.0] function with the [1,1.0]`alg=QuickSort`[1,1.0] argument.[2,1.0]

--------------------
SOURCES
--------------------
1. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
2. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
3. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Algorithms
4. SortingAlgorithms::/README.md::0::SortingAlgorithms
5. AIHelpMe::/README.md::0::AIHelpMe
```

Note: You can see the model cheated because it can see this very documentation...

2. **`aihelp` Macro**:
```julia
aihelp"how to implement quicksort in Julia?"
Expand All @@ -56,11 +91,12 @@ All setup should take less than 5 minutes!
Note: The `!` is required for follow-up questions.
`aihelp!` does not add new context/more information - to do that, you need to ask a new question.

4. **Pick stronger models**:
Eg, "gpt4t" is an alias for GPT-4 Turbo:
4. **Pick faster models**:
Eg, for simple questions, GPT 3.5 might be enough, so use the alias "gpt3t":
```julia
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt4t
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt3t
```

```plaintext
[ Info: Done generating response. Total cost: $0.002 -->
AIMessage("The `sort` function in programming languages, including Julia.... continues for a while!
Expand All @@ -69,22 +105,36 @@ All setup should take less than 5 minutes!
5. **Debugging**:
How did you come up with that answer? Check the "context" provided to the AI model (ie, the documentation snippets that were used to generate the answer):
```julia
const AHM = AIHelpMe
AHM.preview_context()
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[])
# Output: Pretty-printed Question + Context + Answer with color highlights
```

The color highlights show you which words were NOT supported by the provided context (magenta = completely new, blue = partially new).
It's a quite and intuitive way to see which function names or variables are made up versus which ones were in the context.

You can change the kwargs of `pprint` to hide the annotations or potentially even show the underlying context (snippets from the documentation):

```julia
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[]; add_context = true, add_scores = false)
```

## How to Obtain API Keys

### OpenAI API Key:
1. Visit [OpenAI's API portal](https://openai.com/api/).
2. # and generate an API Key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Charge some credits ($5 minimum but that will last you for a lost time).
4. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Cohere API Key:
1. # at [Cohere's registration page](https://dashboard.cohere.com/welcome/register).
2. After registering, visit the [API keys section](https://dashboard.cohere.com/api-keys) to obtain a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Tavily API Key:
1. # at [Tavily](https://app.tavily.com/sign-in).
2. After registering, generate an API key on the [Overview page](https://app.tavily.com/home). You can get a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

## Usage

Expand Down
69 changes: 59 additions & 10 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,15 @@ To install AIHelpMe, use the Julia package manager and the address of the reposi

```julia
using Pkg
Pkg.add("https://github.com/svilupp/AIHelpMe.jl")
Pkg.add(url="https://github.com/svilupp/AIHelpMe.jl")
```

**Prerequisites:**

- Julia (version 1.10 or later).
- Internet connection for API access.
- OpenAI and Cohere API keys (recommended for optimal performance). See [How to Obtain API Keys](#how-to-obtain-api-keys).
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
- For optimal performance, get also Cohere API key (free for community use) and Tavily API key (free for community use).

All setup should take less than 5 minutes!

Expand All @@ -44,10 +45,43 @@ All setup should take less than 5 minutes!
```

```plaintext
[ Info: Done generating response. Total cost: $0.001
[ Info: Done generating response. Total cost: $0.015
AIMessage("To implement quicksort in Julia, you can use the `sort` function with the `alg=QuickSort` argument.")
```

Note: As a default, we load only the Julia documentation and docstrings for standard libraries. The default model used is GPT-4 Turbo.

You can pretty-print the answer using `pprint` if you return the full RAGResult (`return_all=true`):
```julia
using AIHelpMe: pprint

result = aihelp("How do I implement quicksort in Julia?", return_all=true)
pprint(result)
```

```plaintext
--------------------
QUESTION(s)
--------------------
- How do I implement quicksort in Julia?

--------------------
ANSWER
--------------------
To implement quicksort in Julia, you can use the [5,1.0]`sort`[1,1.0] function with the [1,1.0]`alg=QuickSort`[1,1.0] argument.[2,1.0]

--------------------
SOURCES
--------------------
1. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
2. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
3. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Algorithms
4. SortingAlgorithms::/README.md::0::SortingAlgorithms
5. AIHelpMe::/README.md::0::AIHelpMe
```

Note: You can see the model cheated because it can see this very documentation...

2. **`aihelp` Macro**:
```julia
aihelp"how to implement quicksort in Julia?"
Expand All @@ -60,11 +94,12 @@ All setup should take less than 5 minutes!
Note: The `!` is required for follow-up questions.
`aihelp!` does not add new context/more information - to do that, you need to ask a new question.

4. **Pick stronger models**:
Eg, "gpt4t" is an alias for GPT-4 Turbo:
4. **Pick faster models**:
Eg, for simple questions, GPT 3.5 might be enough, so use the alias "gpt3t":
```julia
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt4t
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt3t
```

```plaintext
[ Info: Done generating response. Total cost: $0.002 -->
AIMessage("The `sort` function in programming languages, including Julia.... continues for a while!
Expand All @@ -73,22 +108,36 @@ All setup should take less than 5 minutes!
5. **Debugging**:
How did you come up with that answer? Check the "context" provided to the AI model (ie, the documentation snippets that were used to generate the answer):
```julia
const AHM = AIHelpMe
AHM.preview_context()
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[])
# Output: Pretty-printed Question + Context + Answer with color highlights
```

The color highlights show you which words were NOT supported by the provided context (magenta = completely new, blue = partially new).
It's a quite and intuitive way to see which function names or variables are made up versus which ones were in the context.

You can change the kwargs of `pprint` to hide the annotations or potentially even show the underlying context (snippets from the documentation):

```julia
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[]; add_context = true, add_scores = false)
```

## How to Obtain API Keys

### OpenAI API Key:
1. Visit [OpenAI's API portal](https://openai.com/api/).
2. # and generate an API Key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Charge some credits ($5 minimum but that will last you for a lost time).
4. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Cohere API Key:
1. # at [Cohere's registration page](https://dashboard.cohere.com/welcome/register).
2. After registering, visit the [API keys section](https://dashboard.cohere.com/api-keys) to obtain a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Tavily API Key:
1. # at [Tavily](https://app.tavily.com/sign-in).
2. After registering, generate an API key on the [Overview page](https://app.tavily.com/home). You can get a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

## Usage

Expand Down
31 changes: 21 additions & 10 deletions src/AIHelpMe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,46 @@ using Preferences, Serialization, LinearAlgebra, SparseArrays
using LazyArtifacts
using Base.Docs: DocStr, MultiDoc, doc, meta
using REPL: stripmd
using HDF5

using PromptingTools
using PromptingTools: pprint
using PromptingTools.Experimental.RAGTools
using PromptingTools.Experimental.RAGTools: AbstractRAGConfig, getpropertynested,
setpropertynested, merge_kwargs_nested
using SHA: sha256, bytes2hex
using Logging, PrecompileTools
const PT = PromptingTools
const RAG = PromptingTools.Experimental.RAGTools
const RT = PromptingTools.Experimental.RAGTools

## export load_index!, last_context, update_index!
## export remove_pkgdir, annotate_source, find_new_chunks
include("utils.jl")

## Globals and types are defined in here
include("pipeline_defaults.jl")

## export docdata_to_source, docextract, build_index
include("preparation.jl")

## export load_index!, update_index!
include("loading.jl")

export aihelp
include("generation.jl")

export @aihelp_str, @aihelp!_str
include("macros.jl")

## Globals
const CONV_HISTORY = Vector{Vector{PT.AbstractMessage}}()
const CONV_HISTORY_LOCK = ReentrantLock()
const MAX_HISTORY_LENGTH = 1
const LAST_CONTEXT = Ref{Union{Nothing, RAG.RAGContext}}(nothing)
const MAIN_INDEX = Ref{Union{Nothing, RAG.AbstractChunkIndex}}(nothing)
function __init__()
## Load index
MAIN_INDEX[] = load_index!()
## Set the active configuration
update_pipeline!(:bronze)
## Load index - auto-loads into MAIN_INDEX
load_index!(:julia)
end

# Enable precompilation to reduce start time, disabled logging
with_logger(NullLogger()) do
@compile_workload include("precompilation.jl")
end

end
Loading
Loading