docs: :memo: update embeddings docs

author: Nate Sesti <sestinj@gmail.com> 2023-09-30 14:29:28 -0700
committer: Nate Sesti <sestinj@gmail.com> 2023-09-30 14:29:28 -0700
commit: 82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a (patch)
tree: 3ad6669fdb02cba42e623d6c476b7a76dcb345a2
parent: 685c60bd26adfc87ab4e134251b71604b8ea53ea (diff)
download: sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.gz
sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.bz2
sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.zip
1 files changed, 9 insertions, 2 deletions
diff --git a/docs/docs/walkthroughs/codebase-embeddings.md b/docs/docs/walkthroughs/codebase-embeddings.md
index 79c9edc1..012395b6 100644
--- a/docs/docs/walkthroughs/codebase-embeddings.md
+++ b/docs/docs/walkthroughs/codebase-embeddings.md
@@ -9,7 +9,7 @@ While it is experimental, codebase embeddings will only be available through the
 3. `python -m continuedev` to start the Continue server
 4. Open `~/.continue/config.py` and add the following, filling in your OpenAI API key:
 
-> NOTE: All of the `params` are optional. If you don't provide an OpenAI API key, sentence transformers embeddings will be calculated locally. And the values seen for the other parameters are the defaults so you can leave them out. After retrieving the top `n_retrieve` results from the vector database, an additional re-reranking step uses 2 LLM calls to select the top `n_final` results to use to answer the question. If you want to increase the speed of the query at the cost of relevancy, you can skip the re-ranking step by setting `use_reranking` to `False`. Then the top `n_final` results will just be directly calculated from the vector database.
+> NOTE: All of the `params` are optional. If you don't provide an OpenAI API key, sentence transformers embeddings will be calculated locally. And the values seen in this example for the other parameters are the defaults so you can leave them out.
 
 ```python
 from continuedev.plugins.steps.chroma import (
@@ -34,7 +34,8 @@ config=ContinueConfig(
             params={
                 "n_retrieve": 20,
                 "n_final": 10,
-                "use_reranking": True
+                "use_reranking": True,
+                "sentence_transformers_model": "openai"
             },
         ),
     ]
@@ -46,3 +47,9 @@ config=ContinueConfig(
 7. Please share feedback in [Discord](https://discord.gg/NWtdYexhMs)!
 
 > Note: There is a known bug that requires different imports for the PyPI package. If you encounter "No module named 'continuedev.src', you should replace all instances of 'continuedev.src.continuedev' in `config.py` with just 'continuedev'.
+
+## Parameters
+
+After retrieving the top `n_retrieve` results from the vector database, an additional re-reranking step uses 2 LLM calls to select the top `n_final` results to use to answer the question. If you want to increase the speed of the query at the cost of relevancy, you can skip the re-ranking step by setting `use_reranking` to `False`. Then the top `n_final` results will just be directly calculated from the vector database.
+
+The `sentence_transformers_model` parameter allows you to select a custom embeddings model from the list [here](https://www.sbert.net/docs/pretrained_models.html). The default value is "openai", but if you don't include your OpenAI API key, it will fall back to using the `all-MiniLM-L6-v2` sentence transformers model.
author	Nate Sesti <sestinj@gmail.com>	2023-09-30 14:29:28 -0700
committer	Nate Sesti <sestinj@gmail.com>	2023-09-30 14:29:28 -0700
commit	82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a (patch)
tree	3ad6669fdb02cba42e623d6c476b7a76dcb345a2
parent	685c60bd26adfc87ab4e134251b71604b8ea53ea (diff)
download	sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.gz sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.bz2 sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.zip