diff options
author | Nate Sesti <sestinj@gmail.com> | 2023-09-30 14:29:28 -0700 |
---|---|---|
committer | Nate Sesti <sestinj@gmail.com> | 2023-09-30 14:29:28 -0700 |
commit | 82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a (patch) | |
tree | 3ad6669fdb02cba42e623d6c476b7a76dcb345a2 | |
parent | 685c60bd26adfc87ab4e134251b71604b8ea53ea (diff) | |
download | sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.gz sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.tar.bz2 sncontinue-82bafc9ffa0eabd2b96b90bf3d375f22d62dc16a.zip |
docs: :memo: update embeddings docs
-rw-r--r-- | docs/docs/walkthroughs/codebase-embeddings.md | 11 |
1 files changed, 9 insertions, 2 deletions
diff --git a/docs/docs/walkthroughs/codebase-embeddings.md b/docs/docs/walkthroughs/codebase-embeddings.md index 79c9edc1..012395b6 100644 --- a/docs/docs/walkthroughs/codebase-embeddings.md +++ b/docs/docs/walkthroughs/codebase-embeddings.md @@ -9,7 +9,7 @@ While it is experimental, codebase embeddings will only be available through the 3. `python -m continuedev` to start the Continue server 4. Open `~/.continue/config.py` and add the following, filling in your OpenAI API key: -> NOTE: All of the `params` are optional. If you don't provide an OpenAI API key, sentence transformers embeddings will be calculated locally. And the values seen for the other parameters are the defaults so you can leave them out. After retrieving the top `n_retrieve` results from the vector database, an additional re-reranking step uses 2 LLM calls to select the top `n_final` results to use to answer the question. If you want to increase the speed of the query at the cost of relevancy, you can skip the re-ranking step by setting `use_reranking` to `False`. Then the top `n_final` results will just be directly calculated from the vector database. +> NOTE: All of the `params` are optional. If you don't provide an OpenAI API key, sentence transformers embeddings will be calculated locally. And the values seen in this example for the other parameters are the defaults so you can leave them out. ```python from continuedev.plugins.steps.chroma import ( @@ -34,7 +34,8 @@ config=ContinueConfig( params={ "n_retrieve": 20, "n_final": 10, - "use_reranking": True + "use_reranking": True, + "sentence_transformers_model": "openai" }, ), ] @@ -46,3 +47,9 @@ config=ContinueConfig( 7. Please share feedback in [Discord](https://discord.gg/NWtdYexhMs)! > Note: There is a known bug that requires different imports for the PyPI package. If you encounter "No module named 'continuedev.src', you should replace all instances of 'continuedev.src.continuedev' in `config.py` with just 'continuedev'. + +## Parameters + +After retrieving the top `n_retrieve` results from the vector database, an additional re-reranking step uses 2 LLM calls to select the top `n_final` results to use to answer the question. If you want to increase the speed of the query at the cost of relevancy, you can skip the re-ranking step by setting `use_reranking` to `False`. Then the top `n_final` results will just be directly calculated from the vector database. + +The `sentence_transformers_model` parameter allows you to select a custom embeddings model from the list [here](https://www.sbert.net/docs/pretrained_models.html). The default value is "openai", but if you don't include your OpenAI API key, it will fall back to using the `all-MiniLM-L6-v2` sentence transformers model. |