From 8dad79af8b18c08e270382ce1b18a3956fa59626 Mon Sep 17 00:00:00 2001 From: Ty Dunn Date: Thu, 7 Sep 2023 11:19:38 -0700 Subject: adding support for Hugging Face Inference Endpoints (#460) * stream complete sketch * correct structure but issues * refactor: :art: clean up hf_inference_api.py * fix: :bug: quick fix in hf_infrerence_api.py * feat: :memo: update documentation code for hf_inference_api * hf docs * now working --------- Co-authored-by: Nate Sesti --- docs/docs/customization.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'docs') diff --git a/docs/docs/customization.md b/docs/docs/customization.md index 5fc3eab5..fb7dc0c5 100644 --- a/docs/docs/customization.md +++ b/docs/docs/customization.md @@ -21,6 +21,7 @@ Open-Source Models (not local) - [TogetherLLM](#together) - Use any model from the [Together Models list](https://docs.together.ai/docs/models-inference) with your Together API key. - [ReplicateLLM](#replicate) - Use any open-source model from the [Replicate Streaming List](https://replicate.com/collections/streaming-language-models) with your Replicate API key. +- [HuggingFaceInferenceAPI](#huggingface) - Use any open-source model from the [Hugging Face Inference API](https://huggingface.co/inference-api) with your Hugging Face token. ## Change the default LLM @@ -206,6 +207,24 @@ config = ContinueConfig( If you don't specify the `model` parameter, it will default to `replicate/llama-2-70b-chat:58d078176e02c219e11eb4da5a02a7830a283b14cf8f94537af893ccff5ee781`. +### Hugging Face + +Hugging Face Inference API is a great option for newly released language models. Sign up for an account and add billing [here](https://huggingface.co/settings/billing), access the Inference Endpoints [here](https://ui.endpoints.huggingface.co), click on “New endpoint”, and fill out the form (e.g. select a model like [WizardCoder-Python-34B-V1.0](https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0)), and then deploy your model by clicking “Create Endpoint”. Change `~/.continue/config.py` to look like this: + +```python +from continuedev.src.continuedev.core.models import Models +from continuedev.src.continuedev.libs.llm.hf_inference_api import HuggingFaceInferenceAPI + +config = ContinueConfig( + ... + models=Models( + default=HuggingFaceInferenceAPI( + endpoint_url: "", + hf_token: "", + ) +) +``` + ### Self-hosting an open-source model If you want to self-host on Colab, RunPod, HuggingFace, Haven, or another hosting provider you will need to wire up a new LLM class. It only needs to implement 3 primary methods: `stream_complete`, `complete`, and `stream_chat`, and you can see examples in `continuedev/src/continuedev/libs/llm`. -- cgit v1.2.3-70-g09d2