Completions Configuration
Learn how to configure Cody via completions
on a Sourcegraph Enterprise instance.
completions
is legacy but it's still supported. We recommend using the new modelConfiguration
for flexible LLM model selection.Cody Enterprise supports many models and model providers. You can configure Cody Enterprise to access models via Sourcegraph Cody Gateway or directly using your own model provider account or infrastructure. Let's look at these options in more detail.
Using Sourcegraph Cody Gateway
This is the recommended way to configure Cody Enterprise. It supports all the latest models from Anthropic, OpenAI, Mistral, and more without requiring a separate account or incurring separate charges. You can learn more about these in our supported models docs.
Using your organization's account with a model provider
Example: Your organization's Anthropic account
First, create your own key with Anthropic. Once you have the key, go to Site admin > Site configuration (/site-admin/configuration
) on your instance and set:
JSON{ // [...] "cody.enabled": true, "completions": { "provider": "anthropic", "chatModel": "claude-2.0", // Or any other model you would like to use "fastChatModel": "claude-instant-1.2", // Or any other model you would like to use "completionModel": "claude-instant-1.2", // Or any other model you would like to use "accessToken": "<key>" } }
Example: Your organization's OpenAI account
First, create your own key with OpenAI. Once you have the key, go to Site admin > Site configuration (/site-admin/configuration
) on your instance and set:
JSON{ // [...] "cody.enabled": true, "completions": { "provider": "openai", "chatModel": "gpt-4", // Or any other model you would like to use "fastChatModel": "gpt-3.5-turbo", // Or any other model you would like to use "completionModel": "gpt-3.5-turbo-instruct", // Or any other model that supports the legacy completions endpoint "accessToken": "<key>" } }
Learn more about OpenAI models.
Using your organization's public cloud infrastructure
Example: Use Amazon Bedrock
You can use Anthropic Claude models on Amazon Bedrock.
First, make sure you can access Amazon Bedrock. Then, request access to the Anthropic Claude models in Bedrock. This may take some time to provision.
Next, create an IAM user with programmatic access in your AWS account. Depending on your AWS setup, different ways may be required to provide access. All completion requests are made from the frontend
service, so this service needs to be able to access AWS.
You can use instance role bindings or directly configure the IAM user credentials in the configuration. The AWS_REGION
environment variable must also be set in the frontend
container to scope the IAM credentials for the AWS region hosting the Bedrock endpoint.
Once ready, go to Site admin > Site configuration (/site-admin/configuration
) on your instance and set:
JSON{ // [...] "cody.enabled": true, "completions": { "provider": "aws-bedrock", "chatModel": "anthropic.claude-3-opus-20240229-v1:0", "completionModel": "anthropic.claude-instant-v1", "endpoint": "<See below>", "accessToken": "<See below>" } }
For the chatModel
and completionModel
fields, see Amazon's Bedrock docs for an up-to-date list of supported model IDs, and cross reference against Sourcegraph's supported LLM list to verify compatibility with Cody.
For endpoint
, you can either:
- For pay-as-you-go, set it to an AWS region code (e.g.,
us-west-2
) when using a public Amazon Bedrock endpoint - For provisioned throughput, set it to the provisioned VPC endpoint for the
bedrock-runtime
API (e.g.,"https://vpce-0a10b2345cd67e89f-abc0defg.bedrock-runtime.us-west-2.vpce.amazonaws.com"
)
For accessToken
, you can either:
- Leave it empty and rely on instance role bindings or other AWS configurations in the
frontend
service - Set it to
<ACCESS_KEY_ID>:<SECRET_ACCESS_KEY>
if directly configuring the credentials - Set it to
<ACCESS_KEY_ID>:<SECRET_ACCESS_KEY>:<SESSION_TOKEN>
if a session token is also required
We only recommend configuring AWS Bedrock to use an accessToken for authentication. Specifying no accessToken (e.g. to use IAM roles for EC2 / instance role binding) is not currently recommended (there is a known performance bug with this method which will prevent autocomplete from working correctly. (internal issue: PRIME-662)
Example: Using GCP Vertex AI
On GCP Vertex, we only support Anthropic Claude models.
- Enable the Vertex AI API in the GCP console. Once Vertex has been enabled in your project, navigate to the Vertex Model Garden to select and enable the Anthropic Claude model(s) that you wish to use with Cody. See Supported LLM Models for an up-to-date list of Anthropic Claude models supported by Cody.
-
Create a Service Account:
- Create a service account
- Assign the
Vertex AI User
role to the service account - Generate a JSON key for the service account and download it
-
Convert JSON Key to Base64 by doing:
PYTHONcat <downloaded-json-key> | base64
Once ready, go to Site admin > Site configuration (/site-admin/configuration
) on your instance and set:
JSON{ // [...] "cody.enabled": true, "completions": { "chatModel": "claude-3-opus@20240229", "completionModel": "claude-3-haiku@20240307", "provider": "google", "endpoint": "<See below>", "accessToken": "<Base64 string from step 3>" } }
For the endpoint
, you can:
- Go to the Claude 3 Haiku docs on the GCP Console Model garden
- Scroll through the page to find the example that shows how to use the
cURL
command with the Claude 3 Haiku model. The example will include a sample request JSON body and the necessary endpoint URL. Copy the URL in the site-admin config - The endpoint URL will look something like this:
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/
- Example URL:
https://us-east5-aiplatform.googleapis.com/v1/projects/sourcegraph-vertex-staging/locations/us-east5/publishers/anthropic/models
Example: Use Azure OpenAI service
Create a project in the Azure OpenAI Service portal. From the project overview, go to Keys and Endpoint and get one of the keys on that page and the endpoint.
Next, under Model deployments, click manage deployments and ensure you deploy the models you want, for example, gpt-35-turbo
. Take note of the deployment name.
Once done, go to Site admin > Site configuration (/site-admin/configuration
) on your instance and set:
JSON{ // [...] "cody.enabled": true, "completions": { "provider": "azure-openai", "chatModel": "<deployment name of the model>", "fastChatModel": "<deployment name of the model>", "completionModel": "<deployment name of the model>", // the model must support the legacy completions endpoint such as gpt-3.5-turbo-instruct "endpoint": "<endpoint>", "accessToken": "<See below>" } }
For the access token,
- For Sourcegraph
v5.2.4
or more, the access token can be left empty, and it will rely on Environmental, Workload Identity, or Managed Identity credentials configured for thefrontend
andworker
services - Set it to
<API_KEY>
if directly configuring the credentials using the API key specified in the Azure portal
Use StarCoder for autocomplete
When tested with other coder models for the autocomplete use case, StarCoder offered significant improvements in quality and latency compared to our control groups for users on Sourcegraph.com. You can read more about the improvements in our October 2023 release notes and the GA release notes.
To ensure a fast and reliable experience, we are partnering with Fireworks and have set up a dedicated hardware deployment for our Enterprise users. Sourcegraph supports StarCoder using the Cody Gateway.
To enable StarCoder, go to Site admin > Site configuration (/site-admin/configuration
) and change the completionModel
:
JSON{ // [...] "cody.enabled": true, "completions": { "provider": "sourcegraph", "completionModel": "fireworks/starcoder" } }
Users of the Cody extensions will automatically pick up this change when connected to your Enterprise instance.