Back up or migrate Sourcegraph data to a new instance

In some circumstances it may be necessary or advantageous to migrate from one Sourcegraph instance or deployment to another. This page describes how to execute such a migration.

While much of Sourcegraph's data can be regenerated, some state can be stored in multiple locations.

Configuration JSON

Most parts of Sourcegraph's configuration are managed in the web app via text files. These files are typically stored in the Postgres database (described below), but are translated into text for editing in the web UI.

These files are the most essential pieces of information required for a migration to work.

Data	Can it be recreated without a backup?	Notes
Site configuration	No	This file contains key configuration that defines how the product works.
Code host connection configuration(s)	No	Each connection to an external code host has its own short configuration file.
Global settings	No	Default settings can be set by administrators for all users by editing this file.

Backing up this data is as simple as copy-pasting the text from the files described above on the old Sourcegraph instance into the new one.

Internal database (PostgreSQL)

Sourcegraph's internal database stores most of Sourcegraph's state. While most of this data can be restored after a migration, some cannot.

This list is not guaranteed to be complete, but rather representative of the types of data stored here.

Data	Can it be recreated without a backup?	Notes
Repository metadata (e.g. clone URLs, whether it is a fork or archive, etc.)	Yes
User accounts	Yes (if using SSO authentication), No if using builtin authentication
Repository permissions	Yes
Organizations	No
User and org settings	No	Global settings can be backed up as described above, but user-level and org-level settings cannot.
Saved searches	No
User-generated access tokens	No
Batch Changes	No
Code graph metadata	Yes (if manually regenerated)	This can be regenerated by re-running the indexing and upload process for affected repositories and revisions, but will not be regenerated by default.
User survey responses	No
Usage statistics and event logs	No	Event logs allow admins to track and audit usage, but are not necessary for Sourcegraph to work.

Data stored on disk

Git data, search indexes, precise code-intel data, Prometheus metrics, and some other large data sources are stored on disk.

This list is not guaranteed to be complete, but rather representative of the types of data stored here.

Data	Can it be recreated without a backup?	Notes
Repository (git) data	Yes
Search indexes	Yes
Code graph data	No	This can be regenerated by re-running the indexing and upload process for affected repositories and revisions, but will not be regenerated by default.
Prometheus metrics	No
blobstore	Yes	This is where unprocessed uploads are stored.

Ephemeral data (Redis)

Short-lived data, including session data and some usage statistics, are stored in Redis. This data can all be recreated without backups.

External data

Certain categories of data can be stored outside the Sourcegraph deployment. For example, configuration JSON files can be loaded from disk, and Sourcegraph can connect to external services (PostgreSQL, Redis, S3/GCS) instead of using PostgreSQL, Redis, and blobstore internally.

In these cases, no migration should be necessary, simply re-use the existing external data sources on the new Sourcegraph instance.

Name	Re-creatable	Notes
codeinsights-db	Yes
codeintel-db	Yes	While the data is re-creatable, we suggest including the disk during your migration as it often contains a lot of data that would take awhile to regenerate
indexed-search	Yes
gitserver	Yes
grafana	Yes
blobstore	Yes
pgsql	NO	This is the main database of Sourcegraph where most of the data are stored
prometheus	YES
redis-cache	YES
redis-store	YES

Back up or migrate Sourcegraph data to a new instance

Specific guides

Data stores

Configuration JSON

Internal database (PostgreSQL)

Data stored on disk

Ephemeral data (Redis)

External data

Migration and backup options

Option 1: Configuration only

Option 2: All Postgres data

Option 3: All data

Persistent data backup in Kubernetes