Remote LLM Access with Twingate
Securely access your remote LLM servers using Twingate.
Secure Your Remote LLM Communications with Twingate and Continue
In the age of AI-powered development, Large Language Models (LLMs) are becoming indispensable tools for coders. Extensions like Continue for VS Code make it seamless to integrate LLMs into your workflow. However, when you’re working with a self-hosted or cloud-hosted LLM, a critical question arises: how do you ensure that the communication between your local machine and the remote LLM is secure?
This is where Twingate comes in. Twingate provides a simple, modern, and secure way to connect to your remote resources, including LLMs, without the hassle of traditional VPNs. In this article, we’ll walk you through how to use Twingate to secure your connection to a remote LLM while using the Continue extension in VS Code.
Why Secure Your LLM Connection?
You might be wondering why you can’t just expose your remote LLM server to the internet. Here are a few critical reasons to secure the connection:
- Protecting Intellectual Property: The code snippets, prompts, and model responses you work with can contain proprietary information or sensitive business logic. An unsecured connection could expose this data to unauthorized parties.
- Preventing Unauthorized Access: A publicly exposed LLM endpoint is a target. Malicious actors could abuse your LLM for their own purposes, potentially incurring significant costs or using it for nefarious activities.
- Maintaining Compliance: If you work in a regulated industry (like healthcare or finance), data privacy and security are not just best practices; they are legal requirements.
- Ensuring Model Integrity: An unsecured connection could be vulnerable to man-in-the-middle attacks, where an attacker could intercept and even modify the prompts and responses between you and the LLM.
The Secure Architecture: Continue + Twingate + Remote LLM
Here’s a high-level look at the secure architecture we’re going to build:
- Local Machine: Your development machine running VS Code with the Continue extension.
- Remote Server: A server (either on-premises or in the cloud) running your LLM of choice. We’ll use Ollama as an example.
- Twingate: The magic that connects them. Twingate will create a secure, private network. Your local machine and the remote LLM server will both connect to this network, allowing them to communicate as if they were on the same local network, without exposing any ports to the public internet.
This setup ensures that only authenticated and authorized users (i.e., you) can access the LLM, and all traffic is end-to-end encrypted.
Step 1: Setting Up Your Remote LLM Server
First, you need a remote server where your LLM will run. For this guide, we’ll use a Digital Ocean GPU droplet, which provides a cost-effective way to get the necessary GPU power for running high-performance LLMs. However, you can use any virtual machine in another cloud provider (AWS, GCP, Azure, etc.) or a physical server in your own data center. We’ll assume you have a Linux server ready.
We’ll use Ollama to serve our LLM. Ollama is a fantastic tool that makes it incredibly easy to run open-source LLMs.
-
Install Ollama on your remote server:
curl https://ollama.ai/install.sh | sh -
Pull and run an LLM: Let’s use
llama3, a powerful model from Meta.ollama run llama3This command will download the model and start the Ollama service. By default, Ollama listens on port
11434onlocalhost.
Important Security Note: Do not configure Ollama to listen on 0.0.0.0 or expose it to the public internet. The beauty of the Twingate setup is that you don’t need to. The Ollama server should only be accessible from the server itself so use the machines internal IP address. Twingate’s Connector will handle the secure access.
Step 2: Setting Up Twingate
Now for the core of our secure connection. If you don’t have a Twingate account, you can sign up for free at Twingate’s website.
-
Create a Network and Add a Remote Network:
- In your Twingate admin console, give your network a name (e.g.,
llm-dev-network). - Select “On-Premise” or the appropriate cloud option for the location of your remote LLM server.
- In your Twingate admin console, give your network a name (e.g.,
-
Add a Connector:
-
Twingate will guide you to add a Connector. This is a lightweight piece of software that you’ll install on your remote LLM server. It initiates a secure, outbound connection to the Twingate network, so you don’t need to open any inbound firewall ports.
-
Follow the instructions to generate a script to deploy the Connector. It’s usually a single command that you’ll run on your LLM server. For example, it might look something like this (your script will be unique):
curl "https://binaries.twingate.com/connector/setup.sh" | sudo TWINGATE_ACCESS_TOKEN="{your_access_token}" TWINGATE_REFRESH_TOKEN="{your_refresh_token}" TWINGATE_NETWORK="{your_network_name}" TWINGATE_LABEL_DEPLOYED_BY="linux" bash -
Once the Connector is running on your LLM server, it will show as connected in your Twingate admin console.
-
-
Define a Resource:
- Now, you need to tell Twingate about the LLM service. In your Twingate admin console, go to the “Resources” section and add a new Resource.
- Label: Give it a descriptive name, like
Ollama LLM. - Address: Set this to the internal IP address of the machine.
- Port: Set this to
11434. - Protocol: Allow
TCP.
This configuration tells Twingate: “When a Twingate client requests
{internal_ip}:11434, forward that traffic through the Connector to the remote server.” -
Assign Users:
- For yourself (and any other developers who need access), grant access to this new Resource.
Step 3: Configuring Your Local Machine
Now, let’s set up your local development machine to securely connect to the LLM.
-
Install the Twingate Client:
- Download and install the Twingate client for your operating system (Windows, macOS, or Linux) from the Twingate website.
- Sign in to the client using the same identity provider you used to sign up for Twingate.
- Once connected, the Twingate client runs in the background, and any traffic destined for Resources you have access to will be automatically and securely routed through Twingate.
-
Configure the Continue Extension:
- In VS Code, open the Continue extension’s configuration file. You can do this by opening the command palette (
Cmd+Shift+PorCtrl+Shift+P) and searching for “Continue: Edit Config”. This will openconfig.jsonfor Continue. - You’ll need to add a new model to the
modelsarray. This is where you’ll use the private Twingate address you defined earlier.
Here’s an example
config.jsonsnippet:{"models": [{"title": "My Secure Llama3","provider": "ollama","model": "llama3","apiBase": "http://{internal_ip}:11434"}]}title: A friendly name for your model in the Continue UI.provider:ollama.model: The name of the model you are serving,llama3in our case.apiBase: This is the key. We’re pointing Continue to the secure Twingate addresshttp://{internal_ip}:11434.
Save the file. The Continue extension will automatically pick up the new configuration.
- In VS Code, open the Continue extension’s configuration file. You can do this by opening the command palette (
Step 4: Putting It All Together
With everything configured, let’s see it in action:
- Ensure Twingate is running: Make sure your Twingate client on your local machine is connected.
- Select your secure model in Continue: In the Continue extension pane in VS Code, you should now see “My Secure Llama3” (or whatever title you chose) in the model selection dropdown. Select it.
- Start coding! Ask a question or give a coding instruction in the Continue chat.
What happens now?
- Continue sends the request to
http://{internal_ip}:11434. - The Twingate client on your local machine intercepts this request. Because it matches a Resource you have access to, it securely tunnels the traffic to the Twingate network.
- The Twingate network routes the traffic to the Connector running on your remote LLM server.
- The Connector forwards the request to your Ollama server.
- Ollama processes the request with the
llama3model and sends the response back along the same secure path.
You are now communicating with your remote LLM with end-to-end encryption, and your LLM server is completely invisible to the public internet.
Conclusion: Simple, Powerful, Secure
You’ve successfully created a secure, private communication channel between your local development environment and a remote LLM. By leveraging Twingate, you’ve eliminated the need for complex firewall rules, traditional VPNs, or exposing sensitive services to the internet.
This approach provides:
- Zero-Trust Security: Only authenticated and authorized users can access the LLM.
- Simplicity: The setup is straightforward and doesn’t require deep networking expertise.
- Developer Experience: For you, the developer, the experience is seamless. You access the remote LLM as if it were running locally.
As AI becomes more integrated into our development workflows, securing the components of that workflow is paramount. Twingate and Continue provide a powerful combination to help you build, innovate, and code securely.
Beyond VS Code
While this guide uses VS Code as an example, the same principles apply anywhere the Continue extension is supported. This allows you to maintain a consistent, secure, and powerful AI-assisted development workflow across different editors, including the JetBrains suite and Cursor.
Last updated 14 days ago