Remote LLM Access via Twingate

Andrew Baumbach

Product Marketing Engineer

Illustrated graphic using dark colors with a line art lock and line art brain representing an LLM.

TL;DR: Run a local LLM at home but want to use it on the road? Twingate gives you private, authenticated access to your home rig without exposing Ollama, LM Studio, or Open WebUI to the internet. Twingate docs on adding resources.

You spent a weekend setting up Ollama on the machine with the good GPU. Maybe you wrapped it in Open WebUI, pointed Continue.dev at it from your laptop, and got a clean local Llama 3.1 or Qwen 2.5 Coder workflow going. It's fast, it's private, and it costs you nothing per token.

Then you leave the house.

The model is sitting at home doing nothing while you're on hotel Wi-Fi paying API rates to Anthropic. The obvious fix is also the bad one: forward port 11434, slap on a reverse proxy, hope nobody scans your IP. There's a better way that doesn't involve exposing your LLM to the open internet at all.

Why exposing a local LLM is a bad idea

Local LLM stacks were not designed to be public-facing services. Ollama's default API listens on 127.0.0.1:11434 with no authentication. Open WebUI has accounts, but the underlying inference endpoint usually doesn't. LM Studio's server is the same story.

A few things go wrong the moment you open a port:

  • Anyone who finds the endpoint can use your GPU for free. There are already Shodan dorks and GitHub scripts hunting for exposed Ollama instances, and researchers have found tens of thousands of them sitting open.

  • CVE-2024-37032 ("Probllama") was a remote code execution flaw in Ollama disclosed in mid-2024. Anyone who hadn't patched and had the port open was a target.

  • Even with auth in front of the UI, you're still trusting a single password and a TLS cert you have to renew against the entire internet.

  • Your home IP is now logged in someone else's reconnaissance database.

Tunneling through a reverse proxy with basic auth is better than nothing, but it's still an inbound service on the public internet. The architecture is wrong for what this actually is: a personal resource that two or three people should be able to reach.

The Twingate approach

Twingate treats your home LLM the same way it would treat a private database in a VPC. The resource stays where it is. Nothing about your home network changes from the outside. A small piece of software called a Connector runs inside your network and dials out to Twingate's control plane. Your laptop runs a Client that does the same.

When you try to hit ollama.home or whatever hostname you configure, the Client recognizes the request, checks that you're allowed, and routes it through an encrypted tunnel to the Connector, which forwards it to the Ollama process. No ports open on your router. No DNS records pointing at your house. No public attack surface.

A few properties worth calling out:

  • The Connector only makes outbound connections. Your firewall doesn't need any inbound rules.

  • Authentication is tied to your identity provider (Google, GitHub, Microsoft, etc. via the free tier), not a shared password.

  • Access is per-resource. You can grant your laptop access to the LLM without giving it access to anything else on the network.

  • Traffic goes peer-to-peer when possible and falls back to a relay if NAT traversal fails. Either way, Twingate never sees plaintext.

What you'll need

  • A machine on your home network running a local LLM. Examples: Ollama, LM Studio, llama.cpp's server, vLLM, Open WebUI.

  • A free Twingate account (up to 5 users is plenty for personal use).

  • Docker, or a Linux host where you can install the Connector as a service. A Raspberry Pi works fine.

  • The Twingate Client on the devices you want to use the LLM from.

Setting it up

This is the short version. The full Connector deployment docs cover edge cases.

1. Make sure the LLM is listening on the LAN

By default Ollama binds to localhost only. To make it reachable from the Connector (which will be a separate process or container), bind it to all interfaces:

Set the listen address before starting Ollama:

If you're running it as a systemd service, add the variable to the unit file's [Service] section as Environment="OLLAMA_HOST=0.0.0.0:11434" and reload. Same idea for LM Studio's server settings.

Confirm it works from another machine on the LAN:

curl http://<llm-host-ip>

You should get a JSON list of installed models.

2. Create a Remote Network and Connector in Twingate

In the Twingate admin console, create a Remote Network (call it home or whatever you like), then add a Connector inside it. Twingate will give you a Docker run command or a Linux install script with two tokens baked in.

Run the Connector on a machine that's always on. A Raspberry Pi, a NAS, or the LLM host itself all work. The Docker version looks something like this:

Start the Connector in a Docker container:

docker run -d \
  --sysctl net.ipv4.ping_group_range="0 2147483647" \
  --env TWINGATE_ACCESS_TOKEN="<access-token>" \
  --env TWINGATE_REFRESH_TOKEN="<refresh-token>" \
  --env TWINGATE_NETWORK="<your-network>

It will dial out, register, and show up as connected in the admin console within a few seconds. No inbound firewall rules required.

3. Add the LLM as a Resource

In the same Remote Network, add a Resource. The address can be the LAN IP of the LLM host (e.g. 192.168.1.50) or, better, a hostname like ollama.home. Twingate's split DNS will resolve that name through the Connector even though it doesn't exist publicly.

Assign the Resource to a group that includes your user. If you're the only person using this, the default group is fine.

4. Install the Client and connect

Install the Twingate Client on your laptop, phone, or whatever device you want to use the LLM from. Sign in with the same identity you used for the admin console. Once it's running, requests to ollama.home:11434 will route through the tunnel automatically.

Verify from your laptop while off your home network:

Same JSON response you'd get on the LAN. Point Continue.dev, Open WebUI on another machine, or any OpenAI-compatible client at that hostname and you're done.

What this actually gives you

Before: an LLM that works at home and stops existing the moment you walk out the door, unless you take on the operational and security cost of running a public-facing service.

After:

  • Same hostname works from any network: home, office, hotel, mobile tether.

  • The LLM port is never reachable from the internet. A port scan of your home IP shows nothing new.

  • Access is tied to your identity. Lose your laptop, revoke the device in the console, done.

  • You can share access with a partner or a couple of friends by adding them as users (free up to 5) without giving anyone your Wi-Fi password or VPN credentials.

  • If you eventually run other services (Home Assistant, a Jellyfin instance, a Git server), they slot into the same setup.

The setup takes maybe fifteen minutes if you already have Docker on something. The ongoing maintenance is approximately zero — the Connector updates itself, and there's no certificate to renew or DDNS record to worry about.

A few practical notes

  • If your LLM machine is also the Connector host, you can use localhost:11434 as the Resource address. Cleaner, fewer moving parts.

  • For Open WebUI specifically, you can host the UI itself behind Twingate too. Add it as a separate Resource on whatever port it runs on (usually 3000 or 8080) and access the whole web interface privately.

  • Coding assistants like Continue.dev, Zed's AI features, and Cursor's custom model option all accept an OpenAI-compatible base URL. Point them at http://ollama.home:11434/v1 and they'll happily use your home rig from anywhere.

  • If you want to use the LLM from a phone, the Twingate mobile clients work the same way. Useful for chat-style apps that point at a local endpoint.

Closing

For the full reference on Connectors, Resources, and split DNS, see the Twingate documentation.

New to Twingate? You can use Twingate for free for up to 5 users, request a personalized demo, or reach out to the team over on the Twingate subreddit.

Rapidly implement a modern Zero Trust network that is more secure and maintainable than VPNs.

/

Remote LLM Access

Remote LLM Access via Twingate

Andrew Baumbach

Product Marketing Engineer

Illustrated graphic using dark colors with a line art lock and line art brain representing an LLM.

TL;DR: Run a local LLM at home but want to use it on the road? Twingate gives you private, authenticated access to your home rig without exposing Ollama, LM Studio, or Open WebUI to the internet. Twingate docs on adding resources.

You spent a weekend setting up Ollama on the machine with the good GPU. Maybe you wrapped it in Open WebUI, pointed Continue.dev at it from your laptop, and got a clean local Llama 3.1 or Qwen 2.5 Coder workflow going. It's fast, it's private, and it costs you nothing per token.

Then you leave the house.

The model is sitting at home doing nothing while you're on hotel Wi-Fi paying API rates to Anthropic. The obvious fix is also the bad one: forward port 11434, slap on a reverse proxy, hope nobody scans your IP. There's a better way that doesn't involve exposing your LLM to the open internet at all.

Why exposing a local LLM is a bad idea

Local LLM stacks were not designed to be public-facing services. Ollama's default API listens on 127.0.0.1:11434 with no authentication. Open WebUI has accounts, but the underlying inference endpoint usually doesn't. LM Studio's server is the same story.

A few things go wrong the moment you open a port:

  • Anyone who finds the endpoint can use your GPU for free. There are already Shodan dorks and GitHub scripts hunting for exposed Ollama instances, and researchers have found tens of thousands of them sitting open.

  • CVE-2024-37032 ("Probllama") was a remote code execution flaw in Ollama disclosed in mid-2024. Anyone who hadn't patched and had the port open was a target.

  • Even with auth in front of the UI, you're still trusting a single password and a TLS cert you have to renew against the entire internet.

  • Your home IP is now logged in someone else's reconnaissance database.

Tunneling through a reverse proxy with basic auth is better than nothing, but it's still an inbound service on the public internet. The architecture is wrong for what this actually is: a personal resource that two or three people should be able to reach.

The Twingate approach

Twingate treats your home LLM the same way it would treat a private database in a VPC. The resource stays where it is. Nothing about your home network changes from the outside. A small piece of software called a Connector runs inside your network and dials out to Twingate's control plane. Your laptop runs a Client that does the same.

When you try to hit ollama.home or whatever hostname you configure, the Client recognizes the request, checks that you're allowed, and routes it through an encrypted tunnel to the Connector, which forwards it to the Ollama process. No ports open on your router. No DNS records pointing at your house. No public attack surface.

A few properties worth calling out:

  • The Connector only makes outbound connections. Your firewall doesn't need any inbound rules.

  • Authentication is tied to your identity provider (Google, GitHub, Microsoft, etc. via the free tier), not a shared password.

  • Access is per-resource. You can grant your laptop access to the LLM without giving it access to anything else on the network.

  • Traffic goes peer-to-peer when possible and falls back to a relay if NAT traversal fails. Either way, Twingate never sees plaintext.

What you'll need

  • A machine on your home network running a local LLM. Examples: Ollama, LM Studio, llama.cpp's server, vLLM, Open WebUI.

  • A free Twingate account (up to 5 users is plenty for personal use).

  • Docker, or a Linux host where you can install the Connector as a service. A Raspberry Pi works fine.

  • The Twingate Client on the devices you want to use the LLM from.

Setting it up

This is the short version. The full Connector deployment docs cover edge cases.

1. Make sure the LLM is listening on the LAN

By default Ollama binds to localhost only. To make it reachable from the Connector (which will be a separate process or container), bind it to all interfaces:

Set the listen address before starting Ollama:

If you're running it as a systemd service, add the variable to the unit file's [Service] section as Environment="OLLAMA_HOST=0.0.0.0:11434" and reload. Same idea for LM Studio's server settings.

Confirm it works from another machine on the LAN:

curl http://<llm-host-ip>

You should get a JSON list of installed models.

2. Create a Remote Network and Connector in Twingate

In the Twingate admin console, create a Remote Network (call it home or whatever you like), then add a Connector inside it. Twingate will give you a Docker run command or a Linux install script with two tokens baked in.

Run the Connector on a machine that's always on. A Raspberry Pi, a NAS, or the LLM host itself all work. The Docker version looks something like this:

Start the Connector in a Docker container:

docker run -d \
  --sysctl net.ipv4.ping_group_range="0 2147483647" \
  --env TWINGATE_ACCESS_TOKEN="<access-token>" \
  --env TWINGATE_REFRESH_TOKEN="<refresh-token>" \
  --env TWINGATE_NETWORK="<your-network>

It will dial out, register, and show up as connected in the admin console within a few seconds. No inbound firewall rules required.

3. Add the LLM as a Resource

In the same Remote Network, add a Resource. The address can be the LAN IP of the LLM host (e.g. 192.168.1.50) or, better, a hostname like ollama.home. Twingate's split DNS will resolve that name through the Connector even though it doesn't exist publicly.

Assign the Resource to a group that includes your user. If you're the only person using this, the default group is fine.

4. Install the Client and connect

Install the Twingate Client on your laptop, phone, or whatever device you want to use the LLM from. Sign in with the same identity you used for the admin console. Once it's running, requests to ollama.home:11434 will route through the tunnel automatically.

Verify from your laptop while off your home network:

Same JSON response you'd get on the LAN. Point Continue.dev, Open WebUI on another machine, or any OpenAI-compatible client at that hostname and you're done.

What this actually gives you

Before: an LLM that works at home and stops existing the moment you walk out the door, unless you take on the operational and security cost of running a public-facing service.

After:

  • Same hostname works from any network: home, office, hotel, mobile tether.

  • The LLM port is never reachable from the internet. A port scan of your home IP shows nothing new.

  • Access is tied to your identity. Lose your laptop, revoke the device in the console, done.

  • You can share access with a partner or a couple of friends by adding them as users (free up to 5) without giving anyone your Wi-Fi password or VPN credentials.

  • If you eventually run other services (Home Assistant, a Jellyfin instance, a Git server), they slot into the same setup.

The setup takes maybe fifteen minutes if you already have Docker on something. The ongoing maintenance is approximately zero — the Connector updates itself, and there's no certificate to renew or DDNS record to worry about.

A few practical notes

  • If your LLM machine is also the Connector host, you can use localhost:11434 as the Resource address. Cleaner, fewer moving parts.

  • For Open WebUI specifically, you can host the UI itself behind Twingate too. Add it as a separate Resource on whatever port it runs on (usually 3000 or 8080) and access the whole web interface privately.

  • Coding assistants like Continue.dev, Zed's AI features, and Cursor's custom model option all accept an OpenAI-compatible base URL. Point them at http://ollama.home:11434/v1 and they'll happily use your home rig from anywhere.

  • If you want to use the LLM from a phone, the Twingate mobile clients work the same way. Useful for chat-style apps that point at a local endpoint.

Closing

For the full reference on Connectors, Resources, and split DNS, see the Twingate documentation.

New to Twingate? You can use Twingate for free for up to 5 users, request a personalized demo, or reach out to the team over on the Twingate subreddit.

Rapidly implement a modern Zero Trust network that is more secure and maintainable than VPNs.

Remote LLM Access via Twingate

Andrew Baumbach

Product Marketing Engineer

Illustrated graphic using dark colors with a line art lock and line art brain representing an LLM.

TL;DR: Run a local LLM at home but want to use it on the road? Twingate gives you private, authenticated access to your home rig without exposing Ollama, LM Studio, or Open WebUI to the internet. Twingate docs on adding resources.

You spent a weekend setting up Ollama on the machine with the good GPU. Maybe you wrapped it in Open WebUI, pointed Continue.dev at it from your laptop, and got a clean local Llama 3.1 or Qwen 2.5 Coder workflow going. It's fast, it's private, and it costs you nothing per token.

Then you leave the house.

The model is sitting at home doing nothing while you're on hotel Wi-Fi paying API rates to Anthropic. The obvious fix is also the bad one: forward port 11434, slap on a reverse proxy, hope nobody scans your IP. There's a better way that doesn't involve exposing your LLM to the open internet at all.

Why exposing a local LLM is a bad idea

Local LLM stacks were not designed to be public-facing services. Ollama's default API listens on 127.0.0.1:11434 with no authentication. Open WebUI has accounts, but the underlying inference endpoint usually doesn't. LM Studio's server is the same story.

A few things go wrong the moment you open a port:

  • Anyone who finds the endpoint can use your GPU for free. There are already Shodan dorks and GitHub scripts hunting for exposed Ollama instances, and researchers have found tens of thousands of them sitting open.

  • CVE-2024-37032 ("Probllama") was a remote code execution flaw in Ollama disclosed in mid-2024. Anyone who hadn't patched and had the port open was a target.

  • Even with auth in front of the UI, you're still trusting a single password and a TLS cert you have to renew against the entire internet.

  • Your home IP is now logged in someone else's reconnaissance database.

Tunneling through a reverse proxy with basic auth is better than nothing, but it's still an inbound service on the public internet. The architecture is wrong for what this actually is: a personal resource that two or three people should be able to reach.

The Twingate approach

Twingate treats your home LLM the same way it would treat a private database in a VPC. The resource stays where it is. Nothing about your home network changes from the outside. A small piece of software called a Connector runs inside your network and dials out to Twingate's control plane. Your laptop runs a Client that does the same.

When you try to hit ollama.home or whatever hostname you configure, the Client recognizes the request, checks that you're allowed, and routes it through an encrypted tunnel to the Connector, which forwards it to the Ollama process. No ports open on your router. No DNS records pointing at your house. No public attack surface.

A few properties worth calling out:

  • The Connector only makes outbound connections. Your firewall doesn't need any inbound rules.

  • Authentication is tied to your identity provider (Google, GitHub, Microsoft, etc. via the free tier), not a shared password.

  • Access is per-resource. You can grant your laptop access to the LLM without giving it access to anything else on the network.

  • Traffic goes peer-to-peer when possible and falls back to a relay if NAT traversal fails. Either way, Twingate never sees plaintext.

What you'll need

  • A machine on your home network running a local LLM. Examples: Ollama, LM Studio, llama.cpp's server, vLLM, Open WebUI.

  • A free Twingate account (up to 5 users is plenty for personal use).

  • Docker, or a Linux host where you can install the Connector as a service. A Raspberry Pi works fine.

  • The Twingate Client on the devices you want to use the LLM from.

Setting it up

This is the short version. The full Connector deployment docs cover edge cases.

1. Make sure the LLM is listening on the LAN

By default Ollama binds to localhost only. To make it reachable from the Connector (which will be a separate process or container), bind it to all interfaces:

Set the listen address before starting Ollama:

If you're running it as a systemd service, add the variable to the unit file's [Service] section as Environment="OLLAMA_HOST=0.0.0.0:11434" and reload. Same idea for LM Studio's server settings.

Confirm it works from another machine on the LAN:

curl http://<llm-host-ip>

You should get a JSON list of installed models.

2. Create a Remote Network and Connector in Twingate

In the Twingate admin console, create a Remote Network (call it home or whatever you like), then add a Connector inside it. Twingate will give you a Docker run command or a Linux install script with two tokens baked in.

Run the Connector on a machine that's always on. A Raspberry Pi, a NAS, or the LLM host itself all work. The Docker version looks something like this:

Start the Connector in a Docker container:

docker run -d \
  --sysctl net.ipv4.ping_group_range="0 2147483647" \
  --env TWINGATE_ACCESS_TOKEN="<access-token>" \
  --env TWINGATE_REFRESH_TOKEN="<refresh-token>" \
  --env TWINGATE_NETWORK="<your-network>

It will dial out, register, and show up as connected in the admin console within a few seconds. No inbound firewall rules required.

3. Add the LLM as a Resource

In the same Remote Network, add a Resource. The address can be the LAN IP of the LLM host (e.g. 192.168.1.50) or, better, a hostname like ollama.home. Twingate's split DNS will resolve that name through the Connector even though it doesn't exist publicly.

Assign the Resource to a group that includes your user. If you're the only person using this, the default group is fine.

4. Install the Client and connect

Install the Twingate Client on your laptop, phone, or whatever device you want to use the LLM from. Sign in with the same identity you used for the admin console. Once it's running, requests to ollama.home:11434 will route through the tunnel automatically.

Verify from your laptop while off your home network:

Same JSON response you'd get on the LAN. Point Continue.dev, Open WebUI on another machine, or any OpenAI-compatible client at that hostname and you're done.

What this actually gives you

Before: an LLM that works at home and stops existing the moment you walk out the door, unless you take on the operational and security cost of running a public-facing service.

After:

  • Same hostname works from any network: home, office, hotel, mobile tether.

  • The LLM port is never reachable from the internet. A port scan of your home IP shows nothing new.

  • Access is tied to your identity. Lose your laptop, revoke the device in the console, done.

  • You can share access with a partner or a couple of friends by adding them as users (free up to 5) without giving anyone your Wi-Fi password or VPN credentials.

  • If you eventually run other services (Home Assistant, a Jellyfin instance, a Git server), they slot into the same setup.

The setup takes maybe fifteen minutes if you already have Docker on something. The ongoing maintenance is approximately zero — the Connector updates itself, and there's no certificate to renew or DDNS record to worry about.

A few practical notes

  • If your LLM machine is also the Connector host, you can use localhost:11434 as the Resource address. Cleaner, fewer moving parts.

  • For Open WebUI specifically, you can host the UI itself behind Twingate too. Add it as a separate Resource on whatever port it runs on (usually 3000 or 8080) and access the whole web interface privately.

  • Coding assistants like Continue.dev, Zed's AI features, and Cursor's custom model option all accept an OpenAI-compatible base URL. Point them at http://ollama.home:11434/v1 and they'll happily use your home rig from anywhere.

  • If you want to use the LLM from a phone, the Twingate mobile clients work the same way. Useful for chat-style apps that point at a local endpoint.

Closing

For the full reference on Connectors, Resources, and split DNS, see the Twingate documentation.

New to Twingate? You can use Twingate for free for up to 5 users, request a personalized demo, or reach out to the team over on the Twingate subreddit.