2966 Commits

Author SHA1 Message Date
Theodor Midtlien
ee360963f9 [client] Migrate profile identity from display name to ID and allow renaming of profiles (#6367)
Some checks failed
FreeBSD / Client / Unit (push) Has been cancelled
Linux / Build Cache (push) Has been cancelled
Linux / Client / Unit (386) (push) Has been cancelled
Linux / Client / Unit (amd64) (push) Has been cancelled
Darwin / Client / Unit (push) Has been cancelled
Linux / Client (Docker) / Unit (push) Has been cancelled
Linux / Relay / Unit (386, ) (push) Has been cancelled
Linux / Relay / Unit (amd64, -race) (push) Has been cancelled
Linux / Proxy / Unit (386) (push) Has been cancelled
Linux / Proxy / Unit (amd64) (push) Has been cancelled
Linux / Signal / Unit (386) (push) Has been cancelled
Linux / Signal / Unit (amd64) (push) Has been cancelled
Linux / Management / Unit (amd64, mysql) (push) Has been cancelled
Linux / Management / Unit (amd64, postgres) (push) Has been cancelled
Test installation / test-install-script (false, macos-latest, false) (push) Has been cancelled
Linux / Management / Unit (amd64, sqlite) (push) Has been cancelled
Linux / Management / Benchmark (amd64, postgres) (push) Has been cancelled
Linux / Management / Benchmark (amd64, sqlite) (push) Has been cancelled
Linux / Management / Benchmark (API) (amd64, postgres) (push) Has been cancelled
Linux / Management / Benchmark (API) (amd64, sqlite) (push) Has been cancelled
Linux / Management / Integration (amd64, postgres) (push) Has been cancelled
Linux / Management / Integration (amd64, sqlite) (push) Has been cancelled
Windows / Client / Unit (push) Has been cancelled
Mobile / Android / Build (push) Has been cancelled
Mobile / iOS / Build (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Test installation / test-install-script (false, macos-latest, true) (push) Has been cancelled
Test installation / test-install-script (false, ubuntu-latest, false) (push) Has been cancelled
Test installation / test-install-script (false, ubuntu-latest, true) (push) Has been cancelled
Test installation / test-install-script (true, macos-latest, false) (push) Has been cancelled
Test installation / test-install-script (true, macos-latest, true) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Test installation / test-install-script (true, ubuntu-latest, false) (push) Has been cancelled
Test installation / test-install-script (true, ubuntu-latest, true) (push) Has been cancelled
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Test Infrastructure files / test-docker-compose (mysql) (push) Has been cancelled
Test Infrastructure files / test-docker-compose (postgres) (push) Has been cancelled
Test Infrastructure files / test-docker-compose (sqlite) (push) Has been cancelled
Test Infrastructure files / test-getting-started-script (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync main / trigger_sync_main (push) Has been cancelled
Wasm / JS / Lint (push) Has been cancelled
Wasm / JS / Build (push) Has been cancelled
* Migrate to profile ids

* Migrate android profile manager

* Clean up

* Fix review

* Add ID type

* Fix test and runes in ShortID()

* Fix profile switch on up and android comments

* Revert android profile to string id

* Fix feedback

* Fix UI feedback

* Fix id assignment

* Add renaming of profiles

* Fix review

* Remove ui binary
* Fix getProfileConfigPath not validating id

* Change resolve handle order and fix server merge problems

* Fix mdm test
2026-06-18 08:49:19 +02:00
Maycon Santos
8d9580e491 [misc] improve goreleaser with RC handling and update docker builds (#6438)
- introduce variables to avoid publishing latest docker tags and installers
- Refactor .goreleaser.yaml to simplify docker configurations and add environment-driven flags
- removed management debug containers (it was doing only log var)
- Stopped building arm v6 32bits in favor of v7 32 bits for services (not client)
- Add target argument to docker files
2026-06-17 20:13:13 +02:00
Viktor Liu
5bd7c6c7ea [client] Detect and recover from a stalled signal receive stream (#6459) 2026-06-17 18:48:09 +02:00
Zoltan Papp
8ae2cd0a08 [client] Fix ios route notify ordering (#6454)
* [client] fix iOS route-update reordering that black-holed IPv6 on exit-node disable

On iOS the route notifier delivered each prefix update from its own
fire-and-forget goroutine (notify -> `go func`), so Go provided no ordering
guarantee between consecutive updates. It also read currentPrefixes inside
that goroutine without holding the lock, racing the next OnNewPrefixes write.

On exit-node disable the core removes the default routes as two separate
prefix updates (0.0.0.0/0, then the synthesized ::/0). When the two
goroutines were reordered, the stale snapshot still containing ::/0 was
delivered last and clobbered the correct default-free one. iOS then kept the
::/0 default route on the tunnel with no exit node to carry it, black-holing
all IPv6 traffic while IPv4 recovered correctly.

Fix: deliver updates through a single worker goroutine fed by a buffered
channel, preserving production order, and snapshot the joined prefix string
under the mutex so it can't race a concurrent update. Buffered so producers
(which run under the route manager lock) don't block on the listener callback.

* [client] close iOS notifier delivery goroutine on Stop, unbounded queue

The delivery goroutine was never stopped, leaking on every engine
restart. Add Notifier.Close, called from the route manager Stop after
routing cleanup.

Replace the buffered update channel with a cond-driven linked-list
queue so route-update producers (running under the route manager lock)
never block when the listener callback is slow.
2026-06-17 18:29:33 +02:00
Pascal Fischer
e4397d4d46 [management] remove nmap calc from login (#6449) 2026-06-17 16:37:24 +02:00
Viktor Liu
6fbc90b4d3 [client, relay] Expose relay transport and connection errors in status and metrics (#6342) 2026-06-17 15:41:48 +02:00
Riccardo Manfrin
5095e17cc5 [management] fix flaky Test_SaveAccount_Large from random IP collision (#6452) 2026-06-17 14:00:50 +02:00
Zoltan Papp
6df0175607 [client] Add IsLoginRequiredCached for iOS mobile client (#6447)
Expose a network-free login-required check backed by the in-memory status
recorder. Unlike IsLoginRequired(), which creates a fresh auth client and
performs a blocking network call, IsLoginRequiredCached() reports whether the
LAST observed management error was an auth failure (PermissionDenied/
InvalidArgument).

This lets the iOS connection listener detect a mid-session token expiry from
within onDisconnected during teardown without blocking on a slow or
unavailable network.
2026-06-16 16:15:19 +02:00
Zoltan Papp
3c23700e56 [client] Add iOS debug bundle support in Go (#6270)
* Add iOS debug bundle support in Go

Thread cacheDir through NewClient -> RunOniOS -> MobileDependency.TempDir
so the iOS client can pass its sandbox-writable cache directory for
debug bundle zip file creation instead of os.TempDir().

Move log collection into platform-dispatched addPlatformLog():
- iOS: adds the file-based Go client log (with rotation, stderr/stdout
  companions and anonymization handled by addLogfile) plus the Swift app
  log (swift-log.log) written by the iOS app into the same log directory
- Other non-Android platforms: existing file-based log + systemd fallback

Narrow the debug_nonandroid.go build tag to !android && !ios so iOS no
longer attempts the systemd journal fallback.

Add a DebugBundle() entry point to the iOS Go client that generates a
bundle, uploads it and returns the upload key. It works with or without
a running engine: when the engine is up it reuses the live config, sync
response and client metrics; otherwise it loads the config from disk (or
the preloaded tvOS config). Guard the live config/ConnectClient behind a
state mutex since DebugBundle may run on a different thread.

* Include the iOS state file in the debug bundle

addStateFile() resolved the state path via ServiceManager.GetStatePath(),
which on iOS points at a hard-coded default that does not exist in the app
sandbox, so the state file was silently skipped.

Add an optional StatePath to GeneratorDependencies and use it when set,
falling back to the ServiceManager default otherwise. The iOS DebugBundle
passes the client's actual state file path (the App Group profile state),
matching the Android bundle which includes the state file.

* ios: enable sync response persistence for debug bundle

Turn on sync response persistence before starting the engine so
DebugBundle can include the network map. On iOS the store is disk-backed
(see syncstore) to keep the map out of the constrained process memory.

* ios: pass log file path through NewClient constructor (#6393)

Add logFilePath field to Client struct and expose it as a parameter
in NewClient so callers provide the Go log path at construction time.
Wire it into DebugBundle via GeneratorDependencies.LogPath so the
debug bundle includes client.log and swift-log.log regardless of
whether the bundle is triggered by the app or the management server.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* ios: pass log file path to engine for remote debug bundles

RunOniOS started the engine with an empty LogPath, so EngineConfig.LogPath
was never set. Management-triggered (jobs) debug bundles read the log path
from the engine config, so they collected no client logs (client.log,
rotated logs, swift-log.log). The GUI path was unaffected because it passes
c.logFilePath directly to the bundle generator.

Thread c.logFilePath through RunOniOS into the engine config so remote
bundles include the client logs too.

---------

Co-authored-by: evgeniyChepelev <68751844+evgeniyChepelev@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-16 15:54:46 +02:00
Pascal Fischer
38ad2b67e8 [proxy] fix context for udprelay (#6444) 2026-06-16 14:41:17 +02:00
Pascal Fischer
01aa49433e [management] delete targets when deleting exposed service (#6442) 2026-06-16 14:33:24 +02:00
Zoltan Papp
08a2b63675 [client] propagate exit-node deselect to synthesized v6 (::/0) route (#6296)
* [client] propagate exit-node deselect to synthesized v6 (::/0) route

When a client deselects an IPv4 exit node, the auto-generated IPv6 default
route (::/0) was still selected and pushed onto the tunnel interface, even
though the user disabled the exit node. On an exit node without a real IPv6
egress this blackholes IPv6 traffic, and because clients prefer IPv6 (happy
eyeballs) it can break general connectivity.

Root cause: the synthesized v6 route gets a different NetID than its v4 base
(base + "-v6"). The route selector keys deselects by NetID and defaults
unknown NetIDs to selected, so the "-v6" entry was never matched by the v4
deselect. The effectiveNetID() mirror that solves exactly this is used by
HasUserSelectionForRoute and FilterSelectedExitNodes, but categorizeUserSelection
called the raw IsSelected(), bypassing it and mis-categorizing the v6 pair as
user-selected.

Add RouteSelector.IsSelectedForExitNode(), which applies effectiveNetID before
the selection check, and use it in categorizeUserSelection. IsSelected() is left
untouched so non-exit code paths don't make unrelated "*-v6" routes inherit v4
state. Adds regression tests for the v4/v6 deselect mirror and explicit-v6
override.

* [client] add DIAG logging to trace exit-node v6 (::/0) route filtering

Temporary diagnostics to find why a deselected v4 exit node's synthesized
::/0 route still reaches the tunnel. Logs the full install path: incoming
client networks, route-selector state before/after the management-driven
update, what updateExitNodeSelections deselects/selects, and per-route
KEEP/SKIP/DROP decisions in FilterSelectedExitNodes and applyExitNodeFilter.
To be reverted once the real root cause is confirmed from a client log.

* [client] clear orphaned v6 exit selection when v4 pair is toggled

Root cause of the leaking ::/0 route, confirmed from client logs: the
synthesized "-v6" exit route could stay explicitly selected in the persisted
route-selector state while its v4 base was deselected (selected=[...-v6],
deselected=[...v4base]). Because the v6 entry then has its own explicit state,
effectiveNetID stops mirroring the v4 base, so FilterSelectedExitNodes keeps
::/0 and it is installed on the tunnel even though the user disabled the exit
node. This happened because the iOS SDK's deselect only pairs the "-v6" sibling
via ExpandV6ExitPairs when the v6 route is present in the current routesMap; a
deselect at a moment it wasn't expanded left the v6 selection orphaned.

Fix at the selector write path so it is independent of routesMap timing: when a
v4 exit NetID is selected or deselected, clear any orphaned explicit state on
its "-v6" sibling (clearPairedV6Locked), unless the sibling is part of the same
batch (the deliberate ExpandV6ExitPairs case). The v6 then falls back to
inheriting the v4 base via effectiveNetID, so a v4 deselect also drops ::/0 and
a v4 select brings both back.

Adds regression tests: a stale explicit v6 selection is cleared by a later v4
deselect, and an explicit v6 select made in the same batch is preserved.

* [ios] compute route connection status in the bridge

The iOS bridge exposed a route's Network as a possibly comma-joined string
("0.0.0.0/0, ::/0" for a merged exit node) but no connection status, forcing
the UI to infer status by string-matching that joined value against peer
routes — which never matched for the merged exit node, leaving it stuck as
not-connected. Android already computes status in the core (findBestRoutePeer).

Mirror that here: add a Status field to RoutesSelectionInfo and compute it from
the connected peers' route tables, matching the route's primary prefix, a merged
exit node's extra v6 prefix, or a dynamic route's domain pattern (the key the
route manager records). The UI can now read the status directly.

* [client] remove exit-node v6 DIAG logging and tidy routeselector

Drop the temporary DIAG diagnostics added to trace the leaking ::/0 route
(the root cause is fixed and confirmed). Also reorganize routeselector.go so
the exit-node helpers (clearPairedV6Locked, isExitNode) sit next to the
exit-node code paths and MarshalJSON/UnmarshalJSON are grouped together.

* [client] mirror v4 exit selection onto v6 pair at write time

The synthesized "-v6" exit route shares its v4 base's NetID plus a "-v6"
suffix. Selection state was reconciled at read time via effectiveNetID, a
mirror that could only be applied on exit-node code paths, which forced a
parallel IsSelectedForExitNode() alongside IsSelected() and a clearPairedV6Locked()
orphan cleanup on every toggle. That machinery still missed the case observed
in the field: a persisted state with the v4 base deselected but its "-v6"
sibling explicitly selected (orphaned). Because effectiveNetID returns the v6
entry itself once it carries explicit state, and clearPairedV6Locked only fires
on a live toggle, the loaded orphan survived and the ::/0 route leaked onto the
tunnel despite the exit node being disabled, breaking IPv6 (happy eyeballs).

Treat the v4/v6 exit pair as a single toggle and keep state consistent at write
time instead. RouteSelector.SyncPairedSelection forces the "-v6" entry to match
its v4 base unconditionally, resetting any orphaned explicit state. The route
manager, which knows the route prefixes, computes the pairs (V6ExitMergeSet) and
calls it from updateRouteSelectorFromManagement before selection is read, so both
collectExitNodeInfo and FilterSelectedExitNodes see consistent state, including
pairs loaded from persisted selector state.

This removes effectiveNetID, IsSelectedForExitNode and clearPairedV6Locked; the
selector is literal again and no longer needs the "exit-node paths only" caveat.
HasUserSelectionForRoute and applyExitNodeFilter use the raw NetID.

Adds a selector test for SyncPairedSelection (including the orphaned-v6 case) and
a route-manager test reproducing the persisted-orphan scenario from the field log.

* [client] add DIAG logging to trace v6 exit-pair mirror

The write-time mirror did not eliminate the leak in field testing. Re-add the
DIAG diagnostics around the exit-node selection flow to capture a fresh trace:

- UpdateRoutes: incoming client networks, selector state before/after the
  management update, and the networks remaining after FilterSelectedExitNodes.
- mirrorV6ExitPairSelections: the NetIDs present in this update and the v6 pairs
  V6ExitMergeSet derives from them (reveals whether the v4 base and its ::/0 pair
  are present in the same update so the pair can be matched).
- SyncPairedSelection: the base/paired state before and after the sync.
- FilterSelectedExitNodes / applyExitNodeFilter: per-route SKIP/KEEP/DROP and the
  selection lookups behind each decision.
- updateExitNodeSelections / logExitNodeUpdate: categorization and deselect set.

Temporary; to be removed once the root cause is confirmed.

* [client] remove v6 exit-pair mirror DIAG logging

Drop the temporary DIAG diagnostics added to trace the v4/v6 exit-pair mirror.
The field log confirmed the write-time mirror keeps the pair consistent (the
::/0 route is only ever applied alongside its v4 base and is dropped on deselect),
so the diagnostics are no longer needed.
2026-06-16 12:27:58 +02:00
Maycon Santos
b3f9e6588a [management] sync openapi spec and test for diff on workflows (#6437)
* [management] sync openapi spec and test for diff on workflows

* [management] pin oapi-codegen version to v2.7.1
2026-06-15 17:53:25 +02:00
Pascal Fischer
967e2d6864 [management] network map for affected peers (#6105) 2026-06-15 17:43:22 +02:00
Zoltan Papp
e7c1d364c3 [management] treat ci- builds as development for remote jobs (#6436)
* fix(management): treat ci- builds as development for remote jobs

CI snapshot builds use a "ci-<sha>" version string that did not match
IsDevelopmentVersion, so the remote-jobs minimum-version gate rejected
them. Recognize the "ci-" prefix as a development build.

* fix(management): treat dev- builds as development for remote jobs

Dev snapshot builds use a "dev-<sha>" version string that did not match
IsDevelopmentVersion, so the remote-jobs minimum-version gate rejected
them. Recognize the "dev-" prefix as a development build, alongside the
existing "ci-" prefix.
2026-06-15 17:22:40 +02:00
Viktor Liu
a44198fd77 [client] Add dialWebSocket method to WASM client (#5980) 2026-06-15 16:43:24 +02:00
Viktor Liu
b57f714350 [client] Drop signaling-side ICE candidate filter, drop overlay STUN at mux read-side instead (#6142) 2026-06-15 16:37:03 +02:00
Viktor Liu
f893abc41d [client] Recover from tun device read/write panics and restart the client (#6419) 2026-06-15 16:36:00 +02:00
Lee Sang Hoon
60067619a1 [proxy] Keep custom TCP listeners alive after mapping batches (#6415) 2026-06-15 12:21:24 +02:00
Bethuel Mmbaga
cd777395f2 [management] Skip JWT group evaluation for embedded-IdP local users (#6422)
When JWT group sync is enabled with a restrictive JWTAllowGroups list, the local owner of an embedded-IdP (Dex) deployment can get locked out. The allow-groups check runs account-wide but local password users do not receive
external IdP group claims, so they can't satisfy the allowed list.

This skips JWT group evaluation for local Dex users so the restriction and JWT group sync continue to apply to external-IdP users as intended.
2026-06-15 12:01:54 +03:00
Viktor Liu
b19467e3af [client] Answer NODATA when a host resolves without addresses of the requested family (#6418) 2026-06-12 14:50:46 +02:00
Riccardo Manfrin
2bcea9d582 [client] add MDM configuration profile support (Windows registry + macOS plist) (#6374)
* Initial scaffolding

* Applies MDM override

* Unit tests

* Helpers business logic

* Return error if trying to modify any config that is gated by MDM

* Add ManagedFields to returned config over GetConfig

* Adds initial 101 MDM policy business logic testing

* gRPC MDM changes

* MDM Name scoping for clarity

* Implements windows loading of MDM policy

* Adds missing WGPort config

* Cleanup setupKey to align to linear

* Align split tunnel code

* Adds some log

* Prefix every log with MDM

* Adds debug config cobra command

This can be useful for troubleshooting and checking config
now that its resolution is not trivial

defaults > config > env cars > CLI/UI > MDM

* Adds MDM 1m diff checker & reloader

* Adds also up/start after cancel

* Publishes event for UI to sync upon MDM changes

* Add events to resync UI to actual config

This also provide fixup for UI no aligning to changed config when coming from cli up with config flags.

* UI behavior conflicts relaxation

UI sends full config snapshot with all values. It doesn't
make sense to block it if the values are aligned with the
values constrained by the MDM policy. It's just simplier
to allow values that are compliant. (this goes for the CLI
as well at this point)

* Lock toggle Settngs

* Advanced Settings locking

* Fixup presharedkey

* Apply MDM locks

* Toggle gray in/out for Advanced Settings

* Adds support for disabling of Profiles and UpdateSettings feature flags

* Adds Gate Login as well when --disable-update-settings=true is given to service

This commit tries to settle things with an old PR-4237 which had relaxed
the case where the SetConfig returned an `Unavailable` code error.

Under this circumnstance the PR allowed the upFunc to just emit a warning and
progress further with the login gRPC. Since the login call is consuming
the --management-url coming from the `up` command, it might be possible
to abuse the "Unavailable" code to inject a management URL that is different
from the configured one even though the --disable-update-settings is set
to true (?)

* Evaluate disable-update-settings errors only when there's an actual override

* [UI] Fixup advanced Settings

* [UI] Fixup for preshared key

* [UI] Fixup for profile enable/disable toggle

We need to align the initial state to evaluate the delta in case.

The initial state has to be "true" since the profile starts visible.
Then we receive MDM and transition the cache bool value to the actual
MDM imposed state

* Enforces disable networks

* [UI] Aligns to "enable/disable once on change only"

* Fixup: MDM wins. always

* Removes --disable-advanced-settings

It was a typo in our meetings. the actual thing is --disable-update-settings

* [PROTO] Removes --disable-advanced-settings

* [UI] Removes --disable-advanced-settings

* Pins feat profile retrieval to notif event

* [UI] Fix for "hide" not working when propagating to parent with children

* Adds dep for reading plist files

* Introduces support for darwing plist loading

* Tests MDM config reload via ticker

* [PROVISIONING] ADMX/ADML/PS/bash scripts/templates

* CI fixes

- Add docstrings to `mdm_integration`
- refactor for cognitive complexity
- mod tidy

* Linting

* Add docstrings to `mdm_integration`

* nil,nil is no policy and no error. Allow it

* nil,nil is no policy and no error. Allow it

* exclude MDM profile adminstrated keys data from debug bundle

* Fixes Rosenpass left disable after MDM unlock

* Partial revert coderabbit added docstrings

* Renaming fix

* Avoid locking on clientRunning bool when the connection is aborted for whatever reason

We want to just signal this through the giveUpChan, we will manage the signal from
the waiter side and in case set it to false there. THis way we avoid locking,
which should allow the MDM down+wait_for_term_chan_signal_+up procedure

clientRunning is used to signal two different conditions here:

1. the initialization procedure is over (we have an engine)
2. the connection being up (or being attempted)

Probably these two functionalities should not alias, and the failure of the second condition
(because of any error) should just drive a reconnection (currently it's not happening,
and we silently go idle).
OR, mor probably, the two things are the SAME and there should not exist a case where
we did the "Up" initialization and connection attempt but we are not still attempting it.

* Moves test helper at te very bottom

* Addresses github comments

* No lock no copy

* Prevents engine not stopping within 10 secs from being paired by another instance

We instead juts SKIP updating the policy, so
1. the MDM ticker will kick in 1 minute time,
2. find the policy misaligned,
3. enter the onMDMPolicyChange,
4. find the s.clientRunning == true
   (because it is set to false only in server cleanupConnection,
   and not by s.actCancel())
5. call s.actCancel() again if not nil
6. immediately return from <-s.clientGiveUpChan
7. finally call s.restartEngineForMDMLocked()

* Since we ARE running there should be a config

If the config was cancelled midflight, connect will abort later on

* DisableAutoConnect should not stop a running connection.

DisableAutoConnect should just avoid the connection attempts *when the service starts*.
If we are started and we are up and running, DisableAutoConnect should not kick in.

Another PR will follow about this topic

* Removes unused vars

* Moves callback into Run method arg

* align comment to removal of DisableAutoConnect

DisableAutoConnect should just avoid the connection attempts *when the service starts*.
If we are started and we are up and running, DisableAutoConnect should not kick in

* Removes unused managed_fields data.

This was initially used to drive the UI but approach changed
to reload config/features upon notifications which makes this data redundant.

* Reorder stuff

* Unexport unrequired vars/functions

PoliciesEqual → policiesEqual
AllKeys → allKeys

* Adds list of MDM managed fields in the debug bundle
2026-06-12 12:28:49 +02:00
Maycon Santos
8ff3b06cf1 [client] Index peer tunnel IPs for faster PeerStateByIP lookup (#6412)
Some checks failed
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync tag / trigger_sync_tag (push) Has been cancelled
sync tag / trigger_android_bump (push) Has been cancelled
sync tag / trigger_ios_bump (push) Has been cancelled
update docs / trigger_docs_api_update (push) Has been cancelled
* [client] Index peer tunnel IPs for O(1) PeerStateByIP lookup

Replace the linear scan over all peers with an ipToKey map maintained
by AddPeer/RemovePeer, covering both IPv4 and IPv6 tunnel addresses.

Offline peers are intentionally no longer resolvable by IP: only active
peers can carry traffic, so IdentityForIP and the DNS disconnected-peer
filter now treat them as unknown, same as foreign IPs.

Skip the DNS answer filter for single-record responses; dropping the
only answer was always restored by the empty-answer escape hatch, so
the fast path is behavior-neutral.

* Ensure `ipToKey` entries are only removed if they match the peer being deleted, preventing accidental removal of unrelated mappings.
v0.72.4
2026-06-12 10:24:15 +02:00
Maycon Santos
d7703767d5 [client, proxy] cancel context before stopping engine on embedded client (#6397)
Some checks failed
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync tag / trigger_sync_tag (push) Has been cancelled
sync tag / trigger_android_bump (push) Has been cancelled
sync tag / trigger_ios_bump (push) Has been cancelled
update docs / trigger_docs_api_update (push) Has been cancelled
- Engine.Start takes syncMsgMux with a deferred unlock (engine.go:445) and parks in receiveSignalEvents → WaitStreamConnected (engine.go:1762), which only wakes on
  signal-stream connect or client-context cancellation.
  - When signal never connects, the 30s startup timeout fires and embed.Client.Start's rollback (embed.go:281) called client.Stop() → Engine.Stop, which blocks acquiring
  syncMsgMux (engine.go:318). The cancel() that would unpark Start was deferred until Start returned — permanent cycle. RemovePeer calls (g43/g385) then queue behind the
  lifecycle mutex.
  - Notably, embed.Client.Stop and the daemon's cleanupConnection both cancel before stopping — the startup rollback was the only path that didn't.
  - Engine.Start takes syncMsgMux with a deferred unlock (engine.go:445) and parks in receiveSignalEvents → WaitStreamConnected (engine.go:1762), which only wakes on
  signal-stream connect or client-context cancellation.
  - When signal never connects, the 30s startup timeout fires and embed.Client.Start's rollback (embed.go:281) called client.Stop() → Engine.Stop, which blocks acquiring
  syncMsgMux (engine.go:318). The cancel() that would unpark Start was deferred until Start returned — permanent cycle. RemovePeer calls (g43/g385) then queue behind the
  lifecycle mutex.
  - Notably, embed.Client.Stop and the daemon's cleanupConnection both cancel before stopping — the startup rollback was the only path that didn't.
v0.72.3
2026-06-10 21:26:54 +02:00
Maycon Santos
7feda907ca [management] fix L4 service update when no custom port (#6396)
This fixes an issue where L4 service update is not possible when proxy clusters don't support custom ports
2026-06-10 18:55:24 +02:00
Maycon Santos
62da482133 [management] Add version gate to stop sending deprecated RemotePeers field (#6371)
* [management] Add version gate to stop sending deprecated RemotePeers field

don't send top-level remote peers on peers in the  v0.29.3 or newer

* precompute deprecated remote peers version constraint

* [management] update tests to validate network map-based remote peers

* [management] move deprecatedRemotePeersVersion constant closer to its usage

* fix misplaced precomputed constraint definition

* ensure top-level RemotePeers is empty for v0.29.3+ clients
2026-06-10 16:59:09 +02:00
Philip Laine
079bce3c2f Add commands to discover and write Kubernetes configuration (#6260) 2026-06-10 15:00:10 +02:00
Maycon Santos
1a09aa6715 [misc] Update Go toolchain version in go.mod (#6377) 2026-06-10 14:50:57 +02:00
Maycon Santos
61abf5b9ea [proxy] Use UUID for proxy ID generation (#6391)
Use UUID for proxy ID instead of the second to avoid race conditions when running multiple nodes at the same time.
2026-06-10 13:35:26 +02:00
Boris Dolgov
e229050ba3 [proxy] Notify certificate ready for domains covered by the static certificate (#6389) 2026-06-10 12:05:34 +02:00
Zoltan Papp
e919b2d55d [client] Preserve posture checks on config-only sync updates (#6373)
* [client] Preserve posture checks on config-only sync updates

When management sends a MessageTypeControlConfig update (e.g. relay token
rotation), the SyncResponse carries no NetworkMap and no Checks. Moving the
updateChecksIfNew call after the nm == nil guard ensures posture checks are
only updated when a full network map is present, preventing relay token
rotation from silently clearing the previously applied posture check state.

* [client] Clarify posture check update logic with explicit comment

* [client] Extract NetBird config and sync persistence into helpers

Move the NetbirdConfig handling block out of handleSync into
updateNetbirdConfig and the sync response persistence into
persistSyncResponse, mirroring updateChecksIfNew. This flattens
handleSync and makes the individual update steps unit-testable.
2026-06-10 11:43:24 +02:00
Pascal Fischer
a40028092d [management] log user agent and return request id (#6380) 2026-06-09 15:24:26 +02:00
Pascal Fischer
13200265d8 [proxy] Add no-blocking mapping updates (#6369) 2026-06-09 13:57:17 +02:00
Viktor Liu
ed7a9363aa [management] Emit IPv6 default permit firewall rule for exit node routes (#6368) 2026-06-09 13:26:43 +02:00
Viktor Liu
d56859dc5d [client] Filter DNS fallback upstreams matching our server IP to prevent loops (#6183) 2026-06-09 12:26:03 +02:00
Viktor Liu
367d37050b [relay, client] Fall back to WebSocket relay transport on oversized QUIC datagrams (#6339) 2026-06-09 10:25:46 +02:00
Viktor Liu
106527182f [client] Snapshot iptables rule maps before persisting state (#6345) 2026-06-09 10:24:51 +02:00
Viktor Liu
8e1d5b78c2 [client] Preserve user deselect-all across management route sync (#6363) 2026-06-09 10:24:17 +02:00
PizzaLovingNerd
d3b63c6be9 [infrastructure] Better support for atomic distros in install.sh, docker fixes in getting-started.sh (#6139)
* Made the docker check first for getting-started.sh, better atomic support for install.sh

* Check for docker socket perms

* Added fallback for systems without rpm-ostree or bootc.

* macOS fix for docker socket check

* Change error message for docker group.

No longer using a blanket recommendation for the docker group.
2026-06-08 21:38:46 +02:00
Maycon Santos
60d2fa08b0 [client] Mask sensitive data in debug bundle creation (#6364)
* [client] Mask sensitive data in debug bundle creation

* Avoid nil reference in turn and use masked constant
2026-06-08 13:17:04 +02:00
Maycon Santos
1e7b16db0a [management] resolve private services on custom domains in synthesized DNS zones (#6348)
Some checks failed
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync tag / trigger_sync_tag (push) Has been cancelled
sync tag / trigger_android_bump (push) Has been cancelled
sync tag / trigger_ios_bump (push) Has been cancelled
update docs / trigger_docs_api_update (push) Has been cancelled
private services on a custom domain didn't resolve on clients — the synthesized DNS zone was anchored to the cluster, and the account's custom domains weren't even
  loaded.

- account.go — SynthesizePrivateServiceZones now keys zones by a resolved apex (privateServiceDomainZone): cluster suffix → registered account.Domains (filtered by matching
  TargetCluster, longest wins) → skip if none. One zone per apex; custom-domain services group under their registered domain.
- sql_store.go — GetAccount now loads account.Domains on both loaders (gorm Preload("Domains") + pgx goroutine via ListCustomDomains; errChan buffer bumped 12→16). This was
  the reason the deploy didn't work — the relation was empty in prod.
- Tests — custom-domain zone synthesis cases (apex resolution, free+custom separation, sibling collapse, cluster mismatch, mixed cluster/custom/public) + GetAccount
  domain-preload tests on sqlite and Postgres.
v0.72.2
2026-06-06 12:56:01 +02:00
Maycon Santos
b377d99933 [management] Copy private field on shallowCloneMapping (#6347)
Some checks failed
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync tag / trigger_sync_tag (push) Has been cancelled
sync tag / trigger_android_bump (push) Has been cancelled
sync tag / trigger_ios_bump (push) Has been cancelled
update docs / trigger_docs_api_update (push) Has been cancelled
* [management] Copy private field on shallowCloneMapping

added test to ensure clone handles new fields

* Remove unnecessary debug logs from proxy service

* Increase Wasm binary size limit to 60MB in build validation
v0.72.1
2026-06-05 22:45:49 +02:00
Theodor Midtlien
512899d82d [client] Prevent corruption from competing log rotation and improve debug bundle (#6214)
Some checks failed
Release / FreeBSD Port / Build & Test (push) Has been cancelled
Release / release (push) Has been cancelled
Release / release_ui (push) Has been cancelled
Release / release_ui_darwin (push) Has been cancelled
Release / Windows Installer / Build Test (amd64, amd64) (push) Has been cancelled
Release / Windows Installer / Build Test (arm64, arm64) (push) Has been cancelled
Release / Comment release artifacts (push) Has been cancelled
Release / trigger_signer (push) Has been cancelled
sync tag / trigger_sync_tag (push) Has been cancelled
sync tag / trigger_android_bump (push) Has been cancelled
sync tag / trigger_ios_bump (push) Has been cancelled
update docs / trigger_docs_api_update (push) Has been cancelled
* Adds heuristic to detect an edge case on Linux where a system has configured logrotate as a separate service to rotate log files which would mangle our client log files. If we detect logrotate being configured for netbird, we disable our rotation.

* Adds new env var to disable log rotation: NB_LOG_DISABLE_ROTATION

* Adds compressed and plain logrotate files to debug bundle.

* Replaces lumberjack with timberjack (maintained fork with bug fixes and extra features).

* Clarifies which daemon version is running in the bundle stats.

* Change logging for client service status to console
v0.72.0
2026-06-04 17:36:45 +02:00
Theodor Midtlien
5993ec6e43 [client] Allow wireguard port to be zero in UI and show port in status command (#6158)
* Allow wireguard port to be set to 0 in UI

* Add wireguard port to cmd status

* Correct protoc version
2026-06-04 15:04:11 +02:00
Maycon Santos
eac6d501c3 [infrastructure] allow docker image overrides for getting started (#6335)
* [infrastructure] allow docker image overrides for getting started

Make dashboard and server image configurations overrideable via environment variables

* [infrastructure] update Traefik gRPC rule to include ProxyService PathPrefix

* make Traefik and CrowdSec images configurable via environment variables
2026-06-04 11:24:47 +02:00
Maycon Santos
deeae30612 [misc] Add Codecov integration and coverage reporting across workflows (#6333) 2026-06-03 19:08:45 +02:00
Bethuel Mmbaga
f3cdf163e1 [management] Export ResolveDomain (#6334) 2026-06-03 19:53:57 +03:00
Zoltan Papp
3e61ccb162 [client] Persist sync response via pluggable store (disk on iOS) (#6331)
* Persist sync response via pluggable store (disk on iOS)

The latest Management sync response (which carries the network map) was
kept in memory for debug bundle generation. On memory-constrained
platforms like iOS the network map can be large enough to matter.

Introduce a syncstore package with a Store interface and two backends:
a memory backend (the previous behavior) and a disk backend that
serializes the response to a file in the state directory. The backend
is selected per-platform at build time: disk on iOS, memory elsewhere.

The disk store clears any leftover file on construction so a fresh
store never reads stale data from an earlier run (e.g. another
profile's network map).

In the engine, drop the separate persistSyncResponse bool: the store is
only instantiated while persistence is enabled, and its presence is
what marks persistence as active. The store is also cleared on engine
close so the file does not linger on disk.

* syncstore: silence nilnil linter on "nothing stored" returns

Get returns (nil, nil) to signal that nothing is stored, which is part
of the Store contract and preserves the original behaviour. Annotate
both backends with //nolint:nilnil so golangci-lint does not flag it.

* syncstore: hold syncRespMux for the whole store Set/Get

Both handleSync and GetLatestSyncResponse snapshotted e.syncStore under
the read lock and then released it before calling Set/Get. That allowed
SetSyncResponsePersistence(false) or engine close to clear the store
mid-call. In particular a concurrent Clear()+nil followed by a late
Set could re-create the file that was just removed, defeating the
leak/lingering protection.

Hold syncRespMux for the duration of the store operation in both spots
so the store cannot be cleared while a Set/Get is in flight.

* syncstore: avoid StateDir "." when state path is empty

On mobile the state path may be empty (the engine tolerates a missing
state file). filepath.Dir("") returns ".", which would make a
disk-backed syncstore write into the working directory instead of
letting NewDiskStore fall back to os.TempDir().

Only set engineConfig.StateDir when path is non-empty.
2026-06-03 14:18:50 +02:00
Viktor Liu
a48c20d8d8 [client] Gate DNS forwarder on BlockInbound (#6257) 2026-06-03 11:33:29 +02:00
Riccardo Manfrin
2b57a7d43b [client, management, misc] expose VCS revision in dev build version output (#6263)
* Refactor to use a common checker for development version

* Adds commit sha to development version for cobra command only

Leave dashboard unaffected

* Adjust for "v0.31.1-dev" test case

which must be considered pre-release

* Drop synthetic "dev"/"0.50.0-dev" firewall feature-gate fixtures

These test cases encoded the loose strings.Contains(v, "dev")
semantics inherited from peerSupportedFirewallFeatures, but
NetbirdVersion() never produces those values — only the literal
"development" (and now "development-<sha>[-dirty]") ever flows
through the wire. The agent owns the semantics of an ephemeral
development build, so the tests should exercise the strings we
actually emit.

Replaced with development, development-<sha> and
development-<sha>-dirty cases that match the HasPrefix("development")
predicate introduced upstream.

* Remove unexistent tests on wire format

The sha / dirty flag are added only when the CLI asks the version.
Account versions is unaffacted and can only strictly match "development"

* Adds tests for IsDevelopmentVersion
2026-06-03 08:56:50 +02:00