Use 7z compression (better compression rate) if found in path. That
alone isn't sufficient to get us under 2G, so MLX is now split out as a
discrete download. Fix CI so it will fail if artifacts fail to upload.
* prefer rocm v6 on windows
Avoid building with v7 - more changes are needed
* MLX: add header vendoring and remove go build tag
This switches to using a vendoring approach for the mlx-c headers so that Go
can build without requiring a cmake first. This enables building the new MLX
based code by default. Every time cmake runs, the headers are refreshed, so we
can easily keep them in sync when we bump mlx versions. Basic Windows
and Linux support are verified.
* ci: harden for flaky choco repo servers
CI sometimes fails due to choco not actually installing cache. Since it just speeds up the build, we can proceed without.
* review comments
This adds a new powershell install script suitable for running via
irm https://ollama.com/install.ps1 | iex
If you download the script and run '-?' it reports basic usage
information, as well as usage examples for common customization
options. The script is signed as part of the release process
to ensure it can run on a typically configured Windows system.
This does not include doc updates - we can merge those after a release
ships to avoid user confusion.
With the upcoming addition of MLX, the linux bundle will exceed the
maximum github artifact size of 2G. This change will bring the size
back down.
The install.sh changes support backwards compatibility for prior versions
thus should be safe to merge concurrently with this change.
* docs: vulkan information
* Revert "CI: Set up temporary opt-out Vulkan support (#12614)"
This reverts commit 8b6e5baee7.
* vulkan: temporary opt-in for Vulkan support
Revert this once we're ready to enable by default.
* win: add vulkan CI build
* app: add code for macOS and Windows apps under 'app'
* app: add readme
* app: windows and linux only for now
* ci: fix ui CI validation
---------
Co-authored-by: jmorganca <jmorganca@gmail.com>
Initially Vulkan support in Ollama will require building from source. Once it is
more thoroughly tested and we have fixed any critical bugs, then we can
bundle Vulkan into the official binary releases.
This will likely yield builds that have problems with unicode characters
but at least we can start testing the release while we try to find an
alternate clang compiler for windows, or mingw ships a fixed version.
* Add support for upcoming NVIDIA Jetsons
The latest Jetsons with JetPack 7 are moving to an SBSA compatible model and
will not require building a JetPack specific variant.
* cuda: bring back dual versions
This adds back dual CUDA versions for our releases,
with v11 and v13 to cover a broad set of GPUs and
driver versions.
* win: break up native builds in build_windows.ps1
* v11 build working on windows and linux
* switch to cuda v12.8 not JIT
* Set CUDA compression to size
* enhance manual install linux docs
The preset CMAKE_HIP_FLAGS isn't getting used on Windows.
This passes the parallel flag in through the C/CXX flags, along
with suppression for some log spew warnings to quiet down the build.
* Re-remove cuda v11
Revert the revert - drop v11 support requiring drivers newer than Feb 23
This reverts commit c6bcdc4223.
* Simplify layout
With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling)
* distinct sbsa variant for linux arm64
This avoids accidentally trying to load the sbsa cuda libraries on
a jetson system which results in crashes.
* temporary prevent rocm+cuda mixed loading
This reduces the size of our Windows installer payloads by ~256M by dropping
support for nvidia drivers older than Feb 2023. Hardware support is unchanged.
Linux default bundle sizes are reduced by ~600M to 1G.
* Bump cuda and rocm versions
Update ROCm to linux:6.3 win:6.2 and CUDA v12 to 12.8.
Yum has some silent failure modes, so largely switch to dnf.
* Fix windows build script
clang outputs are faster. we were previously building with clang via gcc
wrapper in cgo but this was missed during the build updates so there was
a drop in performance
set owner and group when building the linux tarball so extracted files
are consistent. this is the behaviour of release tarballs in version
0.5.7 and lower
ollama requires vcruntime140_1.dll which isn't found on 2019. previously
the job used the windows runner (2019) but it explicitly installs
2022 to build the app. since the sign job doesn't actually build
anything, it can use the windows-2022 runner instead.