|
1 | 1 | # Foundry Local C# SDK |
2 | 2 |
|
3 | | -## Installation |
| 3 | +The Foundry Local C# SDK provides a .NET interface for running AI models locally via the Foundry Local Core. Discover, download, load, and run inference entirely on your own machine — no cloud required. |
| 4 | + |
| 5 | +## Features |
4 | 6 |
|
5 | | -To use the Foundry Local C# SDK, you need to install the NuGet package: |
| 7 | +- **Model catalog** — browse and search all available models; filter by cached or loaded state |
| 8 | +- **Lifecycle management** — download, load, unload, and remove models programmatically |
| 9 | +- **Chat completions** — synchronous and `IAsyncEnumerable` streaming via OpenAI-compatible types |
| 10 | +- **Audio transcription** — transcribe audio files with streaming support |
| 11 | +- **Download progress** — wire up an `Action<float>` callback for real-time download percentage |
| 12 | +- **Model variants** — select specific hardware/quantization variants per model alias |
| 13 | +- **Optional web service** — start an OpenAI-compatible REST endpoint (`/v1/chat_completions`, `/v1/models`) |
| 14 | +- **WinML acceleration** — opt-in Windows hardware acceleration with automatic EP download |
| 15 | +- **Full async/await** — every operation supports `CancellationToken` and async patterns |
| 16 | +- **IDisposable** — deterministic cleanup of native resources |
| 17 | + |
| 18 | +## Installation |
6 | 19 |
|
7 | 20 | ```bash |
8 | 21 | dotnet add package Microsoft.AI.Foundry.Local |
9 | 22 | ``` |
10 | 23 |
|
11 | 24 | ### Building from source |
12 | | -To build the SDK, run the following command in your terminal: |
13 | 25 |
|
14 | 26 | ```bash |
15 | | -cd sdk/cs |
16 | | -dotnet build sdk_v2/cs/src/Microsoft.AI.Foundry.Local.csproj |
| 27 | +cd sdk_v2/cs |
| 28 | +dotnet build src/Microsoft.AI.Foundry.Local.csproj |
17 | 29 | ``` |
18 | 30 |
|
19 | | -You can also load [FoundryLocal.sln](./FoundryLocal.sln) in Visual Studio 2022 or VSCode. |
| 31 | +Or open [Microsoft.AI.Foundry.Local.SDK.sln](./Microsoft.AI.Foundry.Local.SDK.sln) in Visual Studio / VS Code. |
20 | 32 |
|
21 | | -## Usage |
| 33 | +## WinML: Automatic Hardware Acceleration (Windows) |
| 34 | + |
| 35 | +On Windows, Foundry Local can leverage WinML for GPU/NPU hardware acceleration via ONNX Runtime execution providers (EPs). EPs are large binaries downloaded on first use and cached for subsequent runs. |
| 36 | + |
| 37 | +Install the WinML package variant instead: |
| 38 | + |
| 39 | +```bash |
| 40 | +dotnet add package Microsoft.AI.Foundry.Local.WinML |
| 41 | +``` |
| 42 | + |
| 43 | +Or build from source with: |
| 44 | + |
| 45 | +```bash |
| 46 | +dotnet build src/Microsoft.AI.Foundry.Local.csproj /p:UseWinML=true |
| 47 | +``` |
22 | 48 |
|
23 | | -> [!NOTE] |
24 | | -> For this example, you'll need the OpenAI Nuget package installed as well: |
25 | | -> ```bash |
26 | | -> dotnet add package OpenAI |
27 | | -> ``` |
| 49 | +### Triggering EP download |
| 50 | + |
| 51 | +EP download can be time-consuming. Call `EnsureEpsDownloadedAsync` early (after initialization) to separate the download step from catalog access: |
| 52 | + |
| 53 | +```csharp |
| 54 | +// Initialize the manager first (see Quick Start) |
| 55 | +await FoundryLocalManager.CreateAsync( |
| 56 | + new Configuration { AppName = "my-app" }, |
| 57 | + NullLogger.Instance); |
| 58 | + |
| 59 | +await FoundryLocalManager.Instance.EnsureEpsDownloadedAsync(); |
| 60 | + |
| 61 | +// Now catalog access won't trigger an EP download |
| 62 | +var catalog = await FoundryLocalManager.Instance.GetCatalogAsync(); |
| 63 | +``` |
| 64 | + |
| 65 | +If you skip this step, EPs are downloaded automatically the first time you access the catalog. Once cached, subsequent calls are fast. |
| 66 | + |
| 67 | +## Quick Start |
28 | 68 |
|
29 | 69 | ```csharp |
30 | 70 | using Microsoft.AI.Foundry.Local; |
31 | | -using OpenAI; |
32 | | -using OpenAI.Chat; |
33 | | -using System.ClientModel; |
34 | | -using System.Diagnostics.Metrics; |
| 71 | +using Microsoft.Extensions.Logging; |
| 72 | +using Microsoft.Extensions.Logging.Abstractions; |
| 73 | +using Betalgo.Ranul.OpenAI.ObjectModels.RequestModels; |
| 74 | + |
| 75 | +// 1. Initialize the singleton manager |
| 76 | +await FoundryLocalManager.CreateAsync( |
| 77 | + new Configuration { AppName = "my-app" }, |
| 78 | + NullLogger.Instance); |
| 79 | + |
| 80 | +// 2. Get the model catalog and look up a model |
| 81 | +var catalog = await FoundryLocalManager.Instance.GetCatalogAsync(); |
| 82 | +var model = await catalog.GetModelAsync("phi-3.5-mini") |
| 83 | + ?? throw new Exception("Model 'phi-3.5-mini' not found in catalog."); |
| 84 | + |
| 85 | +// 3. Download (if needed) and load the model |
| 86 | +await model.DownloadAsync(); |
| 87 | +await model.LoadAsync(); |
| 88 | + |
| 89 | +// 4. Get a chat client and run inference |
| 90 | +var chatClient = await model.GetChatClientAsync(); |
| 91 | +var response = await chatClient.CompleteChatAsync(new[] |
| 92 | +{ |
| 93 | + ChatMessage.FromUser("Why is the sky blue?") |
| 94 | +}); |
| 95 | + |
| 96 | +Console.WriteLine(response.Choices![0].Message.Content); |
| 97 | + |
| 98 | +// 5. Clean up |
| 99 | +FoundryLocalManager.Instance.Dispose(); |
| 100 | +``` |
| 101 | + |
| 102 | +## Usage |
| 103 | + |
| 104 | +### Initialization |
| 105 | + |
| 106 | +`FoundryLocalManager` is an async singleton. Call `CreateAsync` once at startup: |
| 107 | + |
| 108 | +```csharp |
| 109 | +await FoundryLocalManager.CreateAsync( |
| 110 | + new Configuration { AppName = "my-app" }, |
| 111 | + loggerFactory.CreateLogger("FoundryLocal")); |
| 112 | +``` |
| 113 | + |
| 114 | +Access it anywhere afterward via `FoundryLocalManager.Instance`. Check `FoundryLocalManager.IsInitialized` to verify creation. |
| 115 | + |
| 116 | +### Catalog |
| 117 | + |
| 118 | +The catalog lists all models known to the Foundry Local Core: |
| 119 | + |
| 120 | +```csharp |
| 121 | +var catalog = await FoundryLocalManager.Instance.GetCatalogAsync(); |
| 122 | + |
| 123 | +// List all available models |
| 124 | +var models = await catalog.ListModelsAsync(); |
| 125 | +foreach (var m in models) |
| 126 | + Console.WriteLine($"{m.Alias} — {m.SelectedVariant.Info.DisplayName}"); |
| 127 | + |
| 128 | +// Get a specific model by alias |
| 129 | +var model = await catalog.GetModelAsync("phi-3.5-mini") |
| 130 | + ?? throw new Exception("Model 'phi-3.5-mini' not found in catalog."); |
| 131 | + |
| 132 | +// Get a specific variant by its unique model ID |
| 133 | +var variant = await catalog.GetModelVariantAsync("phi-3.5-mini-generic-gpu-4") |
| 134 | + ?? throw new Exception("Variant 'phi-3.5-mini-generic-gpu-4' not found in catalog."); |
| 135 | + |
| 136 | +// List models already downloaded to the local cache |
| 137 | +var cached = await catalog.GetCachedModelsAsync(); |
| 138 | + |
| 139 | +// List models currently loaded in memory |
| 140 | +var loaded = await catalog.GetLoadedModelsAsync(); |
| 141 | +``` |
| 142 | + |
| 143 | +### Model Lifecycle |
| 144 | + |
| 145 | +Each `Model` wraps one or more `ModelVariant` entries (different quantizations, hardware targets). The SDK auto-selects the best variant, or you can pick one: |
| 146 | + |
| 147 | +```csharp |
| 148 | +// Check and select variants |
| 149 | +Console.WriteLine($"Selected: {model.SelectedVariant.Id}"); |
| 150 | +foreach (var v in model.Variants) |
| 151 | + Console.WriteLine($" {v.Id} (cached: {await v.IsCachedAsync()})"); |
| 152 | + |
| 153 | +// Switch to a different variant |
| 154 | +model.SelectVariant(model.Variants[1]); |
| 155 | +``` |
| 156 | + |
| 157 | +Download, load, and unload: |
35 | 158 |
|
36 | | -var alias = "phi-3.5-mini"; |
| 159 | +```csharp |
| 160 | +// Download with progress reporting |
| 161 | +await model.DownloadAsync(progress => |
| 162 | + Console.WriteLine($"Download: {progress:F1}%")); |
| 163 | + |
| 164 | +// Load into memory |
| 165 | +await model.LoadAsync(); |
37 | 166 |
|
38 | | -var manager = await FoundryLocalManager.StartModelAsync(aliasOrModelId: alias); |
| 167 | +// Unload when done |
| 168 | +await model.UnloadAsync(); |
| 169 | + |
| 170 | +// Remove from local cache entirely |
| 171 | +await model.RemoveFromCacheAsync(); |
| 172 | +``` |
| 173 | + |
| 174 | +### Chat Completions |
| 175 | + |
| 176 | +```csharp |
| 177 | +var chatClient = await model.GetChatClientAsync(); |
39 | 178 |
|
40 | | -var model = await manager.GetModelInfoAsync(aliasOrModelId: alias); |
41 | | -ApiKeyCredential key = new ApiKeyCredential(manager.ApiKey); |
42 | | -OpenAIClient client = new OpenAIClient(key, new OpenAIClientOptions |
| 179 | +var response = await chatClient.CompleteChatAsync(new[] |
43 | 180 | { |
44 | | - Endpoint = manager.Endpoint |
| 181 | + ChatMessage.FromSystem("You are a helpful assistant."), |
| 182 | + ChatMessage.FromUser("Explain async/await in C#.") |
45 | 183 | }); |
46 | 184 |
|
47 | | -var chatClient = client.GetChatClient(model?.ModelId); |
| 185 | +Console.WriteLine(response.Choices![0].Message.Content); |
| 186 | +``` |
| 187 | + |
| 188 | +#### Streaming |
48 | 189 |
|
49 | | -var completionUpdates = chatClient.CompleteChatStreaming("Why is the sky blue'"); |
| 190 | +Use `IAsyncEnumerable` for token-by-token output: |
| 191 | + |
| 192 | +```csharp |
| 193 | +using var cts = new CancellationTokenSource(); |
50 | 194 |
|
51 | | -Console.Write($"[ASSISTANT]: "); |
52 | | -foreach (var completionUpdate in completionUpdates) |
| 195 | +await foreach (var chunk in chatClient.CompleteChatStreamingAsync( |
| 196 | + new[] { ChatMessage.FromUser("Write a haiku about .NET") }, cts.Token)) |
53 | 197 | { |
54 | | - if (completionUpdate.ContentUpdate.Count > 0) |
55 | | - { |
56 | | - Console.Write(completionUpdate.ContentUpdate[0].Text); |
57 | | - } |
| 198 | + Console.Write(chunk.Choices?[0]?.Delta?.Content); |
| 199 | +} |
| 200 | +``` |
| 201 | + |
| 202 | +#### Chat Settings |
| 203 | + |
| 204 | +Tune generation parameters per client: |
| 205 | + |
| 206 | +```csharp |
| 207 | +chatClient.Settings.Temperature = 0.7f; |
| 208 | +chatClient.Settings.MaxTokens = 256; |
| 209 | +chatClient.Settings.TopP = 0.9f; |
| 210 | +chatClient.Settings.FrequencyPenalty = 0.5f; |
| 211 | +``` |
| 212 | + |
| 213 | +### Audio Transcription |
| 214 | + |
| 215 | +```csharp |
| 216 | +var audioClient = await model.GetAudioClientAsync(); |
| 217 | + |
| 218 | +// One-shot transcription |
| 219 | +var result = await audioClient.TranscribeAudioAsync("recording.mp3"); |
| 220 | +Console.WriteLine(result.Text); |
| 221 | + |
| 222 | +// Streaming transcription |
| 223 | +await foreach (var chunk in audioClient.TranscribeAudioStreamingAsync("recording.mp3", CancellationToken.None)) |
| 224 | +{ |
| 225 | + Console.Write(chunk.Text); |
58 | 226 | } |
59 | 227 | ``` |
| 228 | + |
| 229 | +#### Audio Settings |
| 230 | + |
| 231 | +```csharp |
| 232 | +audioClient.Settings.Language = "en"; |
| 233 | +audioClient.Settings.Temperature = 0.0f; |
| 234 | +``` |
| 235 | + |
| 236 | +### Web Service |
| 237 | + |
| 238 | +Start an OpenAI-compatible REST endpoint for use by external tools or processes: |
| 239 | + |
| 240 | +```csharp |
| 241 | +// Configure the web service URL in your Configuration |
| 242 | +await FoundryLocalManager.CreateAsync( |
| 243 | + new Configuration |
| 244 | + { |
| 245 | + AppName = "my-app", |
| 246 | + Web = new Configuration.WebService { Urls = "http://127.0.0.1:5000" } |
| 247 | + }, |
| 248 | + NullLogger.Instance); |
| 249 | + |
| 250 | +await FoundryLocalManager.Instance.StartWebServiceAsync(); |
| 251 | +Console.WriteLine($"Listening on: {string.Join(", ", FoundryLocalManager.Instance.Urls!)}"); |
| 252 | + |
| 253 | +// ... use the service ... |
| 254 | +
|
| 255 | +await FoundryLocalManager.Instance.StopWebServiceAsync(); |
| 256 | +``` |
| 257 | + |
| 258 | +### Configuration |
| 259 | + |
| 260 | +| Property | Type | Default | Description | |
| 261 | +|---|---|---|---| |
| 262 | +| `AppName` | `string` | **(required)** | Your application name | |
| 263 | +| `AppDataDir` | `string?` | `~/.{AppName}` | Application data directory | |
| 264 | +| `ModelCacheDir` | `string?` | `{AppDataDir}/cache/models` | Where models are stored locally | |
| 265 | +| `LogsDir` | `string?` | `{AppDataDir}/logs` | Log output directory | |
| 266 | +| `LogLevel` | `LogLevel` | `Warning` | `Verbose`, `Debug`, `Information`, `Warning`, `Error`, `Fatal` | |
| 267 | +| `Web` | `WebService?` | `null` | Web service configuration (see below) | |
| 268 | +| `AdditionalSettings` | `IDictionary<string, string>?` | `null` | Extra key-value settings passed to Core | |
| 269 | + |
| 270 | +**`Configuration.WebService`** |
| 271 | + |
| 272 | +| Property | Type | Default | Description | |
| 273 | +|---|---|---|---| |
| 274 | +| `Urls` | `string?` | `127.0.0.1:0` | Bind address; semi-colon separated for multiple | |
| 275 | +| `ExternalUrl` | `Uri?` | `null` | URI for accessing the web service in a separate process | |
| 276 | + |
| 277 | +### Disposal |
| 278 | + |
| 279 | +`FoundryLocalManager` implements `IDisposable`. Dispose stops the web service (if running) and releases native resources: |
| 280 | + |
| 281 | +```csharp |
| 282 | +FoundryLocalManager.Instance.Dispose(); |
| 283 | +``` |
| 284 | + |
| 285 | +## API Reference |
| 286 | + |
| 287 | +Auto-generated API docs live in [`docs/api/`](./docs/api/). See [`GENERATE-DOCS.md`](./GENERATE-DOCS.md) to regenerate. |
| 288 | + |
| 289 | +Key types: |
| 290 | + |
| 291 | +| Type | Description | |
| 292 | +|---|---| |
| 293 | +| [`FoundryLocalManager`](./docs/api/microsoft.ai.foundry.local.foundrylocalmanager.md) | Singleton entry point — create, catalog, web service | |
| 294 | +| [`Configuration`](./docs/api/microsoft.ai.foundry.local.configuration.md) | Initialization settings | |
| 295 | +| [`ICatalog`](./docs/api/microsoft.ai.foundry.local.icatalog.md) | Model catalog interface | |
| 296 | +| [`Model`](./docs/api/microsoft.ai.foundry.local.model.md) | Model with variant selection | |
| 297 | +| [`ModelVariant`](./docs/api/microsoft.ai.foundry.local.modelvariant.md) | Specific model variant (hardware/quantization) | |
| 298 | +| [`OpenAIChatClient`](./docs/api/microsoft.ai.foundry.local.openaichatclient.md) | Chat completions (sync + streaming) | |
| 299 | +| [`OpenAIAudioClient`](./docs/api/microsoft.ai.foundry.local.openaiaudioclient.md) | Audio transcription (sync + streaming) | |
| 300 | +| [`ModelInfo`](./docs/api/microsoft.ai.foundry.local.modelinfo.md) | Full model metadata record | |
| 301 | + |
| 302 | +## Tests |
| 303 | + |
| 304 | +```bash |
| 305 | +dotnet test |
| 306 | +``` |
| 307 | + |
| 308 | +See [`test/FoundryLocal.Tests/LOCAL_MODEL_TESTING.md`](./test/FoundryLocal.Tests/LOCAL_MODEL_TESTING.md) for prerequisites and local model setup. |
0 commit comments