CoreAI · Mar 2024 — Present

AI Toolkit for VS Code

“Making AI agent development fast and delightful”

VS Code Marketplace GitHub

1M+

Installs

Model Providers

Core Features

30+

Contributors

Problem Statement

AI development is fragmented and slow

Building production AI agents requires juggling multiple tools, platforms, and workflows. Developers waste hours context-switching between model providers, testing environments, and deployment pipelines.

Model Discovery

Finding the right model means navigating dozens of separate provider portals, comparing specs across different formats, and manually testing each candidate.

Tool Fragmentation

Developers context-switch between web UIs for prompting, separate IDEs for coding, terminal tools for deployment, and standalone dashboards for evaluation.

Agent Debugging

Multi-step agent workflows are opaque black boxes. When an agent fails, there's no way to set breakpoints, inspect intermediate states, or trace the execution flow.

Deploy Friction

Moving from a working prototype to production requires manual configuration of infrastructure, separate CI/CD pipelines, and deep cloud platform expertise.

User Personas

Primary User

The AI Application Developer

A full-stack developer integrating AI capabilities into production applications. Comfortable with code but needs to iterate quickly on prompts, evaluate model quality, and ship reliable agents.

Test models from multiple providers in one place
Build and debug agents without leaving VS Code
Deploy to cloud with minimal configuration

Secondary User

The ML Engineer

Specializes in model optimization and fine-tuning. Needs tools to customize open models for domain-specific tasks and benchmark performance across hardware targets (CPU, GPU, NPU).

Fine-tune models with QLoRA on local GPU or cloud
Convert and quantize models for edge deployment
Profile inference performance across execution providers

Extended User

The Citizen Developer

A product manager, designer, or domain expert who wants to prototype AI-powered features without writing code. Needs visual, low-barrier tools to validate ideas quickly.

Create prompt-based agents with a no-code builder
Iterate on prompts using natural language feedback
Export production-ready code for handoff to engineering

User Journey

From idea to deployed AI agent

Discover

Browse the Model Catalog to find models from 9+ providers. Compare capabilities, pricing, and latency side-by-side.

Pain point: Visiting each provider portal separately

Prototype

Test models in the Playground with multi-modal inputs. Use Agent Builder to craft prompts and wire up MCP tools.

Pain point: No unified place to iterate on prompts + tools

Build & Debug

Write agent code with full IntelliSense. Press F5 to launch Agent Inspector with breakpoints and workflow visualization.

Pain point: Agent workflows are opaque and hard to debug

Evaluate & Deploy

Run bulk evaluations with built-in metrics. One-click deploy to Microsoft Foundry with tracing enabled.

Pain point: Separate toolchains for testing vs. deployment

User Stories

As an AI application developer

I want to compare models from OpenAI, Anthropic, and open-source providers in a single interface

So that I can choose the best model for my use case without switching between provider portals.

As a product manager

I want to build and test a prompt-based agent using a visual no-code builder

So that I can validate an AI feature idea before committing engineering resources.

As an ML engineer

I want to fine-tune an open model on my domain-specific dataset with QLoRA

So that I can improve accuracy for my enterprise's specialized vocabulary and workflows.

As a platform engineer

I want to debug multi-agent workflows with breakpoints and execution tracing

So that I can identify where agents fail and fix issues before they reach production.

As a Windows developer

I want to convert and optimize models for NPU acceleration on Copilot+ PCs

So that I can deliver fast, offline AI experiences without cloud dependency.

Features

Core

Model Catalog

Unified model discovery across Microsoft Foundry, GitHub, Hugging Face, ONNX, Ollama, OpenAI, Anthropic, Google, and NVIDIA NIM. Side-by-side comparison and one-click playground access.

9+ integrated model providers

Agent Builder

No-code visual interface for creating prompt agents. Natural language prompt engineering with "Inspire Me" generation, MCP tool integration, and structured output support.

Zero-to-agent in minutes

Core

Agent Inspector

Full F5 debugging for AI agents with breakpoints, real-time streaming visualization, multi-agent workflow graphs, and one-click code navigation.

First-class debugger integration

Performance

Model Evaluation

Batch evaluation with built-in metrics (F1, relevance, similarity, coherence) and custom evaluators. "Evaluation as Tests" for CI-style quality gates.

Quantified model quality

Fine-Tuning

Customize models with QLoRA on local GPU or cloud via Azure Container Apps. Supports Phi, Llama, Mistral, DeepSeek, and NPU-optimized variants for Copilot+ PCs.

Local GPU + cloud training

One-Click Deploy

Deploy agents directly to Microsoft Foundry from VS Code. Built-in tracing and profiling for production monitoring across CPU, GPU, and NPU.

VS Code to production in one click

Technical Architecture

User Interface

VS Code Extension

TypeScript + React

Webview UI

React Components

Extension Tree View

VS Code API

↓

Commands, events, state

Application Services

Agent Builder

Prompt Engineering

Agent Inspector

F5 Debugger

Model Playground

Interactive Chat

↓

API calls, inference requests

Backend Agents

Inference Agent

C# / .NET 8+

Workspace Agent

C# / .NET 8+

MCP Server

Tool Integration

↓

Model requests, tool calls

Model Providers

Microsoft Foundry

Cloud Models

GitHub Models

Open Source

Ollama / ONNX

Local Inference

OpenAI / Anthropic

3rd Party APIs

↓

Deployment, monitoring

Infrastructure

Microsoft Foundry

Cloud Deploy

Azure Container Apps

Fine-Tuning

Windows ML / NPU

Edge Runtime

TypeScriptReactC# / .NET 8+ONNX RuntimeCUDA / NPUMCP ProtocolQLoRADocker / WSL2AzureVS Code API

Cross-Platform Reach

Runs on Windows, macOS, and Linux via VS Code. Local inference supports CPU, GPU (CUDA), and NPU hardware acceleration for Copilot+ PCs, enabling offline AI scenarios.

Provider Agnostic

Single interface abstracts 9+ model providers. Developers can swap between cloud and local models without changing their agent code, reducing vendor lock-in.

Full Lifecycle Coverage

From model discovery through fine-tuning, evaluation, debugging, and cloud deployment — the entire AI development lifecycle lives inside the editor developers already use.

Related projects

Microsoft 365 Agents Toolkit

Enterprise developers building for Microsoft 365 faced fragmented SDKs, complex auth configuration, and manual cloud provisioning for every new project. The Agents Toolkit streamlined the entire lifecycle — scaffold, debug, deploy, and publish — serving 20K+ monthly active developers across Teams, Copilot, and Outlook.

Read case study