- 文件附件 (
type: "file"):提供绝对路径;运行时从磁盘读取文件,将其转换为 base64,并将其发送到 LLM。 - Blob 附件 (
type: "blob"):直接提供 base64 编码的数据;当图像已在内存中时非常有用(例如屏幕截图、生成的图像或 API 中的数据)。
概述

| 概念 | Description |
|---|---|
| 文件附件 | 磁盘上含有type: "file"的附件和绝对path的图像 |
| Blob 附件 | 包含 type: "blob" 的附件,base64 编码 data,以及 mimeType——不需要磁盘 I/O。 |
| 自动编码 | 对于文件附件,运行时将读取图像并将其自动转换为 base64 |
| 自动调整大小 | 运行时会自动调整图像的大小,或降低超出模型特定限制的图像的质量。 |
| 视觉功能 | 模型必须具有 capabilities.supports.vision = true 才能处理图像 |
快速入门 - 文件附件
使用文件附件类型将图像文件附加到任何邮件。 该路径必须是磁盘上映像的绝对路径。
TypeScript
import { CopilotClient } from "@github/copilot-sdk";
const client = new CopilotClient();
await client.start();
const session = await client.createSession({
model: "gpt-4.1",
onPermissionRequest: async () => ({ kind: "approve-once" }),
});
await session.send({
prompt: "Describe what you see in this image",
attachments: [
{
type: "file",
path: "/absolute/path/to/screenshot.png",
},
],
});
Python
from copilot import CopilotClient, PermissionDecisionApproveOnce
client = CopilotClient()
await client.start()
session = await client.create_session(
on_permission_request=lambda req, inv: PermissionDecisionApproveOnce(),
model="gpt-4.1",
)
await session.send(
"Describe what you see in this image",
attachments=[
{
"type": "file",
"path": "/absolute/path/to/screenshot.png",
},
],
)
Go
package main
import (
"context"
copilot "github.com/github/copilot-sdk/go"
"github.com/github/copilot-sdk/go/rpc"
)
func main() {
ctx := context.Background()
client := copilot.NewClient(nil)
client.Start(ctx)
session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
return &rpc.PermissionDecisionApproveOnce{}, nil
},
})
path := "/absolute/path/to/screenshot.png"
session.Send(ctx, copilot.MessageOptions{
Prompt: "Describe what you see in this image",
Attachments: []copilot.Attachment{
&copilot.UserMessageAttachmentFile{
DisplayName: "screenshot.png",
Path: path,
},
},
})
}
ctx := context.Background()
client := copilot.NewClient(nil)
client.Start(ctx)
session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
return &rpc.PermissionDecisionApproveOnce{}, nil
},
})
path := "/absolute/path/to/screenshot.png"
session.Send(ctx, copilot.MessageOptions{
Prompt: "Describe what you see in this image",
Attachments: []copilot.Attachment{
&copilot.UserMessageAttachmentFile{
DisplayName: "screenshot.png",
Path: path,
},
},
})
.NET
using GitHub.Copilot;
using GitHub.Copilot.Rpc;
public static class ImageInputExample
{
public static async Task Main()
{
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
OnPermissionRequest = (req, inv) =>
Task.FromResult(PermissionDecision.ApproveOnce()),
});
await session.SendAsync(new MessageOptions
{
Prompt = "Describe what you see in this image",
Attachments = new List<UserMessageAttachment>
{
new UserMessageAttachmentFile
{
Path = "/absolute/path/to/screenshot.png",
DisplayName = "screenshot.png",
},
},
});
}
}
using GitHub.Copilot;
using GitHub.Copilot.Rpc;
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
OnPermissionRequest = (req, inv) =>
Task.FromResult(PermissionDecision.ApproveOnce()),
});
await session.SendAsync(new MessageOptions
{
Prompt = "Describe what you see in this image",
Attachments = new List<UserMessageAttachment>
{
new UserMessageAttachmentFile
{
Path = "/absolute/path/to/screenshot.png",
DisplayName = "screenshot.png",
},
},
});
Java
import com.github.copilot.sdk.CopilotClient;
import com.github.copilot.sdk.events.*;
import com.github.copilot.sdk.json.*;
import java.util.List;
try (var client = new CopilotClient()) {
client.start().get();
var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
).get();
session.send(new MessageOptions()
.setPrompt("Describe what you see in this image")
.setAttachments(List.of(
new Attachment("file", "/absolute/path/to/screenshot.png", "screenshot.png")
))
).get();
}
快速入门 - Blob 附件
如果内存中已有图像数据(例如应用捕获的屏幕截图或从 API 提取的图像),请使用 blob 附件直接发送它,而无需写入磁盘。
TypeScript
import { CopilotClient } from "@github/copilot-sdk";
const client = new CopilotClient();
await client.start();
const session = await client.createSession({
model: "gpt-4.1",
onPermissionRequest: async () => ({ kind: "approve-once" }),
});
const base64ImageData = "..."; // your base64-encoded image
await session.send({
prompt: "Describe what you see in this image",
attachments: [
{
type: "blob",
data: base64ImageData,
mimeType: "image/png",
displayName: "screenshot.png",
},
],
});
Python
from copilot import CopilotClient, PermissionDecisionApproveOnce
client = CopilotClient()
await client.start()
session = await client.create_session(
on_permission_request=lambda req, inv: PermissionDecisionApproveOnce(),
model="gpt-4.1",
)
base64_image_data = "..." # your base64-encoded image
await session.send(
"Describe what you see in this image",
attachments=[
{
"type": "blob",
"data": base64_image_data,
"mimeType": "image/png",
"displayName": "screenshot.png",
},
],
)
Go
package main
import (
"context"
copilot "github.com/github/copilot-sdk/go"
"github.com/github/copilot-sdk/go/rpc"
)
func main() {
ctx := context.Background()
client := copilot.NewClient(nil)
client.Start(ctx)
session, _ := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) {
return &rpc.PermissionDecisionApproveOnce{}, nil
},
})
base64ImageData := "..."
mimeType := "image/png"
displayName := "screenshot.png"
session.Send(ctx, copilot.MessageOptions{
Prompt: "Describe what you see in this image",
Attachments: []copilot.Attachment{
&copilot.UserMessageAttachmentBlob{
Data: base64ImageData,
MIMEType: mimeType,
DisplayName: &displayName,
},
},
})
}
mimeType := "image/png"
displayName := "screenshot.png"
session.Send(ctx, copilot.MessageOptions{
Prompt: "Describe what you see in this image",
Attachments: []copilot.Attachment{
&copilot.UserMessageAttachmentBlob{
Data: base64ImageData, // base64-encoded string
MIMEType: mimeType,
DisplayName: &displayName,
},
},
})
.NET
using GitHub.Copilot;
using GitHub.Copilot.Rpc;
public static class BlobAttachmentExample
{
public static async Task Main()
{
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
OnPermissionRequest = (req, inv) =>
Task.FromResult(PermissionDecision.ApproveOnce()),
});
var base64ImageData = "...";
await session.SendAsync(new MessageOptions
{
Prompt = "Describe what you see in this image",
Attachments = new List<UserMessageAttachment>
{
new UserMessageAttachmentBlob
{
Data = base64ImageData,
MimeType = "image/png",
DisplayName = "screenshot.png",
},
},
});
}
}
await session.SendAsync(new MessageOptions
{
Prompt = "Describe what you see in this image",
Attachments = new List<UserMessageAttachment>
{
new UserMessageAttachmentBlob
{
Data = base64ImageData,
MimeType = "image/png",
DisplayName = "screenshot.png",
},
},
});
Java
import com.github.copilot.sdk.CopilotClient;
import com.github.copilot.sdk.events.*;
import com.github.copilot.sdk.json.*;
import java.util.List;
try (var client = new CopilotClient()) {
client.start().get();
var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
).get();
var base64ImageData = "..."; // your base64-encoded image
session.send(new MessageOptions()
.setPrompt("Describe what you see in this image")
.setAttachments(List.of(
new BlobAttachment()
.setData(base64ImageData)
.setMimeType("image/png")
.setDisplayName("screenshot.png")
))
).get();
}
支持的格式
支持的图像格式包括 JPG、PNG、GIF 和其他常见图像类型。 对于文件附件,运行时从磁盘读取映像,并根据需要转换映像。 对于 Blob 附件,可以直接提供 base64 数据和 MIME 类型。 使用 PNG 或 JPEG 获得最佳效果,因为这些格式是支持最广泛的格式。
模型的字段列出了它接受的 capabilities.limits.vision.supported_media_types 确切 MIME 类型。
自动处理
运行时会自动处理图像以适应模型的约束。 无需手动调整大小。
- 超出模型尺寸或大小限制的图像会自动调整大小(保留纵横比)或降低质量。
- 如果图像在处理后无法满足要求,则会跳过该图像,并且不会发送到 LLM。
- 模型的
capabilities.limits.vision.max_prompt_image_size字段指示最大图像大小(以字节为单位)。
可以通过模型功能对象在运行时检查这些限制。 为了获得最佳体验,请使用大小合理的 PNG 或 JPEG 图像。
视觉模型功能
并非所有模型都支持视觉。 在发送图像之前检查模型的功能。
功能字段
| 领域 | 类型 | Description |
|---|---|---|
capabilities.supports.vision | boolean | 模型是否可以处理图像输入 |
capabilities.limits.vision.supported_media_types | string[] | 模型接受的 MIME 类型(例如 ["image/png", "image/jpeg"]) |
capabilities.limits.vision.max_prompt_images | number | 每个提示的最大图像数 |
capabilities.limits.vision.max_prompt_image_size | number | 最大图像大小(以字节为单位) |
视觉限制类型
interface VisionCapabilities {
vision?: {
supported_media_types: string[];
max_prompt_images: number;
max_prompt_image_size: number; // bytes
};
}
vision?: {
supported_media_types: string[];
max_prompt_images: number;
max_prompt_image_size: number; // bytes
};
接收图像处理结果
当工具返回图像(例如屏幕截图或生成的图表)时,结果包含 "image" 具有 base64 编码数据的内容块。
| 领域 | 类型 | Description |
|---|---|---|
type | "image" | 内容块类型鉴别器 |
data | string | Base64 编码的图像数据 |
mimeType | string | MIME 类型(例如) "image/png" |
这些图像块显示在事件结果 tool.execution_complete 中。 有关完整的事件生命周期,请参阅 流式会话事件 指南。
提示和限制
| Tip | 详细信息 |
|---|---|
| 直接使用 PNG 或 JPEG | 避免转换开销 - 这些内容会原样发送到 LLM |
| 使图像保持合理大小 | 大型图像可能会质量降低,这可能会丢失重要细节 |
| 对文件附件使用绝对路径 | 运行时从磁盘读取文件;相对路径可能无法正确解析 |
| 使用 BLOB 附件来处理内存中的数据 | 如果已有 base64 数据(例如屏幕截图、API 响应),Blob 将避免不必要的磁盘 I/O |
| 首先检查视觉支持 | 将图像发送到没有视觉理解能力的非视觉模型会浪费标记。 |
| 支持多个映像 | 在一个消息中附加若干附件,至多到模型的max_prompt_images限制 |
| 不支持 SVG | SVG 文件基于文本,并且从图像处理中排除 |