Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions .agents/docs/llvm-install-failure-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# LLVM 工具链安装失败分析

## 现象

`mcpp toolchain install llvm` 依赖包(libxml2, zlib, glibc 等)安装成功,但 LLVM 本体(800MB)缺失:

```
~/.mcpp/registry/data/xpkgs/
├── xim-x-libxml2/ ✓ 安装成功
├── xim-x-zlib/ ✓ 安装成功
├── xim-x-glibc/ ✓ 安装成功
├── xim-x-llvm/ ✗ 不存在
```

## 根因分析

### 问题 1:`</dev/null` 关闭 stdin 可能破坏 xlings 子进程通信

`platform/process.cppm:79-84` 的 `seal_stdin()` 对所有 POSIX 命令追加 `</dev/null`。

这个修复解决了 macOS 首次运行卡住的问题,但副作用是:xlings 内部的子进程(如解压 800MB LLVM 的 tar 进程)可能依赖 stdin 进行进程间通信或信号传递。小包(libxml2 等)不受影响,大包(LLVM)因为解压时间长,子进程链更复杂,可能被 broken stdin 导致静默失败。

### 问题 2:`2>/dev/null` 吞掉所有错误信息

`xlings.cppm:432-434` 构建的命令:

```bash
cd ~/.mcpp && ... xlings interface install_packages --args '...' 2>/dev/null </dev/null
```

stderr 被完全丢弃。如果 xlings 安装 LLVM 时输出了错误信息到 stderr,我们完全看不到。

### 问题 3:NDJSON handler 只处理 download_progress 事件

`xlings.cppm:645-692` 的 `handle_line` 回调:

```cpp
if (kind != "data") return; // 忽略非 data 事件
if (ls.find_str("dataKind") != "download_progress") return; // 只关心下载进度
```

如果 xlings 发出了 error 事件或 log 事件报告安装失败,全部被静默丢弃。

### 问题 4:Windows 有 fallback 但 Linux 没有

`package_fetcher.cppm:608-638` 有一个 Windows-only 的 workaround:

```cpp
#if defined(_WIN32)
// 如果 verdir 不存在,检查全局 xlings 目录 ~/.xlings/data/xpkgs/ 并复制过来
if (!std::filesystem::exists(verdir)) {
// ... copy from ~/.xlings/ to ~/.mcpp/
}
#endif
```

这个 workaround 处理了 "xlings 把包装到全局目录而非 XLINGS_HOME 指定目录" 的情况。**Linux 没有这个 fallback**。

### 为什么 CI 没有这个问题

CI 设置了 `MCPP_VENDORED_XLINGS="$XLINGS_BIN"`:

```yaml
export MCPP_VENDORED_XLINGS="$XLINGS_BIN"
"$MCPP" build --target x86_64-linux-musl
```

`MCPP_VENDORED_XLINGS` 触发 `make_xlings_env()` 中的特殊路径,使用全局 xlings 二进制。而且 CI 中的工具链安装走的是 xlings 全局 sandbox(因为 MCPP_HOME 显式设置),与用户本地的嵌套沙箱场景完全不同。

实际上 **CI 也没有测试 `mcpp toolchain install llvm` 这个用户流程**——CI 只测试 `mcpp build`(使用预装的工具链)。

## 修复方案

### 修复 1:`install_with_progress()` Linux 路径改为直接命令(对齐 Windows)

Windows 已经用直接 `xlings install ... -y` 命令而非 interface 模式。Linux 也应该如此:

```cpp
int install_with_progress(const Env& env, std::string_view target,
const BootstrapProgressCallback& cb)
{
// 所有平台统一:先用直接命令安装
auto directCmd = build_command_prefix(env) + std::format(" install {} -y", target);
int directRc = mcpp::platform::process::run_silent(directCmd);
if (directRc == 0) return 0;

// 直接命令失败则 fallback 到 interface 模式(保留进度回调能力)
// ...
}
```

### 修复 2:Linux 增加与 Windows 相同的 fallback 检查

在 `resolve_xpkg_path()` 中,将 Windows 的全局目录 fallback 扩展到所有平台:

```cpp
// 移除 #if defined(_WIN32),改为所有平台通用
if (!std::filesystem::exists(verdir)) {
// 检查全局 xlings 目录
auto homeDir = std::getenv("HOME");
if (homeDir) {
std::filesystem::path globalXpkgs =
std::filesystem::path(homeDir) / ".xlings" / "data" / "xpkgs";
auto globalVerdir = globalXpkgs / verdir.filename().parent_path().filename() / verdir.filename();
if (std::filesystem::exists(globalVerdir)) {
// 复制或软链接到 sandbox
}
}
}
```

### 修复 3:不对 xlings install 命令关闭 stdin

为 `install_with_progress()` 添加不关闭 stdin 的选项,或让直接 install 命令走 `std::system()` 而非 `platform::process`:

```cpp
// 直接命令不通过 platform::process(不追加 </dev/null)
int directRc = std::system(directCmd.c_str());
```

### 修复 4:CI 增加工具链安装测试

在 `ci.yml` 中增加专门测试 `mcpp toolchain install llvm` 的步骤,确保这个用户核心流程被覆盖。

## 推荐实施顺序

1. **修复 1 + 修复 3**:Linux 改用直接命令 + 不关闭 stdin(最可能解决问题)
2. **修复 2**:增加全局目录 fallback(兜底)
3. **修复 4**:增加 CI 测试(防止回归)
17 changes: 17 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,23 @@ jobs:
"$MCPP" build
"$MCPP" test

- name: Toolchain install smoke test (mcpp toolchain install llvm)
run: |
# Test the core user flow: install a toolchain, create a project,
# build with it. Uses the freshly-built mcpp (not bootstrap).
MCPP=$(realpath "$(find target -type f -name mcpp -printf '%T@ %p\n' | sort -rn | head -1 | cut -d' ' -f2)")
# Install LLVM toolchain into mcpp's sandbox
"$MCPP" toolchain install llvm 20.1.7
# Set as default so the build picks it up
"$MCPP" toolchain default llvm@20.1.7
# Build a hello-world project with the installed toolchain
TMP=$(mktemp -d)
cd "$TMP"
"$MCPP" new hello
cd hello
"$MCPP" build
"$MCPP" run

- name: Fresh user experience (xlings install mcpp → new → run)
continue-on-error: true
run: |
Expand Down
19 changes: 17 additions & 2 deletions src/build/flags.cppm
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,24 @@ CompileFlags compute_flags(const BuildPlan& plan) {
include_flags += " -I" + escape_path(abs);
}

// Sysroot
// Sysroot + config override for macOS.
// On macOS, xlings LLVM's clang++.cfg contains hardcoded --sysroot and
// -isystem paths from the original install location. When the package is
// copied to mcpp's sandbox, these paths become stale. We pass
// --no-default-config to ignore the cfg and provide correct paths.
std::string sysroot_flag;
if (!plan.toolchain.sysroot.empty()) {
bool is_macos_clang = mcpp::toolchain::is_clang(plan.toolchain)
&& (plan.toolchain.targetTriple.find("apple") != std::string::npos
|| plan.toolchain.targetTriple.find("darwin") != std::string::npos);
if (is_macos_clang) {
auto llvmRoot = plan.toolchain.binaryPath.parent_path().parent_path();
auto libcxxInclude = llvmRoot / "include" / "c++" / "v1";
sysroot_flag = " --no-default-config";
sysroot_flag += " -isystem" + escape_path(libcxxInclude);
if (auto sdk = mcpp::platform::macos::sdk_path())
sysroot_flag += " --sysroot=" + escape_path(*sdk);
f.sysroot = sysroot_flag;
} else if (!plan.toolchain.sysroot.empty()) {
sysroot_flag = " --sysroot=" + escape_path(plan.toolchain.sysroot);
f.sysroot = sysroot_flag;
}
Expand Down
18 changes: 10 additions & 8 deletions src/pm/package_fetcher.cppm
Original file line number Diff line number Diff line change
Expand Up @@ -605,14 +605,17 @@ Fetcher::resolve_xpkg_path(std::string_view target,
};

auto resolve = [&]() -> std::expected<XpkgPayload, CallError> {
#if defined(_WIN32)
// Workaround: xlings on Windows may extract large packages (e.g. LLVM)
// into its global data dir instead of the mcpp sandbox, because the
// extraction subprocess doesn't inherit XLINGS_HOME. Detect this and
// copy the payload into the sandbox so mcpp remains self-contained.
// Workaround: xlings may extract large packages (e.g. LLVM) into its
// global data dir instead of the mcpp sandbox, because the extraction
// subprocess doesn't always inherit XLINGS_HOME. Detect this and copy
// the payload into the sandbox so mcpp remains self-contained.
// Originally Windows-only; extended to all platforms for the same
// reason (xlings subprocess XLINGS_HOME propagation is unreliable).
if (!std::filesystem::exists(verdir)) {
// Try xlings' own data dir (where `xlings self install` placed it)
auto xhome = std::getenv("USERPROFILE");
const char* xhome = nullptr;
#if defined(_WIN32)
xhome = std::getenv("USERPROFILE");
#endif
if (!xhome) xhome = std::getenv("HOME");
if (xhome) {
// xlings stores xpkgs at <home>/.xlings/data/xpkgs/ or
Expand All @@ -635,7 +638,6 @@ Fetcher::resolve_xpkg_path(std::string_view target,
}
}
}
#endif
if (!std::filesystem::exists(verdir)) {
return std::unexpected(CallError{
std::format("xpkg payload missing: {}", verdir.string())});
Expand Down
5 changes: 4 additions & 1 deletion src/toolchain/probe.cppm
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,10 @@ probe_sysroot(const std::filesystem::path& compilerBin,
auto s = trim_line(*r);
if (!s.empty() && std::filesystem::exists(s)) return s;
}
// macOS fallback: use xcrun to discover the SDK path
// macOS fallback: use xcrun to discover the SDK path.
// The sysroot is used for regular compilation flags (flags.cppm) but
// skipped for std module precompilation on macOS (stdmod.cppm) to
// avoid breaking SDK internal header dependencies.
if (auto sdk = mcpp::platform::macos::sdk_path())
return *sdk;
return {};
Expand Down
20 changes: 19 additions & 1 deletion src/toolchain/stdmod.cppm
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,26 @@ std::expected<StdModule, StdModError> ensure_built(
: mcpp::toolchain::gcc::std_bmi_path(sm.cacheDir);
sm.objectPath = sm.cacheDir / "std.o";

// Build sysroot + include flags for std module precompilation.
// On macOS, xlings LLVM's clang++.cfg contains hardcoded --sysroot and
// -isystem paths from the original install location. When the LLVM package
// is copied to mcpp's sandbox, these cfg paths become stale (still point
// to the original xlings directory). We override both:
// --sysroot → current active SDK (from xcrun)
// --no-default-config → ignore stale cfg entirely
// -isystem → correct libc++ headers in the sandbox copy
std::string sysroot_flag;
if (!tc.sysroot.empty()) {
bool is_macos = tc.targetTriple.find("apple") != std::string::npos
|| tc.targetTriple.find("darwin") != std::string::npos;
if (is_macos && is_clang(tc)) {
// Ignore the stale clang++.cfg and provide correct flags directly.
auto llvmRoot = tc.binaryPath.parent_path().parent_path();
auto libcxxInclude = llvmRoot / "include" / "c++" / "v1";
sysroot_flag = " --no-default-config";
sysroot_flag += std::format(" -isystem'{}'", libcxxInclude.string());
if (auto sdk = mcpp::platform::macos::sdk_path())
sysroot_flag += std::format(" --sysroot='{}'", sdk->string());
} else if (!tc.sysroot.empty()) {
sysroot_flag = std::format(" --sysroot='{}'", tc.sysroot.string());
}

Expand Down
31 changes: 18 additions & 13 deletions src/xlings.cppm
Original file line number Diff line number Diff line change
Expand Up @@ -609,24 +609,29 @@ int install_with_progress(const Env& env, std::string_view target,
auto argsJson = std::format(
R"({{"targets":["{}"],"yes":true}})", target);

if constexpr (mcpp::platform::is_windows) {
mcpp::platform::env::set("XLINGS_HOME", env.home.string());
mcpp::platform::env::set("XLINGS_PROJECT_DIR", "");
std::error_code ec_mkdir;
std::filesystem::create_directories(env.home, ec_mkdir);
// Use direct `install` command instead of `interface install_packages`
// on Windows. The NDJSON interface may have issues with large packages
// where the extraction subprocess doesn't respect XLINGS_HOME.
auto directCmd = std::format("{} install {} -y",
env.binary.string(), target);
int directRc = mcpp::platform::process::run_silent(directCmd);
// All platforms: try direct `xlings install ... -y` first.
// The direct command is more reliable for large packages (e.g. LLVM
// ~800MB) because:
// - it doesn't pipe through NDJSON interface (simpler subprocess chain)
// - xlings manages its own stdin/stdout/stderr
// - extraction subprocess coordination works normally
// The NDJSON interface path is kept as a fallback for progress reporting.
{
auto directCmd = build_command_prefix(env) +
std::format(" install {} -y {}", target, mcpp::platform::shell::silent_redirect);
// Use std::system() directly — do NOT redirect stdin via </dev/null
// because xlings may need stdin for subprocess coordination during
// large package extraction.
int directRc = mcpp::platform::process::extract_exit_code(
std::system(directCmd.c_str()));
if (directRc == 0) return 0;
}

// Fallback: NDJSON interface path (provides progress callbacks).
auto cmd = [&]() -> std::string {
if constexpr (mcpp::platform::is_windows) {
// Fallback to interface path if direct install fails
return std::format("{} interface install_packages --args {} {}",
env.binary.string(),
build_command_prefix(env),
shq(argsJson),
mcpp::platform::null_redirect);
} else {
Expand Down
Loading