1. ollama 无法使用所有的cpu ?  Ollama is only using 4 of my 8 Cores

  • Ollama does not have an option parameter to define the number of CPUs (it does for GPUs), but you can try setting num_threads to a value much higher than 8 (default value), and see how it works for you:

  • curl --location 'http://127.0.0.1:11434/api/chat' \
    --header 'Content-Type: application/json' \
    --data '{
      "model": "llama3",
      "messages": [
        { "role": "user", "content": "why is the sky blue?" }
      ],
      "options":{
        "num_thread": 16
      }
    }'
  • https://github.com/ollama/ollama/blob/main/docs/api.md#generate-request-with-options

2. ollama 使用定制模型

    2.1 使用llama.cpp转换成gguf 格式 参考

        https://github.com/ggerganov/llama.cpp/discussions/2948
    2.2 使用ollmma 部署通用模型 参加

        https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model

   2.3 发布提供给别人使用

      https://ollama.com/caicongyang/Qwen2.5-1.5B-Instruct-web3

Logo

汇聚全球AI编程工具,助力开发者即刻编程。

更多推荐