AIChatOne Docs
Home
  • πŸ‘‹Welcome to AIChatOne Docs
  • Getting started
    • πŸ’‘Get started with AIChatOne
    • ✨Compare AIChatOne Plans
  • Product Guides
    • πŸ’¬Chat Models Settings
      • Use webapp
      • Get OpenAI API key
      • GPT-4 Access
      • Use Azure OpenAI
      • Use Poe
      • Use OpenRouter
      • Context length limit
    • πŸ€–Parameter Settings
      • Parameter settings
      • Suggested parameter combinations
    • ☁️Export / import data
    • πŸ’»Set Up System Message
      • Initial system instruction
      • Your profile
    • πŸ“”Chat Management
      • Organize chats
      • Share chats
    • πŸ“–Prompt library
    • 🎭AI Characters
    • πŸ“„Upload docs
    • πŸŽ™οΈVoice Input
    • πŸ‘„Text-to-speech
    • πŸ”Show AI Response on Search Page
    • πŸ“–Read/Summarize Web Page
    • πŸ“ΊYoutube Summarize
    • βœ–οΈTwitter (X) write assistant
    • πŸ‘½Reddit reply assistant
    • πŸ€–Custom Chatbots
      • OpenAI
      • Anyscale
      • Local LLM models
        • Ollama
        • Local AI
        • Xinference
      • Together.ai
      • OpenRouter
      • Mistral AI
  • Others
    • πŸ‘¨β€πŸ’»Manage License & Devices
    • ❔FAQs
    • 🌐Translation collaboration
    • πŸ›Troubleshooting
    • πŸ§‘β€πŸ€β€πŸ§‘Contact Us
    • πŸŽ‡Changelog
Powered by GitBook
On this page
  • Preparation​
  • Configuration
  • Troubleshooting

Was this helpful?

  1. Product Guides
  2. Custom Chatbots
  3. Local LLM models

Xinference

PreviousLocal AINextTogether.ai

Last updated 1 year ago

Was this helpful?

Xorbits Inference(Xinference) is an open-source project to run language models on your own machine. You can use it to serve open-source LLMs like Llama-2 locally.

Preparation

Follow the instructions at to setup Xinference and run the llama-2-chat model.

Configuration

  • API Endpoit: http://127.0.0.1:9997/v1/chat/completions

  • API Key: random strings

  • Model: llama-2-chat

Troubleshooting

  • Only models with chat in their name are supported.

You can find all the available models at

πŸ€–
https://inference.readthedocs.io/en/latest/models/builtin/llm/index.html
​
Using Xinference