Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
298 commits
Select commit Hold shift + click to select a range
f55ffcd
BUG: Fix using extra gpus due to match in `__init__` (#1400)
ChengjieLi28 Apr 29, 2024
5fb5663
ENH: Update chatglm3 6b model version (#1401)
codingl2k1 Apr 29, 2024
7c974be
BLD: Use self-hosted aws machine to build docker image (#1405)
ChengjieLi28 Apr 29, 2024
5d55c9c
BUG: Fix qwen tool call paramerter empty issue (#1381)
codingl2k1 Apr 30, 2024
3b03520
BUG:Fix mertics is empty when call `/v1/chat/completions` (#1406)
amumu96 May 1, 2024
07d591f
DOC: update docker doc in using xinference (#1417)
qinxuye May 3, 2024
330c12f
BUG: Fix tool calls return invalid usage (#1420)
codingl2k1 May 6, 2024
1ef3ce4
FEAT: Support query engine with cmdline (#1380)
Ago327 May 6, 2024
f98b8b5
FEAT: Ascend support (#1408)
qinxuye May 6, 2024
1063585
TST: Pin `huggingface-hub` to pass CI since it has some break changes…
ChengjieLi28 May 6, 2024
d384ff0
REF: Add the `model_engine` parameter for launching process (#1367)
hainaweiben May 6, 2024
944bf6c
FEAT: Audio support verbose_json and timestamp (#1402)
codingl2k1 May 7, 2024
79e3897
CLN: Remove actor client (#1436)
ChengjieLi28 May 8, 2024
f9629e3
ENH: make qwen_vl support streaming output (#1425)
Minamiyama May 8, 2024
dc8124b
ENH: Removed the max tokens limitation and boost performance by avoid…
mikeshi80 May 8, 2024
a410672
DOC: add the missing backslash in shell command for benchmark README …
mikeshi80 May 8, 2024
2dfc64c
ENH: Improve benchmark and add long context generate (#1423)
frostyplanet May 8, 2024
2a393a9
BUG: Fix tools ability (#1447)
mikeshi80 May 9, 2024
5724011
ENH: make yi_vl support streaming output (#1443)
Minamiyama May 9, 2024
78920d0
CLN: Remove all speculative-related codes (#1435)
ChengjieLi28 May 9, 2024
8c558ff
ENH: Some minor changes (#1453)
frostyplanet May 9, 2024
da3693e
BUG: Install error on MacOS due to `auto-gptq` (#1457)
ChengjieLi28 May 9, 2024
b4a5c01
BUG: fix some issues in query engine interface (#1442)
Ago327 May 10, 2024
d7762ee
FEAT: [UI] Add engine option when launching LLM (#1456)
yiboyasss May 10, 2024
0cb0f0e
ENH: make deepseek_vl support streaming output (#1444)
Minamiyama May 10, 2024
9aba89f
ENH: Rename `model_engine` for more clear inference backend (#1466)
ChengjieLi28 May 10, 2024
21be5ab
DOC: Usage about `model_engine` (#1468)
ChengjieLi28 May 11, 2024
6a37186
BUG: fix top_k for vllm backend (#1461)
sixsun10 May 11, 2024
0f30688
DOC: update quick start ipynb (#1482)
qinxuye May 13, 2024
96caa22
DOC: Update readme for being integrated by RAGFlow (#1493)
JinHai-CN May 14, 2024
5b6ee04
FEAT: support Yi-1.5 series (#1489)
qinxuye May 15, 2024
1ae2d1f
FEAT: [UI] embedding and rerank support the specified GPU and CPU. (#…
yiboyasss May 15, 2024
672d8ff
BUG: Docker image issue due to `torchvision` (#1485)
ChengjieLi28 May 15, 2024
bd5f7e6
ENH: Refactoring the LoRa adaptation method for the LLM model. (#1470)
hainaweiben May 16, 2024
60f1dbf
DOC: Lora usage (#1506)
ChengjieLi28 May 16, 2024
2e330b6
BUG: Docker image crash during startup due to `llama-cpp-python` (#1507)
ChengjieLi28 May 17, 2024
556149b
BUG: Fix prompt is needed when docker image builds (#1512)
ChengjieLi28 May 17, 2024
0b3e13d
ENH: Add stream_options support (#1508)
amumu96 May 17, 2024
55a0200
BUG: `llama.cpp` model failed when chat due to `lora` (#1513)
ChengjieLi28 May 17, 2024
f9b0e7a
BLD: adapt to langchain 0.2.x, which has breaking changes (#1521)
mikeshi80 May 21, 2024
e0ac4ef
CHORE: Basic benchmark/benchmark_rerank.py (#1479)
codingl2k1 May 21, 2024
8464b41
FEAT: Add command cal-model-mem (#1460)
frostyplanet May 21, 2024
5f7e994
ENH: Compatible with `huggingface-hub` `v0.23.0` (#1514)
ChengjieLi28 May 22, 2024
bf88e9b
BLD: Fix pre commit (#1527)
frostyplanet May 22, 2024
6cc4680
FEAT: add deepseek llm and coder base (#1533)
qinxuye May 23, 2024
aa6a60d
FEAT: add codeqwen1.5 (#1535)
qinxuye May 23, 2024
263a115
BLD: compatible with torch 2.3.0 (#1534)
qinxuye May 23, 2024
7c6465c
BUG: Fix start worker failed due to None device name (#1539)
codingl2k1 May 24, 2024
d4c4fa9
FEAT: Auto detect rerank type for unknown rerank type (#1538)
codingl2k1 May 24, 2024
c3925ac
ENH: convert command-r to chat (#1537)
qinxuye May 24, 2024
9d818a4
BUG: Fix gpu_idx allocate error when set replica > 1 (#1528)
amumu96 May 24, 2024
27643de
ENH: Support Intern-VL-Chat model (#1536)
amumu96 May 24, 2024
77e79f8
FEAT: Provide the functionality to query information on various cache…
hainaweiben May 24, 2024
ac8f334
BUG: fix launch model error when use torch 2.3.0 (#1543)
amumu96 May 24, 2024
568f722
ENH: added engines options to model launch details (#1546)
qinxuye May 28, 2024
f7e088d
FEAT: support Yi-1.5-chat-16k (#1544)
qinxuye May 28, 2024
18e3d24
Correct ModelActor import path in worker & supervisor (#1550)
frostyplanet May 30, 2024
f7e33a3
BUG: fix vl-model img path error (#1559)
amumu96 May 30, 2024
5b7b5b6
ENH: rm mini-internvl (#1563)
amumu96 May 30, 2024
992aa56
FEAT: Support XINFERENCE_DISABLE_METRICS env (#1547)
codingl2k1 May 30, 2024
b8658bc
BUG: Fix validation errors when define a custom baichuan-chat LLM mod…
buptzyf May 30, 2024
18ab071
ENH: add additional_option at vl gradio (#1561)
amumu96 May 31, 2024
f8dd5ba
FEAT: Support new model CogVLM (#1551)
amumu96 May 31, 2024
cb9dbb2
DOC: update readme and fix description about model engine (#1566)
qinxuye May 31, 2024
488ef46
FEAT: telechat model (#1567)
LIKEGAKKI May 31, 2024
69c09cd
ENH: add real paths column (#1555)
hainaweiben May 31, 2024
25fd15a
BUG: Fix typo for cogvlm2 (#1573)
Minamiyama Jun 1, 2024
d76549b
ENH: make CogVLM2 support stream output (#1572)
Minamiyama Jun 1, 2024
9444e93
FEAT: new model: mini-cpm-llama3-v-2.5 (#1577)
Minamiyama Jun 5, 2024
d57eb3f
FEAT: support glm4-chat & glm4-chat-1m (#1584)
qinxuye Jun 5, 2024
29df0fd
FEAT: add mistral-instruct-v0.3 (#1576)
qinxuye Jun 5, 2024
ceb24e8
FEAT: add codestral-v0.1 (#1575)
qinxuye Jun 5, 2024
8808a7f
DOC: added new models in README (#1585)
qinxuye Jun 5, 2024
dad97d4
FEAT: Support ChatTTS (#1578)
codingl2k1 Jun 6, 2024
2e3e4cb
DOC: Fix audio doc (#1593)
codingl2k1 Jun 6, 2024
5dacb3d
DOC: Usage about cal-model-memory (#1589)
wxiwnd Jun 6, 2024
7c63a80
BLD: Docker clean all images after building image on self-hosted mach…
ChengjieLi28 Jun 6, 2024
9625032
DOC: Fix audio doc (#1599)
codingl2k1 Jun 7, 2024
c523335
FEAT: Continuous batching for chat model on transformers backend (#1548)
ChengjieLi28 Jun 7, 2024
a578b3f
FEAT: support qwen2 (#1597)
qinxuye Jun 7, 2024
95ee0d1
Feat: support glm-4v 9b (#1591)
Minamiyama Jun 7, 2024
e7a6331
DOC: Continuous batching (#1602)
ChengjieLi28 Jun 7, 2024
a1cd729
DOC: add new models to readme (#1604)
qinxuye Jun 7, 2024
55c5636
BLD: Fix pip is looking multiple versions of some packages while inst…
ChengjieLi28 Jun 7, 2024
f1d0a30
BUG: Fix wheel package missing thirdparty ChatTTS (#1606)
codingl2k1 Jun 8, 2024
1ea3412
ENH: modelscope for audio models (#1607)
Minamiyama Jun 11, 2024
db69f4a
BUG: fix XINFERENCE_MODEL_SRC behavior (#1616)
LukeWang-Plus Jun 11, 2024
d9f155a
Remove selected cache models (#1613)
hainaweiben Jun 13, 2024
c51fae5
ENH: Supports `generate` interface for continuous batching (#1621)
ChengjieLi28 Jun 13, 2024
7b626c1
BUG: Filtering Step for Streaming Responses to Qwen's Tool Calls when…
zhanghx0905 Jun 13, 2024
83f2b79
FEAT: qwen2-instruct support tool call (#1631)
ayhhyhh Jun 13, 2024
d8f113b
FEAT: Added a method to download models from csghub. (#1627)
hainaweiben Jun 13, 2024
ca2fbe9
FEAT: glm4-chat support tool call (#1617)
codingl2k1 Jun 14, 2024
29b7337
FEAT: [UI] Supports viewing and deleting cache data. (#1637)
yiboyasss Jun 14, 2024
34a57df
ENH: quantization for glm-4v (#1610)
Minamiyama Jun 14, 2024
cc972cd
BUG: show error when user launch quantized model without device suppo…
Minamiyama Jun 16, 2024
aa3acc3
BUG: Fix default rerank type (#1649)
codingl2k1 Jun 17, 2024
7a70214
FEAT: Add Tools Support for Qwen Series MOE Models (#1642)
zhanghx0905 Jun 17, 2024
dbbd5d0
TST: Fix CI due to `tenacity` (#1660)
ChengjieLi28 Jun 18, 2024
5e5e691
FEAT: [UI]Modify the deletion function of a custom model. (#1656)
yiboyasss Jun 18, 2024
c2e9d6d
ENH: Continuous batching supports all the models with `transformers` …
ChengjieLi28 Jun 19, 2024
d6041d5
BUG: chat_completion not response while error appears more than 100 (…
liuzhenghua Jun 20, 2024
2bbfab1
CHORE: [pre-commit] Add exclude thirdparty rules (#1678)
frostyplanet Jun 21, 2024
21b5ab2
FEAT: [UI]Custom model presents JSON data and modifies it. (#1670)
yiboyasss Jun 21, 2024
5cef7c3
FEAT: Add Rerank model token input/output usage (#1657)
wxiwnd Jun 21, 2024
7705d4a
BLD: pin `chatglm-cpp` version `v0.3.x` (#1692)
ChengjieLi28 Jun 22, 2024
006e2c2
BUG: [UI] Fix the issue where Model Abilities in the registration mod…
yiboyasss Jun 24, 2024
f5c66fa
BUG: GGUF models cannot use GPU in docker (#1710)
ChengjieLi28 Jun 24, 2024
85ce9fe
ENH: Set the CSG Hub endpoint as an environment variable. (#1666)
hainaweiben Jun 24, 2024
b2a84c1
BUG: Fix tool call observation (#1648)
codingl2k1 Jun 25, 2024
82c07f7
FEAT: [UI] Add favorite function. (#1714)
yiboyasss Jun 27, 2024
66c66b7
BUG: [UI]fix favorite bug. (#1728)
yiboyasss Jun 27, 2024
341e008
FEAT: add SD3 support (#1723)
qinxuye Jun 27, 2024
be83ab7
FEAT: [UI] Add the function of automatically obtaining the last confi…
yiboyasss Jun 27, 2024
33e8e1e
BUG: curl with stream returns unicode chars rather than chinese chara…
ChengjieLi28 Jun 28, 2024
dc273db
FEAT: support jina-rerank-v2 (#1733)
qinxuye Jun 28, 2024
cb5d165
FEAT: `tensorizer` integration (#1579)
Zihann73 Jun 28, 2024
2a64f4b
CHORE: upgrade version fix security vulnerability (#1674)
rickywu Jun 28, 2024
8feac94
BUG: Cluster info can be accessed without authorization in the auth m…
ChengjieLi28 Jun 28, 2024
3d9c261
FEAT: Delete cluster (#1719)
hainaweiben Jun 28, 2024
7384b74
BLD: Supports Aliyun docker image (#1753)
ChengjieLi28 Jul 1, 2024
7bedd67
BUG: Fix glm4 tool call (#1747)
codingl2k1 Jul 2, 2024
6c52566
ENH: added gguf files for qwen2 (#1745)
qinxuye Jul 2, 2024
fd0f49d
BLD: GPU docker use `vllm` image as base (#1759)
ChengjieLi28 Jul 2, 2024
7ab624e
TST: Fix `llama-cpp-python` issue in CI (#1763)
ChengjieLi28 Jul 3, 2024
7e643f1
BLD: Pin `llama-cpp-python` to `v0.2.77` in Docker for stability (#1767)
ChengjieLi28 Jul 3, 2024
b8110d6
ENH: Add more log modules (#1771)
ChengjieLi28 Jul 4, 2024
48fb744
BUG: [UI] Fix authentication mode related bugs (#1772)
yiboyasss Jul 4, 2024
4a25965
ENH: Continuous batching supports `vision` model ability (#1724)
ChengjieLi28 Jul 4, 2024
e99bc6e
ENH: Add guard for model launching (#1680)
frostyplanet Jul 4, 2024
9a54bc3
BUG: Fix python client returns documents for rerank task by default (…
ChengjieLi28 Jul 4, 2024
8fff9e7
DOC: Update continuous batching and docker usage (#1785)
ChengjieLi28 Jul 4, 2024
3cb1367
FEAT: support MLX engine (#1765)
qinxuye Jul 5, 2024
aa1772f
BUG: Fix LLM based reranker may raise a TypeError (#1794)
codingl2k1 Jul 5, 2024
aada9b4
BUG: fix deepseek-vl-chat (#1795)
qinxuye Jul 5, 2024
007408c
FEAT: add gemma-2-it (#1774)
qinxuye Jul 5, 2024
c94c038
ENH: Update ChatTTS (#1776)
codingl2k1 Jul 8, 2024
51b8b87
FIX: [UI] Historical parameter echo bugs. (#1810)
yiboyasss Jul 8, 2024
9a04b32
DOC: Define a custom Rerank model (#1821)
Weaxs Jul 9, 2024
09d78e9
CHORE: Close issue when it is stale (#1827)
ChengjieLi28 Jul 10, 2024
7fb3755
CHORE: Update issue template (#1833)
ChengjieLi28 Jul 10, 2024
a1be5dd
FEAT: support choose download hub (#1841)
amumu96 Jul 11, 2024
b8d3bcf
FEAT: [UI] Specify download hub. (#1840)
yiboyasss Jul 11, 2024
8f4cbb2
DOC: update readme (#1815)
qinxuye Jul 11, 2024
ff17be3
FIX: [UI] Fix download_hub bugs. (#1846)
yiboyasss Jul 11, 2024
0f9c942
REF: Remove `chatglm-cpp` and Fix latest `llama-cpp-python` issue (#1…
ChengjieLi28 Jul 12, 2024
9bb548a
FEAT: Add support for Flexible Model (#1671)
shellc Jul 12, 2024
e916d05
BUG: cache status missing for model id with quantization placeholder …
Zihann73 Jul 12, 2024
5e3f254
ENH: Added the parameter 'worker_ip' to the 'register' model. (#1773)
hainaweiben Jul 12, 2024
e80910d
BUG: Fix stream unicode issue for chinese characters when using vllm …
ChengjieLi28 Jul 15, 2024
1035728
FEAT: support sd inpainting models (#1879)
qinxuye Jul 16, 2024
bd932c5
FEAT: Stream ChatTTS (#1812)
codingl2k1 Jul 17, 2024
4e741cf
ENH: added gguf formats for gemma-2-it (#1874)
qinxuye Jul 17, 2024
8547c58
FEAT: support codegeex4 (#1888)
qinxuye Jul 19, 2024
d14e3fd
BUG: sglang stream error while stream_option not set (#1901)
wxiwnd Jul 19, 2024
35c3008
FEAT: support internlm2.5-chat & internlm2.5-chat-1m (#1887)
qinxuye Jul 19, 2024
880929c
BUG: fix client import (#1905)
amumu96 Jul 19, 2024
0a87cfb
BUG: fix inpainting and flexible infer due to inner API change (#1907)
qinxuye Jul 22, 2024
2e85c71
FEAT: GLM4 support stream tool call (#1876)
codingl2k1 Jul 23, 2024
c80f404
FEAT: support csg-wukong-chat-v0.1 (#1916)
qinxuye Jul 24, 2024
a266c8d
ENH: added MLX for llama-3-instruct, codestral, Yi-1.5-chat, internlm…
qinxuye Jul 24, 2024
af6dfa3
ENH: add gptq for llama-3-instruct (#1934)
Phoenix500526 Jul 25, 2024
d09c1e9
FEAT: [UI]Add configuration for image and audio models. (#1922)
yiboyasss Jul 26, 2024
a0a4d1e
FEAT: support mistral-nemo-instruct (#1936)
qinxuye Jul 26, 2024
52139c5
FEAT: CosyVoice speech (#1881)
codingl2k1 Jul 26, 2024
2a67d8b
FEAT: add llama-3.1, llama-3.1-instruct (#1932)
Weaxs Jul 26, 2024
c521cdd
FEAT: support mistral-large-instruct (#1944)
qinxuye Jul 26, 2024
681246f
Feat: support for llama 3.1 for vllm (#1935)
Phoenix500526 Jul 26, 2024
d5562f8
DOC: update new models to readme (#1946)
qinxuye Jul 26, 2024
aa51ff2
FEAT: add rembg flexible model to remove background of image (#1917)
qinxuye Jul 26, 2024
202e6cb
ENH: Add support of sglang for llama 3 qwen 2 (#1947)
luweizheng Jul 27, 2024
94580f6
BUG: Fix GLM chat (#1966)
codingl2k1 Jul 29, 2024
76e3ae5
REF: enable sglang by default (#1953)
qinxuye Jul 30, 2024
67bebc3
ENH: add cache_limit_gb option for MLX (#1954)
qinxuye Jul 30, 2024
09a18ac
BUG: fix match for transformers from model registered (#1955)
qinxuye Jul 30, 2024
38eeae4
BUG: Load llama.so failed in docker image (#1974)
ChengjieLi28 Jul 30, 2024
aafd36e
ENH: [benchmark] Add api-key support (#1961)
frostyplanet Jul 30, 2024
31523d6
DOC: ascend support (#1978)
qinxuye Jul 30, 2024
e056e65
DOC: add CosyVoice doc (#1980)
qinxuye Jul 31, 2024
fdab8e5
BUG: [UI]Modifying 'model format' again resulted in an error message.…
yiboyasss Aug 1, 2024
c45f264
ENH: Support for Gemma 2 and Llama 3.1 Models for vllm & sglang (#1929)
vikrantrathore Aug 1, 2024
2c810c6
FEAT: Supports model_path input when launching models (#1918)
Valdanitooooo Aug 1, 2024
7194405
ENH: [K8s] worker log dir name (#1997)
ChengjieLi28 Aug 1, 2024
32ee89b
BUG: fix loading multiple gguf parts (#1987)
qinxuye Aug 2, 2024
a7dd8d9
ENH: support image_to_image (#1986)
qinxuye Aug 2, 2024
be149a8
DOC: Documents for K8s (#2004)
ChengjieLi28 Aug 2, 2024
dd85cfe
FEAT: Support gte-Qwen2-7B-instruct and multi gpu deploy (#1994)
amumu96 Aug 2, 2024
8e292b9
BUG: fix flexible model register in worker (#2011)
frostyplanet Aug 4, 2024
a5e827d
ENH: Improve internal server error (#2009)
codingl2k1 Aug 4, 2024
ec421ed
BUG: [UI] Fix the 'model_path' bug. (#2015)
yiboyasss Aug 5, 2024
1112993
BUG: fix custom embedding launch error (#2016)
amumu96 Aug 5, 2024
0057ede
FEAT: support SenseVoice audio-to-text model (#2008)
qinxuye Aug 5, 2024
3f7dc2d
FEAT: support flux.1-schnell & flux.1-dev (#2007)
qinxuye Aug 6, 2024
957c613
CHORE: Increased frequency of issue processing (#2024)
ChengjieLi28 Aug 6, 2024
54c418e
FEAT: support kolors image model (#2028)
qinxuye Aug 6, 2024
6cf7563
FEAT: Add support for llama-3.1-instruct 405B model (#2025)
frostyplanet Aug 8, 2024
9b713bc
TST: Fix some dependency version issues (#2042)
ChengjieLi28 Aug 8, 2024
7d83300
REF: Mark `Deprecate` for `prompt`, `system_prompt` and `chat_history…
ChengjieLi28 Aug 8, 2024
8f0a0be
DOC: Directly launch custom model by `model_path` (#2047)
ChengjieLi28 Aug 8, 2024
e3f277d
DOC: fix typo in README (#2048)
ArtificialZeng Aug 8, 2024
a826372
ENH: Add `stream` option in Benchmark (#2038)
Dawnfz-Lenfeng Aug 9, 2024
9a02b40
ENH: optimize availability of vLLM (#2046)
qinxuye Aug 9, 2024
93e6604
ENH: [worker] Allow init supervisor_ref lazy (#1958)
frostyplanet Aug 9, 2024
cbdc811
ENH: optimize performance of sglang (#2050)
qinxuye Aug 9, 2024
3ebe1f3
FEAT: Support CogVideoX video model (#2049)
codingl2k1 Aug 9, 2024
3e7ed86
FEAT: Support MiniCPM-v-2_6 (#2031)
Minamiyama Aug 9, 2024
c4cbd38
DOC: update readme & add tips for large image models (#2056)
qinxuye Aug 10, 2024
9afee76
BUG: limit AutoAWQ version to fix docker issue (#2067)
qinxuye Aug 12, 2024
de06eb7
FEAT: add gemma-2-it 2b & internlm2.5-chat 1.8b and 20b & update vide…
qinxuye Aug 14, 2024
894c964
BUG: Fix custom glm4 & remove tool calls of ChatGLM3 (#2081)
codingl2k1 Aug 15, 2024
3540f2b
BUG: [UI] Infinited loop with login (#2039)
WalkerWang731 Aug 15, 2024
b6d655e
FEAT: support FP8 for vllm & sglang engine (#2069)
qinxuye Aug 15, 2024
e04eb43
ENH: make MiniCPM v2.6 support video (#2068)
Minamiyama Aug 16, 2024
f5229a2
REF: Remove some builtin old models and `ggmlv3` model format (#2086)
ChengjieLi28 Aug 16, 2024
e4d2257
Feat: Support internvl2 and internvl stream (#2079)
amumu96 Aug 16, 2024
466f163
FEAT: ChatTTS speech voice support encoded speaker str (#2096)
codingl2k1 Aug 17, 2024
b0848ae
REF: use utils._decode_image replacing same codes in individual vl fi…
Minamiyama Aug 18, 2024
9804c55
DOC: add more doc about ChatTTS (#2108)
qinxuye Aug 18, 2024
05b6180
BUG: fix asyncio.Queue error in benchmark (#2113)
Dawnfz-Lenfeng Aug 19, 2024
95d8786
BUG: fix deleting cache (#2114)
qinxuye Aug 19, 2024
7c891a4
DOC: remove models deleted (#2122)
qinxuye Aug 20, 2024
eb9a4ea
FEAT: [UI] Add other parameters to other models besides the LLM model…
yiboyasss Aug 21, 2024
ce0ff04
FEAT: support SD3-medium inpainting (#2137)
qinxuye Aug 22, 2024
6a372d6
ENH: make internvl2 support video (#2104)
Minamiyama Aug 23, 2024
c6a58ba
FEAT: Added the model dtype parameter for embedding (currently only s…
Zzzz1111 Aug 23, 2024
bd4c26d
FEAT: Support fish speech model (#2119)
codingl2k1 Aug 23, 2024
4d35b6f
ENH: support process_image with padding for image_to_image (#2109)
qinxuye Aug 23, 2024
b66c1d4
BUG: Fix concurrent ops in worker initialization (#2125)
frostyplanet Aug 23, 2024
07f8672
FEAT: support CogVLM2-video (#2110)
Minamiyama Aug 23, 2024
e672767
Bug: fix audio ability errorwhen get instance info (#2147)
amumu96 Aug 23, 2024
16d1193
DOC: Add doc for fish speech and cogvlm2 video (#2149)
codingl2k1 Aug 23, 2024
b500224
FEAT: Support LMDeploy for internvl2 and fix finish reasion miss at i…
amumu96 Aug 23, 2024
2460978
BUG: fix docker conflict (#2156)
amumu96 Aug 25, 2024
e7f5984
ENH: support padding for sd inpainting model (#2165)
wxiwnd Aug 27, 2024
1c8889a
EHN: clean cache for VL models (#2163)
qinxuye Aug 27, 2024
4d77065
ENH: Move matcha to third party (#2166)
codingl2k1 Aug 27, 2024
2b0c091
ENH: Fix callback status when model die (#2172)
frostyplanet Aug 27, 2024
1d78798
ENH: support cosyvoice-300m-instruct without instruction (#2175)
qinxuye Aug 28, 2024
c289807
ENH: Remove opencc and fast_whisper (#2179)
codingl2k1 Aug 28, 2024
5216e6f
ENH: solve the problem of health check for image-to-text model (#2182)
luhairong11 Aug 28, 2024
b91aae8
BUG: docker compose failed due to empty entrypoint (#2180)
ChengjieLi28 Aug 28, 2024
de80532
BUG: 🐛 fix unable launch qwen2-embedding (#2185)
Zzzz1111 Aug 29, 2024
ea407cc
BUG: configuration key different by sglang's version (#2188)
lordk911 Aug 29, 2024
71029b5
BUG: fix lora not load in transformers engine (#2194)
amumu96 Aug 29, 2024
db2b846
BUG: fix register model list error (#2189)
amumu96 Aug 30, 2024
bb73a97
BUG: Fix list video model (#2190)
codingl2k1 Aug 30, 2024
10094f9
CHORE: Clean test env (#2183)
codingl2k1 Aug 30, 2024
2f8c3ce
BUG: fix custom path test error (#2200)
amumu96 Aug 30, 2024
f3d510e
FEAT: support CogVideoX-5b (#2197)
qinxuye Aug 30, 2024
c84a59e
fix: detail
yiboyasss Sep 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
26 changes: 0 additions & 26 deletions .github/ISSUE_TEMPLATE/bug_report.md

This file was deleted.

77 changes: 77 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: "Bug Report"
description: Submit a bug report to help us improve Xinference. You should provide useful information AMAP rather than simply describing what happened. / 提交一个问题报告来帮助我们改进 Xinference。你必须提供有用的信息而不只是描述发生的现象,否则将不予处理。
body:
- type: textarea
id: system-info
attributes:
label: System Info / 系統信息
description: Your operating environment / 您的运行环境信息
placeholder: Includes Cuda version, transformers / llama-cpp-python / vllm version, Python version, operating system... / 包括Cuda版本,transformers / llama-cpp-python / vllm版本,Python版本,操作系统等。
validations:
required: true

- type: checkboxes
id: information-scripts-examples
attributes:
label: Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
description: 'How are you using Xinference? / 以何种方式使用 Xinference?'
options:
- label: docker / docker
- label: pip install / 通过 pip install 安装
- label: installation from source / 从源码安装

- type: textarea
id: start-way
attributes:
label: Version info / 版本信息
description: The version of Xinference you are running / Xinference 版本
validations:
required: true

- type: textarea
id: commandline
attributes:
label: The command used to start Xinference / 用以启动 xinference 的命令
description: |
Please provide the command used to start Xinference.
If it is a distributed scenario, the commands for starting the supervisor and worker need to be listed separately.
If it is a Docker scenario, please provide the complete command for starting Xinference through Docker.
If it is another method, please describe it specifically.

请提供启动 xinference 的命令。
如果是分布式场景,启动 supervisor 和 worker 的命令需要分别列出。
如果是docker场景,请提供通过 docker 启动 xinference 的完整命令。
如果是其他方式,请具体描述。
validations:
required: true

- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Reproduction / 复现过程
description: |
Please provide a code example that reproduces the problem you encountered, preferably with a minimal reproduction unit.
If you have code snippets, error messages, stack traces, please provide them here as well.
Please format your code correctly using code tags. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Do not use screenshots, as they are difficult to read and (more importantly) do not allow others to copy and paste your code.

请提供能重现您遇到的问题的代码示例,最好是最小复现单元。
如果您有代码片段、错误信息、堆栈跟踪、涉及的命令行操作等也请在此提供。
请使用代码标签正确格式化您的代码。请参见 https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
请勿使用截图,因为截图难以阅读,而且(更重要的是)不允许他人复制粘贴您的代码。
placeholder: |
Steps to reproduce the behavior/复现Bug的步骤:

1.
2.
3.

- type: textarea
id: expected-behavior
validations:
required: true
attributes:
label: Expected behavior / 期待表现
description: "A clear and concise description of what you would expect to happen. / 简单描述您期望发生的事情。"
20 changes: 0 additions & 20 deletions .github/ISSUE_TEMPLATE/feature_request.md

This file was deleted.

34 changes: 34 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: "Feature request"
description: Submit a request for a new Xinference feature / 提交一个新的 Xinference 的功能建议
labels: [ "feature" ]
body:
- type: textarea
id: feature-request
validations:
required: true
attributes:
label: Feature request / 功能建议
description: |
A brief description of the functional proposal.
对功能建议的简述。

- type: textarea
id: motivation
validations:
required: true
attributes:
label: Motivation / 动机
description: |
Your motivation for making the suggestion. If that motivation is related to another GitHub issue, link to it here.
您提出建议的动机。如果该动机与另一个 GitHub 问题有关,请在此处提供对应的链接。

- type: textarea
id: contribution
validations:
required: true
attributes:
label: Your contribution / 您的贡献
description: |

Your PR link or any other link you can help with.
您的PR链接或者其他您能提供帮助的链接。
10 changes: 0 additions & 10 deletions .github/ISSUE_TEMPLATE/other.md

This file was deleted.

9 changes: 0 additions & 9 deletions .github/ISSUE_TEMPLATE/question.md

This file was deleted.

40 changes: 34 additions & 6 deletions .github/workflows/docker-cd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ concurrency:

jobs:
build:
runs-on: ubuntu-latest
runs-on: self-hosted
strategy:
matrix:
python-version: [ "3.10" ]
python-version: [ "3.9" ]
steps:
- name: Check out code
uses: actions/checkout@v3
Expand All @@ -38,10 +38,6 @@ jobs:
DOCKER_ORG: ${{ secrets.DOCKERHUB_USERNAME }}
PY_VERSION: ${{ matrix.python-version }}
run: |
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
if [[ "$GITHUB_REF" =~ ^"refs/tags/" ]]; then
export GIT_TAG=$(echo "$GITHUB_REF" | sed -e "s/refs\/tags\///g")
fi
Expand All @@ -65,11 +61,43 @@ jobs:
docker push "$DOCKER_ORG/xinference:${IMAGE_TAG}"
docker build -t "$DOCKER_ORG/xinference:${IMAGE_TAG}-cpu" --progress=plain -f xinference/deploy/docker/cpu.Dockerfile .
docker push "$DOCKER_ORG/xinference:${IMAGE_TAG}-cpu"
echo "XINFERENCE_IMAGE_TAG=${IMAGE_TAG}" >> $GITHUB_ENV
done

if [[ -n "$GIT_TAG" ]]; then
docker tag "$DOCKER_ORG/xinference:${GIT_TAG}" "$DOCKER_ORG/xinference:latest"
docker push "$DOCKER_ORG/xinference:latest"
docker tag "$DOCKER_ORG/xinference:${GIT_TAG}-cpu" "$DOCKER_ORG/xinference:latest-cpu"
docker push "$DOCKER_ORG/xinference:latest-cpu"
echo "XINFERENCE_GIT_TAG=${GIT_TAG}" >> $GITHUB_ENV
fi

- name: Log in to Aliyun Docker Hub
uses: docker/login-action@v1
with:
registry: registry.cn-hangzhou.aliyuncs.com
username: ${{ secrets.DOCKERHUB_ALIYUN_USERNAME }}
password: ${{ secrets.DOCKERHUB_ALIYUN_PASSWORD }}

- name: Push docker image to Aliyun
shell: bash
if: ${{ github.repository == 'xorbitsai/inference' }}
env:
DOCKER_ORG: registry.cn-hangzhou.aliyuncs.com/xprobe_xinference
run: |
docker tag "xprobe/xinference:${XINFERENCE_IMAGE_TAG}" "${DOCKER_ORG}/xinference:${XINFERENCE_IMAGE_TAG}"
docker push "${DOCKER_ORG}/xinference:${XINFERENCE_IMAGE_TAG}"
docker tag "xprobe/xinference:${XINFERENCE_IMAGE_TAG}-cpu" "${DOCKER_ORG}/xinference:${XINFERENCE_IMAGE_TAG}-cpu"
docker push "${DOCKER_ORG}/xinference:${XINFERENCE_IMAGE_TAG}-cpu"
if [[ -n "$XINFERENCE_GIT_TAG" ]]; then
docker tag "xprobe/xinference:${XINFERENCE_GIT_TAG}" "$DOCKER_ORG/xinference:latest"
docker push "$DOCKER_ORG/xinference:latest"
docker tag "xprobe/xinference:${XINFERENCE_GIT_TAG}-cpu" "$DOCKER_ORG/xinference:latest-cpu"
docker push "$DOCKER_ORG/xinference:latest-cpu"
fi

- name: Clean docker image cache
shell: bash
if: ${{ github.repository == 'xorbitsai/inference' }}
run: |
docker system prune -f -a
24 changes: 24 additions & 0 deletions .github/workflows/issue.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: Close inactive issues
on:
schedule:
- cron: "0 19 * * *"
workflow_dispatch:

jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v9
with:
days-before-issue-stale: 7
days-before-issue-close: 5
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 7 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 5 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
operations-per-run: 500
repo-token: ${{ secrets.GITHUB_TOKEN }}
Loading