Multimodal Model #6

New Issue

james · 2024-12-17T23:46:28Z

james commented

2024-12-17 23:46:28 +00:00

Specifically, I want to try unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit.

Specifically, I want to try `unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit`.

james added the

enhancement

label 2024-12-17 23:46:28 +00:00

james commented

2025-01-03 04:21:03 +00:00

figure out if second model entirely is needed for vision, or if it’s possible to train vision models on datasets without images

james commented

2025-01-03 09:23:23 +00:00

when a message is sent to miku with an image, we can just have it infer an image description and offer it as context in the longer langchain instruct prompt; #9

james referenced a pull request that will close this issue

2025-01-17 00:20:43 +00:00

WIP: Vision model #10

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: james/MikuAI#6