Shizu
Shizu 1.0
Back in June 2022, just a few months before the release of ChatGPT, I had discovered "AI dungeons/story games" on YouTube. It peaked my interest and I decided to take a better look at how they achieved this.
After a bit of research, I found that GPT-2, from openai, was basically what I was searching for. GPT-2 differs drastically from GPT-3 (the first realease of ChatGPT) in the way that it is not inherently a chatbot. It simply continues the given snippet of text by what it thinks goes next. Give it a headline and it'll write a fake article.
If you wanted to use it as a chatbot, you could try giving it a snippet of text like the following:
user: Hello
bot: Hi
user: What are you doing right now?
bot: Reading a book
user: What's 5+8
bot:
An example of a conversation, followed by the actual user question, and a blank space for the bot to respond.
With this structure, the LLM could generate a response, but it could also continue, and generate a user line. And then another bot line, user... Basically talking to himself. Generating the questions and the responses.
At best, with a better prompt, you could make a stupid lying bot. Not exactly great.
Something else you could do to change the behavior of the LLM was training. Using the OpenAI API, you could send a dataset, and the model would be fine-tuned given your data.
Now I had an idea, I wanted to not only train the LLM to behave like a chatbot, which would remove the need for the fake conversation example, but also fine-tune it to talk like I do.
But for that, I would need a somewhat large amount of data; of conversations where I write the responses.
At first, I naively thought of just making fake messages and to respond to them. It would be slow and painful and not exactly great.
Then I thought, why not use my discord messages? I have people sending me messages (prompts), and me responding (the llm responses). And my messages where in large quantities too (about 300k messages, at the time). I would just need to download them, and format them to fit the API requests.
Wowie it's going to be great
I thought
Well it was lol
I did, in the process, send 300k of my personal messages in the wild, which uhh... hm...
Looking back, not the greatest idea
But after some fiddling, making a discord bot to interact with the API, here were the results:
And make all humans gray idk
It's truly incredible.
If you'd like to read more of Shizu's messages, I will make a collection of them on Shizu Messages V1
A lot of responses are actual gibberish. But some of them are funny.
Also, something I did not originally anticipate, the llm learned to use GIFs and emojis, which made some funny conversations.
Shizu 2.0
While Shizu 1.0 lived a happy life of plotting to enslave humanity and insulting my friends (check out the message collection).
She sadly passed away...
OpenAI removed the GPT-2 api, and she died in her sleep.
The idea of remaking Shizu was in my mind for a while, but I didn't bother until recently.
Now that AI models where available everywhere, and equipped of a GPU capable of locally training, and running models. It was time to awaken Shizu from the ashes.
A few improvements were made:
- A better model - Gwen 3B
- A better dataset, with more diverse entries (still my own discord messages)
- Locally running
Which meant, smarter and better for the most part, shizu is a bit dum dum
After some time making the dataset, converting the Discord messages to the correct format.
Training the model (Took around 8 hours)
Shizu was alive once again, with now the possibility to send image and videos I had previously sent (it has not worked at all, she keeps making up links instead of using the real one)
Here are some messages:
hi shizu :3
If you want to ask something to Shizu, send me a message on Discord or put a comment in the comment section!
I'll add the message/response on Shizu Messages V2
Or maybe you'll have a 1 on 1 conversation with shizu 😳