Updated readme, minor configuration stuff

This commit is contained in:
James Shiffer 2021-12-26 11:55:19 -08:00
parent 6a475d584d
commit 2795fba194
3 changed files with 61 additions and 2 deletions

View File

@ -1,3 +1,62 @@
# miku # miku
Discord bot/companion for the group chatte, powered by the GPT-J language model and modified with a soft prompt to understand all of our esoteric, elaborate inside jokes. Discord bot/companion for the group chatte, powered by the GPT-~~J~~ Neo language model and modified with a soft prompt to understand all of our esoteric, elaborate inside jokes.
## Setup
Python 3.8+ and PyTorch required. CUDA strongly recommended.
The `c1-1.3B` model used in development should work without needing beefy hardware. It has been tested on a 1050 Ti 4 GB.
Set up a virtual environment:
Linux/MacOS
```shell
python3 -m venv venv
source venv/bin/activate
```
Windows
```shell
py -3 -m venv venv
.\venv\Scripts\activate
```
Install required packages
```shell
pip install -r requirements.txt
```
Copy `.env.example` to `.env` and fill in the bot's `TOKEN`.
For chat scraping, you will also need to get your own `USER_TOKEN`.
* In Discord, hit Ctrl+Shift+I to open up developer tools
* Go to the Network tab and filter by XHR requests
* Open a new channel, or scroll up, or do something else that will trigger an authenticated request
* Click on one that looks suitable (e.g. `messages?limit=50`)
* Under the "Request" tab, copy the contents of the `Authorization` request header.
## Usage
Scrape the messages from the chat channel you wish to use for a soft prompt. You will be prompted for the channel ID, which you can get by having developer mode on in Discord and right-clicking, or copying the last part of the URL in the browser.
```shell
cd src
python -m scraper
```
Train the soft prompt (TODO)
Run the Hatsune Miku bot. The first time you do this, it will download the model, which is ~5 GB.
```shell
python -m miku
```
## Final Remarks
sukima nuts

Binary file not shown.

View File

@ -113,7 +113,7 @@ def boot(token: str):
if not token: if not token:
token = input('Enter your Discord user token (Authorization request header): ') token = input('Enter your Discord user token (Authorization request header): ')
channel = input('Enter channel ID: ') channel = input('Enter channel ID: ')
default_export = Path.cwd().parent / 'chats' default_export = Path.cwd().parent.parent / 'chats'
export = input('Enter path to export transcripts (default "chats"): ') export = input('Enter path to export transcripts (default "chats"): ')
scraper = Scraper(token, channel, Path(export) if export else default_export) scraper = Scraper(token, channel, Path(export) if export else default_export)
scraper.scrape() scraper.scrape()