Playing with ColBERTV2 Embeddings and Retrieval
There are a lot of embedding models out there for LLMs. ColbertV2 is a neat one. Here are some thoughts and code examples.
ColbertV2
The way you shove data into any embedding model can make a difference, and ColBERT is no different. I started off just giving it an html file with the entirety of a website (vimbook’s print-site one-pager). This had a bunch of junk that wasn’t needed, which occasionally affected the
sqlite-utils insert-files https://github.com/bclavie/RAGatouille
Multiline script example:
# enable multilib - see link below
paru # make sure things are up to date generally
paru -S android-tools android-sdk-build-tools # includes adb and other goodies
reboot
Image example: