Below you will find pages that utilize the taxonomy term “Phi2”
2023
[Home Lab] Deploying OpenAI-Compatible LLAMA CPP Server with K3S
In this post, I expand my Home Lab by adding a perpetual LLAMA model for on-demand inferencing. The steps involve crafting a Dockerfile, packaging Microsoft's Phi2 model, and deploying it with K3S. This ensures a continuously accessible LLAMA server for seamless integration into various inferencing projects.
2023
[Artificial Intelligence] Running LLaMA server in local machine
In continuation from my previous post, I prepared the environment using Pipenv and installed the OpenAI-like web server with specific CMAKE arguments. Running the server with a provided model was straightforward. To create an SSH tunnel to the remote Ubuntu machine from my Windows PC, I used PuTTY, configuring it to forward port 8888. Connecting from BYO-GPT involved adjusting the server endpoint in the Dart file. This seamless integration allowed me to access the Open API for the LLAMA CPP server and successfully connect BYO-GPT to the specified server.
2023
[Artificial Intelligence] Running LLaMA model locally
In this thorough guide, I prepared my Ubuntu machine (32GB) for the LLaMA (Language Model) build. Following Georgi Gergano's llama.cpp, I executed CMake commands, ensuring the correct tag and building the model successfully. I downloaded Microsoft's Phi2 model in GGUF format, enabling local execution without exposing prompts or data. Running the Phi2 model showcased its capabilities in a few-shot interaction, providing accurate responses. Additionally, I explored optional OpenBLAS integration for improved speed, offering insights into the installation and rebuild process.