From af5a2996234768921b81d96ffaae00cb88229862 Mon Sep 17 00:00:00 2001 From: flu0r1ne Date: Wed, 1 Nov 2023 20:46:01 -0500 Subject: Initial commit --- README.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 README.md (limited to 'README.md') diff --git a/README.md b/README.md new file mode 100644 index 0000000..bc573b4 --- /dev/null +++ b/README.md @@ -0,0 +1,31 @@ +MyLlama2 +======== + +This is a ground-up reimplementation of the Llama 2 family of large language models. +It adheres to the exact same architecture, based on a decoder-only transformer model +equipped with Group-Query Attention (GQA), key-value caching, SwiGLU feedforward layers, +and SentencePiece embeddings. It is compatible with the original Llama 2 weights. +Unlike Meta's model, this implementation does not incorporate parallel layers or any +distributed processing APIs. Consequently, it can only run on a single GPU and is also +capable of running on a CPU without the need for special tools (e.g., `torchrun`, `mpi`, etc.). + +This model was created for demonstration purposes, with the intent of sharing it with the +community. During its development, I identified a few minor issues in FAIR's +implementation, which I plan to contribute back through pull requests. I believe this +implementation is more accessible for those new to AI, and I've included references to the papers +where these concepts were first introduced. However, this code has not been extensively reviewed. +For production projects, I recommend starting with [Meta's implementation](https://github.com/facebookresearch/llama). +For high-performance CPU-only inference, consider compiling +[llama.cpp](https://github.com/ggerganov/llama.cpp) while targeting the native architecture. + +Example usage: +-------------- + +```bash +python inference_example.py \ + llama/llama-2-7b \ + ./tokenizer.model \ + --top_p 0.8 \ + --max_generation_length 100 \ + --context "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal." +``` -- cgit v1.2.3