Instructions Get started with the open source Grok-1 model

– The release of Grok-1 sparked excitement in the AI community, considered a major milestone for open-source Artificial General Intelligence (AGI).

On Sunday, Elon Musk’s AI company, xAI, announced the base model weights and network architecture for Grok-1, a large language model that competes with OpenAI’s ChatGPT. The release of open versions through GitHub and BitTorrent is in line with Musk’s ongoing criticism and legal battle against OpenAI for not making its AI models available to the public.

Grok-1 model
Grok-1 is a baseline prototype checkpoint from the pre-training phase, which ends in October 2023. It is not trained for any specific use, such as in a chatbot. The model uses the Mixture-of-Experts architecture and boasts an impressive 314 billion parameters, with an activity weighting of 25%. xAI used a custom training system built on JAX and Rust to train Grok-1 from scratch.

Grok benchmark:
Grok’s MMLU score is 73% and it beats the Llama 2 69B with 68.9% and the Mixtral 8x7B with 71%.

Main technical specifications: you need to read the following details carefully
314 billion parameters
Mixed expert architecture with 8 experts (2 operations per token)
SentencePiece token with 131,072 tokens
Maximum context window of 8,000 tokens
Supports 8-bit trigger caching and quantization
64 classes
Top 48 noticed for queries
Supports rotary position embedding (RoPE)
internal embedding 6,000 pm
Grok-1’s large size requires significant hardware resources to run locally. About 8-bit inference requires an NVIDIA DGX H100 system (with 8 GPUs, each with 80 GB of VRAM), while using 4-bit data requires about 320 GB.

This release contains code and weights related to Grok-1 licensed under Apache 2.0.

How to download Grok
Instructions for downloading and running Grok-1 are explained in this GitHub repository. You can clone the repository to your local machine. This repository contains sample JAX code to load and run the Grok-1 open weight model.

To download this model: follow these steps

Main technical specifications: you need to read the following details carefully
314 billion parameters
Mixed expert architecture with 8 experts (2 operations per token)
SentencePiece token with 131,072 tokens
Maximum context window of 8,000 tokens
Supports 8-bit trigger caching and quantization
64 classes
Top 48 noticed for queries
Supports rotary position embedding (RoPE)
internal embedding 6,000 pm
Grok-1’s large size requires significant hardware resources to run locally. About 8-bit inference requires an NVIDIA DGX H100 system (with 8 GPUs, each with 80 GB of VRAM), while using 4-bit data requires about 320 GB.

This release contains code and weights related to Grok-1 licensed under Apache 2.0.

How to download Grok
Instructions for downloading and running Grok-1 are explained in this GitHub repository. You can clone the repository to your local machine. This repository contains sample JAX code to load and run the Grok-1 open weight model.

To download this model: follow these steps

git clone https://github.com/xai-org/grok-1.git && cd grok-1

#git clone https://github.com/xai-org/grok-1.git
#cd grok-1

pip install huggingface_hub[hf_transfer] #pip install -U huggingface_hub

huggingface-cli download xai-org/grok-1 –repo-type model –include ckpt-0/* –local-dir checkpoints –local-dir-use-symlinks False

You can use or download directly from the Hugging Face Hub model xai-org/grok-1

# You can also clone the repository, It’s Simple git lfs install git clone https://huggingface.co/xai-org/grok-1

Make sure to place the ckpt-0 folder in the checkpoints folder like in GitHub.

Note: When you clone the model directly from the hub, make sure other files like run.py are also downloaded otherwise install them manually from github. Check out the image below; the files are not there because the model is not currently in HF Transformer format.

You can also download all its model weights using this torrent client and magnet link:

magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

Run the model
Now your model is ready. To test the code, run the following commands:

:pip install -r requirements.txt python run.py

You can Change your prompt in run.py file.

#A piece of code taken from run.py file inp = “The answer to life the universe and everything is of course” print(f”Output for prompt: {inp}”, sample_from_model(gen, inp, max_len=100, temperature=0.01))

The large size of the model (parameter 314B) requires a machine with abundant GPU memory to test the model with example code.

Gork-1 PyTorch version for transformers
PyTorch Sample version and Grok-1 weights are now available on Hug Face. The original model written in JAX has been translated to the PyTorch version. The weights are transformed by mapping tensor files to parameter keys, de-quantizing the tensors with the corresponding packed tensors, and then saved to a checkpoint file using the lamp API of the battery.

Using

torch.set_default_dtype(torch.bfloat16) model = AutoModelForCausalLM.from_pretrained( “hpcai-tech/grok-1”, trust_remote_code=True, device_map=”auto”, torch_dtype=torch.bfloat16, ) sp = SentencePieceProcessor(model_file=”tokenizer.model”)

text = “Replace this with your text: What is Fine Tuning?” input_ids = sp.encode(text) input_ids = torch.tensor([input_ids]).cuda() attention_mask = torch.ones_like(input_ids) generate_kwargs = {} # Add any additional args if you want inputs = { “input_ids”: input_ids, “attention_mask”: attention_mask, **generate_kwargs, } outputs

= model.generate(**inputs)

Deploy the Grok-1 token
The Hugging Face version of the Grok-1 token, adapted from xai-org/grok-1, is now available. Grok-1 can now be used with Hugging Face libraries such as Transformers, Tokenizers and Transformers.js.

from transformers import LlamaTokenizerFast tokenizer = LlamaTokenizerFast.from_pretrained(‘Xenova/grok-1-tokenizer’) assert tokenizer.encode(‘hello world’)

Conclusion
Limited access hinders progress in AI models: While some fully open source models exist (e.g. Mistral and Falcon), many leading models are closed source or have grant mechanisms restriction. Companies like Meta release “free” research models ( Llama 2 ) but restrict commercial use and further development. This limits valuable feedback and innovation from the broader research community.

Paid exclusivity can backfire: Initially, the Grok chatbot required a paid subscription to access. This is aimed at positioning Grok as a unique and edgy alternative. However, our testing shows it to be inferior to freely available options like ChatGPT or Gemini. This shows that limited access can hinder the model’s ability to compete and innovate.

Facing a problem? Ask for advice or join the discussion about Face Hug.

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *