Setup and installation of LLaMa Factory on GCP

This section describes how to provision and connect to ‘Custom LLMs, Ready in Minutes with LLaMa Factory’ VM solution on GCP.

Open Custom LLMs, Ready in Minutes with LLaMa Factory listing on GCP Marketplace.
Click Get Started.

/img/gcp/llama-factory/marketplace.png

It will ask you to enable the API’s if they are not enabled already for your account. Please click on enable as shown in the screenshot.

/img/gcp/nvidia-ubuntu/enable-api.png

It will take you to the agreement page. On this page, you can change the project from the project selector on top navigator bar as shown in the below screenshot.
Accept the Terms and agreements by ticking the checkbox and clicking on the AGREE button.
It will show you the successfully agreed popup page. Click on Deploy.
On deployment page, give a name to your deployment.

In Deployment Service Account section, click on Existing radio button and Choose a service account from the Select a Service Account dropdown.

If you don't see any service account in dropdown, then change the radio button to New Account and create the new service account here.
If after selecting New Account option, you get below permission error message then please reach out to your GCP admin to create service account by following Step by step guide to create GCP Service Account and then refresh this deployment page once the service account is created, it should be available in the dropdown.


You are missing resourcemanager.projects.setIamPolicy permission, which is needed to set the required roles on the created Service Account

Select a zone where you want to launch the VM(such as us-east1-a)
Optionally change the number of cores and amount of memory. ( This defaults to 8 vCPUs and 30 GB RAM)

Minimum VM Specs : 30GB Memory /8vCPU

/img/gcp/llama-factory/cpu-instance.png

This VM can also be deployed using an NVIDIA T4 GPU instance to train faster. To deploy the VM with a GPU, click on the GPU tab as shown in below screenshot and select a NVIDIA T4 GPU instance. Please note that GPU availability is limited to specific regions, zones, and machine types. If you do not see a GPU option for your selected region, zone, or machine type, try adjusting those settings to find available configurations.

/img/gcp/llama-factory/gpu-instance.png

Optionally change the boot disk type and size. (This defaults to ‘Standard Persistent Disk’ and 60GB respectively)
Optionally change the network name and subnetwork names. Be sure that whichever network you specify has ports 22 (for ssh), 3389 (for RDP) and 443 (for HTTPS) exposed.
Click Deploy when you are done.
Custom LLMs, Ready in Minutes with LLaMa Factory will begin deploying.

/img/gcp/llama-factory/deployed-01.png

/img/gcp/llama-factory/deployed-02.png

/img/gcp/llama-factory/deployed-03.png

A summary page displays when the compute engine is successfully deployed. Click on the Instance link to go to the instance page .
On the instance page, click on the “SSH” button, select “Open in browser window”.

/img/gcp/puppet-support/ssh-option.png

This will open SSH window in a browser. Switch to ubuntu user and navigate to ubuntu home directory.

sudo su ubuntu

cd /home/ubuntu/

/img/gcp/llama-factory/switch-user.png

Run below command to set the password for “ubuntu” user

sudo passwd ubuntu

/img/gcp/llama-factory/update-passwd.png

Now the password for ubuntu user is set, you can connect to the VM’s desktop environment from any local windows machine using RDP or linux machine using Remmina.
To connect using RDP via Windows machine, first note the external IP of the VM from VM details page as highlighted below

/img/gcp/llama-factory/public-ip.png

Then From your local windows machine, goto “start” menu, in the search box type and select “Remote desktop connection”
In the “Remote Desktop connection” wizard, paste the external ip and click connect

/img/gcp/jupyter-python-notebook/rdp.png

This will connect you to the VM’s desktop environment. Provide “ubuntu” as the userid and the password set in step 6 to authenticate. Click OK

/img/gcp/jupyter-python-notebook/rdp-login.png

Now you are connected to out of box Custom LLMs, Ready in Minutes with LLaMa Factory VM’s desktop environment via Windows machines.

/img/azure/minikube/rdp-desktop.png

To connect using RDP via Linux machine, first note the external IP of the VM from VM details page, then from your local Linux machine, goto menu, in the search box type and select “Remmina”.

Note: If you don’t have Remmina installed on your Linux machine, first Install Remmina as per your linux distribution.

In the “Remmina Remote Desktop Client” wizard, select the RDP option from dropdown and paste the external ip and click enter.

/img/gcp/common/remmina-external-ip.png

This will connect you to the VM’s desktop environment. Provide “ubuntu” as the userid and the password set in step 6 to authenticate. Click OK

/img/gcp/common/remmina-rdp-login.png

Now you are connected to out of box Custom LLMs, Ready in Minutes with LLaMa Factory VM’s desktop environment via Linux machine.

/img/azure/minikube/rdp-desktop.png

The VM will generate a random password to login to LLaMa Factory Web Interface. To get the password, connect via SSH terminal as shown in above steps and run below command.

cat llama-factory-passwd.txt

Here username is admin with random password.

/img/azure/llama-factory-vm/llama-factory-passwd.png

To access the Llama Factory Web Interface, copy the public IP address of the VM and paste it in your local browser as https://public_ip_of_vm. Make sure to use https and not http.

Browser will display a SSL certificate warning message. Accept the certificate warning and Continue.

/img/azure/chromadb-vm/browser-warning.png

Provide the ‘admin’ user and its password we got at step 14 above.
Now you are logged in to LLaMa Factory Web Interface. Here you can select different values and train/chat/evaluate the models.

/img/azure/llama-factory-vm/llama-factory-homepage.png

Note: If you are using CPU instance type then make sure to change the default value of Compute Type from bf16 to fp16 or fp32. If Training starts with the default value bf16 on CPU instance then it will show an error message “Your setup doesn’t support bf16/gpu”. Also CPU instances will take much longer to finish the training compared to GPU instances.

/img/azure/llama-factory-vm/change-compute-type.png

/img/azure/llama-factory-vm/bf16-error.png

To begin with , you can set below values in Web Interface and click on Start to start the training. Once the training finishes, you can use the trained model for Chat.

Model name: Qwen2.5-0.5B-Instruct

Hub name: huggingface

Finetuning method: LoRA

Dataset: identity, alpaca_en_demo

Compute type: fp16 (for cpu instace) / bf16 (for gpu instance)

Output dir: train_qwen_05 (Any name of your choice)

/img/azure/llama-factory-vm/train-qwen-05-model.png

/img/azure/llama-factory-vm/start-tuning-02.png

Once training completed you will see below successful message in the logs window.

/img/azure/llama-factory-vm/training-completed.png

Now you can access the Chat functionality with the new train checkpoint. To do so, on same page of LLaMa Factory Web UI select Checkpoint path as highlighted in below screenshot. (It will be same as mentioned in the Output dir during fine tuning.)

/img/azure/llama-factory-vm/select-checkpoint-path.png

Navigate to Chat tab and click on Load Model button.

/img/azure/llama-factory-vm/load-model.png

Once model is loaded successfully , you can run your queries.

/img/azure/llama-factory-vm/chat-feature.png

To access the LLaMa Factory CLI on this VM, connect via SSH terminal and run below command. This command will login you to LLaMa Factory container.

sudo docker exec -it llamafactory /bin/bash

/img/azure/llama-factory-vm/access-llamafactory-cli.png

If above command fails then please check the status of running container using :

sudo docker ps -a

/img/azure/llama-factory-vm/docker-ps.png

If you see the container is not running and in Exited state then restart it with

sudo docker start llamafactory

/img/azure/llama-factory-vm/start-container.png

Inside container you can run various llamafactory-cli commands or you can use lmf as a shortcut for llamafactory-cli.

/img/azure/llama-factory-vm/lmf-help.png

For more details, please visit Official Documentation page