Setup and installation of LLaMa Factory on AWS

This section describes how to launch and connect to ‘Milvus DB: AI-Ready Vector Database Environment’ VM solution on AWS.

Note: Please note that the VM can be deployed using NVIDIA GPU instances or CPU instances. CPU instances will take longer to fine tune or train the model. Kindly choose the instance based on your requirement.

Open Custom LLMs, Ready in Minutes with LLaMa Factory VM listing on AWS marketplace.

/img/aws/llama-factory/marketplace.png

Click on View purchase options.

Login with your credentials and follow the instruction.
Review the prices and subscribe to the product by clicking on subscribe button located at the bottom of this page. Once you are subscribed to the offer, click on Launch your software button.

/img/aws/llama-factory/subscribe.png

/img/aws/llama-factory/launch-your-software.png

Next page will show you the options to launch the instance, Launch through EC2 and One-click launch from AWS Marketplace. Tick the 2nd option One-click launch from AWS Marketplace.

/img/aws/llama-factory/launch.png

Select a Region where you want to launch the VM(such as US East (N.Virginia))
Optionally change the EC2 instance type. (This defaults to t2.2xlarge instance type, 8 vCPUs and 32 GB RAM.)

Minimum VM Specs : 16GB vRAM / 4vCPU, but for swift performance please choose recommended option 32GB vRAM/8vCPU configuration.

Please note that the VM can also be deployed using NVIDIA GPU instance. If you want to deploy this instance with GPU configuration then Please choose NVIDIA GPU (e.g g4dn.xlarge) or check the available NVIDIA GPU instances on AWS documentation page.

/img/aws/llama-factory/region.png

/img/aws/llama-factory/gpu-instance.png

Optionally change the network name and subnetwork names.

/img/aws/llama-factory/VPC.png

Select the Security Group. Be sure that whichever Security Group you specify have ports 22 (for SSH), 3389 (for RDP) and 443 (for HTTPS) exposed. Or you can create the new SG by clicking on “Create Security Group” button. Provide the name and description and save the SG for this instance.

/img/aws/milvus-vm/create-sg.png

/img/aws/llama-factory/SG.png

Be sure to download the key-pair which is available by default, or you can create the new key-pair and download it.
Click on Launch..
Custom LLMs, Ready in Minutes with LLaMa Factory will begin deploying.

/img/aws/llama-factory/key-pair.png

A summary page displays. To see this instance on EC2 Console click on View instance on EC2 link.

/img/aws/llama-factory/deployed.png

To connect to this instance through putty, copy the IPv4 Public IP Address from the VM’s details page.

/img/aws/llama-factory/public-ip.png

Open putty, paste the IP address and browse your private key you downloaded while deploying the VM, by going to SSH->Auth->Credentials, click on Open. Enter ubuntu as userid

/img/aws/desktop-linux/putty-01.png

/img/aws/nvidia-aiml/putty-02.png

/img/aws/llama-factory/ssh-login.png

Once connected, change the password for ubuntu user using below command

sudo passwd ubuntu

/img/aws/llama-factory/update-passwd.png

Now the password for ubuntu user is set, you can connect to the VM’s desktop environment from any local Windows Machine using RDP protocol or Linux Machine using Remmina.

From your local windows machine, goto “start” menu, in the search box type and select “Remote desktop connection”. In the “Remote Desktop connection” wizard, copy the public IP address and click connect

/img/aws/desktop-linux/rdp.png

This will connect you to the VM’s desktop environment. Provide the username “ubuntu” and the password set in the above “Reset password” step to authenticate. Click OK

/img/aws/desktop-linux/rdp-login.png

Now you are connected to the out of box Custom LLMs, Ready in Minutes with LLaMa Factory VM’s desktop environment via Windows Machine.

/img/azure/milvus-vm/rdp-desktop.png

To connect using RDP via Linux machine, first note the external IP of the VM from VM details page, then from your local Linux machine, goto menu, in the search box type and select “Remmina”.

Note: If you don’t have Remmina installed on your Linux machine, first Install Remmina as per your linux distribution.

In the “Remmina Remote Desktop Client” wizard, select the RDP option from dropdown and paste the external ip and click enter.

/img/gcp/common/remmina-external-ip.png

This will connect you to the VM’s desktop environment. Provide “ubuntu” as the userid and the password set in above reset password step to authenticate. Click OK

/img/gcp/common/remmina-rdp-login.png

Now you are connected to out of box Custom LLMs, Ready in Minutes with LLaMa Factory VM’s desktop environment via Linux machine.

/img/azure/milvus-vm/rdp-desktop.png

The VM will generate a random password to login to LLaMa Factory Web Interface. To get the password, connect via SSH terminal as shown in above steps and run below command.

cat llama-factory-passwd.txt

Here username is admin with random password.

/img/azure/llama-factory-vm/llama-factory-passwd.png

To access the Llama Factory Web Interface, copy the public IP address of the VM and paste it in your local browser as https://public_ip_of_vm. Make sure to use https and not http.

Browser will display a SSL certificate warning message. Accept the certificate warning and Continue.

/img/azure/chromadb-vm/browser-warning.png

Provide the ‘admin’ user and its password we got at step 14 above.
Now you are logged in to LLaMa Factory Web Interface. Here you can select different values and train/chat/evaluate the models.

/img/azure/llama-factory-vm/llama-factory-homepage.png

Note: If you are using CPU instance type then make sure to change the default value of Compute Type from bf16 to fp16 or fp32. If Training starts with the default value bf16 on CPU instance then it will show an error message “Your setup doesn’t support bf16/gpu”. Also CPU instances will take much longer to finish the training compared to GPU instances.

/img/azure/llama-factory-vm/change-compute-type.png

/img/azure/llama-factory-vm/bf16-error.png

To begin with , you can set below values in Web Interface and click on Start to start the training. Once the training finishes, you can use the trained model for Chat.

Model name: Qwen2.5-0.5B-Instruct

Hub name: huggingface

Finetuning method: LoRA

Dataset: identity, alpaca_en_demo

Compute type: fp16 (for cpu instace) / bf16 (for gpu instance)

Output dir: train_qwen_05 (Any name of your choice)

/img/azure/llama-factory-vm/train-qwen-05-model.png

/img/azure/llama-factory-vm/start-tuning-02.png

Once training completed you will see below successful message in the logs window.

/img/azure/llama-factory-vm/training-completed.png

Now you can access the Chat functionality with the new train checkpoint. To do so, on same page of LLaMa Factory Web UI select Checkpoint path as highlighted in below screenshot. (It will be same as mentioned in the Output dir during fine tuning.)

/img/azure/llama-factory-vm/select-checkpoint-path.png

Navigate to Chat tab and click on Load Model button.

/img/azure/llama-factory-vm/load-model.png

Once model is loaded successfully , you can run your queries.

/img/azure/llama-factory-vm/chat-feature.png

To access the LLaMa Factory CLI on this VM, connect via SSH terminal and run below command. This command will login you to LLaMa Factory container.

sudo docker exec -it llamafactory /bin/bash

/img/azure/llama-factory-vm/access-llamafactory-cli.png

If above command fails then please check the status of running container using :

sudo docker ps -a

/img/azure/llama-factory-vm/docker-ps.png

If you see the container is not running and in Exited state then restart it with

sudo docker start llamafactory

/img/azure/llama-factory-vm/start-container.png

Inside container you can run various llamafactory-cli commands or you can use lmf as a shortcut for llamafactory-cli.

/img/azure/llama-factory-vm/lmf-help.png

For more details, please visit Official Documentation page