Setup and installation of 'Chroma: Vector DB for AI Development' on Azure
This section describes how to launch and connect to ‘Chroma: Vector DB for AI Development’ VM solution on Azure Platform.
- Open Chroma: Vector DB for AI Development VM listing on Azure Marketplace.

- Click on Get It Now
-
Login with your credentials, provide the details here. Once done click on Get it now button at the bottom.

-
It will take you to the Product details page. Click on Create.

-
Select a Resource group for your virtual machine
-
Select a Region where you want to launch the VM(such as East US)

- Note: If you see “This image is not compatible with selected security type. To keep trusted launch virtual machines, select a compatible image. Otherwise change your security type back to Standard” error message below the Image name as shown in the screenshot below then please change the Security type to Standard.


- Optionally change the number of cores and amount of memory.
Minimum VM Specs : 8GB vRAM / 2vCPU, but for swift performance please choose 16GB vRAM/4vCPU configuration.
Select the Authentication type as Password and enter Username as ubuntu and Password of your choice.

- Optionally change the OS disk size and its type. By default the VM comes with 40GB of disk.

- Optionally change the network and subnetwork names. Be sure that whichever network you specify has ports 22 (for ssh), 3389 (for RDP), port 80 (for HTTP) and 443 (for HTTPS) exposed.
The VM comes with the preconfigured NSG rules. You can check them by clicking on Create New option available under the security group option.


- Optionally go to the Management, Advanced and Tags tabs for any advance settings you want for the VM.
- Click on Review + create and then click on Create when you are done.
Virtual Machine will begin deploying.
- A summary page displays when the virtual machine is successfully created. Click on Go to resource link to go to the resource page. It will open an overview page of virtual machine.

- If you want to update your password then open up the left navigation pane, select Run command, select RunShellScript and enter following command to change the password of the vm .
sudo echo ubuntu:yourpassword | chpasswd


Now the password for ubuntu user is set, you can SSH to the VM. To do so, first note the public IP address of the VM from VM details page as highlighted below

Open putty, paste the IP address and click on Open.

login as ubuntu and provide the password for ‘ubuntu’ user.

-
You can also connect to the VM’s desktop environment from any local windows machine using RDP protocol or local linux machine using Remmina.
-
To connect using RDP via Windows Machine, first note the public IP address of the VM from VM details page as highlighted below

- Then From your local windows machine, goto “start” menu, in the search box type and select “Remote desktop connection”.
In the “Remote Desktop connection” wizard, copy the public IP address and click connect

- This will connect you to the VM’s desktop environment. Provide the username (e.g “ubuntu”) and the password set in the step4 to authenticate. Click OK

- Now you are connected to the out of box “Chroma: Vector DB for AI Development” VM’s desktop environment via Windows Machine.

- To connect using RDP via Linux machine, first note the external IP of the VM from VM details page, then from your local Linux machine, goto menu, in the search box type and select “Remmina”.
Note: If you don’t have Remmina installed on your Linux machine, first Install Remmina as per your linux distribution.

- In the “Remmina Remote Desktop Client” wizard, select the RDP option from dropdown and paste the external ip and click enter.

- This will connect you to the VM’s desktop environment. Provide “ubuntu” as the userid and the password set in above reset password step to authenticate. Click OK

- Now you are connected to out of box “Chroma: Vector DB for AI Development” VM’s desktop environment via Linux machine.

- When the VM is deployed, Chromadb will start in the background. Connect to Chroma at: http://localhost:8000 .
Example code to connect to Running ChromaDB server is
import chromadb
from chromadb.config import Settings
client = chromadb.HttpClient(host="localhost", port=8000)
- To access the JupyterHub Web Interface, copy the public IP address of the VM and paste it in your local browser as https://public_ip_of_vm.
Browser will display a SSL certificate warning message. Accept the certificate warning and Continue.

- Provide the ‘ubuntu’ user and its password set during VM creation. ubuntu is configured as an admin user here.

- If your jupyter server did not spawn in 30 sec you will see error message as shown in below screenshot. In this case simply click on Home tab and click Start My Server button. It will spawn the server again.


- Now you are logged in to jupyterhub. Here you can see we have setup folder configured with venv, jupyterhub_config.py files along with other jupyterhub configuration files. You can use jupyter notebook to run and test your AI projects.

- The VM comes preloaded with “Generative Benchmarking” App. The App project “Generative Benchmarking” is available in /home/ubuntu/setup/ directory. Once you logged in the jupyterhub, navigate to setup directory and click on “Generative Benchmarking” directory.

Benchmarking is used to evaluate how well a model is performing. You can update the models and provide your data here and perform the benchmarking. Instruction to modify the code are given before the cells where modification is required for your custom data.
The App directory comes with:
-
generate_benchmark.ipynb
A comprehensive guide to generating a custom benchmark based on provided data
-
compare.ipynb
A framework for comparing results, which is useful when evaluating different embedding models or configurations
-
data/
Example data to immediately test out the notebooks with
-
functions/
Functions used to run notebooks, includes various embedding functions and llm prompts
-
results/
Folder for saving benchmark results, includes results produced from example data
- Before running this Sample App, you will need to set the OPENAI API Key in environment file. To do so, from this jupyterlab window, open the terminal and make sure you in /home/ubuntu/setup/generative_benchmarking directory.


- In this directory we have .env file. Open this file using -
Press “i” to enable insert mode, copy paste your OPENAI API Key and other API Keys here. Save the changes by pressing ESC key followed by :wq


- By default, the collection created after running this sample code is stored on a temporary volume and will not persist. To store it on a persistent volume, open generate_benchmark.ipynb, locate the “Set Clients” cell, and comment out the line:
chroma_client = chromadb.Client()
Then, replace it with code that initializes chroma_client using PersistentClient, with below code snippet. The ChromaDB server will then use your local storage at /home/ubuntu/setup/chroma. ChromaDB is running on localhost on port 8000.
import chromadb
from chromadb.config import Settings
chroma_client = chromadb.HttpClient(host="localhost", port=8000)


- There are 2 notebook files available in this directory. They are generate_benchmark.ipynb and compare.ipynb. You can simply run each cell one by one in same sequence or you can select the Run All Cells option from Run Menu at the top. Wait for it to finish.

Note: After setting the API keys in .env file and running the notebooks, if you get API Key not found error at any step then restart the kernel as shown below and rerun all the cells from beginning.

- This example will insert data in ChromaDB with collection name “chroma-docs-openai-large”.

Output of compare.ipynb notebook

- ChromaDB provides an in-Terminal User Interface (iTUI) feature to browse your data. If you have updated chromadb client to use persistent volume as explained at step.22 above then you can browse chroma-docs-openai-large collection using the ChromaDB in Terminal User Interface. For that connect to SSH terminal of this vm as explained above in this guide. Then run below command-
chroma browse --local chroma-docs-openai-large


Use left, right , up and down arrow keys to navigate. Press Enter to see the full record. Press ESC key to exit the current window.
For more details, please visit Official Documentation page