The field of artificial intelligence (AI) has seen monumental advances in recent years, largely driven by the emergence of large language models (LLMs). LLMs trained on vast datasets, are capable of working like humans, at some point in time, a way better than humans like generate remarkably human-like text, images, calculations, and many more. In essence, these LLMs are the actual brains of AI applications today. However, the broad deployment of public LLMs has also raised valid concerns about data privacy, security, reliability, and cost.
As AI permeates critical domains like healthcare, finance and more, transmitting sensitive data to public cloud APIs can expose users to unprecedented risks. Dependency on external services also increases vulnerabilities to outages, while usage-based pricing limits widespread adoption. This underscores the need for AI solutions that run entirely on the user’s local device.
Several open-source initiatives have recently emerged to make LLMs accessible privately on local machines. One such initiative is LocalGPT – an open-source project enabling fully offline execution of LLMs on the user’s computer without relying on any external APIs or internet connectivity.
LocalGPT overcomes the key limitations of public cloud LLMs by keeping all processing self-contained on the local device. Users can leverage advanced NLP capabilities for information retrieval, summarization, translation, dialogue and more without worrying about privacy, reliability or cost. Documents never leave the vicinity of the device at any point in time.
In this comprehensive guide, we will walk through the step-by-step process of setting up LocalGPT on a Windows PC from scratch. We cover the essential prerequisites, installation of dependencies like Anaconda and Visual Studio, cloning the LocalGPT repository, ingesting sample documents, querying the LLM via the command line interface, and testing the end-to-end workflow on a local machine.
Follow this guide to harness the power of large language models locally on your Windows device for a private, high-performance LLM solution.
Introduction of LocalGPT
LocalGPT is an open-source project inspired by privateGPT that enables running large language models locally on a user’s device for private use. The original Private GPT project proposed the idea of executing the entire LLM pipeline natively without relying on external APIs. However, it was limited to CPU execution which constrained performance and throughput.
LocalGPT builds on this idea but makes key improvements by using more efficient models and adding support for hardware acceleration via GPUs and other co-processors. Instead of the GPT-4ALL model used in privateGPT, LocalGPT adopts the smaller yet highly performant LLM Vicuna-7B. For generating semantic document embeddings, it uses InstructorEmbeddings rather than LlamaEmbeddings. Unlike privateGPT which only leveraged the CPU, LocalGPT can take advantage of installed GPUs to significantly improve throughput and response latency when ingesting documents as well as querying the model. The project readme highlights Blenderbot, Guanaco-7B, and WizardLM-7B as some of the compatible LLMs that can be used for privatization.
The default setup uses Vicuna-7B for text generation and InstructorEmbeddings for encoding document context vectors which are indexed locally using Chroma. However a key advantage is that these models can be readily swapped based on specific use cases and hardware constraints.
By keeping the entire pipeline limited to the local device while enabling acceleration using available hardware like GPUs, LocalGPT unlocks more efficient privatization of large language models for offline NLP tasks. Users get access to advanced natural language capabilities without compromising on privacy, reliability, or cost.
According to the moderators of LocalGPT, the project is still experimental. However, our belief is that it shows promising potential for building fully private AI applications across diverse domains like healthcare, finance, and more where data privacy and compliance are paramount.
What LocalGPT Carries the Benefits over the Private GPT Project?
One of the biggest advantages LocalGPT has over the original privateGPT is support for diverse hardware platforms including multi-core CPUs, GPUs, IPUs, and TPUs.
By contrast, privateGPT was designed to only leverage the CPU for all its processing. This limited execution speed and throughput especially for larger models.
LocalGPT’s ability to offload compute-intensive operations like embedding generation and neural inference to available co-processors provides significant performance benefits:
- Faster response times – GPUs can process vector lookups and run neural net inferences much faster than CPUs. This reduces query latencies.
- Higher throughput – Multi-core CPUs and accelerators can ingest documents in parallel. This increases overall throughput.
- More efficient scaling – Larger models can be handled by adding more GPUs without hitting a CPU bottleneck.
- Lower costs – Accelerators are more cost-efficient for massively parallel workloads compared to high core-count CPUs.
- Flexibility – Different models and workflows can be mapped to suitable processors like IPUs for inference and TPUs for training.
- Portability – Can leverage hardware from all major vendors like Nvidia, Intel, AMD, etc.
So while privateGPT was limited to single-threaded CPU execution, LocalGPT unlocks more performance, flexibility, and scalability by taking advantage of modern heterogeneous computing. Even on laptops with integrated GPUs, LocalGPT can provide significantly snappier response times and support larger models not possible on privateGPT.
For users with access to desktop GPUs or enterprise accelerators, LocalGPT makes local privatization of LLMs much more practical across diverse settings – from individual users to large organizations dealing with confidential data.
By decoupling model execution from the underlying hardware, LocalGPT makes local LLM privatization faster, more affordable, and accessible to a much wider audience. This aligns well with its open-source ethos of AI privacy and security for all.
Prerequisites to Run the LocalGPT on a Windows PC
To install and run LocalGPT on your Windows PC, there are some minimum system requirements that need to be met. Please ensure these minimum requirements before you get started.
Operating System – You need Windows 10 or higher, 64-bit edition. Older Windows versions are not supported.
RAM – LocalGPT requires at least 16GB RAM, while 32GB is recommended for optimal performance, especially with larger models.
GPU – For leveraging GPU acceleration, an Nvidia GPU with a CUDA compute capability of 3.5 or higher is necessary. CUDA-enabled GPUs provide significant speedups versus just CPU.
Storage – 250GB of free disk space is required as LocalGPT databases can grow large depending on the documents ingested. SSD storage is preferred.
- Anaconda or Miniconda for Python environment management. Python 3.10 or later is required.
- Visual Studio 2022 provides the necessary C++ build tools and compilers. Ensure the desktop development workload with C++ is selected during installation.
- Git is required for cloning the LocalGPT repository from GitHub.
- MinGW provides the gcc compiler needed to compile certain Python packages.
- Docker Desktop (optional) – Provides a containerized environment to simplify managing LocalGPT dependencies.
- Nvidia Container Toolkit to enable GPU support when running LocalGPT via Docker.
Additionally, an internet connection is required for the initial installation to download the required packages and models.
Ensuring these prerequisites are met before starting the LocalGPT installation will ensure a smooth setup process and avoid frustrating errors down the line. Pay particular attention to GPU driver versions, CUDA versions, and Visual Studio workloads during installation.
How to Setup LocalGPT on Your Windows PC?
Now, you have gotten enough knowledge about LocalGPT. Let’s go ahead and see how to set up LocalGPT on your Windows PC.
Time needed: 1 hour
How to Setup LocalGPT on Your Windows PC?
- Download the LocalGPT Source Code or Clone the Repository
Now we need to download the source code for LocalGPT itself. There are a couple of ways to do this:
Option 1 – Clone with Git
If you’re familiar with Git, you can clone the LocalGPT repository directly in Visual Studio:
1. Choose a local path to clone it to, like
2. Change the directory to your local path on the CLI and run this command: > git clone https://github.com/PromtEngineer/localGPT.git
3. Click Clone
This will download all the code to your chosen folder.
Option 2 – Download as ZIP
If you aren’t familiar with Git, you can download the source as a ZIP file:
1. Go to https://github.com/PromtEngineer/localGPT in your browser
2. Click on the green “1.<> Code” button and choose “Download ZIP”
3. Extract the ZIP somewhere on your computer, like
Either cloning or downloading the ZIP will work!
We have downloaded the source code, unzipped it into ‘LocalGPT’ folder, and kept in G:\LocalGPT on our PC.
- Import the LocalGPT into an IDE
The next step is to import the unzipped ‘LocalGPT’ folder into an IDE application. We used PyCharm IDE in this demo. You can use Visual Studio 2022 or even it is okay to directly run the CLI.
If you want to set up the PyCharm on your Windows, follow this guide: https://thesecmaster.com/step-by-step-procedure-to-install-pycharm-on-windows/
To Import the LocalGPT as a project on PyCharm, Click on the ‘Four Lines’ button on the top left corner and click ‘Open.’ Brows the LocalGPT folder.
- Install Anaconda
We will use Anaconda to set up and manage the Python environment for LocalGPT.
1. Download the latest Anaconda installer for Windows from https://www.anaconda.com/products/distribution
2. Choose Python 3.10 or higher during installation.
3. Complete the installation process and restart your terminal.
4. Open the Anaconda Prompt which will have the Conda environment activated by default.
To verify the installation is successful, fire up the ‘Anaconda Prompt’ and enter this command: conda –version.
Refer to these online documents for installation, setting up the environmental variable, and troubleshooting:
- Create and Activate LocalGPT Environment
It’s best practice to install LocalGPT in a dedicated Conda environment instead of the base env. This keeps the dependencies isolated.
Run the following commands in Anaconda Prompt:
conda create -n localgpt
conda activate localgpt
- Change to Anaconds Python Interpreter on PyCharm
Your PC could have multiple Python Interpreters the one that comes with PyCharm, the second comes with the installation of Anaconda, and there may be another interpreter that came along with the installation of Python from python.org. Make sure you use the Anaconda Python interpreter on PyCharm. To do so, go to the Settings gear icon on the top right corner of your project in Pycharm, Go to ‘Settings’, and Select the Project and Python Interpreter. You should see all the interpreters listed in the drop-down. Select the interpreter that comes with Anaconda. If you don’t see the interpreter, Click on ‘Add Interpreter’ and select the ‘python.exe’ location.
If in case, you are not sure where the ‘python.exe’ exists. Open your ‘Anaconda Prompt’ and run the command: where python.
- Install Required Python Packages
Now we need to install the Python package requirements so LocalGPT can run properly. Run this command to install all the packages listed in ‘requirements.txt’ file on the terminal.
pip install -r .\requirements.txt
This will install all of the required Python packages using
pip. Depending on your internet speed, this may take a few minutes.
If you run into any errors during this step, you may need to install a C++ compiler. See the LocalGPT README on GitHub for help troubleshooting compiler issues on Windows.
- Install Packages Required to Run on GPU (Optional)
LocalGPT requires some essential packages to be installed if you want to run the LLM model on your GPU. This is an optional step for those who have an NVIDIA GPU card on their machine.
Run the following to install Conda packages:
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia
This installs Pytorch, Cuda toolkit, and other Conda dependencies.
“Unfortunately, the screenshot is not available“
- Install MinGW Compiler
MinGW provides gcc, the default C++ compiler used by Python and its packages.
1. Download the latest MinGW installer from https://sourceforge.net/projects/mingw/
2. Run the exe and select the
mingw32-gcc-g++-binpackage under Basic Setup.
3. Leave other options as default and complete the MinGW installation.
4. Finally, add MinGW to your PATH environment variable so it’s accessible from the command line.
- Ingest Documents
Now we’re ready to ingest documents into the local vector database. This preprocesses your files so LocalGPT can search and query them. C
In the PyCharm terminal, run:
This will look for files in the source_documents folder, parse and encode the document contents into vector embeddings, and store them in an indexed local database.
You can add .pdf, .docx, .txt, and other files in this folder ‘source_documents’. The initial process may take some time depending on how large your files are and how much computational resources your PC has. If you run this on CPU, the ingest process would take longer than GPU.
LocalGPT is designed to run the ingest.py file on GPU as a default device type. However, if your PC doesn’t have CODA supported GPU then it runs on a CPU.
Well, LocalGPT provided an option to choose the device type, no matter if your device has a GPU. You can select the device type by adding this flag –device_type to the command.
python ingest.py –device_type cpu
python ingest.py –device_type coda
python ingest.py –device_type ipu
To see the list of device types, run this –help flag: python ingest.py –help
Once it finishes, your documents are ready to query!
- Query Your Documents
With documents ingested, we can ask LocalGPT questions relevant to them:
In the terminal, run:
It will prompt you to enter a question. Ask something relevant to the sample documents like:
What is Privilege Escalation?
LocalGPT will provide an appropriate answer by searching through the ingested document contents.
You can keep entering new questions, or type
Note: LocalGPT provided an options to choose the device type, no matter if your device has GPU. You can select the device type by adding this flag –device_type to the command.
python run_localGPT.py –device_type cpu
python run_localGPT.py –device_type coda
python run_localGPT.py –device_type ipu
To see the list of device type, run this –help flag: python run_localGPT.py –help
- Use a Different LLM
By default, LocalGPT uses Vicuna-7B model. But you can replace it with any HuggingFace model:
1. Open constants.py in an editor.
MODEL_BASENAMEas per the instructions in the LocalGPT readme.
3. Comment out other redundant model variables.
4. Restart LocalGPT services for changes to take effect.
And that’s it! This is how you can setup LocalGPT on your Windows machine. You can ingest your own document collections, customize models and build private AI apps leveraging its local LLM capabilities.
Note: If you use CPU to run LLM, you may need to wait a log time to see responses. We recommend to run this on GPU.
FYI: we tried this on one of Windows PC which as intel i7 7700 processor, 32 Gb RAM with 4 Gb GTX 1050 GPU. We get an average repose time of 60 to 90 sec on CPU. Unfortunately, we couldn’t run this on GPU, due to version compatibility issues with our Python and Tenserflow-gpu. We keep trying this and let you know once we are succeeded. If you are one of toes who successfully run this on your local GPU, please leave a comment.
And that’s it! This is how you can setup LocalGPT on your Windows machine. You can ingest your own document collections, customize models, and build private AI apps leveraging its local LLM capabilities.
Note: If you use the CPU to run LLM, you may need to wait a long time to see responses. We recommend to run this on GPU.
FYI: We tried this on one of our Windows PCs which has an Intel i7 7700 processor, 32 Gb RAM with 4 Gb GTX 1050 GPU. We get an average repose time of 60 to 90 sec on the CPU. Unfortunately, we couldn’t run this on GPU, due to version compatibility issues with PyTorch and CUDA Took KIt. We keep trying this and let you know once we have succeeded. If you are one of toes who successfully ran this on your local GPU, please leave a comment.
Being able to leverage the power of large language models locally on your device provides tremendous opportunities to build intelligent applications privately. However, installing and configuring complex deep-learning software can seem daunting for many Windows users.
In this comprehensive, step-by-step guide, we simplified the process by detailing the exact prerequisites, dependencies, environment setup, installation steps, and configurations required to get LocalGPT up and running on a Windows PC.
By closely following the instructions outlined and checking the system requirements beforehand, you should be able to successfully install LocalGPT on your Windows 10 or 11 machine without major issues. We also covered how to ingest sample documents, query the model, and customize the underlying LLM as per your application needs.
While still experimental, LocalGPT enables you to unlock the myriad capabilities of large language models to create personalized AI solutions that keep your data completely secure and private. No documents or information is ever transmitted outside your computer.