Pakistan's First Oracle Blog

Subscribe to Pakistan's First Oracle Blog feed
Blog By Fahd Mirza ChughtaiFahd Mirzahttp://www.blogger.com/profile/14722451950835849728noreply@blogger.comBlogger598125
Updated: 17 hours 22 min ago

RAG Pipeline Tutorial Using Ollama, Triplex, and LangChain On Custom Data Locally

Sun, 2024-07-21 15:10

 This video is a step-by-step guide on building an end-to-end RAG pipeline on your own custom data locally by using Ollama models Triplex and Langchain with GUI in Gradio.



Code:


conda create -n ragpipe python=3.11 -y && conda activate ragpipe

pip install torch sentence_transformers transformers accelerate
pip install langchain==0.1.14
pip install langchain-experimental==0.0.56
pip install langchain-community==0.0.31
pip install faiss-cpu==1.8.0
pip install pdfplumber==0.11.0
pip install gradio==4.25.0
pip install ollama
pip install pypdf
conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import TextLoader
from pypdf import PdfReader
import ollama
import gradio as gr
import os
import json

def triplextract(text, entity_types, predicates):
    input_format = """
        **Entity Types:**
        {entity_types}

        **Predicates:**
        {predicates}

        **Text:**
        {text}
        """

    message = input_format.format(
                entity_types = json.dumps({"entity_types": entity_types}),
                predicates = json.dumps({"predicates": predicates}),
                text = text)

    # Pass the message as a single string
    prompt = message
    output = ollama.generate(model='triplex', prompt=prompt)
    return output

entity_types = ["PERSON", "LOCATION"]
predicates = ["PROFESSION", "BASED_IN"]
   
reader = PdfReader("/home/Ubuntu/myfiles/mypdf.pdf")
text = ""
for page in reader.pages:
    text += page.extract_text() + "\n"
   

prediction = triplextract(text, entity_types, predicates)

response_string = prediction['response'].strip('```json\n').strip()
response_string = response_string.lstrip('\n')
response_string = response_string.strip('```')
response_string = response_string.replace('```', '')
response_string = response_string.replace("json", "")
response_json = json.loads(response_string)
entities_and_triples = response_json['entities_and_triples']
print(entities_and_triples)

with open('output.txt', 'w') as f:
    f.write(text)
   
loader = TextLoader("./output.txt")
docs =loader.load()

       
# Split into chunks
text_splitter = SemanticChunker(HuggingFaceEmbeddings())
documents = text_splitter.split_documents(docs)

# Instantiate the embedding model
embedder = HuggingFaceEmbeddings()

# Create the vector store and fill it with embeddings
vector = FAISS.from_documents(documents, embedder)
retriever = vector.as_retriever(search_type="similarity", search_kwargs={"k": 3})

# Define llm
llm = Ollama(model="mistral")

# Define the prompt
prompt = """
1. Use the following pieces of context to answer the question at the end.
2. If you don't know the answer, just say that "I don't know" but don't make up an answer on your own.\n
3. Keep the answer crisp and limited to 3,4 sentences.

Context: {context}

Question: {question}

Helpful Answer:"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(prompt)

llm_chain = LLMChain(
                  llm=llm,
                  prompt=QA_CHAIN_PROMPT,
                  callbacks=None,
                  verbose=True)

document_prompt = PromptTemplate(
    input_variables=["page_content", "source"],
    template="Context:\ncontent:{page_content}\nsource:{source}",
)

combine_documents_chain = StuffDocumentsChain(
                  llm_chain=llm_chain,
                  document_variable_name="context",
                  document_prompt=document_prompt,
                  callbacks=None)
             
qa = RetrievalQA(
                  combine_documents_chain=combine_documents_chain,
                  verbose=True,
                  retriever=retriever,
                  return_source_documents=True)

def respond(question,history):
    return qa(question)["result"]


gr.ChatInterface(
    respond,
    chatbot=gr.Chatbot(height=500),
    textbox=gr.Textbox(placeholder="Ask me question related to Fahd Mirza", container=False, scale=7),
    title="Fahd's Chatbot",
    examples=["Where Fahd Lives", "Who is Fahd"],
    cache_examples=True,
    retry_btn=None,

).launch(share = True)
Categories: DBA Blogs

GraphRAG Replacement - SciPhi Triplex - Step by Step Local Installation

Sat, 2024-07-20 20:45

 This video installs Triplex which is a finetuned version of Phi3-3.8B for creating knowledge graphs from unstructured data developed by SciPhi.AI. It works by extracting triplets.



Code:

conda create -n triplex python=3.11 -y && conda activate triplex

pip install torch transformers accelerate

import json
from transformers import AutoModelForCausalLM, AutoTokenizer

def triplextract(model, tokenizer, text, entity_types, predicates):

    input_format = """
        **Entity Types:**
        {entity_types}

        **Predicates:**
        {predicates}

        **Text:**
        {text}
        """

    message = input_format.format(
                entity_types = json.dumps({"entity_types": entity_types}),
                predicates = json.dumps({"predicates": predicates}),
                text = text)

    messages = [{'role': 'user', 'content': message}]
    input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt = True, return_tensors="pt").to("cuda")
    output = tokenizer.decode(model.generate(input_ids=input_ids, max_length=2048)[0], skip_special_tokens=True)
    return output

model = AutoModelForCausalLM.from_pretrained("sciphi/triplex", trust_remote_code=True).to('cuda').eval()
tokenizer = AutoTokenizer.from_pretrained("sciphi/triplex", trust_remote_code=True)

entity_types = [ "LOCATION", "POSITION", "DATE", "CITY", "COUNTRY", "NUMBER" ]
predicates = [ "POPULATION", "AREA" ]
text = """
San Francisco,[24] officially the City and County of San Francisco, is a commercial, financial, and cultural center in Northern California.

With a population of 808,437 residents as of 2022, San Francisco is the fourth most populous city in the U.S. state of California behind Los Angeles, San Diego, and San Jose.
"""

prediction = triplextract(model, tokenizer, text, entity_types, predicates)
print(prediction)


entity_types = ["CASE", "LAWYER", "DATE"]
predicates = ["VERDICT", "CHARGES"]
text = """
In the landmark case of Roe v. Wade (1973), lawyer Sarah Weddington successfully argued before the US Supreme Court, leading to a verdict that protected women's reproductive rights.
"""
prediction = triplextract(model, tokenizer, text, entity_types, predicates)
print(prediction)
Categories: DBA Blogs

Create Space Shooter Game with Local Free AI Model Mistral Nemo

Sat, 2024-07-20 02:53

 This video show how to create a free local space shooter game with just text prompt using Mistral Nemo LLM.


Full Prompt:

"""
Create a Python program that generates a customizable space shooter game. The game should have the following features:

A player ship that can be customized with a choice of 3 colors and 2 weapon types
3 types of enemies with unique behaviors and difficulty levels
2 power-ups that can be collected by the player
5 levels with increasing difficulty
A boss fight at the end of level 5

The program should use the Pygame library and generate the game assets (sprites, backgrounds, etc.) using AI algorithms. The game should be playable from the terminal.

Customization Options:

Player ship color (red, blue, or green)
Player ship weapon type (laser or missiles)
AI-Generated Assets:
Player ship sprite
Enemy sprites (3 types)
Power-up icons (2 types)
Backgrounds (5 levels)

Gameplay Logic:
Player movement and shooting
Enemy AI behaviors (patrolling, chasing, etc.)
Collision detection and response
Scoring and level progression
Test the AI model's ability to generate a functional and engaging space shooter game with customization options.

"""

Full Game Code:


```python
import pygame
import random

# Initialize Pygame
pygame.init()

# Set up some constants
WIDTH, HEIGHT = 800, 600
FPS = 60
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)
RED = (255, 0, 0)
LIVES = 3

# Set up the display
screen = pygame.display.set_mode((WIDTH, HEIGHT))
pygame.display.set_caption("Space Shooter")
clock = pygame.time.Clock()

# Player class
class Player(pygame.sprite.Sprite):
    def __init__(self):
        super().__init__()
        self.image = pygame.Surface((50, 50))
        self.image.fill(WHITE)
        self.rect = self.image.get_rect()
        self.rect.centerx = WIDTH // 2
        self.rect.bottom = HEIGHT - 10
        self.speed_x = 0

    def update(self):
        self.speed_x = 0
        keys = pygame.key.get_pressed()
        if keys[pygame.K_LEFT]:
            print("Left key pressed")
            self.speed_x = -5
        if keys[pygame.K_RIGHT]:
            print("Right key pressed")
            self.speed_x = 5
        self.rect.x += self.speed_x
        if self.rect.left < 0:
            self.rect.left = 0
        if self.rect.right > WIDTH:
            self.rect.right = WIDTH

    def shoot(self):
        bullet = Bullet(self.rect.centerx, self.rect.top)
        all_sprites.add(bullet)
        bullets.add(bullet)

# Enemy class
class Enemy(pygame.sprite.Sprite):
    def __init__(self):
        super().__init__()
        self.image = pygame.Surface((30, 30))
        self.image.fill(RED)
        self.rect = self.image.get_rect()
        self.rect.x = random.randrange(WIDTH - self.rect.width)
        self.rect.y = random.randrange(-100, -40)
        self.speed_y = random.randrange(1, 8)

    def update(self):
        self.rect.y += self.speed_y
        if self.rect.top > HEIGHT + 10:
            self.rect.x = random.randrange(WIDTH - self.rect.width)
            self.rect.y = random.randrange(-100, -40)
            self.speed_y = random.randrange(1, 8)

# Bullet class
class Bullet(pygame.sprite.Sprite):
    def __init__(self, x, y):
        super().__init__()
        self.image = pygame.Surface((10, 20))
        self.image.fill(WHITE)
        self.rect = self.image.get_rect()
        self.rect.centerx = x
        self.rect.top = y
        self.speed_y = -10

    def update(self):
        self.rect.y += self.speed_y
        if self.rect.bottom < 0:
            self.kill()

# Create sprite groups
all_sprites = pygame.sprite.Group()
enemies = pygame.sprite.Group()
bullets = pygame.sprite.Group()

# Create player instance
player = Player()
all_sprites.add(player)

# Create enemy instances
for i in range(10):
    enemy = Enemy()
    all_sprites.add(enemy)
    enemies.add(enemy)

# Main game loop
running = True
lives = LIVES
while running:
    clock.tick(FPS)

    # Process input (events)
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            running = False
        elif event.type == pygame.KEYDOWN:
            if event.key == pygame.K_SPACE:
                print("Spacebar pressed")
                player.shoot()

    # Update
    all_sprites.update()

    # Check for bullet-enemy collisions
    hits = pygame.sprite.groupcollide(enemies, bullets, True, True)
    for hit in hits:
        enemy = Enemy()
        all_sprites.add(enemy)
        enemies.add(enemy)

    # Check if player collides with an enemy
    if pygame.sprite.spritecollideany(player, enemies):
        lives -= 1
        print(f"Lives left: {lives}")
        if lives == 0:
            running = False
        else:
            player.rect.centerx = WIDTH // 2
            player.rect.bottom = HEIGHT - 10

    # Draw
    screen.fill(BLACK)
    all_sprites.draw(screen)

    # Flip the display
    pygame.display.flip()

pygame.quit()
Categories: DBA Blogs

Use GPT-4o Mini Locally with Text and Images

Thu, 2024-07-18 16:54

  This video introduces and shows how to use GPT-4o mini by OpenAI which is quite cost efficient and performant.


Code:

from openai import OpenAI
import base64
import requests
import os

## Set the API key and model name
MODEL="gpt-4o-mini"
os.environ.get('OPENAI_API_KEY')
client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

IMAGE_PATH="nm.png"
base64_image = encode_image(IMAGE_PATH)

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant that responds in Markdown. Help me with this image!"},
        {"role": "user", "content": [
            {"type": "text", "text": "Describe the image? how many girls are there?"},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{base64_image}"}
            }
        ]}
    ],
    temperature=0.0,
)

print(response.choices[0].message.content)

-

#pip install -U openai
#export OPENAI_API_KEY=""

from openai import OpenAI
import os

## Set the API key and model name
MODEL="gpt-4o-mini"
os.environ.get('OPENAI_API_KEY')
client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

completion = client.chat.completions.create(
  model=MODEL,
  messages=[
    {"role": "system", "content": "You are a helpful assistant. Help me with my question!"},
    {"role": "user", "content": "A bat and a ball together cost $1.10. The bat costs $1.00 more than the ball. How much does the ball cost?"}  
  ]
)

print("Assistant: " + completion.choices[0].message.content)
Categories: DBA Blogs

Install Mistral Nemo Locally and Test for Multi-Lingual , Function Calling

Thu, 2024-07-18 16:51

 This video installs Mistral NeMo locally and tests it on multi-lingual, math, coding, and function calling.


Code:

conda create -n nemo python=3.11 -y && conda activate nemo

pip install torch
pip install git+https://github.com/huggingface/transformers.git
pip install mistral_inference
pip install huggingface_hub pathlib

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Nemo-Instruct')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Mistral-Nemo-Instruct-2407", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook


from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest

tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tekken.json")
model = Transformer.from_folder(mistral_models_path)

prompt = "Write 10 sentences ending with the word beauty."

completion_request = ChatCompletionRequest(messages=[UserMessage(content=prompt)])

tokens = tokenizer.encode_chat_completion(completion_request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.35, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

print(result)


#===============================
# Function Calling
#===============================

from mistral_common.protocol.instruct.tool_calls import Function, Tool
from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest

completion_request = ChatCompletionRequest(
    tools=[
        Tool(
            function=Function(
                name="get_current_weather",
                description="Get the current weather",
                parameters={
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "format": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit to use. Infer this from the users location.",
                        },
                    },
                    "required": ["location", "format"],
                },
            )
        )
    ],
    messages=[
        UserMessage(content="What's the weather like today in Paris?"),
        ],
)

tokens = tokenizer.encode_chat_completion(completion_request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.35, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

print(result)
Categories: DBA Blogs

Install Codestral Mamba Locally - Best Math AI Model

Tue, 2024-07-16 17:07

 This video installs Codestral Mamba locally which is an open code model based on the Mamba2 architecture. 



Code: 

conda create -n codestralmamba python=3.11 -y && conda activate codestralmamba

pip install torch huggingface_hub pathlib2

pip install mistral_inference>=1 mamba-ssm causal-conv1d

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'mamba-codestral-7B-v0.1')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/mamba-codestral-7B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

mistral-chat $HOME/mistral_models/mamba-codestral-7B-v0.1 --instruct  --max_tokens 256
Categories: DBA Blogs

Easy Tutorial to Fine-Tune Vision Model on Image Data Locally

Sun, 2024-07-14 19:56

 This video is a step-by-step easy tutorial to fine-tune any vision model on your own custom image dataset locally easily.



Code:

conda create -n ft python=3.11 -y && conda activate ft
!pip install -U transformers datasets trl peft accelerate Pillow torch

from datasets import features, load_dataset
from transformers import AutoModelForVision2Seq, AutoProcessor
import torch
from trl import DPOConfig, DPOTrainer
from peft import LoraConfig
import os

ds_id = "openbmb/RLAIF-V-Dataset"
dataset = load_dataset(ds_id, split="train")
dataset = dataset.shuffle(seed=42).select(range(100))

model_id = "HuggingFaceM4/idefics2-8b"

model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.bfloat16)
processor = AutoProcessor.from_pretrained(model_id, do_image_splitting=False)

def format_ds(example):
    # Prepare the input for the chat template
    prompt = [{"role": "user", "content": [{"type": "image"}, {"type": "text", "text": example["question"]}]}]
    chosen = [{"role": "assistant", "content": [{"type": "text", "text": example["chosen"]}]}]
    rejected = [{"role": "assistant", "content": [{"type": "text", "text": example["rejected"]}]}]
    # Apply the chat template
    prompt = processor.apply_chat_template(prompt, tokenize=False)
    chosen = processor.apply_chat_template(chosen, tokenize=False)
    rejected = processor.apply_chat_template(rejected, tokenize=False)
    # Resize the image to ensure it fits within the maximum allowable
    # size of the processor to prevent OOM errors.
    max_size = processor.image_processor.size["longest_edge"] // 2
    example["image"].thumbnail((max_size, max_size))
    return {"images": [example["image"]], "prompt": prompt, "chosen": chosen, "rejected": rejected}

dataset = dataset.map(format_ds, remove_columns=dataset.column_names, num_proc=os.cpu_count())

f = dataset.features
f["images"] = features.Sequence(features.Image(decode=True))
dataset = dataset.cast(f)

training_args = DPOConfig(
    output_dir="my-idefics2",
    bf16=True,
    gradient_checkpointing=True,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=32,
    num_train_epochs=1,
    push_to_hub=False,
    dataset_num_proc=os.cpu_count(),
    dataloader_num_workers=os.cpu_count(),
    logging_steps=10,
    )

trainer = DPOTrainer(
    model,
    ref_model=None,  # not needed when using peft
    args=training_args,
    train_dataset=dataset,
    tokenizer=processor,
    peft_config=LoraConfig(target_modules="all-linear"),
)

trainer.train()
Categories: DBA Blogs

Install aTrain Locally - Offline Transcription of Speech Recordings Free with AI

Sun, 2024-07-14 03:23

 This video is a step-by-step easy tutorial to install aTrain which is a tool for automatically transcribing speech recordings utilizing state-of-the-art machine learning models without uploading any data. 




Code:

conda create -n atrain python=3.11 -y && conda activate atrain
sudo apt install ffmpeg
pip install aTrain@git+https://github.com/BANDAS-Center/aTrain.git --extra-index-url https://download.pytorch.org/whl/cu118
sudo apt install nvidia-cudnn
aTrain init
aTrain start
Categories: DBA Blogs

llama-cpp-agent Installation to Use AI Models Locally in Simple Way

Sat, 2024-07-13 20:46

 This video is a step-by-step easy tutorial to install llama-cpp-agent which is a tool designed to simplify interactions with LLMs. It provides an interface for chatting with LLMs, executing function calls, generating structured output.



Code:



conda create -n llamagent python=3.11 -y && conda activate llamagent

pip install --no-cache-dir llama-cpp-python==0.2.77 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124

pip install llama-cpp-agent

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

# Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider

# Create an instance of the Llama class and load the model
llama_model = Llama(r"/home/Ubuntu/mymodels/mistral-7b-instruct-v0.2.Q6_K.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)

# Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)


from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType

agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.MISTRAL)

agent_output = agent.get_chat_response("Hello, World!")

print(f"Agent: {agent_output.strip()}")


import math
from llama_cpp_agent import FunctionCallingAgent
from llama_cpp_agent.llm_output_settings import LlmStructuredOutputSettings
from typing import Union

# Callback for receiving messages for the user.
def send_message_to_user_callback(message: str):
    print(message)
   
def calculate_a_to_the_power_b(a: Union[int, float], b: Union[int, float]):
    """
    Calculates a to the power of b

    Args:
        a: number
        b: exponent

    """
    return f"Result: {math.pow(a, b)}"

output_settings = LlmStructuredOutputSettings.from_functions([calculate_a_to_the_power_b], allow_parallel_function_calling=True)

llama_cpp_agent = LlamaCppAgent(
    provider,
    debug_output=True,
    system_prompt=f"You are an advanced AI, tasked to assist the user by calling functions in JSON format.",
    predefined_messages_formatter_type=MessagesFormatterType.CHATML,
)

user_input = "Calculate a to the power of b: a = 2, b = 3"

print(
    llama_cpp_agent.get_chat_response(
        user_input, structured_output_settings=output_settings
    )
)

Categories: DBA Blogs

AuraFlow - Best and Free Text to Image Model - Install Locally

Fri, 2024-07-12 23:32

 This video installs AuraFlow v0.1 locally which is the fully open-sourced largest flow-based text-to-image generation model. It generates hyper realistic images from text prompts.



Code:

conda create -n auraflow python=3.11 -y && conda activate auraflow

pip install transformers accelerate protobuf sentencepiece
pip install torch torchvision

pip install git+https://github.com/huggingface/diffusers.git@bbd2f9d4e9ae70b04fedf65903fd1fb035437db4

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook


from diffusers import AuraFlowPipeline
import torch

pipeline = AuraFlowPipeline.from_pretrained(
    "fal/AuraFlow",
    torch_dtype=torch.float16
).to("cuda")

image = pipeline(
    prompt="close-up portrait of a majestic iguana with vibrant blue-green scales, piercing amber eyes, and orange spiky crest. Intricate textures and details visible on scaly skin. Wrapped in dark hood, giving regal appearance. Dramatic lighting against black background. Hyper-realistic, high-resolution image showcasing the reptile's expressive features and coloration.",
    height=1024,
    width=1024,
    num_inference_steps=50,
    generator=torch.Generator().manual_seed(666),
    guidance_scale=3.5,
).images[0]
Categories: DBA Blogs

Install LightRAG Locally - Moduler RAG and Lightning Library for LLM Apps

Fri, 2024-07-12 07:37

This video is a step-by-step easy tutorial to install LightRAG which helps developers with both building and optimizing Retriever-Agent-Generator pipelines. It is light, modular, and robust, with a 100% readable codebase.


Code:

conda create -n lightrag python=3.11 -y && conda activate lightrag

git clone https://github.com/SylphAI-Inc/LightRAG.git && cd LightRAG

pip install lightrag
pip install openai==1.12.0
pip install faiss-cpu==1.8.0
pip install sqlalchemy==2.0.30
pip install pgvector==0.2.5
pip install groq==0.5.0

mv .env_example .env   #set your openai and groq api keys in .env file.


conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

from lightrag.utils import setup_env
setup_env()



from dataclasses import dataclass, field

from lightrag.core import Component, Generator, DataClass
from lightrag.components.model_client import GroqAPIClient
from lightrag.components.output_parsers import JsonOutputParser

@dataclass
class QAOutput(DataClass):
    explanation: str = field(
        metadata={"desc": "A brief explanation of the concept in one sentence."}
    )
    example: str = field(metadata={"desc": "An example of the concept in a sentence."})



qa_template = r"""<SYS>
You are a helpful assistant.
<OUTPUT_FORMAT>
{{output_format_str}}
</OUTPUT_FORMAT>
</SYS>
User: {{input_str}}
You:"""

class QA(Component):
    def __init__(self):
        super().__init__()

        parser = JsonOutputParser(data_class=QAOutput, return_data_class=True)
        self.generator = Generator(
            model_client=GroqAPIClient(),
            model_kwargs={"model": "llama3-8b-8192"},
            template=qa_template,
            prompt_kwargs={"output_format_str": parser.format_instructions()},
            output_processors=parser,
        )

    def call(self, query: str):
        return self.generator.call({"input_str": query})

    async def acall(self, query: str):
        return await self.generator.acall({"input_str": query})
       


qa = QA()
print(qa)

# call
output = qa("What is LLM?")
print(output)


qa.generator.print_prompt(
        output_format_str=qa.generator.output_processors.format_instructions(),
        input_str="What is LLM?",
)

Categories: DBA Blogs

Microsoft GraphRAG Installation with Llama.CPP Locally - Step-by-Step Guide

Thu, 2024-07-11 01:02

 This video is a step-by-step tutorial to install Microsoft GraphRAG with Llama.cpp and Groq API models with your own data.




Code:

conda create -n graphcpp python=3.11 -y && conda activate graphcpp


./llama-server -m /home/Ubuntu/mymodels/qwen1_5-4b-chat-q4_k_m.gguf -c 8192 -n -1 -t 7 --embeddings


pip install graphrag


mkdir -p ./ragtest/input

cp fahd.txt ragtest/input 


python3 -m graphrag.index --init --root ./ragtest

python3 -m graphrag.index --root ./ragtest


python3 -m graphrag.query --root ./ragtest --method global "Who is Fahd Mirza?"

python3 -m graphrag.query --root ./ragtest --method local "Who is Fahd Mirza and where does he live?"


Settings.yaml

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mixtral-8x7b-32768
  model_supports_json: false # recommended if this is available for your model.
  api_base: https://api.groq.com/openai/v1
  max_tokens: 8192
  concurrent_requests: 1 # the number of parallel inflight requests that may be made
  tokens_per_minute: 28000 # set a leaky bucket throttle
  requests_per_minute: 29 # set a leaky bucket throttle
  # request_timeout: 180.0
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  max_retries: 10
  max_retry_wait: 60.0
  sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-ada-002
    api_base: http://localhost:8080
    batch_size: 1 # the number of documents to send in a single request
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional


chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32
Categories: DBA Blogs

Install Microsoft GraphRAG with Groq API Locally

Wed, 2024-07-10 21:45

 This video is a step-by-step tutorial to install Microsoft GraphRAG with Groq API models with your own data.



Code:

conda create -n graphgroq python=3.11 -y && conda activate graphgroq

export OPENAI_API_KEY=""
export GROQ_API_KEY=""

pip install graphrag

mkdir -p ./ragtest/input
cp fahd.txt ragtest/input

python3 -m graphrag.index --init --root ./ragtest

python3 -m graphrag.index --root ./ragtest

python3 -m graphrag.query --root ./ragtest --method global "Who is Fahd Mirza?"
python3 -m graphrag.query --root ./ragtest --method local "Who is Fahd Mirza and where does he live?"


Settings.yaml File:

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mixtral-8x7b-32768
  model_supports_json: false # recommended if this is available for your model.
  api_base: https://api.groq.com/openai/v1
  max_tokens: 8192
  concurrent_requests: 1 # the number of parallel inflight requests that may be made
  tokens_per_minute: 28000 # set a leaky bucket throttle
  requests_per_minute: 29 # set a leaky bucket throttle
  # request_timeout: 180.0
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  max_retries: 10
  max_retry_wait: 60.0
  sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${OPENAI_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: text-embedding-3-small
    api_base: https://api.openai.com/v1
    # api_base: https://<instance>.openai.azure.com
    # api_version: 2024-02-15-preview
    # organization: <organization_id>
    # deployment_name: <azure_model_deployment_name>
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional



chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
  type: file # or blob
  file_type: text # or csv
  base_dir: "input"
  file_encoding: utf-8
  file_pattern: ".*\\.txt$"

cache:
  type: file # or blob
  base_dir: "cache"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

storage:
  type: file # or blob
  base_dir: "output/${timestamp}/artifacts"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

reporting:
  type: file # or console, blob
  base_dir: "output/${timestamp}/reports"
  # connection_string: <azure_blob_storage_connection_string>
  # container_name: <azure_blob_storage_container_name>

entity_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 0

summarize_descriptions:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  # enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 0

community_report:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  prompt: "prompts/community_report.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes
  # num_walks: 10
  # walk_length: 40
  # window_size: 2
  # iterations: 3
  # random_seed: 597832

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
  graphml: false
  raw_entities: false
  top_level_nodes: false

local_search:
  # text_unit_prop: 0.5
  # community_prop: 0.1
  # conversation_history_max_turns: 5
  # top_k_mapped_entities: 10
  # top_k_relationships: 10
  # max_tokens: 12000

global_search:
  # max_tokens: 12000
  # data_max_tokens: 12000
  # map_max_tokens: 1000
  # reduce_max_tokens: 2000
  # concurrency: 32
Categories: DBA Blogs

Run LoRA Adapter of Any Model with Ollama

Mon, 2024-07-08 21:58

This video is a step-by-step tutorial to create integrate LoRA adapters of models in Ollama.



Code:

conda create -n lora python=3.11 -y && conda activate lora

pip install huggingface_hub
huggingface-cli login            

huggingface-cli download pacozaa/tinyllama-alpaca-lora

https://github.com/anna-chepaikina/llama-cpp.git && cd llama-cpp
pip install -r requirements.txt

python3 convert-lora-to-ggml.py /home/Ubuntu/.cache/huggingface/hub/models--pacozaa--tinyllama-alpaca-lora/snapshots/10fcfd76a34f5b2b18a8d2dfc4372f9d4e0f01cd


ollama pull tinyllama

touch Modelfile
============================

FROM tinyllama:latest
ADAPTER /home/Ubuntu/.cache/huggingface/hub/models--pacozaa--tinyllama-alpaca-lora/snapshots/10fcfd76a34f5b2b18a8d2dfc4372f9d4e0f01cd/ggml-adapter-model.bin

TEMPLATE """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.


{{ if .System }}### Instruction:
{{ .System }}{{ end }}

{{ if .Prompt }}### Input:
{{ .Prompt }}{{ end }}

### Response:
"""
SYSTEM """You are Alexander the Great, the world conqueror."""
PARAMETER stop "### Response:"
PARAMETER stop "### Instruction:"
PARAMETER stop "### Input:"
PARAMETER stop "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request."
PARAMETER num_predict 200

=================

ollama create alextinyllama -f ./Modelfile
ollama run alextinyllama


/set system You're Aristotle, the philosopher. Answer using philosophical terms.
Categories: DBA Blogs

Install Microsoft GraphRAG with Ollama Locally

Sun, 2024-07-07 03:25

 This video is a step-by-step tutorial to install Microsoft GraphRAG with Ollama models with your own data.


Commands Used:

conda create -n graphollama python=3.11 -y && conda activate graphollama

pip install ollama

ollama pull mistral
ollama pull nomic-embed-text

pip install graphrag

mkdir -p ./ragtest/input
cp fahd.txt ragtest/input

python3 -m graphrag.index --init --root ./ragtest

cd ragtest , vi settings

sudo find / -name openai_embeddings_llm.py


python3 -m graphrag.index --root ./ragtest

python3 -m graphrag.query --root ./ragtest --method global "Who is Fahd Mirza?"


Files Used:

#settings.yaml

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mistral
  model_supports_json: true # recommended if this is available for your model.
  # max_tokens: 4000
  # request_timeout: 180.0
api_base: http://localhost:11434/v1
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 25 # the number of parallel inflight requests that may be made

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic_embed_text
    api_base: http://localhost:11434/api
    # api_version: 2024-02-15-preview


#openai_embeddings_llm.py

from typing_extensions import Unpack
from graphrag.llm.base import BaseLLM
from graphrag.llm.types import (
    EmbeddingInput,
    EmbeddingOutput,
    LLMInput,
)
from .openai_configuration import OpenAIConfiguration
from .types import OpenAIClientTypes
import ollama

class OpenAIEmbeddingsLLM(BaseLLM[EmbeddingInput, EmbeddingOutput]):
    _client: OpenAIClientTypes
    _configuration: OpenAIConfiguration

    def __init__(self, client: OpenAIClientTypes, configuration: OpenAIConfiguration):
        self._client = client
        self._configuration = configuration

    async def _execute_llm(
        self, input: EmbeddingInput, **kwargs: Unpack[LLMInput]
    ) -> EmbeddingOutput | None:
        args = {
            "model": self._configuration.model,
            **(kwargs.get("model_parameters") or {}),
        }
        embedding_list = []
        for inp in input:
            embedding = ollama.embeddings(model="nomic-embed-text", prompt=inp)
            embedding_list.append(embedding["embedding"])
        return embedding_list


Categories: DBA Blogs

Candle - Minimalist ML framework for Rust - Step-by-Step Tutorial - Run Models Locally

Fri, 2024-07-05 23:35

 This video installs Candle locally and run local models. Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use. 



Code:

-- Make sure nvcc shows CUDA version 12.3 and nvidia-smi shows 8.9 for compute capacity.

nvcc --version should print information about your Cuda compiler driver.
nvidia-smi --query-gpu=compute_cap --format=csv

Make sure that that openssl and cc1plus are on the following path by finding them :  sudo find / -name cc1plus

-- export PATH=$PATH:/usr/lib/gcc/x86_64-linux-gnu/11/   //This is for cc1plus  (cc1plus is the C++ compiler driver )
-- export OPENSSL_LIB_DIR=/usr/lib/x86_64-linux-gnu      //This is for openssl
-- export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
-- export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}


-- If you cannot find openssl, then run following :

sudo find / -name cc1plus

sudo apt install libssl-dev
sudo apt install pkg-config

-- Create a new app and add candle-core with Cuda support.

cargo new myapp
cd myapp

cargo add --git https://github.com/huggingface/candle.git candle-core --features "cuda"
cargo build


-- Run Inference with Phi-2 model

git clone https://github.com/huggingface/candle.git
cd candle/candle-examples

cargo run --example phi --features cuda --release -- --model 2 --prompt "What is happiness in 3 lines?"
Categories: DBA Blogs

RouteLLM - Create LLM Routers Locally

Fri, 2024-07-05 19:14

 This video installs RouteLLM locally which is a framework for serving and evaluating LLM routers. It als shows hands-on demo of routing model traffic between Ollama models and OpenAI models.



Code:

conda create -n rl python=3.11 -y && conda activate rl

git clone https://github.com/lm-sys/RouteLLM.git
cd RouteLLM
pip install -e .[serve,eval]

export OPENAI_API_KEY=sk-XXXXXX

python3 -m routellm.openai_server --routers mf --alt-base-url http://localhost:11434/v1 --config config.example.yaml --weak-model llama3

pip install openai

In Python Interpreter:

import openai
client = openai.OpenAI(base_url="http://localhost:6060/v1", api_key="no_api_key")
response = client.chat.completions.create(model="router-mf-0.116",messages=[{"role": "user", "content": "Hello!"}])

In CLI:

python3 -m routellm.calibrate_threshold --routers mf --strong-model-pct 0.5 --config config.example.yaml
Categories: DBA Blogs

OpenAutoCoder Agentless AI Applications - Install Locally

Wed, 2024-07-03 23:16

 

This video installs Agentless and tries it to out on SWE-Benchmark. It is an agentless approach to automatically solve software development problems.




Code:

conda create -n agentless python=3.11 -y && conda activate agentless

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

pip install torch
pip install git+https://github.com/huggingface/transformers
pip install openai
git clone https://github.com/OpenAutoCoder/Agentless.git && cd Agentless

pip install -r requirements.txt

export PYTHONPATH=$PYTHONPATH:$(pwd)

export OPENAI_API_KEY=""

mkdir projectfiles
cd projectfiles
wget https://github.com/OpenAutoCoder/Agentless/releases/download/v0.1.0/swebench_lite_repo_structure.zip
unzip swebench_lite_repo_structure.zip

export PROJECT_FILE_LOC=/home/Ubuntu/projectfiles

mkdir results
python3 agentless/fl/localize.py --file_level --related_level --fine_grain_line_level --output_folder results/location --top_n 3 --compress --context_window=10 --temperature 0.8 --num_samples 4

python3 agentless/fl/localize.py --merge --output_folder results/location_merged --start_file results/location/loc_outputs.jsonl --num_samples 4
Categories: DBA Blogs

Install OpenLIT Locally - Best Free Tool for LLM Monitoring and Tracing

Wed, 2024-07-03 03:17

 This video installs OpenLIT locally and integrates it with local Ollama models. OpenLIT is an OpenTelemetry-native tool designed to help developers gain insights into the performance of their LLM applications in production. It automatically collects LLM input and output metadata, and monitors GPU performance for self-hosted LLMs.




Code:

conda create -n lit python=3.11 -y && conda activate lit

pip install torch
pip install git+https://github.com/huggingface/transformers

git clone https://github.com/openlit/openlit.git

docker compose up -d

pip install openlit
pip install ollama

import ollama
prompt="what is happiness"

import openlit
openlit.init(otlp_endpoint="http://127.0.0.1:4318",trace_content=False)

response = ollama.generate(model='llama3', prompt=prompt)
Categories: DBA Blogs

GraphRAG Local Installation - Create RAG Pipeline on Your Own Custom Data

Tue, 2024-07-02 06:10

 This video installs GraphRAG locally and create RAG pipeline with your own custom data in a a step-by-by step tutorial.



Code:

conda create -n grag python=3.11 -y && conda activate grag

export OPENAI_API_KEY=""

pip install graphrag

pip install gptcache

mkdir -p ./ragtest/input && cd ./ragtest/input

touch fahd.txt

python3 -m graphrag.index --init --root ./ragtest

set the .env file.

python3 -m graphrag.index --root ./ragtest

python3 -m graphrag.query --root ./ragtest --method global "Who is Fahd Mirza?"

python3 -m graphrag.query --root ./ragtest --method local "Who is Fahd Mirza and where he lives?"

Categories: DBA Blogs

Pages