Fine-Tune Llama 3.1 8B

The AI landscape is shifting. Open-source models like Llama 3.1 are proving that smaller, strategically fine-tuned models can match or exceed the performance of expensive closed-source solutions. With Impulse AI, you can harness Llama 3.1 8B’s capabilities on your own data, building custom AI models that are both powerful and entirely yours. This is a comprehensive guide to fine tune the pre-trained Llama 3.1 8B Instruct model using Impulse SDK or the Web App. Training on Impulse is simplified - it’s just a two-step process. Impulse AI orchestrates everything else for you.

Key Takeaways

Prepare dataset and upload to the Impulse Platform.
Submit fine-tuning job with custom training parameters using SDK or Web App.
Download and evaluate the fine-tuned model.

Prerequisites

Impulse SDK: Install using pip install impulse-api-sdk-python
API Key: Obtain the key from the Impulse dashboard, and set it up as an environment variable.

export IMPSDK_API_KEY=your_api_key

Dataset Preparation

Dataset preparation is the most crucial step for fine-tuning on the Impulse AI platform. Getting this step right is key to achieving the desired results from fine-tuning. For a more comprehensive guide on supported data formats and preparation methods, refer to the dataset guide.

Fine Tune

The fine-tuning parameters are flexible, allowing you to specify parameters like batch size, learning rate, the number of epochs, seed and shuffle. We support LoRA, QLoRA and Full fine tuning. Method 1: Fine Tune via Impulse SDK

import os
import asyncio
from impulse.api_sdk.sdk import ImpulseSDK
from impluse.api_sdk.models import (
    FineTuningJobCreate,FineTuningJobParameters
)

async def main():
    async with ImpulseSDK(os.environ.get("IMPSDK_API_KEY")) as client:
        job = await client.fine_tuning.create_fine_tuning_job(FineTuningJobCreate(
            base_model_name="llm_llama3_1_8b",
            dataset_name="<dataset-name>",
            name="<job-name>",
            type="<fine-tune mode>",
            parameters=FineTuningJobParameters(
                batch_size=2,
                shuffle=True,
                num_epochs=1,
                lr=2e-5,
                seed=42
            )
        ))
        print(f"Fine-tuning job started: {job}")

asyncio.run(main())

Method 2: Fine Tune via Web App

Login to the Impulse Dashboard.
Navigate to Fine-Tuning Tab in the left panel.
Click on “Create Job”.

Sit back & relax while we finish training and provide you with fine tuned model parameters 😃

Monitoring Jobs

Job status can be retrieved in the following ways. Method 1:Impulse SDK

import os
import asyncio
from impulse.api_sdk.sdk import ImpulseSDK

async def main():
    async with ImpulseSDK(os.environ.get("IMPSDK_API_KEY")) as client:
        jobs = await client.fine_tuning.list_fine_tuning_jobs()
        print("Fine-tuning jobs:", jobs)

asyncio.run(main())

Method 2:Web App Job status is visible under the Fine-Tuning section on the Impulse Dashboard.

Post Training

Fine-tuned model weights are available for download via the Impulse Dashboard on the Fine-Tuning page. Note: In-house evaluation and inference capabilities will be available soon on Impulse AI. Our team is currently building these inference features.

Quick Guide to Inference

Inference on dowloaded weights can be performed using Hugging Face Transformers library. The sample script below demonstrates how to run inference locally or on a hosted machine once the model weights are available to that machine.

from transformers import AutoModelForCausalLM, AutoTokenizer
import sys

# Usage:
# python predict.py "What is the capital of Japan?"

# Ask the model something
query = sys.argv[1]

# Load your fine-tuned model and tokenizer (replace with your model path or Hugging Face hub path)
model_path = "<path_to_your_finetuned_model>"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

# Generate predictions from the model
def generate_answer(prompt):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(inputs['input_ids'], max_length=50)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Print model output
print(generate_answer(query))

Sample Inference for model fine-tuned on TriviaQA dataset. Refer to dataset guide.

python predict.py "What is the capital of Japan?"

Conclusion

Using the Impulse SDK, you can quickly fine-tune open-source models like Llama 3.1 8B for specific downstream tasks, creating faster, more accurate models at a fraction of the cost of closed-source alternatives. The flexibility of Impulse AI’s fine-tuning API allows you to customize the entire process, from dataset management to model deployment. For more details, check out our full documentation or explore the PyPi repo to get started.

Getting Started

Fine-Tuning

Tutorials

Pricing

Frequently Asked Questions

Fine-Tune Llama 3.1 8B

Key Takeaways

Prerequisites

Dataset Preparation

Fine Tune

Monitoring Jobs

Post Training

Quick Guide to Inference

Conclusion

Getting Started

Fine-Tuning

Tutorials

Pricing

Frequently Asked Questions

​Key Takeaways

​Prerequisites

​Dataset Preparation

​Fine Tune

​Monitoring Jobs

​Post Training

​Quick Guide to Inference

​Conclusion

Key Takeaways

Prerequisites

Dataset Preparation

Fine Tune

Monitoring Jobs

Post Training

Quick Guide to Inference

Conclusion