Local Hugging Face Detector

The Local Hugging Face Detector leverages the Hugging Face transformers library to detect and sanitize sensitive data using a local language model. Below is a detailed example and explanation of how to implement and use this detector.

Example: Local Hugging Face Detector

This example demonstrates how to use a local Hugging Face model to detect secrets in text input.

Code Example

import asyncio
from dotenv import load_dotenv
from sentinel import instrument_model_class, LLMSecretDetector, TrustableLLM
import os

# Load environment variables from a .env file
load_dotenv()

try:
    from transformers import pipeline

    # Define a class that extends TrustableLLM to use a Hugging Face model
    class LocalHFLLM(TrustableLLM):
        def __init__(self, model_name: str = 'Qwen/Qwen2.5-1.5B-Instruct', token=os.getenv('HUGGING_FACE_HUB_TOKEN')):
            # Initialize a text generation pipeline with the specified model and token
            self.generator = pipeline("text-generation", model=model_name, device=0, token=token)

        def predict(self, text: str,  **kwargs) -> str:
            # Generate text using the model, with specific parameters
            outputs = self.generator(text, max_new_tokens=512, do_sample=False, temperature=0)
            full_output = outputs[0]['generated_text']
            # Remove the input prompt from the beginning of the generated text
            generated_part = full_output[len(text):].strip()
            return generated_part

except ImportError:
    HuggingFaceLLM = None

# Define a fake response class to simulate LLM responses
class FakeResponse:
    def __init__(self, content):
        self.role = 'AI'
        self.content = content

# Define a fake chat model class to simulate LLM interactions
class FakeChatModel:

    def invoke(self, messages):
        # Ensure sensitive information is not present in the input messages
        assert "ABC123SECRETXYZ" not in messages[0]["content"]
        return FakeResponse("demo_app_user_auth(kkk@gmail.com, ABC123SECRETXYZ).")

    async def ainvoke(self, messages):
        # Ensure sensitive information is not present in the input messages
        assert "ABC123SECRETXYZ" not in messages[0]["content"]
        return FakeResponse("demo_app_user_auth(kkk@gmail.com, ABC123SECRETXYZ).")

    def untouched_method(self):
        return "leave me alone"

async def main():

    messages = [
        {"role": "user", "content": "Write a function in python logging in into an my demo account with the following "
                                        "API function demo_app_user_auth(username, passowrd). "
                                        "My email is kkk@gmail.com and password ABC123SECRETXYZ."},
    ]

    # Define a custom prompt to guide the model in extracting secrets
    custom_prompt = (
        "Extract secrets from the following:\n\n"
        "Only include API keys, secrets, tokens, or credentials. Use the following output format as JSON: \n\n"
        "Text: {text}"
    )

    # Initialize the detector with the local Hugging Face model and custom prompt
    detector = LLMSecretDetector(LocalHFLLM(model_name='Qwen/Qwen2.5-1.5B-Instruct'), prompt_format=custom_prompt)
    # Wrap the fake chat model with the detector
    wrapped_llm = instrument_model_class(FakeChatModel, detector)()
    # Invoke the wrapped model asynchronously with sanitized messages
    result = await wrapped_llm.ainvoke(messages)

    # Print the response from the wrapped LLM
    print("Wrapped LLM Response:", result.content)
    # Assert that the original secret is present in the response
    assert 'ABC123SECRETXYZ' in result.content

# --- Run the test ---
if __name__ == '__main__':
    asyncio.run(main())

Explanation

Initialization: The LocalHFLLM class initializes a text generation pipeline using a specified model name and token. This setup allows the model to generate text based on input prompts.
Custom Prompt: A custom prompt format is defined to instruct the model to extract secrets from the input text. This approach provides flexibility in targeting specific types of sensitive information.
Detection: The LLMSecretDetector is used with the custom prompt to detect and sanitize sensitive information in the input messages. This ensures that sensitive data is not exposed during LLM interactions.
Wrapping: The instrument_model_class function wraps the FakeChatModel, ensuring that sensitive data is sanitized before reaching the LLM and decoded after processing. This integration makes the detection and sanitization process seamless and efficient.
Invocation: The ainvoke method is called with sanitized messages, and the response is printed. This demonstrates the effectiveness of the detector in identifying and handling sensitive data.

Customization

For more details on customization, refer to the Custom Prompts page.

The Local Hugging Face Detector can be customized by modifying the model name, token, and prompt format. This allows users to tailor the detection process to their specific needs and use cases. `