FANA LLM v0.3.1 - Enhanced Context and Response Time, Task Determination, Groq and Claude 3.5 Sonnet

This significant update brings numerous improvements to FANA capabilities. With enhanced multi-language support, contextual memory, fastest inference for text responses, image generation and


Task Determination, Contextual Image Generation, Improved Response Time, Groq Inference and Claude 3.5 Sonnet

Task Determination

The task_determination module determines different tasks for execution based on user input. Here is an example of how it works:

async def process_user_input(user_input: str, chat_history: list, client):
    # Handle greetings and intro messages
    response, chat_history = await triggers_greetings(user_input, chat_history, client)
    if response: return response, chat_history
    # Handle image vision through URL or image upload
    response, chat_history = await process_url(user_input, chat_history)
    if response: return response, chat_history
    # Handle triggers for image diffusion, regenerations and image context handling
    response, chat_history = await process_triggers(user_input, chat_history, client)
    if response: return response, chat_history
    # Handle overall text processing when the above fails, mostly FAQ questions.
    return await process_text(user_input, chat_history, client)

New AI Models Integration

Claude 3.5 Sonnet for Computer Vision

We have integrated Claude 3.5 Sonnet, a powerful computer vision model, into our system. Here is an example of how it works:

# Set up the Anthropic API key
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")

# Set up the API endpoint and headers
ANTHROPIC_API_URL = "https://api.anthropic.com/v1/messages"
ANTHROPIC_HEADERS = {
    "x-api-key": ANTHROPIC_API_KEY,
    "Content-Type": "application/json",
    "anthropic-version": "2023-06-01"
}

# Create a helper function for making API calls
async def anthropic_api_call(data):
    async with httpx.AsyncClient(timeout=300) as client:
        response = await client.post(ANTHROPIC_API_URL, headers=ANTHROPIC_HEADERS, json=data)
        response.raise_for_status()
        return response.json()

Groq and It's Fast Inference Engine with All Available Models

We have integrated Groq's with it's super fast inference engine into our system, which provides a significant speedup in response time. It also comes with some open source models: llama3-70b-8192 with 8192 tokens of context window, llama3-8b-8192 with 8192 tokens of context window, gemma2-9b-it with 8192 tokens of context window, gemma-7b-it with 8192 tokens of context window, mixtral-8x7b-32768 with 32768 tokens of context window, whisper-large-v3 with 1500 tokens of context window.

2024-07-07 09:20:03 - root INFO: Groq token usage: 
Prompt tokens = 580 
Completion tokens = 71 
Total tokens = 651                                                                             
2024-07-07 09:20:03 - root INFO: Groq response times: 
Prompt time = 0.053630 seconds, 
Completion time = 0.114463 seconds 
Total time = 0.168093 seconds 

Mixtral 8x7b with 32k Context Window

We have integrated Mixtral's 32k open-source model, which provides a higher context than GPT 3.5. This allows our AI to understand the conversation context better and provide more accurate and personalized responses.

Here is an example snippet:

async with httpx.AsyncClient() as http_client:
    response = await http_client.post(
        "https://api.groq.com/v1/chat/completions",
        headers={"Authorization": f"Bearer {client.api_key}"},
        json={
            "model": "mixtral-8x7b-32768",
            "messages": messages,
            "max_tokens": 8000,
        },
    )
    response.raise_for_status()
    response_json = response.json()
    generated_response = response_json["choices"][0]["message"]["content"].strip()
    return generated_response

Multi-Language Enhancement

We have enhanced our multi-language support, allowing users to interact with the AI in their preferred language. Our language detection module has been improved to properly detect and respond in multiple languages, including but not limited to English, Spanish, French, and many more.

Contextual Memory

Our contextual memory module has been improved to understand the conversation context better and provide more accurate and personalized responses. This is achieved by storing and retrieving conversation history, allowing the AI to recall previous interactions and adapt its responses accordingly.

Image Generation and Analysis Context Improved

We have improved our image generation capabilities, allowing users to generate images based on the context of the latest messages. Additionally, our image generation module can now identify regeneration keywords and generate images based on the context provided. This enables the AI to generate more accurate and relevant images that match the user's intent.

Technical Improvements

We have made several technical improvements, including:

  • Optimized task determination and execution logic for faster and more accurate responses

  • Integration of the latest Anthrophic model Claude3-5-Sonnet-20240620 for image analysis reducing up to 50% response time on the image analysis.

  • Integration of the latest Groq Inference API for ultra fast responses with many open source models, models available are: llama3-70b-8192 with 8192 tokens of context window, llama3-8b-8192 with 8192 tokens of context window, gemma2-9b-it with 8192 tokens of context window, gemma-7b-it with 8192 tokens of context window, mixtral-8x7b-32768 with 32768 tokens of context window, whisper-large-v3 with 1500 tokens of context window

  • Improved language detection and response generation using advanced NLP techniques

  • Enhanced contextual memory module with a RAG system for better conversation understanding

  • Improved image generation and analysis capabilities using state-of-the-art models

  • Significant improvement in response time through algorithmic optimizations


Conclusion

FANA v0.3.1 is a significant release that brings numerous improvements to our AI capabilities. With enhanced multi-language support, contextual memory, fastest inference for text responses, image generation and improved analysis response time and capabilities, FANA is now more powerful and accurate than ever. We are excited to see how developers will utilize these improvements to build innovative applications.

Getting Started

To get started with FANA v0.3.1, reach us out at hello@fana.ai

License

FANA is licensed under the Apache 2.0 license. See LICENSE for more information.


Last updated