FANA LLM v0.3.0 Update: Chat History RAG, NLP Enhancements, and Multi-Language Image Processing


What's new?

This update introduces significant enhancements to the FANA LLM system, focusing on chat history retrieval-augmented generation (RAG) for contextual conversations, natural language processing (NLP) for image generation and analysis, and improved multi-language support. Key improvements include better relevance in past interactions, more accurate context retrieval, and enhanced regex NLP trigger word and phrases detection for image generation and analysis.

Key Features and Improvements

1. Advanced Chat History RAG For Contextual Conversations

Retrieval-augmented generation (RAG) is an advanced natural language processing (NLP) technique that combines the strengths of retrieval-based and generative-based AI models. RAG leverages the vast pre-existing knowledge embedded within AI models and enhances it by retrieving relevant information to generate unique, context-aware responses. This approach allows RAG AI to deliver highly accurate results by not only summarizing the retrieved data but also synthesizing it into coherent and human-like language.

Purpose:

  • Finds the most relevant past interactions based on semantic similarity and keyword relevance.

  • Ensures context-aware responses by utilizing previous chat history effectively.

Use Case:

  • Ideal for tasks like image modifications or follow-up requests where understanding the context of previous user interactions is crucial.

Challenges Solved:

  1. AI Context Limitation:

    • Traditional models like GPT-3.5 have a 4k token limit for input and output. RAG overcomes this by retrieving only the most relevant parts of the chat history, ensuring that the input stays within token limits.

  2. Cost Efficiency:

    • Processing entire chat histories for each request can be computationally expensive. By retrieving only the relevant data, RAG significantly reduces the number of tokens processed, saving costs on API usage.

  3. Enhanced Relevance:

    • Combines semantic similarity with keyword relevance to ensure the retrieved data is not just similar but also contextually relevant to the user’s current query.

  4. Improved User Experience:

    • Provides more accurate and context-aware responses, enhancing the overall interaction quality and user satisfaction.

Highlights:

  • Cosine Similarity: Calculates the cosine similarity between user input embedding and stored embeddings to find semantically similar content.

  • Keyword Relevance: Extracts and matches keywords between user input and stored chat history to ensure contextual relevance.

  • Combined Scoring: Uses a weighted approach to combine similarity and keyword relevance scores, ensuring the most relevant content is retrieved.

Function: find_most_similar_chat history

# Extract keywords from user input
user_keywords = extract_keywords(user_input)
logging.info(f"Extracted user keywords: {user_keywords}")

# Calculate keyword relevance scores
keyword_relevance_scores = []
for entry in supabase_data:
    entry_text = " ".join([msg["content"] for msg in entry["chat_history"]["data"]])
    entry_keywords = extract_keywords(entry_text)
    common_keywords = user_keywords.intersection(entry_keywords)
    keyword_relevance_scores.append(len(common_keywords))

# Normalize keyword relevance scores
max_keyword_score = max(keyword_relevance_scores) if keyword_relevance_scores else 1
normalized_keyword_scores = [score / max_keyword_score for score in keyword_relevance_scores]

# Combine similarity and keyword relevance
combined_scores = [(sim * 0.7) + (kw_score * 0.3) for sim, kw_score in zip(similarities, normalized_keyword_scores)]

if not combined_scores:  # Check if the list is empty
    logging.info("No valid similarities found.")
    return "No similar content found due to invalid data"

max_combined_score = max(combined_scores)
similarity_threshold = 0.25  # Adjusted similarity threshold for chat history. The higher the more similar it should be.

2. NLP and Trigger Word Enhancements

Function: check_for_trigger_words

  • Purpose: Improved detection of trigger words and questions to handle more complex inputs.

  • Use Case: Detects if user inputs should trigger image generation or other actions.

  • Highlights:

    • Uses refined regex patterns to detect questions and trigger words.

    • Translates non-English inputs to English before processing.

    translated_input = await translate_with_gpt(client, user_input, "English")
    is_question = any(re.search(pattern, user_input_lower) for pattern in question_patterns)

3. Enhanced Image Processing

Function: analyze_and_generate

  • Purpose: Processes image-related queries considering the context of previous interactions.

  • Use Case: Ensures relevant image modifications and follow-up actions are based on past interactions.

  • Highlights:

    • Extracts the last generated image description from chat history.

    • Combines user input with previous image context for analysis.

    last_image_description = msg["content"].split("Here's the image created based on your request:")[1].strip()
    combined_input = f"{user_input} {last_image_description} {' '.join([msg['content'] for msg in retrieved_history])}"

4. Multi-Language Support

Functions: translate_to_english, translate_back_to_user

  • Purpose: Handles multi-language inputs and outputs by translating non-English inputs to English and back from English to the target Language.

  • Use Case: Ensures non-English inputs are accurately processed and trigger appropriate actions.

  • Highlights:

    • Translates user input to English before processing.

    translated_text = response.choices[0].message["content"].strip()
    user_input_lower = translated_input.lower()

Conclusion

The v0.3.0 update to FANA LLM brings significant enhancements in retrieving and utilizing past interactions, improving response relevance and accuracy, and supporting multi-language inputs. These improvements ensure a more effective and user-friendly experience, paving the way for more sophisticated interactions and better support for user queries in the future.


Last updated