Skip to content

Ankit-0803/FinSight-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 FinSight AI — Intelligent Equity Research Assistant

FinSight AI is an AI-powered equity research assistant that automates financial news analysis using a Retrieval-Augmented Generation (RAG) pipeline.
It allows users to input up to 3 financial news article URLs, processes them into a semantic vector index, and answers natural-language questions with grounded insights and source citations.

The system is designed with production-grade architecture, focusing on accuracy, low hallucination risk, and fault tolerance.


🚀 Key Capabilities

  • Financial News Scraping

    • Extracts clean article text from public financial news URLs using LangChain loaders and BeautifulSoup.
    • Preserves source URLs for citation and traceability.
  • RAG-Based Question Answering

    • Uses semantic search to retrieve the most relevant article chunks.
    • LLM answers are strictly grounded in retrieved content.
  • Semantic Vector Search (FAISS)

    • Embeddings stored locally using FAISS for ultra-fast similarity search.
    • Exact L2 nearest-neighbor search for high retrieval accuracy on small datasets.
  • Multi-Model LLM Fallback

    • Uses OpenRouter to route queries across multiple free LLMs.
    • Automatic fallback ensures reliability even if a model fails or times out.
  • Source-Backed Answers

    • Every answer includes links to the original articles used.
    • Optional reasoning visibility for transparency.

🧠 System Architecture (High Level)

  1. URL Ingestion

    • User provides up to 3 financial news URLs.
    • Content is fetched and cleaned into plain text.
  2. Chunking

    • Text is split using RecursiveCharacterTextSplitter.
    • chunk_size = 600, chunk_overlap = 100 for semantic coherence and API safety.
  3. Embedding

    • Each chunk is embedded using Google Gemini Embedding Model (models/gemini-embedding-001).
    • Produces 768-dimensional dense vectors.
  4. Vector Store

    • Vectors are stored in a local FAISS index (faiss_index/).
    • Metadata (source URLs) is persisted alongside vectors.
  5. Query Flow

    • User question is embedded with the same Gemini model.
    • FAISS retrieves top-k (default = 4) most similar chunks.
    • Retrieved chunks are passed to the LLM for answer generation.
  6. LLM Inference

    • Queries routed via OpenRouter with a fallback chain:
      • Mistral Small 3.1 → LLaMA 3.3 → Gemma 3 → DeepSeek R1

🛠️ Tech Stack

Layer Technology
UI Streamlit (dark-themed UI)
Web Scraping LangChain WebBaseLoader + BeautifulSoup
Text Chunking RecursiveCharacterTextSplitter
Embeddings Google Gemini gemini-embedding-001
Vector Store FAISS (CPU, Flat L2 index)
RAG Orchestration LangChain RetrievalQAWithSourcesChain
LLM Gateway OpenRouter
Environment Python, python-dotenv

⚙️ Installation

  1. Clone this repository:
    git clone https://github.com/yourusername/finsight-ai.git
  2. Navigate to the project directory:
    cd finsight-ai
  3. Install dependencies:
    pip install -r requirements.txt
  4. Configure environment variables in a .env file:
    GOOGLE_API_KEY=your_gemini_api_key_here
    OPENROUTER_API_KEY=your_openrouter_api_key_here
    
    

🧩 Usage

  1. Launch the Streamlit application: streamlit run main.py
  2. Input URLs or upload files containing financial articles.
  3. Click “Process Articles” to extract text, create embeddings, and build the FAISS index.
  4. Ask queries related to market trends, company performance, or financial insights — FinSight AI will provide responses with source references.

About

AI-powered equity research assistant using RAG, Gemini embeddings, FAISS, and LLMs for source-backed financial insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors