📊 FinSight AI — Intelligent Equity Research Assistant

FinSight AI is an AI-powered equity research assistant that automates financial news analysis using a Retrieval-Augmented Generation (RAG) pipeline.
It allows users to input up to 3 financial news article URLs, processes them into a semantic vector index, and answers natural-language questions with grounded insights and source citations.

The system is designed with production-grade architecture, focusing on accuracy, low hallucination risk, and fault tolerance.

🚀 Key Capabilities

Financial News Scraping
- Extracts clean article text from public financial news URLs using LangChain loaders and BeautifulSoup.
- Preserves source URLs for citation and traceability.
RAG-Based Question Answering
- Uses semantic search to retrieve the most relevant article chunks.
- LLM answers are strictly grounded in retrieved content.
Semantic Vector Search (FAISS)
- Embeddings stored locally using FAISS for ultra-fast similarity search.
- Exact L2 nearest-neighbor search for high retrieval accuracy on small datasets.
Multi-Model LLM Fallback
- Uses OpenRouter to route queries across multiple free LLMs.
- Automatic fallback ensures reliability even if a model fails or times out.
Source-Backed Answers
- Every answer includes links to the original articles used.
- Optional reasoning visibility for transparency.

🧠 System Architecture (High Level)

URL Ingestion
- User provides up to 3 financial news URLs.
- Content is fetched and cleaned into plain text.
Chunking
- Text is split using RecursiveCharacterTextSplitter.
- chunk_size = 600, chunk_overlap = 100 for semantic coherence and API safety.
Embedding
- Each chunk is embedded using Google Gemini Embedding Model (models/gemini-embedding-001).
- Produces 768-dimensional dense vectors.
Vector Store
- Vectors are stored in a local FAISS index (faiss_index/).
- Metadata (source URLs) is persisted alongside vectors.
Query Flow
- User question is embedded with the same Gemini model.
- FAISS retrieves top-k (default = 4) most similar chunks.
- Retrieved chunks are passed to the LLM for answer generation.
LLM Inference
- Queries routed via OpenRouter with a fallback chain:
  - Mistral Small 3.1 → LLaMA 3.3 → Gemma 3 → DeepSeek R1

🛠️ Tech Stack

Layer	Technology
UI	Streamlit (dark-themed UI)
Web Scraping	LangChain `WebBaseLoader` + BeautifulSoup
Text Chunking	`RecursiveCharacterTextSplitter`
Embeddings	Google Gemini `gemini-embedding-001`
Vector Store	FAISS (CPU, Flat L2 index)
RAG Orchestration	LangChain `RetrievalQAWithSourcesChain`
LLM Gateway	OpenRouter
Environment	Python, `python-dotenv`

⚙️ Installation

Clone this repository:

git clone https://github.com/yourusername/finsight-ai.git

Navigate to the project directory:
```
cd finsight-ai
```
Install dependencies:
```
pip install -r requirements.txt
```

Configure environment variables in a .env file:

GOOGLE_API_KEY=your_gemini_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here

🧩 Usage

Launch the Streamlit application: streamlit run main.py
Input URLs or upload files containing financial articles.
Click “Process Articles” to extract text, create embeddings, and build the FAISS index.
Ask queries related to market trends, company performance, or financial insights — FinSight AI will provide responses with source references.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
faiss_store_openai.pkl		faiss_store_openai.pkl
img.jpg		img.jpg
main.py		main.py
pip		pip
python		python
requirements.txt		requirements.txt
streamlit		streamlit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 FinSight AI — Intelligent Equity Research Assistant

🚀 Key Capabilities

🧠 System Architecture (High Level)

🛠️ Tech Stack

⚙️ Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 FinSight AI — Intelligent Equity Research Assistant

🚀 Key Capabilities

🧠 System Architecture (High Level)

🛠️ Tech Stack

⚙️ Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages