An intelligent automation platform designed to enhance SOC Level 1 analyst operations using advanced AI and vector search capabilities.
This project leverages state-of-the-art AI models and vector search technologies to automate routine SOC Level 1 analyst tasks, improving efficiency, accuracy, and response times in security operations.
- Automated log analysis and categorization
- Real-time threat detection with reduced latency
- Interactive dashboard for log visualization
- AI-powered insights using LAMA 3.1
- Efficient vector search using FAISS
- Advanced text preprocessing with BERT-based embeddings
The system implements a sophisticated pipeline combining several cutting-edge technologies:
-
Log Ingestion Interface
- User-friendly interface for log file uploads
- Supports mutiple log file format .csv, .log, .txt. .md
- Streamlined data input process
-
Log Processing Engine
- Intelligent log chunking for optimal processing
- BERT-based text embeddings using Nomic-Embedded text
- Advanced preprocessing for enhanced accuracy
-
Vector Storage and Search
- FAISS-powered vector database
- Efficient similarity search capabilities
- Optimized for large-scale log data
-
AI Analysis Engine
- LAMA 3.1 integration
- Retrieval-Augmented Generation (RAG) implementation
- Contextual understanding of security events
-
Visualization Dashboard
- Real-time log categorization
- Interactive metrics and statistics
# Clone the repository
git clone https://github.com/Sai-Chakradhar-Mahendrakar/SOC-Analyst-Automation-using-RAG-Model.git
# Create and activate virtual environment
python -m venv env_name
source env_name/bin/activate # On Windows: env_name\Scripts\activate
# Install dependencies
pip install -r requirements.txt
- Start the application:
python app.py
-
Access the web interface at
http://localhost:8000
-
Upload log files through the interface
-
Get resopnse for user query
-
View results in the dashboard
- Python 3.8+
- LAMA 3.1
- FAISS
- Nomic Embedded Text
- BERT Transformers
- React + Vite
- Fast API
- Additional requirements listed in
requirements.txt
soc-automation/
├── FrontEnd/
│ ├── public/
│ ├── src/
│ │ ├── assets
| | ├── Components
| | └──App.css
| | └──App.jsx
| | └──index.css
| | └──main.jsx
│ └── index.html
| └── ReadMe.md
|
├── Backend/
│ ├── data/
| | └──Windows_2k.log_structured.csv
│ ├── logs/
| | └──log4.log
| | └──logs.md
| | └──logs1.md
| | └──logs2.md
│ ├── src/
│ │ ├──api
│ │ ├──chains
│ │ ├──loaders
│ │ ├──utils
| | └──main.py
│ ├── uploads/
| | └──Windows_2k.log_structured.csv
| | └──logs2.md
| └──requirement.txt
|
├── Images/
└── README.md
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-feature
) - Commit changes (
git commit -am 'Add new feature'
) - Push to branch (
git push origin feature/new-feature
) - Create Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- LAMA team for their excellent language model
- FAISS team for the vector similarity search engine
- Nomic for their embedded text processing capabilities
For questions and support, please open an issue in the GitHub repository or contact the maintainers.