Active

🦠 AI-Based Malware Detector

Machine learning-powered malware detection system that analyses PCAP network captures to identify suspicious traffic patterns and flag potential malware communication. Built with Python and ML classification models.

Python ML PCAP Analysis Malware Detection

View on GitHub

Project Overview

Traditional signature-based antivirus solutions often fail to detect zero-day malware and highly sophisticated polymorphic threats. This project addresses this vulnerability by shifting the detection mechanism from static file signatures to dynamic network behaviors.

By feeding raw `.pcap` files into the system, the AI Malware Detector extracts and normalizes network flow telemetry (such as packet sizes, inter-arrival times, protocol ratios, and TLS handshake metrics) and uses machine learning classification models to distinguish between benign user traffic and malicious Command & Control (C2) beaconing.

Technical Implementation

The core data pipeline is written in Python using `scapy` and `tshark` bindings to shred PCAP files into structural flow datasets. Pandas and Scikit-learn are used to curate the feature matrix and train multiple classifier algorithms, including Random Forest, Support Vector Machines (SVM), and Gradient Boosted Models (XGBoost).

The model was trained on a synthesized dataset containing millions of packets from the Stratosphere IPS dataset of real-world botnet traffic combined with regular enterprise network captures. The resulting ensemble model achieves a high F1-score with an exceptionally low false positive rate, making it viable as a supplementary SOC detection layer.

Key Features / Findings

Parses complex PCAP captures into standardized CSV datasets automatically.
Ensemble Machine Learning approach comparing RF, SVM, and XGBoost performance.
Focuses on behavioral network anomalies rather than static file hashes.
Provides a modular Python CLI interface for easy integration into existing SOC pipelines.