UPISheild
Inspiration
Cyber attacks today are automated, polymorphic, and increasingly AI-assisted. However, many defensive systems still depend heavily on static signatures and predefined rules. This reactive model creates a structural gap: attackers evolve faster than defenses.
We asked a forward-looking question:
What if a cybersecurity system could continuously learn system behavior and detect threats based on deviation rather than known signatures?
This project was built to design a lightweight, AI-driven behavioral anomaly detection engine capable of predicting malicious activity in real time.
Project Overview
We developed an adaptive cyber defense system that:
- Monitors real-time system and network activity
- Learns baseline behavioral patterns
- Detects anomalous deviations
- Computes dynamic risk scores
- Triggers intelligent alerts
Each system state is represented as a feature vector:
$$ X = {x_1, x_2, x_3, ..., x_n} $$
Where features include:
- Network request frequency
- Port access distribution
- File access entropy
- CPU usage variance
- Authentication attempt patterns
Mathematical Framework
We model threat probability using Bayesian inference:
$$ P(\text{Threat} \mid X) = \frac{P(X \mid \text{Threat}) \, P(\text{Threat})}{P(X)} $$
If:
$$ P(\text{Threat} \mid X) > \tau $$
where \(\tau\) is a dynamic threshold, the system flags the activity as suspicious.
We combine multiple anomaly metrics into a unified risk score:
$$ \text{Risk Score} = \alpha \cdot IF(X) + \beta \cdot B(X) $$
Where:
- \(IF(X)\) = Isolation Forest anomaly score
- \(B(X)\) = Bayesian threat probability
- \(\alpha, \beta\) = adaptive weighting coefficients
Long-term trust modeling is expressed as:
$$ Trust(t) = e^{-\lambda \cdot Risk(t)} $$
Where trust decays exponentially as cumulative risk increases.
System Architecture
Data Collection Layer
- Real-time log ingestion
- Network traffic metadata parsing
- Feature normalization
Feature Engineering
We extracted:
- Statistical deviation metrics
- Temporal frequency shifts
- Shannon entropy of file access:
$$ H = -\sum_{i=1}^{n} p_i \log p_i $$
- Behavioral drift scores
Model Layer
- Isolation Forest (unsupervised anomaly detection)
- Bayesian probabilistic modeling
- Adaptive threshold optimization
Response Engine
When risk exceeds threshold:
- Event is logged
- Alert is generated
- Mitigation suggestions are provided
- Optional IP auto-blocking is triggered
Technology Stack
- Python
- Scikit-learn
- NumPy
- Pandas
- Real-time socket monitoring
- Modular logging engine
The system is:
- Lightweight
- Modular
- Deployable locally
- Extensible for enterprise-scale deployment
Challenges Faced
1. Limited Labeled Attack Data
High-quality labeled cybersecurity datasets are scarce.
We addressed this using:
- Simulated attack injection
- Synthetic anomaly generation
- Unsupervised learning models
2. False Positives
Anomaly detection systems are sensitive by nature.
To reduce false alarms, we implemented:
- Rolling behavioral baselines
- Context-aware detection windows
- Adaptive threshold tuning
3. Real-Time Constraints
Live threat detection requires computational efficiency.
Optimizations included:
- Vectorized numerical operations
- Batch inference windows
- Lightweight model selection
Key Learnings
- Behavioral modeling is more scalable than signature-based detection.
- Risk scoring is superior to binary classification in real-world security.
- Adaptive thresholds significantly reduce false positives.
- AI in cybersecurity must balance detection sensitivity and operational practicality.
Future Scope
This system can evolve into:
- Federated learning across endpoints
- Zero-trust architecture integration
- Cloud-native deployment
- Autonomous response agents
- SIEM platform integration
Conclusion
This project shifts cybersecurity from reactive detection to predictive defense.
Instead of asking:
Is this a known attack?
We evaluate:
Does this behavior statistically deviate from trusted patterns?
That shift enables next-generation adaptive cyber defense systems.
Log in or sign up for Devpost to join the conversation.