My Research Publications

Ext-NoSQL: A NoSQL Schematics through JSON

IEEE Paper: https://ieeexplore.ieee.org/abstract/document/10705262

This paper tackles the challenge of extracting descriptive schemas from JSON data collections, specifically targeting the discovery of tagged unions. Tagged unions are a JSON Schema design pattern where the value of one property dictates conditional subschemas for other sibling properties. By formalizing these implications as conditional functional dependencies, we employ JSON Schema operators such as if-then-else. Our heuristics are designed to avoid overfitting. Promising experiments with our prototype implementation show successful detection of tagged unions in real-world GeoJSON and TopoJSON datasets. Additionally, we explore potential extensions of our approach for future work.

Enhancing Data Engineering Frameworks for Scalable Real-Time Marketing Solutions

IEEE Paper: https://www.ijrah.com/index.php/ijrah/article/view/562

This report discusses improvements to the data engineering frameworks for cost-efficient real-time marketing environment, with the applications in mobile advertising and smart city analytics. The proposed framework combines the processing of time-series data with map-reduce (M-R) systems in order to compensate for inefficiencies in the execution of temporal queries and the processing of real-time data streams. It enables effective behavioral targeting in mobile marketing and helps integrate various domains’ real-time data for smart city utilization. Specific examples that illustrate further refinements in the algorithms are showcased through case study and experimental evidence where gains in throughput, data reliability, and system effectiveness are quantified in real-time Marketing and Smart City applications.

Enhanced Intrusion Detection in Big Data Systems: A Machine Learning Approach

IEEE Paper: https://ieeexplore.ieee.org/abstract/document/10692729

Robust intrusion detection systems (IDS) are critical in big data environments due to the growing volume and complexity of network traffic, as they protect against cyberattacks. Conventional signature-based intrusion detection systems have difficulty keeping up with new attack techniques. In large data network security, machine learning and deep learning methods are viable substitutes for anomaly detection. To attain better performance on big data network traffic datasets, this study suggests a unique deep learning-based intrusion detection system (IDS) that makes use of capsule networks, bi-directional LSTMs (BiLSTMs), and embedding layers. We create a deep learning model for feature extraction and pattern detection from network traffic data by combining an embedding layer, a capsule layer, and a BiLSTM layer. To test the model’s efficacy in intrusion detection, two benchmark datasets UNSW-NB15 and an IoT dataset are used. With a test accuracy of 99.99% on the UNSWNB15 dataset, the suggested model performs remarkably well on both datasets. This shows the model’s greater capacity to detect anomalies and possible intrusions within big data network traffic, surpassing the accuracy reported in earlier studies. The results demonstrate how well the suggested deep learning architecture for big data intrusion detection works. Due to its excellent accuracy and generalizability over a wide range of datasets, the model has practical applications in protecting big data networks from cyberattacks.