RESEARCH
AREAS
Data
Mining and Knowledge Discovery
Data mining (DM) is the process of extracting relevant knowledge from
large and complex data sets. A broader process, Knowledge Discovery (KD)
(also refereed in a limited scope as knowledge discovery from databases
(KDD)) involves preprocessing (data preparation), DM and post processing
(knowledge refinement) processes. DM algorithms work on pre-processed
data and knowledge refinement processes validate and refine discovered
knowledge. My interests in DM and KD revolve around both basic and applied
research. In particular, I am interested in developing fundamentally
new techniques that build on soft computing, pattern matching, statistical,
and biological systems. For the applied aspects, I am interested in
applications of DM and KD to Internet security. Of special interest
are anomaly detection in World Wide Web and computer networks to identify
intruders and malicious activity and, detection and control of malicious
executables.
Fault
Mitigation in Software Systems
The search for fundamental principles of fault tolerance in human-engineered
complex dynamic systems is very new. We are interested in modeling complex
dynamic systems as hybrid interacting automata whose continuously varying
dynamics capture the physical process at the lowest level of abstraction.
Discrete event models at the higher levels capture the cognitive response
of the system to observed emerging physical phenomena. Our broader aim
is to formulate analytical models of the higher-level dynamics of component
interactions triggered by all types of individual failures to (i) predict
emerging pathological system behavior from time-series observations
of events and their dynamic interactions, and (ii) formulate adaptive
mechanisms to circumvent or mitigate the effects of pathological behavior.
In
particular, my interests are mitigation of faults in complex software
systems. One area that we are currently exploring is to model interactions
of software applications with the Operating System as a deterministic finite state
automaton (DFSA) (i.e., a regular language) and apply the Supervisory
Control Theory for development of a recognizer of this language to control
and mitigate faults in software execution. Specifically, the discrete-event
supervisor restricts the legal language of the model in an attempt to
mitigate the normal detrimental consequences of faults or undesirable
events.
Web
and Internet Security
Web caching, Web site reorganization, Web personalization, Web optimization,
Trust and related protocols. My interests are design and implementation,
security aspects of the World Wide Web at the system level. I am also
interested in design of algorithms, Internet protocols, Internet programming,
study of Internet attack methods, and building software based Internet
security mechanisms above the IP layer. I am also interested in authentication
and trust protocols and methods, Internet security policies, mobile
code control of malicious software, and intrusion detection.
Soft
Computing
Soft Computing (SC) consists of Fuzzy Logic, Neural Computing, Evolutionary
Computation, Machine Learning, and Probabilistic Reasoning, which include
belief networks, chaos theory and parts of learning theory. SC tolerates
imprecision, uncertainty, partial truth, and approximation to achieve
tractability, robustness and low solution cost. SC has been influenced
by a lot of earlier work but I attribute the most influence on SC (in
its present form) to Zadeh's 1965 paper on fuzzy sets; the 1973 paper
on the analysis of complex systems and decision processes; and the 1979
report (1981 paper) on possibility theory and soft data analysis. The
inclusion of neural computing and genetic computing in soft computing
came at a later point. SC is a foundational component for the emerging
field of conceptual intelligence.
I
am interested in doing basic and applied work in soft computing with
applications to the Internet, World Wide Web, and computer networks.
Earlier in my research, I have applied two organizing principles of
functional architecture of mammalian primary visual cortex: (1) Competitive
learning and (2) Hebb type learning to image restoration and segmentation.
I am interested in extending this work to apply to steganography (hiding
messages in Web images, etc.)
World
Wide Web and Computer Networks
This area includes the ability of the network to monitor itself, find
alternate paths for traffic through the network and modify these paths
dynamically based on the state of the network. The specific topics of
interest in this area are: (1) Load estimation based on the history
of network traffic at node level. Considerable success has been achieved
in using recurrent neural nets to predict chaotic time series like those
based on Glass-Mackey equation. The advantage of these techniques is
that hardware implementation can result in real time load estimation.
(2) Adaptive route selection. In case of failure of particular channel
or path, fast rerouting of traffic is very important. Number of possible
routes in a densely connected network (or a wide area network with a
number of LANs acting as nodes or even mobile units acting as nodes)
can increase combinatorialy. The selection of paths will also depend
upon the importance of the message, the existing traffic and the topology
of the network. I want to apply here some of the emerging techniques
using Neural Nets (Hopfield net and Kohonen's feature maps) to find
optimal path in a dynamic environment; I am also interested in exploring
the Genetic algorithms and Fuzzy logic, these techniques may have potential.