Friday, April 22, 2011

DPI Research: New Features are Needed (Encryption, User Profiling)

A "Proposition de Sujets de These" by Pr. Guillaume Urvoy-Keller (picture), from the Université Nice Sophia Antipolis, looks for a "candidate should have a solid background in networking and programming" to research DPI. "This work will be partly carried out in cooperation with Orange Lab, Sophia-Antipolis".

See "Internet/Intranet Traffic classification" - here.
In a previous work, "Hybrid Traffic Identification" - here, Pr. Urvoy-Keller proposes ".. a framework, called Hybrid Traffic Identification (HTI) that enables to take advantage of the merits of different approaches. Any source of information (flow statistics, signatures, etc) is encoded as a feature; the actual classification is made by a machine learning algorithm. We demonstrated that HTI is not-dependent on a specific machine learning algorithm, and that any classification method can be incorporated to HTI as its decision could be encoded as a new feature". 

He concludes that "We heavily tested HTI using different ADSL traces and 3 different machine learning methods. We demonstrated that not only HTI outperforms the classical classification schemes, but, in addition, it is suitable for cross-site classification. We further reported on the use of an HTI instance that takes its decision on the fly for all the traffic generated by the customers of an ADSL platform"

In the new proposal ".. we aim at investigating new problems related to traffic identification:
  • .. we want to further explore the process of adding features to the [above] classification tool . In addition to adding features, we would like to investigate new applications that might represent a minority of bytes when observing the overall traffic on the long run but might be considered as crucial by the ISP, e.g. streaming or social network traffic. Also, encrypted traffic is of high interest ...
  • So far, all works that rely on statistical approaches use deep packet inspection tools for annotating the pre-labeled trace used to train the classifier. No study has considered the reverse problem of how results of statistical tools can help improving deep packet inspection tools.
  • Traffic classification might be used to profile groups of users. It can also be used to inform anomaly detection, e.g., abnormal trends in a specific application. We started to investigate users profiling

1 comment:

  1. Deep packet inspection tools are useful for security functions as well as internet data mining, eavesdropping, and internet censorship.