Automatic Identification of Impairments Using ... - Semantic Scholar

Report 3 Downloads 217 Views
2398

IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 18, NO. 22, NOVEMBER 15, 2006

Automatic Identification of Impairments Using Support Vector Machine Pattern Classification on Eye Diagrams Ronald A. Skoog, Member, IEEE, Thomas C. Banwell, Member, IEEE, Joel W. Gannett, Senior Member, IEEE, Sarry F. Habiby, Member, IEEE, Marcus Pang, Michael E. Rauch, and Paul Toliver, Member, IEEE

Abstract—We have demonstrated powerful new techniques for identifying the optical impairments causing the degradation of an optical channel. We use machine learning and pattern classification techniques on eye diagrams to identify the optical impairments. These capabilities can enable the development of low-cost optical performance monitors having significant diagnostic capabilities. Index Terms—Machine learning, monitoring (OPM), pattern recognition.

optical

performance

I. INTRODUCTION S all-optical subnetworks become more extensive, it will be necessary to have optical performance monitoring (OPM) capabilities placed throughout those subnetworks so the cause and location of performance problems can be readily identified. The network interface points that do optical to electrical conversion (e.g., client interfaces, domain interconnection points, etc.) provide performance measurements such as bit-error ratio; so they can identify performance degradation, but they cannot identify the problem’s cause or its source. The function of the OPMs will be to identify the cause (e.g., type of impairment) of performance degradations, and multiple OPMs will be needed to sectionalize and locate the faulty equipment. To make this economically viable from a carrier’s perspective, OPMs will need to be developed that have sophisticated diagnostic capability and low-cost components. This letter identifies new OPM techniques that could make this goal achievable. The basic idea we are investigating is to use the extensive body of knowledge that has been developed for machine learning and pattern classification (e.g., see [1]) to develop automated techniques for analyzing the eye diagram of a monitored optical signal and identify its optical impairments. A key observation is that it is essential to examine the entire eye diagram to identify optical signal impairments, and not just look at one part of the eye diagram (e.g., using histograms at the maximum eye opening). A fundamental step in developing a pattern classification system is the machine-learning process of training the classi-

A

Manuscript received June 20, 2006; revised September 5, 2006. The authors are with Applied Research, Telcordia Technologies, Inc., Red Bank, NJ 07701 USA (e-mail: [email protected]). Color version of Fig. 1 is available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LPT.2006.886146

fier. This step requires training data. To generate the needed training data, we use commercial optical communication system simulation packages to generate eye diagrams with different types and level of impairment. To test the efficacy of a trained pattern classification system, testing data is required. Cross validation [1] can be used in the training process to get a form of testing data, but we found it essential to use measured data from real optical signals to get a reliable indication of the OPM classifier’s true performance. The type of pattern classification system we use is called a support vector machine (SVM) [1], [2]. SVM has been found to be effective in a number of pattern recognition application areas (e.g., character recognition, face detection in images, etc.), and its performance is significantly better than that of competing methods (e.g., neural networks). In this letter, to validate the basic concepts for the simplest case, we consider optical signals having a single impairment. Extensions to optical signals with multiple impairments have been done, and we will report details on those extensions in the future. II. FEATURE COMPUTATION AND SELECTION Fig. 1(a) shows example simulated eye diagrams for different cases of a single optical impairment. For a human being looking at these pictures, it is very clear that the different impairments give rise to distinct features. The distinct features occur at different locations in the eye diagram, as illustrated in the right panel of Fig. 1(a). However, these are features that are visually perceived by a human. A major challenge is defining mathematical features of entire eye diagrams so a machine classification algorithm can be trained to distinguish between the different types of impairment. The most straightforward feature definition is to view the discrete image and define each of eye diagram as an pixel values as a feature. We found, however, that the the performance using this feature definition is unacceptable. The representing reason for this is that the feature vectors in the eye diagram are very sensitive to small spatial perturbations. For example, shifting all pixel values one pixel to the right vector having a large or left can lead to the shifted Euclidian distance from the unshifted vector, though they are spatially close. A technique developed in image analysis to deal with this problem is using moments to characterize images [3], [4]. The geometrical moments of an image having pixel values

1041-1135/$20.00 © 2006 IEEE

SKOOG et al.: AUTOMATIC IDENTIFICATION OF IMPAIRMENTS USING SVM PATTERN CLASSIFICATION

2399

Zernike moments have proved to be superior to Legendre (and other orthogonal polynomials) in terms of their feature representation and low noise sensitivity (e.g., see [5, Ch. 5]), and we have found them to perform the best in our application as well. The of order with repetition for an image Zernike moment is defined as ( and are integers, and function and even) (2) where

and is a real-valued radial polynomial of order with coefficients depending on both and [5]. The features we use to characterize the eye diagrams for the pattern classification system are the lower order Zernike moments. To compute the Zernike moments, the eye diagrams are scaled to fit within a unit disk. The moment set is determined by , a chosen upper bound on ; we used the upper bound which produces 23 independent moments to use for classification. Thus, each eye diagram is characterized by a 23-dimensional feature vector. Zernike moments are invariant to rotation, but not to linear translation. As a result, it is important to have consistent phase registration of the eye diagrams. For nonreturn-to-zero modulation, this can be done by identifying the predominant crossover rising and falling waveforms. phase for the III. SVM CLASSIFICATION SYSTEM

Fig. 1. (a) Eye diagrams and features. (b) SVM mapping and classification margin. (c) Experimental setup.

are expressed as (1)

where is the region in pixel space where the intensity function is defined. A generalization of this concept is to ( is the use a complete set of orthogonal polynomials order of the polynomial) and replace and in (1) with and , respectively. A heavily studied choice of orthogonal polynomials in the image processing literature is the set of Legendre polynomials (and their generalizations). Teague [4] introduced the use of Zernike moments, which are based on comdefined over the plex-valued orthogonal functions unit disk and called Zernike polynomials. In image processing,

The aim of a “machine” classification system is to classify classes. In this letter, we consider test samples into one of optical signals with a single impairment, and the impairment classes are chromatic dispersion (CD), first-order polarizationmode dispersion (PMD), noncoherent crosstalk, and no impairment (normal eye). The input to the SVM classifier is a feature vector representing the test sample, and as discussed above we use 23 low-order Zernike moments as the feature vector of an , eye diagram. Classifiers are trained with a training set where is the feature vector of the th training sample and represents its known classification. Many types of classification system have been developed [1], and SVM has been found to be superior in a wide range of applications (e.g., character recognition, tumor detection, etc.). SVM is based on the idea of minimizing the risk of error (called generalization error) when the classifier is applied to test samples that do not exactly match any training sample used in training the classifier. In comparison, most classifiers (e.g., neural networks) try to minimize the training error, and tend to “overfit” the training data. Fig. 1(b) illustrates the basic idea behind SVM. is implicitly defined that maps the A nonlinear mapping into a high-ditraining points (vectors) in the feature space mensional (HD) feature space. In the HD feature space, a decision hyperplane is computed to separate the training points of the two classes. This is done to maximize the margin between the two classes. Maximizing the margin is what minimizes the generalization error.

2400

IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 18, NO. 22, NOVEMBER 15, 2006

SVM is basically a two-class classifier (one hyperplane separating two classes). To build a multiclass classifier using SVM, an SVM for each pairing of two classes is built and a test sample is classified by each of those two-class SVMs. A voting scheme is used to determine the multiclass classification. The SVM process [6] also provides a confidence level out of this voting computation. Once an SVM classifier has been trained, the time required for a processor to compute the feature vector of an eye diagram and have it classified is very short (a few milliseconds). Thus, the above pattern classification techniques are quite amenable to real time monitoring. IV. EXPERIMENTAL RESULTS When laboratory measurements of eye diagrams are used, low-pass filtering is required to remove noise, and we found that gray-scale pixel values are difficult to accurately calibrate between simulation and measurement data. Therefore, pixel values of 0 or 1 must be used, and a filtering and a processing stage is needed to generate “clean” eye diagrams for classification. The filtering we do uses a simple low-pass filter that filters one row or column at a time to remove noise and smooth out pixel amplitude variation. We then use a peak analyzer on the filtered data to identify local maxima. These local maxima are the pixel locations we label as 1 s. All other pixels are set to 0. Fig. 1(c) shows the experimental laboratory setup for taking measured data, and also shows the measured and filtered eye diagrams. The filtering and peak analysis is also done on the simulation data to achieve as much consistency as possible between the two data sets. For test data, we developed experimental measurement data for each of the four impairment classes (normal, CD, PMD, and crosstalk). The test set we generated had three CD samples, 11 PMD samples, and three crosstalk samples. For training with simulation data, we generated 31 CD samples, 107 PMD samples, 20 crosstalk samples, and six “normal” samples (distinguished by different levels of amplified spontaneous emission noise). The accuracy we obtained from the training cross validation (we used ten-fold cross validation) was 95%. Applying the trained SVM classifier to the 17 measurement test eye diagrams resulted in 15 high confidence ( 60%) classifications

that were all correct and two low confidence ( 50%) classifications that were both incorrect. The two low confidence incorrect classifications were crosstalk samples that were classified as extreme cases of PMD (i.e., differential group delay around 1 pulse period). To show the capability of quantifying the amount of the impairment, we used a nearest neighbor technique. After impairment classification of a test sample, we found the nearest (i.e., shortest Euclidian distance between feature vectors) training sample within the determined impairment class. For correct classifications, this always identified the training sample having the closest impairment value to the test sample’s impairment value. V. CONCLUSION We have shown that the extensive body of knowledge that has been developed for machine learning and pattern classification can enable automated techniques for identifying optical signal impairments from measured eye diagrams. This is a fundamentally new technique that has never been used before for OPM applications. Simulation is used to generate the eye diagrams for training the impairment classification system. A major problem that had to be solved for this approach to work was developing eye diagram filtering and feature computation capabilities that are consistent across simulation and measured data. The capabilities presented here show significant potential for enabling the development of low-cost OPMs with significant diagnostic capabilities. REFERENCES [1] R. Duda, P. Hart, and D. Stork, Pattern Classification. New York: Wiley, 2001. [2] V. Vapnick, The Nature of Statistical Learning Theory. New York: Wiley, 1995. [3] M. K. Hu, “Visual pattern recognition by moment invariants,” IRE Trans. Info. Theory, vol. 8, no. 1, pp. 179–187, 1962. [4] M. R. Teague, “Image analysis via the general theory of moments,” J. Opt. Soc. Amer., vol. 70, no. 8, pp. 920–930, 1980. [5] R. Mukundan, Moment Functions in Image Analysis. Singapore: World Scientific, 1998. [6] T.-F. Wu, C.-J. Lin, and R. Weng, “Probability estimates for multi-class classification by pairwise coupling,” J. Machine Learning Res., vol. 5, pp. 975–1005, 2004.