A New Proposal for Person Identification Based on the Dynamics of Typing: Preliminary Results

Krisztian Buza, Dora Neubrandt


The availability of cheap and widely applicable person identification techniques is essential due to wide-spread usage of online services. The dynamics of typing is characteristic to particular users, and users are hardly able to mimic the dynamics of typing of others. State-of-the-art solutions for person identification from the dynamics of typing are based on machine learning. The presence of hubs, i.e., few instances that appear as nearest neighbors of surprisingly many other instances, have been observed in various domains recently and  hubness-aware machine learning approaches have been shown to work well in those domains. However, hubness has not been studied in context of person identification yet, and hubness-aware techniques have not been applied for this task. In this paper, we examine hubness in typing data and propose to use EC$k$NN, a recent hubness-aware regression techniques together with dynamic time warping for person identification. We collected time-series data describing the dynamics of typing and used it to evaluate our approach. Experimental results show that hubness-aware techniques outperform  state-of-the-art time-series classifiers.

Full Text:



S. Marcel and J.R. Del Millan. Person authentication using brainwaves (EEG) and maximum a posteriori model adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 29(4):743--752, 2007. DOI: 10.1109/TPAMI.2007.1012.

F. Gargiulo, A. Fratini, M. Sansone, and C. Sansone. Subject identification via ecg fiducial-based systems: Influence of the type of QT interval correction. Comput. Meth. Programs Biomed. 121(3):127--136, 2015. DOI: 10.1016/j.cmpb.2015.05.012.

M. Fraschini, M. Demuru, L. Didaci, and G. Marcialis. An EEG-based biometric system using eigenvector centrality in resting state brain networks. IEEE Signal Process. Lett. 22(6):1, 2015. DOI: 10.1109/LSP.2014.2367091.

M. Antal, L.Z. Szabó and I. László. Keystroke dynamics on Android platform. pages 820--826, 2015. DOI: 10.1016/j.protcy.2015.02.118.

F. Monrose and A.D. Rubin. Keystroke dynamics as a biometric for authentication. Futur. Gener. Comp. Syst. 16(4):351--359, 2000. DOI: 10.1016/S0167-739X(99)00059-X.

F.W.M.H Wong, A.S.M. Supian, A.F. Ismail, L.W. Kin, and O.C. Soon. Enhanced user authentication through typing biometrics with artificial neural networks and k-nearest neighbor algorithm. In Conference Record of the Thirty-Fifth Asilomar Conference on Signals, Systems and Computers, 2001. volume 2, pages 911--915. IEEE, 2001.

A. Nanopoulos, R. Alcock, and Y. Manolopoulos. Feature-based classification of time-series data pages 49--61. Nova Science Publishers, Commack, NY, USA, 2001.

S. Kim, P. Smyth, and S. Luther. Modeling waveform shapes with random effects segmental hidden markov models. In Proceedings of the 20th conference on Uncertainty in artificial intelligence pages 309--316. AUAI Press Arlington, Virginia, USA, 2004.

D.R. Eads, D. Hill, S. Davis, S.J. Perkins, J. Ma, R.B. Porter, and J.P. Theiler. Genetic algorithms and support vector machines for time series classification. In International Symposium on Optical Science and Technology pages 74--85. International Society for Optics and Photonics, 2002.

K. Buza and L. Schmidt-Thieme. Motif-based classification of time series with Bayesian Networks and SVMs. In Advances in Data Analysis, Data Handling and Business Intelligence pages 105--114. Springer, 2009. DOI: 10.1007/978-3-642-01044-6_9.

X. Xi, E. Keogh, C. Shelton, L. Wei, and C.A. Ratanamahatana. Fast time series classification using numerosity reduction. In Proceedings of the 23rd international conference on Machine learning - ICML '06 pages 1033--1040. ACM, New York, NY, USA, 2006. DOI: 10.1145/1143844.1143974.

H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh. Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1(2):1542--1552, 2008. DOI: 10.14778/1454159.1454226.

G.H. Chen, S. Nikolov, and D. Shah. A latent source model for nonparametric time series classification. In Advances in Neural Information Processing Systems pages 1088--1096, 2013.

L. Devroye, L. Györfi, and G. Lugosi. A probabilistic theory of pattern recognition volume 31 of Stochastic Modelling and Applied Probability. Springer Science & Business Media, 1996. DOI: 10.1007/978-1-4612-0711-5.

M. Radovanović A. Nanopoulos, and M. Ivanović. Hubs in space: Popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11:2487--2531, 2010.

N. Tomašev, M. Radovanovic, D. Mladenic, and M. Ivanovic. A probabilistic approach to nearest-neighbor classification: Naive hubness bayesian kNN. In Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11 pages 2173--2176. ACM New York, NY, USA, 2011. DOI: 10.1145/2063576.2063919.

N. Tomašev, K. Buza, K. Marussy, and P.B. Kis. Hubness-aware classification, instance selection and feature construction: Survey and extensions to time-series. In Feature selection for data and pattern recognition volume 584 of Studies in Computational Intelligence pages 231--262. Springer, Berlin-Heidelberg, Germany, 2015. DOI: 10.1007/978-3-662-45620-0_11.

K. Buza, A. Nanopoulos, and G. Nagy. Nearest neighbor regression in the presence of bad hubs. Knowledge-Based Syst. 86:250--260, 2015. DOI: 10.1016/j.knosys.2015.06.010.

H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust., Speech, Signal Process. 26(1):43--49, 1978. DOI: 10.1109/TASSP.1978.1163055.

S.L. Salzberg. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Min. Knowl. Discov. 1(3):317--328, 1997.

N. Tomašev and D. Mladenić. Nearest neighbor voting in high-dimensional data: Learning from past occurrences. In 2011 IEEE 11th International Conference on Data Mining Workshops pages 1215--1218. IEEE, 2011. DOI: 10.1109/ICDMW.2011.127.

G. Doddington, W. Liggett, A. Martin, M. Przybocki, and D. Reynolds. Sheep, goats, lambs and wolves a statistical analysis of speaker performance in the nist 1998 speaker recognition evaluation. In International Conference on Spoken Language Processing 1998.

K. Marussy and K. Buza. SUCCESS: a new approach for semi-supervised classification of time-series. In Artif. Intell. Soft. Comput. pages 437--447. Springer, 2013. DOI: 10.1007/978-3-642-38658-9_39.

DOI: http://dx.doi.org/10.20904/281-2001


  • There are currently no refbacks.

Copyright (c) 2017 Krisztian Buza, Dora Neubrandt

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 1896-5334 (print), 2300-889X (online)

Open Acces CrossRef Indexed in DOAJ