论文部分内容阅读
The rapid development of mobile network brings opportunities for researchers to analyze user behaviors based on large-scale network traffic data. It is important for Intet Service Providers (ISP) to optimize resource allocation and provide customized services to users. The first step of analyzing user behaviors is to extract information of user actions from HTTP traffic data by multi-pat-t URL matching. However, the efficiency is a huge problem when performing this work on massive network traffic data. To solve this problem, we propose a novel and accurate al-gorithm named Multi-Patt Parallel Match-ing (MPPM) that takes advantage of HashMap in data searching for extracting user behaviors from big network data more effectively. Ex-tensive experiments based on real-world traf-fic data prove the ability of MPPM algorithm to deal with massive HTTP traffic with better performance on accuracy, concurrency and efficiency. We expect the proposed algorithm and it parallelized implementation would be a solid base to build a high-performance analy-sis engine of user behavior based on massive HTTP traffic data processing.