Student at University of Washington
I’m a master student major in CS at University of Washington, who will graduate this June. I got my bachelor degree on computer science in Xi’an Jiaotong University in China. I had a SWE internship experience in IBM for Human Resource Information Management System Application Development(IOS). And I’m also a fast learner, passionate about application development. I could manage many programming languages and work as a full stack developer. Besides, I have a lot of data analysis and machine learning experience for I’ve conducted several machine learning projects with professors in university, such as detecting malicious domains based on domain names, predicting the defects of software and so on. Also I get a GPA 4.0/4.0.
- Algorithm Design and Analysis
- Data Analysis
- Data Structure
- Machine Learning
- Software Engineering
09/2016 – 06/2018(Expected)
Master Science in Computer Science and System at University of Washington
09/2012 – 06/2016
Bachelor Science in Computer Science at Xi’an Jiaotong University
Software Engineer Intern at IBM
• Researched for common functions of Human Resource Information System. Wrote demoes to communicate and interact with costumers, and documented the details of costumers’ requirements;
• Wrote overall layouts and displayed personal information using Swift;
• Modified and optimized existing functional module code using Swift.
Developer at University of Washington
Domain-Based Feature Selection, AGD Detection and AGD Family Classification
• Goal & Context: Aimed to conduct various data analysis techniques and feature selection methods to
obtain the best feature set for distinguishing benign domains from malicious domains which are
Algorithmically Generated Domains (AGDs), then classify the AGD families with and without features.
• Researched for and then extracted all available linguistic features of domains, and done feature analysis
such as linear correlation, multicollinearity, feature frequency distribution per class, and so on;
• Leveraged feature selection techniques including wrapper and filter methods to obtain the best feature set;
• Optimized the performance (TPR@FPR 0.001) of distinguishing AGDs by 10%, using Random Forest;
• Optimized the accuracy of AGD family classification by 15%, using CNN approach without features.
Developer at University of Washington
Predicting Gender and Age of Authorship over Text from Facebook
• Goal & Context: Leveraged users’ texts, text analysis techniques and machine learning algorithms, to
conduct the binary and 4-class classification separately for gender and age of authorship based on text.
• Programming Language & Packages: Python, NLTK, Scikit-Learn.
• Conducted text tokenization, word normalization and stemming, and sentence segmentation; obtained
term frequency–inverse document frequency;
• Built SGD Classifier for gender and OneVSRest Classifier for age with high performance(Accuracy).
Developer at Xi’an Jiaotong University
Predicting Defects of Software Changes Based on Network Analysis
• Goal & Context: To improve the performance to predict the defects of software, the idea is to add a new
category of feature, the network-attribute feature, into the existing feature set. The nodes and edges are
separately corresponding to java source files in software and dependency relationships among files.
• Programming Language & Other Tools: Java, Linux, Weka, Understand, UCINET.
• Extracted five classes of network-attribute features such as closeness measure, reach centrality and so on;
• Data integration and feature selection using recursive feature elimination to obtain the best feature set;
• Developed a Naïve Bayes Classifier on the best feature set, improving the performance (Accuracy) by 9%.