Many classification methods such as kernel methods or decision trees are nonlinear approaches. However, linear methods of using a simple weight vector as the model remain to be very useful for many applications. By careful feature engineering and having data in a rich dimensional space, the performance may be competitive with that of
using a highly nonlinear classifier. Successful application areas include document classification and computational advertising (CTR prediction). In the first part of this talk, we give an overview of linear classification by introducing commonly used formulations. We discuss optimization techniques developed in our linear-classification package LIBLINEAR for fast training. The flexibility over kernel methods in selecting and employing optimization methods can be clearly seen in our discussion. In the second part of the talk, we select a few examples to demonstrate how linear classification is practically applied. They range from small to big data. The third part of the talk discusses issues in applying linear classification for big-data analytics. In our recent work on distributed linear classification, we see several challenges of this research topic. I will discuss them and hope to get your comments.