Read paper “Large-Scale Machine Learning with Stochastic Gradient Descent”

Paper reference: Large-Scale Machine Learning with Stochastic Gradient Descent


This GD(Gradient Descent), which is used for computing weight of NN (also used for other Machine Learning Algorithm). zi represents the example ‘i’, also as (xi, yi). After calculate all examples, we need to compute the average for all differentials by weight. Calculating all examples is a slow progress, so we can image GD is not adequate efficient.


Here comes the SGD, which use only one example to compute gradient. It is simpler, and more efficient.


Using SGD in K-mean clustering algorithm seems counterintuitive for me at first glance. But after thinking about “Sample zi belongs to cluster of wk, then don’t wait for all samples, just update wk by zi“, it becomes conceivable.


ASGD is suitable for distributed machine learning environment, since it could get averaged gradient from any example of data at any time (no order restrain).

Data Preprocessing in Tableau

In previous article, I created two tables in my Redshift Cluster. Now I wan’t to find out the relation between salary of every employee and their working age. Tableau is the best choice for visualizing data analysis (SAS is too expensive and has no trail-version for learning).
First, we connect to Redshift in Tableau, and double-click the “New Custom SQL”. In the popup window, type in our SQL to query first-year-salary of every employee:

Now we have the table “custom sql query”. Drag in table “salary”, and choose “inner join” for employee_id, start_date:

Click into the “Sheet 1”. Drag “salary” to “Rows”, “min_start_date” to “Columns”, and “employee_id” to “Color” in “Marks” panel.

Now we can see the “expensive employees” (who have the most high salary in the same first-year) on the top of the graph:

Instead of adding custom SQL in tableau datasource panel, we can also create view in Redshift, and let tableau show views in “Tables”.

Or using “WITH” clause